HUE基础入门

HUE基础入门HUE 是一个开源的 ApacheHadoop 系统 早期由 Cloudera 开发 后来贡献给开源社区 它是基于 PythonWeb 框架 Django 实现的 通过使用 Hue 我们可以通过浏览器方式操纵 Hadoop 集群 例如 put get 执行 MapReduceJob 等等 学习网站 http gethue comhttps github com cloude

#需要的依赖包 gcc gcc-c++ ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi krb5-devel libtidy (for unit tests only) libxml2-devel libxslt-devel mvn (from maven package) mysql mysql-devel openssl-devel (for version 7+) openldap-devel python-devel sqlite-devel #执行以下命令: make apps build/env/bin/hue runserver

修改配置文件/opt/hue-3.7.0-cdh5.3.6/desktop/conf/hue.ini

  secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn 
  

HUE集成hdfs

 
  
      
   
     dfs.webhdfs.enabled 
      
   
     true 
    
  

2.You also need to add this to core-site.html.

 
  
      
   
     hadoop.proxyuser.hue.hosts 
      
   
     * 
    
   
  
      
   
     hadoop.proxyuser.hue.groups 
      
   
     * 
    
  

3.Also add this in httpfs-site.xml which might be in /etc/hadoop-httpfs/conf.

 
  
      
   
     httpfs.proxyuser.hue.hosts 
      
   
     * 
    
   
  
      
   
     httpfs.proxyuser.hue.groups 
      
   
     * 
    
  

同步到其他节点

启动进程:

sbin/httpfs.sh start

配置hue.ini

  [[hdfs_clusters]]     # HA support by using HttpFs     [[[default]]]       # Enter the filesystem uri   # ------------------------------------------------------------------------   [[hdfs_clusters]]     # HA support by using HttpFs     [[[default]]]       # Enter the filesystem uri       fs_defaultfs=hdfs://ns1       # NameNode logical name.       logical_name=ns1       # Use WebHdfs/HttpFs as the communication mechanism.       # Domain should be the NameNode or HttpFs host.       # Default port is 14000 for HttpFs.       webhdfs_url=http://hadoop-senior01.zhangbk.com:14000/webhdfs/v1       # Change this if your HDFS cluster is Kerberos-secured       security_enabled=false       # Default umask for file and directory creation, specified in an octal value.       umask=022       # Directory of the Hadoop configuration       hadoop_conf_dir=/opt/hadoop-2.5.0-cdh5.3.6/etc/hadoop 

出现的问题:

或修改配置文件hdfs-site.xml

 
  
      
   
     dfs.permissions.enabled 
      
   
     false 
    
  

解决方法

  修改 文件desktop/libs/hadoop/src/hadoop/fs/webhdfs.py 中的  DEFAULT_HDFS_SUPERUSER = ‘hdfs’  更改为你的用户,或在HUE中创建hdfs用户。

HUE集成yarn

 # Configuration for YARN (MR2) # ------------------------------------------------------------------------ [[yarn_clusters]] [[[default]]] # Enter the host on which you are running the ResourceManager resourcemanager_host=hadoop-senior03.zhangbk.com # The port where the ResourceManager IPC listens on resourcemanager_port=8032 # Whether to submit jobs to this cluster submit_to=True # Resource Manager logical name (required for HA) logical_name= # Change this if your YARN cluster is Kerberos-secured security_enabled=false # URL of the ResourceManager API resourcemanager_api_url=http://hadoop-senior03.zhangbk.com:8088 # URL of the ProxyServer API proxy_api_url=http://hadoop-senior03.zhangbk.com:8088 # URL of the HistoryServer API history_server_api_url=http://hadoop-senior01.zhangbk.com:19888 # In secure mode (HTTPS), if SSL certificates from Resource Manager's # Rest Server have to be verified against certificate authority ssl_cert_ca_verify=False 
 
  
      
   
     hive.server2.thrift.port 
      
   
     10000 
    
   
  
      
   
     hive.server2.thrift.bind.host 
      
   
     hadoop-senior01.zhangbk.com 
    
   
  
      
   
     hive.metastore.uris 
      
   
     thrift://hadoop-senior01.zhangbk.com:9083 
    
  

启动hiveserver2和metastore服务

修改hue.ini文件

 # Settings to configure Beeswax with Hive [beeswax] # Host where HiveServer2 is running. # If Kerberos security is enabled, use fully-qualified domain name (FQDN). hive_server_host=hadoop-senior01.zhangbk.com # Port where HiveServer2 Thrift server runs on. hive_server_port=10000 # Hive configuration directory, where hive-site.xml is located hive_conf_dir=/opt/hive-0.13.1-cdh5.3.6/conf # Timeout in seconds for thrift calls to Hive service server_conn_timeout=120 # Choose whether Hue uses the GetLog() thrift call to retrieve Hive logs. # If false, Hue will use the FetchResults() thrift call instead. use_get_log_api=true 

HUE集成RDBMS

 # Settings for the RDBMS application [librdbms] # The RDBMS app can have any number of databases configured in the databases # section. A database is known by its section name # (IE sqlite, mysql, psql, and oracle in the list below). [[databases]] # sqlite configuration. [[[sqlite]]] # Name to show in the UI. nice_name=SQLite # For SQLite, name defines the path to the database. name=/opt/hue-3.7.0-cdh5.3.6/desktop/desktop.db # Database backend to use. engine=sqlite # Database options to send to the server when connecting. # https://docs.djangoproject.com/en/1.4/ref/databases/ options={} # mysql, oracle, or postgresql configuration. [[[mysql]]] # Name to show in the UI. nice_name=MySql # For MySQL and PostgreSQL, name is the name of the database. # For Oracle, Name is instance of the Oracle server. For express edition # this is 'xe' by default. name=test # Database backend to use. This can be: # 1. mysql # 2. postgresql # 3. oracle engine=mysql # IP or hostname of the database to connect to. host=hadoop-senior01.zhangbk.com # Port the database server is listening to. Defaults are: # 1. MySQL: 3306 # 2. PostgreSQL: 5432 # 3. Oracle Express Edition: 1521 port=3306 # Username to authenticate with when connecting to the database. user=root # Password matching the username to authenticate with when # connecting to the database. password=password01 # Database options to send to the server when connecting. # https://docs.djangoproject.com/en/1.4/ref/databases/ options= {"init_command":"SET NAMES 'utf8'"} 

出现问题: 

 2. 发生服务器错误: ‘utf8’ codec can’t decode byte 0xfc in position 1: invalid start byte

options= {“init_command”:”SET NAMES ‘utf8′”}

HUE集成Oozie

 # Settings to configure liboozie [liboozie] # The URL where the Oozie service runs on. This is required in order for # users to submit jobs. Empty value disables the config check. oozie_url=http://hadoop-senior01.zhangbk.com:11000/oozie # Requires FQDN in oozie_url if enabled security_enabled=false # Location on HDFS where the workflows/coordinator are deployed when submitted. remote_deployement_dir=/user/oozie-apps # Settings to configure the Oozie app [oozie] # Location on local FS where the examples are stored. local_data_dir=/opt/oozie-4.0.0-cdh5.3.6/oozie-apps # Location on local FS where the data for the examples is stored. sample_data_dir=/opt/oozie-4.0.0-cdh5.3.6/oozie-apps/input-data # Location on HDFS where the oozie examples and workflows are stored. remote_data_dir=/user/oozie-apps # Maximum of Oozie workflows or coodinators to retrieve in one API call. oozie_jobs_count=100 # Use Cron format for defining the frequency of a Coordinator instead of the old frequency number/unit. enable_cron_scheduling=true

出现问题:

/user/oozie/share/lib    Oozie 分享库 (Oozie Share Lib) 无法安装到默认位置。

解决:

    
  
            
   
     oozie.service.WorkflowAppService.system.libpath 
            
   
     /user/oozie/share/lib 
        
  

重新执行:

 bin/oozie-setup.sh sharelib create -fs hdfs://hadoop-senior01.zhangbk.com:8020 -locallib oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz

 

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/225746.html原文链接:https://javaforall.net

(0)
上一篇 2026年3月17日 上午8:38
下一篇 2026年3月17日 上午8:38


相关推荐

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号