Use of Hive

Open Ports for Hive Service

Description Port
JDBC 32203

Basic Usage

Hive in EMR no longer supports the hive-cli method of submitting commands. Beeline should be used as follows:

  1. Interactive

    cd $HIVE_HOME ./bin/beeline !connect jdbc:hive2://${hiveserver2_ip}:$port #after entering, enter username and password as well. select 1+1;

  1. Non-interactive

    cd $HIVE_HOME ./bin/beeline -u jdbc:hive2://${hiveserver2_ip}:$port -e ${your_query}

Note: Interactive is normally used to make a small number of continuous queries in a short period of time; use non-interactive operation to handle a large amount of data.

Use of UDF/UDAF

  1. Prepare the jar package for the UDF/UDAF function, assuming it is a.jar
  2. Upload a.jar to cluster HDFS
  3. create function ${func_name} as '${classfullname_to_your_funcclass}' using jar '${jarpath}'

Extended FDS File System Support

Extending FDS file system support to Hive allows Hive to directly access data on FDS. By default, Hive clusters created with EMR do not support access to FDS file systems. Please request activation of this function by email to emr-help@xiaomi.com