Manage Clusters

In addition to the management of clusters through the console, users can also directly log on to the main node through SSH. The configuration of the cluster environment has been completed on the master node, and you can execute commands directly on the master node. You can also access the native Hadoop management page in the cluster after you set up the SOCKS5 proxy server via the SSH framework.

Generate a key pair

Execute the following on your own machine

ssh-keygen -f ./hadoop_key -C "emr public key"

Where -f specifies the file, -C adds a comment. After executing the command, two files are obtained: hadoop_key and hadoop_key.pub. hadoop_key is the key file which is kept by the user, hadoop_key.pub is the public key file that can be added to the cluster through the console.

Note: If the key file is missing, immediately delete the corresponding public key on the cluster to prevent potential security risks.

Add a public key

Open the console page and click on the corresponding cluster management button to enter the cluster management page. On the security configuration at the bottom of the page, click Add a Public key, and then enter its name (use any characters, cannot contain |, recommend using the key file name to easily match the key and the public key). All of the content of the hadoop_key.pub is copied to content. Click Save to complete adding the public key. add-publicKey

Remove a Public Key

Open the console page and click on the corresponding cluster management button to enter the cluster management page. On the security configuration at the bottom of the page, click on - to the right of the corresponding public key to remove the key. delete-publicKey

Log on to the master node

After you complete the generate a key pair-add a public key two-step process, you can log onto the master node via SSH, as follows:

  1. Obtains the public network IP for the master node. Open the cluster management page, and locate basic information in master node entry to find the IP. manageCluster-basic

  2. Use the key to log in via SSH. Currently only Hadoop accounts are available for the master mode. Please use Hadoop when you log on. The commands are as follows ($master_ip please replace with the actual IP):

    ssh -i ./hadoop_key hadoop@$master_ip
    

Where i specifies the key file.

  1. After you log on to the master node, you can directly use commands such as hadoop, hdfs, yarn etc. hdfs-command

Establish SOCKS5 Proxy tunnel

  1. Obtains the public network IP for the master node. Open the cluster management page, and locate basic information in master node entry to find the IP.
  2. On the same machine, use SSH tunnel to establish SOCKS5 tunnel. The command is as follows ($master_ip please replace with the actual master node IP):

    ssh -i ./hadoop_key -N -D 127.0.0.1:1080 hadoop@$master_ip
    

    In this command, the proxy server is 127.0.0.1 and the port is 1080.

  3. Set up the browser proxy management extension. All systems and browsers are different. The proxy protocol selects SOCKS5 or sockets, SOCKS, Version 5, proxy server and port in Step 2 are 127.0.0.1 and 1080.

    We recommend SwitchyOmega as a proxy plugin, very easy to use.

  4. Get the IP for each service in the cluster. Log on to the master node,

    cat /home/hadoop/app/emr-master/${cluster_id}/emr-master/9002/deploy-helper/conf/emr/${cluster_id}.yaml
    

    ${cluster_id} please replace with the actual cluster ID.

  5. To access a service, enter the corresponding IP and port in the browser.

Service port list

Service Port
NameNode 41201
ResourceManager 41701