In terminal,
sudo -u hdfs hadoop fs -mkdir /user/Leela
Note: Need to first create Leela and then Hive directory.
sudo -u hdfs hadoop fs -mkdir /user/Leela/Hive
OR
sudo -u hdfs hadoop fs -mkdir hdfs://quickstart.cloudera:8020/user/Leela/Hive/Student
To go to created directory and lookup for files in it.
hadoop fs -ls Leela
Found 1 items
drwxr-xr-x - hadoop supergroup 0 2017-07-07 11:34 Leela/Nifi1_out
To Upload a file to HDFS, use
hadoop fs -put /home/cloudera/Desktop/Spark/Words2 /user/Leela/Hive
OR
hadoop fs -put file:///home/cloudera/Desktop/Hadoop/HIve/hive_inputs/student.txt hdfs://quickstart.cloudera:8020/user/Leela/Hive/Student
To delete a file in HDFS, use
hadoop fs -rm hdfs://quickstart.cloudera:8020/user/Pig/demo.txt
OR
hadoop fs -rm /user/hadoop/Leela/Nifi1_out/1.txt
FOLLOWED BY,
hadoop fs -expunge
To delete a directory in HDFS, use
sudo -u hdfs hadoop fs -rmdir hdfs://quickstart.cloudera:8020/user/Leela/Hive/Student
To recursively delete a directory that has some files in it.
followed by hadoop fs -expunge
To view Contents of HDFS file
hadoop fs -cat /user/hadoop/Leela/Nifi1_out/1.txt
hdfs dfs -put -f /home/cloudera/Employee/newdir1 /user/cloudera
Merge Files in HDFS and save the resultant in Local filesystem
hdfs dfs -getmerge /user/cloudera/newdir1 /home/cloudera/Employee/newdir1/MergedEmployee.txt
Change Permissions.
hdfs dfs -chmod 664 /user/cloudera/newdir1/MergedEmployee.txt
To know hadoop installed dir
whereis hadoop
To know hadoop directory
echo $HADOOP_HOME
This command gives the directories and the users created those users
hadoop fs -ls
Give the password as cloudera
su root
su hdfs //switch to hdfs user if the directory is created by hdfs user.
Note: Switch to the Owner of the directory to get permissions. Highlighted above
Provide the permissions
hadoop fs -chmod -R 777 /user/Leela
hadoop fs -chmod -R 777 /user
hadoop dfs -ls xml_test //here xml_test is the directory name
In case of Namenode in safe mode Error use the below command,
sudo -u hdfs hdfs dfsadmin -safemode leave
To Kill an existing Job
/usr/lib/hadoop/bin/hadoop job -kill job_1490524209136_0004
To see the list of services Running
sudo jps
To get the lost of Hadoop directories:
hadoop fs -ls /
Putty Usage
Give host name along with User id:
eg:ubuntu@ec2-13-126-209-11.ap-south-1.compute.amazonaws.com
OR
ubuntu@13.126.197.33
For Cloud, first Generate a keypair Use PuttyGen to generate a PPK file
In Putty, Connection -> SSH -> Auth Provide path of the generated PPK file and click on Open
Also this can be saved.
This PPK file is for cloud. Incase of datacentre username and password has to be specified in the terminal
WinSCP Usage: To transfer files to datanode
provide host name as like above
Eg: ubuntu@13.126.71.73
Click Advanced -> SSH -. Authentication and Provide PPK file path
To install aws cli
sudo apt install awscli
To get namenode's host name
hdfs getconf -namenodes
To view contents in a HDFS file
hadoop fs -cat /user/cert/problem/solution/part-m-00000
Copy a file from source to destination
hadoop fs -cp /user/saurzcode/dir1/abc.txt /user/saurzcode/dir2
Move file from source to destination.
hdfs dfs -mkdir /user/cloudera/problem2
hdfs dfs -mv /user/cloudera/products /user/cloudera/problem2/products
Display the aggregate length of a file.
hdfs dfs -du /user/cloudera/problem2/products/part-m-00000
Display the size of sub-directories in parent directory(-h displays in MB)
hdfs dfs -du -h /user/cloudera/problem2/products
Display last few lines of a file.
dfs dfs -tail /user/cloudera/problem2/products/part-m-00000
Commands for operating on Services in Cloudera like Hive,Hue etc
To get the list of YARN jobs running in datanode.
yarn application -list
eg:
yarn application -kill application_1542646340855_17805
/etc/init.d contains scripts used by the System V init tools (SysVinit). This is the traditional service management package for Linux, containing the init program (the first process that is run when the kernel has finished initializing¹) as well as some infrastructure to start and stop services and configure them. Specifically, files in /etc/init.d are shell scripts that respond to start, stop, restart.
To start all hadoop services use command,
for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x start ; done
This will start the services those start with hadoop-.
To start individual services look for service name under /etc/init.d/ and sudo service zookeeper-server start.
List the running jobs in the Node
yarn application --list
Status of all the running services
service --status-all
Restart HMaster
sudo service hbase-master restart
Note: In case of error HBase master daemon is dead and pid file exists [FAILED], restart HMaster using above command.
Command to start zookeeper service:
sudo service zookeeper-server startTo Kill an existing running YARN Application
yarn application -kill <APPLICATIONID>eg:
yarn application -kill application_1542646340855_17805
Few points:
Scripts of all the services in the machine would be under /etc/init.d/./etc/init.d contains scripts used by the System V init tools (SysVinit). This is the traditional service management package for Linux, containing the init program (the first process that is run when the kernel has finished initializing¹) as well as some infrastructure to start and stop services and configure them. Specifically, files in /etc/init.d are shell scripts that respond to start, stop, restart.
To start all hadoop services use command,
for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x start ; done
This will start the services those start with hadoop-.
To start individual services look for service name under /etc/init.d/ and sudo service zookeeper-server start.
List the running jobs in the Node
yarn application --list
Status of all the running services
service --status-all
Restart HMaster
sudo service hbase-master restart
Note: In case of error HBase master daemon is dead and pid file exists [FAILED], restart HMaster using above command.
Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.
ReplyDeletehttps://www.emexotechnologies.com/online-courses/big-data-hadoop-training-in-electronic-city/
Very nice article,Thank you for sharing it.
ReplyDeleteKeep updating...
Big Data Hadoop Certification