Steps for Setting up a Single node hadoop cluster.
Creating User
Assuming that you have a seperate user for setting up hadoop or all your nodes have the same user configured already, if not create a user by.
Check Java Version
Please install Java(TM) SE Runtime Environment version 1.7. To check if your java version run java -version. If not please install the Java SE. Also Check if JAVA_HOME is set
Installing Hadoop
Setp 1: Download and Extract
I personally prefer all applications installed at /opt/ therefore have proceeded so.
Add “export HADOOP_HOME=/usr/local/hadoop” to .bashrc
Assuming that the above is setup properly to verify the installation run hadoop version
Setp 2: Creating required Directories
Assuming that all the data is going to be set up in the same location on all instances
The Following directories are going to be created
Setp 3.1: Hadoop Configuration - core-site.xml
The core-site.xml file contains information such as the port number used for Hadoop instance, memory allocated for file system, memory limit for storing data, and the size of Read/Write buffers.
All hadoop configuration files are located at HADOOP_HOME/etc/hadoop
Setp 3.2: Hadoop Configuration - hdfs-site.xml
The hdfs-site.xml file contains information such as the value of replication data, namenode path, and datanode path of your local file systems, where you want to store the Hadoop infrastructure.
Assuming the following details
dfs.replication = 1
namenode path = /opt/hadoop/data/name
datanode path = /opt/hadoop/data/data
Setp 3.3: Hadoop Configuration - yarn-site.xml
This file is used to configure yarn into Hadoop.
Setp 3.4: Hadoop Configuration - mapred-site.xml
Setp 3.4: Hadoop Configuration - mapred-site.xml
This file is used to specify which MapReduce framework we are using. By default, Hadoop contains a template of yarn-site.xml. It is required to copy the file from mapred-site.xml.template to mapred-site.xml file or create it.
Step 4.1 Verifying Hadoop Installation - Name Node Setup
Set up the namenode using the command “hdfs namenode -format” as follows.