Monday, June 30, 2014

Installing Hadoop-1.2.1(Stable Ver 1 ) on Linux(Ubuntu) Part 2

Continued from Part 1 


Go to bigdata directory and create two directory

$ cd /home/training/bigdata

$ chmod -R 755 hadoop-1.2.1

$ mkdir name

$ mkdir data

$ chmod -R 755 name

$ chmod -R 755 data

To run hadoop on single node the following files need to be configured

  1. hdfs-site.xml
  2. mapred-site.xml
  3. core-site.xml
  4. masters ( for multinode cluster)
  5. slaves (for multinode cluster)


Go to hadoop conf directory

$ cd /home/training/bigdata/hadoop-1.2.1/conf

Open core-site.xm file

$ vim core-site.xml

Add the following inside configuration tag

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>

Save and close the file.

Next, open hdfs-site.xml and copy the following values

$ vim hdfs-site.xml

<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/training/bigdata/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/training/bigdata/data</value>
</property>

Save and close the file.

Now, open mapred-site.xml and copy the following values

$ vim mapred-site.xml

<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>

For now for pseudo distributed mode both msters and slaves file will contain the localhost and we don't need to change it.


Next step is to format the namenode.

But , before that lets check if both JAVA_HOME and HADOOP_HOME are set and visible.
On terminal type echo $PATH

$ echo $PATH

It should display path for both java and hadoop as we have already set path for both.


Now lets format the namenode.

Type the following command at namenode

$ cd

$ hadoop namenode -format

Note : Type Uppercase Y when requested, else it will throw error.
Also namenode should be formatted only once, else you might lose all previous data if you format the namenode again.


Note in some cases if JAVA_HOME is not detected and if you get the error that JAVA_HOME is not set then , got to hadoop conf directory and edit
hadoop-env.sh file.

$ cd /home/training/bigdata/hadoop-1.2.1/conf

$ vim hadoop-env.sh

Now remove the # comment sign and change the same path as JAVA_HOME
Here it is

JAVA_HOME = /home/training/java

Now we are all set to start the cluster

At terminal type

$ start-all.sh

$ jps

It should display the process number and the running processes.
If every thing is setup correctly . We should see the following process running.

NameNode
DataNode
SecondaryNameNode
JobTracker
TaskTracker

To stop the cluster type

$ stop-all.sh

1 comment:

  1. Hi,
    nice information from your blog for learners real time experts provides training on
    hadoop online training

    ReplyDelete