Monday, June 30, 2014

Installing Hadoop-1.2.1(Stable Ver 1 ) on Linux(Ubuntu) Part 1

What is Hadoop?

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Source : http://hadoop.apache.org/


There are three steps to install Hadoop as a single node cluster.

Step 1.Installing Java

Follow this to install java- How to Install/Setup Java Path on Linux

Step 2.Setup ssh for password-less access- Create and Setup SSH Certificates


Step 3.Install Hadoop


Download the hadoop binary from http://hadoop.apache.org/releases.html
Choose any stable stable version.
I have used Hadoop-1.2.1 here from this repository.

After downloading you would have a file similar to

hadoop-1.2.1.tar.gz

Extract the file using the command

$ tar -xzvf hadoop-1.2.1.tar.gz

Create bigdata directory at home.

Eg: Here the home directory path is

$ pwd
$ /home/training

Create a new directory bigdata

$ mkdir bigdata

Go back to the extracted hadoop directory and
move extracted hadoop folder to bigdata directory

$ mv hadoop-1.2.1  /home/training/bigdata

$ cd /home/training/bigdata/hadoop-1.2.1

To check the current working directory type

$ pwd

It should display

$/home/training/bidgata/hadoop-1.2.1

Now lets set path for HADOOP_HOME

Go to home directory

$ cd

Open .bashrc file as sudo

$ vim .bashrc (or vi .bashrc , if vim is not installed )

Note : To install vim we can use the following command

sudo apt-get install vim

At the end of the file copy the following settings

#my hadoop settings

export HADOOP_HOME=/home/training/bigdata/hadoop-1.2.1
export PATH=$PATH:$HADOOP_HOME/bin

Note : If you have used some other path please replace that accordingly.

Save and close the file.Reload the config using the command


$ source .bashrc


Next we edit the configuration files for pseudo-distributed mode in next part

Installing Hadoop-1.2.1(Stable Ver 1 ) on Linux(Ubuntu) Part 2


2 comments:

  1. Nice and very informative, thanks for sharing. I bookmarked your page for future updates.

    Hadoop Training Chennai

    ReplyDelete
  2. Thank you so much for information on how to install hadoop. This will be very useful for my hadoop training chennai program.

    ReplyDelete