What is Hadoop?
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
Source : http://hadoop.apache.org/
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
Source : http://hadoop.apache.org/
There are three steps to install Hadoop as a single node cluster.
Step 1.Installing
Java
Follow this to install java- How to Install/Setup Java Path on Linux
Step 2.Setup
ssh for password-less access- Create and Setup SSH Certificates
Step 3.Install
Hadoop
Download
the hadoop binary from http://hadoop.apache.org/releases.html
Choose
any stable stable version.
I
have used Hadoop-1.2.1 here from this repository.
After
downloading you would have a file similar to
hadoop-1.2.1.tar.gz
Extract
the file using the command
$
tar -xzvf hadoop-1.2.1.tar.gz
Create
bigdata directory at home.
Eg:
Here the home directory path is
$
pwd
$
/home/training
Create
a new directory bigdata
$
mkdir bigdata
Go
back to the extracted hadoop directory and
move
extracted hadoop folder to bigdata directory
$
mv hadoop-1.2.1 /home/training/bigdata
$
cd /home/training/bigdata/hadoop-1.2.1
To
check the current working directory type
$
pwd
It
should display
$/home/training/bidgata/hadoop-1.2.1
Now
lets set path for HADOOP_HOME
Go
to home directory
$
cd
Open
.bashrc file as sudo
$
vim .bashrc (or vi .bashrc , if vim is not installed )
Note
: To install vim we can use the following command
sudo
apt-get install vim
At
the end of the file copy the following settings
#my
hadoop settings
export
HADOOP_HOME=/home/training/bigdata/hadoop-1.2.1
export
PATH=$PATH:$HADOOP_HOME/bin
Note
: If you have used some other path please replace that accordingly.
Save
and close the file.Reload the config using the command
$
source .bashrc
Next we edit the configuration files for pseudo-distributed mode in next part
Installing Hadoop-1.2.1(Stable Ver 1 ) on Linux(Ubuntu) Part 2
Next we edit the configuration files for pseudo-distributed mode in next part
Installing Hadoop-1.2.1(Stable Ver 1 ) on Linux(Ubuntu) Part 2
Nice and very informative, thanks for sharing. I bookmarked your page for future updates.
ReplyDeleteHadoop Training Chennai
Thank you so much for information on how to install hadoop. This will be very useful for my hadoop training chennai program.
ReplyDelete