Install Cloudera Hadoop on Linux
This article talks about installing Hadoop on a single host machine.
Hadoop is the framework for a large amount of data processing paralleled
Hadoop implementation is provided by different vendors like hortionworks and Cloudera.
This article talks about installing Cloudera Hadoop on a single machine.
To set up Cloudera Hadoop, java is required.
if java is not already installed, install JDK 1.6, at least update 8
Please download Cloudera-testing. repo from http://archive.cloudera.com/redhat/cdh/ and copy it to /etc/yum.repos.d/ and make sure you update the yum command.
Please run the below commands to install hadoop, hive, and pig
yum install hadoop-0.20 -y
yum install hadoop-hive -y
yum install hadoop-pig -y
The above commands installs hadoop to /usr/lib/hadoop folder, hive installs to /usr/lib/hive, pig to /usr/lib/pig
please set up the environment variables as described below in the .bash_rc file
$ \\vi ~/.bashrc
export HADOOP\_HOME=/usr/lib/hadoop
export HIVE\_HOME=/usr/lib/hive
export PIG\_HOME=/usr/lib/pig
export PATH=$HADOOP\_HOME/bin:$PATH:$PIG\_HOME/bin:$HIVE\_HOME/bin
```save it to .bashrc file
$ source ~/.bashrc
Open $HADOOP\_HOME/conf/hadoop-env.sh. Add JAVA\_HOME path. Ex:
export JAVA\_HOME=/usr/java/jdk1.6.0\_18
\* Open $HADOOP\_HOME/conf/core-site.XML. Add the Namenode server name or localhost and port for fs.default.name. Ex:
fs.default.name
hdfs://localhost:9000