Wednesday, July 8, 2015

Apache whirr basic tutorial explained

What is Apache Whirr

Apache Whirr is an open source Java API library for creating/setup Hadoop cluster on different cloud instance services. It also provides command line tools to launch Hadoop services. Whirr tool uses JCloud API in middle to interact with different cloud providers

Whirr Advantages

Apache Whirr provides following advantages No need of providing scripts for each cloud provider to execute cloud services
Common API to interact with different cloud providers for provisioning
Install/configure/setup/deploy Hadoop clusters very quickly in minutes

If you see the whirr recipe folder of whirr software package, the following cloud providers and services are supported

Whir supported cloud providers
1. Amazon cloud:- Very easily we can setup Hadoop on amazon ec2 instance. Launch clusters dynamically and destroy clusters when not required
2. Rackspace cloud:-
3. Open stack Cloud

Whirr supported services
1. Hadoop
2. Casandra
3. zookeeper
4. Hbase
5. Flume
6. Kafka
7. Mongodb

How to install Whir on a local instance

For setup and install whirr on any instance java is required thing
First download whir  from apache mirror site http://www.apache.org/dyn/closer.cgi/incubator/whirr/
Extract whirr tarball

$ tar -xzvf whirr-0.8.0.tar.gz
$ cd whirr-0.8.0

Set PATH environment variable for whirr
$export PATH=$PATH:/path/to/whirr/bin
to Test whether whirr is working or not
$ whirr version
Apache Whirr 0.8.0
above command display version of installed whirr package
To configure any cloud providers, users have to write whirr.properties which has roles  and cluster information

whirr.properties file

whirr.cluster-name=name of the cluster
whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 hadoop-datanode+hadoop-tasktracker different roles and services

whirr.provider=provide cloud provider here
whirr.identity=provide access key if of cloud provider instance
whirr.credential=secret access key or cloud provider instance

whirr.private-key-file= private key file of cloud provider 
whirr.public-key-file=public key file of cloud provider

That's it on my understanding on Apache Whirr. Please comment below for any questions on this.

Related article