Apache whirr basic tutorial explained

Apache Whirr is an open-source Java API library for creating/setup a Hadoop cluster on different cloud instance services.

It also provides command-line tools to launch Hadoop services.

Whirr tool uses JCloud API in middle to interact with different cloud providers.

Whirr Advantages

Apache Whirr provides the following advantages.

  • No need of providing scripts for each cloud provider to execute and deploy cloud services
  • Common API to interact with different cloud providers for provisioning
  • Install/configure/setup/deploy Hadoop clusters very quickly in minutes

If you see the whirr recipe folder of whirr software package, the following cloud providers and services are supported

Whir supported cloud providers:

  • Amazon cloud:- Very easily we can set up Hadoop on the amazon ec2 instance. Launch clusters dynamically and destroy clusters when not required
  • Rackspace cloud:-
  • Open stack Cloud

Whirr supported services:

  • Hadoop
  • Casandra
  • zookeeper
  • Hbase
  • Flume
  • Kafka
  • MongoDB

How to install Whir on a local instance

For setup and installation whirr on any instance, java is a required thing.

First download whir from apache mirror site🔗 Extract whirr tarball

$ tar -xzvf whirr-0.8.0.tar.gz
$ cd whirr-0.8.0

Set PATH environment variable for the whirr

$export PATH=$PATH:/path/to/whirr/bin

to Test whether whirr is working or not

$ whirr version
Apache Whirr 0.8.2

above command display version of installed whirr package To configure any cloud providers, users have to write whirr.properties that have roles and cluster information

whirr.properties file

whirr.cluster-name=name of the cluster
whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 hadoop-datanode+hadoop-tasktracker different roles and services

whirr.provider=provide cloud provider here
whirr.identity=provide access key if of cloud provider instance
whirr.credential=secret access key or cloud provider instance

whirr.private-key-file= private key file of cloud provider
whirr.public-key-file=public key file of cloud provider

That’s it on my understanding of Apache Whirr. Please comment below for any questions on this.