Wednesday, June 9, 2010

Introduction to Hadoop

hadoop is a apache software framework of a disturbuted programming model to process the large amount (say tera bytes) of data in large set of clusters( multiple nodes). Hadoop is popular for OLTP datawarehouse processing. Hadoop Advantages :- 1. Open source framework built on java. 2. It will process the large chunks of data in parallelel within in small time. 3. As the data is disturbuted in multiple nodes, over the time data grows, we can add nodes easily and supports parallel processing without much overhead. 4. Highly scalable and highly fault torennce system even if there are multiple thousands of node. On top of Hadoop, we have sevaral following other frameworks 1.Map Reduce 2.Hive 3.Pig 4.Hbase 5.Scoope 6.Flume

Related article