What is hive in Hadoop?



Hive is open source framework developed in java, and one of sub component of hadoop system, developed by facebook on top of hadoop hdfs system.

I have already blogged about the basics of HDFS Basics of HDFS in Hadoop.Hive can be used to accesse the data (files in hdfs ) stored in hadoop distributed file system or data stored in HBase.
and Map reduce is java framework to process the data parallelly

Hive can be used to analyze the large amount of data  on Hadoop without knowing java map reduce programming.

Hive provides hive query language (HQL). which is simpilary to Structured query language. Hive provides all queries with minimal ansi sql support.
if we wan to support complex query features like aggregation, custom functions, in that case we have to write custom map reduce program that can be plugged to hive sql repository.

Execute Hive Queries?:-
Hive provides command line interfaces platform i.e hive shell for executing hive queries. You can write the queries in shell script and call the shell script. This hive queries calls the map reduce jobs  and query, process the data.

Hive Advantages:-
1.Hive is built on hadoo, so supports and handles all the capablities of hadoop provides like reliable, high avialble ,nodefailure,commodatiy hardware
2.Database developer need not to learn the java programming for writing map reduce programs for retrieving data from hadoop system.

Hive Disadvantages:-
1. Hive is not for OLAP processing

This topic has been a very basic start to explore on what is hadoop. Hopefull you have enough information to get started.

If you have any questions, please feel free to leave a comment and I will get back to you.


No comments:

Post a Comment

Note: only a member of this blog may post a comment.