Difference between Hive and HBase
Hive | HBase |
---|---|
Hive is query engine | HBase is a data storage particularly for unstructured data. |
Apache Hive is mainly used for batch processing i.e. OLAP |
HBase is extensively used for transactional processing wherein the response time of the query is not highly interactive i.e. OLTP. |
Operations in Hive are used to transformed into mapreduce jobs. |
Operations in HBase are run in real-time on the database |
For big data applications that require complex and fine grained processing, Hadoop MapReduce is the best choice. |
HBase should be used when Data model schema is sparse. |
It used for data warehousing requirements the programmers do not write complex mapreduce code. |
HBase is an ideal big data solution if the application requires random read or random write operations or both. |
Hive does not currently support update statements. |
HBase queries are written in a custom language that needs to be learned. |
Hive does not provide interactive querying it only runs batch processes on Hadoop. |
Apache HBase is a NoSQL key/value store which runs on top of HDFS. |
Hive has some limitations of high latency |
HBase does not have analytical capabilities |
Hive is to analytical queries. | HBase is to real-time querying |
Hive used for analytical querying of data collected over a period of time.Hive should not be used for real-time querying. |
HBase is perfect for real-time example Facebook use for messaging and real-time analytics. They may even be using it to count Facebook likes. |