Hive
- Hive is a component of Hortonworks Data Platform(HDP).
- Hive provides a SQL interface to data stored in HDP.
- Hive has 3 main functions:
-
- Data Summarization
- Query
- Analysis.
- It supports queries expressed language called HiveQL, which automatically translates SQL like queries into MapReduce jobs executed on Hadoop.
- It also enables data serialization and increases flexibility in schema architecture including a system catalog called Hive Metastore.
Architecture of hive :
Features of hive:
- Different storage such as plain text, RCFile, ORC, HBase, and others.
- Pre-defined functions (UDFs) to manipulate dates, strings, and other data-mining tools.
- Hive supports extending the UDF set to handle use-cases not supported by built-in functions.
Limitations of Hive:
- Hive supports overwriting or hold data, but not updates and deletes.
- In Hive, sub queries are not supported