Difference between pig and Hive
Pig | Hive |
---|---|
Pig Hadoop Component is generally used by Researchers and Programmers. |
Hive Hadoop Component is mainly used by data analysts. |
Pig Hadoop Component is used for semi structured data. |
Hive Hadoop Component is used for completely structured Data. |
Pig Hadoop Component has a procedural data flow language (Pig Latin). |
Hive Hadoop Component has a declarative Structured Query Language (HiveQL). |
Pig can’t start reliance based server. | Hive can start reliance based server,send queries from any nook and corner directly to the Hive server which will execute them. |
Pig is also SQL like but it varies from great extent and it will take some time efforts to master Pig. |
Hive directly borrowed SQL expertise it learnt easily. |
Avro supported for Pig. | Avro does not support for Hive. |
Pig Hadoop Component operates on the client side of any group. |
Hive Hadoop Component operates on the server side of any group. |
Pig Hadoop Component is highly used for programming. |
Hive Hadoop Component is mainly used for generating reports. |
Pig Hadoop is a great ETL tool for big data because of its powerful transformation and processing capabilities. |
Hive Hadoop Component is helpful for ETL(Extract, Transform and Load). |
In Pig there is no dedicated metadata database and the schemas or data types will be defined in the script itself. |
Hive makes use of exact variation of the SQL DLL language by defining the tables beforehand and storing the schema details in any local database. |
Pig Hadoop component does not have any notion for partitions though might be one can achieve this through filters. |
The Hive Hadoop component has a provision for partitions so that you can process the subset of data by date or in an alphabetical order. |
Pig Hadoop Component renders users with sample data for each scenario and each step through its “Illustrate” function. |
This feature is not incorporated with the Hive Hadoop Component. |