Difference between Pig,Hive and Hadoop
Pig | Hive | Hadoop |
---|---|---|
Apache Pig is a high-level platform for creating programs it runs on Apache Hadoop. |
Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data query, and analysis. |
Open-source software framework for distributed storage and processing dataset of big data using the MapReduce programming model. |
Pig Latin extracts the programming from the Java MapReduce phrase into a code which creates MapReduce programming high level. |
Hive provides the SQL abstraction to integrate SQL-like queries (HiveQL). The underlying Java without we need to implement queries in the low-level Java API. |
The Hadoop Common package for Java ARchive (JAR) files and scripts needed to start Hadoop. |
Pig basically has 2 parts: the Pig Interpreter and the language, PigLatin |
It provides us data warehousing facilities on top of an existing Hadoop cluster. |
Hadoop is basically two things:Hadoop Distributed File System (HDFS) +a Computation or Processing framework (MapReduce). |
Pig makes our life a lot easier, otherwise writing MapReduce is always not easy. |
User can create tables in Hive and store data there. Along with that user can even map their existing HBase tables to Hive and operate on them. |
Hadoop is a distributed, scalable,big data store, modelled after Google’s BigTable.It stores data as key/value pairs. |