Apache hive Vs Impala
Apache hive | Impala |
---|---|
Hive generates query expressions at compile time;Hive is batch based Hadoop MapReduce |
Impala does not support for complex types and fault tolerance. |
Apache does not generations runtime code for “big loops ” using llvm. |
Impala does generations runtime code for “big loops ” using llvm. |
Hadoop 2.7.3 | Hadoop 2.6.0 |
All queries run through LLAP | Runtime Filtering Optimization Enabled |
ORCFile format with zlib compression | Parquet format with snappy compression |
Every hive query has this problem of “cold start”. | Impala avoids startup overhead as daemon processes are started at boot time itself, always being ready to processes a query. |
Apache Hive might not be ideal for interactive computing | Impala is meant for interactive computing. |
Hive is batch based Hadoop MapReduce. | Impala is more like MPP database. |
Hive supports complex types. | Impala does not support complex types. |
Apache Hive is fault tolerant. | Impala does not support fault tolerance. |
It is more universal, versatile and pluggable language. | It is used unleash its brute processing power and give lightning fast analytic results. |