Difference between Impala and Apache Hive
Impala | Apache hive |
---|---|
Impala does runtime code generations for “big loops ” using llvm. |
Apache hive generates query expressions at compile time. |
Hadoop 2.6.0 | Hadoop 2.7.3 |
Runtime Filtering Optimization Enabled | All queries run through LLAP |
Parquet format with snappy compression | ORCFile format with zlib compression |
Impala avoids startup overhead as daemon processes are started at boot time itself, always being ready to processes a query. |
Every hive query has this problem of “cold start”. |
Impala is meant for interactive computing. | Apache Hive might not be ideal for interactive computing . |
Impala is more like MPP database. | Hive is batch based Hadoop MapReduce . |
Impala does not support complex types. | Hive supports complex types . |
Impala does not support fault tolerance | Apache Hive is fault tolerant |