Best tool to process web streaming data in Hadoop or PIG or HIVE :
- It contains easy programming.
- It is little to achieve parallel execution of simple, “embarrassingly parallel” data analysis tasks.
- For Complex tasks it comprised the multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain.
- To upgrade their execution automatically, allowing the user to focus on semantics rather than efficiency and Extensibility.
- A special-purpose processing to create own functions.
- For efficient working pig and hive can be used together.
- Pig is best tool for parsing (ETL) kind of job, and even pig supports UDF better than hive.
- It can help you to develop you own framework using pig where you can call Hive ql and map-reduce job for better functionality.