[Solved-1 Solution] Processing Json through Pig Scripts ?
Json:
- Each Pig tuple is stored on one line (as one value for TextOutputFormat) so that it can be read easily using TextInputFormat. Pig tuples are mapped to JSON objects.
- Pig bags are mapped to JSON arrays. Pig maps are also mapped to JSON objects. Maps are assumed to be string to string. A schema is stored in a side file to deal with mapping between JSON and Pig types. The schema file share the same format as the one we use in PigStorage.
Problem:
- If you have currently started to work with JSON files and process data using PIG scripts. You have to come across PiggyBank which you thought will be useful to load and process json file in PIG scripts.
- Here is a simple PIGSCRIPT and the respective JSON as follows.
The following exception came during runtime:
Pig log file:
Please give the correct solution.
Solution 1:
We can handle nested json loading with Twitter's Elephant Bird
- This will parse the JSON into a map schema the JSONArray gets parsed into a DataBag of maps.