[Solved-1 Solution] How to Get Pig to Work with lzo Files ?
Problem:
How to Get Pig to Work with lzo Files ?
Solution 1:
- You can use this:
1. Clone hadoop-lzo from github
2. Compile it to get a hadoop-lzo*.jar and the native *.o libraries. You'll need to compile this on a 64bit machine.
3.Copy the native libs to
4.Copy the java jar to
5. Then configure hadoop and pig to have the property java.library.path point to the lzo native libraries.
6.We can do this in
with:
7. Now try grunt shell by running pig again, and make sure everything still works.
8. All we need to do now is install elephant-bird.
9. command: ant in the elephant-bird folder in order to create a jar.
10. For simplicity's sake, move all relevant jars (hadoop-lzo-x.x.x.jar and elephant-bird-x.x.x.jar)
11. Play around with loading normal files and lzos in grunt shell. Register the relevant jars mentioned above, try loading a file, limiting output to a manageable number, and dumping it. This should all work fine whether you're using a normal text file or an lzo.