[Solved-3 Solutions] How to incorporate the current input filename into my Pig Latin script ?
What is PigStorage() ?
- The
PigStorage()
function loads and stores data as structured text files. It takes a delimiter using which each entity of a tuple is separated as a parameter. By default, it takesâ\tâ
as a parameter.
Syntax
- Given below is the syntax of the PigStorage() function.
Problem:
How to incorporate the current input filename into my Pig Latin script ?
Solution 1:
We can use PigStorage by specify -tagsource as following
The first field in each Tuple will contain input path (INPUT_FILE_NAME)
Solution 2:
- The Pig wiki as an example of PigStorageWithInputPath which had the filename in an additional chararray field:
Example:
UDF
Solution 3:
- tagSource is deprecated in Pig 0.12.0 . Instead use
- -tagFile - Appends input source file name to beginning of each tuple.
- -tagPath - Appends input source file path to beginning of each tuple.
It will gives the full file path as first column