[Solved-2 Solutions] Pig problem with split string(STRSPLIT) ?
What is split string
- This function is used to split a given string by a given delimiter.
Syntax
- The syntax of
STRSPLIT()
is given below. This function accepts a string that is needed to be split, a regular expression, and an integer value specifying the limit (the number of substrings the string should be split). - This function parses the string and when it encounters the given regular expression, it splits the string into
n
number of substrings wheren
will be the value passed tolimit
.
Problem:
- The following tuple H1, want to strsplit its $0 into tuple, but always got error message:
Is there any solution ?
Solution 1:
- There is an escaping problem in the pig parsing routines when it encounters this semicolon.
- we can use a unicode escape sequence for a semicolon:
\u003B
. - However this must also be slash escaped and put in a single quoted string.
- The string must be the single quoted string
Solution 2:
- STRSPLIT on a semi-colon is tricky. The semi colon should be put inside the code.
- This is how we originally implemented STRSPLIT() command