[Solved-1 Solution] Pig Order By Query ?
What is Order By ?
- The
ORDER BY
operator is used to display the contents of a relation in a sorted order based on one or more fields.
Syntax
- Given below is the syntax of the
ORDER BY
operator.
grunt> Relation_name2 = ORDER Relatin_name1 BY (ASC|DESC);
Problem:
grunt> dump jn;
(k1,k4,10)
(k1,k5,15)
(k2,k4,9)
(k3,k4,16)
grunt> jn = group jn by $1;
grunt> dump jn;
- Now, from here we want the following output :
(k4,{(k3,k4,16),(k1,k4,10)})
(k5,{(k1,k5,15)})
- Basically, we want to sort on the numbers : 10,9,16 and select the top 2 for every row.
How can we do it?
Solution 1:
- We can use the nested foreach with
ORDER BY
like below code
A = LOAD 'data';
jn = group A by $1;
B = FOREACH jn {
sorted = ORDER A by $2 ASC;
lim = LIMIT sorted 2;
GENERATE lim;
};