[Solved-2 Solutions] How to use Apache Pig rank function ?
What is Rank function ?
- Pig 0.11.0 rank function used for generating ranks for every data.
Syntax :
ranked = RANK input [BY [COL [ASC|DESC]]] [DENSE];
Problem :
How to use apache pig rank function ?
Solution 1:
- We can group data by id then use the UDF Enumerate to append an index to each tuple of the bags.
register datafu-1.1.0.jar;
define Enumerate datafu.pig.bags.Enumerate('1');
data = load 'data' using PigStorage(',') as (id:chararray, rating:int);
data = group data by id;
data = foreach data {
sorted = order data by rating DESC;
generate group, sorted;
Solution 2:
- We can use RANK function as below:
B = rank A by rating DESC; dump B;