[Solved-1 Solution] Percentile calculation in Pig Latin ?
Problem:
How to calculate percentile in pig latin ?
Solution 1:
- We can use the UDF StreamingQuantile from the Apache DataFulibrary.
To calculate the percentile we can use the below code
Input
item1,234
item1,324
item1,769
item2,23
item2,23
item2,45
Pig Script
register datafu-1.2.0.jar;
define Quantile datafu.pig.stats.StreamingQuantile('0.0','0.5','1.0');
data = load 'data' using PigStorage(',') as (item:chararray, value:int);
quantiles = FOREACH (GROUP data by item) GENERATE group, Quantile(data.value);
dump quantiles;
Output
(item1,(234.0,324.0,769.0))
(item2,(23.0,23.0,45.0))