pig tutorial - apache pig tutorial - Apache Pig AVG() Function - pig latin - apache pig - pig hadoop
What is AVG() Function in Apache Pig ?
- The AVG() function which is used in Apache Pig returns the average value of the numeric columns which is done in a bag.
- The AVG() function is done to compute the average of the numerical values which is done within a bag.
- The AVG() function will ignore the NULL values while we are calculating the average values
- The AVG() Function computes the numeric values within a single-column bag.
- The AVG() function will require a preceding GROUP ALL statement for global averages and the GROUP BY statement for the group averages. which is given in the average function

Learn Apache Pig - Apache Pig tutorial - Apache Pig AVG() - Apache Pig examples - Apache Pig programs

Learn Apache Pig - Apache Pig tutorial - Apache Pig AVG() Function - Apache Pig examples - Apache Pig programs
Syntax
grunt> AVG(expression)
Example
wikitechy_employee_details.txt
001,Suresh,Reddy,21,9848022337,Hyderabad,89
002,fathima,Battacharya,22,9848022338,Kolkata,78
003,Ramesh,Khanna,22,9848022339,Delhi,90
004,Preethi,Agarwal,21,9848022330,Pune,93
005,Deepthi,Mohanthy,23,9848022336,Bhuwaneshwar,75
006,Sruti,Mishra,23,9848022335,Chennai,87
007,Kama,Nayak,24,9848022334,trivendram,83
008,Vanitha,Nambiayar,24,9848022333,Chennai,72
- And hence we have loaded this file into Pig which is done with the relation name wikitechy_employee_details as shown below.
grunt> wikitechy_employee_details = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_employee_details.txt' USING PigStorage(',')
as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray, gpa:int);
Calculating the Average GPA
- Let us group the relation wikitechy_employee_details by using the Group All operator, and store the result in the relation which is called employee_group_all which is given below:
grunt> employee_group_all = Group wikitechy_employee_details All;
- This statement which is given above will produce a relation for employee_group_all as shown below.
grunt> Dump employee_group_all;
(all,{(8,Vanitha,Nambiayar,24,9848022333,Chennai,72),
(7,Kama,Nayak,24,9848022334,trivendram,83),
(6,Sruti,Mishra,23,9848022335,Chennai,87),
(5,Deepthi,Mohanthy,23,9848022336,Bhuwaneshwar,75),
(4,Preethi,Agarwal,21,9848022330,Pune,93),
(3 ,Ramesh,Khanna,22,9848022339,Delhi,90),
(2,fathima,Battacharya,22,9848022338,Ko lkata,78),
(1,Suresh,Reddy,21,9848022337,Hyderabad,89)})
- We will now calculate the global average GPA of all the employees by using the AVG() function which is given below:
grunt> employee_gpa_avg = foreach employee_group_all Generate (wikitechy_employee_details.firstname,wikitechy_employee_details.gpa), AVG(wikitechy_employee_details.gpa);
Verification:
grunt> Dump employee_gpa_avg;
Output:
(({(Vanitha),(Kama),(Sruti),(Deepthi),(Preethi),(Ramesh),(fathima),(Suresh) },
{ (72) , (83) , (87) , (75) , (93) , (90) , (78) , (89) }),83.375)