pig tutorial - apache pig tutorial - Apache Pig - IsEmpty() Function - pig latin - apache pig - pig hadoop
What is IsEmpty() function in Apache Pig ?
- The IsEmpty() function used in Apache Pig is to check if a bag or map is empty.
- The IsEmpty() function is used to filter the data.
- The IsEmpty() returns a Boolean value is indicating whether a variable has been initialized.
- The IsEmpty() function will only return the meaningful information for the variants.
- The IsEmpty() function expresses the argument which is most often uses a single variable name.

Learn apache pig - apache pig tutorial - isempty() in apache pig - apache pig examples - apache pig programs
Syntax
grunt> IsEmpty(expression)
Example
We can assume that we have two files namely wikitechy_employee_sales.txt and wikitechy_employee_bonus.txt which is given in the HDFS directory /pig_data/ which is given below:
wikitechy_employee_sales.txt
1,Robin,22,25000,sales
2,BOB,23,30000,sales
3,Maya,23,25000,sales
4,Sara,25,40000,sales
5,David,23,45000,sales
6,Maggy,22,35000,sales
wikitechy_employee_bonus.txt
1,Robin,22,25000,sales
2,Jaya,23,20000,admin
3,Maya,23,25000,sales
4,Alia,25,50000,admin
5,David,23,45000,sales
6,Omar,30,30000,admin
- We have loaded the files into Pig, with the relation names which are called employee_sales and employee_bonus.
employee_sales
grunt> employee_sales = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_employee_sales.txt' USING PigStorage(',')
as (sno:int, name:chararray, age:int, salary:int, dept:chararray);
employee_bonus
grunt> employee_bonus = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_employee_bonus.txt' USING PigStorage(',')
as (sno:int, name:chararray, age:int, salary:int, dept:chararray);
grunt> cogroup_data = COGROUP employee_sales by age, employee_bonus by age;
Verify the relation cogroup_data by using the DUMP operator which is given below.
grunt> Dump cogroup_data;
(22,{(6,Maggy,22,35000,sales),(1,Robin,22,25000,sales)}, {(1,Robin,22,25000,sales)})
(23,{(5,David,23,45000,sales),(3,Maya,23,25000,sales),(2,BOB,23,30000,sales)}, {(5,David,23,45000,sales),(3,Maya,23,25000,sales),(2,Jaya,23,20000,admin)})
(25,{(4,Sara,25,40000,sales)},{(4,Alia,25,50000,admin)})
(30,{},{(6,Omar,30,30000,admin)})
We need to list some empty bags from the employee_sales relation which is given in the group by using the IsEmpty() function.
grunt> isempty_data = filter cogroup_data by IsEmpty(employee_sales);
Verification
grunt> Dump isempty_data;
(30,{},{(6,Omar,30,30000,admin)}