pig tutorial - apache pig tutorial - Apache Pig Limit Operator - pig latin - apache pig - pig hadoop
What is Limit Operator in Apache Pig ?
- The LIMIT operator is used to get a limited number of tuples from a relation.
- This idea has come to mind visualizing the fact that " 'ddx' is an operator and derivative of a function is defined by 'Limit'".
- Well, the limit as x approaches something is an operator, such as limx→0.
- Truncates relation’s size

Syntax
grunt> Result = LIMIT Relation_name required number of tuples;
Example
- Ensure that we have a file named wikitechy_employee_details.txt in the HDFS directory /pig_data/ as given below.
wikitechy_employee_details.txt
111,Anu,Shankar,23,9876543210,Chennai
112,Barvathi,Nambiayar,24,9876543211,Chennai
113,Kajal,Nayak,24,9876543212,Trivendram
114,Preethi,Antony,21,9876543213,Pune
115,Raj,Gopal,21,9876543214,Hyderabad
116,Yashika,Kannan,22,9876543215,Delhi
117,siddu,Narayanan,22,9876543216,Kolkata
118,Timple,Mohanthy,23,9876543217,Bhuwaneshwar
- You have loaded this file into Pig with the relation name wikitechy_employee_details as given below.
grunt>wikitechy_employee_details = LOAD 'hdfs://localhost:9000/pig_data/ wikitechy_employee_details.txt' USING PigStorage(',')
as (id:int, firstname:chararray, lastname:chararray,age:int, phone:chararray, city:chararray);
- Sort the relation in descending order based on the age of the employee and store it into another relation named limit_data using the ORDER BY operator as given below.
grunt> limit_data = LIMIT wikitechy_employee_details 4;
Verification
- Now verify the relation limit_data using the DUMP operator as shown below.
grunt> Dump limit_data;
Output
- The following output, display the contents of the relation limit_data.
111,Anu,Shankar,23,9876543210,Chennai
112,Barvathi,Nambiayar,24,9876543211,Chennai
113,Kajal,Nayak,24,9876543212,Trivendram
114,Preethi,Antony,21,9876543213,Pune