[Solved-1 Solution] Using IN clause with PIG FILTER ?
In operator
- Pig had no support for IN operators. To imitate an IN operation, users had to concatenate several OR operators.
Here is an example
- Now, this type of expression can be re-written in a more compressed manner using an IN operator:
Problem:
How to use IN clause with pig filter ?
Solution 1:
Filter in Pig
- Pig allows to remove unwanted records based on a condition. The Filter functionality is similar to the
WHERE
clause inSQL
. - The
FILTER
operator in pig is used to remove unwanted records from the data file.
- The syntax of
FILTER
operator is shown below:
Here relation is the data set on which the filter is applied, condition is the filter condition and new relation is the relation created after filtering the rows.
Pig Filter Examples:
Lets consider the below sales data set as an example
year,product,quantity
---------------------
2000, iphone, 1000
2001, iphone, 1500
2002, iphone, 2000
2000, nokia, 1200
2001, nokia, 1500
2002, nokia, 900
1. select products whose quantity is greater than or equal to 1000.
2. select products whose quantity is greater than 1000 and year is 2001
3. select products with year not in 2000
- we can use all the logical operators (NOT, AND, OR) and relational operators (< , >, ==, !=, >=, <= ) in the filter conditions.