[Solved-1 Solution] Max/Min for whole sets of records in PIG ?
What is max()
- The Pig Latin MAX() function is used to calculate the highest value for a column (numeric values or chararrays) in a single-column bag. While calculating the maximum value, the Max() function ignores the NULL values.
- To get the global maximum value, we need to perform a Group All.operation, and calculate the maximum value using the MAX() function.
- To get the maximum value of a group, we need to group it using the Group By operator and proceed with the maximum function.
Syntax
- Given below is the syntax of the Max() function.
What is min()
- The MIN() function of Pig Latin is used to get the minimum (lowest) value (numeric or chararray) for a certain column in a single-column bag. While calculating the minimum value, the MIN() function ignores the NULL values.
- To get the global minimum value, we need to perform a Group All operation, and calculate the minimum value using the MIN() function.
- To get the minimum value of a group, we need to group it using theGroup By operator and proceed with the minimum function.
Syntax
- Here is the syntax of the MIN() function.
Problem:
- Here is an example of set of records that we loading from a file and the first thing we need to do is get the max and min of a column.
- In SQL we would do this with a subquery like this:
- It has a MAX and MIN function but when we tried doing the following it didn't work:
- This one is not working. Is there any solution?
Solution 1:
- We need to group all the data together but no extra column is required if we can use GROUP ALL
Pig:
Input: