pig tutorial - apache pig tutorial - Apache Pig - GetYear() - pig latin - apache pig - pig hadoop
What is GetYear() method?
- The getYear() method returns the year in the specified date according to local time.
- Because getYear() does not return full years ("year 2000 problem"), it is no longer used and has been replaced by the getFullYear() method.
Syntax
dateObj.getYear()

Learn apache pig - apache pig tutorial - get year in apache pig - apache pig examples - apache pig programs
GetYear() function in Apache Pig
- This function accepts a date-time object as parameter and returns the current year from the given date-time object.
Syntax
grunt> GetYear(datetime)
Example
- Ensure that we have a file named wikitechy_date.txt in the HDFS directory /pig_data/ as shown below.
- This file contains the date-of-birth details of a particular person, it has person id, date and time.
wikitechy_date.txt
001,1989/09/26 09:00:00
002,1980/06/20 10:22:00
003,1990/12/19 03:11:44
We have loaded this file into Pig with a relation named date_data as given below.
grunt> date_data = LOAD 'hdfs://localhost:9000/pig_data/wikitechy_date.txt' USING PigStorage(',')
as (id:int,date:chararray);
- Following is an example of the GetYear() function.
- It will retrive the current year from the given date-time object.
- So, First we can generate the date-time objects of all employees using todate() function as given below.
grunt> todate_data = foreach date_data generate ToDate(date,'yyyy/MM/dd HH:mm:ss')
as (date_time:DateTime );
grunt> Dump todate_data;
(1989-09-26T09:00:00.000+05:30)
(1980-06-20T10:22:00.000+05:30)
(1990-12-19T03:11:44.000+05:30)
- Now we get the year from the date-of-birth of each employee using the GetYear() function and store it in the relation named getyear_data.
grunt> getyear_data = foreach todate_data generate (date_time), GetYear(date_time);
Verification
- Now verify the contents of the getyear_data relation using Dump operator as given below.
grunt> Dump getyear_data;
Output
The above statement stores the result in the relation named getyear_data.
(1989-09-26T09:00:00.000+05:30,1989)
(1980-06-20T10:22:00.000+05:30,1980)
(1990-12-19T03:11:44.000+05:30,1990)