What is the purpose of @outputSchema decorator in Python UDF

Specifying the UDF output schema

A UDF has input and output. Here is the different ways you can specify the output format of a Python UDF through use of the outputSchema decorator.

Sample Code:

# the original udf
# it returns a single chararray (that's PigLatin for String)
@outputSchema('word:chararray')
def hi_world():
    return "hello world"
    
# this one returns a Python tuple. Pig recognises the first element 
# of the tuple as a chararray like before, and the next one as a 
# long (a kind of integer)
@outputSchema("word:chararray,number:long")
def hi_everyone():
  return "hi there", 15

#we can use outputSchema to define nested schemas too, here is a bag of tuples
@outputSchema('some_bag:bag{t:(field_1:chararray, field_2:int)}')
def bag_udf():
    return [
        ('hi',1000),
        ('there',2000),
        ('bill',0)
    ]

#and here is a map
@outputSchema('something_nice:map[]')
def my_map_maker():
    return {"a":"b", "c":"d", "e","f"}

OutputSchema can be used to imply that a function outputs one or a combination of basic types. Those types are:

chararray: like a string
bytearray: a bunch of bytes in a row. Like a string but not as human friendly
long: long integer
int: normal integer
double: floating point number
datetime
boolean
No schema is specified;then the Pig assumes that the UDF outputs a bytearray.

Categorized in:

Apache Pig

Tagged in:

What is the purpose of @outputSchema decorator in Python UDF when using in Apache Pig ?

Specifying the UDF output schema

Sample Code:

Leave a Reply

Other Stories

What are the features of Apache Solr ?

What is the main difference between pig vs hive vs sql ?

What is the output of the following pseudo code ? Int a = 456, b, c, d = 10; b = a/d; c = a-b; print c ?

A mother her little daughter and her just born infant boy together stood on a weighing machine which shows 74kgs. How much does the daughter weight if the mother weights 46kg more than the combined weight of daughter and the infant and the infant weights 60% less than the daughter ?

16, 24, 48, 120, 360, 1260, ?

A number is divided by 5, 3, 2 successively in order to get remainders of 0, 2, 1 respectively. What will be the remainder when the same number is divided by 2, 3, 5 respectively ?

Project Details for DOTNET / JAVA /PHP

UI/UX Design Projects

Project Details for Python

Cyber Security Projects

Project Details for Java

Ads Blocker Detected!!!

Press ESC to close

Or check our Popular Categories...

Specifying the UDF output schema

Sample Code:

Leave a Reply

Related Articles

Other Stories

What are the features of Apache Solr ?

What is the main difference between pig vs hive vs sql ?

Ads Blocker Detected!!!