sqoop - Sqoop Integration - apache sqoop - sqoop tutorial - sqoop hadoop



Sqoop Integration with Hadoop Ecosystem:

  • Sqoop Integration is defined as data was moved between RDBMS to HDFS.
  • This imported data may further be required code analysed using hive or hbase.
  • Sqoop offers property to directly import data to Hive / Hbase.
  • Just add "--import-hive" at the end of the command.

Example:

sqoop import \ --connect 
"jdbc:mysql://localhost/training" \ --username training -P \ --table cityByCountry \ --target-dir /user/where_clause \ --where "state = 'Alaska'" \--import -hive  -m 1  
Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team
 Sqoop integration

Learn Sqoop - Sqoop tutorial - Sqoop integration - Sqoop examples - Sqoop programs

How to Integrate Sqoop in a Data Ingestion Layer with Hive, Oozie in a Big Data Application:

  • Sqoop is a tool which helps to migrate and transfer the data between RDBMS and Hadoop system in bulk mode.
  • This blog post will focus on integrating Sqoop with other projects in Hadoop ecosystem and Big Data Applications.
Sqoop related tags : sqoop import , sqoop interview questions , sqoop export , sqoop commands , sqoop user guide , sqoop documentation

Environment

  • Java: JDK 1.7
  • Cloudera version: CDH4.6
  • MySQL.

Initial steps

  • Assume that we will have some data on HDFS of our Hadoop cluster and in our MySQL.
  • Now we want to integrate jobs with Oozie and Hive.

Code walkthrough

Below is an Oozie workflow which will trigger import job to import the data from MySQL to Hadoop:

<workflow-app name="musqoop-wf" xmlns="uri:oozie:workflow:0.1">
 ...
<action name="mysqoopaction">
<sqoop xmlns="uri:oozie:mysqoopaction:0.2">
<job-tracker>myjt:8021</job-tracker>
<name-node>nn:8020</name-node>
<command>import --table People <https://datafloq.com/people/?utm=internal> --connect ...</command>
</sqoop>
<ok to="next"/>
<error to="error"/>

</action>
 ...
</workflow-app>
Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team
Sqoop related tags : sqoop import , sqoop interview questions , sqoop export , sqoop commands , sqoop user guide , sqoop documentation

How to add the property for Sqoop when integrating with Oozie?

  • This workflow file will help to configure the property to sqoop.

For Example: Add a statement in a transaction when import the data.

<workflow-app name="musqoop-wf" xmlns="uri:oozie:workflow:0.1">
 ...
<action name="mysqoopaction">
<sqoop xmlns="uri:oozie:mysqoopaction:0.2">
<job-tracker>myjt:8021</job-tracker>
<name-node>nn:8020</name-node>
<configuration>
<property>
<name>sqoop.export.statements.per.transaction</name>
<value>10</value>
</property>
</configuration>
<command>import --table People --connect ...</command>
</sqoop>
<ok to="next"/>
<error to="error"/>
</action>

 ...
</workflow-app>
Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team
  • Below command will import the data to Hadoop and into Hive also.
sqoop import \
 --connect jdbc:mysql://mysql.example.com/myintegrate \
 --username hn \
 --password hn \
 --table student \
 --hive-import
Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team
  • To Import the data to Hadoop and into Hive also the below command will be helpful, But it will support to partition data only.
sqoop import \
 --connect jdbc:mysql://mysql.example.com/myintegrate \
 --username hn \
 --password hn \
 --table student\
 --hive-import \
 --hive-partition-key day \
 --hive-partition-value "2016-09-11"
Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team

Verify and Result

  • And we can try to trigger the Oozie workflow for first two commands to make sure the XML schemas in workflow files.
  • For Hive integration, after the importing job complete, we can try to show table in Hive to make sure the table has data from MySQLshow tables by:
select * from student;
select * from student where LastName =’HN’;
select * from student where LastName like %’HN’%;
select count(*) from student;
Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team


Related Searches to Sqoop Integration

-