sqoop - Sqoop Export - apache sqoop - sqoop tutorial - sqoop hadoop
Sqoop related tags : sqoop import , sqoop interview questions , sqoop export , sqoop commands , sqoop user guide , sqoop documentation
What is Sqoop Export?
- The export tool exports a set of files from HDFS back to an RDBMS. The target table must already exist in the database.
- The input files are read and parsed into a set of records according to the user-specified delimiters.
- The default operation is to transform these into a set of INSERT statements that inject the records into the database.
- In "update mode," Sqoop will generate UPDATE statements that replace existing records in the database.
Learn sqoop - sqoop tutorial - Sqoop Export - sqoop examples - sqoop programs
Export control arguments:
Argument | Description |
---|---|
--direct | Use direct export fast path |
--export-dir | HDFS source path for the export |
-m,--num-mappers | Use n map tasks to export in parallel |
--table | Table to populate |
--update-key | Anchor column to use for updates. Use a comma separated list of columns if there are more than one column. |
--update-mode | Specify how updates are performed when new rows are found with non-matching keys in database. |
Legal values for mode include updateonly (default) and allowinsert. | |
--input-null-string | The string to be interpreted as null for string columns |
--input-null-non-string | The string to be interpreted as null for non-string columns |
--staging-table | The table in which data will be staged before being inserted into the destination table. |
--clear-staging-table | Indicates that any data present in the staging table can be deleted. |
--batch | Use batch mode for underlying statement execution. |
Syntax
Insert:
- Insert mode will insert the new records from HDFS to RDBMS table.
Note: If a record already present in the database table with same primary key, then it will raise MySQLIntegrityConstraintViolationException exception.
Update:
- Update mode will update the records in the RDBMS from HDFS data.
- Update mode only update already existing records, it will not insert new records in the RDBMS.
Example
- Let us take an example of the employee data in file, in HDFS.
- The employee data is available in emp_data file in ‘emp/’ directory in HDFS. The emp_data is as follows.
- It is mandatory that the table to be exported is created manually and is present in the database from where it has to be exported.
- The following query is used to create the table ‘wikitechy_employee’ in mysql command line.
- The following command is used to export the table data (which is in emp_data file on HDFS) to the employee table in db database of Mysql database server.
- The following command is used to export the table data (which is in emp_data file on HDFS) to the employee table in db database of Mysql database server.
- The following command is used to verify the table in mysql command line.