sqoop - Sqoop Import All Tables - apache sqoop - sqoop tutorial - sqoop hadoop
How to import all tables from the RDBMS database server to the HDFS?
- Each table data is stored in a separate directory and the directory name is same as the table name.
- The import-all-tables tool imports a set of tables from an RDBMS to HDFS. Data from each table is stored in a separate directory in HDFS.
- For the import-all-tables tool to be useful, the following conditions must be met:
- Each table must have a single-column primary key or --autoreset-to-one-mapper option must be used.
- You must intend to import all columns of each table.
- You must not intend to use non-default splitting column, nor impose any conditions via a WHERE clause.





Syntax
$ sqoop import-all-tables (generic-args) (import-args)
$ sqoop-import-all-tables (generic-args) (import-args)
Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team
Import control arguments:
Argument | Description |
---|---|
--as-avrodatafile | Imports data to Avro Data Files |
--as-sequencefile | Imports data to SequenceFiles |
--as-textfile | Imports data as plain text (default) |
--direct | Use direct import fast path |
--direct-split-size <n> | Split the input stream every n bytes when importing in direct mode |
--inline-lob-limit <n> | Set the maximum size for an inline LOB |
-m,--num-mappers <n> | Use n map tasks to import in parallel |
--warehouse-dir <dir> | HDFS parent for table destination |
-z,--compress | Enable compression |
--compression-codec <c> | Use Hadoop codec (default gzip) |
- These arguments behave in the same manner as they do when used for the sqoop-import tool, but the --table, --split-by, --columns, and --where arguments are invalid for sqoop-import-all-tables. The --exclude-tables argument is for +sqoop-import-all-tables only.
Example
- Let us take an example of importing all tables from the userdb database.
- The list of tables that the database userdb contains is as follows.
+--------------------+
| Tables |
+--------------------+
| emp |
| emp_add |
| emp_contact |
+--------------------+
Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team
- The following command is used to import all the tables from the userdb database.
$ sqoop import-all-tables \--connect jdbc:mysql://localhost/userdb \--username root
Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team
Note: If you are using the import-all-tables, it is mandatory that every table in that database must have a primary key field.
- The following command is used to verify all the table data to the userdb database in HDFS
$ $HADOOP_HOME/bin/hadoop fs -ls
Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team
- It will show you the list of table names in userdb database as directories.
Output
drwxr-xr-x - hadoop supergroup 0 2014-12-22 22:50 _sqoop
drwxr-xr-x - hadoop supergroup 0 2014-12-23 01:46 emp
drwxr-xr-x - hadoop supergroup 0 2014-12-23 01:50 emp_add
drwxr-xr-x - hadoop supergroup 0 2014-12-23 01:52 emp_contact
Import all tables from the corp database:
$ sqoop import-all-tables --connect jdbc:mysql://db.foo.com/corp
Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team
Validating:
$ hadoop fs -ls
Found 4 items
drwxr-xr-x - someuser somegrp 0 2010-04-27 17:15 /user/someuser/EMPLOYEES
drwxr-xr-x - someuser somegrp 0 2010-04-27 17:15 /user/someuser/PAYCHECKS
drwxr-xr-x - someuser somegrp 0 2010-04-27 17:15 /user/someuser/DEPARTMENTS
drwxr-xr-x - someuser somegrp 0 2010-04-27 17:15 /user/someuser/OFFICE_SUPPLIES