What is Sqoop?

What is Sqoop?
As we all know Relational databases are the main data sources for Big Data, and Hadoop is a framework which we use to analyze big data. So Sqoop is a tool which imports the data from Relational databases to Hadoop HDFS and also exports the data from Hadoop HDFS to Relational databases.
Relational databases can be MySQL, PostgreSQL ,Oracle and Redshift or any other RDBMS.
Sqoop uses MapReduce to import and export the data, which provides parallel operation as well as fault tolerance.
Sqoop is an open source software product of the Apache Software Foundation.

Prerequisites
Before we start with Sqoop following prerequisite knowledge is required to run Sqoop jobs:
·         Basic knowledge of linux operating system with commands
·         Concepts of Relational database management systems
·         Concepts of Hadoop and HDFS, with basic commands


Starting with Sqoop :-
Let’s start with very basic and important command which will tell you about all the available commands of Sqoop

$ sqoop help
Available commands:
  codegen                            Generate code to interact with database records
  create-hive-table            Import a table definition into Hive
  eval                                   Evaluate a SQL statement and display the results
  export                               Export an HDFS directory to a database table
  help                                   List available commands
  import                               Import a table from a database to HDFS
  import-all-tables             Import tables from a database to HDFS
  import-mainframe          Import datasets from a mainframe server to HDFS
  job                                     Work with saved jobs
  list-databases            List available databases on a server
  list-tables                    List available tables in a database
  merge                          Merge results of incremental imports
  metastore                  Run a standalone Sqoop metastore
  version                       Display version information


In next blog will see how to use all above Sqoop commands one by one.

Comments

Popular posts from this blog

1. What is Big Data ?

How to install Cloudera QuickStart VM on VMware - Part1?

Different flavours of Hadoop?