Posts

Showing posts from 2016

What is Sqoop?

What is Sqoop? As we all know Relational databases are the main data sources for Big Data, and Hadoop is a framework which we use to analyze big data. So Sqoop is a tool which imports the data from Relational databases to Hadoop HDFS and also exports the data from Hadoop HDFS to Relational databases. Relational databases can be MySQL, PostgreSQL ,Oracle and Redshift or any other RDBMS. Sqoop uses MapReduce to import and export the data, which provides parallel operation as well as fault tolerance. Sqoop is an open source software product of the Apache Software Foundation. Prerequisites Before we start with Sqoop following prerequisite knowledge is required to run Sqoop jobs: ·          Basic knowledge of linux operating system with commands ·          Concepts of Relational database management systems ·          Concepts of Hadoop and HDFS, with basic commands Starting with Sqoop :- Let’s start with very basic and important command which will tell you abou