#bigdata 30e — Apache Flume and Sqoop

To capture data or move them into Hadoop we have two tools that are part of the Hadoop Ecosystem, called FLUME and SQOOP.

1 — APACHE FLUME

Flume is free software, developed by Cloudera, and delivered to the management of the Apache Foundation.

Credits Apache Fundation

Flume allows streaming data, coming from multiple locations, to be “injected” or moved into Hadoop clusters and written into HDFS.

Flume is used to collect log files from a worldwide network of clusters, with data stored in HDFS to be analyzed later or even in real time.

2 — APACHE SQOOP

Sqoop is an open source software designed to transfer data between relational and Hadoop database systems.

Credits Apache Foundation

Used in Data Warehouse for the extraction of structured data for analysis in Hadoop.

CURIOSITIES

  1. One of the advantages of Flume is that the captured data can be stored directly into HBase or HDFS.
  2. Flume is widely used to import large volumes of data from events produced on social networks like Facebook and Twitter, and e-commerce sites like Amazon for example.
  3. Sqoop works with relational databases such as Teradata, Netezza, Oracle, MySQL, and Postgres.
  4. Sqoop helps to download ETL (Extract, Transform, Load) tasks from Data Warehouse to Hadoop in an efficient and low-cost way.
  5. Sqoop can also do the task reversed by transferring Hadoop data into Relational Database.

More information about this article

Article selected from the eBook “Big Data for Executives and Market Professionals.”
eBook in English: Amazon or Apple Store
eBook in Portuguese: Amazon or Apple Store

--

--

Author. Big Data Researcher. USA WebCT IT Executive. Education Technology Director. Portuguese Brazilian citizen. Ex-soccer player. Peace for everyone.

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jose Antonio Ribeiro Neto (Zezinho)

Jose Antonio Ribeiro Neto (Zezinho)

Author. Big Data Researcher. USA WebCT IT Executive. Education Technology Director. Portuguese Brazilian citizen. Ex-soccer player. Peace for everyone.