Apache flume
- Apache Flume is a reliable, distributed and accessible service for efficiently aggregating, collecting, and moving huge amounts of log data.
- It has an easy and flexible design based on streaming data flows. It’s robust and fault tolerant with tunable reliability mechanisms and recovery mechanisms.
- It uses an easy extensible data model that enables for on-line analytic application.
Learn Flume – Flume tutorial – apache flume – Flume examples – Flume programs
Flume defines a simple pipeline structure with three roles:
-
- Source
- Channel
- Sinks
- Sources define where data comes from, e.g. a file, a message queue (Kafka,JMS).
- Channels are pipes connecting sources with sinks.
- Sinks are the destination of the data pipelined from sources.