Loading…
ApacheCon North America 2014 has ended
Register Now for ApacheCon North America 2014 - April 7-9 in Denver, CO. Registration fees increase on March 15th, so don’t delay!
Tuesday, April 8 • 10:30am - 11:20am
Real Time Data Ingest into Hadoop using Flume

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Apache Flume is a real time distributed data ingest system specifically designed for the Hadoop Ecosystem. Flume is highly scalable distributed system that guarantees delivery from a large number of data sources to an eventual destination like HDFS or HBase. Flume has been deployed in extremely large deployments in several companies around the world, transferring several hundreds of terabytes every weekend.

In this presentation, we will go through the fundamental components that make up Flume and how to configure and deploy Flume to your cluster to scale based on the number of sources and amount of data. As a committer and an engineer supporting Flume in production, I will present standard deployment topologies and how to design a deployment topology.

Speakers
HS

Hari Shreedharan

Software Engineer, Cloudera
Hari Shreedharan is a PMC member on Apache Flume and a committer on Apache Sqoop. He is a Software Engineer at Cloudera. He regularly presents at conferences and meetups related to Hadoop and Big Data.


Tuesday April 8, 2014 10:30am - 11:20am PDT
Confluence C

Attendees (0)