Engineering BIG DATA with HADOOP - PowerPoint PPT Presentation

About This Presentation
Title:

Engineering BIG DATA with HADOOP

Description:

This presentation explains about Introduction of Big Data with Hadoop – PowerPoint PPT presentation

Number of Views:1053

less

Transcript and Presenter's Notes

Title: Engineering BIG DATA with HADOOP


1
ENGINEERING BIG DATA WITH HADOOP
  • BY
  • International School of Engineering
  • We Are Applied Engineering

Disclaimer Some of the Images and content have
been taken from multiple online sources and this
presentation is intended only for knowledge
sharing but not for any commercial business
intention
2
OVERVIEW
  • WHAT IS BIG DATA?
  • EXPLOSION OF DATA
  • DATA CONTRIBUTIONS
  • DATA EXPLOSION
  • WHO ARE THE PLAYERS?
  • BIG DATABIG PICTURE LANDSCAPE
  • BIG DATA ENTERPRISE ROLES
  • WHAT IS HADOOP?
  • EVOLUTION OF HADOOP
  • HADOOP ECOSYSTEM
  • HADOOP ECOSYSTEM MAP
  • HADOOP 30,000 FEET VIEW
  • BIG DATA ANALYTICS Case studies
  • VIDEO OF HADOOP ECOSYSYTEM

3
WHAT IS BIG DATA?
  • High-volume, high-velocity and high- variety
    information assets that demand cost- effective,
    innovative forms of information processing for
    enhanced insight and decision making.



  • -Gartner

HIGH VOLUME
HIGH VARIETY
HIGH VELOCITY
4
EXPLOSION OF DATA
5
Source http//www.emc.com/leadership/digital-univ
erse/iview/index.htm
6
DATA CONTRIBUTIONS
7
DATA EXPLOSION
8
Source http//www.emc.com/collateral/about/news/i
dc-emc-digital-universe-2011-infographic.pdf
9
Source http//www.emc.com/collateral/about/news/i
dc-emc-digital-universe-2011-infographic.pdf
10
WHO ARE THE PLAYERS?
11
(No Transcript)
12
BIG DATABIG PICTURE LANDSCAPE
13
BIG DATA ENTERPRISE ROLES
14
INTRODUCTION TO
15
WHAT IS HADOOP?
  • Flexible
  • Structured/Unstructured
  • Text/Binary
  • Schema/Schema less
  • 100 Open Source
  • Scalable
  • Petabytes of Data
  • Thousands of Nodes

Source http//cloudtimes.org/2013/06/25/hadoop-as
-a-service-market-growing/
16
EVOLUTION OF HADOOP
How does an Elephant Sneak up on you?
17
HADOOP ECOSYSTEM
Chukwa
Sqoop
Zookeeper
Pig
HBase
Avno
Mahout
Flume
Whirr
Map Reduce Engine
Hama
Hadoop Distributed File System
Hive
Hadoop Common
18
HADOOP ECOSYSTEM MAP
Source http//indoos.wordpress.com/2010/08/16/had
oop-ecosystem-world-map/
19
Hadoop Evolution Map Explained!
  • How did it all start- huge data on the web!
  • Nutch built to crawl this web data
  • Huge data had to be saved- HDFS was born!
  • How to use this data? Map reduce framework built
    for coding and running analytics java, any
    language-streaming (Hadoop streaming)
  • How to get in unstructured data Web logs, Click
    streams, Apache logs, Server logs  fuse,webdav,
    chukwa, flume, Scribe
  • Hiho and sqoop for loading data into HDFS RDBMS
    can join the Hadoop band wagon!

20
Continued
  • High level interfaces required over low level map
    reduce programming Pig, Hive, Jaql
  • BI tools with advanced UI reporting- drilldown
    etc- Intellicus
  • Workflow tools over Map-Reduce processes and High
    level languages Oozie
  • Monitor and manage hadoop, run jobs/hive, view
    HDFS high level view- Hue, karmasphere, eclipse
    plugin, cacti, ganglia
  • Support frameworks- Avro (Serialization),
    Zookeeper (Coordination)
  • More High level interfaces/uses- Mahout, Elastic
    map Reduce
  • OLTP- also possible Hbase

21
HADOOP 30,000 FEET VIEW
  • Distribute data initially
  • Let processors / nodes work on local data
  • Minimize data transfer over network
  • Replicate data multiple times for increased
    availability
  • Write applications at a high level
  • Programmers should not have to worry about
    network programming, temporal dependencies, low
    level infrastructure, etc
  • Minimize talking between nodes (share-nothing)

22
BIG DATA ANALYTICS
  • Case Studies

23
YAHOO - PERSONALIZATION
24
YAHOO SEARCH ASSIST
25
For Detailed Description of HADOOP ECOSYSTEM
components
checkout our video on
26
International School of Engineering
Plot no 63/A, 1st Floor, Road No 13, Film Nagar,
Jubilee Hills, Hyderabad-500033
For Individuals (91) 9502334561/62 For
Corporates (91) 9618 483 483
Facebook www.facebook.com/insofe
Slide share www.slideshare.net/INSOFE
Write a Comment
User Comments (0)
About PowerShow.com