Apache Hadoop – The Big Name In The Big Data World

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Apache Hadoop – The Big Name In The Big Data World


1
Apache Hadoop The Big Name In The Big Data World
Java/J2EE Capabilities
2
What is Apache Hadoop?
What is Apache Hadoop?
  • A proficient data management framework for Big
    Data
  • Open source software for distributed processing
    of large chunks of data
  • Offers distributed parallel processing across
    servers, ranging from a single server to multiple
    machines
  • Processing and analysis of thousands of terabytes
    of data
  • Apt framework to increase business efficiency and
    maximize ROI
  • Latest Release on 18 November, 2014 Release
    2.6.0

3
Main Modules of Hadoop
Main Modules of Hadoop
4
Main Modules of Hadoop (contd.)
Main Modules of Hadoop (contd.)
  • Hadoop Common
  • Common utilities to help other Hadoop modules and
    support subprojects
  • Includes File System, RPC and serialization
    libraries
  • Hadoop Distributed File System (HDFS)
  • Distributed File System giving access to
    application data
  • Spans across all nodes in a Hadoop cluster to
    link them into one big file system
  • Java based, giving scalable and reliable data
    storage

5
Main Modules of Hadoop (contd.)
Main Modules of Hadoop (contd.)
  • Hadoop YARN
  • Utilized for job scheduling and resource
    management of clusters
  • Splits up two roles of JobTracker, namely,
    resource management and job scheduling into
    different areas
  • Hadoop MapReduce
  • System for parallel processing of large data sets
  • A framework that gets into work assignment to
    nodes in a particular cluster
  • Writes applications processing large amount of
    data, on multiple nodes of hardware with utmost
    reliability

6
Other Hadoop Related Projects at Apache
Other Hadoop Related Projects at Apache
  • Avro
  • Cassandra
  • Hbase
  • Hive
  • Pig
  • Spark
  • Ambari
  • Chukwa
  • Mahout
  • Tez
  • ZooKeeper

7
Why Hadoop?
Why Hadoop?
  • Next generation real time analytics
  • Rich eco systems
  • Scale-out storage
  • Reduced cost of ownership
  • Scalability, Flexibility and Reliability
  • Fault tolerance
  • Simplistic programming models

8
THANK YOU
Looking Forward To Have A Mutually Beneficial
Association. Assuring You Of Our Best Services
Always.
SPEC INDIA "SPEC House, Parth Complex, Swastik
Cross Road, Navrangpura, Ahmedabad-380
009, INDIA. Tel.91-79-26404031 to 34 VoIP
1 - 908 - 450 - 9862
Instant Messengers spec.bd
spec_india bd.spec specindia2009
specindia.bd e-mail lead_at_spec-india.com URL
http//www.spec-india.com
Write a Comment
User Comments (0)
About PowerShow.com