5 Emerging Ideas in Hadoop technology which are Trending - PowerPoint PPT Presentation

About This Presentation
Title:

5 Emerging Ideas in Hadoop technology which are Trending

Description:

This presentation describes the current trends in Hadoop technology. we offer courses and training for Hadoop. Hadoop training in Chennai is the best center for effective learning – PowerPoint PPT presentation

Number of Views:22

less

Transcript and Presenter's Notes

Title: 5 Emerging Ideas in Hadoop technology which are Trending


1
5 Emerging Ideas in Hadoop Technology which are
in Trending
Copyright 2008 PresentationFx.com
Redistribution Prohibited Image woodsy/sxc.hu
This text section may be deleted for
presentation.
2
1. WEB NOTEBOOKS
  • Web notebooks are a way to write code within the
    web browser and have it run against a cluster of
    servers.
  • Generally, web notebooks can support languages
    such as Scala and Python, as well as more basic
    languages such as HTML and Markdown, which allow
    the creation of a notebook that can be presented
    more easily
  • Integration of SQL into web notebooks has also
    become a more popular feature, although the
    capabilities of web notebooks vary greatly.
  • The only current limitation of these notebooks
    lies within the realm of security.
  • Currently there is no real security model in
    these web notebooks, but by putting a web server
    in front of them, some level of security can be
    achieved.

Copyright 2008 PresentationFx.com
Redistribution Prohibited Image woodsy/sxc.hu
This text section may be deleted for
presentation.
3
2. ALGORITHMS FOR MACHINE LEARNING
  • The application of machine-learning algorithms is
    a hot topic, and there are a number of important
    reasons for this.
  • The first is that most people can see the
    potential of leveraging machine-learning
    algorithms to gain more insights into the data
    they have.
  • Whether creating a recommendation engine,
    personalizing a website, identifying anomalies,
    or detecting fraud, the popularity of this area
    is strong.
  • A New Look at Anomaly Detection and Practical
    Machine Learning Innovations in Recommendation
    can each be read within a few hours.

Copyright 2008 PresentationFx.com
Redistribution Prohibited Image woodsy/sxc.hu
This text section may be deleted for
presentation.
4
3. SQL ON HADOOP
  • Apache Hive is the SQL-on-Hadoop technology that
    has been around the longest, and is probably the
    most widely used.
  • The Hive Metastore can be leveraged by other
    technologies such as Apache Drill.
  • The benefit in this case is that Drill can read
    the metadata from Hive and then run the queries
    itself.
  • Instead of depending upon the Hive MapReduce
    runtime. This approach is significantly faster
    and is one of the preferred ways of using Hive.
  • Now that you understand the background of SQL on
    Hadoop, lets take a look at two technologies
    that are gaining the most traction in this space

5
4. STREAM PROCESSING TECHNOLOGIES
  • It seems these days that everyone wants their
    stream processing framework to be the framework
    used.
  • There are so many projects (free and paid) in
    this space that it can make your head spin
    Apache Flink, Spark Streaming, Apache Apex
    (incubating), Apache Samza, Apache Storm, and
    Akka Streams, as well as StreamSets
  • Apache Storm was once considered the leader in
    this technology area. While it is true that the
    use of Apache Storm is declining.
  • The Storm API will likely live a long time. It
    has now been adopted by private code bases such
    as Twitters Heron, and it is also supported by
    Apache Flink.
  • Apache Beam is a rising star when it comes to
    frameworks for both batch and streaming
    data-parallel processing pipelines. It runs on
    both Flink and Spark and is worth keeping an eye
    on.

6
5. MESSAGING PLATFORMS
  • While stream processing engines are hot,
    messaging platforms are probably hotter. They can
    be used to create scalable architectures and are
    taking off like crazy across many organization
  • The top reason that the messaging platform model
    is so important is that it can support huge
    volumes of events.
  • Less than 10 years ago, people would get excited
    about being able to handle 50,000 to 100,000
    message events per second on a server. 
  • The cost to scale this platform is very low,
    which means a properly built application can
    scale without re-architecting the entire
    platform.
  • To perform data movement or having to enable
    development and quality assurance teams to test
    with production payloads. The value is
    tremendous.

7
CONVERGED ARCHITECTURAL APPROACH
  • As you can see, there are a lot of technology
    areas to keep an eye on. Be thoughtful about how
    you leverage these new technologies.
  • They bring with them the ability to think
    differently by simplifying business processes,
    which can enable a business to directly integrate
    analytics into core business functions.
  • Many of the technologies in the Hadoop ecosystem
    are considered big data technologies. We provide
    training for Hadoop technology. Dont hesitate
    to contact us805627677

www.datawaretools.in/chennai/
Write a Comment
User Comments (0)
About PowerShow.com