Title: Hadoop online training|Bytesonlinetraining.com
1Hadoop Online Training
- Bytes online Training
- www.bytesonlinetraining.com
2Introduction
- Step by
Step Solution of Hadoop - 1)Hadoop is a very right method for distributed
computing and storage of data. - 2) Long time, people stored there information on
server in big databases. - 3) Normally, if you've some row or column data,
this works for you really well. 4) For e.g.,
imagine a small Amazon, early on in their
beginning, the only thing they are storing is
user info. and purchase order history. If you
have such as 1 million users and every makes 5
purchasing a year, your user data is 1 million
records and your database is 5 million records.
35)After that , you decide to store all product a
user adds to the shopping cart but never
purchases, so you start storing purchase
information which is 10 million records a year (2
carry out purchases for every real purchase).
6)Later that, they decide they want to store
every product every user looks at. They started
storing that user look at 10 products before
each completed or terminate purchase. You have
got 150 million record every year. After that
you decide you want to store every single click
that a user does it... it is not very long before
you want to store billions or trillions of
records every year.And this old storage
structures don't work. Single database with
billions of records is going to be so huge that
if it work, it's going to be slow down. this
is where Hadoop comes into existence...
4- What is Hadoop?1)Hadoop is a very clever
software designed to tackle the above situation.
As, we might have 5 billion records of what users
clicked on. - 2) We want to analyze this data to create
recommendations for people so they can find
products they want more quickly! How do we do
this? We already said that above that it was too
large expensive.3) So Hadoop comes into
existence and basically says that Buy a lot of
commercial hardware (Cheap desktops you can get
from anyone) and network them together. One of
these desktops is going to be the brain (Main) of
this thing. When you give it data, it will store
it on one of the nodes later. And when you need
to query it, you send the query to the brain, and
the brain will ask each desktop to return
whatever results it happens to be storing data..
5 4) Instead of using costly hardware, you use
cheaper hardware. more, distribute the queries
to the local hardware. It's equivalent to this
question Would you hire the fastest reader in
the world who can read such as 1,000 pages a
minute to read a billion pages or hire 75 avg.
people that can read 15 pages a minute to each
read 13 million pages for less money? The latter
stareted and cheaper but only if you've got a
large amount of data
6Apache Hadoop
Developers Apache Software Foundation
Release Date December 10, 2011.
Written In Java
Operating System Cross Platform
Type. Distributed file system
7- Master your concepts from the basics and become
expert from Bytesonlinetraining.com we provide - Certification Preparation
- Job Support
- Interview Help
- General Training
- Crash Course
- For more details visit our site -
http//www.bytesonlinetraining.com - Call- 91 998-952-7180