Basic Things You Should Know About Data Science - PowerPoint PPT Presentation

About This Presentation
Title:

Basic Things You Should Know About Data Science

Description:

Around 6 billion and more devices connected to the internet at present, as much as 2.5 million terabytes of data are produced each day. By 2020, a lot more devices are expected to get linked, evaluating an estimate of around 30 million terabytes of data each day. – PowerPoint PPT presentation

Number of Views:15
Slides: 11
Provided by: rogersamara
Category: Other

less

Transcript and Presenter's Notes

Title: Basic Things You Should Know About Data Science


1
Basic Things You Should Know About Data Science
  • Roger Samara

2
Roger Samara
  • Around 6 billion and more devices connected to
    the internet at present, as much as 2.5 million
    terabytes of data are produced each day. By 2020,
    a lot more devices are expected to get linked,
    evaluating an estimate of around 30 million
    terabytes of data each day.
  • As a tech lover or an IT professional, this
    should make you curious to explore more. So, if
    youre someone who is curious to know more about
    Data Science.

3
Lets explore a few basic things about data
science with Roger Samara.
  • What actually is Data Science all about?
  • Hadoop's role when you talk about Data Science
  • What is R in Data Science?
  • What is Apache Mahout?

4
What actually is Data Science all about?
  • It has become a hot topic when it comes to new
    technology and trends in the Information
    Technology world. This is common with many
    technologies which individuals start discussing
    as a nonsense without having actual knowledge of
    what is meant by the technology, what comes
    within its scope and so on. Therefore it is
    essential to discuss in a bit of detail.

5
  • The confusion arises at the point when you
    consider data science as part of today's
    technical scenario. It comes with its numerous
    components. Every time when people talk about the
    constituents of data science, they actually
    talking about big data. At the same time, they
    are talking about several jobs that form part of
    Data Science - what really is a role of Data
    Scientist's what actually is the Data Curator's
    role, what particularly id the Data Librarian's
    role and so on. At present scenario when you talk
    about it as a field within itself, it mainly
    deals with large chunks of data.

6
Hadoop's role when you talk about Data Science
  • It basically alludes to huge information and vast
    amounts of frameworks which are utilized to
    grapple with this large data. There are a
    significant number of structures which are
    existing, and they happen to have their very own
    pros and cons. Hadoop is the most far-reaching
    and mainstream structure. At whatever point you
    talk about data science, you talk about the
    various examination, which you have worked on
    this substantial amount of data - you truly can't
    escape Hadoop.
  • Every time when you perform a statistical
    examination, there is no need to care about
    Hadoop or any such structure for big data.
    However, Data Science happens to be an alternate
    creature. Likewise, Hadoop is created in Java, so
    it will truly help on the off chance that you
    comprehend Java too.

7
What is R in Data Science?
  • R is really a programming language for figures.
    Avoiding R is not a good idea since when you
    speak of different algorithms you have to apply
    over this large range of data in for you to have
    the capacity to get to the bits of knowledge of
    this information or essentially to empower
    certain machine learning algorithms over its
    highest point, you have to employ the services of
    R.

8
What is Apache Mahout?
  • Apache Mahout happens to be a library utilized
    for machine learning. It has been produced by
    Apache. Presently, what are the purposes behind
    getting so much popularity? What decisively are
    the causes behind it? The genuine sauce is that
    it specifically incorporates to science. It is
    truly not just about the sheer volume of
    information. It is extremely about getting
    helpful bits of knowledge from a given set of
    data.

9
  • Mahout happens to have an immediate vital
    condition with Hadoop that enables it to utilize
    Hadoop's capacity of preparing in executing its
    algorithm on big data. According to Roger Samara,
    on the off chance that you investigate enormous
    organizations including Facebook and LinkedIn,
    you will experience Mahout Implementations.

10
Roger Samara
Write a Comment
User Comments (0)
About PowerShow.com