Basics of R programming (1) - PowerPoint PPT Presentation

About This Presentation
Title:

Basics of R programming (1)

Description:

Introduction of R Programming – PowerPoint PPT presentation

Number of Views:1118
Slides: 33
Provided by: Rahulverma123

less

Transcript and Presenter's Notes

Title: Basics of R programming (1)


1
R PROGRAMMING
  • ARAVALI COLLEGE OF ENGINEERING MANAGEMENT
  • (ACEM, FARIDABAD)

2
Program Name B.TECH CSESemester VIICourse
Name R PROGRAMMINGCourse Code
OEC-CS-701(III)Faculty Name RASHIKA
SINGHDesignation ASSISTANT PROFESSORDepartment
__CSE_____
3
UNIT No.1
  • INTRODUCTION
  • Getting R
  • R Version
  • 32-bit versus 64-bit
  • The R Environment
  • Command Line Interface
  • RStudio
  • Revolution Analytics
  • Learning Outcome
  • Familiarize themselves with R and the RStudio IDE

4
R PROGRAMMING
  • R is a software environment which is used to
    analyze statistical information and graphical
    representation.
  • R allows us to do modular programming using
    functions.
  •  This programming language was named R, based on
    the first name letter of the two authors (Robert
    Gentleman and Ross Ihaka).

5
  • "R is an interpreted computer programming
    language which was created by Ross Ihaka and
    Robert Gentleman at the University of Auckland,
    New Zealand." 
  • The R Development Core Team currently develops R.
    It is also a software environment used to
    analyze statistical information, graphical
    representation, reporting, and data modeling.
  • R is the implementation of the S
    programming language, which is combined
    with lexical scoping semantics.
  • R not only allows us to do branching and looping
    but also allows to do modular programming using
    functions. R allows integration with the
    procedures written in the C, C, .Net, Python,
    and FORTRAN languages to improve efficiency.
  • In the present era, R is one of the most
    important tool which is used by researchers, data
    analyst, statisticians, and marketers for
    retrieving, cleaning, analyzing, visualizing, and
    presenting data.

6
History of R Programming
  • The history of R goes back about 20-30 years ago.
    R was developed by Ross lhaka and Robert
    Gentleman in the University of Auckland, New
    Zealand, and the R Development Core Team
    currently develops it.
  • This programming language name is taken from the
    name of both the developers.
  • The first project was considered in 1992. The
    initial version was released in 1995, and in
    2000, a stable beta version was released.

7
(No Transcript)
8
The following table shows the release date,
version, and description of R language
Version-Release Date Description
0.49 1997-04-23 First time R's source was released, and CRAN (Comprehensive R Archive Network) was started.
0.60 1997-12-05 R officially gets the GNU license.
0.65.1 1999-10-07 update.packages and install.packages both are included.
1.0 2000-02-29 The first production-ready version was released.
9

1.4 2001-12-19 First version for Mac OS is made available.

2.0 2004-10-04 The first version for Mac OS is made available.
2.1 2005-04-18 Add support for UTF-8encoding, internationalization, localization etc.
2.11 2010-04-22 Add support for Windows 64-bit systems.
2.13 2011-04-14 Added a function that rapidly converts code to byte code.
10
2.14 2011-10-31 Added some new packages.
2.15 2012-03-30 Improved serialization speed for long vectors.
3.0 2013-04-03 Support for larger numeric values on 64-bit systems.
3.4 2017-04-21 The just-in-time compilation (JIT) is enabled by default.
3.5 2018-04-23 Added new features such as compact internal representation of integer sequences, serialization format etc.
11
Features of R programming
  • It is a simple and effective programming language
    which has been well developed.
  • It is data analysis software.
  • It is a well-designed, easy, and effective
    language which has the concepts of user-defined,
    looping, conditional, and various I/O facilities.
  • It has a consistent and incorporated set of tools
    which are used for data analysis.
  • For different types of calculation on arrays,
    lists and vectors, R contains a suite of
    operators.
  • It provides effective data handling and storage
    facility.
  • It is an open-source, powerful, and highly
    extensible software.
  • It provides highly extensible graphical
    techniques.
  • It allows us to perform multiple calculations
    using vectors.
  • R is an interpreted language.

12
Why use R Programming?
13
  • R is important for Data Science?
  • R plays a very important role in Data Science,
    you will be benefited with following operations
    in R.
  • You can run your code without any compiler  R is
    an interpreted language. Hence we can run code
    without any compiler. R interprets the code and
    makes the development of code easier.
  • Many calculations done with vectors  R is a
    vector language, so anyone can add functions to a
    single Vector without putting in a loop. Hence, R
    is powerful and faster than other languages.
  • Statistical Language  R used in biology,
    genetics as well as in statistics. R is a turning
    complete language where any type of task can
    perform.
  • 2. R is Good for Business?
  • R will just not help you in the technical fields,
    it will also be a great help in your business.
  • Here, the major reason is that R is open-source,
    therefore it can be modified and redistributed as
    per the users need. It is great for
    visualization and has far more capabilities as
    compared to other tools.
  • For data-driven businesses, lack of Data
    Scientists is a huge concern. Companies are using
    R programming as their core platform and are
    recruiting trained R programmers.

14
  • 3. R is a gateway to Lucrative Career
  • R language is used extensively in Data Science.
    This field offers some of the
  • highest-paying jobs in the world today. 
  • 4. Open-source
  • R is an open-source language. It is maintained
    by a community of active users and
  • you can avail R for free. You can modify various
    functions in R and make your own
  • packages. Since R is issued under the General
    Public Licence (GNU), there are no
  • restrictions on its usage.
  • 5. Popularity
  • R has become one of the most popular programming
    languages in the industries.
  • Conventionally, R was mostly used in academia but
    with the emergence of Data
  • Science, the need for R in the industries became
    evident. R is used at Facebook for social
  • network analysis. It is being used at Twitter for
    semantic analysis as well as visualizations.
  • 6. Robust Visualization Library
  • R comprises of libraries like ggplot2,
    plotly that offer aesthetic graphical plots to
    its
  • users. R is most widely recognized for its
    stunning visualizations which gives it an
  • edge over other Data Science programming
    languages.

15

7. With R, you can develop amazing Web-Apps R
provides you with the ability to build aesthetic
web-applications. Using the R Shiny package, you
can develop interactive dashboards straight from
the console of your R IDE. Using this, you can
embed your visualizations and enhance the
storytelling of your data analysis through
aesthetic visualizations. 8. A go-to language
for Statistics and Data Science R is the
standard language for Statistics and Data
Science. R was developed for statistics, by
statisticians. It has been in use even before the
word Data Science was coined. Statisticians
and Data Scientists are most familiar with R
than any other programming language. R
facilitates various statistical operations
through its thousands of packages. 9. R is being
used in almost every industry R is one of the
most widely used programming languages in the
world today. It is used in almost every industry,
ranging from finance, banking to medicine and
manufacturing. R is used for portfolio
management, risk analytics in finance and banking
industries. It is used for carrying out an
analysis of drug discovery and genomic analysis
in bioinformatics. R is also used to implement
various statistical measures to optimize
industrial processes.
16
Pros and Cons of R Programming Language
17
PROS/ADVATAGES OF R PROGRAMMING
  • 1. Open Source
  • R is an open-source programming language. This
    means that anyone can work with R without any
    need for a license or a fee. Furthermore, you can
    contribute towards the development of R
    by customizing its packages, developing new ones
    and resolving issues.
  • 2. Exemplary Support for Data Wrangling
  • R provides exemplary support for data wrangling.
    The packages like dplyr, readr are capable of
    transforming messy data into a structured form.
  • 3. The Array of Packages
  • R has a vast array of packages. With over 10,000
    packages in the CRAN repository, the number is
    constantly growing. These packages appeal to all
    the areas of industry.
  • 4. Quality Plotting and Graphing
  • R facilitates quality plotting and graphing. The
    popular libraries like ggplot2 and plotly advocate
    for aesthetic and visually appealing graphs that
    set R apart from other programming languages.
  • 5. Highly Compatible
  • R is highly compatible and can be paired with
    many other programming languages like C, C,
    Java, and Python. It can also be integrated with
    technologies like Hadoop and various other
    database management systems as well.

18
  • 6. Platform Independent
  • R is a platform-independent language. It is a
    cross-platform programming language, meaning that
    it can be run quite easily on Windows, Linux, and
    Mac.
  • 7. Eye-Catching Reports
  • With packages like Shiny and Markdown, reporting
    the results of an analysis is extremely easy with
    R. You can make reports with the data, plots and
    R scripts embedded in them. You can even make
    interactive web apps that allow the user to play
    with the results and the data.
  • 8. Machine Learning Operations
  • R provides various facilities for carrying out
    machine learning operations like classification,
    regression and also provides features for
    developing artificial neural networks.
  • 9. Statistics
  • R is prominently known as the lingua franca of
    statistics. This is the main reason as to why R
    is dominant among other programming languages for
    developing statistical tools.
  • 10. Continuously Growing
  • R is a constantly evolving programming language.
    It is a state of the art technology that provides
    updates whenever any new feature is added.

19
Disadvantages of R Programming
  • 1. Weak Origin
  • R shares its origin with a much older programming
    language S. This means that its base package
    does not have support for dynamic or 3D graphics.
    With common packages of R like Ggplot2 and
    Plotly, it is possible to create dynamic, 3D as
    well as animated graphics.
  • 2. Data Handling
  • In R, the physical memory stores the objects.
    This is in contrast to other languages like
    Python. Furthermore, R utilizes more memory as
    compared with Python. Also, R requires the entire
    data in one single place, that is, in the memory.
    Therefore, it is not an ideal option when dealing
    with Big Data. However, with data management
    packages and integration with Hadoop possible,
    this is easily covered.
  • 3. Basic Security
  • R lacks basic security. This feature is an
    essential part of most programming languages like
    Python. Because of this, there are several
    restrictions with R as it cannot be embedded into
    a web-application.

20
  • 4. Complicated Language
  • R is not an easy language to learn. It has a
    steep learning curve. Due to this, people who do
    not have prior programming experience may find it
    difficult to learn R.
  • 5. Lesser Speed
  • R packages and the R programming language is much
    slower than other languages like MATLAB and Python
    .
  • 6. Spread Across various Packages
  • The algorithms in R are spread across different
    packages. Programmers without prior knowledge of
    packages may find it difficult to implement
    algorithms.

21
R VERSIONS
  • RStudio Server enables users and administrators
    to have very fine-grained control over which
    versions of R are used in various contexts.
    Capabilities include
  • Administrators can install several versions of R
    and specify a global default version as well as
    per-user or per-group default versions.
  • Users can switch between any of the available
    versions of R as they like.
  • Users can specify that individual R projects
    remember their last version of R and always use
    that version until explicitly migrated to a new
    version.

22
  • On Windows, RStudio uses the system's current
    version of R by default. When R is installed on
    Windows it writes the version being installed to
    the Registry as the "current" version of R (the
    specific registry keys written are
    described here). This is the version of R which
    RStudio runs against by default.
  • You can override which version of R is used via
    General panel of the RStudio Options dialog. This
    dialog allows you to specify that RStudio should
    always bind to the default 32 or 64-bit version
    of R, or to specify a different version
    altogether

23
Note that by holding down the Control key during
the launch of RStudio you can cause the R version
selection dialog to display at startup.
24
32-BIT VERSUS 64-BIT
  • we have two R binaries
  • /APPS/32/bin/R /APPS/64/bin/R If you are on 32
    bit, then you automatically get the first when
    you just say R. If you are on 64 bit, then you
    automatically get the second when you just say R.
  • On 64 bit you can run the first, if you so
    desire, but you must invoke it using the full
    path name /APPS/32/bin/R. It will work in
    emulation mode.
  • If you never load your own C code into R
    with dyn.load(sharedlibraryname) and never use
    libraries other than those that those installed
    by the system administrators and you load
    withlibrary(packagename), then you should have no
    problems. Just say R with no additional fuss to
    invoke R, and it will work.
  • Otherwise, you will have to be aware of What
    architecture you are running on.What architecture
    the R binary you are r

25
APPLICATIONS OF R
  • 1. Finance
  • Data Science is most widely used in the financial
    industry.
  • R is the most popular tool for this role. This is
    because R provides an advanced statistical suite
    that is able to carry out all the
    necessary financial tasks.
  • With the help of R, financial institutions are
    able to perform downside risk measurement, adjust
    risk performance and utilize visualizations
    like candlestick charts, density plots, drawdown
    plots, etc.
  • 2. Banking
  • Just like financial institutions, banking
    industries make use of R for credit risk modeling
    and other forms of risk analytics.
  • Bank of America makes use of R for financial
    reporting. With the help of R, the data
    scientists at BOA are able to analyze financial
    losses and make use of Rs visualization tools.

26
  • 3. Healthcare
  • Genetics, Bioinformatics, Drug Discovery,
    Epidemiology are some of the fields in healthcare
    that make heavy usage of R. With the help of R,
    these companies are able to crunch data and
    process information, providing an essential
    backdrop for further analysis and data
    processing.
  • 4. Social Media
  • For many beginners in Data Science and R, social
    media is a data playground. Sentiment Analysis
    and other forms of social media data mining are
    some of the important statistical tools that are
    used with R.
  • Social Media is also a challenging field for Data
    Science because the data prevalent on social
    media websites is mostly unstructured in nature.
    R is used for social media analytics, for
    segmenting potential customers and targeting them
    for selling your products.

27
  • 5. E-Commerce
  • The e-commerce industry is one of the most
    important sectors that utilize Data Science. R is
    one of the standard tools that is being used in
    e-commerce.
  • Since these internet-based companies have to deal
    with various forms of data, structured and
    unstructured, as well as from varying data
    sources like spreadsheets and databases (SQL
    NoSQL), R proves to be an effective choice for
    these industries.
  • 6. Manufacturing
  • Manufacturing companies like Ford, Modelez, and
    John Deere use R to analyze customer sentiment.
    This helps them optimize their product according
    to trending consumer interests and also to match
    their production volume to varying market demand.
    They also use R to minimize their production
    costs and maximize profits.

28
Real-Life Use Cases of R Language
  • Facebook  Facebook uses R to update status and
    its social network graph. It is also used for
    predicting colleague interactions with R.
  • Ford Motor Company  Ford relies on Hadoop. It
    also relies on R for statistical analysis as well
    as carrying out data-driven support for decision
    making.
  • Google  Google uses R to calculate ROI on
    advertising campaigns and to predict economic
    activity and also to improve the efficiency of
    online advertising.
  • Foursquare  R is an important stack behind
    Foursquares famed recommendation engine.
  • John Deere  Statisticians at John Deere use R
    for time series modeling and also geospatial
    analysis in a reliable and reproducible way. The
    results are then integrated with Excel and SAP.
  • Microsoft  Microsoft uses R for the Xbox
    matchmaking service and also as a statistical
    engine within the Azure ML framework.
  • Mozilla  It is the foundation behind the Firefox
    web browser and uses R to visualize web activity.

29
FUTURE SCOPE OF R
30
  • 1. Data Scientist
  • The profession of Data Scientist is the most
    demanding job role. A Data Scientist is supposed
    to extract data, transform it into a structured
    format, perform analysis and forecast future
    insights. For this purpose, R is the most ideal
    tool as it provides efficient data handling
    capability as well as a robust set of analysis
    and machine learning tools. 
  • 2. Business Analyst
  • A Business Analyst has to develop solutions that
    are technical in nature for the various business
    problems. They are required to seek solutions,
    advance the efforts of the company as well as
    fulfill the requirements of the business. For
    this purpose, R provides various business
    intelligence tools through its extensive
    packages. 

31
  • 3. Data Analyst
  • A Data Analyst is responsible for extracting and
    analyzing data. This task requires extensive
    usage of Rs statistical libraries to deliver
    accurate results so that the companies can make
    careful data-driven decisions. 
  • 4. Data Visualization Expert
  • R is most popular for its visualization
    libraries. Due to this reason, Data Visualization
    experts in R programming are in-demand in the
    industries. The various packages of R
    likeggplot2, plotly, etc provide visually
    appealing graphs and plots to their users.
  • 5. Quantitative Analyst
  • Quantitative Analysts are engaged in the
    financial and banking industries. These
    industries have to deal with all types of data
    and R provides an ideal solution to their various
    data problems. 

32
Aravali College of Engineering And
Management Jasana, Tigaon Road, Neharpar,
Faridabad, Delhi NCR Toll Free Number 91-
8527538785 Website  www.acem.edu.in
Write a Comment
User Comments (0)
About PowerShow.com