Facilitating Data Analysis Through Ruby - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Facilitating Data Analysis Through Ruby

Description:

This is precisely why I turned to the Linux-base Ubuntu! Why Linux and not Windows? ... Ubuntu's webpage describes itself as 'a community developed, Linux-based ... – PowerPoint PPT presentation

Number of Views:168
Avg rating:3.0/5.0
Slides: 28
Provided by: Office2163
Category:

less

Transcript and Presenter's Notes

Title: Facilitating Data Analysis Through Ruby


1
Facilitating Data Analysis Through Ruby R
Interaction
  • Rishi Gharpuray, James Woo, Carl Leonard and Jim
    DeLeo
  • Scientific Computing Section
  • Department of Clinical Research Informatics
  • NIH Clinical Center
  • National Institutes of Health
  • Bethesda, Maryland

2
Objective
  • To create and develop a software graphical
    module that uses Ruby code in Smart-MartTM, a
    software environment presently under development
    by the NIHCC Scientific Computing Section. Plots
    produced with this module should facilitate data
    analysis for biomedical research and promote the
    objectives of data mining and translational
    medicine.

3
Ruby Overview
  • Ruby is an interactive web development
    language that is a combination of the classic C
    language and Perl, another programming language.
    Ruby is very easy to use compared to the
    syntactically complex and cryptic C and C. For
    Loops, instead of requiring all sorts of
    parentheses and uncommon characters, can simply
    be
  • written as
  • for..1 in 20 to go through for loops 20 times.

4
Ruby Overview Continued
  • Ruby is easily upgradeable with Gems, which
    can conveniently be installed at the Command Line
    with the install name of gem command. Ruby
    Forge, an online website with forums and a
    variety of useful packages has become one of my
    best friends. http//rubyforge.org/

5
R Overview
  • R is at first glance a very simple
    statistical language that can be used to plot,
    histograms, pie charts, bar graphs, line graphs,
    and the typical scatter plot with regression
    lines. After delving deeper into the language
    however, contours, density plots, complex
    histograms, and more reveal themselves.

6
Command Line Overview (cmd)-
  • Most Windows users will be familiar with the
    command line, which can be called from Run as
    cmd. The two main commands essential to
    navigating with cmd are dir and cd, where dir
    refers to directory, and cd refers to jumping
    into a specific folder within the current
    directory.

7
Problems with Windows
  • Installation of Ruby Gems became quite
    complicated on the Windows command line it was
    also unpredictable.
  • Installation through windows can get
    mind-boggling and messy at times, and so a
    stronger command line, capable of executing all
    the installations would be much better. This is
    precisely why I turned to the Linux-base Ubuntu!

8
Why Linux and not Windows?
  • There are so many software choices when it
    comes to doing any specific task. You could
    search for a text editor on Freshmeat and yield
    hundreds, if not thousands of results. My article
    on 5 Linux text editors you should know about
    explains how there are so many options just for
    editing text on the command-line due to the open
    source nature of Linux. Regular users and
    programmers contribute applications all the time.
    Sometimes its a simple modification or feature
    enhancement of a already existing piece of
    software, sometimes its a brand new application.
    In addition, software on Linux tends to be packed
    with more features and greater usability than
    software on Windows. Best of all, the vast
    majority of Linux software is free and open
    source. Not only are you getting the software for
    no charge, but you have the option to modify the
    source code and add more features if you
    understand the programming language. What more
    could you ask for? Ubuntu Website

9
Using Ubuntu
  • Ubuntus webpage describes itself as a
    community developed, Linux-based operating system
    that is perfect for laptops, desktops and
    servers. It contains all the applications you
    need a web browser, presentation, document and
    spreadsheet software, instant messaging and much
    more. In addition, its is easy to download and
    upgrade, and best of all its free!

10
Connecting R and Ruby
  • Our project in DCRI, is linked by the unifying
    language Ruby we are using this language to
    implement the SmartMartTM. My tasks here were
    heavily tied into the statistics language R,
    specifically. Hence, I had to find a way to
    connect the languages to maintain consistency.

11
RSRuby Gem
  • This is essentially a bridge library that
    gives programmers in Ruby, access to the
    spectacular R environment. RSRuby embeds a full R
    interpreter inside the running Ruby script,
    allowing R methods to be called and data passed
    between the Ruby script and the R interpreter.
    Most data conversion is handled automatically,
    but user-definable conversion routines can also
    be written to handle any R or Ruby class.

12
R from Command Line
  • Running R from the command line is a piece of
    cake, and is not difficult to do however, most
    important applications rely on scripts, or
    numerous lines of code, to accomplish a task. It
    is oftentimes more economical to open an editor
    software such as Nano or VI to write and execute
    an R script.

13
Editor Software-Nano
  • Nano is a very easy editor to use, and can be
    opened with the simple command nano.
    Unfortunately, Linux protects access to such
    applications with a password. The prefix sudo
    must be inserted to access such applications. So,
    the sudo nano w command opens a save-able
    version of the source code file, which can be of
    any type, but will be either .r or .rb for my
    project.

14
RSRuby Produced Graphs
  • The following code is all ruby code, but it
    executes R Commands! My comments, which for the
    non-coders are basically segments of code that
    are not executed and serve to help other
    programmers and viewers to comprehend the
    complexities of code, will help follow the
    complexities of code. Specifically, they are
    marked with a in ruby, as well as R. My task
    of producing these graphs, will be explained
    through these comments.

15
Ruby Code Segment 1
  • GNU nano 2.0.7
    File rsrubyplots.rb
  • The following two lines import two libraries
    into the code so that functions
  • and variables defined in these two
    libraries, will be accessible to the code in
  • this rsrubyplots.rb file.
  • require 'rubygems'
  • require 'rsruby'
  • create a variable called R, that will be placed
    before R commands, and acts as the instance
    variable for R.
  • r RSRuby.instance
  • Create two arrays storing data elements!
  • array1 r.c(1,2,3,4,5,6,7,7,8,5,4,3,23,2,4,5,6,77
    ,5,4,3,3,4,5,56,12,34,67,89,27)
  • array2 r.c(2,3,4,5,6,7,8,8,6,6,5,4,4,3,23,2,4,5,
    6,6,45,4,4,3,5,6,43,3,4,80)

16
Scatter plot Command
  • ylabel "Y-Axis (Dependent Variable)"
  • xlabel "X-axis (Independent Variable)"
  • title "Scatter Plot (Insert Title Here)"
  • r.png(r.paste("/home/rgharpuray/scatterplot.png"))
  • r.plot('x'array1,'y'array2,
  • xlab' xlabel, 'ylab' ylabel, 'main'
    title)
  • r.eval_R("dev.off()")

17
Pie Graph Command
  • Pie graph command
  • title "Pie Chart (insert Title Here)"
  • r.png(r.paste("/home/rgharpuray/piegraph.png"))
  • r.pie('x'array1,'labels'r.c("group1","group2"
    ,"group3","group4","group5","group6","group7","gro
    up8","group9","group10","group11, main
    title)
  • r.eval_R("dev.off()")

18
Rnorm Command
  • title "Rnorm(insert Title Here)"
  • r.png(r.paste("/home/rgharpuray/rnorm.png"))
  • data r.rnorm(100)
  • r.plot('x'data,'xlab'xlabel,'ylab'ylabel,'m
    ain'title)
  • r.eval_R("dev.off()")

19
QQPlot Command
  • title "QQPlot Comparison of two
    Distributions(Insert Title here)"
  • r.png(r.paste("/home/rgharpuray/qqplot.png"))
  • r.qqplot('x'array1,'y'array2,'xlab'xlabel,'
    ylab'ylabel,'main'title)
  • r.eval_R("dev.off()")

20
Histogram
  • title "Histogram (Insert Title Here)"
  • r.png(r.paste("/home/rgharpuray/histogram.png"))
  • r.hist('x'array1, 'xlab'xlabel,'ylab'ylabel
    , 'col' "green", 'main' title)
  • r.eval_R("dev.off()")

21

Text file,

table, ppt,

data base, etc.
Data from Researcher
Data Acquired
James Woos Module
Data structure and conversion to usable type
(array)
Data Managed, and Organized!
Data Plotted/Graphed!
22
Explanation of Big Picture
  • The previous diagram summarizes how all of
    our projects come together to form an effective
    and interesting whole! Data is acquired from the
    biomedical researcher. Data may be in the form of
    a database, excel spreadsheet, text file, or
    more. My coworker James module converts this
    data to a more effective and usable method of
    storage, an array. It is in this form that my
    plotting commands come into play I take the
    reformatted data produced by James, and make
    plots out of them. Through this entire process,
    it is in our hope that the information and data
    acquired, can be more effectively used and
    applied!

23
Results/Conclusions
  • Overall, my module has been a success, and
    produced a variety of graphs using a mixture of
    Ruby and R commands! Analyzing the data will be
    much easier and will produce interesting
    insights. Nevertheless, like many science
    experiments and computer programs, my modules
    took quite a while to perfect and make functional!

24
Acknowledgements
  • My mentors Carl Leonard and Jim DeLeo, in
    conjunction with my coworker, James Woo, helped
    me complete my tasks and fix errors that
    presented themselves. I am very thankful to them
    for providing me with help this summer.
  • Documentation for R in the introduction to
    R.pdf manual also helped me execute these R
    commands
  • RSRUBY Documentation helped me recognize how to
    transition from R commands to Ruby commands

25
What I Learned and My Experience
  • The internship this summer has taught me very
    much about several aspects of coding, about which
    I was originally oblivious or slightly perplexed.
    It was very entertaining as well as enlightening
    to see how all of our tasks, despite being
    independent, came together to form an amazing
    project! A big thanks to my coworkers, James Woo,
    Simone Campbell, and Morgan Clinton, as well as
    my mentors Carl Leonard and Jim DeLeo!

26
References
  • McCown, Frank. "Producing Simple Graphs with R."
    2006-2007. 29 June 2008 ccown/r/.
  • "Support." Ruby-Forge. 28 June 2008
    .
  • Ubuntu. 05 June 2008 .
  • Venables, W. N., and D. M. Smith. "An
    Introduction to R." R Project. 23 June 2008. 1
    July 2008 -intro.pdf.
  • Warnes, Gregory R. "RSRuby Documentation." 2 July
    2008 /.

27
THE END
  • Any questions, just shoot an email to me at
  • rgharpuray_at_gmail.com
  • or
  • rishig_at_stanford.edu
Write a Comment
User Comments (0)
About PowerShow.com