Title: Implementing Legacy Statistical Algorithms in a Spreadsheet Environment
1Implementing Legacy Statistical Algorithms in a
Spreadsheet Environment
- Stephen W. Liddle
- Information Systems Faculty
- Rollins eBusiness Center
John S. Lawson Department of Statistics
Brigham Young University Provo, UT 84602
2Overview
- Introduction
- Fundamentals of VBA in Excel
- Retargeting traditional algorithms to a
spreadsheet environment - Converting FORTRAN to VBA
- Conclusions
3Why Convert FORTRAN Programs to Run in a
Spreadsheet Environment?
- Useful code available that is not implemented in
standard statistical packages - FORTRAN compilers not usually available on normal
Windows workstation - Many textbooks refer to published FORTRAN
algorithms
4Sources for Published FORTRAN Algorithms
- STATLIB (http//lib.stat.cmu.edu/)
- General Archive
- Applied Statistics Archive
- Journal of Quality Technology Archive
- JASA Software Archive
- JCGS Archive
5Advantages of Running Legacy FORTRAN Code in Excel
- Comfortable environment for practitioners
- More user friendly input from spreadsheet
- Output to spreadsheet allows further graphical
and computational analysis of results with Excel
functions
6(No Transcript)
7(No Transcript)
8(No Transcript)
9(No Transcript)
10(No Transcript)
11(No Transcript)
12(No Transcript)
13(No Transcript)
14Proposed Methodology
- Understand original FORTRAN program
- Choose suitable I/O methods
- Convert original FORTRAN code to VBA
- Test and use resulting Excel code
15Visual Basic For Applications
- Built on ANSI BASIC
- Language engine of Microsoft Office
- Modern structured programming language
- Has vast array of types, functions, programming
helps - Powerful support environment (Office platform)
- Popular in business contexts
16Excel Object Model
- Objects in Excel are addressable in VBA
- Each object has
- Properties
- Methods
17Input/Output Methods
- Non-interactive
- Files, databases
- Worksheet cells
- Interactive
- Message boxes
- Input boxes
- Custom GUI forms
18FORTRAN vs. VBA
- VBA (-bSqr (b 2-4ac))/(2a)
- FORTRAN (-bSQRT(b2-4ac))/(2a)
19More Operators
- .EQ.
- .NE. ltgt
- .LT. lt
- .LE. lt
- .GT. gt
- .GE. gt
- .AND. And
- .OR. Or
- .NOT. Not
- //
20Data Types
- INTEGER Byte, Integer, Long
- REAL Single
- DOUBLE PRECISION Double
- COMPLEX Non-primitive in VBA
- LOGICAL Boolean
- CHARACTER String
- CHARACTERlength Stringlength
- Other notable VBA types
- Currency, Decimal, Date, Variant
21Worksheet Functions
- ChiDist(x,deg_freedom)
- Returns one-tailed probability of the ?2
distribution. - Correl(array1,array2)
- Returns the correlation coefficient of two cell
ranges. - Fisher(x)
- Returns the Fisher transformation at a given x.
- Pearson(array1,array2)
- Returns the Pearson product moment correlation
coefficient for two sets. - Quartile(array,quart)
- Returns the requested quartile of a data set.
- StDev(array)
- Returns the standard deviation of a data set.
- ZTest(array,x,sigma)
- Returns the two-tailed P-value of a z-test.
22Flow-Control Statements
23Subtle Differences (Gotchas)
- Implicit conversion of real to integer values
- FORTRAN truncate
- VBA round
- Solution use VBAs Fix(), which truncates
- Both languages allow implicit typing
- This introduces ambiguity
- Solution supply explicit types everywhere
24Eliminating Goto Statements
- Computer science accepts the axiom that goto is
generally considered harmful - We advocate rewriting alogrithms to use
structured programming techniques where feasible - Sine qua non is make it work
- Its a good idea for maintainability,
understandability to move to structured form
25Eliminating Goto Statements
DO 8 J1,3 ... 6 ...
IF(OBJFN.GT.BESTFN) GO TO 7 ...
GO TO 6 7 IF(J.EQ.3) GO TO 8
XKBESTK-STEP 8 CONTINUE
26Eliminating Goto Statements
For j1 To 3 ... 6 ...
IF(OBJFN.GT.BESTFN) GO TO 7 ...
GO TO 6 7 IF(J.EQ.3) GO TO 8
XKBESTK-STEP 8 Next j
27Eliminating Goto Statements
For j1 To 3 ... 6 ...
IF(OBJFN.GT.BESTFN) GO TO 7 ...
GO TO 6 7 If j ltgt 3 Then xk
bestk - step End If Next j
28Eliminating Goto Statements
For j1 To 3 ... Do Until
objfn gt bestfn ... Loop
If j ltgt 3 Then xk bestk - step
End If Next j
29Our Reasoning
- Digital assets are fragile
- FORTRAN is not universally available
- Excel is a ubiquitous, powerful platform
- VBA is a full-featured language capable of
handling sophisticated statistical computations
30Conclusions
- We recommend creating a Web-based repository of
Excel/VBA implementations of classic statistical
algorithms - We can preserve our legacy algorithms in this
modern spreadsheet environment - E-mail us if you want a copy of our manuscript
(liddle or lawson_at_byu.edu)