Using SQL Databases from APL (Dyalog - PowerPoint PPT Presentation

About This Presentation
Title:

Using SQL Databases from APL (Dyalog

Description:

Using SQL Databases from APL (Dyalog & other) J. Merrill Analytical Software Corp. jamesmerrill_at_usa.net – PowerPoint PPT presentation

Number of Views:142
Avg rating:3.0/5.0
Slides: 99
Provided by: JMer4
Category:

less

Transcript and Presenter's Notes

Title: Using SQL Databases from APL (Dyalog


1
Using SQL Databases from APL (Dyalog other)
  • J. Merrill
  • Analytical Software Corp.
  • jamesmerrill_at_usa.net

2
Overview
  • About my APL and SQL experience
  • This talk emphasizes information about SQL
  • Concepts you must know
  • Weaknesses you need to understand
  • Ideas about SQL database design
  • Why so much about SQL?
  • Could a few people learn the details about SQL,
    and build tools to shield other developers from
    the issues?
  • Why you should not hide SQL from APLers
  • The very successful project that was a failure
  • APL talking to SQL is not hard to do (any more)

3
My Background
  • I started working with APL professionally in
    1975, as a user of STSC timesharing.
  • While working at STSC, I took a course in
    relational database technology and SQL from a
    representative of a small company called
    Relational Software Inc. Later they changed
    their name to Oracle Ive worked with SQL for
    longer than Oracle Corporation has existed!

4
SQL Work at STSC
  • I automated connection of an Oracle database of
    microcomputer software sales data to the existing
    (APL-based) accounting system.
  • I designed a portable interface between STSCs
    APLPlus and SQL on these platforms
  • VMS talking to RDB
  • VMS talking to Oracle
  • Unix talking to Oracle
  • and implemented the APL in those interfaces.

5
STSC, Manugistics, and Later
  • I was a software development consultant for many
    years with both STSC and (after the name change)
    Manugistics.
  • Client projects included migrating an APL-based
    database of foreign exchange trading data to
    Microsoft SQL Server.
  • Since founding Analytical Software Corp. Ive
    worked extensively with clients using SQL Server
    from APLWin.

6
SQL Concepts You Must Know
  • SQL has only four data manipulation verbs
  • RDBMSs use Client / Server techniques
  • SQL has the concept of null
  • All data is in a table (like an APL matrix)
  • SQL columns are named, not numbered
  • SQL columns are strongly typed
  • Relationships are data not structure

7
DML select insert update delete
  • SQLs select is extraordinarily versatile
  • SQLs insert and update normally operate on a
    single row at a time and therefore are quite
    primitive
  • If insert and update work on multiple rows, not a
    single row, they become very powerful (as
    powerful as select)
  • SQLs delete is usually simple but can use some
    powerful select techniques

8
SQL Has DML and DDL
  • DML means data manipulation language
  • SQLs four DML verbs are select, insert, update,
    delete
  • For defining the database structure (schema), SQL
    has DDL
  • DDL means data definition language
  • DDL is less standardized between products

9
SQL select is Very Powerful
  • select supports extremely complex statements that
    can access data from multiple tables, potentially
    joining them using every available join technique
    in a single query
  • Much of selects power is based on the ability to
    define derived tables within a statement, and use
    those tables as if they held real data but no
    data is actually stored to run the query
  • The syntax is intended let you express intent
    without specifying a particular processing
    sequence
  • Powerful does not mean simple or intuitive

10
insert update SQLs Weak Spot
  • When data is in memory (in an application)
  • insert statements allow you to add one row,
    giving values for each column in the new row
  • update statements typically reference a single
    row by its primary key (though it can change many
    columns in one update)
  • Applying a group of changes based on user input
    typically involves issuing many one-row-each
    insert or update statements

11
Multi-Row insert and update
  • insert supports adding the result of any select
    to an existing table
  • update similarly can update many rows of one
    table from the contents of another table,
    providing the tables can be joined (usually on
    the primary key)
  • The syntax for this is not completely standard
    and some products may not support this
  • Get the data on the server if possible!

12
SQL delete Does What It Says
  • delete can use the full power of select to
    determine what row or rows of a table should be
    deleted
  • When deleting rows based on a list of primary key
    values, implementations that limit the length of
    a SQL statement (or an individual clause within
    the statement) can require multiple delete
    statements to delete a large number of rows

13
SQL Concepts You Must Know
  • SQL has only four data manipulation verbs
  • RDBMSs use Client / Server techniques
  • SQL has the concept of null
  • All data is in a table (like an APL matrix)
  • SQL columns are named, not numbered
  • SQL columns are strongly typed
  • Relationships are data not structure

14
Client / Server Concepts
  • Clients send requests to the server and it
    responds with the results
  • Only the DBMS server touches the data
  • Its good to have the server do things like
    compute totals so that results are smaller
  • The number of back-and-forth interactions is
    potentially more significant than the amount of
    data being transferred

15
SQL is the Language for Server Requests and
Responses
  • Your program sends SQL statements (only) to the
    server and handles the responses
  • Some tools hide the fact that everything is done
    with SQL, by generating SQL for you
  • SQL is very powerful for reading data (select)
    and deleting data (delete)
  • SQL is quite stupidly simple for writing data in
    memory into a table (insert, update)
  • The over-the-wire protocol is complex, but is
    hidden by (most) client libraries

16
One Program Touches the Data
  • A single coordinating program, the DBMS server,
    performs all access (reading and writing) to data
    files holding the SQL data
  • Access to data is controlled by the server
  • The server process can be run on a very powerful
    (expensive) computer
  • Conflicting requests from different users can be
    handled more easily

17
Let the Server Do the Work
  • The server should be a better computer than the
    one running your program
  • It is optimal if the server can get the job done
    without sending a lot of data over the wire to
    your program
  • Downside the more you get the server to do the
    work (by sending SQL), the more you have to worry
    that things will need to change if you switch to
    another DBMS

18
Avoid Chatty Programs
  • Each interaction with the server takes time for
    the back-and-forth protocol, in addition to
    whatever actual data is passed
  • Rather than sending a series of steps to the
    server one after the other, its better to send
    one larger request
  • SQLs simplistic insert update is an issue
  • Downside the more you bunch up your requests,
    the more difficult it is to handle errors

19
SQL Concepts You Must Know
  • SQL has only four data manipulation verbs
  • RDBMSs use Client / Server techniques
  • SQL has the concept of null
  • All data is in a table (like an APL matrix)
  • SQL columns are named, not numbered
  • SQL columns are strongly typed
  • Relationships are data not structure

20
SQLs Concept Called null
  • Intended to represent a missing or unknown
    value, rather than an empty value
  • If a value is null, it is not equal to any
    particular value, nor is it not-equal to any
    particular value. Suppose A is an arbitrary SQL
    expression
  • int_column 5
  • char_column hello
  • One would think that the result of
  • A or (not A)
  • has to be true but in SQL thats not the case
    (!).

21
null Causes three valued logic
  • APLers think we understand booleans, but then
    along comes SQLs notion of null that changes the
    rules weve learned
  • When nulls are involved, the result of a boolean
    expression is one of true, false, or null
  • The result of any expression is null if any value
    involved in the expression is null
  • The SQL function coalesce returns the first
    not-null parameter value passed
  • coalesce(column, value_if_null)

22
Use Null Values Sparingly
  • Allowing nulls sometimes makes sense
  • Definitely makes sense for date or time
  • Makes sense for numeric data when 0 is not an
    appropriate no information replacement
  • Dont use a missing value code in the database,
    even if you use one in your APL data
  • Any distinction between NULL and (an empty
    string) is hard to see, justify, or explain to
    non-programmers (or programmers!).
  • Sometimes its not your decision

23
SQL Concepts You Must Know
  • SQL has only four data manipulation verbs
  • RDBMSs use Client / Server techniques
  • SQL has the concept of null
  • All data is in a table (like an APL matrix)
  • SQL columns are named, not numbered
  • SQL columns are strongly typed
  • Relationships are data not structure

24
All SQL Data is in Tables
  • There is no other persistent data structure
  • SQL table values are conceptual scalars
  • SQL can hold strings (enclosed char vectors)
  • Otherwise, no nested data (in standard SQL)
  • If APL had scalar strings, a SQL table would be
    just like a simple APL matrix
  • SQL locates rows using data values only
  • No notion like row number in an APL matrix

25
SQL Columns
  • Columns are named, not numbered
  • Each column has a (scalar) data type
  • Most data types can support the special SQL value
    null
  • Handling null values correctly can be difficult

26
Column Names
  • One difference between products is in what column
    names they allow
  • Some allow spaces in names, but that makes it
    necessary to quote those column names (so SQL
    can parse statements using them)
  • Standard quoting techniques vary (!)
  • Advice use simple, not-very-long names
  • Its not always your choice

27
SQL Concepts You Must Know
  • SQL has only four data manipulation verbs
  • RDBMSs use Client / Server techniques
  • SQL has the concept of null
  • All data is in a table (like an APL matrix)
  • SQL columns are named, not numbered
  • SQL columns are strongly typed
  • Relationships are data not structure

28
Column Type Catgeories
  • Numbers
  • Strings
  • Dates and Times
  • Binary
  • Time Intervals
  • Boolean

29
Numbers
  • Integers of various sizes (1, 2, 4, 8)
  • Usually no unsigned integer
  • 4- or 8-byte floating point
  • Fixed width and number of decimals
  • Money sometimes available
  • 64-bit integer with 4 implied decimals

30
Strings
  • Fixed length with maximum size
  • Variable length with maximum size
  • Unlimited size often available
  • Multi-byte data (Unicode) often available

31
Dates and Times
  • One or more of date, time, datetime
  • Concept of NULL is very useful here
  • Dates are stored as Julian date values
  • Many support negative Julian dates
  • Time values are stored as fraction of a day
  • 0.25 is 6am, 0.5 is noon, 0.75 is 6pm
  • Datetime values stored as (day timefraction)
  • If only datetime is available,
  • time can are represented as being on day 0
  • date can be represented as being at midnight

32
Time Intervals
  • Supposedly standard
  • Not universally available (SQL Server)
  • Can be represented with a date or datetime value
    holding the result of subtracting one date or
    datetime value from another

33
Binary
  • Fixed length with a maximum size
  • Variable length with a maximum size
  • Unlimited size often available

34
Boolean
  • A single bit (!)
  • Not fixed or variable length with a maximum
    size like string and binary data
  • Sometimes boolean columns support null (so there
    are 3 states)
  • My advice use a one-wide character column
    instead

35
Time for a Break ?
36
SQL Concepts You Must Know
  • SQL has only four data manipulation verbs
  • RDBMSs use Client / Server techniques
  • SQL has the concept of null
  • All data is in a table (like an APL matrix)
  • SQL columns are named, not numbered
  • SQL columns are strongly typed
  • Relationships are data not structure

37
How Relationships Work in SQL
  • In APL, you can choose to represent related
    structures by nesting
  • In SQL, there are only tables and there is no
    possibility of nesting
  • Relationships are stored by having a column in
    one table store a value identifying a related row
    in another table (or null if there is no related
    row)

38
SQL Table Design
  • All columns in a table should represent
    information about one entity (or concept)
  • row data about one instance
  • column same data re many instances
  • The term field is deprecated (but widely used)
  • Each table needs a primary key column whose
    value can be stored in any other table that has a
    relationship with rows in this table
  • SQL-generated primary key values allow
    relationships to be represented easily and can
    prevent issues that arise when users think they
    want to change key values

39
SQL Relationships
  • One to many (parent - child)
  • Many to one (lookup)
  • Hierarchy
  • Many to many

40
Relationships one to many
  • Each (one) invoice can have any number (many) of
    line items
  • LineItem table has an InvoiceID column holding
    Invoice table primary key (PK) value
  • No line items for invoice N is easily
    represented by there being no rows in the
    LineItem table with InvoiceID N
  • An APL application might use nesting, storing a
    vector or matrix of lineitem data within each
    Invoice data structure

41
One to Many Parent - Child
  • The one table (Invoice) is parent
  • The many table (LineItem) is child
  • Column holding primary key (PK) of another table
    is called a foreign key (FK) to (or into) that
    table
  • Foreign keys are critical to competent SQL
    database design

42
Relationships many to one
  • There can be any number (many) of employees in
    each (one) department
  • Employee table has a column holding Department
    table PK value (read as FK to Department)
  • No employees in department N is again easy
  • Not thought of as parent / child you dont think
    of departments as owning employees the way
    invoices own line items
  • An APL application could (but wouldnt) use
    nesting, storing a vector of EmployeeID values
    (but not Employee data) for each department

43
Relationships hierarchy
  • Each employee has a manager the manager is also
    an employee
  • Employee table has a ManagerID column holding FK
    to the same (Employee) table
  • Top of hierarchy is represented by null ManagerID
    value (no manager)
  • APL could use nesting, but I havent

44
Relationships many to many
  • Requires a third table to hold two many-to-one
    relationships to the other two tables
  • ClassStudent table records that
  • each class can have any number of students
    enrolled in it
  • each student enrolls in any number of classes
  • You cant use nesting for this (without
    double-storing the information), so APL
    applications usually do it the same way

45
More SQL Design Thoughts
  • Advice use your products identity or
    sequence feature to create PK values
  • When in doubt, define another table
  • Examples of extra tables that add flexibility
  • Storing historical data (not just the current
    value, but changes over time)

46
Advice Use DBMS-assigned Values as Primary Keys
  • Most implementations support having the DBMS
    assign the value of a particular column to be 1
    the previous value, during insert of a new row
  • Columns defined this way make excellent PKs that
    are efficient and easily used as FKs
  • Columns that are user-visible primary keys
    (e.g. Product Code, Region Code, Department Code)
    become attributes
  • Changing them no longer causes pain and trouble

47
When In (Any) Doubt, Define Another Table
  • When you see a numeric suffix on a column name,
    thats an indication that another table should
    have been built
  • FK columns in an Operation table named Doctor1
    Doctor2 Doctor3 Doctor4 would be a poor design
  • What if more than 4 doctors involved?
  • Better to have a child table with any number
    (even zero) of doctors for an operation

48
More Tables Are Often Better
  • Some cases where using more tables results in a
    more flexible system
  • Address table where both Customer and Employee
    tables have AddressID column
  • Address table has column AddressType (FK to
    AddressType table with descriptions Billing,
    Shipping, Home, Office)
  • Name table where Patient table has NameID
  • Column Primary in Name table marks the current
    (primary) name to use for this person

49
Historical Data
  • Data that changes over time where history needs
    to be maintained can be in a child table with a
    ValidUntil column, and possibly also a ValidSince
    column
  • The currently active child table row is marked by
    a null ValidUntil value
  • Find the active child table row on date X
  • (X gt coalesce( ValidSince, X ))
  • (X lt coalesce( ValidUntil, X ))
  • Code needs to ensure that timespans dont overlap
  • Can store ID of active child table row in parent

50
SQL Joins
  • Because data values are used to define
    relationships, SQL needs to be able to combine
    (join) tables based on the defined FK-to-table
    relations and it can do so quite flexibly
  • Each SQL statement must re-specify the join
    condition(s) tools to generate SQL code can make
    this easier to get right
  • SQL supports different kinds of joins
  • Natural join
  • Left (or right) join
  • Cross join

51
Time for Lunch ?
52
SQL Join is What in APL?
  • Similar to A,BBpk?AfkToB
  • Natural join
  • values of AfkToB that would cause INDEX ERROR
    in APL instead remove those rows from A (!)
  • if there are multiple rows that match in B, rows
    of A are repeated to match
  • Left join adds a row of null values to B to
    avoid losing rows of A with unmatched values
  • A,(B??null)Bpk?AfkToB
  • Cross join like APL outer product (usually this
    is a mistake) all combinations of values

53
Relationships vs. Join Types
  • Parent-child relationship left join
  • Lookup table left join
  • Natural join removes unmatched rows make sure
    thats what you want
  • Many-to-many should normally be two left joins
    from the middle table, because its two combined
    one-to-many relationships
  • Hierarchies are non-trivial to handle

54
Given the Issues, Why Use SQL?
  • Standard outside the APL world
  • Superb support for transactions
  • Security is built-in
  • Advanced database features would be very
    difficult to provide otherwise

55
SQL Databases are Standard
  • Wide availability of expertise
  • Backup and recovery are solved problems
  • Other parts of an application can be developed by
    others or using other tools
  • Many non-APL tools can work with them
  • Logical and physical data modeling
  • Reporting
  • Performance analysis

56
Database Transactions
  • ACID characteristics
  • Atomicity, Consistency, Isolation, Durability
  • All-or-nothing data updates, ensuring data
    consistency even when there are errors during
    updates
  • Each users work is not affected by the work of
    other users
  • Server or application software crashes leave data
    intact after a restart

57
Security Built In
  • DBMS server controls all access to data
  • application can be the only way to data or
  • can use Windows identity as SQL identity
  • Access can be removed from the underlying tables,
    perhaps granting select access via views that use
    user identity
  • Corporations are accustomed to securing their
    databases, but not their (APL) files

58
Advanced DBMS Features
  • Flexible high-performance indexing
  • Views (including materialized views)
  • Sub-queries
  • Stored procedures
  • Triggers
  • XML data handling
  • Partitioned tables
  • Materialized views

59
Database Indexes
  • Auxiliary data structures to speed access to
    data, like the index of a book
  • Can include multiple columns
  • A unique index prevents duplicates
  • A clustered index changes the physical
    arrangement of rows
  • Clustering a child table by the parent key places
    all child rows adjacent on disk

60
Database Views
  • Almost any SQL select statement can be saved as
    a view
  • Views act like virtual tables
  • There is no data in a view, only in the table or
    tables referenced by the view
  • Views can be targeted by select statements
    exactly as if they were physical tables
  • In some limited cases, views can be updated
  • Users can be granted access to a view rather than
    to the table(s) named in the view

61
SQL Subqueries
  • SQL statements can define derived tables that
    exist only for that select
  • Like a view that exists for one statement
  • Who has not purchased product X?
  • select from Customer where CustomerID not in
  • (select distinct inv.CustomerID
  • from Invoice as inv join LineItem as li
  • on inv.InvoiceID li.InvoiceID
  • where li.Product X)

62
Stored Procedures and Triggers
  • Programs that run within the database
  • Stored Procedures (SPs) can be called from client
    programs, or by other SPs
  • Triggers are a special type of SP that fire (run)
    on any of insert, update, delete
  • Can be used to validate or audit
  • Usually written a product-specific proprietary
    language
  • Writing in Java or .Net is becoming possible

63
Support for XML Data
  • Products are vying for the best support
  • XML data can of course be stored as
    variable-length character data, but that doesnt
    require any special support
  • Examples of XML support
  • Load data or create virtual tables from XML
  • Use XPath syntax to locate data within the XML
    stored in a column, and use that data to select
    rows

64
Partitioned Tables
  • Dividing a logical table, one defined by a union
    view, into separate physical tables
  • The purpose is to divide the workload of
    accessing (or updating) the table between
    multiple servers
  • Only appropriate for huge tables, or ones that
    become a bottleneck in a transaction processing
    system

65
Materialized Views
  • Some views are better off being computed once and
    stored, rather than being virtual (and thus
    re-computed each time theyre accessed)
  • If SQL can know how to update the view when the
    underlying data changes, this can speed up access
    to the view at the cost of storing the
    materialized view and making needed updates
  • The views that are eligible to be materialized
    are fairly limited in most products today

66
A Quick Look at Implementation
  • Usually multiple tables in each physical file
  • flexible assignment of tables to files
  • Each tables data is a group of pages
  • A page of data contains multiple rows
  • Pages of a table are linked together
  • Index data may be in file with table data
  • Indexes reference rows by page and position
    within the page

67
Which database to choose?
  • Its often not your choice
  • If you can choose, what matters most to you?
  • cost
  • reliability
  • availability of knowledgeable people
  • performance
  • If performance is critical, test lesser-known
    products some are amazingly fast for simpler
    usage scenarios yet quite robust and stable

68
Can You Choose?
  • In corporate development, the data likely already
    exists in a particular database
  • Only if you are building a new system (or a new
    product) are you given the opportunity to select
    the database platform
  • Many customers care only that its one of the
    big ones (Oracle, SQL Server, DB2)
  • To customers, the formula is often that the
    unknown is risky, and risk is bad

69
Whats Most Important?
  • Licensing costs vary dramatically
  • Pure open source is free for a reason
  • no support unless you pay
  • The big three cost an arm and a leg
  • but your company may already license it
  • Lesser-known products can be a great value
  • If its mission-critical, you want big 3
  • but its still your problem (and will be said to
    be your fault) if the database fails

70
Whats Most Important?
  • Sometimes youll need to find expertise
  • Oracle is notorious for needing a guru
  • If any part of the application is built outside
    APL, youll find people more easily if you go
    with one of the big boys
  • Performance is rarely known to be critical
  • Starting out, you may have no idea if the DBMSs
    performance will be a problem
  • Except in extreme cases, you can handle it

71
Lesser-known Products
  • Im somewhat biased towards them
  • Sometimes support is unbelievably great
  • Theyre swimming against the tide help show the
    world that they deserve to exist
  • Using APL is already weird and risky
  • If your test results impress you as much as
    theyve impressed me, why not use what you think
    is the best tool, despite the risk of an
    unknown tool?

72
APL-based or APL-aware?
  • If you find a system that seems to fit well, and
    is either based on APL or has some special
    support for APL callers, why would you reject it
    automatically?
  • But you might not really be using a SQL RDBMS
    in all such those cases, and it depends on why
    youre leaning towards SQL in the first place

73
Time for a Break ?
74
APL Can Speak to SQL Easily
  • SQAPL for Dyalog APL (uses ODBC)
  • Also available for Dyalog Unix (not free, nor are
    the best ODBC drivers for Unix)
  • DataDirect apparently has the best Unix ODBC
    drivers they are recommended by Dyalog (others
    work as well)
  • Same API for APLWin is called APLLink
  • ADO.Net Windows database API 6
  • ADO (original), or OLE DB, via COM/OCX
  • Write directly to some API
  • Use a home-grown DLL, OCX, or AuxProc

75
SQAPL Wins
  • Its very hard to prove that statement in every
    environment, with every workload
  • Why its better
  • Its the only interface that understands how APL
    works and inherently supports its arrays
  • Other interfaces require major work to hide
    one-row-at-a-time programming models
  • Version 5 has a new mechanism that can provide a
    significant performance improvement (10x better
    has been reported) with a relatively small change

76
Other Choices Are Inferior
  • Not APL oriented
  • Very loopy with scalar thinking APIs
  • Only recently have APIs (in .Net) reached parity
    with SQAPL techniques from 5 years ago!
  • Why bang your head against the wall when SQAPL
    comes free with Dyalog APL?
  • SQAPL is a proven, tested, robust solution
  • Put in something about the stupidity of trying to
    re-invent a tough, well-worn, smooth, very round
    wheel

77
SQAPL Sets
  • APL developers think of changes to data happening
    all at once, when an array value is stored (e.g.
    in a file component)
  • Storing an APL array as SQL data frequently
    involves multiple SQL operations, because each
    SQL operation changes only one row at a time
  • SQAPL Sets are an abstraction to make working
    with SQL data more APL-like

78
Basics of Using Sets
  • If you plan to modify data, start by reading data
    using one of the set tools the result is a
    set handle and a data matrix
  • Modify the matrix by adding and/or deleting rows,
    and/or updating values
  • Call set update to store the data, passing the
    set handle and the new matrix

79
How Do Sets Work?
  • The retrieved data is stored, associated with the
    set handle that is returned to you
  • When you call set update, APL is used to
    compare the old and new tables and make the
    changes by generating the needed SQL statements
    and running them
  • Sets are an extraordinarily simple but powerful
    idea take advantage of them

80
SQAPLs Current Sets Implementation is Not
Complete
  • Should use SQL transactions
  • So the new version of the in-memory array will be
    successfully stored in the database, or the
    database will be unchanged
  • Could support DBMS-assigned identity values
  • As Ive said, I think these are very useful
  • Returning the identity values back to APL is very
    important (to avoid the need to re-query)

81
More Possible Enhancements
  • Fix a few bugs
  • SQL handles are not closed ASAP
  • Occasional inconsistent error handling (some now
    fixed)
  • Could easily support multi-column keys (but I
    prefer to use identity values as keys)
  • Could support optimistic concurrency
  • Could support automatic updates to related
    (child) tables

82
Database-Specific Tools
  • SQAPL provides the tools to talk to various SQL
    databases. (Remember that they only understand
    SQL statements.)
  • A desirable tool would be one to improve the
    performance of Set Update by passing data to be
    updated in fewer steps, perhaps as XML.
  • With SQL Server in particular, sending XML data
    for insert and update can let them operate in
    bulk, rather than one row at a time. Impressive
    performance gains would be quite likely.

83
Improving SQAPLs API
  • Working with an APLWin client, I did major
    design and implementation work on an
    object-oriented layer built on the SQAPL code.
    This included making most of the enhancements
    discussed above, adding major features for
    calling stored procedures, creating and modifying
    tables, and much more. I hope to have a similar
    opportunity in Dyalog APL.

84
Performance Issues
  • Most of these are SQL / DBMS issues, not issues
    of the APL interface to SQL
  • LAN vs WAN makes a huge difference (latency)
  • Consider using a middle tier in WAN case
  • The nature of these issues varies wildly from one
    DBMS to another
  • The same SQL statement may be very fast in one
    implementation, very slow in another

85
Figuring Out Whats Wrong
  • APL is the wrong tool for analyzing performance
    problems
  • APL cant see what the DBMS is doing
  • APL cant see the system-level slowdowns
  • Non-APL tools exist for no other purpose
  • If the problem is too much back and forth, can
    you find a way to blast the data across in
    fewer steps?

86
Metadata Based Tools
  • Most APL applications are based on, or at least
    use, metadata data about the data being
    manipulated. APLers use these techniques
    naturally. It surprises me how infrequently
    other programmers in other languages seem to do
    so.
  • Tools that know your database structure can be
    used for many things. For example, if the table
    hierarchy is available as metadata, a routine can
    be written to produce the SQL where clause to
    join an arbitrary list of tables.

87
Other Tools
  • Knowing the datatype of database table columns
    can simplify creating UI screens.
  • Knowing the relationship between tables can let
    (tools called by) model code retrieve related
    data automatically.
  • Automating changes to the database schema given
    metadata is possible I know, Ive done it.
    (Its not trivial!)

88
Lessons Learned
  • Always (except that one should never say always)
    use identity values as PKs
  • Unless you can prove that you have one of the
    (very rare, in my opinion) cases where theres a
    real reason not to
  • GUID PKs are an alternative to identity values,
    but
  • they are comparatively fat vs. integers
  • the problems with identity values are overstated
  • the any location benefit is overstated

89
More Lessons
  • Always write APL that writes SQL
  • Build tools to help you do so
  • Dont hide SQL from APL developers
  • Table names should be singular (my opinion) not
    all people agree
  • Try not to base your design on the ability of
    your (current) DBMS to handle it well

90
Unproven (by me) Ideas
  • These are thoughts that havent been fully
    implemented (at least not by me in APL), so be
    aware of that before basing your development on
    them.
  • They represents some potential best practices
    that borrow significantly from work done by
    others, mostly in other languages.

91
Model Definition is Often Hazy
  • The line between UI, data model, and database can
    be unclear.
  • Data manipulated by the UI of a form is most
    easily stored in attributes of the form.
  • Many programs have a UI data model thats
    separate from the model of the data as stored
    persistently (whether or not its stored in a SQL
    database).

92
Problems Caused by UI Models
  • Changes must be coordinated
  • When model data is stored in the UI, making
    changes to either the UI or the data definition
    forces them to change at the same time.
  • UI validation code works from the UI model
  • Code for event-handling in the UI naturally does
    BR (business rule) validation using UI-stored
    data.
  • Validation logic gets repeated
  • The model must enforce BRs to prevent bad data
    from being stored, so logic gets repeated.

93
Use Model Code in the UI
  • When a models data is simple and the power of
    APL to manipulate data structures makes that
    common when writing in APL it can be tempting
    to do the manipulation directly.
  • User interface code should avoid that, instead
    invoking methods of the model even though they
    may be trivial.
  • Consider a web app server-side code cant rely
    on data being correct when submitted, so it must
    do validation.
  • Code for browser-side validation either repeats
    the logic or consults the server. Ideally, the
    latter is done asynchronously (using an
    Ajax-style model).

94
UI / Data Model Separation
  • It is clear that the user interface code is best
    kept separate from (a client of) data models.
  • This results in models that are easier to test,
    and easier to re-use in other contexts.
  • When the UI needs to implement a Save action, it
    can be tempting to use your apps data storage
    APIs directly in the code that gathers the data
    from where its stored in the UI.
  • Better would be to call a models Save method.

95
UI Forms Reference a Model
  • UI forms should have a reference to a model
    object.
  • The model should be able to hold data about
    multiple data instances (database table rows)
    this is APL, after all.
  • The model can have documented events to which the
    UI can subscribe.
  • UI data should be passed to the model as soon as
    possible.

96
Unified Memory / Database Model
  • The model should know about the state of the data
    it holds relative to the database.
  • Is it in the database (vs. new data)?
  • If it is, has it changed since then?
  • Is it currently valid (ready to be saved)?
  • If not, what error message data is there?
  • One good source of design ideas is Scott Amblers
    ActiveRecord model. It is only slightly
    array-oriented, so it is not ideal unchanged.

97
Writing Models
  • Write model methods on an as-needed basis.
    APL programmers often work to create tools that
    have much more functionality than is needed
    today. When writing models, put required
    functionality into lower-level tools (creating
    them as needed) and keep only high-level logic,
    and calls to the lower-level tools, in the
    models. Maximize the work not done.

98
A Big Thank You to Morten
  • I would not be here if Morten had not asked me
    to deliver this talk. I had not seen Dyalog APL
    in more than 20 years, having used only APLPLUS
    (now APL) for all that time. I appreciate
    seeing where Dyalog APL is today, and Im always
    happy to visit Denmark. Ive enjoyed whats felt
    like a warm welcome from this community. I thank
    both you and Morten for your generosity.
Write a Comment
User Comments (0)
About PowerShow.com