Data::ObjectDriver: A relational mapper that doesnt suck July 27th 2006 PowerPoint PPT Presentation

presentation player overlay
1 / 37
About This Presentation
Transcript and Presenter's Notes

Title: Data::ObjectDriver: A relational mapper that doesnt suck July 27th 2006


1
DataObjectDriverA relational mapper that
doesnt suckJuly 27th 2006
2
Relational mapper
  • Abstraction layer
  • Maps from classes and objects to relational
    database tables
  • Build database-backed sites without (much)
    database knowledge

3
Doesnt suck?
  • A bold claim!
  • Actually all relational mappers suck

4
They all suck
  • Slower than raw SQL
  • Never as powerful as raw SQL
  • Reinventing SQL syntax

5
But
  • Reduces code complexity (and overall code)
  • Reduces potential for errors
  • Particularly drastic errors

6
Back story
  • Movable Type (Six Aparts first product) was
    released in 2001
  • Server software, and hobbyist software at that
  • Minimal prerequisites only HTMLTemplate and
    ImageSize

7
Options we briefly considered (in 2001)
  • ClassDBI
  • Too many required modules
  • Funky class structure class is both base object
    and driver
  • Tangram too theoretical
  • Alzabo too complex

8
Plus, we were crazy
  • We wrote full support for Berkeley DB, along with
    custom-built indexes
  • It wasnt much fun
  • But it made sense at the time

9
Hence MTObjectDriver
  • Supports all of the normal operations creating,
    loading, deleting rows
  • Very minimal support for JOINs
  • Still used in Movable Type today

10
Fast forward four years, to mid 2005
  • Were starting a new project Vox
  • New database architecture (like LiveJournals)
  • But LJ writes all of its queries by hand
  • We wanted to abstract caching and partitioning

11
Options we considered (in 2005)
  • All of the previous options discarded for the
    same reasons
  • DBIxClass wasn't released yet
  • Plus, we knew we had a good, stable codebase
  • Worked for four years on MT, 2 on TypePad

12
Hence DataObjectDriver
  • All of the obvious features of a relational
    mapper
  • Plus transparent caching and partitioning
    support
  • Plus layered driver architecture
  • Easy to write your own driver
  • Easy to plug layers together

13
Typical Layers
14
Goals
  • Transparent
  • Flexible
  • Subclassable

15
Embrace SQL
  • Don't eliminate it
  • It's a good, flexible language
  • Replacing SQL with a new syntax (in Perl) is silly

16
Handles all of the easy things
  • Creates objects, updates objects
  • Looks up objects by primary key
  • Searches for objects by various columns

17
For example a recipe database
  • package Recipe
  • use base qw( DataObjectDriverBaseObject )
  • __PACKAGE__-gtinstall_properties(
  • columns gt 'id', 'title' ,
  • datasource gt 'recipes',
  • primary_key gt 'id',
  • driver gt DataObjectDriverDriverDBI-gtnew
    (
  • dsn gt 'dbiSQLitedbnameglobal.db',
  • ),
  • )

18
But your traffic grows!
  • Your single database is overwhelmed with SELECT
    queries!
  • What do you do?
  • You add caching

19
Caching Goals
  • Transparent Automate the easy things, like
    caching by primary key
  • Flexible Allow the application to mix and match
    caching drivers per class/table

20
So, we had this
  • __PACKAGE__-gtinstall_properties(
  • columns gt 'id', 'title' ,
  • datasource gt 'recipes',
  • primary_key gt 'id',
  • driver gt DataObjectDriverDriverDBI-gtnew
    (
  • dsn gt 'dbiSQLitedbnameglobal.db',
  • ),
  • )

21
So we extend our Recipe class
  • __PACKAGE__-gtinstall_properties(
  • columns gt 'id', 'title' ,
  • datasource gt 'recipes',
  • primary_key gt 'id',
  • driver gt DataObjectDriverDriverCacheM
    emcached-gtnew(
  • cache gt CacheMemcached-gtnew( servers
    gt ... ),
  • fallback gt DataObjectDriverDriverDB
    I-gtnew(
  • dsn gt 'dbiSQLitedbnameglobal.db',
  • ),
  • ),
  • )

22
Caching Effect
  • All primary key lookups now come out of memcached
  • Records in memcached are automatically kept in
    sync with the database

23
New Feature!
  • Visitors clamor for a comment feature
  • So you add it comments/notes on recipes

24
Recipe Notes
  • package RecipeNote
  • use base qw( DataObjectDriverBaseObject )
  • __PACKAGE__-gtinstall_properties(
  • columns gt recipe_id, note_id,
    author, text ,
  • datasource gt 'recipe_note',
  • primary_key gt recipe_id, note_id ,
  • driver gt DataObjectDriverDriverCacheCac
    he-gtnew(
  • cache gt CacheMemcached-gtnew( servers
    gt ... ),
  • fallback gt DataObjectDriverDriverDBI-gt
    new(
  • dsn gt 'dbiSQLitedbnameglobal.db',
  • ),
  • ),
  • )

25
But your site is still growing, oh no!
  • Visitors to your site post comments on recipes
  • Write traffic is crushing your single database
    server
  • What now?
  • Partition the data to spread the writes

26
Partitioning Background
  • Move as much data as possible into partitions
  • Global database tables are an index into
    partitions
  • All partitioned tables use composite primary keys
  • We always know where to look up partitioned data

27
Partitioning Goals
  • Transparent Caller should never have to care
    about whether an object is in a partitioned table
  • Flexible Applications define their own
    partitioning scheme--it's not imposed by the
    framework
  • Simple Partitioning is hard--try to make it
    easier

28
We had this
  • package RecipeNote
  • use base qw( DataObjectDriverBaseObject )
  • __PACKAGE__-gtinstall_properties(
  • columns gt recipe_id, note_id,
    author, text ,
  • datasource gt 'recipe_note',
  • primary_key gt recipe_id, note_id ,
  • driver gt DataObjectDriverDriverCacheCac
    he-gtnew(
  • cache gt CacheMemcached-gtnew( servers
    gt ... ),
  • fallback gt DataObjectDriverDriverDBI-gt
    new(
  • dsn gt 'dbiSQLitedbnameglobal.db',
  • ),
  • ),
  • )

29
and now we have this
  • package RecipeNote
  • use base qw( DataObjectDriverBaseObject )
  • __PACKAGE__-gtinstall_properties(
  • columns gt recipe_id, note_id,
    author, text ,
  • datasource gt 'recipe_note',
  • primary_key gt recipe_id, note_id ,
  • driver gt DataObjectDriverDriverCacheCac
    he-gtnew(
  • cache gt CacheMemcached-gtnew( servers
    gt ... ),
  • fallback gt DataObjectDriverDriverSimpl
    ePartition-gtnew(
  • using gt 'Recipe',
  • ),
  • ),
  • )

30
Partitioning Effect
  • Recipe notes spread across multiple servers
  • Writes are spread across multiple partitions
  • Horizontal scaling add another database server,
    increase capacity

31
Fun stuff Parallelization
  • We have an asynchronous job system called Gearman
  • Client submits a couple of jobs, then waits for
    all of them to repeat while they're worked on in
    parallel
  • We have many partitions of data, all with the
    same data model
  • Wouldn't it be nice to be able to query them in
    parallel, then merge the results?
  • Yes.

32
Fun stuff Parallelization
  • Create a new driver, ParallelQuery
  • The driver knows about the various database
    partitions
  • Queries are submitted in parallel, and the
    results are merged, like a map-reduce algorithm
  • A class that represents a data set across all
    partitions can use this driver

33
Caching support
  • Memcached
  • Cache family of modules
  • Simple in-memory cache

34
Database support
  • MySQL
  • PostgreSQL
  • SQLite

35
Other Useful Features
  • Views
  • (Simple) Query Profiling
  • Triggers

36
Current Status
  • DataObjectDriver has a version number of 0.03
  • But don't let that fool you it's stable

37
Its stable!
  • Based on a codebase that's been stable and
    production-ready for five years
  • Used on TypePad and Vox, and other Six Apart
    projects
  • http//code.sixapart.com/
Write a Comment
User Comments (0)
About PowerShow.com