Title: WHAT THE MARKET-LEADING DBMS VENDORS DON
1- WHAT THE MARKET-LEADING DBMS VENDORS DONT WANT
YOU TO KNOW - Disruption is gathering steam
2Curt Monash
- Analyst since 1981
- Covered DBMS since the pre-relational days
- Also analytics, search, etc.
- Own firm since 1987
- Publicly available research
- Blogs, including DBMS2 (www.dbms2.com -- the
source for most of this talk) - Feed at www.monash.com/blogs.html
- White papers and more at www.monash.com
3Database diversity
- Mike Stonebraker, PhD
- One size doesnt fit all
- Curt Monash, PhD
- Horses for courses
- Database diversity
- Mike and Curt
- The world needs 9 to 11 different kinds of data
management software
4The case for grand integrated DBMS
- Theoretical relational model has great advantages
- Actual relational DBMS are versatile and modular
- Software developers have economies of scale
- Vendor consolidation theoretically saves effort
and money - So does database consolidation
5The case for database diversity
- Different kinds of data require fundamentally
different kinds of data management software - Putting all that together in one system is
extremely hard - Nobody has ever done it well
6Application and use cases
- High-end e-commerce
- 100-terabyte analytics
- High-volume call center
- Media-heavy web startup
- Simple departmental application
- General enterprise or SaaS app
- End-user or ISV
7Data management distinctions
- Fundamental
- Data manipulation language
- Data access method
- Practical
- Type of data
- Type of hardware
- Administrative burden
- Performance stresses and metrics
8Very practical
9Major components of DBMS cost
- License and maintenance
- Especially maintenance
- Hardware, power, facilities
- Mainly for VLDB analytics
- Installation and ongoing administration
- Time-to-benefit is a factor too
- Programming
- Sometimes a differentiator
1011 kinds of data management software
- High-end OLTP/general-purpose DBMS
- Mid-range OLTP/general-purpose DBMS
- Row-based analytic RDBMS
- Column- or array-based analytic RDBMS
- Text search engines
- XML and OO DBMS (but these may merge with search)
- RDF and other graphical DBMS (but these may merge
with relational) - Event/stream processing engines (aka CEP)
- Embedded DBMS for devices
- Sub-DBMS file managers (e.g. MapReduce/Hadoop)
- Science DBMS
11High-end OLTP/general-purpose DBMS
- Oracle, DB2, MS SQL Server, et al.
- Amazing throughput and scale-up
- Bullet-proofing
- 24/7
- Security certifications
- Datatype extensibility
- Expensive, expensive, expensive
12Mid-range OLTP/general-purpose DBMS
- Three main groups
- Crippled high-end (Express editions)
- ISV/VAR-focused (Progress, several
non-relational) - Open source-based (Postgres, MySQL)
- Some are comparable to (or better than) the
systems that ran the world in the 1990s - What does the Postgres family still lack?
- Generally inexpensive
13Row-based analytic RDBMS
- Data warehouses should be in separate instances
- But thats not enough
- Sequential vs. random reads
- MPP vs. SMP
- Teradata, Netezza, DATAllegro
14Column- or array-based analytic RDBMS
- Retrieving whole rows carries penalties
- I/O
- Optimization
- Columnar is better
- But not in all use cases
- MOLAP may be superceded
15Text search engines
- 85 of all information is in text
- and 16.9 of all statistics are made up out of
thin air - There really are a lot of words out there
- And search interfaces are hugely important
- Text search has its own data access methods
- May play more nicely with columnar than row-based
RDBMS - Watch integrations with other analytic datatypes
- Attivio (relational, a little XML)
- Mark Logic (a lot of XML)
16XML and OO DBMS
- Reasons for logical XML structures
- Schema flexibility
- Dressed-up text
- XML is the transport format, and its too complex
to unpack - The data came from neither an RDMS nor text store
in the first place - Native XML data access methods
- Like text and object
- So far mainly in niches
17RDF and other graphical DBMS
- Semantic web is overhyped
- but the world DOES need ontology management
systems - Much depends on path length
- Analytic RDBMS may do the job
18Event/stream processing engines
- Design point super-low latency
- but there are other applications
- Data is executed against queries rather than
vice versa - Could be the future of BI
- and of social networking
19Embedded DBMS for devices
- Products
- Sybase SQL Anywhere
- solidDB focused on caching post-acquisition?
- Cloudscape vaporized?
- McObject tiny startup
- Features
- Load-and-forget
- Zero-DBA
- Small-footprint
- Sometimes -- subsettable library
20Matching analytic DBMS to use cases
- 100 Tb data mart
- 50 Tb enterprise data warehouse
- 5 Gb 5 Tb OLTP offload
21Matching OLTP/general DBMS to use cases
- Market leader
- High-end e-commerce
- High-volume call center
- Mid-range
- Web startup
- It depends on how locked-in you are
- Simple departmental application
- General enterprise or SaaS app
22Clayton Christensens disruption narrative
- Market leaders have many advantages, including
top technology. - Followers come up with good technology too.
- The leaders stay ahead by making their products
ever better and more complex. - The followers sell into new or non-mainstream
markets, at prices the leaders cant match. So
they dominate new markets. - Old markets turn into low-margin commodity-fests.
- Unless they diversify, old leaders are doomed.
23Thats whats happening here
- Much DBMS complexity is without benefit
- Other complexity only benefits a few high-end
customers - Data warehouse specialists exploit radically
superior technology (e.g., MPP) - Open source vendors have radically different
price points and business models - Open source adoption has been strongest in
non-traditional markets.
24And the big vendors know it
- Oracle is diversifying furiously
- Oracle has announced a clear focus on top-end
customers - IBM is obviously focused on the high end too
- Oracle and (to some extent) IBM are buying
alternative DBMS technologies - Microsoft and IBM arent dependent on the DBMS
business anyway