Title: Dr' Vishal Sikka
1Data Management in Enterprise Apps Some
Perspectives
- Dr. Vishal Sikka
- Chief Software Architect
- SAP AG
2A Brief Introduction to SAP and Data Management
in Our Applications
The Current Situation Some Existing and Emerging
Divides
Our Approach to Two of These Divides
The Lessons Learned and Some Open Problems
3SAP at a Glance
- Who we are
- Founded in 1972
- 2005 revenues 8.5 Billion
- 34,600 customers
- 37,500 employees
- 12 Million users in 120 countries
- 1,600 partners
- What we do
- Largest enterprise applications company in the
world - Serve most back-end and front-end business
processes - Leader in ERP, CRM, SCM,
- Leading platform to build and run apps on
- 25 industry solutions
SAP Composites
Other Composites
SAP NetWeaver
Data
Infrastructure
Infrastructure
4Our data management requirements are massive
mySAP ERP HCM Customer with payroll calculations
for 500,000 employees in 3 hours
SAP for Engineering Construction Customer with
5,000 concurrent active users
SAP for Consumer Products Customer with 1.4
million sales order line items per day
mySAP SCM Customer with 4.5 million
characteristic combinations 512 GB memory in
live cache
SAP NetWeaver Portal Customer with 300,000 users
(20,000 concurrent)
mySAP Business Suite
SAP for Utilities 25 million business partners
85 million service and sales orders per year
SAP NetWeaver BI Customer with 40 TB database
live Average DB size of top 10 live BI customers
5.5TB
mySAP ERP A customer with 5 users on a laptop
5Data Management from SAPs Perspective
- There is gt10 PB of transactional and analytical
data processed by SAP apps worldwide - We are the largest applications consumer and
reseller of data worldwide - Our data is of many different types, shape and
sizes - Transactional, Analytical, Text/Unstructured,
Master, Events, - Data has different requirements different
optimizations - Significant need for deriving value from this data
SAP Applications
Unstructured
Event
Master
Transactional
Analytical
6Data through the SAP Lens Not All Data Is
Alike
Progression Over Time
Textual and Unstructured Data
Analytical Data
Transactional Data
Event Data
Master Data
- Order 100G
- Write gt read
- Many changes
- Accurate
- Consistent
- Performance
- All back-end apps
- Order gt Tb
- Read only
- Slow changes
- Many queries
- Flexibility
- Performance
- Order 1G
- Mostly read
- Mid change
- Many queries
- Distributed
- Order lt Tb
- Many writes
- Few queries
- Distributed
- Filtering
- Correlation
- Order gt Tb
- Mostly read
- Slow change
- Many queries
- Unstructured
- Contextual
73-tier C/S Architecture of Basis Our Application
Server
8Memory Management in Basis outside the DBMS
- Buffers in the application server help
significantly improve performance. In a
classical 3-tier system, network round trips
mitigated benefits of the DBMS cache, while TCO
optimization required one DB for gt10 app
servers. - Application level locking (Enqueue and
Application LUW) mitigates the absence of fine
granularity of locking in DBMS and transaction
support needed by Application Servers (multiple
users accessing the same DB, complex screen
processing with workflow on front-end). - Numerous other optimizations and DB abstractions.
9Bringing Data Closer to Applications SAP
LiveCache
- LiveCache is a main-memory DB component used in
SAP SCMs APO - Rapid Planning Matrix in the Automotive Industry
- Common ERP system Plan the mfg of 20,000 Cars /
Day - Needed volumes are much higher
- liveCache enables planning 500,000 Cars / Hour
- Demand Planning (DP)
- Interactive planning 10x performance gain
compared to DB based solution - Consistent storage of data (no need for
aggregation/disaggregation batch jobs) - Production Planning (PP/DS)
- Performance gain of 15x in rescheduling
production runs and DS heuristics - Data volume 5x higher in planning board compared
to common ERP system - Consolidation of data structures via generic
liveCache data types - E.g. 1 order data type 1 order type with multiple
attributes instead of a few dozen different
specific order types in ERP - Bringing development teams closer together
- LiveCache applications team bridges technology
knowledge with business process knowledge by
working together with the application team on the
usage of the liveCache, as well as in
optimization of business logic. - Common team working together for several years ?
3000 happy deployments.
10A Brief Introduction to SAP and Data Management
in Our Applications
The Current Situation Some Existing and Emerging
Divides
Our Approach to Some of These Divides
The Lessons Learned and Some Open Problems
11New needs Innovate, Be flexible, Stay
high-performant
Once my system is up and running, you, SAP, can
touch my core processes once every 5 years ...
and it needs to be a Saturday and my CEO
wants me to innovate every quarter
CIO, Fortune 1000 Manufacturing Company
12New requirements, New divides
- More decoupled business processes
- More visible Physical-Digital divide
- Infrastructure subjected to much higher volumes
(events, sensors, ) - Greater need for in-context usage
- Multiple UIs
- More visible work-personal divide
- Users are a lot more used to search, lack of
structure is academic to them - Different requirements on front-end than on
back-end - e.g. easier front-end application composition
- Many more deployment options
- Greater flexibility ? easy integration, better
components semantics
New application architectures are necessary SOA
is the biggest component, but there are others
13Technology Shifts
Architectural Shift
Technology Drivers
Improvement
1990
2006
2006
1990
143x
7.15 MIPS/
0.05 MIPS/
- Disk based data storage
- Simple consumption of applications (Fat client
UI, EDI) - General-purpose, application-agnostic database
- In-memory data stores
- Multi-channel UI, high event volume, cross
industry value chains - Application-aware and intelligent data
management
CPU
250x
5 MB/
0.02 MB/
Memory
2 x
64 Bits
16 Bits
48
Addressable Memory
10 Gbps
100x
100 Mbps
Network Speed
5 Kilo RPM
3x
15 Kilo RPM
Disk Speed
14A Brief Introduction to SAP and Data Management
in Our Applications
The Current Situation Some Existing and Emerging
Divides
Our Approach to Two of These Divides
The Lessons Learned and Some Open Problems
15Addressing DB Architecture Gap SAP BI Accelerator
Performance 1 Billion records analyzed in 3
seconds Delivery Off the shelf hardware,
appliance setup Predictability Consistent
response, no tuning, fast load Integration Built
for closely integrated with SAP NW BI
16Addressing DB Architecture Gap SAP BI Accelerator
- Performance 1 Billion records analyzed in 3
seconds - Affordability Off the shelf hardware, appliance
setup - Agility Consistent response, no tuning, fast
load - Integration Closely integrated with SAP BI
17BI Accelerator Key Technology
BI Application Server
SAP BI AppServer
SAP BI Accelerator
- Main memory technology
- Inspired by text search
- On the fly aggregation
- L2 cache miss optimization
- Column based data structures
- Highly compressed, dictionary based, golomb,
sparse, ... - Fast updates with write-optimized delta mechanism
- Compressed data structures for read access
- Parallel and distributed execution engine
- Distributed joins, horizontal table split
- Intelligent partitioning (along join paths)
- Data distribution optimizer
- Model based data layer
- Exploit data model for performance optimization
and data distribution
Storage subsystem
Database Server
Scalability by adding blades
18Key Benefits
- Predictable (near constant) query response time
- Query execution shifted from DB to BI Accelerator
- Fast in memory full table scans guarantee stable
response times - Column based data structures support fast joins
- Intelligent partitioning and data distribution
allows massive parallelization - Reduced maintenance costs
- Simplified cube modeling (normalization for
semantic reasons only) - No more aggregates (or aggregate administration)
- Less need for DB optimization
- Reduced hardware costs
- Commodity hardware (blades) with standard
equipment - Linear scalability with number of processors /
cores - Use of blade infrastructure instead of big SMP
box - Packaged as an appliance
19BI Accelerator Future
- Complete OLAP layer as part of the BI Accelerator
- Integration of text search and BIA technology
- Master data support
- Enterprise search on BI data
- Reporting on text data ( information extraction)
- Support of flat cubes
- Use of commodity coprocessors
- Network processors
- Graphic card processors
- Application data model integration
20SAP Enterprise Search
- Search in the enterprise
- Business objects
- Business context awareness
- Role
- Authorizations, Compliance
- Current work context
- Graceful degradation with decreasing structure
- Multiple clients
- Stand alone and embedded into applications
- Integration into non-SAP sources
- SAP Enterprise Search is a stand alone business
search xApp and a framework for search as a
service
Portal
Office
Devices
Desktop
SAP Enterprise Search
SAP NetWeaver Business Process Platform
DesktopSearchService
InternetSearchService
R/3 via BAPIs
Search Indexing
my SAPBus.Suite
3rdparty
Docu-ments
21SAP Enterprise Search
- Access more information from any place
- Get the right answer to enterprise questions
anywhere, anytime - Access data from your workplace or mobile device.
- Simple to use Open to everyone
- Pre-build common queries
- Smart context
- Better Answers Leverage context information and
meta data - Support targeted search for object types
- Enhance search and displays by contextual meta
data related queries, object scoping - Go Deep Find the right information Across all
your sources - Penetrate entire corporate data sources including
Search for documents and business objects
simultaneously - Ensure service-oriented, multi-device scalable
operation - Reach Out Embed search into everyday tools
- Design simple search front ends that are
compliant to the respective devices, including
Portal, Desktop, SMS, e-mail, mobile
22The Argo Widget
23Enterprise Search Example
24Enterprise Search Example (Contd)
25A Brief Introduction to SAP and Data Management
in Our Applications
The Current Situation Some Existing and Emerging
Divides
Our Approach to Some of These Divides
The Lessons Learned and Some Open Problems
26Master Data Management
- Characterized By
- Business Entities with
- Multiple data models
- Multiple application sources
- Reference Models
- Single logical model
- Multiple physical models
- Source of Truth
- No single source of truth
- Access Characteristics
- Serves as reference data
- Few systems write
- Many systems read
- 360 view of data
- Full analytics view
- Full operational view
Master Data Management Architecture
MDM Application Services
Quality
Visibility
Governance
Validation
Analytics
Meta-dataMaster
Unified Data Management Layer
Distributed Query
Data Federation
Multiple Data Source Management
Data Mappings
Legacy Data
Unstructured Data
Structured Data
Connectivity Fabric
Events
Services
27Event Processing
- Characterized By
- Continuous Streams of near real-time data
- High data flow rate and large volume needs
parallel processing - Significant main memory processing
- Continuous evaluation of rules
- Edge Devices as data producers
- (RFID, sensor data) generate significant number
of events - orders of magnitude scale data e.g., shop floor
sensor devices - Large volume of event data dictates
pre-processing for consumption - Events externalized non-invasively for several
forms of consumption - Automatic correlation and context determination
of business events
Business Events Actions Query Results BI/Reports A
lerts
Event Streams Data (IN)
Output Streams
Input Streams
Event Management
Filters
Response
Correlation Engine
Correlation Rules
Event Memory/Storage
28Lessons Learned
- Its not the technology, stupid. Application
perspectives provide grounding for data
management. - ? So learn what the apps needs are
- One size does not fit all. Applications data
mgmt needs are changing and this requires a
rethink in data mgmt architecture. - ? So lets go rethink data mgmt for the enterprise