Title: Past, Present, Future of Data/Information Modeling
1Past, Present, Future of Data/Information
Modeling
- Foster Distinguished Chair Professor
- Computer Science Dept.
- Louisiana State University
- Baton Rouge, LA 70803, USA
- pchen_at_lsu.edu
- http//www.csc.lsu.edu/chen
Peter P. Chen
2Overview
- Historical Background
- How Entity-Relationship Model (ERM) was Developed
- Last Twenty-Five Years
- ER Conferences, IDEF, ANSI/IRDS, CASE
Methodologies/tools - The Present
- OO, Data Mining, UML
- The Future
- Discovering Links/Relationships
- Validity/Credibility of Data, Machine
Learning/Reasoning - Natural Languages and Data/Information Modeling
- Modeling for XML/Semantic Web
- Conclusions
3The Needs of the DB Community in the Early 70s
- For Software/Hardware Vendors
- Integration of Various File and DB Formats
- Incorporating More Data Semantics
- For User Organizations
- A Unified Methodology for File and DB design for
Various File and DBMSs - Incorporating More Business Rules
4How the ERM was Developed -- Right Place
at the Right Time (I)?
- I Got Ph.D. From Harvard in 1973
- Thesis Title Optimal File Allocation
- Worked for Honeywell from 73 to 74
- In a 10-person Architecture Team for Next
Generation Distributed System - Many Team Members Were DB Experts 20 Years
Older - Charles Bachman, Henry Leftkovitz,
- John Lyon
5Right Place at the Right Time (II)?
- Joined MIT Management School Faculty in 74
- Interacted with User Organizations
- They wanted a unified modeling and design
methodology - Completed the ERM Paper
- Most other faculty members were busy implementing
DBMS prototypes
6Concepts of Entity and Relationship
Figure 2
7An Example of ER Diagram
Figure 2
8Theoretical Foundations of ER Model
- Set Theory
- Modern Algebra
- Logic
- Lattice Theory
9Defining ER Concepts using Set
Figure 2
10Relationship as an Ordered Tuple
Figure 2
11First ER Paper Initial Reactions (I)
- Published in ACM Transactions on DBMS, Vol. 1,
No.1, pp. 9-36, March 76
12First ER Paper Initial Reactions (II)
- The Situation then
- Most People were in Religious War
- I was a New Kid on the Block
- The Advices I got
- Dropped the ER Model
- Joined one of the Religious Camps
13The First Five Years (I)
- Persistence with the ER Model
- Continue to Write Papers on ERM
- Organized First ER Conference in 1979 at UCLA
- 2nd ER Conference Two Years Later
- Now, an Annual Conference
- November 2001 in Japan
- 2002 in Finland
- 2003 in Chicago
- 2004 in Shanghai, China
14The First Five Years (II)
- Some Academic People Started to Develop
Semantics-Richer Data Models - Mike Hammer of EECS, MIT, now a guru in
reverse-engineering - US-AF ICAM/IDEF Project
- Served as a consultant
- The ERM Became the Basis of IDEF
- More Companies Started to Experiment ERM
15Related Developments in Next 20 Years
- Codds RM/T Model added ER Concepts
- Bachmans Partnership Model, too.
- ANSI/IRDS Standard of Information Resource
Directory Systems (IRDS) Adopted ERM - CASE (Computer-Aided Software Engineering)
- First Major CASE Symposium in Atlanta, 1987,
Keynote speaker - IBM AD Cycle Based on ERM
- IBM DB2 Repository Mgr ERM
- Oracle Desinger/200 -- ERM
16The Present Status of Data/Information Modeling
(I)
- ER Modeling is the most widely used methodology
in the business DB application development world - more than 85 of the FORTUNE 3,000 companies and
major organizations are using it - More advanced ER concepts are proposed and used
- UML, which is a specific language syntax,
reinforces the ER concepts
17The Present Status of Information Modeling (II)
- OO Modeling incorporates many concepts of ERM
- Object is an implementation concept
- Current OO methodologies deed more general
concepts of relationship - What is Data Mining?
- Discover hidden relationships,
- Discover the embedded ER Models
18Future (I)
- Discovering Links/Relationships from Data in
Various Sources (such as DARPAs EELD Program) - Validity/Credibility Analysis and Integration of
Data Machine Learning/Reasoning - Natural Languages vs. Data/Info Modeling. ER
Modeling Concepts are Similar to - Chinese Character Composition Methods
- Ancient Egyptian Hieroglyph
- English Sentence Grammar Structure
19The Future (II)
- ERM and the Fundamental Principles of Systems
Architecture - ER Model is closely related to
- XML
- Semantic Web
20Information Validity/Credibility Analysis
- A Paper was published in InfoFusion 2001,
Montreal - Algorithm was developed
- Prototype developed
- Also, developed machine learning algorithm
21ER Modeling English
- First presented the ideas (abstract of a paper)
in 2nd ER Conference in Washington, D.C. 1981 - Paper was published in Information Sciences, 1983
- Adopted as a standard systems analysis technique
by some large consulting firms - Recently, OO Analysis re-discovered some of the
basic concepts - Also, the research community started to use,
modify, or extend the concepts
22(No Transcript)
23(No Transcript)
24(No Transcript)
25Chinese Characters as Models of Real World
Entities
Figure 2
26Ancient Egyptian Hieroglyph
Figure 2
27Various Components of XML
- XML has many components
- XML (language part)
- XSL
- DOM
- DTD
- XLink and XPointer
- RDF
- XML Schema
- etc.
- Not all are compatible with each other
28What is RDF?
- Acronym for Resource Description Framework
- As a way to specify metadata
- Two parts
- Model and Syntax
- Schema
- The RDF Schema is not a W3C recommended
specification, yet - http//www.w3.org/TR/REC-rdf-syntax
29W3C Pays Attention to ERM
- The Cambridge Communiqué (http//www.w3.org/TR/sch
ema-arch) states - RDF can be viewed as a member of the
Entity-Relationship Model Family - In several articles, Tim Berners-Lee discusses
the similarity and differences between ERM and
RDF.
30RDF vs. ER Model
- RDF can be viewed as a version of binary ER model
(but at a lower and more detailed level) - RDFs dependence on sentence analysis is similar
to a series of work done in the correspondence
between the ER model and English (and several
other natural languages). - Reference Chen, P.P., Entity-Relationship
Diagram and English Sentence Structure,
Information Science, 1983, Academic Press. - Major concepts
- Noun --gt Entity, Verb --gt Relationship
- Adjective --gt Attribute of Entity, Adverb --gt
Attribute of Relationship - Gerund --gt Relationship-converted Entity
- Etc.
31Real World Modeling Fundamental Principles of
Systems Architecture
- Entity Lattice and Other Mathematical Structures
and Operations - Fundamental Principles of Systems Architecture
- Starting from Info System Architecture, and then
extends to all kinds of systems - Fundamental Questions on
- Representation/Understanding
- Operations
- Costs/Benefits/Optimization
32(No Transcript)
33Conclusions (I)
- ER Modeling was triggered by the needs
- Unifying data views from top-down and bottom-up
perspectives - For vendors user organizations
- Incorporating more sematics
- Entity and relationship are fundamental concepts
for - Data/Knowledge Representation
- Database design
- Software engineering
- Information system development
- And others (data mining, ...)
34Conclusions (II)
- Future
- Discovering Missing/Intended/Un-intended
Relationship from data - Prediction of the Validity of data and data
model Machine Learning/Reasonsing - Natual Languages and ERM
- Multi-language Information Extraction and
Understanding - Culture-based Modeling Methodology
- Modeling Design of Web
- ?Theory of Web, Semantic Web
- Fundamental Principles of Systems Architecture
35References
- ER and other Conferences
- ER2002 (Finland), ER2003 (Chicago), ER2004
(Shanghai) - http//www.conceptualmodeling.org
- Chens papers online
- http//www.csc.lsu.edu/chen
- XML Schema
- Primer http//www.w3c.org/TR/xmlschema-0/
- Structure http//www.w3c.org/TR/xmlschema-1/
- Data Types http//www.w3c.org/TR/xmlschema-2/
- XML XLink XPointer
- http//www.w3c.org/XML/Linking
- RDF
- http//www.w3c.org/RDF/