Title: Industrial data quality
1Industrial data quality
Guy Pierra pierra_at_ensma.fr
2Content
- Industrial data quality dimensions
- Industrial data quality factors
- IDQ effort within WG2
- Conclusion / Recommendation
3Industrial Data Quality dimensions
Data Quality dimensions? 1 - Data correctness
(intrinsic) It is generally agreed that data
should also "fit for their intended used" 2 -
Accessibility 3 - Relevance 4 - easy to use
Abstraction process
symbols
Business situation
Quality factors How to improve quality?
4Industrial Data Quality factors
Data Quality dimensions? 1 - Data correctness
(intrinsic) It is generally agreed that data
should also "fit for their intended used" 2 -
Accessibility 3 - Relevance 4 - easy to use
- 1 - Check Data Correctness (CDC)
- Method? Compare with other data
Abstraction process
? ? ?
symbols
Business situation
?
-
- Cost/benefit analysis
- Application domain ? ?
- Cost
- specifying ?
- checking ?
Quality factors How to improve quality?
Cost ? Benefit?
5Industrial Data Quality factors (cont.)
- 2 - Improve abstraction Process (IP)
- Method? ISO 9000-like certification
-
Data Quality dimensions? 1 - Data correctness
(intrinsic) 2 - Accessibility 3 - Relevance
4 - easy to use
- Application domain
- Data correctness ?
- Accessibility ?
- Relevance ?/?
- easy to use ?
- Cost
- specifying ? ?
- checking ? ? ?
Abstraction process
?
? ? ?
Quality factors How to improve quality?
symbols
Business situation
Cost ? ? ? Benefit ?
6Industrial Data Quality factors (cont.)
- 3 - Improve intrinsic data quality (IDQ)
- Method? Concentrate on model issues
- Data correctness Improve data semantics
Abstraction process
?
symbols
Business situation
7Industrial Data Quality factors (cont.)
- 3 - Improve intrinsic data quality (IDQ)
- Method? Concentrate on model issues
- Data correctness Improve data semantics
- Set-theoretic view of data
- an entity, an attribute stands for a set of
values - restricting the size of the set through formal
constraints avoid errors
8Industrial Data Quality factors (cont.)
- 3 - Improve intrinsic data quality (IDQ)
- 3.1 - Data correctness Improve data semantics
- SC4 application
- maintain/extend use of constraint-based
modeling - Develop constraint-based usage guide (e. g.,
SASIG) ? ? - 3.2 - Accessibility
- SC4 application
- Ensure free accessibility when necessary
condition of usage ? ? - 3.3 - Relevance
- SC4 application
- Consider developing formal usage guide / user
requirements ? - 3.4 - Ease of use
- How to? Major ease of use problem lack of
interoperability!! - SC4 application
- Ensure interoperability of SC4 standard !! ?
-
9Industrial Data Quality factors (end)
1 - Check Data Correctness (CDC) 2 - Improve
abstraction Process (IP) 3 - Improve
intrinsic data quality (IDQ)
Cost ? Benefit?
Cost ? ? ? Benefit ?
Cost ? Benefit ? ?
10IDQ Quality effort within WG2
- 3.1 - Data correctness
- The PLIB dictionary/ontology meta-schema includes
a huge number of constraints - Part 42.2 contains a constraint schema for
instance values - 3.2 - Accessibility
- A lot of actions to ensure free-of-charge access
to PLIB dict. - Development of a set of Web services for Internet
access - 3.3 - Relevance
- nothing!!
- 3.4 - Ease of use
- A world-wide initiative to ensure
interoperability of all product domain
dictionaries OIDDI - Collaboration with UN/CEFACT to ensure
interoperability/ orthogonality with Business
Process
11Conclusion / Recommandation
- Data Quality spreads over a number of dimensions
- From the three identified factors IDC, ID, IDQ,
Intrinsic Data Quality seems both - the most efficient from a cost/benefit analysis
- the most adapted to SC4 technology
- Main recommendations to SC4
- promote the development of formal
constraint-based usage guides - request specific free of charge accessibility for
those SC4 standards that defines
computer-sensible meaning identifiers - ensure interoperability between SC4-defined
industrial data standards