Title: Kumar Madurai
1Knowledge EngineeringUsing Linked Data in an
Enterprise
- Kumar Madurai
- October 21, 2013
2Knowledge Engineering Problem Context
- 15 different product definitions
- 17 different application systems creating design,
regulatory, and production data - 100s of change requests monthly
- Tribal knowledge about product design and process
design not captured anywhere
3Opportunities for Improvement
- How do we reduce the complexity?
- How do we promote consistency?
- How do we foster collaborative sharing of data
and knowledge? - How do we get an integrated view of process and
product data across the product life cycle?
4Knowledge Engineering Semantic Data Framework
Applications Consuming Domain Views
Linked Concepts and Properties in Domain Views
Model1 RDF Triples
Model2RDF Triples
Model3 RDF Triples
Model4RDF Triples
Semantic Layer in Oracle
Mapping Rules using D2RQ
Staged Data in Oracle
Database1
Database2
Database3
Database4
Original Data Sources
Non-Validated Systems
Database1
Database2
Database3
Database4
Validated Systems
5Relational Data Source - Simple Example
Equipment_Id Inspector_Id Inspection_Date
E02363 103546 01/10/2013
Equipment Database
Document_Id Author_Id Creation_Date
D8946 jmathis 06/15/2012
Document Database
Employee_Id Employee Name Network_Id
103456 Joe Mathis jmathis
Employee Database
Query Give me the document(s) authored by
inspector of equipment E02363
6Ontology Model for Relational Example
rdfsdomain
rdfsrange
xsdstring
rdfsdomain
rdfsrange
mPerson
rdfsdomain
rdfsrange
xsdstring
rdfsdomain
rdfsrange
mPerson
rdfsdomain
rdfsrange
xsdstring
rdfsdomain
rdfsrange
xsdstring
rdfssubClassOf
7Semantic Data for Relational Example
rdftype
rdftype
inspectedBy
authoredBy
equipmentID
documentID
inspectionDate
creationDate
E02363
Person_103546
D8946
Person_jmathis
01/10/2013
06/15/2012
rdftype
networkID
employeeID
employeeName
103546
Joe Mathis
jmathis
8Inference Rule Example
CONSTRUCT ?emp2 rdftype mEmployee .
?emp2 owlsameAs ?emp1 WHERE ?emp1
rdftype mEmployee . ?emp1 mnetworkID
?netID . BIND (URI (CONCAT (http//KE/Data/SEM
, Person_, ?netID)) AS ?emp2)
With OWL Inferencing SELECT ?name WHERE
Document_D8946 authoredBy ?author ?author
employeeName ?name Results in name Joe
Mathis
rdftype
rdftype
owlsameAs
networkID
employeeName
jmathis
Joe Mathis
9A Manufacturing Example
Product Table
BOM Table
Prod_Id Prod_Name Mfg_Line
P1 Splash Line 1
P2 Trident Line 4
Prod_Id Matl_Used Qty_Used
P1 Pink Colorant 20
P1 Silver Wrapper 1
P1 Melon Flavor 15
P2 Red Colorant 13
P2 Silver Wrapper 1
P2 Cherry Flavor 10
Manufacturing Database
Raw Material Table
Supplier Table
Purchasing Database
Matl_Name Supplier_Id
Colorant-Pink S1
Colorant-Red S1
Flavor-Melon S2
Flavor-Cherry S3
Wrapper-Silver S4
Supplier_Id Supplier_Name
S1 Acme
S2 Foods-R-Us
S3 Yummy
S4 Lotus Inc.
Query Give me the suppliers of raw materials for
products made on Line 4
10Ontology Model for Manufacturing Database
rdfsdomain
rdfsrange
rdfsdomain
rdfsdomain
rdfsrange
rdfsrange
rdfsrange
rdfsdomain
rdfsdomain
rdfsrange
rdfsdomain
rdfsdomain
rdfsrange
rdfsrange
11Ontology Model for Purchasing Database
rdfsdomain
rdfsrange
rdfsdomain
rdfsdomain
rdfsrange
rdfsrange
rdfsdomain
rdfsdomain
12RDF Triples from D2RQ Mapping - Manufacturing
Database
mProduct_P1 rdftype mProduct mProduct_P1
mProductID P1 mProduct_P1 mProductName
Splash mProduct_P1 mhasManufacturingLine
mLine_1 mProduct_P2 rdftype mProduct
mProduct_P2 mProductID P2 mProduct_P2
mProductName Trident mProduct_P2
mhasManufacturingLine mLine_4 mLine_1
rdftype mManufacturingLine mLine_1
rdfslabel Line 1 mLine_4 rdftype
mManufacturingLine mLine_4 rdfslabel Line
4 mBom_P1_Pink_Colorant rdftype
mBillOfMaterial mBom_P1_Pink_Colorant
mhasProduct mProduct_P1 mBom_P1_Pink_Colorant
mhasRawMaterial mMaterial_Pink_Colorant
mBom_P1_Pink_Colorant qtyUsed
20 mBom_P1_Silver_Wrapper rdftype
mBillOfMaterial mBom_P1_Silver_Wrapper
mhasProduct mProduct_P1 mBom_P1_Silver_Wrappe
r mhasRawMaterial mMaterial_Silver_Wrapper
mBom_P1_Silver_Wrapper qtyUsed 1 ..
Question How do we update the unitOfMeasure
property?
13RDF Triples from D2RQ Mapping - Purchasing
Database
pMaterial_Colorant_Pink rdftype pRawMaterial
pMaterial_Colorant_Pink pmaterialName
Colorant-Pink pMaterial_Colorant_Pink
phasMaterialType pMaterialType_Colorant
pMaterial_Colorant_Pink phasSupplier
pSupplier_S1 pMaterial_Wrapper_Silver rdftype
pRawMaterial pMaterial_Wrapper_Silver
pmaterialName Wrapper-Silver pMaterial_Wrappe
r_Silver phasMaterialType pMaterialType_Wrapper
pMaterial_Wrapper_Silver phasSupplier
pSupplier_S4 . pSupplier_S1
rdftype pSupplier pSupplier_S1 psupplierID
S1 pSupplier_S1 psupplierName
Acme pSupplier_S4 rdftype
pSupplier pSupplier_S4 psupplierID
S4 pSupplier_S4 psupplierName Lotus
Inc. pMaterialType_Colorant
rdftype pMaterialType pMaterialType_Colorant
rdfslabel Colorant pMaterialType_Wrapper
rdftype pMaterialType pMaterialType_Wrapper
rdfslabel Wrapper .
Question How do we link the materials from
Purchasing to Manufacturing?
14Data Traceability is Critical
- Ability to link any item used in an application
to the exact data source all the way downstream
use of hasDataSource property for every
instance created - Linking of data occurs in two levels, across the
product genealogy (horizontal and business
driven), and across the system layers (vertical
and technology driven) - Semantic relationships between concepts should be
defined properly and maintained to reflect
changes in underlying source systems - Important to keep non-validated data in their own
models (semantic graphs) especially in a
regulated environment - Specific verification / validation steps to be
performed when new applications are brought on
board using the semantic layer
15Conclusion / Takeaways
- Ontological modeling of enterprise data stored in
conventional databases is the first and crucial
step - Augmenting the model with rules adds more power
to the inferencing capabilities of the model - Annotation properties (rdfslabel, rdfscomment,
rdfsseeAlso, etc.) can also be used to add
semantic meaning to the data - D2RQ provides a flexible mapping language and a
set of tools to enable the conversion of
relational data to RDF triples - Judicious use of owlsameAs helps in linking
instances that are the same but from different
sources - Critical to ensure data traceability (also called
data provenance) which has to be planned for in
the model and when data is loaded into the
database