Business Rules, Data Quality, and Information Compliance - PowerPoint PPT Presentation

1 / 80
About This Presentation
Title:

Business Rules, Data Quality, and Information Compliance

Description:

Both internal and external factors affect how information is represented and used ... 'Fitness for Use' Defect-free data is not a requirement ... – PowerPoint PPT presentation

Number of Views:186
Avg rating:3.0/5.0
Slides: 81
Provided by: davidl78
Category:

less

Transcript and Presenter's Notes

Title: Business Rules, Data Quality, and Information Compliance


1
Business Rules, Data Quality, and Information
Compliance
  • David Loshin
  • Knowledge Integrity Incorporated
  • loshin_at_knowledge-integrity.com
  • (301) 754-6350

2
Agenda
  • Data Quality and Risk
  • Information Compliance
  • Information Compliance and Business Rules

3
The Value of Information Quality
  • Both internal and external factors affect how
    information is represented and used
  • Traditional focus on structure instead of content
    has led to deficiencies in asserting validity of
    data
  • There is growing recognition that responsibility
    for information quality lies with the business
    client
  • Increased external pressures are beginning to
    influence the care we take in managing content

4
Risks and Data Quality
Policy
Compliance
Value/Benefits
Poor Data Quality
5
Types of Risk
  • Compliance Risks
  • System Development Risks
  • Increased Operational Costs
  • Lost Opportunities
  • Being able to quantify impediments to achieving
    business objectives due to poor data quality
    should be seen as a critical part of business
    risk management

6
Managing Data Quality
  • Data is a critical organizational asset
  • Manage the quality of data with the same
    diligence that companies use in managing all
    other assets and resources TDWI Data Quality
    and the Bottom Line
  • Companies that manage their data as a strategic
    resource and invest in its quality are already
    pulling ahead in terms of reputation and
    profitability PriceWaterhouseCoopers Global
    Data Management Survey 2001

7
Data Quality is Critical to Data Warehouse Success
  • By 2005, more than 50 percent of projects will
    fail, Fortune 1000 companies will spend or lose
    more on operational inefficiencies in the
    back-office than on data warehousing or CRM.
    Ted Friedman, Gartner Group
  • According to PriceWaterhouseCoopers Global Data
    Management Survey, poor data quality meant that
  • Over 50 of respondents had incurred extra costs
    to prepare reconciliations
  • A third had been forced to delay or scrap new
    systems
  • Almost a third had failed to bill or collect
    receivables

8
Fitness for Use
  • Defect-free data is not a requirement
  • Instead, target measured compliance with user
    expectations above agreed-to thresholds
  • But How do we know when data quality is at an
    acceptable level?
  • Fuzzy notions of good vs. bad data
  • Different criteria for different users
  • Data sets are used in ways they were never
    intended
  • Data Quality is Contextual

9
What Can Go Wrong?
  • Data entry errors
  • Absence of agreement as to business term meanings
  • Mismatched syntax, formats and structures
  • Unexpected changes in source systems
  • Multiple interfaces to same back end
  • Validity failures
  • Data conversion errors
  • Changes in use and perception of data

10
The Real Problem
  • There are no objective measures of data quality
  • The only industry metrics are based on name
    deduplication or address standardization
  • The scope of the business importance of
    information validation is widely underestimated!

11
Understanding and Addressing the Problem
  • Any situation where information must comply with
    business client expectations may be considered a
    data quality problem
  • To effectively manage data quality, we must be
    able to
  • Determine data quality expectations
  • Identify contextual metrics
  • Assess levels of data quality
  • Identify opportunities for improvement
  • Eliminate sources of problems
  • Measure continuous improvement against baseline

12
The Knowledge Integrity Approach
  • Introduce a methodology for expressing business
    client data quality expectations and measuring
    conformance with those expectations
  • Provide a framework for defining data quality and
    business rules at a high level
  • Address the most common data quality problems at
    the source instead of repetitive data correction
  • Demonstrate the ability to transform the
    statement of rules into actualized processes to
    measure and report based on those defined rules
  • Information Compliance

13
Information Compliance
  • The coordinated, measurable conformance of a
    collection of data instances with a set of
    explicitly defined data expectations expressed
    using a formal rule language.
  • Can be used
  • To characterize fitness for use
  • To motivate the definition of data expectations
  • To determine validity of information within a set
    of defined constraints
  • As the basis for business performance metrics and
    measurements of those metrics
  • To enable rapid root cause analysis of
    information scrap and rework

14
Information Compliance and Data Quality
  • Effective measurement of compliance with
    expectations is a core component of a data
    quality strategy
  • Defining information consumer expectations
    provides clarity as to data quality requirements
  • Expectations must be tied to specific business
    impacts
  • Well-defined requirements provide insight into
    objective metrics and key data quality indicators

15
Information ComplianceNot Just for Data Quality!
  • Traditional design processes
  • Documentation/data dictionaries
  • Business management
  • Policies
  • Standards
  • Regulations
  • Knowledge management

16
Maturation of the Web
17
Neo-Centralization
We are going to make scads of money!
Business Intelligence Data Warehouse
How do we reconcile between these different
systems?
18
Regulatory Compliance
Your companys CEO
Watch this space
19
Business Management
  • Our corporate pricing strategy has been revised
    to reflect a more customer-friendly approach
  • The first time a customer buys one of our
    products, he or she will be given an immediate
    15 discount
  • To encourage repeat sales, any of our current
    customers can buy additional quantities of any
    already purchased product at a 10 discount
  • Our preferred customers receive a 25 discount on
    any purchase

20
Business Management
  • New customers are given a 15 discount on first
    purchase of a product
  • Current customers are given a 10 discount on
    each additional product purchased
  • Preferred customers are given a 25 discount on
    all products

21
Policy, Example 1
  • The following paragraphs have been taken from the
    Yahoo! Privacy policy, http//privacy.yahoo.com/pr
    ivacy/us/
  • Yahoo! does not rent, sell, or share personal
    information about you with other people or
    nonaffiliated companies except to provide
    products or services you've requested, when we
    have your permission, or under the following
    circumstances
  • We provide the information to trusted partners
    who work on behalf of or with Yahoo! under
    confidentiality agreements. These companies may
    use your personal information to help Yahoo!
    communicate with you about offers from Yahoo! and
    our marketing partners. However, these companies
    do not have any independent right to share this
    information.
  • We have a parent's permission to share the
    information if the user is a child under age 13.
    Parents have the option of allowing Yahoo! to
    collect and use their child's information without
    consenting to Yahoo! sharing of this information
    with people and companies who may use this
    information for their own purposes

22
Policy, Example 1
  • The following paragraphs have been taken from the
    Yahoo! Privacy policy, http//privacy.yahoo.com/pr
    ivacy/us/
  • Yahoo! does not rent, sell, or share personal
    information about you with other people or
    nonaffiliated companies except to provide
    products or services you've requested, when we
    have your permission, or under the following
    circumstances
  • We provide the information to trusted partners
    who work on behalf of or with Yahoo! under
    confidentiality agreements. These companies may
    use your personal information to help Yahoo!
    communicate with you about offers from Yahoo! and
    our marketing partners. However, these companies
    do not have any independent right to share this
    information.
  • We have a parent's permission to share the
    information if the user is a child under age 13.
    Parents have the option of allowing Yahoo! to
    collect and use their child's information without
    consenting to Yahoo! sharing of this information
    with people and companies who may use this
    information for their own purposes

23
Policy, Example 2
  • The following paragraph has been taken from the
    Earthlink Small Office DSL Terms and Conditions
    http//www.earthlink.net/about/policies/smofficete
    rms/
  • If you need to cancel your EarthLink Small Office
    DSL Service after installation, please send your
    written request to EarthLink Business Access
    Customer Service at fax (408) 881-3011. We
    require 30 days notice for service cancellation.
    To process your cancellation request, we require
    that you provide the following (1) Written
    request submitted on company letterhead by your
    billing contact (2) Your customer or account
    number (3) Current phone number (4) Reason for
    canceling service.

24
Policy, Example 2
  • The following paragraph has been taken from the
    Earthlink Small Office DSL Terms and Conditions
    http//www.earthlink.net/about/policies/smofficete
    rms/
  • If you need to cancel your EarthLink Small Office
    DSL Service after installation, please send your
    written request to EarthLink Business Access
    Customer Service at fax (408) 881-3011. We
    require 30 days notice for service cancellation.
    To process your cancellation request, we require
    that you provide the following (1) Written
    request submitted on company letterhead by your
    billing contact (2) Your customer or account
    number (3) Current phone number (4) Reason for
    canceling service.

25
Data Dictionary Example 1
26
Data Dictionary Example 2
27
Standards, Example 1
28
Standards, Example 1
29
Standards, Example 2
30
Legislation
From the Personal Responsibility and Work
Opportunity Reconciliation Act of 1996 (PRWORA)
(1) TRANSMISSION OF WAGE WITHHOLDING NOTICES TO
EMPLOYERS.Within 2 business days after the date
information regarding a newly hired employee is
entered into the State Directory of New Hires,
the State agency enforcing the employees child
support obligation shall transmit a notice to the
employer of the employee directing the employer
to withhold from the income of the employee an
amount equal to the monthly (or other periodic)
child support obligation (including any past due
support obligation) of the employee, unless the
employees income is not subject to withholding
pursuant to section 466(b)(3).
31
Legislation
From the Personal Responsibility and Work
Opportunity Reconciliation Act of 1996 (PRWORA)
(1) TRANSMISSION OF WAGE WITHHOLDING NOTICES TO
EMPLOYERS.Within 2 business days after the date
information regarding a newly hired employee is
entered into the State Directory of New Hires,
the State agency enforcing the employees child
support obligation shall transmit a notice to the
employer of the employee directing the employer
to withhold from the income of the employee an
amount equal to the monthly (or other periodic)
child support obligation (including any past due
support obligation) of the employee, unless the
employees income is not subject to withholding
pursuant to section 466(b)(3) .
32
Regulations
  • Sarbanes Oxley Act of 2002, which requires each
    annual report of an issuer to contain an
    "internal control report", which shall
  • state the responsibility of management for
    establishing and maintaining an adequate internal
    control structure and procedures for financial
    reporting and
  • contain an assessment, as of the end of the
    issuer's fiscal year, of the effectiveness of the
    internal control structure and procedures of the
    issuer for financial reporting.

33
Information Compliance Activities
  • Documentation of Policy and Linkage to Business
    Rules
  • Assessment
  • Monitoring
  • ROI calculation
  • Root cause analysis
  • Continuous data quality improvement
  • Knowledge Capture, Management, Transfer

34
Benefits
  • Standardization of metadata and widely-shared
    reference information across collection of
    organizations
  • Ability to capture and manage business logic as
    content
  • Overall improved information quality and improved
    operational efficiency
  • High-level description of information integration
    process

35
More Benefits
  • Abstraction of business rule specification from
    implementation provides for
  • Rapid application development
  • Retargetability
  • Reuse
  • Continuous improvement
  • Enhanced matrixed information coordination

36
What is a Business Rule?
  • a statement that defines or constrains some
    aspect of a business by asserting control over
    some behavior of that business

37
Business Rules?
Ok, now get this down. If it is a Monday, and it
is raining outside, and if there might be a red
corvette parked on the roof of the garage, then
if the clients mood is ok, then we can charge
the double rate when the clients head is turned
to the right, and
Yeah, this is good that we are finally
documenting this business rule.
38
Rules and Rules
  • The value of business rules lies in the ability
    to describe assertions about a system using a
    formal framework that is both actionable and
    adaptable
  • The popular perception of business rules leaves a
    wide gap between what is describable and what is
    actionable
  • In other words, what people think of as being
    business rules are usually not business rules

39
Rule-Based Validation
V A L I D A T E
Event
System State
Rules Engine
Business Rules
40
The Value of Rules
  • Documents business logic
  • Automates business processes
  • Middle ground of definition between technicians
    and business clients
  • Rapid development
  • Rapid adaptability to change

41
Data Quality Business Rules
42
Defining a Semantic Rule Hierarchy
  • Lets use what we know about data to drive that
    definition framework
  • Discuss how to transform those rules into
    operational code in different ways
  • We can see ways in which the rules can be
    integrated into higher-level information
    compliance applications

43
Granularity the Semantic Hierarchy I
  • Distinct values, perhaps bound to single instance
    objects

44
Granularity the Semantic Hierarchy II
  • Sets of values unbound to a specific attribute

45
Data Values
  • Range restrictions
  • Format Restrictions
  • Data domains
  • Null values

46
Null Values
  • No Value
  • Unavailable
  • Not Applicable
  • Not Classified
  • Unknown
  • Default

47
Null Values Some Examples
  • Social Security Numbers
  • 000000000, 999999999
  • Names
  • ?, ??, ???, ????, ?????,N/A, NA,
    NONE, None, UNKNOWN, Unknown, n/a,
    na, none, unknown
  • Phone Numbers
  • No phone number provided
  • 000-000-0000
  • 999-999-9999

48
Data Domain Definition
  • Enumerated Domains
  • Define States as Alabama, Alaska,
  • Implemented in SQL as creation of a temporary
    table and populating it with values
  • Table-derived Domains
  • Define validIDs as employee.id when
    employee.status active
  • Implemented in SQL as a subquery
  • Constructive Domains
  • Valid values are defined as a function of other
    values

49
Granularity the Semantic Hierarchy III
  • All values bound to different instances of the
    same attribute

50
Data Attributes
  • Absence of values
  • Restriction of values

51
Null Value Rules
  • Nulls not Allowed
  • Order.productID may not be null
  • Null Representations
  • Define NO_NUMBER as NOVALUE as no number
    provided
  • Represented Nulls Allowed

52
Domain Membership
  • Assert that each value in all instances of a
    named attribute are taken from the specified data
    domain
  • Payroll.id belongs to validIDs
  • Implemented in SQL as a query extracting all
    records whose values are not in the named domain,
    represented by a subquery
  • Data domains may be shared across an enterprise

53
Granularity and the Semantic Hierarchy IV
  • Single set of attribute,value pairs

54
Data Records
  • Completeness
  • Exemption
  • Consistency

55
Record-Level Completeness
  • Assert that a record is not complete unless a
    list of attributes are non-null
  • Example If (order.productClass option) then
    incomplete without underlier, strikePrice, and
    expiration
  • Implemented in SQL as extracting any record where
    the condition is true and any of the named
    attributes is null

56
Record-Level Exemption
  • Asserts that under the specified condition an
    attribute should not have a value
  • Example If (hcfa1500.otherInsurance N) then
    exempt (hcfa1500.otherPolicyNumber,
    hcfa1500.otherGroupName)
  • Implemented similarly to completeness

57
Record-Level Consistency
  • Either a straightforward assertion or one guarded
    by a condition
  • Example If (employee.level manager) then
    (employee.salary gt 30000) and (employee.salary
    lt 54000)
  • Implemented by finding all records that meet the
    condition but not the assertion

58
Granularity and the Semantic Hierarchy V
  • Relationship of sets of values bound to sets of
    attributes

59
Functional Dependency
  • Asserts the dependence of one set of attribute
    values on a different set of attribute values
  • Example ZIPCode DEPEND ON Street, City,
    State
  • Implemented by searching for occurrences where
    pairs of records that share the determining
    attribute values have different dependent
    variables

60
Uniqueness
  • Asserts that across a set of records, there may
    not be two or more records sharing the same
    values in a set of named attributes
  • Example NAME, SSN are UNIQUE
  • Implemented by searching for occurrences where
    pairs of records that share the values for the
    set of named attributes

61
Granularity and the Semantic Hierarchy VI
  • Multiple sets of attribute, value pairs

62
Instance Classification Assertions
  • Groups sets of objects together, then applies
    assertion to group
  • Example All test results with the same PSA test
    score value and the same prostate cancer risk and
    a risk factor greater than 10 must have a white
    blood count greater than 1000
  • CLASSIFY BY ltclassification expressiongt,
    ltconditiongt IMPLIES ltconsequentgt

63
Granularity and the Semantic Hierarchy VII
  • Aggregation of a set of values

64
Aggregate Assertion
  • Aggregate functions are useful for making
    assertions
  • Aggregates AVG, COUNT, MAX, MIN, SUM
  • An aggregate assertion makes some statement about
    compliance with respect to the result of an
    aggregate function
  • Example The number of distinct values in todays
    update must not be less than the number of
    distinct values in yesterdays update

65
Aggregate Dependence
  • A dependence rule uses the result of an aggregate
    function to check for compliance against other
    data instances
  • Example Any order greater than 1000.00 is given
    a 10 discount

66
Granularity and the Semantic Hierarchy VIII
  • Assertional relationship that exists across
    multiple instance sets

67
Foreign Key Assertion
  • This rule establishes a connective link between a
    set of attributes of one set of objects to a set
    of attributes in a different set of objects
  • The assertion implies that all instances of the
    targeted attributes exist within one object from
    the source set of objects

68
Projected Completeness
  • For all objects related to each source object, if
    a condition evaluates to true then related object
    attribute lists must have values
  • Example All order line items must have a product
    code and a quantity

69
Projected Exemption
  • For all objects related to each source object, if
    a condition evaluates to true then related object
    attribute lists may not have values
  • Example For each order, if an item ordered comes
    in a single color, the color must be null

70
Projected Consistency
  • For all objects related to each source object, if
    a condition evaluates to true then the consequent
    must also evaluate to true
  • Example For all extensions associated with a PBX
    number, if the line is marked as active, then a
    router ID must be a valid router ID

71
Did I Just Pull a Fast One?
  • For all extensions associated with a PBX number,
    if the line is marked as active, then a router ID
    must be a valid router ID
  • We should be able to combine assertions from
    lower levels at higher levels, right?
  • The rule above embeds a domain membership
    assertion valid router ID

72
Semantic Hierarchy Summary
  • Values
  • Sets of values
  • Values bound to object attribute across a set of
    objects
  • Values assigned to single objects attributes
  • Relationship of sets of attribute values across
    set of objects
  • Values assigned to attributes of a set of objects
  • Aggregation of a set of values
  • Assertions that cross object set boundaries

73
Rule-Based Validation
74
Taking Action
?
Express rules formally
State rules at high level
Identify business rules in context
75
Exploit Formality
  • Formal representation has certain characteristics
    that are desirable
  • Constraint expressions neatly splits object
    collection into compliant and non-compliant sets
  • Formal specification is implementation-independent
  • Well-defined syntax is parsable

76
Transforming Rules
  • Validation scheme can be constructed by
    operationalizing each formal rule statement
  • Question How is a rule operationalized?
  • Answer Provide a scheme for turning each formal
    rule statement into a corresponding executable
    statement in some target implementation framework

77
Distinguishing Noncompliant Objects
  • By asserting a constraint relating to a set of
    objects within our semantic hierarchy, we
    effectively define a bisection of that set into
    two subsets
  • Conformant objects, or ones that do not violate
    the constraint
  • Nonconformant objects, or ones that do violate
    the constraint

In practice, if we can operationalize the test of
the constraint, we can use it as a test to
identify and extract noncompliant objects
78
Distinguishing Noncompliant Objects - Benefits
  • Violating objects can be collected and grouped by
    violated rule
  • Eases reconciliation
  • Improves root cause analysis
  • The rule statement itself can be used as an error
    message
  • Provides high-level feedback
  • Understandable by both technicians and business
    clients

79
Conclusion
  • Data quality is a special case of a more
    general concept called Information Compliance
  • Information Compliance introduces a methodology
    for capturing and formally expressing user
    information expectations and measuring
    conformance with those expectations
  • Information Compliance can be implemented using a
    business rules approach

80
Questions?
  • If you have questions, please contact me
  • David Loshin
  • 301-754-6350
  • loshin_at_knowledge-integrity.com
Write a Comment
User Comments (0)
About PowerShow.com