Pr - PowerPoint PPT Presentation

About This Presentation
Title:

Pr

Description:

Predictive Modeling Methods to predict expected claims costs Uses historical data and calibrated models Many uses in health insurance context: Renewal underwriting ... – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 83
Provided by: actuaries
Learn more at: http://www.actuaries.org
Category:
Tags:

less

Transcript and Presenter's Notes

Title: Pr


1
(No Transcript)
2
New Developments in Predictive Modeling
Jonathan Shreve, FSA, MAAA Principal and
Consulting Actuary Milliman
3
SOMMAIRE/ SUMMARY
Overview Optimal Use of Risk
Adjusters Lifestyle-Based Prediction

4
  • Predictive Modeling
  • Methods to predict expected claims costs
  • Uses historical data and calibrated models
  • Many uses in health insurance context
  • Renewal underwriting
  • Cost impact modeling
  • Payment equalization
  • Care management

5
  • Health Insurance Market in the United States
  • Individual
  • Small group (2-50 employees)
  • Medium group (51-500 employees)
  • Large group (gt 500 employees)

6
  • Risk Adjusters Overview
  • Risk adjusters measure morbidity
  • Used for adjusting payments (Medicare),
    predictive modeling (SG rating), and medical
    management (DM)
  • Function of age, gender, and claim history
    (diagnoses and services - medical and/or Rx).
  • ERG, ACG, DxCG, etc.

7
  • Risk Adjusters Overview
  • Claim detail is sorted and formatted
  • Software assigns members to relatively broad
    diagnosis categories (e.g. Symmetry has 120
    categories called Episode Risk Groups (ERGs))
  • Output file (array) of 0s and 1s under each
    demographic category and each condition category
    for each member
  • Regression to fit actual costs to array of 0s and
    1s
  • Other risk adjusters

8
  • Risk Adjusters Theoretical Value

9
  • United States Small Group Underwriting
  • Small group rating
  • Health insurance coverage
  • Small group 2 to 50 employees
  • Guaranteed Issue
  • Limits on rate adjustments due to health status
  • Limits on rates offered to different groups

10
  • Introduction Real World Considerations
  • Delay between when rates are developed and the
    rating period
  • Incomplete data (IBNR)
  • Rating limits (total Health Status Factor and
    changes)
  • Turnover
  • Competing against carriers new business methods,
    not their renewal methods

11
  • Introduction Prior Studies
  • Society of Actuaries Report (May, 2002 Cummings
    et al)
  • Society of Actuaries Health Section Council
    Article (Aug, 2003 Ellis - DxCGs)
  • Society of Actuaries Report (Summer 2006)

12
  • Society of Actuaries Assessment of Available
    Claims Based Predictive Modeling/Risk Adjuster
    Tools
  • Objective analysis of predictive power of
    commercially available risk adjusters
  • Updates 2002 study
  • Measures , MAPE, and grouped statistics
    (including fit within disease category)

13
  • Society of Actuaries Assessment of Available
    Claims Based Predictive Modeling/Risk Adjuster
    Tools
  • Vendors/Products Included
  • Company Product
  • Ingenix Episode Risk Groups (ERGs)
  • Ingenix Pharmacy Risk Groups (PRGs)
  • Ingenix Impact Pro
  • Johns Hopkins Adjusted Clinical Groups (ACGs)
  • UCSD, Todd Gilmer Medicaid Rx
  • MedAI MedAI
  • DxCG Diagnostic Cost Groups (DCGs)
  • DxCG RxGroups
  • DxCG Underwriting Models
  • 3M Clinical Risk Groups

14
  • Society of Actuaries Assessment of Available
    Claims Based Predictive Modeling/Risk Adjuster
    Tools
  • Biggest changes from prior study
  • New tools (i.e. MedAI)
  • Improvement in tools
  • Use of prior costs in some models
  • Results with data lag

15
  • Publicly Available Risk Adjusters
  • Medicaid Rx
  • RxRisk
  • CDPS
  • Information from 2002 Study A Comparative
    Analysis of Claims-Based Methods of Health Risk
    Assessment for Commercial Populations,
  • Cumming/Knutson/Cameron/Derrick
  • Some restrictions on use may exist

16
  • Publicly Available Risk Adjusters
  • Medicaid Rx
  • Pharmacy based risk assessment model developed by
    Todd Gilmer and other at Univ. of California
  • Assigns each member to one or more of 45
    condition categories based on prescription drugs
    used
  • Assigns each member to one of 11 age/gender
    categories
  • Predicts overall costs for each member
  • Includes separate sets of weights for adults and
    children

17
  • Publicly Available Risk Adjusters
  • Rx Risk
  • Pharmacy based risk assessment model developed by
    Paul Fishman at Group Health Cooperative of Puget
    Sound
  • Assigns each member to one or more of 27 medical
    condition categories for adults, and up to 42 for
    children
  • Assigns members to one of 22 age/gender
    categories
  • Predicts total medical costs for each member

18
  • Publicly Available Risk Adjusters
  • CDPS (www.medicine.ucsd.edu/fpm/cdps)
  • Diagnosis based risk assessment model developed
    by Richard Kronick and others at the Univ. of
    California
  • Orignally intended for use with Medicaid,
    including disabled and Temporary Aid for Need
    Familites (TANF) populations
  • Assigns members to up to 67 possible medical
    condition categories
  • Assigns members to one of 16 age/gender
    categories
  • Predicts total medical costs
  • Model contains different sets of weights for
    adults and children

19
  • Milliman Research
  • Optimal Renewal Guidelines
  • Goal of Research
  • Understand current small group renewal practices
  • Identify optimal renewal methodologies

20
  • Introduction Survey Results
  • What methods are currently practiced to rate
    small groups at renewal?
  • Surveyed 21 carriers on SG methods
  • 30 of carriers used risk adjusters
  • 60 of groups

21
  • Introduction Main Components
  • Individualized Data Analysis
  • Carrier Analysis
  • Competitive Simulation

22
  • Introduction Individualized Data
  • Large multicarrier database used to review
    individual predictions
  • Advantages
  • Large database
  • Good geographical representation
  • Disadvantages
  • No group identifiers
  • Manual rate unavailable

23
  • Introduction Carrier Data
  • Advantages
  • Actual Group Data
  • Group Manual Rates Available
  • Disadvantages
  • Medium sized data set
  • Geographical concentration
  • Biased

24
  • Models Loss Ratio Model
  • 1st Renewal
  • 2nd Renewal

25
  • Models Risk Adjuster Model
  • 1st Renewal
  • 2nd Renewal

26
  • Models Service Category Model
  • 1st Renewal

27
  • Results Error Measures
  • R-Squared - of variance from the mean explained
    by rating variables
  • MAPE - Absolute error as of total costs

28
  • Results Theoretical

29
  • Results Error Calculation Example
  • Small Group ABC
  • Traditional Prediction 150
  • Risk Adjuster Prediction 125
  • Actual Claims equal 120 of manual
  • Which method is better?
  • Error / R-squared?

30
  • Results Credibility Weights
  • 1st Renewal, Individual Analysis

Svc category 2 IP, 24 OP, 18 Rx
31
  • Results R-square
  • R-Square vs. Rating Caps (Group Size 10)

32
  • Results Mean Absolute Prediction Error (as )
  • MAPE vs. Rating Caps (Group Size 10)

33
  • Results Mean Absolute Prediction Error (as )
  • MAPE vs. Group Size (Rating Cap 35)

34
  • Results Mean Absolute Prediction Error (as )
  • MAPE vs. Group Size (Uncapped)

35
  • Results Carrier Analysis
  • Real groups
  • Turnover
  • Biased sample
  • Traditional / Risk Adjuster very similar!
  • Health status correlation

36
  • Competitive Simulation Introduction
  • Based on carrier data
  • Excel model - stochastic
  • First renewal with 9 months of historic claims.
  • New business method accuracy simulated relative
    to renewal method accuracy (less accurate)
  • New business quotes generated stochastically
    (Bayesian from renewal quote distribution) with
    some correlation among different carriers

37
  • Competitive Simulation Results
  • Small improvements in new business methods
    significantly increase profitability for new
    business and hurt profitability for renewal
  • Very sensitive to point at which group seeks new
    business quotes (try to keep your groups from
    getting quotes!)
  • Number of competing quotes is important.
  • Accuracy and results are sensitive to credibility
    of risk adjuster and/or historic experience
    components

38
  • Research Conclusions
  • Marginal value of improvements decrease as
    allowable rate variation decreases, and as group
    size increases
  • New business is less profitable than renewal
    business. Dont chase the wrong groups away.
  • Competitive results are very sensitive to
    accuracy of new business methods
  • Credibility is affected by accuracy / explanatory
    power of manual rate and level of health status
    correlation

39
  • Recommendations
  • Understand effects of rating environment
  • Fundamentals (Blocking Tackling)
  • Objectively analyze what prediction method is
    right for you. It may be that multiple methods
    are most appropriate (state, group size, costs,
    etc).
  • Use all relevant data / information on a group.
  • Understand what your competitors are doing with
    new business
  • Assign credibility explicitly and carefully.
  • Use a rigorous, systematic method to develop
    renewal quotes, with appropriate, efficient
    manual intervention.
  • Capture all information on each renewal quote and
    what happens with group. Analyze data and modify
    your approach.

40
  • Lifestyle-Based Prediction

41
  • The US Surgeon General
  • 70 of the diseases and subsequent deaths
  • in the U.S. are lifestyle-based
  • The Centers for Disease Control
  • Lifestyle-based chronic diseases account
  • for 75 of the United States 1.4 trillion
    medical care costs

42
  • Definition of Lifestyle Diseases
  • Lifestyle diseases (also called diseases of
    longevity or diseases of civilization) are
    diseases that appear to increase in frequency as
    countries become more industrialized and people
    live longer. (WHO)
  • Lifestyle disease is a disease associated with
    the way a person or group of people lives.
  • Lifestyle diseases include atherosclerosis, heart
    disease, and stroke obesity and type 2 diabetes
    diseases associated with smoking, alcohol, and
    drug abuse. Regular physical activity helps
    prevent obesity, heart disease, hypertension,
    diabetes, colon cancer, and premature mortality.
  • (Stedmans Medical
    Dictionary)

43
  • Lifestyle-Based Diseases
  • Lifestyle-Based Diseases/Conditions
  • Diabetes
  • Hypertension
  • Cardiovascular
  • Stroke
  • COPD
  • Most cancers
  • Some mental health Depression, Alzheimers, etc.
  • Others Osteoporosis, Arthritis, Back Pain, etc.
  • Maternity

44
  • Lifestyle-Based Diseases
  • Correlation between Lifestyle and Cancer

Source American Cancer Society
45
  • 2004 INTERHEART Study
  • Over 90 of the risk of a heart attack
    (myocardial infarction) is attributed to
    lifestyle factors
  • Factors include abnormal lipids, smoking,
    hypertension, abdominal obesity, consumption of
    fruits and vegetables, alcohol and regular
    physical activity
  • Family history thought by many to be the major
    risk, only accounts for 1 of the population
    attributable risk

46
  • Lifestyle Based Prediction (LBP)
  • Most healthcare costs are driven by lifestyle
    choices
  • Claims data does not reflect lifestyle
  • How else can we gather this information?

47
  • Lifestyle-Based Prediction (LBP)
  • Lifestyle-Based Prediction is based on strong
    correlations that exist between lifestyle-based
    behaviors and diseases in particular,
    lifestyle-based diseases
  • LBP switches the method of detection focus from
    poorly correlated medical events to highly
    correlated lifestyle behaviors

48
  • Challenges in Predictive Modeling
  • Predictive models are only as good as the data
    that drive them
  • Challenge 1 New business
  • Challenge 2 High employee turnover
  • Challenge 3 Data consolidation
  • Challenge 4 Increase in lifestyle diseases

49
  • Development of Lifestyle-Based Prediction Models
  • Over 700 fields of lifestyle-based data are
    appended to two data sets
  • Individuals with a disease state
  • Base group average representation of the group
    at large
  • Clinical datasets development
  • Various models are tested including linear
    regression, logistical regression, CHAID
    analysis, discriminative analysis, Bayesian
    methods, and cluster analysis

50
  • Ties Between Lifestyles and Diseases
  • Two types of statistical principles used in LBP
  • Correlation Lifestyle-based behaviors which
    will result in a higher propensity for an
    individual to have the disease
  • Obesity and latent lifestyle promote diabetes
  • Causality There are lifestyle-based behaviors
    that exist or change as a result of the disease
  • Once diagnosed with diabetes, you become a diet
    food purchaser

51
  • Lifestyle-Based Prediction Example

52
  • Maternity Example
  • Traditional maternity factors are based on
    age/sex/geographic/family enrollment
  • In fact, a simple Bayesian model using number and
    ages of children can lift results by over 40
  • Lifestyle-Based Prediction can dramatically
    improve accuracy by including number and ages of
    children, financial indicators, household living
    parameters, etc.

53
  • Early Disease Detection Study (EDDS)
  • Screening Data
  • Over 100,000 patient screening records per
    condition
  • Abdominal Aortic Aneurysm (AA Screening)
  • Carotid Artery Disease (CA Ultrasound)
  • Congestive Heart Failure (Cardiac Echo)
  • Diabetes (Fasting Plasma Glucose)
  • Osteoporosis (Bone Densitometer)
  • Peripheral Arterial Disease (Ankle Brachial
    Index)

54
  • Early Disease Detection Study (EDDS)
  • Health Information
  • Health History
  • 45 Personal health history elements
  • Medical histories stroke, heart attack, CAD,
    etc.
  • Medical procedures improve blood flow to heart
    or legs, prior screenings, medications, etc.
  • Medical symptoms chest pain, loss of speech,
    blurred vision, etc.
  • 10 Family history elements
  • Medical conditions
  • Medical procedures

55
  • Early Disease Detection Study (EDDS)
  • Lifestyle Information
  • Lifestyle Elements
  • 8 Exercise elements
  • How often do you exercise
  • What types of exercise
  • 5 Tobacco elements
  • 8 Nutritional elements
  • Caffeine intake
  • Calcium intake
  • Fast food intake
  • Food group intake

56
  • Early Disease Detection Study (EDDS) Results
  • Predictive coefficients for the 21
    lifestyle-based elements were relatively equal to
    the 55 health elements in all six cases
  • Minimum Coronary Artery Disease
  • Lifestyle-based elements relatively equal to the
    health history elements on stand alone basis
  • Maximum Osteoporosis
  • Lifestyle elements have twice the potential to
    affect the score compared to health history
    elements
  • Combination of lifestyle with health elements
    increased health risk identification by over 45
    (as defined by R-squared)

57
  • Currently in Place
  • Applications and enrollment forms
  • Individuals and groups
  • Family information
  • Age, sex and age differences in family members
  • Employment
  • Job description
  • Height/weight
  • Commute time
  • Geography

58
  • HRAs and Other Surveys
  • Excellent source for lifestyle-based data
  • Several key problems
  • Expensive to administer (gt10/member)
  • Additional cost tied to participation incentives
  • Poor participation rates
  • Questionable results on the unhealthiest
    population
  • Timing issues for new business/members

59
  • Publicly Available Consumer Data
  • Who, What, Where Why

60
  • Consumer Data in the United States
  • The plethora of consumer data has dramatically
    changed our way of interacting with consumers
  • Consumer data measured in Disk Storage per Person
    (DSPS)
  • 1985 0.02 Mbytes/yr
  • 1995 26 Mbytes/yr
  • 2005 3,500 Mbytes/yr

61
  • Consumer Data Why?
  • Primarily used for marketing, customer service
    and fraud purposes
  • United States Graham-Leach-Bliley Act of 1999
  • Requires opt-out
  • Permitted by law
  • Joint marketing agreements

62
  • Consumer Data Where?
  • Government Public Records
  • Census
  • Financial Services
  • Surveys
  • Warranties
  • Loyalty Programs
  • Internet Purchases
  • Subscriptions

63
  • Consumer Data Who?
  • 95 of U.S. Households
  • Historically household-based
  • Newest trend individual-based
  • Observed
  • Implied

64
  • Consumer Data What?
  • Traditional Demographics
  • Age, sex, race, etc.
  • Financial
  • Homeowner, credit score, mortgage/auto/credit
    card balances, etc.
  • Household
  • Marriage status, number andages of children,
    etc.

65
  • Consumer Data What?
  • Physical inactiveness
  • Television time, computer time, board games,
    stamp and coin collecting, etc.
  • Lifestyle-Based Elements
  • Physical activeness
  • Running, walking, cycling, aerobics, golf,
    tennis, etc.

66
  • Consumer Data What?
  • Lifestyle-Based Elements
  • Food purchases
  • Fast food, diet food, gourmet, vegetarian, etc.
  • Wine and other alcohol
  • Self improvement
  • Health fitness, dieting/weight loss, etc.
  • Mental wellness, personal improvement, etc.

67
  • Consumer Data What?
  • Lifestyle-Based Elements
  • Tobacco
  • Occupation
  • Travel
  • Motor vehicle type
  • Recreational vehicles
  • Other

68
  • The Expense of Consumer Data
  • Medical Data Costs
  • MIB, Rx, historical medical, etc. start at about
    10.00 per individual and go up
  • Consumer Data Costs
  • Rapidly decreasing in price due to fierce
    competition
  • 5 years ago 100 data elements cost 2.00/head
  • Today over 500 data elements cost 0.25/head
  • The data needed for medical modeling costs about
    0.10/head or less

69
  • Practical Applications
  • Individual
  • Small group (2-50 employees)
  • Medium group (51-500 employees)
  • Large group (gt500 employees)

70
  • Practical Applications Tele-underwriting
  • Determiner of At Risk population
  • Who to call
  • Identifier of Risk Conditions
  • What questions to ask

71
  • Practical Application Preferred Risk
  • Determination of Jet Issue Application
  • Clean application plus healthy score
  • Determination of Preferred Status
  • Current techniques rely on clean application plus
    what?
  • Lifestyle indicators provide the best what

72
  • Massive Consumer Database
  • Over 55 million records in the US
  • Every US adult over the age of 50
  • Over 500 fields of lifestyle-based data
  • Updated monthly
  • Scored for marketing and health risk status
    monthly
  • Looking at real-time hosted applications

73
  • Cancer Policy Example
  • Model Objective
  • Develop Models to Identify the Most Risky Cancer
    Policies in Terms of Claims and Track the Quality
    of Portfolio
  • Rank Customers by Their Likelihood to Have Claims
    in the Next 2.5 Years
  • Used in Conjunction With the Underwriting Rules
    to Validate and Improve Underwriting Process

74
  • Risk Model Logistic Regression
  • The risk model was based on the comparison of key
    customer demographics and lifestyle
    characteristics of policyholders or applicants
    who had claims in the performance window against
    the people who do not have claims.
  • The rank and plot distribution of the claims vs.
    non-claims are compared for each demographic
    attribute.
  • The attributes which showed significantly
    different distributions or trends were selected
    for the Logistic regression analysis.

75
  • The Key Drivers of the Application Risk Model
  • ISSUEAGE Customer Age at the Time of
    Application
  • CHILD Presents of Children (Yes/no)
  • MARRITAL Marital Status
  • VEHREG Dominant Vehicle Life Style
  • KID610 Have Kids Between 6 to 10 Year Old
  • VEHSUV Dominant Vehicle Life Style
  • ADUL35 Adult Age Under 35 in Household
  • ADUL65P Adult Age Over 65 in Household
  • RATIO1 Weight/height for the First Individual
  • RATIO2 Weight/height for the Second Individual

76
  • 5 of the Customers Ranked by Scores
  • include 13 of Claims

Conclusions the Lorenz Curve Shows the
Application Risk Model Rank Orders Claim Risk
Well.
77
  • Model Summary
  • By working at the top 20 of the policies, we
    have potential to cut 43 of claims, which
    represents 45 of dollar losses. The hit rate
    (number of good policies sacrificed per bad
    policy stopped) is 14 (in the 2.5 year analysis
    window), model lifts renders 117 gains in
    targeting.

78
  • Profit Impact Scenario

79
  • Statistical Results
  • Compared Traditional Underwriting and LBA Scores
    to Actual Claims Results
  • LBA Beat Traditional Underwriting in All
    Statistical Measures
  • Adjusted R-squared
  • Bias
  • MSE
  • MAD
  • AAD

80
Operational Overview - Individual
81
  • Conclusion
  • Recognize much of medical costs cannot be
    predicted by traditional methods
  • Look for nontraditional data sources
  • The real value of consumer data in the healthcare
    industry lies in its ability to predict
    lifestyle-based diseases.
  • Whether used as an identifier for health risks or
    as an early predictor of a disease state, we see
    the use of Lifestyle-Based Analytics accelerating
    rapidly within the healthcare and in particular
    disease management industries.

82
Questions?
Jonathan Shreve, FSA, MAAA Milliman Jon.Shreve_at_Mil
liman.com 001 303-299-9400
Write a Comment
User Comments (0)
About PowerShow.com