Interesting Association Rules of Household Indicators of Poverty - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Interesting Association Rules of Household Indicators of Poverty

Description:

Poverty is the state of which one lacks the means to satisfy their basic needs. ... rules, that utilizes two of the most popular algorithms - apriori and eclat ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 22
Provided by: solo74
Category:

less

Transcript and Presenter's Notes

Title: Interesting Association Rules of Household Indicators of Poverty


1
Interesting Association Rules of Household
Indicators of Poverty
  • By
  • Nkumbuludzi Ndwapi

2
Contents
  • Introduction
  • Literature Review
  • Methodology
  • Analysis
  • Conclusions Recommendations

3
Introduction
  • One of the major problems troubling the continent
    of Africa is poverty.
  • Poverty is the state of which one lacks the means
    to satisfy their basic needs.
  • Defining poverty often leads to the question of
    how to determine when a household is to be
    considered poor. 

4
Introduction contd
  • There are various methods in which poverty can
    be measured
  • income poverty
  • human poverty
  • capabilities deprivation.
  • Coleman and Cressay (1990) explained that
    measurement of income poverty are
  • Absolute
  • Relative

5
Introduction contd
  • The above definitions define the multi complexity
    of poverty.
  • Mining Association Rules can be used to unravel
    how certain aspects of human poverty are
    associated or related.

6
Statement of Problem
  • Poverty alleviation has been an issue of major
    concern to the Government of Botswana.
  • According to Buthali (1997) the government took
    the decision to focus on the productive mining
    sector.
  • The redistribution of this revenue is based on
    understanding a poor household.

7
Statement of Problem contd
  • To characterise a household as poor should not
    only be based on economic characteristics.
  • According to Buthali (1997) the analysis of
    poverty in Botswana has been solely based on
    poverty baskets and poverty datum lines over the
    years.
  • The Human Development programme (1997) reported
    that figures based on poverty lines are usually
    converted to dollars and this distorts the real
    levels of inflation.
  • Purchasing Power Parity exchange rates that are
    used to turn the 1/day poverty line into
    national currencies are inappropriate.

8
Objectives
  • To develop association measures that could be
    used to analyse the multi-factored nature of
    poverty
  • To determine the most common (frequent) types of
    housing and living conditions based on the HIES
    2002/3 data set.
  • To investigate interesting association measures
    that exist between different housing and living
    conditions which can not be determined using
    traditional statistics techniques
  • To classify households using interesting rules as
    to whether they are poor or not.

9
Literature Review
  • Income is limited as an indicator of poverty
    because it does not capture public goods, non
    market goods, rationing, and the problem of
    distorted or imperfect markets (Alkire and
    Leander, 2005)
  • So income as the sole indictor of well being is
    inappropriate and should be supplemented by other
    attributes or variables for example housing,
    literacy, life expectancy, provision of public
    goods and so on.
  • Alkire and Leander (2006) explain that
    multidimensionality was also advocated by the
    basic needs approach as Sens (1997) capability
    approach argued that wellbeing is
    multidimensional.

10
Literature Review contd
  • Individuals and households that are able to meet
    their basic food needs but are unable to provide
    adequately for basic none food needs would still
    be classified as poor based on this criteria
    (Obuseng and Powder 2003).
  • A research by Ngwame et al. (2002) revealed that,
    as with deprivation in terms of dwelling
    structure, lack of access to safe water in South
    Africa is common among the poor.
  • Ngwame et al. (2002) it is explain that there is
    often a relationship between poor housing and
    poor sanitation facilities

11
Methodology
  • The data
  • Obtained from the HIES 2003
  • Original variable s and their categories were as
    in table 1.
  • General Categorization
  • Categorization of Indicators of poverty
  • Indicators of poverty as in table 2.

12
Mining Association Rules
  • Mining Association Rules is a branch of data
    mining that seeks to understand the shopping
    behavior of supermarket customers
  • which items tend to be bought together?
  • Which items are bought as substitutes?
  • Agrawal, Imiellinski and Swami (1993) are widely
    credited with introducing the method of
    Association Rules at the 1993 International
    Conference on Management of Data, held in
    Washington DC, USA.
  • In the intervening 15 years, mining association
    rules has become one of the best studied
    problems of data mining.
  • As with other data mining techniques, association
    rules (AR) have mainly been used in market
    research studies.
  • MAR aims to answer questions such as how often
    does a shopping basket that contains meat also
    contain wine?
  • MAR is also referred to as unsupervised learning.

13
Mining Association Rules contd
  • Searching the database for interesting patterns
    is an unsupervised learning process.
  • Suppose that X and Y represent 2 different sets
    of potential items in a shopping basket, e.g. X
    Meat, Maize meal and Y spinach
  • Then an association rule between X and Y is a
    rule, r X ?Y.
  • The rule r, is interpreted in this case as
    meaning that
  • customers who buy Meat, Maize meal are likely to
    buy spinach with a certain probability.
  • With unsupervised learning, one must then define
    conditions that make such a rule to be of
    interest.

14
Interesting Measures
  • The rule r X?Y makes no prediction of Y in the
    entire database.
  • In the original paper, Agrawal et al. (1993)
    introduced the support-confidence framework, and
    the apriori algorithm for mining the rules.
  • Hahsler (2005) gives a comprehensive summary of
    commonly used and recent additions to measures of
    interestingness of mined association rules.
  • Hahsler et al (2007) provide additional
    probability based measures of interestingness as
    well as an R-library, arules, for mining
    association rules, that utilizes two of the most
    popular algorithms - apriori and eclat

15
Interesting Measures contd
  • The support of a rule, r X ?Y is the likelihood
    of finding a transaction containing all items in
    X and Y in the database. Estimated by
  • proportion of transactions in which X Y are
    both present.
  • The confidence of a rule r X ?Y is the
    likelihood of finding Y among all transactions
    that contain X.
  • proportion of transactions containing X and Y
    among transactions containing X.
  • This is a conditional probability Given that a
    transaction contains X, what is the likelihood
    that it also contains Y?
  • Lift (initially called interest) of r X ?Y
    compares the likelihood of finding both X and Y
    in a transaction to the probability of their
    joint occurrence if they occurred independently
    i.e. if costumers bought the two itemsets
    independently.

16
Interesting Measures contd
  • "chiSquare" (see Liu et al. 1999). The chi-square
    statistic to test for independence between the
    LHS (X) and RHS (Y) of the rule. The critical
    value of the chi-square distribution with 1
    degree of freedom (2x2 contingency table) at
    alpha0.05 is 3.84 higher chi-square values
    indicate that the LHS (X) and the RHS (Y) are not
    independent.
  • "oddsRatio" (see Tan et al. 2004). The odds of
    finding X in transactions which contain Y divided
    by the odds of finding X in transactions which do
    not contain Y. Range 0...1... Inf ( 1 indicates
    that Y is not associated to X).

17
Interesting Measures contd
  • Consider the following transactional matrix
  • The various measures of interestingness can be
    computed as follows equations.

18
Analysis
  • Descriptive statistics
  • Mining Association Rules
  • General Rules
  • Rules of Poverty
  • Cluster Analysis

19
Conclusions
  • Considering the rules presented above mining
    association rules is technique that could be used
    to analyse household poverty and how certain
    household characteristics relate.
  • For mining association rules to bring out the
    best rules surveys that are deliberately intended
    to capture the poor must be explored.
  • Rules resulting from such surveys will be the
    best at defining and explain household poverty.

20
Further research and Limitations
  • For further research ordinal regression model is
    proposed based on the clusters presented in this
    paper and the possibility of clustering the rules
    them selves is worth exploring.
  • Considering the results of the paper the
    objectives of the research have been satisfied.
  • The limitations of this paper is that it does
    not take into account different localities of
    household, indicators of poverty in the urban and
    rural areas are expected to be different.

21
  • Thank You!
Write a Comment
User Comments (0)
About PowerShow.com