DATA AND DATA COLLECTION - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

DATA AND DATA COLLECTION

Description:

DATA AND DATA COLLECTION Lecture 3 What is STATISTICS? Statistics is a discipline which is concerned with: designing experiments and other data collection ... – PowerPoint PPT presentation

Number of Views:542
Avg rating:3.0/5.0
Slides: 44
Provided by: JustDj
Category:

less

Transcript and Presenter's Notes

Title: DATA AND DATA COLLECTION


1
DATA AND DATA COLLECTION
  • Lecture 3

2
What is STATISTICS?
  • Statistics is a discipline which is concerned
    with
  • designing experiments and other data collection,
  • summarising information to aid understanding,
  • drawing conclusions from data, and
  • estimating the present or predicting the future.

3
What is STATISTICS?
  • A branch of applied mathematics concerned with
  • the collection
  • interpretation of quantitative data
  • the use of probability theory to estimate
    population parameters

4
What is STATISTICS?
  • I like to think of statistics as the science of
    learning from data
  • Jon Kettenring 1997, President, American
    Statistical Association

5
Jun-07 GSE All Share Index
1 5,225.80
4 5,226.04
5 5,226.40
6 5,226.72
7 5,226.92
8 5,237.34
9 5,237.74
10 5,238.90
13 5,238.99
14 5,250.31
15 5,258.52
18 5,263.37
19 5,263.58
6
(No Transcript)
7
(No Transcript)
8
What is Data?
  • It is a collection of facts from which meaningful
    conclusions can be drawn.
  • Examples
  • names,
  • numbers,
  • text,
  • graphics,
  • Decimals.
  • The singular form is datum and the plural form is
    data.

9
Types of Data
  • Qualitative
  • Quantitative

10
Qualitative
  • Qualitative data is not provided numerically.
  • They are non numeric data.
  • E.g. colour, race, geographical region, industry,
    sex, type of car, place of birth, etc.
  • Qualitative data may also be referred to as
    categorical.

11
Qualitative
  • Quantitative data is given numerically numeric
    data.
  • This can be further categorised into
  • Discrete
  • continuous

12
Quantitative
  • Discrete data are numeric data that have a finite
    number of possible values and represents counts.
  • finite subset of the counting numbers, 1, 2, 3,
    4, and 5 or
  • how many students were present on a given day.
  • The representation of discrete is by use of
    integers. E.g. Number of firms listed on Ghana
    Stock Exchange

13
Quantitative
  • Continuous quantitative data have infinite
    possibilities.
  • They can be represented by real numbers.
  • These are continuous with no gaps or
    interruptions.
  • Physically measurable quantities of length,
    volume, time, mass, etc. are generally considered
    continuous.
  • At the physical level, especially for mass, this
    may not be true.
  • E.g. company profit, Height, mass and length.

14
Quantitative
  • The structure and nature of data will greatly
    affect the choice of analysis method.

15
Data-Cross Sectional
  • Data sets may also be described as
  • cross-sectional
  • time series
  • Cross-sectional data refers to data collected by
    observing many subjects (such as individuals,
    firms or countries/regions) at the same point of
    time, or without regard to differences in time.
  • Cross sectional data defines data set containing
    observations on multiple phenomena observed at a
    single point in time.

16
Data-Cross Sectional
  • the values of the data points have meaning, but
    the ordering of the data points does not.
  • Analysis of cross-sectional data usually consists
    of comparing the differences among the subjects.
    E.g.

17
Data-Time Series
  • Time series data is a sequence of numerical data
    points in successive order, usually occurring in
    uniform intervals.
  • A sequence of numbers collected at regular
    intervals over a period of time.
  • Stated in yet another way, time series data is a
    data set containing observations on a single
    phenomenon observed over multiple time periods.

18
Data-Time Series
  • In time series data, both the values and the
    ordering of the data points have meaning.

19
Data Panel
  • A data set containing observations on multiple
    phenomena observed over multiple time periods is
    called panel data.
  • the second dimension of data may be some other
    than time.
  • when there is a sample of groups, like company
    subsidiaries, and several observations from every
    group, the data is panel data.

20
Data Panel
  • Whereas time series and cross-sectional data are
    both one-dimensional, panel data sets are
    two-dimensional.
  • Some data sets could possess more than two
    dimensions.
  • In such case the nomenclature is
    multi-dimensional panel data.

21
Source of Data
  • Primary
  • Secondary.

22
Source-Primary
  • Primary data is gathered specifically for a
    research project.
  • Data collected from the original source.
  • Examples include data from
  • focus groups,
  • telephone surveys,
  • Interviews
  • questionnaires.

23
Source-Primary
  • Advantages of primary data
  • Collection based on researcher's need
  • Control over measurement selection and execution
  • timeliness of the data can be controlled
  • representativeness of the data can be ensured

24
Source-Primary
  • Advantages of primary data
  • type of information desired can be directly
    determined by the design of the questions.
  • collected to fit the specific purpose
  • data are current
  • secrecy can be maintained

25
Source-Primary
  • Disadvantages of primary data
  • Expensive
  • Time-consuming
  • Quality declines if interviews are lengthy
  • Reluctance to participate in lengthy interviews

26
Source-Secondary
  • Secondary data is information that has already
    been collected and is available to the public.
  • Examples
  • population statistics from the Ghana Statistical
    Service (GSS) Census Office,
  • economic indicators from the GSS,
  • Trading data from GSE,
  • information in government documents,
  • Industry and trade journals.

27
Source-Secondary
  • data contained in published accounts of
    organisations.
  • Many businesses and organisations also collect
    information about their customers or clients
    (such as where they live), and this is also
    considered secondary data

28
Types of Secondary data
29
Source-Secondary
  • Advantages of secondary data
  • Little cost or time required to access data
    (inexpensive)
  • Not confined to immediate level or unit of
    analysis
  • available more quickly
  • Several sources are available.

30
Source-Secondary
  • Advantages of secondary data
  • Saves time and money if on target
  • Aids in determining direction for primary data
    collection
  • Pinpoints the kinds of people to approach
  • Serves as a basis of comparison for other data

31
Source-Secondary
  • Disadvantages of secondary data
  • Information may be outdated
  • May not be suitable
  • Methodology for collection may be inappropriate.
  • May not be on target with the research problem
  • Quality and accuracy of data may pose a problem

32
Evaluating Secondary Data
  • Overall suitability
  • Precise suitability
  • Costs and benefits

33
Evaluating Secondary Data
  • Overall suitability
  • Does the data set contain the information you
    require to answer your research question(s) and
    meet your objectives?
  • Do the measures used match those you require?
  • Is the data set a proxy for the data you really
    need?
  • Does the data set cover the population which is
    the subject of your research?

34
Evaluating Secondary Data
  • Overall suitability
  • Can data about the population which is the
    subject of your research be separated from
  • unwanted data?
  • Are the data sufficiently up to date?
  • Are data available for all the variables you
    require to answer your research question(s) and
    meet your objectives?

35
Evaluating Secondary Data
  • Precise suitability
  • How reliable is the data set you are thinking of
    using?
  • How credible is the data source?
  • Is the methodology clearly described?
  • If sampling was used what was the procedure and
    what were the associated sampling errors and
    response rates?

36
Evaluating Secondary Data
  • Precise suitability
  • Who were responsible for collecting or recording
    the data?
  • (For surveys) is a copy of the questionnaire or
    interview checklist included?
  • (For compiled data) are you clear how the data
    were analysed and compiled?

37
Evaluating Secondary Data
  • Precise Suitability
  • Are the data likely to contain measurement bias?
  • What was the original purpose for which the data
    were collected?
  • Who were the target audience and what was their
    relationship to the data collector or compiler?

38
Evaluating Secondary Data
  • Precise Suitability
  • Have there been any documented changes in the way
    the data are measured or recorded, including
    definition changes?
  • How consistent are the data obtained from this
    source when compared with data from other
    sources?

39
Evaluating Secondary Data
  • Costs and benefits
  • Are you happy that the data have been recorded
    accurately?
  • What are the financial and time costs of
    obtaining these data?
  • Have the data already been entered into a
    computer?
  • Do the overall benefits of using this secondary
    data source outweigh the associated costs?

40
Methods of Data Collection
  • Census
  • Survey

41
Sample Selection
  • Population
  • Sample frame
  • Sample size
  • Sampling error

42
Principles of Sampling
  • Probability
  • Non-Probability

43
Methods of Sampling
  • Random Sampling
  • Purposive
  • Stratified sampling
  • Systematic sampling
  • Multi-stage and multi-phase
Write a Comment
User Comments (0)
About PowerShow.com