Title: Statistics%20New%20Zealand
1Statistics New Zealands Case StudyCreating
a New Business Model for a National Statistical
Office if the 21st CenturyCraig Mitchell, Gary
Dunnet, Matjaz Jug
2Overview
- Introduction organization, programme, strategy
- The Statistical Metadata Systems and the
Statistical Cycle description of the
metainformation systems, overview of the process
model, description of different metadata groups - Statistical Metadata in each phase of the
Statistical Cycle metadata produced used - Systems and Design issues IT architecture,
tools, standards - Organizational and cultural issues user groups
- Lessons learned
3(No Transcript)
4Business model Transformation Strategy
- A number of standard, generic end-to end
processes for collection, analysis and
dissemination of statistical data and information
- Includes statistical methods
- Covering business process life-cycle
- To enable statisticians to focus on data quality
and implemented best practice methods, greater
coordination and effective resource utilisation. - A disciplined approach to data and metadata
management, using a standard information
lifecycle - An agreed enterprise-wide technical architecture
5BmTS Metadata
- The Business Model Transformation Strategy (BmTS)
is designing a metadata management strategy that
ensures metadata - fits into a metadata framework that can
adequately describe all of Statistics New
Zealand's data, and under the Official Statistics
Strategy (OSS) the data of other agencies - documents all the stages of the statistical life
cycle from conception to archiving and
destruction - is centrally accessible
- is automatically populated during the business
process, where ever possible - is used to drive the business process
- is easily accessible by all potential users
- is populated and maintained by data creators
- is managed centrally
6A - Existing Metadata Issues
- metadata is not kept up to date
- metadata maintenance is considered a low priority
- metadata is not held in a consistent way
- relevant information is unavailable
- there is confusion about what metadata needs to
be stored - the existing metadata infrastructure is being
under utilised - there is a failure to meet the metadata needs of
advanced data users - it is difficult to find information unless you
have some expertise or know it exists - there is inconsistent use of classifications/termi
nology - in some instances there is little information
about data, where it came from, processes it has
been under or even the question to which it
relates
7B - Target Metadata Principles
- metadata is centrally accessible
- metadata structure should be strongly linked to
data - metadata is shared between data sets
- content structure conforms to standards
- metadata is managed from end-to-end in the data
life cycle. - there is a registration process (workflow)
associated with each metadata element - capture metadata at source, automatically
- ensure the cost to producers is justified by the
benefit to users - metadata is considered active
- metadata is managed at as a high a level as is
possible - metadata is readily available and useable in the
context of client's information needs (internal
or external) - track the use of some types of metadata (eg.
classifications)
8How to come from A to B?
- Identified the key (10) components of our
information model. - Service Oriented Architecture.
- Developed Generic Business Process Model.
- Development approach from stove-pipes to
components and core teams. - Governance Architectural Reviews Staged
Funding Model. - Re-use of components.
910 Components within BmTS
10(No Transcript)
11Statistics New Zealand Current Information
Framework
Need
Design/ Build
Collect
Process
Analyse
Disseminate
Generic Business Process
Time Series Store ( INFOS)
QMS, Ag
Range of information stores by subject area
(silos)
HES etc.
ICS Store
Web Store
Metadata Store (statistical, e.g. SIM)
Reference Data Store (e.g. BF, CARS)
Software Register
Document Register
Management Information - HR Finance Data Stores
12Statistics New Zealand Future Information
Framework
Need
Design/ Build
Collect
Analyse
Disseminate
Process
Generic Business Process
TS
Raw Data
Output Data Store (confidentialised copy of IDS
- Physically separated)
Clean Data
Summary Data
Input Data Store
ICS
WEB
Metadata Store (statistical/process/knowledge)
Reference Data Store
Software Register
Document Register
Management Information - HR Finance Data Stores
13CMF gBPM Mapping
CMF Lifecycle Model Statistics NZ gBPM (sub-process level)
1 - survey planning and design Need (sub-processes 1.1 - 1.5) Develop Design (sub-processes 2.1 - 2.6)
2 - survey preparation Build (sub-processes 3.1 - 3.7) Collect (sub-process 4.1)
3 - Data collection Collect (sub-processes 4.2 - 4.4)
4 - Input processing Collect (sub-process 4.5) Process (sub-processes 5.1 - 5.3)
5 - Derivation, Estimation, Aggregation Process (sub-processes 5.4 - 5.7)
6 - Analysis Analyse (sub-processes 6.1 - 6.6)
7 - Dissemination Disseminate (sub-processes 7.1 - 7.5)
8 - Post survey evaluation Not an explicit process, but seen as a vital feedback loop.
14Metadata End-to-End
- Need
- capture requirements eg usage of data, quality
requirements - access existing data element concept definitions
to clarify requirements - Design
- capture constraints, basic dissemination plans eg
products - capture design parameters that could be used to
drive automated processes eg stratification - capture descriptive metadata about the collection
- methodologies used - reuse or create required data definitions,
questions, classifications - Build
- capture operational metadata about selection
process eg number in each stratum - access design metadata to drive selection process
- Collect
- capture metadata about the process
- access procedural metadata about rules used to
drive processes - capture metadata eg quality metrics
15Metadata End-to-End (2)
- Process
- capture metadata about operation of processes
- access procedural metadata, eg edit parameters
- create and/or reuse derivation definitions and
imputation parameters - Analyse
- capture metadata eg quality measures
- access design parameters to drive estimation
processes - capture information about quality assurance and
sign-off of products - access definitional metadata to be used in
creation of products - Disseminate
- capture operational metadata
- access procedural metadata about customers
- Needed to support Search, Acquire, Analyse (incl
integrate), Report - capture re-use requirements, including importance
of data - fitness for purpose - Archive or Destruction - detail on length of data
life cycle.
16Metadata End-to-End - Worked Example
- Question Text Are you employed?
- Need
- Concept discussed with users
- Check International standards
- Assess existing collections questions
- Design
- Design question text, answers methodologies
- Align with output variables (e.g. ILO
classifications) - Data model, supported through meta-model
- Develop Business Process Model process data /
metadata flows - Build
- Concept Library questions, answers methods
- Plug Play methods, with parameters (metadata)
the key - System of linkages (no hard-coding)
17Metadata End-to-End - Worked Example
- Question Text Do you live in Wellington?
- Collect
- Question, answers methods rendered to
questionnaire - Deliver respondents question
- Confirm quality of concept
- Process
- Draw questions, answers methods from meta-store
- Business logic drawn from rules engine
- Analyse
- Deliver question text, answers methods to
analyst - Search Discover data, through metadata
- Access knowledge-base (metadata)
- Disseminate
- Deliver question text, answers methods to user
- Archive question text, answers methods
18Conceptual View of Metadata
- Anything related to data, but not dependent on
data metadata - There are four types of metadata in the model
Conceptual (including contextual), Operational,
Quality and Physical - defined by MetaNet
19Implementation Dimensional Model
Metadata
FACT
20Dimensional Model
Metadata
FACT
21Architecture
User access
INFORMATION PORTAL
Metadata
Service layer
Input Data Environment
FACT
FACT
22Fact definitions
Versioning
Time
Questions Variables
Dimensions Hiearchies
Units of Interest
Collections Instruments
Respondents
23Goal Overall Metadata Environment
24Metadata Recent Practical Experiences
- Generic data model federated cluster design
- Metadata the key
- Corporately agreed dimensions
- Data is integrateable, rather than integrated
- Blaise to Input Data Environment
- Exporting Blaise metadata
- Rules Engine
- Based around s/sheet
- Working with a workflow engine to improve (BPM
based) - IDE Metadata tool
- Currently s/sheet based
- Audience Model
- Public, professional, technical added system
25SOA
26Standards Models - The MetaNet Reference ModelTM
- Two Level Model based on
- Concepts basic ideas, core of model
- Characteristics elements, attributes, make
concepts unique
- Terms and descriptions can be adapted
- Concepts must stay the same
- Concepts should be distinct and consistent
- Concepts have hierarchy and relationships
27Collection
Eg. Census Frequency 5 yearly
Eg. Census 2006
Classification CITY Category
WGTN Classification NZ Island Category NTH
ISL
28Defining Metadata Concepts Example
29How will we use MetaNet?
- Use to guide the development of a Stats NZ model
- Another model (SDMX) will be used for additional
support in gaps - Provides the base for consistency across systems
and frameworks - Will allow for better use and understanding of
data - Will highlight duplications and gaps in current
storage
30Metainformation systems
Concept Based Model
SIM
Other Metadata stored in
IDE
CARS
- Business Frame
- Survey Systems
- BmTS components
- etc
Classifications
Domain Value
Data Collections
Variables
Fact Classification
Categories
Statistical Units
Response
Sample Design
Concordance
Collection
31Metadata Users - External
- Government,
- Public,
- External Statisticans (incl. Intl Orgs)
32Metadata Users - Internal
- Statistical Analysts
- IT Personnel (business analysts, IT designers
technical leads, developers, testers etc.) - Management
- Data Managers / Custodians / Archivists
- Statistical Methodologists
- External Statisticians (researchers etc.)
- Architects - data, process application
- Respondent Liaison
- Survey Developers
- Metadata and Interoperability Experts
- Project Managers Teams
- IT Management
- Product Development and Publishing
- Information Customer Services
33Lessons Learnt Metadata Concepts
- Apart from 'basic' principles, metadata
principles are quite difficult. To get a good
understanding of and this makes communication of
them even harder. - Every-one has a view on what metadata they need -
the list of metadata requirements / elements can
be endless. Given the breadth of metadata - an
incremental approach to the delivery of storage
facilities is fundamental. - Establish a metadata framework upon which
discussions can be based that best fits your
organisation - we have agreed on MetaNet,
supplemented with SDMX.
34Lessons Learnt BPM
- To make data re-use a reality there is a need to
go back to 1st principles, i.e. what is the
concept behind the data item. Surprisingly it
might be difficult for some subject matter areas
to identify these 1st principles easily,
particularly if the collection has been in
existence for some time. - Be prepared for survey-specific requirements the
BPM exercise is absolutely needed to define the
common processes and identify potentially
required survey-specific features.
35Lessons Learnt Implementation
- Without significant governance it is very easy to
start with a generic service concept and yet
still deliver a silo solution. The ongoing
upgrade of all generic services is needed to
avoid this. - Expecting delivery of generic services from input
/ output specific projects leads to significant
tensions, particularly in relation to added scope
elements within fixed resource schedules.
Delivery of business services at the same time as
developing and delivering the underlying
architecture services adds significant complexity
to implementation.
36Lessons Learnt Implementation
- Well defined relationship between data and
metadata is very important, the approach with
direct connection between data element defined as
statistical fact and metadata dimensions proved
to be successful because we were able to test and
utilize the concept before the (costly)
development of metadata management systems.
37Lessons Learnt SOA
- The adoption and implementation of SOA as a
Statistical Information Architecture requires a
significant mind shift from data processing to
enabling enterprise business processes through
the delivery of enterprise services. - Skilled resources, familiar with SOA concepts and
application are very difficult to recruit, and
equally difficult to grow.
38Lessons Learnt Governance
- The move from silo systems to a BmTS type model
is a major challenge that should not be
under-estimated. - Having an active Standards Governance Committee,
made up of senior representatives from across the
organisation (ours has the 3 DGSs on it), is a
very useful thing to have in place. This forum
provides an environment which standards can be
discussed agreed and the Committee can take on
the role of the 'authority to answer to' if need
be.
39Lessons Learnt Other
- There is a need to consider the audience of the
metadata. - Some metadata is better than no metadata - as
long as it is of good quality. - Do not expect to get it 100 right the very first
time.
40Questions?