The FRB and XML: - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

The FRB and XML:

Description:

Bureau of Labor Statistics has query screens, series select ... series names where each character in our series name has meaning and names are hierarchical. ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 38
Provided by: m1s
Category:
Tags: frb | xml | meaning | names | of

less

Transcript and Presenter's Notes

Title: The FRB and XML:


1
The FRB and XML
  • National data and International standards
  • San Cannon
  • Federal Reserve Board
  • IASSIST 2005

2
Background
  • The Fed is a statistical agency as well as a
    central bank and regulatory agency.
  • Lots of data and information are available on the
    public website.
  • Statistical data is varied Monthly industrial
    production indexes (non-financial), daily
    interest and exchange rates (financial) and
    quarterly financial flows for various sectors of
    the economy, surveys of small businesses and
    consumers, etc.

3
The different roles are often competing
interests...
  • Sometimes it seems that the statistical agency
    role is secondary.
  • Data are not always easy to find.
  • Downloads are not customizable.
  • Example Trying to extract one industrial
    production series Requires two text files,
    cutting and pasting, reformatting.
  • All or nothing approach.
  • Complete yes. User Friendly no.

4
Other agencies making great strides
  • Bureau of Economic Analysis has wonderful tabling
    capabilities www.bea.gov
  • Bureau of Labor Statistics has query screens,
    series select screens and frequently requested
    statistics www.bls.gov

5
Taking an extra step
  • We wanted to build something forward looking XML
    was identified early on.
  • Most flexible and seems to be the trend for
    future.
  • Financial data already heading that way FinXML,
    FpML (financial product ML), MDDL (Market data
    definition language), XBRL (eXtensible Business
    reporting language)

6
How do we do it?
  • Build our own XML definitions
  • Pro would fit our data perfectly
  • Con wed be the only ones
  • Use financial definitions
  • Pro lots of others use them
  • Con we have nonfinancial data
  • Try SDMX (Statistical Data and Metadata
    eXchange)
  • Pro designed for time series data
  • Con new kid on the block

7
But nothing goes smoothly at first
  • SDMX is based on key families and codelists
    where every concept can be represented by a code
    with a corresponding definition in a list

8
We think about data differently
  • The Fed uses mnemonic series names where each
    character in our series name has meaning and
    names are hierarchical.

9
Fitting a square peg in a round hole.
  • Data represented by a concrete number of concepts
    are much easier to represent with key family
    dimensions and attributes
  • Q.SCBA.GB.92 ? Freq.Topic.Country.BIS code
  • M.HBBA.US.01 ? Freq.Topic.Country.BIS code
  • Hierarchical relationships and varying number of
    concepts makes life more difficult a single key
    family isnt possible
  • JQI_I02YMF_N.M ? Topic_Industry_SA.Freq
  • RIFSPPNA2P2D30_N.B ? Topic?_SA.Freq

10
SDMX only provides a framework
  • We still needed to build the actual schemas to
    describe our data within the SDMX metaschema
    framework.
  • Each data release uses its own schema or set of
    schemas. Each schema is based on a key family
    used to describe the data.
  • Currently, our schemas are tailored to meet our
    data needs.

11
Storage adds further complications
  • We need to store data and metadata in a database
    to be retrieved with queries.
  • Native XML databases in their infancy.
  • We couldnt find many people storing XML tagged
    data in relational databases

12
So what did we end up with?
  • Data model is hybrid tree structure flattened
    to fit codelist setup.
  • We store the XML as carefully sliced text in a
    relational database and we can build an index
    structure that allows us to respond to ad-hoc
    queries very efficiently, even for large volumes
    of data.

13
This kind of structure
14
Looks like this in SDMX-ML
  • Commercial
    Paper Outstandings
  • codelist"CL_TIME"
  • concept"FREQ" codelist"CL_FREQ"/
  • codelist"CL_CP_SA"/
  • codelist"CL_CP_IND_TYPE"/
  • codelist"CL_CP_ORIG"/
  • codelist"CL_CP_OWN"/
  • codelist"CL_CP_NSASC"/
  • codelist"CL_UNIT" attachmentLevel"Group"
    assignmentStatus"Mandatory"/
  • codelist"CL_UNIT_MULT" attachmentLevel"Group"
    assignmentStatus"Mandatory"/
  • codelist"CL_OBS_STATUS" attachmentLevel"Observat
    ion" assignmentStatus"Mandatory"/
  • attachmentLevel"Series" assignmentStatus"Mandat
    ory" /
  • attachmentLevel"Series" assignmentStatus"Condit
    ional" /

15
Which gets stored like this
16
And the end result?
  • The Data Download Project (DDP) is the largest,
    most complex application on the Boards public
    website.
  • Its also the first production application to
    deliver customized data extracts in SDMX format.
  • And now.
  • Version 1.0!

17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
Next steps
  • Performance testing and verify server load
    capabilities.
  • Polish interface, do usability testing and verify
    compliance with Section 508 regulations.
  • Long run work with other central banks on
    common schema framework.
  • Release on the unsuspecting public! Target
    Third quarter 2005

37
The last slide
  • Questions? Comments?
  • Thank you for your attention!
  • San Cannon
  • scannon_at_frb.gov
  • (202) 452-3710
Write a Comment
User Comments (0)
About PowerShow.com