Building EDEN NonXML Files November 2005 - PowerPoint PPT Presentation

1 / 76
About This Presentation
Title:

Building EDEN NonXML Files November 2005

Description:

Cause: An invalid Category Permitted Value is provided. ... that was less than the sum of the student counts for the Category Set being reported. ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 77
Provided by: brando8
Category:

less

Transcript and Presenter's Notes

Title: Building EDEN NonXML Files November 2005


1
Building EDEN Non-XML FilesNovember 2005 
U.S. DEPARTMENT OF EDUCATION
2
Agenda
  • ESS Introduction A brief overview of the
    features of ESS.
  • ESS Databases The path data takes from
    submission to the data repository.
  • File formats High level overview of available
    file formats.
  • File Specification Review Step through File
    Specifications.
  • Order of File Submission Importance of
    Directory file and managing data identifiers.
  • Review of Errors and Edits Sharing of solutions
    to common problems. Open session with group to
    discuss their states issues.
  • SEA Tools - What do you use? Open session for
    states to share how they prepare files. Review
    successes and failures.

3
ESS Sub-Applications
Access to ESS Home Page
Access to transmit data files
Access to view status of transmitted files.
Including format and validation errors,
reasonability edit warnings.
Access to view submission status by SEA, LEA and
School.
This sub-application is available to view and
verify data at a State, LEA, or School level.
This area allows you to change your password as
needed.
4
ESS Databases  
5
System Components
  • Data Receipt and Processing System (DRPS)
  • Receives files and checks for errors
  • Staging Database
  • Performs reasonability edits
  • Data Repository
  • Responsibility of data transferred to ED

6
Data Receipt and Processing System
  • Manages transmission of files from the SEAS to
    EDEN
  • Receives files from SEA
  • Performs format and validation edits
  • Generates submission and error reports
  • Emails file receipt and status to submitter
  • If a file contains any format or validation
    errors, its data does not go beyond this point

7
Staging Database
  • Contains data from files that passed format and
    validation edits
  • Contents can be viewed from the Education Unit
    Profile
  • Performs reasonability edits on submitted data
  • Data requiring explanations from the SEA does not
    move past this point
  • Before leaving this stage, ED will contact SEA
    regarding any data quality concerns

8
Data Repository
  • Contains cleaned and edited files
  • Responsibility of the data has transferred from
    SEA to ED
  • SEA will receive notice 96 hours before data is
    moved to DR
  • Additional review of data in the DR performed by
    ED subject matter experts before data is used or
    made available to public

9
(No Transcript)
10
  • File Format Options

11
File Formats
  • Fixed Format (.txt)
  • Comma-delimited Format (.csv)
  • Tab-delimited Format (.tab)
  • XML Format (.xml)

12
  • Fixed Format
  • Each record has a pre-defined length
  • Each field in a record is assigned a beginning
    position and has a pre-defined length
  • All fields must be included in a record, whether
    used or not
  • Filler fields and optional fields for which no
    value is reported should be blank in fixed
    position records
  • Unused positions of a field should be blank

13
  • Fixed Format
  • For a fixed format file, the external file name
    in the header record must have a file extension
    of .txt
  • Numeric fields must not contain commas
  • Data in decimal fields must contain the decimal
    point (e.g., 1.123)

14
  • Fixed Format Example

15
  • Comma-delimited Format
  • Similar to the fixed format for order and format
    of each field in a record
  • Each field is separated by a comma instead of
    beginning and ending in a specified position
  • All fields, including filler fields must be
    accounted for

16
  • Comma-delimited Format
  • Except for the last field in a record, each field
    whether valued or not, must be designated by a
    comma, including filler fields
  • Example ( indicates CRLF)
  • First Field Three Bears School
  • Second Field Filler
  • Third Field 123 Main Street
  • Last Field Filler
  • THREE BEARS SCHOOL,,123 MAIN STREET,

17
  • Comma-delimited Format
  • When the last field in a record is a filler, the
    record will appear to end with a comma, but it
    actually ends with a blank filler field
  • For valued fields, the comma should immediately
    follow the data value, except for the last field
    in the record
  • For a comma-delimited file, the external file
    name in the header record must have a file
    extension of .csv

18
  • Comma-delimited Format Examples
  • Header Record Field Content
  • Field Name Format Data Value
  • File Type STRING LEA DIRECTORY INFO
  • Total Records in File NUMBER 123
  • Filename STRING EULEADIRECTORYVer0001.csv
  • File Identifier STRING LEA Directory Example
  • File Reporting Period STRING 2003-2004
  • LEA DIRECTORY INFO,123,EULEADIRECTORYVer0001.csv,L
    EA Directory Example,
  • 2003-2004

19
  • Comma-delimited Format Examples

20
  • Tab-delimited Format
  • Almost identical to comma-delimited format except
    for the following differences
  • Each field is separated by a tab character
    instead of a comma
  • Blank fields and fillers need only be designated
    with a tab character

21
  • Tab-delimited Format
  • For a tab-delimited file, the external file name
    in the header record must have a file extension
    of .tab
  • When the last field in a record is a filler, the
    record will appear to end with a tab character,
    but it actually ends with a blank filler field

22
  • Tab-delimited Format
  • Except for the last field in a record, each field
    whether valued or not, must be designated by a
    tab character, including filler fields
  • Example (gt indicates a tab character -
    indicates CRLF)
  • First Field Three Bears School
  • Second Field Filler
  • Third Field 123 Main Street
  • Last Field Filler
  • THREE BEARS SCHOOLgtgt123 MAIN STREETgt

23
  • Tab-delimited Format Examples
  • Header Record Field Content
  • Field Name Format Data Value
  • File Type STRING LEA DIRECTORY INFO
  • Total Records in File NUMBER 123
  • Filename STRING EULEADIRECTORYVer0001.tab
  • File Identifier STRING LEA Directory Example
  • File Reporting Period STRING 2003-2004
  • LEA DIRECTORY INFOgt123gtEULEADIRECTORYVer0001.tabgtL
    EA Directory Examplegt
  • 2003-2004

24
  • Tab-delimited Format Examples

25
  • Extensible Markup Language - XML
  • XML is a standard for creating documents that
    describe and apply a structure to data
  • XML document are self-describing making them
    readable by both humans and computers
  • Extensible means there not a fixed set of
    elements like HTML. Authors can define their own
    tags.
  • A common format for exchanging data over the Web.

26
  • Example of XML
  • ltFILETRANSMISSION FILELAYOUTTYPE"SEA MEMBERSHIP
    TABLE"
  • FILEID"SEA MEMBERSHIP 15 OCT 2005"
  • SCHOOLYEAR"2005-2006"gt
  • ltAGENCY FIPSSTATECODE"99" STATEAGENCYIDNUMBER
    "01"
  • STATELEAIDNUMBER"518"gt
  • ltTABLETYPE TYPEABBRV"MEMBER"
    TOTALINDICATOR"N"gt
  • ltCATEGORY TYPE"GRADELVMEM"
    VALUE"03"/gt
  • ltCATEGORY TYPE"RACEETHNIC"
    VALUE"AS"/gt
  • ltCATEGORY TYPE"GENDER" VALUE"F"/gt
  • ltAMOUNTgt456lt/AMOUNTgt
  • ltEXPLANATION/gt
  • lt/TABLETYPEgt
  • lt/AGENCYgt
  • lt/FILETRANSMISSIONgt

27
File Specifications
28
File Specifications
  • Contains layout of all non-XML file types
  • 83 separate specifications- 1 for each file type
  • All have similar format
  • Primary reference point for preparing and
    submitting files
  • Revised for the 2005-2006 collection period
  • Current specifications available at
    http//www.ed.gov/about/inits/ed/pbdmi/file-specif
    ications.html

29
Section 1.0- Purpose
  • Provides an overview of what data is submitted in
    the file
  • Lists changes from previous school years file
    specifications
  • Clarifies the set of Educational Units from which
    the data should be collected
  • The Category Set table is now located in this
    section
  • Easier to determine requirements of a file at a
    glance

30
Section 1.0- Purpose, contd
  • Categories
  • Lists the data groups that make up a specific
    table entry
  • Details which fields should be populated for any
    given record in a file

31
Section 2.0- File Naming Convention
  • Describes naming convention for consistency
  • File names can be no longer than 25 characters
  • Not strictly necessary, but strongly recommended

32
Section 3.0- Guidance for Submitting this File
  • Helpful hints and frequently asked questions
    appear in this section
  • Information is specific to each particular file
    specification
  • Clarifies when zero rows need to be submitted

33
Sections 4-6 File Level Specifications
  • Sections 4, 5, and 6 provide the technical
    details needed to prepare the SEA, LEA, and
    School level files
  • If data is not collected at a given level, that
    level is left out
  • Includes tables listing the data groups and their
    permitted values
  • Consists of 3 subsections
  • Header Record Definition
  • Data Record Definition
  • Examples

34
Sections 4-6 File Level Specifications, contd
  • The Header and Data Record tables contain 7
    columns
  • Data Group Name
  • Start Position
  • Length
  • Type
  • Mandatory/Optional
  • Definition/Comments
  • Permitted Values

35
Sections 4-6 File Level Specifications, contd
  • Data groups are listed in the order they should
    exist within each record
  • Start position
  • Indicates the starting point of a data group
    measured in characters from the beginning of a
    line
  • Only applies to fixed-width files
  • Length
  • The maximum number of characters for a data group
    value
  • Values exceeding this length can cause format
    errors
  • Applies to all non-XML file types

36
Sections 4-6 File Level Specifications, contd
Types
  • String
  • Can consist of any alphanumerical character
  • Leading zeros are significant in numbers
    expressed as strings
  • Number
  • Can consist of any integer value -1 or greater
  • Leading zeros are not required
  • Decimal
  • Used to report percentages
  • Consists of a digit followed by 4 decimal places
    (0.8355 83.55)

37
Mandatory vs. Optional
  • An M in the Mandatory/Optional column indicates
    that the field is mandatory and must contain some
    data for each record within the file
  • An O in the Mandatory/Optional column indicates
    that the field is Optional and data is not
    required for that field
  • Optional fields may still be required in
    accordance with the category chart of the file
    specification

38
Definition, contd
  • Definition/Comments- gives a brief explanation of
    the data group along with additional information
    on preparing the file
  • Permitted Values- lists the only accepted values
    for a given field
  • If blank, permitted values must be consistent
    with the data type

39
Examples
  • The examples section includes an excerpt of
    properly formatted file in each of the 3 non-XML
    formats

40
Order of File Submissions
41
  • 05-06 Transmittals
  • Importance of Directory Files
  • Relationships Among Files

42
  • Timing for 2005-2006 EDEN Data Submission
  • The submission time frame for School Year
    2005-2006 data begins in January 2006.
  • SEAs should begin submitting data as soon as the
    data is available after January 1, 2006.
  • See Workbook

43
  • Importance of Directory Files
  • Prior to any other data being submitted for an
    education unit, a complete and correct Directory
    record must be in place
  • SEA Directory data must be in place prior to
    submission of LEA Directory data
  • LEA Directory data must be in place prior to
    submission of School Directory data

44
  • Importance of Directory Files
  • Directory and Grades Offered files are not school
    year specific, but continue from year to year
  • They only need to be submitted when data changes
    or if data for the education unit does not exist
    in the database
  • Operational Status is now collected with the
    directory file.

45
  • Relationships Among Files
  • A Membership record (or any other non-directory
    record) for a school will not be accepted unless
    a record for that school exists in the directory
    file
  • Grades Offered files should be submitted for any
    LEA or School which has students enrolled

46
  • Updating Directory Files
  • Submitting a transmittal file with only an
    additional or corrected education unit is ok, or
    the entire directory transmittal file may be
    resubmitted
  • If the entire Directory transmittal file is
    submitted, be sure to include the additional
    and/or corrected education unit(s)
  • Submit grades offered for the additional
    education unit
  • Submit membership data for the additional
    education unit

47
  • Directory Files Identifiers
  • State Identifiers are the primary identifiers
    used to match a submitted education unit
    directory record to an education unit directory
    record on the EDEN database
  • NCES Identifiers may be provided, but are only
    used a secondary check

48
  • Identifiers for SEA Data
  • FIPS State Code and State Agency Number
  • State Agency Number (default 01)

49
  • Identifiers for LEA Data
  • FIPS State Code
  • State Agency Number
  • State-assigned LEA ID

50
  • Identifiers for School Data
  • FIPS State Code
  • State Agency Number
  • State-assigned LEA ID
  • State-assigned School Identifier

51
  • Adding/Updating an SEA
  • FIPS/Agency Match - FIPS Code and State Agency
    Number in submission match the FIPS Code and
    State Agency Number on EDEN
  • Treated as update - Submission changes applied
  • FIPS Match Agency Not Match - FIPS Code in
    submission matches FIPS Code in EDEN - State
    Agency Number in submission does not match State
    Agency Number in EDEN
  • Treated as new state agency Agency added

52
  • Adding/Updating an LEA
  • Primary IDs Match, No Secondary IDs/Secondary IDs
    Match
  • Treated as update - Submission LEA changes
    applied
  • State SEA IDs Match, LEA ID Not Match, No
    Secondary IDs
  • Treated as new LEA added to EDEN
  • State IDs Match, LEA ID Not Match, Secondary IDs
    Match
  • Error (An attempt is being made to associate 2
    different State LEA IDs to a single NCES ID.)

53
  • Modifying State Identifiers
  • It is not possible to modify a States ID by
    submitting an updated directory file
  • EDEN will see the modified ID as a new unit and
    add it to the system, resulting in duplicate
    records

54
  • Adding/Updating a School
  • The same logic for identifiers applies at the
    school level

55
EDEN Edits, Errors, and Warnings
56
  • Format Edits Errors
  • Validation Edits Errors
  • Reasonability Edits Errors

57
  • Format Edits and Errors
  • Definition transmitted file must be in an
    allowable specified format
  • Purpose If the transmitted file is not in an
    allowable specified format, DRPS cannot interpret
    it
  • Result If the file is acceptably formatted,
    Validation Edits are performed. If not, the file
    is rejected and an email will be sent providing a
    link to the report.
  • A single format error will result in rejection

58
  • Invalid File Format
  • Message Data is not in correct fixed (txt)
    file format.
  • Message Data is not in correct delimited
    (csv/tab) file format.

59
  • Validation (Hard) Edits and Errors
  • Definition Identification of submissions where
    invalid data has been used for a field where
    specific values or a domain has been specified
  • Purpose To ensure that the data stored in the
    Data Repository can be interpreted according to
    agreed upon rules
  • Result Any validation error in any submission
    will cause the entire file to be rejected.
    However, all validation errors will be identified
    up to a threshold
  • High Error Rates currently 1000 or more errors
    identified will result in the termination of
    validation edit processing
  • If a file contains no validation errors, any
    applicable reasonability edits will be performed.
    All submissions in the Transmittal will be
    loaded into the Staging Database.

60
  • Most Common Validation Error 1
  • Message The Category Set combination is invalid
    for Table Type ltTable Type Namegt for the reported
    ltcount or subtotalgt.
  • Cause A count or sub-total for an invalid
    combination of Categories has been provided. For
    example, a Membership student count was provided
    only for grade level, where the Race/Ethnicity
    and Gender are also required. In the case of a
    Membership sub-total, a subtotal was provided for
    all 3 Categories (Grade Level, Race/Ethnicity and
    Gender) where each sub-total required only
    contains 2 Categories each.
  • Resolution Refer to the appropriate file
    specifications document for the correct Category
    Sets and their valid combinations for a count or
    subtotal for the table type.

61
  • Most Common Validation Error 2
  • Message "Submitted State ID is not found in
    EDEN."
  • Cause An attempt was made to submit metric
    information (e.g., Membership) for an LEA or
    school that does not exist in the EDEN Staging
    Database.
  • Resolution Verify that the State School
    Identifier in the membership file submission is
    correct. If the State Identifier in the
    membership file is correct, submit a directory
    record for the school and then resubmit the
    membership file. Refer to the EDEN Workbook for
    a discussion on "Relationships Among Files."

62
  • Most Common Validation Error 3
  • Message "The Category Code ltvaluegt, which was
    submitted for the reported ltTable Type Namegt, is
    not a Permitted Code."
  • Cause An invalid Category Permitted Value is
    provided.
  • Resolution Refer to the appropriate file
    specifications document for the permitted code
    values that can be reported for each Category in
    a Category Set for the data group in error.

63
  • Most Common Validation Error 4
  • Message The State Agency Number is not found in
    EDEN Directory.
  • Cause The State Agency Number in a submission
    file cannot be found in the EDEN data base.
  • Resolution Ensure that the State Agency Number
    for the FIPS State Code being submitted is
    correct.

64
  • Most Common Validation Error 5
  • Message Sub-total missing for ltTable Type Namegt
    Table Type.
  • Cause A required sub-total is not provided for a
    count type.
  • Resolution Refer to the appropriate file
    specifications document for the correct Category
    Sets and their valid combinations for a sub-total
    for this Data Group.

65
  • Most Common Validation Error 6
  • Message Field DisplayName contains value that
    is not part of the valid set.
  • Cause The permitted value specified in the
    displayed field name is not valid.
  • Resolution Correct the permitted value in error.
    Refer to the appropriate file specifications
    document for valid permitted values.

66
  • Most Common Validation Error 7
  • Message The table type abbreviation does not
    exist.
  • Cause An invalid Table Type Name was provided in
    the Header Record.
  • Resolution Ensure that the Table Type Name is
    not misspelled. (e.g., MUMBER instead of
    MEMBER). In Fixed Format (.txt) files, ensure
    that the Table Type Name is left-justified in the
    field. Refer to the File Format Specifications
    for the Table Type being submitted for the valid
    Table Type Name.

67
  • Most Common Validation Error 8
  • Message The Grand Total for ltTable Type Namegt
    is less than the sum of its parts.
  • Cause A grand total was provided that was less
    than the sum of the student counts for the
    Category Set being reported. For example, the
    grand total for the MEMBER Table Type is a total
    of the Grade Level, Race Ethnicity and Gender
    student counts for each education unit being
    reported (i.e., a grand total is reported for
    each education unit and must be equal to or
    greater than the sum of the student counts for
    that education unit).
  • Resolution Correct the grand total for the Table
    Name and Category Set in question and resubmit
    the file.

68
  • Most Common Validation Error 9
  • Message Grand Total is missing for ltTable Type
    Namegt Table Type.
  • Cause A required grand total is not provided for
    a count type.
  • Resolution Refer to the appropriate file
    specifications document for the correct Category
    Sets and their valid combinations for a grand
    total for this Data Group.

69
  • Most Common Validation Error 10
  • Message The LEA ID is invalid for this State.
  • Cause In a submission for an LEA, the
    combination of State FIPS Code, State Agency
    Number and LEA State Identifier cannot be found
    in the EDEN Staging Database.
  • Resolution Add the LEA Directory information as
    appropriate and then resubmit the school
    Directory.

70
  • Reasonability Edits
  • Definition Identification of submissions where
    a questionable data has been submitted
  • Purpose While the data may fall outside normal
    expectation there may be reasonable explanations
    for it. An explanation is supplied to save
    future communication between ED analysts or other
    interpreters of the same data in the future
  • Result
  • All submissions in the file are loaded into the
    Staging Database but those submissions where
    reasonability edits have failed must be
    explained.
  • Such submissions will not be transferred to the
    Data Repository until a reasonable explanation
    has been approved

71
  • Reasonability Warning 1
  • Message School Membership count changed
    significantly from last year.
  • Cause The student membership school count for a
    count of more than 10 students has an increase or
    decrease of more than 50 in the student count
    for that category group.
  • Resolution The State should verify that the
    count in question has been reported correctly and
    provide an explanation accordingly.

72
  • Reasonability Warning 2
  • Message School Membership subtotal changed
    significantly from last year.
  • Cause The submitted student membership sub-total
    for a school has an increase or decrease of more
    than 25 than the prior year subtotal.
  • Resolution The State should verify that the
    sub-total in question has been reported correctly
    and provide an explanation accordingly.

73
  • Reasonability Warning 3
  • Message LEA Membership count changed
    significantly from last year.
  • Cause The student membership LEA count for a
    submitted count of more than 25 students has an
    increase or decrease of more than 25 than the
    prior year student count for that category group.
  • Resolution The State should verify that the
    count in question has been reported correctly and
    provide an explanation accordingly.

74
Errors and Edits
  • Questions concerning specific errors?
  • Questions concerning problems or concerns with
    regards to transmittals?

Open Session
75
SEA Tools
  • What the tools, programs your states uses to
    build your EDEN Files?
  • Share any failures?
  • Share any successes?

Open Session
76
Recap Objectives
  • Provided a basic understanding of ESS and its
    sub-applications.
  • Provided an understanding of the data file
    submission lifecycle.
  • Provided an in-depth knowledge of the File
    Specifications and how to read them.
  • Provided time to learn from other SEA Partners
    concerning internally used tools and processes.
Write a Comment
User Comments (0)
About PowerShow.com