Title: Steward system
1Steward system
- Seokjoo Lee
- University of Floridaseokjoo_at_ufl.edu
2Table of contents
- System configuration
- Operation
- Management and expansion
- Migration
- Hotter program
3(No Transcript)
4Steward System configuration
5Steward data flow
Every day, TSS data are delivered from D2, D4 and
D6 into Steward. Data sizes are 100MB, 230MB and
110MB each.
6Steward system configuration-ETL(1)
- FTP server
- - Filezila FTP server version 0.9.30 beta
version is installed to serve the TSS data
transfer from each district. - - C\Steward_ftp is assigned for the FTP storage
- - Each district can access C\Steward_ftp\DistrctX
folder for its file transfer. - - Host IP cdw880.ce.ufl.edu
- Administrator User ID cdw
- Administrator Password trc513
- District User ID cdw_user_d1 cdw_user_d8
- District Password steward08
- Data backup
- - 2BrightSpark SyncBack version 3.2.19.0 is used
to backup FTP folders into local file storage
daily - - Two copies of backup are created into two local
file storages (external hard drive 1GByte each,
USB interface)
7Steward system configuration-ETL(2)
- Data transformation
- - Custom-build SunETLUtility is used to transform
the raw TSS data into the aggregated Steward data
format - - SunETLUtility can be executed in command-line
or stand-alone program. - - SunETLUtility needs the following subfolders
- FacilityDataFolder Input facility data folder
- ToConvertFolder Input TSS data folder
- ConvertedDataFolder Output folder for
converted data/report - GroupDataFolder Output folder for group data
- - SunETLUtility command line format
- SunETLUtility /dDx /iInputDataFolder
/oOutputDataFolder " - example) SunETLUtility /d"D2" /i"c\D2Data"
/o"c\Steward\Input"
8Steward system configuration-DB(1)
- Steward database tools
- - Oracle Database 10g 10.2.0.1.0 Enterprise
edition - - Oracle Warehouse Builder 10g 10.2.0.1.311 GUI
interface - - Oracle Workflow 2.6.4.0.0 Scheduler
- Steward database/web server
- - Dell Optiplex GX620
- - Pentium 4 3.2GHz/ 4GB Ram/Windows XP3 SP3
- - Seagate 160GB HDD (SATA I) for Oracle SW
- - Seagate 500GB HDD (SATA II) for database
- 1Oracle patch version 10.2.0.3 is expected to be
used. metalink
9Steward system configuration-DB(2)
- Steward DB tables
- - Design Strategy for Steward database
- -- Keep the simple architecture to avoid the
costly joins - -- Allow duplicate data tables/views rather
than on-the-fly computations - - Fact tables (Cubes)
- -- Station-level tables TSS_5min, TSS_15min
and TSS_1hr - -- Lane-level tables TSS_5min_lane,
TSS_15min_lane and TSS_1hr_lane - - Dimension tables (Dimension)
- -- date_ref, time_ref, tss_station
10Steward system configuration-DB(3)
- - Materialized views
- -- SYSTEM_LANE_5MIN, SYSTEM_LANE_15MIN,
SYSTEM_LANE_1HR - -- VOLUME_MAP_5MIN, VOLUME_MAP_15MIN,
VOLUME_MAP_1HR - -- TRAFFIC_COUNTS_5MIN, TRAFFIC_COUNTS_15MIN,
TRAFFIC_COUNTS_1HR - -- PM_5MIN, TT_REL_5MIN
- -- MAX_FLOW_5MIN, MAX_FLOW_15MIN, MAX_FLOW_1HR
- -- EFF_DET_LEN_5MIN, EFF_DET_LEN_15MIN,
EFF_DET_LEN_1HR - - Materialized views are partitioned by Date and
District and indexed by Station_ID, Hour and Date - - Fact tables will be partitioned and indexed on
next major revision - - Materialized views are refreshed on demand.
They will be updated into on-commit on next major
revision
11(No Transcript)
12(No Transcript)
13(No Transcript)
14(No Transcript)
15Steward system configuration-Web(1)
- Web design concepts
- - Minimize the on-the-fly computations by using
pre-developed materialized views - - Minimize the complex queries on the client
side by using the query functions where the
materialized are not applicable. - Steward web tools
- - Microsoft Internet Information Services 5.1
for web server - - Text-based ASP, JavaScript, HTML
16Steward system configuration-Web(2)
- Steward web architecture
- - Home
- - Overview
- - Resources
- - Maps
- - District data
- -- TSS Facility-Level Reports
- -- TSS Section-Level Reports
- -- TSS Station-Level Reports
- -- TSS ETL Reports
- -- TVT Station-Level Reports
- -- TVT ETL Reports
- -- Report Archive
17Steward system configuration-Web(3)
18Steward operations -ETL process (1)
- D2, D4 and D6 pushes archived TSS data into
Steward FTP server daily at 3AM, 2AM and 2AM. - - Average file sizes are 15MB, 30MB and 15MB
in compressed file format (100MB, 230MB and 110MB
in uncompressed format) - - File location c\Steward_FTP\DistrictX
- File backup and uncompressing
- - Backup location u\DISTRICT_DATA\Districtx\r
awdata - - Uncompressing location c\Program
Files\TRC\SunETLUtility\ToConvert - Data transformation
- - Raw traffic data are processed using custom
program (SunETLUtility) - - Process the data files
- -- Assign Station/Lane ID
- -- Aggregated 1min, 5min, 15min and 60min
lane/station data - -- Run the basic quality check
19Steward operations -ETL process (2)
- -- One input data file would generate one
converted file, one daily report, eight
aggregated data file. For example,
TSS-03032009--1.dat from District 2 would
generate the following output files - ConvertedData\D2_TSS-03032009-0.csv
- ConvertedData\D2_TSS-03032009-DailyReport.
Log - GroupData\D2_TSS-03032009-1S.csv
- GroupData\D2_TSS-03032009-1L.csv
- GroupData\D2_TSS-03032009-5S.csv
- GroupData\D2_TSS-03032009-5L.csv
- GroupData\D2_TSS-03032009-15S.csv
- GroupData\D2_TSS-03032009-15L.csv
- GroupData\D2_TSS-03032009-60S.csv
- GroupData\D2_TSS-03032009-60L.csv
- - Report process results to Steward Admin and
users - - Archive all the processed files and prepare
the data loading - -- Backup location u\DISTRICT_DATA\Distr
ictx\processeddata\ConvertedData - u\DISTRICT_DATA\Districtx\processeddata\Gr
oupData - Data backup
- - Steward_FTP folder are backup into U drive at
4AM daily - - U drive are back-up into the identical V drive
at 5AM daily
20Steward operations -ETL process (3)
- Data loading
- - Oracle Process Flow modules are used to load
TSS data files - - For one raw TSS file, six processed data
files are loaded (5min, 15min, 60min station and
lane data) - - Oracle Process Flow modules can be
executable manually using Control Center or
automatically using Oracle Schedules
21Steward operations -ETL process (4)
- After data uploading, data changes can be
verified from their table row counts.
22Steward operations -ETL process (5)
- Database performance can be monitored from Oracle
Enterprise Manager
23Steward operations - Current status
- Traffic data from districts
- Daily data are not received, data CRC error or
our ftp server issues.
District TSS Data Range Missing days Roadways Data issues
2 6-28-07 to 2-11-09 6 days I-95 I-295
4 5-1-08 to 2-1-09 14 days I-95, I-595 I-95 NB has two HOT lanes 118 New stations are added at I-75
6 8-1-08 to 2-1-09 7 days SR-826, I-195, I-75 I-95 data are not stabilized yet.
24Steward operations -Web
25Management plan
- Daily data loading
- - Daily transform report by email
- DB performance
- - Oracle enterprise manager console
- Web statistics
- - WebLog Expert Lite version 5.6
26Expansion scenario (1)
- New data for CDW three possible scenarios
- Limited information on district
changes/information - - New HOT lanes in District 4 and 6
- - New facility (I-75) in District 4
- - Limited information (I-295 milepost) in
District 2
27Expansion scenario (2)
- Expansion might cause the performance issues on
preprocessing, DB performance or web interface. - Unexpected system configurations could impact the
system modifications - - Two HOT lanes in District 4 and District 6
- - HOT lanes in NB are operational but SB will
be implemented later - - 30sec aggregation interval in D5
28Expansion scenario (3)
- New analysis report
- - New materialized views needs be created.
- - Web interface needs be revised
29Statewide traffic CDW at other states
- Rob Hranac at Berkeley Transportation Systems
defines the progress of Archived Data User
Service (ADUS) as follows - - Data ? Reports (for decision support system)
? Application (web 2.0) ? Prediction ? Control
Automation - Most of states are in the stage 2 (Report) and
move toward stage 3 (Application) - PeMS at California are developing integration of
traffic data and incident data on GIS. - PeMS are trying to integrate the transportation
simulation model with the traffic data - WisTransPortal integrated traffic data database
with lane closure system - U of Maryland developed traffic data warehouse
for Washigton DC, Maryland and Virginia - U of Maryland developed the web-based analytical
tools for the traffic data and incident data
30Migration plan
- Web service
- - Web sources are server independent and ready
to migrate - Database
- - Migrate Oracle warehouse builder metadata
- - TSS data will be re-loaded rather than table
migration - - TSS data tables will be reconfigured for
Index and partitions - - Refresh method for TSS materialized views
will be reconfigured from on-demand to on-commit - ETL process
- - ETL process will be customized for new system
31Hotter program
- Utility program for analyzing the operation of
High Occupancy Toll (HOT) lanes within SunGuide
traffic management systems in Florida. - I-95 in D4 and D6 has HOV lanes on SB and HOT
lanes on NB - Input data are generated from Steward manually
- Hotter program generates analysis report on
HOV/HOT lanes in comma-delimited (CSV) file
format - Results may be plotted if desired from the
analysis files