Title: D-1
1Permits Over the Internet
2Paperless Permitting
- Issuing engineering permits has been a slow and
paper intensive process - At the City of Los Angeles, issuing engineering
permits is now a paperless, Internet based,
process that can complete a full circle of plan
check redlining / plan update in less than one
hour - And all of the records are available in digital
form for use in the future
3Issuing Permits on the Internet
- To see how the City of Los Angeles issues permits
on the Internet, go to - http//eng.lacity.org/
- On the Permits drop down list, click on Demo B
Permits. This will take you to - http//eng.lacity.org/demos/bpermits/start.htm
- Clicking on customer will take you to
- http//eng.lacity.org/demos/bpermits/index.htm
- Choose from the options to explore the system.
4Permit Via the Internet
5Plan Check Over the Internet
6Plan Check Over the Internet
- At http//eng.lacity.org/demos/bpermits/index.htm
- There is a demonstration of how
- Raster Scanned and AutoCAD drawings can be
- Uploaded
- Checked
- Returned for changes
7Plan Check Via the Internet
8The Human Genome A Record (1)
- The human genome has been decoded
- By Celera Genomics of Rockville, Maryland
- By the year 2004, doctors will treat cartilage
injuries with cartilage grown from cartilage
precursor cells from the patient - (Robert Nerem of Georgia Tech)
- Within 10 years complex organs such as hearts
could be grown - (Buddy Ratter of the Univ. of Washington in
Seattle)
9The Human Genome A Record (2)
- The drug Pleconaril kills 169 viruses
- Including viral meningitis, the flu, and polio
- Could be in drugstores around the end of 2000
- Monkeys were cloned in 1999
- Genes to add iron and beta-carotene, the
precursor to vitamin A, to rice (International
Rice Research Institute) - Eliminate iron deficiency-anemia, the worlds
worst nutrition disorder, affects nearly 2
billion people - Eliminate vitamin A deficiency, the worlds
leading cause of blindness and a malaise that
affects as many as 250 million children
10Index Boundary
- Similar to the division between an Archive and a
Museum - ChemAbstracts, Columbus Ohio needed to separate
chemicals from genes - The dividing line was reported to be a molecular
atomic weight of 100 thousand. - http//www.CAS.org/
11Communications
12Workflow
13- Document Routing
- Structured vs Ad Hoc in Same System
- Proprietary vs Commercial Email
- Is Routing Setup Graphical?
- Can You Find In-Process Documents?
- Even when you delete the node they are enqueued
for?
14Workflow Functions (1)
- Workflow Metadata Management
- Synchronization Points
- Ad Hoc Activities
- Purge, Archive, Delete
- Names and Roles
- Error Reporting and Control
- Security, Locking, Process Integrity
15Workflow Functions (2)
- Subprocess (Workflow Subroutine)
- Transition Condition
- Workflow Process / Activity / Instance
- Manual Process / Activity / Instance
- Worklist (Queue)
- Workflow Control Data
- Workflow Monitoring
16Workflow (1)
- Dependent on Email System Integrity, or
Independent Database - Client Ability to Interrupt One Document Process
to Handle a Higher Priority Document - Ability to Find a Document Anywhere in the
Workflow System
17Workflow (2)
- GUI Workflow Editor
- Ability to Change Workflow On-the-Fly
- Automatically Identify Documents That No Longer
Have a Destination in the System
18Workflow (3)
- Manage Queue Length
- Manage Length of Time Each Document Spends in the
System - Reroute Documents from an Administrative Terminal
19The Internet
2040 Million Internet Hosts in Major World
Cities140 million users, 700 million webpages,
206 of 246 countries territories
21Change Learning to Use the Internet
- Even the people who make the most use of the
Internet only use a part of its capabilities. - These power users make up far less than one
percent of the population. - Huge changes would occur if the lives of even ten
percent of the population became as
Internet-centric as the lives of these power
users. - But, as great, and as fast, as these changes are
sure to be, there are bigger and faster changes
coming.
22The Internet Will Change Faster Than We Can Learn
to Use It
- Because, even as we learn to use it, the Internet
itself is changing. - Terabit per second fiber optic transmitters and
receivers will be introduced by Nortel Networks
in 2001 - In five years, DVD quality video-telephony could
be as free and as available as email is today. - If this seems impossible, remember that five
years ago, free worldwide email seemed
impossible. -
23Displacements
- With free video-telephony, the jobs of
receptionists in offices may be exported to
locations around the world. - Video stores will likely disappear.
- Tractor drivers may work from other countries.
- The 95 percent drop in rural populations may
occur again. - A simple change in the quality-of-service
Internet protocol, a programming change, can
provide free CD quality telephony worldwide with
existing Internet hardware. - The existing telephony infrastructure will become
redundant
24Internet Delivery the Big Picture
- January 1999 Internet carries 2 PetaBytes per
week - 2 thousand TeraBytes, 2 million GigaBytes, 2
billion MegaBytes - Doubles every 6 months
- Free
- DSL can work with existing phone lines at up to 8
Megabits (1 MegaByte) per second. (2 minutes per
box) - ATM 300 pages per second (5 Boxes per Minute)
- OC192 2 file cabinets per second (8 boxes per
second) - Dark Fiber 4 million boxes per second
25Offshore Clerks
- Video on demand over the Internet in 3 years.
- Free worldwide video-phone calls.
- Offshore indexing of scanned document
- Offshore receptionists video video-phone
- Offshore system administration
- Offshore Visual Basic programming, etc.
26How the Internet Works
27Images per Second
28Internet Delivery
- Modem (56 Kbits/s) 3 pages per minute
- ISDN (128Kbits/s) 10 pages per minute (complex
to do) - Cable (TV) Modem (500 Kbits/s) 1 page per second
- DSL (Digital Subscriber Line) (8 Mbits/s) 20
pages/s - T1 3 pages per second
- T3 100 pages per second
- LAN (10Base T) 2 pages per second
- LAN (100Base T) 20 pages per second
- ATM 300 pages per second
- OC192 8 boxes per second (2 file cabinets per
second) - Dark Fiber 4 million boxes per second
- (1 million file cabinets per second)
29DWDM
- (Dense Wavelength Division Multiplexing)
- Northern Telecom markets a 6.4 Terabit per second
fiber optic transmitters and receivers. - Works with existing fibers
- 5 million TV channels per fiber
- 100 million phone calls per fiber
30Topologies
31Topologies
32Geographic Extents
- SANs (Storage / System Area Network)
- Computer Room
- LANs (Local Area Network)
- Building
- MANs (Metropolitan Area Network)
- Campus or Metro Area
- (Now Becoming Intranets)
- VANs (Value Added Network)
- World (Becoming Part of the Internet)
33Internet
34Protocol Stack
35Images Across the Internet
- At the Blackboard
- Internet, Intranet, Extranet, and Trust
- The seven layers of protocols (the protocol
stack) - Why 7 different protocols are in use
simultaneously - And, how one protocol can be substituted for
another - Or, simulated via tunneling
- The structure of the Internet and the protocol
stack - Why the Internet is called an Inter - Net
36Protocol Stack
- Application
- Presentation
- Session
- Transport
- Network
- Link
- Physical
37Internet Etc.
- Internet
- Made up of linked independent networks
- Designed to survive nuclear Armageddon
- Intranet
- Private Internet (under owners control)
- Can use all internet software and metaphors
- Extranet
- Linked Intranets
- Linked (shared) security, trust
38Intranet
- Use the Internet as digital POTS
- Does not allow use by outsiders
- Requires a firewall at every Internet interface
- Will become the universal method of private
network data access and transfer
39Extranet
- Linked Intranets
- Requires cooperative, trusting parties
40IP (Internet Protocol)
- TCP/IP (Transmission Control Protocol/IP)
- IP Address (Internet Protocol)
- 192.0.0.0
- Four numbers, each from 0 to 255 (2 8)
- (Base 256)
- Specifies one of 4 billion possible addresses
- (2 32)
- Much too small, being extended to 2 128.
41ATM
- Asynchronous Transfer Mode
- Required underneath to support smooth (not
choppy) personal communications - Voice and video telephony
- Video on demand which will replace
- Video stores
- Cable TV
- Broadcast TV
- Security cameras
42HDTV ATM Studio WAN Switch
- FORE Systems and Tektronix demonstrated
uncompressed HDTV (High Definition TeleVision)
switching at NAB 99 (National Association of
Broadcasters), in Las Vegas, NV, April 19, 1999
for studio LANs (Local Area Networks), WANS (Wide
Area Networks) and the Internet. www.FORE.com - A FORE ATM (Asynchronous Transfer Mode) switch
with OC-48c (Optical Carrier 2.488
Gigabit/second) ports transported real-time,
uncompressed, HDTV in the industry-standard 1.5
Gigabit/second stream, SMPTE292M (Society of
Motion Picture and Television Engineers, SMPTE). - The Tektronix video edge device is a full 10-bit
digital video transport device. - The FORE ATM Switch is a 40 Gigabit/second,
non-blocking, full duplex-switch, capable of
switching 20, independent, uncompressed, HDTV
signals. - With this great headroom, (compressed HDTV is
much easier to send than uncompressed HDTV),
compressed HDTV is very likely to arrive into the
home, on demand, over the Internet, on a fiber
optic link, within a decade.
43CSMA/CD (Ethernet)
- CSMA/CD Carrier Sense Multiple Access with
Collision Detection
44Ethernet Segmenting and Routing
- Being replaced by switched ethernet.
- Every NIC (Network Interface Card) on a LAN can
hear everyone else and has to look at their
traffic. - By splitting or segmenting a LAN, you cut the
traffic in half on each half of the LAN. - To get just the messages that need to go across
the split, across the split, a router is used.
45Gateways
- A gateway lets people in one network into the
secured environment of another network. - A firewall is part of an industrial strength
gateway.
46Spatial Diversity
- For communications
- Do not route you backup communications link
through the same conduit as your primary link. - For data storage
- Do not store your backups next to your computer.
47Bandwidth
48Bandwidth
- Bandwidth is literally the frequency range of a
transmission medium. - It is taken to mean the amount of data that can
fit through a communications link. - People can be said to have a fixed bandwidth that
can be used to handle minor details (that could
be eliminated by consistent system design) or for
useful work.
49A Raster Image is a Digital Analog
- An analog is a replication of something in
another medium. - A raster is a replication of a page in a digital
medium.
50All Digital Signals are Analog
- Digital ones and zeros are mathematical concepts.
- In the real world ones and zeros are voltage or
current levels that can be distinguished. - We store and transmit our digital analogs using
analog digital mediums. - As electronics technology enters the nano-world
of quantum electronics, ones and zeros may be
represented by discreet quanta.
51Finding the Bits
- Recovery of the Clock
- From an accurate clock (causes slips)
- Synchronous
- From a transition in the carrier (Start Stop
Bits) - Asynchronous
- From the signal (limit on the maximum value run)
- Isochronous
52Acronyms, Palindromes, and Inverted Eponymy
- Modulator, Demodulator (Modem)
- Radio Detecting and Ranging (Radar)
- Light Amplification by Stimulated Emission of
Radiation (Laser) - Historical Eponymy in Computing
- Virtual Memory System (for the Virtual Address
eXtension computer) for Windows New Technology - HAL for International Business Machines
53C Frequency x Wave Length
- Microwaves are in the GHz range.
- Fiber Optics Use 1300 nm (nanometer) Light
- C is the speed of light in E MC2
- C is a universal constant that is a part of the
definition of the universe. - C is about 300 MMeters per second.
54C Frequency x Wave Length
- Microwaves
- (10 -1) x (3 x 10 9) (3 x 10 8)
- (100 mm) x ( 3 GHz ) 300,000 KM / sec
- C is about 300 MMeters per second 300,000 KM /
sec - Fiber Optics (1300 Nanometers)
- (1.3 x 10 -6) x (2.3 x 10 14) (3 x 10
8) - (1,300 nm) x ( 230 THz ) 300,000 KM / sec
- And, a rainbow is one octave
55Future Networking
56Meeting the Fiber When It Arrives at Your House
- T1 Was Invented for Rural Areas In the 1960s
- All modem transmissions have been digitized at 64
Kbps for 30 years (even for 300 bit per second
modems) - Cable Modems Are 10 Mbps (Mega-bits per sec.)
- DirecTV Channels are 25 Mbps
- OC192 SONET Fiber Interface is 10 Gbps
- SONET (Synchronous Optical NETwork)
- Fiber to the Home Will Be Multi-Gigabits per sec.
57Telco
- Telco (Telephone company)
- POTS
- Plain Old Telephone Service
- SONET
- Synchronous Optical NETwork
- Asymmetric Internet
- Twisted pair to carry mouse clicks (Slow)
- DirecTV to carry images and video (Fast)
58The Spanish Armada A Story
- In 1588 the Spanish Armada sailed for England.
- The English spent a considerable sum building a
series of signal relay towers from the coast to
London. - The investment was wildly successful. It carried
a single bit The Spanish are Coming! - If communications costs have been dropping
rapidly for hundreds of years, then it should
have cost millions to send a single bit in the
16th century.
59Document Management
60Documents
- We gotem
- We storedem
- We preservedem
- We studied their format
- We studied their structure
- We recorded their metadata
- How do we manageem?
61Managing Em
- Records Management
- Libraries
- Archives
- Museums
62Records Management
63RIM
- Records and Information Management
- The problem at Comdex was that no one visiting
the ARMA booth knew what ARMA meant - And, none of the ARMA banners or signs spelled it
out. - ARMA members need to get out there where no one
knows who they are. - (Association of Records Managers and
Administrators, International)
64CD ROMs Are 20 Years Old The Internet is 30
Years Old
65Here We Are In The Third Millennium
66The Third Millennium Started in 1994
67Stephen the Short Was Off by 6 Years When He
Established the Calendar for the Catholic Church.
68We Base A Lot on the Records We Keep
69Buck Rogers in the Twenty-Fifth Century
- What can we learn from the past?
70Preparing for RIM in the Thirteenth
Millennium(RIM Records and Information
Management)
- Of interest in the Silver State Chapter of ARMA
International (Association of Records Managers
and Administrators)
71Records Management
- Views the Recorded Activity of the Corporation as
a Whole - Assesses All Constituencies Needs and
Requirement for the Records - Reviews Records for Security, Integrity, and
Accessibility
72Records Managers in the Third Millennium(The
View from Inside)
- Professionalization
- What was accidental is now a profession
- Diversification
- Records now include voice and email, databases,
and more - Knowledge Management
- Integration
- Records managers work with IS and top management
- Background of technology
- Technological change continues to accelerate
- Education is the key
- Professional organizations ARMA
- Management, marketing, software
73Records ManagementTime Frame
- Records Managers Must Accommodate Changes in
Technology from the Time a Record is Created
Until the Record is Destroyed
74History of Records Management (1)
- Dr. Nathaniel S. Rousenau invented vertical
filing (File Cabinets). - First General Records Disposal Act passed by US
Congress in 1889 - US Bureau of Efficiency Established 1912.
- US National Archives Founded 1934.
- First Records Disposition Schedule by the US
National Archives in 1943
75History of Records Management (2)
- ARMA International (Association of Records
Managers and Administrators) founded in 1956 as
the American Records Management Association. - ICRM (Institute of Certified Records Managers)
founded in 1975. (Administers the CRM exam.)
76Records Management Program
- Records Survey
- Records Inventory
- Records Retention Schedule
- All Types of Records
- Origin
- Physical Class
- Function
- Organizational Relationship
- Applicable Regulations
77Lifecycles
78Forms Lifecycles
- Form Lifecycle
- Form Instance Lifecycle
79Records Life Cycle
- Creation
- Distribution and Immediate Use
- Storage and Maintenance
- Retention
- Disposition
- Archival Preservation
80Inactive Records
- Bankers Boxes (Records Cartons)
- Shelving
- Barcoding, Scan on Demand
- Inhouse
- Commercial Records Centers
- 10 to 25 cents per month per box
- 2 to 5 dollar retrieval fee per box
81Why Have Records Management? (1)
- Regulatory Compliance
- Business Operation
- Cost Containment
- Monitor New Technology
- Minimize Litigation Risk
- Safeguard Vital Information
82Why HaveRecords Management? (2)
- Control the Creation of Records
- Support Management Decisions and Planning
- Preserve the Corporate Memory
- Foster Professionalism in Business
83Records Management Barriers
- Records Management does not generate income.
- Records Management is not the organizations
primary business. - Most Records Management tasks are discretionary.
84Tasks Related to Records Management
- Forms Management
- Mail / Message Management / Internet Management
- Reprographics / Demand Printing / Report
Distribution / COOL
85Summary
- Formats for Preservation in 3 Parts
- Meta data, including indices, is required to
interpret documents - File format integrity to open a document file,
you need the application, the OS, and the
computer hardware (Raster format has the longest
life.) - Preserving the bits ECCs (Error Correcting
Codes) recover bad bits. Copying restores ECCs.
86ISO 9000
- Quality Standard
- Uses Records to Support Quality Management
- Requires Explicit Procedures for Records and
Information Management - Manages Organization Wide Quality Including
Sales and Marketing
87Do You Want to Be anISO 9000 Organization?(Lib
rary, Archives, Museum, Records
Center)(Manufacturer, City, University, Law Firm)
88Libraries
89Libraries
- Libraries are libraries
- Digital technologies are one of the means by
which libraries serve society
90Digital Library Examples
- U of Michigan, The Making of America
- Un-deskewed Raster Images OCR
- at http//www.umdl.umich.EDU/moa/
- UVA (Virginia)
- Collects Electronic Documents (SGML)
- at http//etext.lib.Virginia.EDU/
- Digital Library References (IFLA)
- International Federation of Library Associations
- at http//www.nlc-bnc.ca/ifla/
- Metadata (Dublin Core)
- at http//www.dlib.org/dlib/february98/02weibel.h
tml
91Ptolemaic Epicycles (1)
- All means of expression are being recreated in
digital form. - All techniques of reproducing expression are
being recreated in digital form. - The complexity of digital records is growing
along all known dimensions.
92Ptolemaic Epicycles (2)
- The good news is
- The growth is somewhat bounded.
- Current growth in complexity is mostly a product
of converting existing techniques to digital
form. - When this is completed, only new forms and
techniques will have to be handled. - These forms and techniques will likely appear at
their historic, much slower rate. - The current growth is like typesetting all
existing hand written book
93User Links
- Added over time
- Not encouraged by libraries and archives
- Lost with broken links
94Should You Convert to a Common Format
- Limited conversion, avoid migration
- Common Formats
- Raster (G4 - Group 4 Fax)
- HTML (HyperText Markup Language)
- SGML (Structured Generalized Markup Language)
- Book
- Libraries encourage their production
- Libraries do not create books
95Migratory Effects Are there (Insurmountable)
Problems?
- Just ask Mr. Twelve Feet
- The migrated Mark Twain
- Technology Without an Interesting Name
- TWAIN is a set of drivers for slow slow scanners.
http//www.TWAIN.org/ - ISIS (Image and Scanner Interface Specification )
is a set of drivers for fast scanners.
http//www.PixTran.com/ - The NBI word processors were Nothing But
Initials. - It is said that the Archbishop of Seville
believed that all of science and technology was
in the etymology of language. - To orient a map, place the east at the top.
- The La Brea Tar Pits in Spanish or English
96Geospatial Cataloging
97Geospatial Cataloging
- MARC 21 Format for Bibliographic Data
- (MAchine Readable Cataloging)
- 034 Coded Cartographic Mathematical Data Field
- http//lcweb.loc.gov/marc/bibliographic/nlr/nlr0xx
.html - Supports proximity searching
- E.g. All maps that cover the area within 100
miles (Km) of a City, border, river or a given
point (radius search). - Proximity searching, like all searching
techniques is best used in combination with all
other search techniques.
98Maps
- Map
- surface of the earth or celestial body
- Celestial chart
- Map of celestial bodies above an area of the
surface of earth or celestial body (rare). - Opposite of a map of a surface
99034 Coded Cartographic Mathematical Data Field
- a Category of scale
- b Constant ratio linear horizontal code
- c Constant ratio linear vertical scale
- d Coordinates westernmost longitude
- e Coordinates easternmost longitude
- f Coordinates northernmost latitude
- g Coordinates southernmost latitude
- h Angular scale
- j Declination northern limit
- k Declination southern limit
- m Right ascension eastern limit
- n Right ascension western limit
- p Equinox
- s G-ring latitude
- t G-ring longitude
- 6 Linkage
- 8 Field link and sequence number
100034 Field
- Indicator
- First
- 0 Scale indeterminate/No scale recorded
- 1 Single scale
- 3 Range of scales
- Second Type of Ring
- Blank not applicable
- 0 Outer ring (e.g. Edge of a map)
- 1 Exclusion ring
- (e..g. Area taken up by legend of map covers
part of map)
101a Category of Scale
- a Linear scale
- b Angular scale
- Used for celestial charts
- z Other type of scale
102b c Constant Ratio Linear Code
- b Constant ratio linear horizontal code
- c Constant ratio linear vertical scale
- The denominator of the representative fraction
for the horizontal scale - 1 inch 100 feet (1 / 12 100) 1,200 scale
- e.g. b2500 c2500
- (1 foot 12 inches)
- 1 mm 1Km (1,000 1,000) 1,000,000 scale
- e.g. b1000000
- 1,000 millimeters 1 meter, 1,000 meters 1
Kilometer)
103 c Constant Ratio Linear Vertical Scale
- The denominator of the representative fraction
for the vertical scale of relief models and other
three-dimensional items. (3D three dimensional
maps) - Viz. How high are the bumps on the map?
(videlicet) - e.g. 1 mile 1 inch 5280 12 63,360
- c63360
- (5,280 feet 1 mile, 12 inches 1 foot)
- e.g. 10 mm 1 Km (100 1,000 100,000)
- 1,000 millimeters 1 meter, 1,000 meters 1
Kilometer) - c10000
104Edges of a Rectangular Map
- To Orient a Map Literally to put the East at the
Top - To view the map annotation as right reading
- A convention that went out of use about 1500 AD.
- d Coordinates westernmost longitude (right side)
- e Coordinates easternmost longitude (left side)
- f Coordinates northernmost latitude (top)
- g Coordinates southernmost latitude (bottom)
105Edges of A Map
- Subfields d, e, f, and g always appear
together. - Each subfield is eight characters in length
- Each subfield consists of the hemisphere,
degrees, minutes, and seconds recorded in the
pattern hdddmmss. The degree, minute, and second
subelements are each right justified and each
unused position contains a zero
106Subelements hdddmmss
- Subelements are each right justified and each
unused position contains a zero - h hemisphere
- (examples are for a map 2 degrees on a side
centered on Greenwich, England which is located
at longitude 0 0' 0", latitude 51 28' 38"
http//www.rog.nmm.ac.uk/) - N North (e.g. fN0522828)
- S South (e.g. fS0502828)
- E East (e.g. eE0010000)
- W West (e.g. dW0010000)
- ddd degrees
- mm minutes
- ss seconds
107Celestial Charts
- h Angular scale The scale for celestial charts.
- j Declination northern limit
- k Declination southern limitSubfields j and
k are each eight characters in length and
consist of the hemisphere, degree, minutes, and
seconds of the declination of a celestial chart,
recorded in the pattern hdddmmss. The degree,
minute, and second subelements are each right
justified with and each unused position contains
a zero. (See Subfields d, e, f, and g) - m Right ascension eastern limit
- n Right ascension western limitSubfields m
and n are each six characters in length and
consist of the right ascension of a celestial
chart, recorded in the pattern hhmmss. (hours,
minutes, seconds) Each subelement is right
justified and each unused position contains a
zero. - p Equinox The year or year and month of a
celestial chart recorded in the pattern yyyy.mm
108255 Field
- The 255 field is the text form of the entry that
is coded in 034 field.
109Rings
- Exclusion ring (e..g. Area taken up by legend of
map which covers a portion of the map) - Outer ring (e.g. Edge of a map)
- A ring of 14 points (and 14 segments) would not
be sufficient to define the boundary of the City
of Los Angeles or the exclusion areas within the
City of Los Angeles for The City of Culver City,
the City of Beverly Hills or the City of San
Fernando. - s G-ring latitude
- t G-ring longitude
- 6 Linkage
- 6ltlinking taggt- ltoccurrence numbergt /
- Occurrence number is two digits, right justified.
- 8 Field link and sequence number (see next
slide)
110Rings (cont.)
- 8 Field link and sequence number
- Subfield 8 contains data that identifies linked
fields and may also propose a sequence for the
linked fields. Subfield 8 may be repeated to
link a field to more than one other group of
fields. The structure and syntax for the field
link and sequence number subfield is - 8ltlinking number.sequence numbergt
- Linking number
- This is the first data element in the subfield
and required if the subfield is used. It is a
variable-length whole number that occurs in
subfield 8 in all fields that are to be linked.
Fields with the same linking number are
considered linked. - Sequence number
- This number is separated from linking number by a
period "." and is optional. It is a
variable-length whole number that may be used to
indicate the relative order for display of the
linked fields (lower sequence numbers displaying
before higher ones). If it is used it must occur
in all 8 subfields containing the same linking
number.
111Archives
112The Technology
- Voyager 1 billion years, beyond the solar system
- Forever 100 billion years, big bang to big
crunch - Engineering bits to last forever ECC
- Ion milled nickel and iridium disks
- Cost reduction of 1-million-to-1 in 25 years
- 1-sextillion-to-1 in 100 years
- Successive S curves of one technology after
another provide a steady reduction in cost over
time.
113The Technology
- Can the entire record of our civilization be
expressed as ones and zeros? - DVDs, Fax, Digital Cameras, Word Processors
- Our World Digital with an Analog Veneer
- CAM (Computer Aided Manufacturing)
- Based on digital model
- Nanotechnology
- The clearest form of a book is the digital
typesetter file.
114Archives Management
- Appraisal
- Review all Records
- Accessioning
- File Archived Records in Archives
- Prepare, Adapt Finding Aids for Use
- Protect and Preserve Forever
115The Effect of Our Efforts
- From many civilizations, one civilization.
- Cataloging adds to understanding.
- Time and Distance
- Chemistry, Nanotechnology, Phylogenetics,
Genealogy - Hypertext links, and infinite un-dos
- Designed to stay out of the way.
- Soon We lost it. will be replaced by
We cant erase it. - What should we avoid recording?
116Preserving Information Forever
- We can now preserve information forever.
- We should start planning now, to be in control of
the technology. - The lessons we have learned, and that we now call
the professional management of Libraries,
Archives, and Museums, do not need many additions
to make use of digital technology. - We must see the creation in our own acts and in
the creative acts we seek to preserve. We must
separate creation from technology. - We have a professional responsibility to use the
new technology wisely in service to our society,
to our civilization, and to future recipients of
our preserved record.
117- Levels of Electronic Preservation
- A function of an Archive, not Records Management
- Preserving all the steps in document printing.
- The raster image that is printed is generated by
the RIP. - The RIP interprets the PDL page image.
- RIP (Raster Image Processor) PDL (Page
Description Language) - The document creation application writes the PDL
file. - The document application runs on a specific
version of the operating system (e.g. Microsoft
Windows 2000). - The operating system runs on a specific hardware
configuration.
118Practical
- Don't do anything you don't understand
- Require that all systems be understandable
- We need archivists, not digital archivists
- Digital stuff is a tool
- Knowing the available tools is a professional
responsibility - Someone must be in charge, it may be you
- Not acting is acting
- Fit wisdom into the plan
119What Could You Require?
- Require raster input
- Require PDF input
- Create PDF raster from PDF vector files
- Require customers to convert native format to XML
and PDF - Require XML Schema be used by customer to
validate XML documents - Convert your metadata to XML validated by an XML
schema - Collect and seal all formats of a given documents
together in an electronic container to maintain
cotemporal provenance
120Preserving Electronic Records
- The first tasks of a data archivist would
include - Getting the bits and provenance off the incoming
media. - NARA copies all digital documents to magnetic
tape - (United States National Archives and Records
Administration) - Affixing the archives digital seal.
- Digitally signing the package.
- (Archiving is done by people.)
121Emulators
122Emulators
- The need for emulators
- Preserving a word processor file requires
preserving - Word Processor
- Operating System
- Computer Configuration
- Display
- Printer
123The Theory of Emulators
- Turing Machines
- Any computer can be simulated on a Turing
Machine. - A Turing Machine can run on any computer.
- Hardware emulators are faster.
- The need for speed becomes moot over time.
- Software emulators can be preserved as binary
files. - Archivists will become emulator experts.
- Just as they are paper experts now.
124Current Emulators in PCs
- Each PC operating system emulates previous
versions - To a degree
- Even with emulators, preserving a print-ready
raster bit-map of each preserved page is
necessary. - Future operation of emulators can be checked to
the pixel.
125A History of Emulators Intel
- Each Intel chip emulates all previous Intel
chips. - Chips and operating systems are developed using
emulators before the new, target chip is ready. - 4004 (4bit), 4040, 8008 (8 bit), 8080, 8086 (16
bit) 80186, 80286, 80386 (32 bit), 80486, 80586
(Pentium), 80686 (Pentium II), 80786 (Merced) (64
bit) - Bill Gates developed his Basic interpreter on a
software emulator of an Intel 8080 that ran on a
DEC (Digital Equipment Corporation) PDP 10
computer.
126A Call for Emulators
- Emulators can be preserved forever in binary
form. - Emulators can be run forever.
- Run in series, each emulator run on the next
emulator created. - Run on intermediate machine such as a Turing
Machine - Limits the number of layers to two for all time.
- May Fail in Microcode Intels Pentium math bug
- A fringe effect, to be noted.
127Archivists What to Do
- Be certain to save a raster format copy of every
document (scanned or printed to image) - Easy, you can always just scan the document
- Save the raster in G4 compressed format
- PDF is a good candidate for G4 format
- Digitally seal your fascicles
- Monitor the bad bits
- Try to get the other formats Native, XML, and
Vector (PDF) - Seal them together with the raster format for the
document - Support the SAA (Society of American Archivists)
- Studying and preserving emulators
- Working to preserve GIS, Databases, and
multimedia files
128Litigation Support
129Document Processing
- Discovery, Production
- Scan, Identify, OCR, Store
- Code, Index, Workflow
- Import, Load, Merge, Link
- Display, Print, Transmit
- Search, Tag, Collaborate, Prepare Case
- In-Court Presentation
- Warehouse/Archive (To serve client interest)
- System Design, Sources of Information
130Discovery/Production
131Discovery / Production
- Discovery is the process of obtaining access to a
copy of the opponents files for the purpose of
searching through them. - Production is the process of responding to a
discovery request by providing your
organizations opponent access to your
organizations files.
132Informal Formality
- Email and Voice Mail
- All Corporate Records are Discoverable (in US)
- Windows 2000 Supports Exchange for Email
- Windows 2000 Has Extensive Support for Voice
Mail - Informal Statements End Up In Permanent System
Backups - the Keep Forevers
133Scan
134Document Identification
- Done by scanner operator
- Not the same as document indexing
- Minimum information necessary to link paper
document with digital document images - Also done by Bates numbers
- Bates stamp marks each page with a sequential
number and automatically advances for next page - Pages can be marked before or after scanning
135Code
136Code
- Letter Example for Coding
- From (Author)
- To (Recipient)
- cc (Carbon Copy)
- Date
- Subject
- Document Type
- Other fields as requested
- Keywords
- Bates number
- Start and ending number for multipage document
137In-CourtPresentation
138In-Court Presentation
- Three parts of vision
- Resolution, detail (black and white)
- Color
- Motion
- The foundation of optical tricks (illusions)
- Big, bright, high resolution
- Medium can carry the same weight as the
presentation content
139Warehouse/Archive
140Warehouse/Archive
- Law firms keep client records as long as they
serve the clients interests. - This frequently results in permanent retention.
- Most firms retain both the document image and the
paper that was scanned. - Eliminating the paper would greatly reduce
storage costs and would not require any
additional scanning, coding, or indexing because
the coded and indexed digital copy already exists.
141Litigation Support Goals
- We Do Not Have Any Records That Show That
- We have Analyzed the Documents You Provided
Under Our Discovery Proceedings - Here is a Record of All of the Documents We Have
in Storage - We Destroyed Those Documents According to this
Retention Schedule
142Operating a Commercial Records Center(CRC)
143Commercial Records Centers
- Contract off-site records storage
- In records storage cartons
- standard records storage carton (box) is about 12
inches wide by 15 inches long by 9 1/2 inches
deep (300 mm x 375 mm x 235 mm) - Warehouses with 18 to 55 foot ceilings
- New standard from the US National Archives
- 36 CFR (Code of Federal Regulations) Part 1228
- Proposed rule published in the Federal Register
on April 30, 1999, in Part VII at page 23,504
144Commercial Records Centers
- About 400 in United States
- Average about 1 million boxes per company
- Many small companies
- A few large national / international chains
- Beginning to appear outside the United States in
greater numbers - The first were in Canada and Europe
- PRISM International (Professional Records and
Information Services Management International) - Formerly ACRC (Association of Commercial Records
Centers)
145Scanning Today and in the Future
- What should a Commercial Records Center Do?
- Scanning is less expensive than delivery.
- Scanning is more expensive than storage.
- Future Technology (in Commercial Records Centers)
- What will happen before you pay off your
mortgage? (20 years) - Or sell your building? (5 years)
146How Many Pages Will You Scan in the First Year?
- 40 deliveries per day (replaced by scanning)
- 100 pages per delivery (desired folder contents)
- Customers often order a box to get a folder
- 50 deliveries X 100 pages 4 thousand pages /
day - 250 days X 4 thousand pages 1 million pages /
yr.
147Converting Boxes to GigaBytes
- means about, approximately
- Quick Rule 4 boxes 1 CD-R 1/2 GigaByte
- CD-R (Recordable CD)
- 8 boxes 1 GigaByte (1,000 MegaBytes)
- 8 thousand boxes 1 TeraBytes (1,000 GigaBytes)
- 1 box 2,500 pages (single sided assumed)
- Double sided pages require twice as much storage
per box. - These numbers will get you started, you can
measure the actual storage used for more
precision.
148Digitize Everything The Story (1)
- Blank CD-R (Compact Disc - Recordable)
- US 1 to US 2
- Guaranteed for 100 years
- 1/2 GigaByte (4 Boxes) 25 to 50 US-cents per box
- Blank DVD-R (commonly Digital Video Disc - R)
- Ditto (in a year or two) (US 1 to US 2, 100
year guarantee) - 4.5 GigaBytes (36 boxes) 3 to 5 US-cents per
box
149Digitize Everything The Story (2)
- Cost of on-line digital magnetic storage
- Declining at 40 percent per year
- 8 boxes per GigaByte (1,000 MegaBytes)
- USD 10 to USD 100 per GigaByte to purchase
- USD 2.50 to USD 25 per box for permanent
storage - 1,000,000 hours MTBF (Mean Time Between Failure)
USD 5 US-cents per page minimum - The cost of scanning
- Changes very slowly.
- 2,500 pages per box
- USD 125 per box for scanning
150Cost of Scanning
- USD 5 cents per page minimum (absolute minimum)
- The cost of scanning changes slowly
- 2,500 pages per box
- USD 125 per box for scanning
- USD 125 thousand for 1 thousand boxes
- USD 1.25 million for 10 thousand boxes
- USD 12.5 million for 100 thousand boxes
- USD 125 million for 1 million boxes
151Cost of Storing a Box
- 15 to 25 USD cents per month
- USD 1.80 to USD 3.00 per box per year
- USD 18.00 to USD 30.00 to set up an annuity
for perpetual storage to store one box forever - vs. USD 125 per box for scanning
- For an additional USD 20 percent, all scanned
materials can be stored forever in paper form as
well as in digital form.
152Cost of Scanning
- Scanning service bureaus will always scan for
less than commercial records centers. - Scanning service bureaus USD 5 cents per page
- Commercial records centers USD 20 cents per page
- It will always cost more to store boxes in
scanning service bureaus. - Commercial records centers have the records, and
the customers want them delivered immediately.
153The Desire to Scan Everything
- It is too expensive (not cost effective)
- If the customer really wants every page in a box,
consider physically delivering the box. - Some well funded customers may still want
scanning - You may lose these customers if you cant offer
scanning - Keeps scanner busy
- Steady cash flow
154Scanning and Customer Retention
- Customers may not be ready for scanning for
years, but they want to be sure their commercial
records center has it now. - Scanning rounds out brochures and customer
presentations. - Customers will use their commercial records
centers scanning services (and scanning
materials) to show they (the customers) are
planning for the future. - Do not start with difficult customers.
- Applies to getting new customers too.
155Is it Legal?
- Yes, it is analogous to microfilming.
- Anything can be done wrong.
- Records storage centers are a link in the
physical chain of custody of records. - The chain can be broken through errors.
- Clever lawyers can create doubt in any situation.
- Scanning is no different than any other business
activity.
156Business Issues in Scanning
- Start small, fail small.
- Experience comes from computer system
replacements. - Get it working before your customers see it.
- Learn as much as you can about it.
- Learn as much as you can about operating it.
- Your competitor used to own a computer store.
- Finish projects early and often so you can apply
what you learned early and often. - Get behind your system and make it successful.
157Pages to MegaBytes ( GigaBytes)
- means about, approximately
- 8 1/2 by 11 inches is 93 1/2 square inches (100
sq. in.) - 300 dpi x 300 dpi 90 thousand dots per square
inch - 100 square inches X 90 thousand dots 9
million dots - 9 million 1 bit (black or white) dots 1
MegaByte / Page - With 20 to 1 compression 1 MegaByte 50
KiloBytes - 20 times 1 page (50 KiloBytes) 1 MegaByte
- 20 thousand pages 1 GigaByte (1,000 MegaBytes)
158 Cost of Storage vs Cost of Scanning
- means about, approximately
- Cost to scan 1 box at 5 cents per page USD 125
- Cost to store 1 box in digital form
- USD 10 per GigaByte in 2001
- 8 Boxes per GigaBytes gtUSD 1.25 per box to
store digitally - Cost of digital storage drops at 40 percent per
year - And it already does not matter in 2001
- Magnetic disk drives are advertised to have a
1,000,000 hours MTBF (Mean Time Between Failure)
159Scan on Demand
- Competes with the cost of physical delivery
- Requires a computer system
- Track scanned images
- Provide security and passwords
- Integration with records management system
- Must be managed
- Or it is just a fax system
160Indexing
- Box in warehouse
- Folder in box
- document in folder
- Name of document
- Document labels date, number, client name, etc.
- Full text will help avoid indexing
- Client may do detailed indexing
161Fax Only
- If you have two databases of the same client
information, the information for a given client
will be different in the two databases. - Scanning twice produces two copies.
- The two copies will be different.
- The two copies will be compared by the customer.
- Careful management of data reduces problems.
1621 Scan System 1 Truck
- The two cost about the same to own and operate.
- Like trucks, you may want two complete systems.
- You will need a systems person you can work with
and trust. - The only way to understand backing up a system is
to lose data. - Most records centers use computers to track boxes
and have lost data.
163Internet Delivery as an Email Attachment
- Modem (56 Kbits/s) 3 pages per minute
- (1 box per day)
- ISDN (128Kbits/s) 10 pages per minute
- 1 box in 1/2 day
- Cable (TV) Modem (500 Kbits/s) 1 page per second
- 1 box per hour
- DSL (Digital Subscriber Line) (8 Mbits/s) 20
pages/s - 1 box in 2 minutes
- Ranges from 1/2 to 8 Mbits per second depending
on location.
164What Does the User Do With It
- Keep the scanned document as an email attachment.
- Add the document to a document management system.
- Manually
- Automatically
- Depend on you to find it for them when they need
it again.
165TransFormat System
166Permanent Electronic Records
- Require
- A format that will remain readable for a long
period of time - Long lasting bits
167The Stored Records are the KeyNot the System
- First store the records (documents) in a
permanent format and organization. - Then load the records into your first system.
- The records organization on the media should be
chosen to facilitate migration between systems. - Test the quality of your storage organization
- Consider the first system you load to be the
first test, of many tests to come, in which
stored records will populate a new software
system.
168Manageable by a Records Manager
- Permanent records should be stored in a system
that is manageable by a RIM professional. - (Records and Information Management)
- This is a simple (but essential) design
requirement - Automobiles are designed to be managed by the
people who drive them. - Managed, not built.
169Professional Responsibility
- RIM professionals are responsible for the
experience that their clients have when their
clients use the systems that are managed by the
RIM professionals. - Including
- Surprises when record formats mysteriously
change. - Surprises when printed documents look different
than what is on the screen. - The responsibility is to educate users about
endemic problems in electronic records, not to
fix problems for which there is no solution. - Suggesting work-arounds, not offering perfection
- Similar to assisting clients with the use of
paper records
170Chain of Custody
- Written policies and procedures
- System logs (ISO 9000-like audit records)
- Integrated with TF system and its operation
- Based on digital signature maintenance
171Retention Schedule
- Basis for TransFormat system
- Includes working documents
- Records Survey
- Records Inventory
- Record life-cycles (retention periods) (examples)
- Permanent (forever) (birth certificate) (land
records) - Long-term (30 years) (e.g. personnel records)
- Short-term (7 years) (tax)
- Working records (1-2 years) (worksheets)
- Ephemeral records (lunch orders)
172Website Maintenance
- TF (TransFormat) systems, like all modern
software, will provide intranet and perhaps
Internet access to records. - Maintenance of the TF system website that
provides this access should be integrated with
the metadata maintenance for general records
management. - For example, the descriptions of records series
and their finding aids should be loaded onto the
website automatically, from the TF system
metadata. - If the website information (metadata) is
maintained separately from the TF system
metadata, then there will be two databases that
are supposed to contain the same information. - As is always the case, according to Murphys Law,
if any piece of information is in two databases,
it will be different in the two databases. - Copying information from one database to another
doubles the maintenance burden, at least. - All displays of information should be computer
retrieved from one master copy of the
information.
173Where Are Your Electronic Records?
- On which disk?
- On which server?
- On which SAN? (Storage Area Network)
- On which backup?
- Is it an incremental backup?
- Is it a full backup?
- Is it a volume backup?
- Is it retention schedule compliant?
- How is it protected from
- Access (retrieval, alteration)?
- Destruction (denial of service)?
174Preservation Forever
- Permanent retention
- Getting the records back in the box (fascicle)
- Where the records can be protected and managed
- Where are your electronic records right now?
- Electronic records deteriorate over time
- Bits fade away (like disintegrating acid paper)
- Formats fade away (like lost languages)
- Systems and hardware fade away (like knotted
documents) - Knowledge and integrity face away (chain of
custody) - Fascicles
- Electronically signed virtual digital containers
1752 Year Time Horizon Backup / Disaster Planning
- Because fascicle based system records (on
fascicles) do not change, no baseline, and
corresponding (complex) incremental backups (and
media rotations), are necessary. - Will survive a technical staff 2 year time
horizon - All that is required is that as new fascicles are
filled, the new fascicles are digitally sealed,
and at least 7 duplicate copies of each of the
new fascicles are made for offsite storage. - The offsite storage should