Title: ESE250: Digital Audio Basics
1ESE250Digital Audio Basics
- Week 10 November 17, 2009
- File System
2Review
- Everything reduces to bits
- Songs ? digitized and encoded
- Machine code ? bit encoding for machine
instructions - Ebook text, Homework PDF, movies,
- Memories store bits
- Non-volatile memories store them persistently
(when the power goes off)
3Persistent Storage Questions
- How do we save data across boots
- When the computer is off
- How do we save data to move between machines?
- How do we organize our data so we can find it
again? - Tell others to find it?
4Strawman 1
- Guess a random address and write data there
- Write down addresses on paper
- To playback song
- Run program located at address 42000
- On data located at address 81736
- How do I know if I can use the block at address
90,000? - Why might this be problematic?
5Course Map
Numbers correspond to course weeks
6Outline
- What does technology give us?
- Requirements?
- Interlude
- File System Sketch
7Technology
8Hard Disk
- Disc with magnetic material on its surface
9Hard Disk
- Disc with magnetic material on its surface
- Divided into tracks (circles)
- Modern disks
- 300,000 TPI
- TPI Tracks Per Inch
10Hard Disk
- Disc with magnetic material on its surface
- Divided into bit regions
- Modern disks
- 1.5M BPI
- BPI bits per inch
11Hard Disk
- Each bit located at a position (R,?)
- R select track
- ? select bit from track
- Disc spins
- Traces through Q
12Hard Disk
- Each bit located at a position (R,?)
- Add arm to move head
13Hard Disk
- Each bit located at a position (R,?)
- Head arm moves
- Varies R
14Disk Bandwidth
- Typical Disk speed?
- 15,000 RPM
- One rotation every
- 60s/15,0004ms
- At R1 inch and 1.5M BPI, How many bits/second?
- 2 p ? 1 in ? 1.5M/in / 4ms
- 9Mbits/4ms 2.25Gb/s
- 280 MB/s
15Disk Speed
- Move head in R?
- Also a few milliseconds
- Typical Data access 10ms
- E.g. 4ms rotate 6ms seek
16Throughput and Implications
- Disk throughput faster than access time
- 10ms latency
- 280MB/s throughput (1B/4ns)
- What does this drive us to?
- 10ms seek ? Random byte access 100B/s
- Sequential access 280MB/s
- Want to exploit sequential access!
- Read blocks of data
17Read Data Blocks
- How many sequential bytes can read in 1ms?
- 280MB/s ? 0.001 s
- 280KB
- Can read 280KB in the same time as 1Byte
- 6ms seek, 4ms rotation, 1ms data read
18Seagate 2.5 Disk Drive
5400 RPM
http//www.seagate.com/docs/pdf/datasheet/disc/ds_
momentus_5400_psd.pdf
19FLASH Memory
Week 8
- Exploit tunneling
- Use high-voltage to reduce barrier
- Tunnel charge onto floating node
- Charge trapped on node
- Use field from floating node to modulate
conduction
http//commons.wikimedia.org/wiki/FileFlash-Progr
amming.png
20Flash
- NOR -- Read like other memories
- NAND Sequential read within page
- Denser than NOR
- Can only erase in blocks
- 4KB, 64KB?256KB
- Once erased can write byte (page) at a time
- Write time variable
- Typically need feedback to sense when written
21Samsung 256Mx8 NAND Flash
http//www.datasheetcatalog.com/datasheets_pdf/K/9
/E/2/K9E2G08U0M.shtml
22Intel Solid-State Drive (SSD)
35,000/s x 4KB 140MB/s
SSDhttp//download.intel.com/design/flash/nand/ext
reme/extreme-sata-ssd-datasheet.pdf
23Requirements
24File
- File sequence of bits that go together
- MP3 encoding
- Executable for mp3player
- Picture in JPEG
- PDF for your lab writeup
- How big is a file?
25File
- File sequence of bits that go together
- Like an object
- A base address
- Length or extent
- Generally has a typebut only used weakly
- Mp3, WAV, x86 executable,
- On unix/linux
- an array of unsigned char
- with a length
- Magic number ? tries to convey type
- Inband, in the file (first word?)
26Create a File?
- What do I need to do to create/store a new file?
- Allocate/reserve space for it
- Give it a name?
- Make a record somewhere of mapping between name
and location
27Strawman 2
free_address
- Keep track of next free address
- free_address initialized to 0
28Strawman 2
free_address
- Keep track of next free address
- free_address initialized to 0
- When create file, give it space at free_address
- Increment free_address by length
29Strawman 2
free_address
- Keep track of next free address
- free_address initialized to 0
- When create file, give it space at free_address
- Increment free_address by length
- Store name address in table
- Maybe put table at high addresses
File1 0
30Strawman 2
- Keep track of next free address
- free_address initialized to 0
- When create file, give it space at free_address
- Increment free_address by length
- Store name address in table
- Maybe put table at high addresses
free_address
File1 0
File2 250
31Strawman 2
- Keep track of next free address
- free_address initialized to 0
- When create file, give it space at free_address
- Increment free_address by length
- Store name address in table
- Maybe put table at high addresses
- When free_addresslengttable_base
- Device is full
free_address
File1 0
File2 250
32Strawman 2
- Keep track of next free address
- free_address initialized to 0
- When create file, give it space at free_address
- Increment free_address by length
- Store name address in table
- Maybe put table at high addresses
- When free_addresslengttable_base
- Device is full
free_address
33Evaluating Strawman 2
- Bad
- What happens when delete a file?
- How reuse space?
- Add data to files?
- Table gets big
- All filenames have to be unique?
- Demands coordination between
- users?
- Programs?
- Programs from different vendors?
- Good
- Accommodates variable length files
- Allows contiguous access
34Files Grow and Shrink
- Essay/homework gets longer as write it
- Dont know how long it will be when start
- Database of checks written grows
- TODO before end-of-term list shrinks?
- Where put additional space?
- Allocate new space at end of disk?
35Delete Files
- Dont need lame .o files once build executable
- Replace false start
- That was a bad picture of me
- Now have better
- Dont want anyone to see my secret plans to take
over the world - want the space back because drive filling up
36Repurposing Space
- How reclaim space?
- With single free_address pointer
- cant keep track of all the places where there is
space - What can do?
- Keep a list of free regions
- Try to find a region where will fit
37Finding Contiguous Space
- What if our disk looks like this
- and we want to allocate a large file?
- Disk has capacity
- But cannot allocate because not contiguous
38Bad Sectors
- Portions of a disk head may be bad
- At manufacture time
- Go bad during use
- Portions of Flash RAM may be bad
- Manufacturing defects
- Limited number of write cycles
- Also inhibits contiguous allocation
39Naming Conflicts
- How solve naming conflicts?
- Provide separate contexts
- E.g. separate space of names
- for each user
- for each program
- Typically with a directory structure
40Directory
- Special file that contains name to location
mappings - Once a file, we can easily allow hierarchy
- Directories can contain directories
41Requirement Roundup
- Find things easily and quickly
- Minimize what we need to look at to find data
- Portable (is the sole state holder)
- Self describing
- Fast read
- Attempt to layout contiguous files
- Fast write
- Not take too long to find space for file
- Support deletion (repurpose capacity)
- Use (most) of capacity
- Allow files to be non-contiguous
- Tolerate errors in media
- Dont depend on contiguous blocks to be good
- Isolate/differentiate who can access what
Challenge both asymptotics and constants
(e.g. 280MB/s vs. 10ms
random access) matter.
42Interlude
43Disk Data Security
- How is security enforced?
- OS demands credentials for login
- User doesnt get direct access to hardware
- OS intermediates
44Physical Disk Access
- What happens if the disk is removed from the
physical machine? - Plugged into another machine that
- Someone else has administrator access on?
- Doesnt respect the users/isolation?
45Common News Item
- Computer hard drive sold on eBay 'had details of
top secret U.S. missile defence system' - By Daily Mail ReporterLast updated at 1108 AM
on 07th May 2009 - Highly sensitive details of a US military
missile air defence system were found on a
second-hand hard drive bought on eBay. - The test launch procedures were found on a hard
disk for the THAAD (Terminal High Altitude Area
Defence) ground to air missile defence system,
used to shoot down Scud missiles in Iraq. - The disk also contained security policies,
blueprints of facilities and personal information
on employees including social security numbers,
belonging to technology company Lockheed Martin -
who designed and built the system. - Read more http//www.dailymail.co.uk/news/articl
e-1178239/Computer-hard-drive-sold-eBay-details-se
cret-U-S-missile-defence-system.htmlixzz0Wxa60PT9
46all too common
- VA Update on Missing Hard Drive in Birmingham,
Ala. - February 10, 2007 Printable Version
- Investigation Yielding Additional Information
- WASHINGTON -- The Department of Veterans Affairs
(VA) today issued an update on the information
potentially contained on a missing
government-owned, portable hard drive used by a
VA employee at a Department facility in
Birmingham, Ala. - Our investigation into this incident continues,
but I believe it is important to provide the
public additional details as quickly as we can,
said Jim Nicholson, Secretary of Veterans
Affairs. I am concerned and will remain so
until we have notified those potentially affected
and get to the bottom of what happened. - VA will continue working around the clock to
determine every possible detail we can,
Nicholson said. - VA and VAs Office of Inspector General have
learned that data files the employee was working
with may have included sensitive VA-related
information on approximately 535,000
individuals. The investigation has also
determined that information on approximately 1.3
million non-VA physicians both living and
deceased could have been stored on the missing
hard drive. It is believed though, that most of
the physician information is readily available to
the public. Some of the files, however, may
contain sensitive information.
47Still Happening
- Probe Targets Archives Handling of Data on 70
Million Vets - By Ryan Singel October 1, 2009
- The inspector general of the National Archives
and Records Administration is investigating a
potential data breach affecting tens of millions
of records about U.S. military veterans,
Wired.com has learned. The issue involves a
defective hard drive the agency sent back to its
vendor for repair and recycling without first
destroying the data. .... - The incident was reported to NARAs inspector
general by Hank Bellomy, a NARA IT manager, who
charges that the move put 70 million veterans at
risk of identity theft, and that NARAs practice
of returning hard drives unsanitized was
symptomatic of an irresponsible security mindset
unbecoming to Americas record-keeping agency. - This is the single largest release of personally
identifiable information by the government ever,
Bellomy told Wired.com. When the USDA did the
same thing, they provided credit monitoring for
all their employees. We leaked 70 million
records, and no one has heard a word of it.
http//www.wired.com/threatlevel/2009/10/probe-tar
gets-archives-handling-of-data-on-70-million-vets/
48Caveats
- On standard unix/windows setups
- Without the OS to providing protection,
all the data is accessible - Sometimes good for recovery
- On standard unix/windows setups
- Rm/del doesnt make the data go away
- Also sometimes useful for recovery
- Even format not guarantee data overwritten
- See Remembrance of Data Passed A Study of Disk
Sanitization Practices - IEEE Security and Privacy, v1n1p1727 (linked
from todays reading)
49File System Sketch
50Sketch
- Manage the disk at the level of blocks of fixed
size (bnodes) - Format disk for bnodes
- File is a collection of bnodes
- Directory is a kind of file
- Root of system bnode in known location
51bnode
- Fixed-size block of data
- Minimum unit of storage allocation
- bnodes map to physical addresses
- E.g. bnode 76 ? address 76?4096 311296
- Or bnode 76 ? R1.012in, theta32.07 degrees
- Address physical resources through bnodes
52bnode Size
- How big should a bnode be?
- Needs to be bigger than the block address
- Problems with small blocks?
- Longer addresses
- Can address smaller FS w/ fixed address bits
- Problems with large blocks?
- Minimum allocation increment
- ?Internal fragmentation
- Typical values 4KB, 1KB, 256B
- Trending toward larger these days
- Intel 4KB SSD, 64KB for some flash
53Files from bnodes
- Use bnodes as file handle
- How we address the file
- bnode contains metadata
- Block type, File type, length
- Small file all data in single bnode
76
obj, 3172
54Files from bnodes
76
obj, 15,791
- Large file tree of bnodes
55Files from bnodes
76
obj, 15,791
- Large file tree of bnodes
- Multi-level if necessary
37253
56Files from bnodes
- Large file tree of bnodes
- Multi-level if necessary
- Overhead for tree structure?
- 4KB pages
- About 1000-way tree
- 1000KB tree needs 1001 pages
- 8KB tree needs 3 pages (50 overhead)
- In practice inodes avoid this worst-case
57EXT2 inode
indirect blocks
(12 of these)
Source http//www.tldp.org/LDP/tlk/fs/filesystem
.html
58File Expansion with bnodes
- Expand file
- Add bnodes to file
76
76
obj, 15,791
59Directory
- File
- With type directory
- contains name/bnode pairs
- Small
- Fits in one bnode
- Large
- Tree of bnodes
- Just like file
Directory, 234
Lab1, 76 Lab2, 98 Lab3, 1034 Lab4, 267 Lab5,
2053 .
60Free bnodes
0
1
2
3
4
5
- Keep track of free and usable bnodes
- Grouped by contiguous set of free blocks
- Allocation try to find contiguous set of bnodes
to satisfy file need - and try not to breakup large contiguous block
unnecessarily - Deletion try to reassemble free blocks
- E.g. delete 13 ? make 1014 length 5 block
6
7
8
9
Length 1 1, 14 Length 2 3-4, 7-8, Length 3
10-12, 16-18 Length 4 20--23
10
11
12
13
14
15
16
17
18
19
20
21
22
23
61Superblock
- For bootstrapping and file system management
- Each file system has a master block in a
canonical location (first block on device) - Describes file-system type
- Root bnode
- Keeps track of free lists at least the head
pointers to (bnodes, blocks) - Corruption on superblock makes file system
unreadable - ?Store backup copies on disk
62Format disk
- Identify all non-defective bnodes
- Defective blocks skipped
- ? those addresses not assigned to bnodes
- Create free bnode data structure
- Create superblock
63Review Sketch
- Manage the disk at the level of bnodes
- Format disk for bnodes
- File is a collection of bnodes
- Directory is a kind of file
- Root of system bnode in known location
64Requirement Review
- Find things easily and quickly
- Minimize what we need to look at to find data
- Directory structure
- Portable (is the sole state holder)
- Self describing ? superblock, metadata
- Fast read
- Attempt to layout contiguous files
- Fast write
- Not take too long to find space for file ?
efficient free structure - Support deletion (repurpose capacity)
- Return bnodes to free list
- Use (most) of capacity
- Allow files to be non-contiguous ? bnodes
- Tolerate errors in media
- Dont depend on contiguous blocks to be good ?
bnodes - Isolate/differentiate who can access what
65Learn More
- Online reading/pointers
- Unix File System Tutorial
- Flash, SSD, Hard drive data sheets
- Data found on hard drive articles
- Courses
- CIS121 efficient data structures
- CIS380 operating systems
66Big Ideas
- Persistence/Volatility
- Self-describing
- Every disk different (at least due to media
defects) - Only place to save data across boots
- Naming
- Must have canonical way of referencing data
- Indirection
- Build logically contiguous region from
non-contiguous physical regions - Deal with growing, variable size files and errors
in media