Title: Topics%20in%20Preservation%20Science
1The Reconstruction of Mechanically Recorded Sound
by Image Processing
Update on Collaboration with the Library of
Congress Carl Haber Lawrence Berkeley National Lab
2Collaborators
Vitaliy Fadeyev, Carl Haber, Zach Radding, and
Jim Triplett Lawrence Berkeley National
Lab Christian Maul Taicaan Technology,
U.K. John W. McBride University of Southampton,
U.K. Mitch Golden
Peter Alyea Larry Applebaum Mark Roosa Sam
Brylawski The Library of Congress Bill
Klinger ARSC George Horn Fantasy Records,
Berkeley
3Lawrence Berkeley National Labwww.lbl.gov
- Founded in 1931 by E.O.Lawrence
- Oldest of US National Labs
- Operated by the University of California for the
US DoE - 4000 Staff, 800 Students, 2000 Guests
- 14 Research Divisions including
- Physics, Nuclear Science
- Materials, Chemical Science
- Life Sciences, Physical Bioscience
- Energy and Environment, Earth
- Computing
- Major user facilities-
- Advanced Light Source
- Nat. Center for Electron Microscopy
- Nat. Energy Research Super Computer Center
4Outline
- Introduction
- Summary of method (mostly a repeat)
- Towards a real 2D machine (I.R.E.N.E.)
- 3D reconstruction of an Edison cylinder
- Plans for the 3D research program
- Conclusions
- V.Fadeyev C. Haber, J. Audio Eng. Soc.,
vol. 51, no.12, pp.1172-1185 (2003 Dec.). - IRENE Proposal 12-Feb-2004
- V. Fadeyev et al, LBNL Report-54927
5Introduction
- We have investigated the problem of optically
recovering mechanical sound recordings without
contact to the medium - Address concerns of the preservation, archival,
and research communities - The reconstruction of delicate or damaged media
- Mass digitization of diverse media
- The approach evolved naturally out of methods of
optical metrology, pattern recognition, and image
processing. - First shown at the LC in July 2003.
- Research is now supported by an LC/DOE agreement.
- Message to take away from todays presentation
- The techniques yield good reproductions and some
improvements. - Measurement, data storage, and computing
technologies may be approaching performance
levels required for this application. - Strong development program for the near future in
both the mass digitization and analytic aspects.
6Traditional Contact Playback
- Bulky stylus riding in a narrow
- groove gt Issues with
- tracking
- condition of the groove
- debris and contamination
- wear
- Presence of trained manpower or
- supervision is de-facto required.
- Modulation is lateral for most disc
- media, and vertical for Edison
- cylinders
- Transduction may be electrical or
- acoustic
Groove width 160 mm Lateral modulation
7A Non-contact Method
- Using digital optical techniques, the pattern of
undulations in a surface can be imaged. - Cover surface with sequential views or grid of
points. - Views can be stitched together surface map
- The images can be processed to remove defects and
analyzed to model the stylus motion. - The stylus motion model can be sampled at a
standard frequency and converted to digital sound
format. - Real time playback is not required de-facto,
method is aimed at reconstruction and
digitization.
8Parameter 78 rpm, 10 inch Cylinder
Cut Lateral Vertical
Revolutions per minute 78.26 80-160
Max/Min radii inches (mm) 4.75/1.875 (120.65/47.63) 2 5 fixed (50.8-127)
Area containing audio data (mm2 ) 38600 16200 (2)
Total length of groove 152 meters 64-128 meters
Groove width at top inches (mm) 0.006 (160) variable
Lines/inch (mm) Gd 96-136 (5.35-3.78) 100-200 (3.94-7.87)
Line spacing microns 175 250 125 250
Ref level peak velocity_at_1KHz 7 cm/sec (11 mm) NA
Max groove amplitude (microns) 100 - 125 10
Groove depth 80 mm 20 mm
Noise level below reference, S/N 17-37 dB ?
Dynamic range 30-50 dB ?
Groove max ampl_at_noise level 1.6 - 0.16 mm ?
9a
c
d
e
b
10Imaging Methods Electronic Camera
scratch
160 mm
surface
dust
groove bottom
Coaxially illuminated groove on a shellac 78 rpm
disc
- 2D method, CCD or CMOS image sensor frame or
line format - Cameras contains 1 x 4000, or 768 x 494 pixels,
or up to few Mega-Pixels - 1 pixel 1 micron on the disc surface
- Magnification and pixel size yield sufficient
resolution for audio data measurement due to
pixel interpolation - Frames acquired at 30-100 fps, lines at 10-20K lps
11Imaging Methods Confocal Scanning Probe
3D method Acquire measurements over a grid of
points at up to 4 KHz Grid spacing may be
optimized in design of the scan.
Surface of an Edison cylinder
12Imaging Methods White Light Interferometry
- 3D method, requires depth scan.
- Acquire frames at 1-20 seconds/frame,
- depending upon depth and slope
- Frame size 0.5 2.5 mm2
- Stitch together adjacent frames
Scratch in a 78 rpm shellac disc
Wax cylinder surface
13Issue of Aliasing
- Sampling theorem
- Sample at 2f where f is highest frequency of
interest - Apply low pass filter above f to prevent aliased
components appearing in data unless noise above f
can be neglected. - In optical approach sampling is done by
pixelization of image. - High sampling frequency
- Use of pixel size to achieve effective low pass
filtering
14Comparison of Specifications
Parameter 2D Digital Camera W.L.I. Confocal
Acquisition n (x m) pixels n x m pixels point
Transverse resolution 0.5 - 1 mm Pixel projection 1-10 mm 1.5 - 10 mm
Vertical resolution NA 10 nanometers 10 nanometers
Points/measurement 4000 480x540 259200 1
Max Time/measurement 50 ms 1-10 seconds 250 ms
Effective time/point 12.5 ns 4-40 ms 250 ms
Low pass filtering? Image field twice w. offset Image field twice w. offset 2 passes w. offset
Depth of field 0 75 mm Depth is scanned 20 mm mm
Cost of probe only 7K gt100K 30K
15Speed and Data
Vertical groove
- 2D scans for lateral discs
- Line scan camera 5-15 min/scan for 10 78 rpm
disc - 190 Mbytes / 1 s of raw audio images
- 3D scans for vertical media
- Confocal methods depends upon grid. 12 - 24
hours, 450 M points. - Interferometric frame methods
- 1 - 5 hours
- 3D for deep groove lateral discs
- Confocal with optimized grid
- 15 - 60 hours for 10 78 rpm disc
- Interferometric frame method
- 10 - 100 hours (groove depth)
Lateral groove
Key 3D issues are slope and depth
16Image Processing
dilation
- Intensity profile and edge finding measure
features - Shape recognition
- Dilation operation can remove dust particles
- Example is 2D but generalizes to 3D
- Knowledge of groove geometry provides a powerful
constraint for rejecting debris and damage
Edge finding
17Signal Analysis
- For recording and playback, signal is
proportional to stylus velocity - electrical magnetic induction
- acoustic plane wave approximation, air
pressure and velocity are proportional and
in-phase - Electrical recordings are (deliberately) mediated
by equalization scheme to attenuate low
frequencies and boost high frequencies - Acoustic recordings are (naturally) mediated by
the frequency response of horns and diaphragms. - Potential to improve fidelity with modeling of
acoustic component response - Groove data is in digital form, numerical
analysis - Determine velocity by numerical differentiation
Max. Slope
Amplitude
Wavelength
18Response of horn and diaphragm at low frequency
can modify response and deviations from constant
velocity characteristic.
19Response of one real horn
From Maxfield, J.P. and Harrison, H.C., The Bell
System Technical Journal, Volume V, No.3, July
1926, pp. 493-524
20(Dis)Advantages of Imaging Method
- Data intensive, storage and manipulation of large
data-sets - Scanning speed, 3D methods may be quite slow, use
for special cases only? - Is ultimate resolution sufficient to provide
required fidelity?
- Delicate samples can be played without further
damage. - Independent of record material and format wax,
metal, shellac, acetates - Effects of damage and debris (noise sources)
reduced by image processing. Scratched regions
can be interpolated, re-assemble broken media - Discrete noise sources are resolved in the
spatial domain where they originate rather than
as a random effect in the audio playback. - Dynamic effects of damage (skips, ringing) are
reduced. - Classic distortions (wow, flutter, tracing and
tracking errors, pinch effects etc) are absent or
resolved as geometrical corrections - Operator intervention during transcription is
reduced, mass digitization.
21Relationship to Other Work
- Laser turntables (www.ELPJ.com) reflected laser
spot, susceptible to damage, debris, and surface
reflectivity. - Stanke and Paul, (3D Measurement and modelling
in cultural applications, Inform. Serv. Use 15
(1995) 289-301) depth sensed from greyscale in
2D image of galvano, states the basic approach
in general - S.Cavaglieri et al, Proc of AES 20th
International Conference, Budapest, Oct 5-7,
2001 photographic contact prints and scanner to
archive groove pattern in 2D no 3D analog. - O.Springer (http//www.cs.huji.ac.il/springer/)
use of desk top scanner on vinyl record lacks
resolution - W.Penn (First Monday, volume 8, number 5 (May
2003)) real time interferometric cylinder
playback system in development.
22Test of Concept using 2D Imaging
- Precision optical metrology system SmartScope
manufactured by Optical Gauging Products. - Video zoom microscope with electronic camera and
precision stage motion in x-y-z. - Image acquisition with pattern recognition and
analysis reporting software - Wrote program to scan groove, report, and process
data (offline). - Study of 78 rpm shellac disc 1950
23Offline Data Processing
Reformatting data in one global coordinate
system
1
Removal of big outliers
2
Filtering by selecting on the distance between
interval pair merging two sides into one.
3
Matching the adjacent frames
4
Fit the groove shape RR0CfAsin2(f f0)
5
Numerical Differentiation and resampling
6
Multiple runs addition conversion to WAV format
7
24Raw measurement of groove bottom edges
DR distribution
DR
Width across groove bottom
Averaged and filtered for known width DRltcut
Frames aligned
Measurement spacing along time axis 66 KHz
Stylus velocity by numerical differentiation 4th
order polynomial fit of 15 points about each
sample
25Waveform comparison
19.1 seconds
40 ms
optical
stylus
CD
- Clear reduction in clicks and pops
- Similarity of fine waveform structure
26Sound Comparison
Goodnight Irene by H. Ledbetter (Leadbelly) and
J.Lomax, performed by The Weavers with Gordon
Jenkins and His Orchestra 1950
Sound from the CD of re-mastered tape.
Sound from the mechanical (stylus) readout.
Sound from the optical readout.
optical commercial noise reduction
27Frequency Spectra
- FFT spectra of optical (top), stylus (middle),
and CD/tape (bottom) - Audio content in range 100 - 4000 Hz very similar
- More high frequency content in stylus and CD
versions. - Effects of equalization and differentiation?
- Low frequency structure in optical sample
(audible).
2 4 14 34 134
1234 5234 20K
28Waveform Comparison 2nd Sample
optical
stylus
CD
29Sound Comparison
Nobody Knows the Trouble I See , traditional,
performed by Marion Anderson, Matrix
D7-RB-0814-2A, 1947
Sound from the CD of re-mastered tape.
Sound from the mechanical (stylus) readout.
Sound from the optical readout.
optical commercial noise reduction
30Directions
- In July 2003, at the LC, some possible
future directions were discussed - The 2D test was promising, what would it take to
make a machine to run near real-time on discs?
Could this address mass digitization needs? - 3D may be the ultimate goal, could you do a small
feasibility study similar to the 2D test? - Can you propose a research program to further the
3D technology?
31Design of a 2D Machine
- In Fall of 2003 LBNL supported design study for a
2D machine based upon methods shown in the test. - Specifications vetted with LC staff.
- Media sample collection provided by LC.
- Basic design developed by LBNL mechanical and
software engineering staff. - Design, cost, and schedule passed internal
reviews at LBNL. - Documents submitted to LC in 2/2004
32Basic Features and Goals
- I.R.E.N.E. (Image, Reconstruct, Erase Noise,
Etc.) - Follow 2D approach used in test, image groove
bottom and/or top since that is known to work at
some level. Quality consistent with test or
better(?). - Emphasize throughput.
- Encompass as much variation in media as possible.
- Handle broken or partial discs.
- Facility to (temporarily) flatten flexible media
(Memovox) - User friendly interface.
- Commercial off-the-shelf components
- Provide a test bed for the mass digitization
application.
33Media Condition Survey
- Approx 40 discs provided by LC
- 65 GOOD should reconstruct well
- 25 FAIR may require additional software
developments to reconstruct well - 10 POOR reconstruct with excess noise or
distortions, or not at all. - Similar breakdown on a random set of 78 rpm
shellac discs.
34Media Condition Shellac
good
fair
Rough groove bottom edge
fair
poor
Multiple edges
35Media Condition Acetate
Exudated deep cleaned
good
Exudated surface cleaned
36System Layout
37Components
38- 4000 pixel, 18 KHz line scan sensor
- Magnify to 1 pixel 1 mm
- 7.6 x 105 lines/outer ring
- 390 KHz sampling
- Time/ring 40 seconds
- 73 mm / 4 mm 19 rings
- 19 x 40 sec 13 minutes
- Reduce with variable speed on inner rings 9
minutes - Scan time increased if warping is large.
Based upon 10 inch, 78 rpm geometry
39Performance
40Software Interface
41Software Features
- Control framework
- Data acquisition functions
- Image processing, data reduction
- Analysis package (filter, noise reduction), data
quality monitor - Calibration tools
- Directory structure
- Configuration management
- Display tools and diagnostics
- Commented source code
42Issue of Alternative Lighting
- Addition of diffuse ring light adds new features
to image. - Diameter of light source is important
- Depends strongly on location of features which
reflect back into optics - Inclusion of these features requires additional
algorithms - Usefulness still needs to be tested
1 ring
2 rings
43IRENE Summary
- Design study completed and documented.
- Projected scan time reasonable for a
production-like machine. - Includes a suite of attractive features (GUI,
broken, warped discs, data quality plots etc.) - Would also provide a powerful test-bed for
further development. - Provides a new statistical view of disc media
44Test of 3D Scanning
- 2 methods were identified confocal scanning and
white light interferometry - Collaboration with Taicaan Technology and Dept of
Engineering Sciences, U of Southampton, UK
confocal tools in hand - LBNL, MG designed the experiment, performed
analysis and interpretation. - UK Group configured the hardware and performed
the scan. - Scan speed was not emphasized, wanted to perform
a proof-of-concept.
453D Study of an Edison Cylinder
Utilize confocal scanning probe at 300 Hz, Along
axis at 3 mm/sec (10 mm points) Angular increment
0.01o 96 KHz
59 hours for 30 seconds
46Frequency Distance Map
diameter (inches) 2.1875
RPM 80 90 120 144 160
surface velocity (mm/s) 232.740 261.832 349.109 418.931 465.479
frequency wavelength (mm)
10 23274.0 26183.2 34910.9 41893.1 46547.9
100 2327.4 2618.3 3491.1 4189.3 4654.8
500 465.5 523.7 698.2 837.9 931.0
1000 232.7 261.8 349.1 418.9 465.5
5000 46.5 52.4 69.8 83.8 93.1
10000 23.3 26.2 34.9 41.9 46.5
15000 15.5 17.5 23.3 27.9 31.0
20000 11.6 13.1 17.5 20.9 23.3
44100 5.3 5.9 7.9 9.5 10.6
88200 2.6 3.0 4.0 4.7 5.3
100000 2.3 2.6 3.5 4.2 4.7
47Confocal Probe Specifications
Parameter Value
Probe Model STIL CHR150
Depth of field 350 microns
Spot size 7.5 microns
Sampling Frequency 300 Hz
Vertical Resolution 10 nanometers
Vertical Accuracy 100 nanometers
Step size across grooves 10 microns
Step size along grooves (circumferential) 0.01o ( 5 microns on circumference)
Linear scan speed (parallel to cylinder axis) 3 mm/second
48Methodology
- Parabolic fits to each groove bottom with fixed
curvature to determine depth. Damage and debris
are filtered with shape constraint. - No explicit low pass filter applied but high
sampling avoids most of the noise. - Overall shape is distorted but can constrain with
an averaged measurement of ridge heights - Signal is (approximately) stylus velocity,
perform numerical differentiation using Discrete
Fourier Transform
49Overall Shape of Cylinder
- Perfect cylinder would be flat in this view
- Off center rotation
- Out of round elliptical
- Local deformity
- Heard as rumble and other low frequency effects
50(No Transcript)
51Numerical Differentiation and Filtering
The filtering factor M is defined as follows.
- The cut below 20 Hz removes the low frequency
structure in the bottom-only data due to the
cylinder shape irregularity. - The 400 Hz wide transition to zero at 5.0 KHz was
used to avoid the interference-like pattern
triggered by jumps in the data. - The cut above 5.2 KHz satisfies the Nyquist
criteria before re-sampling to a lower digital
audio standard.
52Waveforms
bottoms
tops
top - bottom
53Comparison of Waveforms
optical (t b)
stylus
54Sound Comparison
- Just Before the Battle, Mother, composed by
George F. Root, performed by Will Oakland and
Chorus 1909, - 1516 (..76 4M-297-2) originally as Amberol
297 1909 - with stylus, flat equalization
- Optical version, flat equalization
- commercial noise reduction low frequency
boost
thanks to George Horn, Fantasy Records, Berkeley
55Frequency Spectra
3D optical version with top-bottom shape
correction
Version played on modern cylinder system with
electrical stylus
56Frequency Spectra
Groove bottoms only
(B)
Ridge tops only (T)
T B (version played already)
573D Research Plan
- Setting up similar scanning system at LBNL.
- Acquired 4000 Hz confocal probe with high
intensity xenon light source. - Configure stages for raster and helical scans.
- Study data quality versus probe speed and grid
spacing to optimize overall scan time. - Study media with mould growth and other damage.
- Development of scratch correction code.
- Some 3D studies of disc media may be possible.
- Possibility of WLI study (?)
58(No Transcript)
59(No Transcript)
60Conclusions
- Image based methods have sufficient resolution to
reconstruct audio data from mechanical media and
reduce impulse noise. - Basic process is data intensive compared to
simple stylus playback. - 2D approach may be suitable for mass
digitization. How general is the 2D image
quality? IRENE design can address these and other
key issues. - At present 3D methods may be suitable for
reconstruction of particular samples since they
require hours per scan. - Upcoming 3D research program should clarify some
issues of ultimate scan time, address damaged
media - Reports available
- 2D LBNL-51983, JAES Dec 2003
- 3D LBNL-54927, to be submitted to JAES
- Info at URL www-cdf.lbl.gov/av
61Extra Slides
62Measurement of Noise at Rmin Rmax
Optical readings Upper sample is at outer
radius Lower sample is at inner radius From
Goodnight Irene disc If noise is dominated by
surface structures of constant size distribution,
the outer radius amplitude and frequency should
be greater due to greater linear speed there
63Physical Origin of Noise in Optical Reconstruction
- View of raw groove shape data from region of
pause, before differentiation into velocities. - Upper plot is 0.6 second portion.
- Lower plot shows deviations about 10 Hz waveform.
- Each point is an independent edge detection
across the groove bottom. - Clear structures, spanning multiple points are
resolved of typical scale - 100 microns (0.2 ms) x 0.2 microns !!!
0.1 seconds
64Parameter 78 r.p.m., 10 inch 33 1/3 r.p.m., 12 inch
Groove width at top 150-200 mm 25-75 mm
Grooves/inch (mm) Gd 96-136 (3.78-5.35) 200-300 (7.87-11.81)
Groove spacing 175-250 mm 84-125 mm
Reference level peak velocity_at_1KHz 7 cm/sec 7 cm/sec (0.0011 cm)
Maximum groove amplitude 100-125 mm 38-50 mm
Noise level below reference, S/N 17-37 dB 50 dB
Dynamic range 30-50 dB 56 dB
Groove max amplitude at noise level 1.6 - 0.16 mm 0.035 mm
Maximum/Minimum radii 120.65/47.63 mm 146.05/60.33 mm
Area containing audio data 38600 mm2 55650 mm2
Total length of groove 152 meters 437 meters
65Cylinder Sample
Parameter Value
Cylinder issue Edison Blue Amberol
Diameter 2 inch (2.1875 inches)
Artist Will Oakland and Chorus
Title Just Before the Battle, Mother
Serial number 1516 (..76 4M-297-2) originally as Amberol 297 1909
Date of original recording 1909
Date of manufacture 1920s
Tracks per inch (t.p.i.) 200
Groove spacing 127 mm