Title: Using the JPEG2000 image format for storage and access in biodiversity collections'
1Using the JPEG2000 image format for storage and
access in biodiversity collections.
- Chris Freeland
- Missouri Botanical Garden
2But first, an oversight
3Overview of JPEG2000
- Wavelet-based compression
- Different than JPEG
- Decompress without extracting entire file
- Proposed in 2000 to supercede JPEG
- Hasnt
- Slow adoption in museums libraries
- Poor (no) native browser support
- Few open source options
- Faster adoption in medical imaging, other
commercial applications
4Parts of the format
- Part 1, Core coding system (JP2)
- defines format adopted as standard first.
- Part 2, Extensions
- Part 3, Motion JPEG 2000
- Part 4, Conformance
- Part 5, Reference software
- Part 6, Compound image file format (JPM)
- Part 7 has been abandoned
- Part 8, Security (JPSEC)
- Part 9, Protocols and API (JPIP)
- Part 10, JP3D (volumetric imaging)
- Part 11, JPWL (wireless applications)
- Part 12, ISO Base Media File Format (common w/
MPEG-4)
5Advantages of JPEG2000
- Region extraction
- Compression
- Both lossless lossy
- Self-containedness
- XML metadata image
- Multiple objects can be bundled together
- Progressive Transmission
- Lower quality at early load
http//www.dlib.org/dlib/september08/chute/09chute
.html
6Region Extraction
Give me x,y coordinates at z resolution.
72ppi 20KB JPG
600ppi, 200MB TIF encode to 100MB JP2
7How many books in a ___?
Luis Soriano, with Alpha and Beto
- 2 Biblioburros 4,800 books
1 Biblioburro 2,400 books
BHL to date 9 Biblioburros!
http//www.nytimes.com/2008/10/20/world/americas/
20burro.html
8Storage requirement for a digital Biblioburro
- 2,400 books / Biblioburro
- (9,238,295 pages / 22,118 books in BHL) 418
pages / book - 1,002,437 pages / Biblioburro
- Avg size of each image file
- RAW/TIF 24MB JP2 2MB
- Drive space needed / Biblioburro
- TIF 24TB JP2 2TB
2 TB JP2
2,400 books
24 TB TIFs
9Self-containedness / metadata bundling
- Not just an image, but an image, its content
its context - Adobe XMP
- Dublin Core
- Your own XML
- TIF Headers JPEG limit fields
- Can describe more than just an image
- A whole web site
10Barriers for adoption
- Lack of affordable, scalable serving options
- Until recently, no open source server
- Commercial options expensive
- No native browser support
- Safari does, but via QuickTime
- But why??
- PNG?
- No motivation?
- Community skepticism
11Encoding Software
- Commercial
- Adobe Photoshop
- LuraTech SDK
- LizardTech
- Non-Commercial
- Kakadu
- ImageMagik
- IrfanView
12Decoding Serving
- Commercial
- LizardTech
- Aware
- LuraTech ICS
- FSIV
- Non-Commercial
- Kakadu
- GSIV
- djatoka
13Part 6 JPIP
- Protocol and API for transmitting JP2
- Designed for HTTP, but not restricted to that
carrier - Dont need a browser
- Implementations are available, use is
infrequent - HiRISE camera onMars ReconnaissanceOrbiter
14Current use of JP2 in BHL
- Serve 85 (lossy) .jp2
- LizardTech decoder
- Tiled on the fly
- Cached for performance
- GSIV browser-based client viewer
15A user requests Mushrooms of America, edible and
poisonous, Plate X http//www.biodiversitylibrary
.org/page/1274907
Browser
GSIV.js
.jpg
/page/1274907
www.biodiversitylibrary.org
images.mobot.org
LizardTech ExpressServer
BHLdb
Internet Archive
.jp2
pageid 1274907
locate
http//www.archive.org/download/mushroomsofameri00
palm/.../mushroomsofameri00palm_0010.jp2
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20The Future djatoka
- Developed at Los Alamos National Laboratory,
Research Library - Use of the ISO-standardized JPEG 2000 format 6
as the service format - Java-based open source solution built around the
Kakudu JPEG 2000 library - Geared towards reuse through URI-addressability
of all image disseminations including regions,
rotations, and format transformations - Provision of a consistent, guessable URI pattern
for image disseminations based on the ANSI/NISO
OpenURL standard 7 - Provision of an extensible service framework for
image disseminations enabled by OCLC's Java
OpenURL package - Availability of image disseminations in a range
of image formats - Availability of image disseminations for locally
stored JPEG 2000 files, as well as for
Web-accessible images in a variety of formats - Configurable server-side, file-based caching
- Ajax-based client reference implementation, based
on IIPImage JavaScript Viewer, which allows
panning, zooming, and selecting the URI of the
current view.
http//www.dlib.org/dlib/september08/chute/09chute
.html
21References
- djatoka
- http//www.dlib.org/dlib/july08/buonora/07buonora.
html - HUL Page Image Compression for Mass Digitization
- http//preserve.harvard.edu/massdig/hul_study/
- JP2 in Libraries and Archives
- http//j2karclib.info/taxonomy/term/2
- JPEG 2000 - a Practical Digital Preservation
Standard? - http//www.dpconline.org/docs/reports/dpctw08-01.p
df - JPEG2000 site
- http//www.jpeg.org/jpeg2000/
22Contact
- Chris Freeland
- Missouri Botanical Garden
- 4344 Shaw Blvd.
- St. Louis, MO 63110
- chris.freeland_at_mobot.org
- http//www.chrisfreeland.com