Title: Cross-media Intelligent Searching in
1Cross-media Intelligent Searching in Digital
Library
- Yueting Zhuang
- Zhejiang University, China
- Nov. 18, 2006, Egypt
2Outline
- 1. CADAL China digital library
- 2. Our Vision to next generation of digital
library - 3. From Multimedia Retrieval to Cross-media
Retrieval - 4. Retrieval of Chinese calligraphy character a
cross-media practice - 5. Building Personalized Portal
- 6. Conclusion
3Outline
- 1. CADAL China digital library
- 2. Our Vision to next generation of digital
library - 3. From Multimedia Retrieval to Cross-media
Retrieval - 4. Retrieval of Chinese calligraphy character a
cross-media practice - 5. Building Personalized Portal
- 6. Conclusion
43rd Workshop 2004, CMU, USA
5ICUDL 2005, Zhejiang University, China
6(No Transcript)
71. CADAL China Digital Library
- China-US One Million Book Digital Library Project
- a unique library resource to scholars, students,
and citizens - contain over one million scanned books
- A big step towards the goal create a universal
free to read digital library - Get knowledge available on the web, anytime,
anyone, anywhere
http//www.cadal.zju.edu.cn
8(No Transcript)
9- As of today, CADAL has achieved
- 1.023 million books was digitized, including
- Degree dissertation
- Modern Chinese books
- Traditional cultural resources
- English books
- Supporting multimedia resource
- Image
- audio
- video
- 3D model
- Chinese calligraphy
- about 200,000 clicks a day (http//www.cadal.zju.e
du.cn) - users spread over 70 countries and regions
- 16 scanning centers in China, occupying more than
2000 square meters
10Scanning books
Processing digitized books
11(No Transcript)
12Users spread over 70 countries and regions
13- Service structure of CADAL
14- Current services provided by CADAL
(1) Metadata searching
- digital resources are classified into 8 classes
according to the publication time and type. - both unified and advanced search are provided for
all resources
15(2) Unified search
16China Ancient
Choose the types of resources to search
17search results contain each type of resources.
18(3) advanced search
Users can choose search scope, combined results
and result style
Second search, full texts and detailed
information are available in result page.
19(4) full-text search
- Full text search uses the texts from OCR
20Outline
- 1. CADAL China digital library
- 2. Our Vision to next generation of digital
library - 3. From Multimedia Retrieval to Cross-media
Retrieval - 4. Retrieval of Chinese calligraphy character a
cross-media practice - 5. Building Personalized Portal
- 6. Conclusion
212. Our Vision to Next Generation of Digital
Library
- typical features of existing DLs
- books are indexed by title, author, keywords
- users query books by keywords input
- mostly only text information is returned
- multimodal data is not fully-supported
- What the next generation of DL looks like?
- support multimodal sources
- enable cross-media retrieval
22Extension to the concept of Book
- The key of our vision to next generation of
digital library is the extension of book
concept - A book is regarded as not only the written
symbols on papers, but also any type of
multimedia item, such as - A video clip
- An audio clip
- A piece of painting
- .
23So in the next generation of DL, book can be in
multimodal
- We can find a general data structure to represent
multimodal books
24 Supporting multimodal data is an important trend
in multimedia retrieval
?
We get multimodal information from real world,
then can we get multimodal data from digital
world, especial like a digital library?
25Cross-media retrieval
- After the extension of Book concept, the
retrieval shall also be extended. - We call it cross-media retrieval.
26Scenario a simple example of cross-media
Giant Panda Image
Starting Query
Starting Query
Textual Description to the giant Panda the
Panda is a kind of cat which
Starting Query
Giant Panda Text
Giant Panda Audio
User can start a query from any type of media,
and relevant multimedia data would be returned.
27 Cross-media retrieval is a useful way to access
multimodal data
- Cross-media retrieval can be regarded as the
simulation of the real world, and it helps us get
multimodal data in a more flexible and more
informative way!
28What cross-media retrieval needs to do?
It can be an image, audio or keywords
29Outline
- 1. CADAL China digital library
- 2. Our Vision to next generation of digital
library - 3. From Multimedia Retrieval to Cross-media
Retrieval - 4. Retrieval of Chinese calligraphy character a
cross-media practice - 5. Building Personalized Portal
- 6. Conclusion
303. From Multimedia Retrieval to Cross-media
Retrieval
1) Image Retrieval Content-based
31query example
relevance feedback
Searching images
negative example
positive example
32(2) Image retrieval text-based
Query text
33(3) Motion retrieval
Given a query example of motion data, we can find
similar motion data from database.
34(4) Audio retrieval Content-based
System Framework
35audio retrieval key techniques
- extract auditory features in compression field
from audio clips - cluster fuzzy auditory features
- represent audio clips with the cluster center
- retrieve similar audios by cluster center
matching - introduce relevance feedback techniques
36 audio retrieval an example
feature weight
query example
weight adjusting
relevance feedback
37(5) video retrieval Overview
- unlike text resources, video is unstructured.
- rich in visual contents
- poor in semantic understanding
- the challenging issues
- summarization structuring
- video mining
38(5) video retrieval key techniques
- video structuring
- construct video table-of-content (VTOC)
- make it physically structured.
- video summarization
- help the user quickly grasp the content of video
clips - support video browsing
- video encoding/compression
39video stream
video
concept clustering
table of contents
Scene
scene construction
group
grouping
shot boundary detection
shot
temporal features
key frame
spatial features
Key Frame Extraction
40- video summary video content mining
original video (redundant)
video content mining
summarized video (concise and informative )
Find meaningful patterns to support efficient
video browsing
41two news video are separated in 6 video shots
(the following are the key frames) . And their
total length is 3 minutes
42After video summarization, the video is 3
seconds. And it consists of 3 key frames as
below.
43 video shot clustering result
44 video browse
45 video browse
summary
key frames
46(6) 3D model retrieval overview
measure 3D model with shape similarity
47(6) 3D model retrieval an example
query example
48- As shown above, the multimedia retrieval is
generally content-based X retrievalCBXR.
49- towards cross-media Retrieval
image retrieval
audio retrieval
video retrieval
Cross-media retrieval
motion retrieval
3D model retrieval
CBXR
We can provide a more flexible and efficient way
to access multimodal data. We name it as
cross-media retrieval.
50- Support multimodal sources
- smooth integration of multimodal data
- query media objects by examples of different
modalities - Challenging issues
- texts, images, audios, etc. are represented with
different features - different features are heterogeneous
- cross-media similarity cant be measured by
content features - there is a semantic gap between low-level
features and semantics
51- Our Solution to Cross-media retrieval
- build cross-indexing from multimodal data
- organize multimedia document
- explore cross-media correlations
52Cross-indexing-based retrieval General idea
Retrieval interface
53(1) Cross-index retrieval interface
The system now support images, audios and videos.
Users can submit any of the media objects, and
the system returns relevant images, audios and
videos.
54Building multimedia document General idea
- definition of multimedia document
- a logical representation of multimodal data
- consists of semantically related media objects
- formal structure
Document ltID, Title, URI, KeywordList,
ElementSet,LinkSetgt ElementSet (Audio Image
Text Video) i i?N Audio ltID, ParentID,
URI, Size, KeywordList, AudioFeaturegt Image
ltID, ParentID, URI, Size, KeywordList,
ImageFeaturegt Text ltID, ParentID, URI,
KeywordList gt Video ltID, ParentID, URI,
Frames, KeywordList, VideoFeaturegt
55Build multimedia document framework
Storage Subsystem
Multimedia document
keyword
text
image
Learning and Relevance feedback subsystem
audio
video
graphics
Query Processor (multimedia document media
objects)
Preprocessing
Semantic skeleton base
56Building multimedia document retrieval interface
- the left figure is the relevant media data
retrieved by the query of water.
- A multimedia document is visualized as its
sketch, i.e. text, images and key-frame lists for
videos.
- Besides keyword-based search, the user can
perform a content-based search with a specific
media object as the query example
57Exploring cross-media correlations challenges
Gap 1 Content gap
Challenges
- multimodal data reside in heterogeneous feature
spaces - the semantic gap
58Exploring Cross-media Correlations Solutions
Images and audios represent high-level semantics
from different perspectives. If we can find the
correlation between different perspectives, we
can enable cross-media retrieval with the bridge
of correlations.
correlation
correlation
bird
tiger
explosion
dog
car
59Exploring cross-media correlations mathematical
realization
Basic idea
X and Y are of different dimension !
At the same time, the correlation between X and Y
maximally coincides with the correlation between
X and Y
X and Y are of the same dimension !
60Exploring cross-media correlations subsequent
challenges
1. how to measure both intra- and inter-media
correlations ?
cross-media
Intra-media
Intra-media
cross-media
2. how to introduce new media objects into the
system?
testing data
locate
the correlation network in the subspace
locate
61Outline
- 1. CADAL China digital library
- 2. Our Vision to next generation of digital
library - 3. From Multimedia Retrieval to Cross-media
Retrieval - 4. Retrieval of Chinese calligraphy character a
cross-media practice - 5. Building Personalized Portal
- 6. Conclusion
624. Retrieval of Chinese Calligraphy Character
- Original calligraphy works is unique.
- They exist in paper, bamboo slips, and are
easily to be destroyed.
63How to search?
- In our digital library, we digitize Chinese
Calligraphy works, - Design retrieval systems to make them sharable
by all the people on internet.
641. to query similar characters
Similar characters could be found and returned to
users. This is like traditional content based
image retrieval.
652. to find out where a character comes from
Character ? comes from this work
We aim to provide an intelligent way to find out
surrounding characters, and represent them to
users.
66System Overview
67 (1). segmentation
- noise elimination
- page-image analysis
- smoothing
(2). retrieval
- feature extraction
- shape matching
- speed up
68(1) segmentation
minimum-bounding box
We segment page into columns, and cut the columns
into individual characters within the
minimum-bounding box.
69(2) Retrieval of Chinese Calligraphy Characters
Calligraphy character is written by brush in
stead of hard pen. The brush causes stroke
varies in different shape and different sickness.
Also the ancient calligraphy has many
degradation because of nature changes.
we use contour points to represent the
calligraphy character, and keep the features of
each individual calligraphy character in the
database
70- use polar coordinates to represent the characters
divide the direction into 8 bins equally, and
divide each bin into 4 areas. Then count the
points in every bins as show in the picture.
71- coarse-to-fine Strategy
- improve Shape matching algorithm
- dynamic Time Warping of projecting histogram
- extended DTW for 2D calligraphy contour warping
- high dimensional indexing
72Visualization of Chinese Calligraphy
Retrieval result
Shape-based character retrieval
Submit Example
73Outline
- 1. CADAL China digital library
- 2. Our Vision to next generation of digital
library - 3. From Multimedia Retrieval to Cross-media
Retrieval - 4. Retrieval of Chinese calligraphy character a
cross-media practice - 5. Building Personalized Portal
- 6. Conclusion
745. Building Personalized Portal
- Personalized portal
- Web personalization is the technique to help
users quickly locate interesting information
which features multimedia and cross-media. - Service integration around the content
- Information filtering based recommendation
Show me the information that I really need !
75- Personalization services provided by portal
- my bookshelf
- my bookmark
- my rules
- personal profile
- setting
My bookshelf
My bookmark
Books recommended by rules
76- service integration around the content
- detail information about book
- translate metadata
- full-text search
- my bookshelf management
- ranking
- CALIS union catalog and inter- library loan
- My bookshelf management
- my bookmark management
- bilingual translation
- full-text search
77- information filtering based recommendation
- the classification of Web data
- content data texts, images
- structure data XML/HTML tag
- usage data Web access log
- user profile preferences, demographic information
- implementing information filtering techniques
- content based filtering method
- collaborative filtering method
786. Conclusion
- Next generation of digital library shall focus
more on multimedia, and finally cross-media
retrieval.
- But more research issues to be faced with
- Cross-Media Representation Framework
- Cross-Media Knowledge-based Reasoning
- Analysis and Recognition
- Complex retrieval
79Thanks !