MediaView Towards a Semantic Multimedia Database Model - PowerPoint PPT Presentation

1 / 65
About This Presentation
Title:

MediaView Towards a Semantic Multimedia Database Model

Description:

Recipe on the Web. Sample Recipe -- The Cooking Procedure of 'Triple Cheese Pasta Primavera' ... mozzarella, milk, Parmesan, Italian seasoning, salt and black ... – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 66
Provided by: Qing51
Category:

less

Transcript and Presenter's Notes

Title: MediaView Towards a Semantic Multimedia Database Model


1
MediaView -- Towards a Semantic Multimedia
Database Model
  • Qing Li
  • Dept of Computer Science
  • City University of Hong Kong

2
Outline
  • Motivation Introduction
  • Modeling Constructs
  • Logical Implementation
  • Real-World Applications
  • Conclusion

3
State-of-the-art
  • Multimedia Systems and Applications
  • an explosive growth in recent years
  • demand on managing multimedia using databases
  • Database techniques for multimedia
  • data modeling
  • indexing
  • query processing
  • presentation synchronization

4
Semantic Gap
  • semantics-intensive multimedia systems
    applications

non-semantic multimedia data models
Semantic Gap
require
model
raw data,primitive properties (size, format,
etc)
semantic meaning of the data
5
Semantic modeling of multimedia -- Why hard?
  • Context-dependency
  • Semantics is not a static and intrinsic property
  • The semantics of an object often depends on
  • the application/user who manipulate the object
  • the role that the object plays
  • other objects in the same context

Example
Van Goghs paintings
flower
6
Why hard? (cont.)
  • Modality-independency
  • Media objects of different modalities may suggest
    the similar/related semantic meanings.
  • Example

Query
Results
Harry Potter has never been the star of a
Quidditch team, scoring points while riding a
broom far above the ground. He knows no spells,
has never helped to hatch a dragon, and has never
worn a cloak of invisibility.
image
video
text
7
MediaView A Semantic Bridge
  • An object-oriented view mechanism that bridges
    the semantic gap between multimedia systems and
    databases
  • Core concept media view (MV)
  • a customized context for semantic interpretation
    of media objects (text docs, images, video, etc)
  • collectively constitute the conceptual
    infrastructure of an multimedia system
    application

8
Architecture
MediaView Mechanism
9
Fundamentals of MediaView
  • Basic concepts class vs. MV
  • View operators basic functions of MV
  • View algebra derivations of MV
  • Comparison other dynamic data models

10
Basic Concepts
  • Definition 1 Set C as the set of base classes. A
    base class Ci ? C has a unique class name, a type
    description, and a set of objects associated with
    it. The type of Ci is referred to as type(Ci),
    which defines a set of properties as the common
    interface of all the instances of Ci. The set of
    properties are referred to as properties(Ci), and
    each property in it can be a value of a simple
    type, an instance of a certain class, or a
    method. The set of objects associated with Ci is
    defined as extent(Ci) o o?Ci.

11
Basic Concepts
  • Definition 2 A media view MVi is a virtual class
    that has a unique view name, a type description,
    and a set of objects associated with it. The type
    of MVi is referred to as type(MVi), which defines
    a set of properties properties (MVi) as the
    common interface of all its instances.
    Similarly, a property can be a value of a simple
    type, an instance of a media view, or a method.
    The set of objects associated with MVi is defined
    as extent(MVi) o o?MVi.

12
Basic Concepts
  • So, a media view MVi can be represented as a
    triple
  • MVi ltMi, Pi, Ri,gt
  • Where
  • Mi - a set of objects that are included into MVi
    as its members. Each object o?Mi belongs to a
    certain source class, and different members of
    MVi may belong to different source classes.
  • Piv - a set of properties (attributes and
    methods) applied on either MVi itself (Piv) or on
    all the members (Pim).
  • Ri - a set of relationships, and each r?Ri is in
    the form of ltoj, ok, tgt, which denotes a
    relationship of type t between member oj and ok
    in MVi Ri itself may exhibit a graph.

13
Basic Concepts
  • Definition 3 A base class Ci is defined as a
    subclass of another base class Cj if and only if
    the following two conditions hold (1)
    properties(Cj) ? properties(Ci), and (2)
    extent(Ci) ? extent(Cj). If Ci is the subclass of
    Cj, we also say that there is an is-a
    relationship from Ci to Cj. A base schema (BS) is
    a directed acylic graph G(V, E), where V is a
    finite set of vertices and E is a finite set of
    edges as a binary relation defined on VV. Each
    element in V corresponds to a base class Ci. Each
    edge in the form of eltCi, Cjgt?E represents an
    is-a relationship from Ci to Cj (or Ci is a
    subclass of Cj).

14
Basic Concepts
  • Definition 4 A media view MVi is a subview of
    another media view MVj (or there is an is-a
    relationship from MVi to MVj) if and only if
    properties(MVj) ? properties(MVi) and extent(MVi)
    ? extent(MVj). A view schema (VS) is a directed
    acyclic graph GV, E, where a vertex in V
    corresponds to a media view MVi, and an edge
    eltMVi,MVjgt?E represents an is-a relationship
    from MVi to MVj (or MVi is a subview of MVj).

15
Basic Concepts
  • An example

16
Basic Concepts
  • Semantics-based data reorganization via media
    views

17
Basic Concepts
  • Definition 5 The semantic graph (SG) is an
    undirected graph GV, E, where V is a finite
    set of vertices and E is a finite set of edges.
    Each element Vi?V corresponds to a multimedia
    object Oi in the database. E is a ternary
    relation defined on VVN. Each eltVi,Vj, ngt?E
    represents a semantic link of degree n between
    object Oi and Oj, where n is the number of media
    views to which both objects belong. We define n
    as the correlation factor between Oi and Oj.

18
Basic Concepts
  • Definition 6 The correlation matrix MMij is
    an adjacency matrix of the semantic graph.
    Specifically, each element Mij contains the
    correlation factor between Oi and Oj, with all
    the diagonal elements set to be zero.

19
Basic Concepts
  • Semantic Graph Model

20
View Operators
  • A set of operators that take media views and view
    instances as operands.
  • Our intension is not to come up with a complete
    set of operators, but to focus on those that are
    indispensable in supporting queries and
    navigation over multimedia objects.

21
View Operators
  • type-level
  • V-overlap
  • syntaxltbooleangt v-overlap (ltmedia view1, media
    view2 gt)
  • semantics true, if and only if (? o ?
    O)(o?extent(ltmedia view1gt) and o?extent(ltmedia
    view2gt))
  • Cross
  • syntaxltobjectgt cross (ltmedia view1, media
    view2 gt)
  • semanticsltobjectgt o ? O o ?
    extent(ltmedia view1gt) and o?extent(ltmedia
    view2gt)
  • Sum
  • syntaxltobjectgt sum (ltmedia view1,
    meida-view2 gt)
  • semanticsltobjectgt o ? O o ?
    extent(ltmedia view1gt) or o?extent(ltmedia view2gt)
  • Subtract
  • syntaxltobjectgt subtract (ltmedia view1, media
    view2gt)
  • semanticsltobjectgt o ? O o ? extent(ltmedia
    view1gt) and o?extent(ltmedia view2gt)

22
View Operators
  • instance-level
  • Class
  • syntaxltbase classgt class(ltview instancegt)
  • semanticsltview instancegt is a instance of ltbase
    classgt
  • components
  • syntaxltobjectgt components (ltview instancegt)
  • semantics ltobjectgt o?O o is a component
    (direct or indirect) of ltview instancegt
  • i-overlap
  • syntaxltbooleangt i-overlap (ltview instnace1gt,
    ltview instance2gt)
  • semantics true, if and only if (? o ? O) (o ?
    components (ltview instance1gt) and o ?
    components(ltview instance2gt))

23
View Algebra
  • Functions
  • -- derivation of new MVs from existing MVs
  • Heuristic Enumeration
  • Blind enumeration
  • Content-based enumeration
  • Semantics-based enumeration

24
View Algebra
  • Definition 7. The n-level correlation matrix M(n)
    is derived from correlation matrix M by the
    following formula
  • where n is a positive integer and k (0ltklt1)
    is a constant between 0 and 1. Each element
    M(n)ij is defined as the n-level correlation
    factor between objects Oi and Oj.

25
View Algebra
  • Algebra Operators
  • select from src-MV where ltpredicategt
  • project ltproperty-listgt from src-MV
  • intersect (src-MV1, src-MV2)
  • union (src-MV1, src-MV2)
  • difference (src-MV1, src-MV2)

26
Comparison (vs. class)
27
Comparison (vs. traditional object view)
28
Logical Implementation
  • MediaView Construction
  • MediaView Customization
  • MediaView Evolution

29
MediaViews Construction
  • Work with CBIR systems to acquire the knowledge
    from queries
  • Learn from previously performed queries
  • A multi-system approach to support multi-modality
    of media objects
  • Organize the semantics by following WordNet

30
Why WordNet?
  • Different queries may greatly vary with the
    liberty of choosing query keywords
  • We need an approach to organize those knowledge
    into a logic structure
  • A simple context a concept in WordNet
  • Common media views corresponds to simple
    contexts
  • We provide all common media views, based on which
    users can build complex ones.

31
Navigating the Multimedia Database
  • Navigating via semantic relationships of WordNet
  • Semantic Relationship Examples
  • Synonymy (similar) pipe, tube
  • Antonymy (opposite) fast, slow
  • Hyponymy (subordinate) tree, plant
  • Meronymy (part) chimney, house
  • Troponomy (manner) march, walk
  • Entailment drive, ride

32
Navigating the Multimedia Database
33
MediaViews Construction
34
Multi-dimensional Semantic Space
  • IS-A relationship in thesaurus
  • For example, Season has a 4-dimension semantic
    space spring, summer, autumn, winter

35
Encoding with Probabilistic Tree
  • A Probabilistic Tree specifies the probability of
    one media object semantically matching a certain
    concept in thesaurus.

36
Encoding with Probabilistic Tree
  • Procedure
  • Step i Following the thesaurus, trace from the
    target concept C1 to the root concept Root in
    thesaurus. Assume the path is ltC1, C2 , Root
    Cngt. Start from CCCn and initially set P1.
  • Step ii Suppose CCCi, and the next concept Ci-1
    is one of the k sub-concepts of Ci. If CC is
    encoded in the Probabilistic Tree of this media
    object, then let
  • If not, we let
  • Step iii If CC has not reached C1, repeat Step
    ii. Or, P is the probability of the media object
    matching concept C1.

37
Evolution through Feedback
  • A progressive approach
  • MediaView is accumulated along with the processes
    of user interactions
  • Two phases of feedback
  • System-feedback
  • User-feedback

38
Evolution through Feedback

39
Evolution through Feedback
  • Procedure
  • Record each feedback performed by users.
  • For each CBIR system i involved, calculate its
    accuracy rate of retrieval. That is, simply
    divide the total number of retrieved results by
    the number of correct results according to user
    feedback.
  • Reset the value of to its accuracy rate
    respectively.
  • Wait for next session of user feedback.

40
Fuzzy Logic based Evolution Approach
  • Due to the uncertainty of the semantics, can not
    make an absolute assertion that a media object is
    relevant or irrelevant to a context
  • A media object in a database may be retrieved as
    a relevant result to a context several times
    the more times a media object is retrieved, the
    more confidence it has to be considered as
    relevant to the context.

41
Fuzzy Logic based Evolution Approach
  • For a media object e, a context c,
  • - the accumulation of historial
    feedback information (from both system and
    users)
  • - the adjustment of after each feedback
    session

42
Inverse Propagation of Feedback
  • The drawback of up-down fashion of calculating
    the probability
  • E.g. Whether a media object matches season can
    not leverage from that the media object was a
    match of spring
  • Solution propagate the confidence value of a
    media object being relevant to a concept along
    the hierarchical structure from bottom-up

43
Inverse Propagation of Feedback
  • Procedure
  • Wait for a feedback session.
  • For each positive feedback, namely, stating a
    concept C is relevant to a media object.
    Following the thesaurus, trace from C to the root
    concept Root in thesaurus. Assume the path is
    ltC, C1, C2 , Root Cngt.
  • Append Ci as also positive feedback to that media
    object, where i1 to n.

44
MediaView Customization
  • Two level MediaView Framework

45
MediaView Customization
  • Dynamically construct complex-context-based media
    views based on simple ones
  • An example complex context the Grand Hall in
    City University
  • Several user-level operators are devised to
    support more complex/advanced contexts, besides
    the basic operators

46
User-level Operators
  • INHERIT_MV(N mv-name, NS set-of-mv-refs, VP
    set-of-property-ref, MP set-of-property-ref)
    mv-ref
  • UNION_MV(N mv-name, NS set-of-mv-refs) mv-ref
  • INTERSECTION_MV(N mv-name, NS set-of-mv-refs)
    mv-ref
  • DIFFERENCE_MV(N1 mv-ref, N2 mv-ref) mv-ref

47
Build a MediaView in Run-time
  • Example find out info about "Van Gogh"
  • Who is "Van Gogh"?
  • What is his work?
  • Know more about his whole life.
  • Know more about his country.
  • See his famous painting "sunflower"

48
Build a MediaView in Run-time
  • Who is Van Gogh?
  • INHERIT_MV(V. Gogh, ltpaintergt,nameVan Gogh
    ,)
  • What is his work?
  • INTERSECTION_MV(work, ltpaintinggt, vg)
  • Know more about his whole life.
  • INTERSECTION_MV(life, ltbiographygt, vg)
  • Know more about his country.
  • INTERSECTION_MV(country, ltcountrygt, vg)
  • See his famous painting sunflower
  • Set sunflower INTERSECTION_MV(sunflower,
    ltsunflowergt, ltpaintinggt)Set vg_sunflower
    INTERSECTION_MV(vg_sunflower, vg_work,
    sunflower)

49
Authoring Scenario
  • Creates a new media view named after the subject
  • All multimedia materials used in the document
    would be put into this MediaView for further
    reference.
  • To collect the most relevant materials for
    authoring, the user performs the MediaView
    building process.
  • Import suitable media objects by browsing media
    views
  • Reference the manner and style of authoring, to
    find other media views with similar topics.
  • Drag Drop
  • learning-from-references

50
Interface of Our Authoring System
51
System Features
  • A Dynamic Environment
  • Helps a user select materials from the database
    to incorporate into the document
  • Query other similar media views for referencing
    the manner and/or style of authoring

52
Real-World Applications
  • A Multimedia Recipe Database
  • Modeling basis
  • Personalized (context-aware) manipulation
  • Cross-media indexing and retrieval system
  • Novel way of annotating and retrieving media
    objects
  • Lead to new indexing strategies

53
A Personalized Recipe Database System
  • People can not live without foods
  • Existing recipe websites provide huge amounts of
    recipes throughout the world
  • Fail to give support on analyzing and comparing
    recipes (What are important cooking principles
    skills what makes two dishes taste so
    different, etc.)
  • Unable to help users find similar recipes in a
    comprehensive manner (only keyword-based search
    on recipe names)
  • Fail to adapt recipes to meet the real-world
    situation (e.g. due to lack of ingredients or
    user preference)

54
A Personalized Recipe Database System -- Our
Contributions
  • Propose a recipe model which encompasses static
    attributes as well as dynamic behaviours (e.g.
    cooking procedures and constraints)
  • Present a novel perspective of evaluating the
    quality of a recipe by constructing and
    analysing its cooking graph (capture both action
    flows and data/ingredient flows)
  • Provide a promising way to address the problem of
    recipe adaptation heuristically (with flexible
    and feasible solutions)

55
Recipe on the Web
56
Sample Recipe -- The Cooking Procedure of Triple
Cheese Pasta Primavera
57
Sample Recipe
Parsing the Cooking Procedure of Triple Cheese
Pasta Primavera
58
Recipe Model
  • A recipe R is modeled and represented by a tuple
    of three elements
  • R ltM, RP, SPgt
  • where
  • (a) MMi i 1.. m a set of ingredients. An
    ingredient Mi is either a basic ingredient or a
    set of ingredients
  • Mi ltMID, MPgt, MIDunique identity, MPmember
    level properties (and functions) such as the
    name, quantity and image
  • An ingredient Mi belongs to one of the three
    classes Main, Minor and Seasoning
  • (b) RP is a set of recipe-level properties (and
    functions) applied on R itself, such as the main
    cooking style, region, nutrition and images of
    the dish of the recipe

59
Recipe Model
  • (c) SP (V, E, Cons, Ingr) is a labeled directed
    Cooking Graph,
  • Vvi i 1..n is a set of nodes.
  • via cooking action
  • cooking action constraints Cons(vi)associated
    constraint conditions that should be satisfied
    when the action of vi takes place. e.g.
    conditions on temperature and duration etc.
  • E is a set of directed edges on Vtemporal
    execution flow of the cooking actions named
    action flows.
  • An edge ltvi ,vjgt vj should take place after vi.
  • cooking transition constraints Cons(vi , vj)
    the conditions that should be satisfied for the
    flow to take place.
  • Ingr(vi) ingredients that should be added into
    vi
  • O(vi) the output ingredients of vi
  • These inputs and outputs for the nodes are
    called ingredient flows.

60
Cooking Graph
The Cooking Graph of Triple Cheese Pasta
Primavera
61
Basic Properties
  • Definition 1. (Reachability) A cooking graph is
    defined as reachable if each of its nodes is
    reachable a node is reachable if it is on a
    directed path from a starting node to the end
    node.
  • Definition 2. (Consistency) A cooking graph is
    defined to be consistent if the conditions for
    each node/edge is consistent (i.e. there exists
    assignment to variables to make the conditions
    true).

62
Constraints and Rules
  • Definition 3. (Constraint) A constraint is a
    predicate followed by one or more terms, enclosed
    in parentheses and separated by commas a term is
    either a constant, variable or function
    expression.
  • Constraints specify all kinds of conditions or
    restrictions in the recipe model
  • Three categories intra-recipe constraints,
    inter-recipe constraints and outer-recipe
    constraints.
  • Incompatible(Spinach, Tofu) says spinach and tofu
    are incompatible and should not be cooked
    together.

63
Constraints and Rules
  • Definition 4. (Rule) A rule is a logical
    implication of the form If ? Then ? (or, ),
    where ? and ? are sentences.
  • Validate the correctness of a recipe through
    reasoning and recognition process.
  • Handle complex situations such as to make
    necessary adjustment or compensation once an
    improper cooking action occurs.
  • Describe cooking skills that have been widely
    accepted and commonly used.
  • Over_Put(salt) ? Add(vinegarwater) says that if
    too much salt has been put into a dish, then
    neutralize the salty taste by adding either
    vinegar or water.

64
Recipe Cooking Graph Mining
  • Pattern Some subgraphs occur in one or more
    cooking graphs and they have certain influence on
    the cooking effects (e.g. taste, appearance).
  • Find patterns for a set of recipes
  • Whats usually done and whats usually put in the
    cooking procedure (one action, a series of
    actions, an ingredients, a set of ingredients,
    actions combined with ingredients)
  • Cooking graphs of different recipes may share the
    same pattern
  • Distinct subgraphs that determine the cooking
    effect (e.g. taste) should be identified

65
Sample Patterns
66
Sample Cooking Style
Generally describe how a recipe is cooked in a
Pattern Combination or in Graph Abstraction.
67
User Adaptation
  • Usually a user wants to make a dish that has the
    same cooking result (e.g. taste, appearance) as
    the recipe exhibits.
  • Unfortunately, the user is very likely to get a
    slightly or even totally different dish as he/she
    modifies the cooking procedure.
  • Objective reasonse.g. lack of some ingredients,
    Subjective reasonse.g. wrong cooking actions by
    carelessness or personal preference.

68
User Adaptation
  • When the user makes an adaptation, the system
    will check if the modified cooking graph is
    feasible.
  • If not, a set of feasible templates are provided.
  • The remaining subgraph is replaced by the user
    selected one.
  • Property check (Reachability, Consistency)

Template Selection and Instantiation
69
Prototype SystemGlobal Systemvs. User Space
70
Prototype System Recipe Browser
71
Prototype System Cooking Pattern Miner
72
Prototype System Similarity Calculator
73
Summary
  • Proposed a data model to represent a recipe
  • Advocated cooking graph mining to find frequent
    used patterns (actions, ingredients)
  • Attempt to solve recipe adaptation problem by
    using patterns as templates
  • Developed a prototype systemRecipeView
  • Further work include
  • discover patterns of cooking graphs
  • Refine and strengthen the algorithm of recipe
    adaptation

74
Application Scenario
75
Application Scenario
  • Advantages (vs. traditional retrieval techniques)
  • Easy-to-compose query
  • By browsing (to get seed objects of arbitrary
    modalities)
  • By subject (simply keyword) at various
    abstraction level
  • Multi-modal results
  • a collection of images, text docs, videos, etc
  • vs. a single type of media
  • Semantically relevant results
  • natural outcome of exploring previously learnt
    knowledge
  • vs. a set of specifically chosen features

76
Advantages (contd)
  • Hill-climbing Effect retrieval performance
    grows as more user interactions are conducted

77
Conclusion
  • MediaView a semantic multimedia database
    modeling mechanism
  • to bridge the semantic gap between conventional
    database and semantics-intensive multimedia
    applications
  • A set of user-level operators to accommodate the
    specialization/generalization relationships among
    the media views

78
Conclusion
  • MediaView promises more effective access to the
    content of media databases
  • Users could get the right stuff and tailor it to
    the context of their application easily.
  • Providing the most relevant content from
    pre-learnt semantic links between media and
    context
  • ? high performance database browsing and
    multimedia authoring tools can enable more
    comprehensive applications to the user

79
Conclusion
  • Users could customize specific media view
    according to their tasks, by using user-level
    operators
  • The effectiveness of using MediaView in the
    experimental problem domains
  • Multimedia recipe database
  • Cross-media indexing and retrieval

80
Further Issues
  • The development and transition of MediaView to a
    fully-fledged multimedia database system
    supporting declarative queries
  • Intensive and extensive performance studies
  • Advanced semantic relations (eg. temporal and
    spatial ones) can also be incorporated in
    combining individual media views

81
  • Thank you!
  • Q A
  • Email Qing.Li_at_cityu.edu.hk
Write a Comment
User Comments (0)
About PowerShow.com