MediaHub: An Intelligent MultiMedia Distributed Platform Hub - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

MediaHub: An Intelligent MultiMedia Distributed Platform Hub

Description:

The primary objectives of MediaHub are to: ... GeNIe/SMILE (Genie 2005) Netica (Norsys 2005) Bayes Net Toolbox (BNT 2005) BUGS (BUGS 2005) ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 30
Provided by: tedl3
Category:

less

Transcript and Presenter's Notes

Title: MediaHub: An Intelligent MultiMedia Distributed Platform Hub


1
MediaHub An Intelligent MultiMedia Distributed
Platform Hub
MediaHub
  • Glenn Campbell, Tom Lunney, Paul Mc Kevitt
  • School of Computing and Intelligent Systems
  • Faculty of Engineering
  • University of Ulster, Magee Campus
  • Northland Road, Derry
  • Campbell-g8, TF.Lunney, P.McKevitt
    _at_ulster.ac.uk

2
Outline
  • Goals and objectives
  • Key research problems
  • Distributed Processing
  • Distributed Platforms
  • Architecture of MediaHub
  • Decision making in MediaHub
  • Comparison to related research
  • Tools and future development

3
Goals
  • The primary objectives of MediaHub are to
  • Interpret/generate semantic representations of
    multimodal input/output
  •  
  • Perform fusion and synchronisation of multimodal
    data (decision-making)
  •  
  • Implement and evaluate a multimodal platform hub
    (MediaHub)

4
Goals
  • Research questions
  • Semantic representation?
  • Communication with other elements of a platform?
  • Semantic representation?
  • Decision-making?

5
Key research problems
  • Semantic Representation
  • Represent language and vision
  • Frames or XML?
  • Semantic Storage
  • Blackboard model?
  • Non-blackboard model?
  • Decision-making
  • Fusion and synchronisation
  • AI technique

6

Semantic representation
  • Frames (CHAMELEON)
  • (Brøndsted et al. 1998, 2001)
  • MODULE
  • INPUT input
  • INTENTION intention-type
  • TIME timestamp
  •  
  • SPEECH-RECOGNISER
  • UTTERANCE(Point to Hannes office)
  • INTENTION instruction!
  • TIME timestamp
  •  
  • GESTURE
  • GESTURE coordinates (3, 2)
  • INTENTION pointing
  • TIME timestamp
  • XML (M3L, SmartKom)
  • (Bühler et al. 2002, Wahlster et al. 2001)
  • ltpresentationTaskgt
  • ltpresentationGoalgt
  • ltinformgt ltinformFocusgt ltRealizationTypegtlist
    lt/RealizationTypegt lt/informFocusgt lt/informgt
  • ltabstractPresentationContentgt
  • ltdiscourseTopicgt ltgoalgtepg_browselt/goalgt
    lt/discourseTopicgt
  • ltinformationSearch id"dim24"gtlttvProgram
    id"dim23"gt
  • ltbroadcastgtlttimeDeictic id"dim16"gtnowlt/timeDei
    cticgt
  • ltbetweengt2003-03-20T194232
    2003-03-20T220000lt/betweengt
  • ltchannelgtltchannel id"dim13"/gt lt/channelgt
  • lt/broadcastgtlt/tvProgramgt
  • lt/informationSearchgt
  • ltresultgt lteventgt
  • ltpieceOfInformationgt
  • lttvProgram id"ap_3"gt
  • ltbroadcastgt ltbeginTimegt2003-03-20T195000lt/beginT
    imegt
  • ltendTimegt2003-03-20T195500lt/endTimegt

7
Semantic storage
  • Blackboard or Non-blackboard?
  • High coupling Blackboard?
  • Low coupling - distributed architecture?
  • Communication
  • Via central blackboard?
  • Message passing between modules?

8
Decision-making (fusion synchronisation)
  • Rule-based
  • Potential for Other AI techniques
  • Fuzzy Logic
  • Neural Networks
  • Genetic Algorithms
  • Bayesian Networks (CPNs)

9
Distributed processing
  • DACS (Fink et al. 1995, 1996)
  • Open Agent Architecture (OAA)
  • (Cheyer et al. 1998, OAA 2004)
  • JATLite (Kristensen 2001, Jeon et al. 2000)
  • JavaSpaces (Freeman 2004)
  • CORBA (Vinoski 1993)
  • .NET (Fay 2003)

10
Intelligent Multimedia Distributed Platforms
  • Blackboard Model
  • Ymir (Thórisson 1999)
  • CHAMELEON (Brøndsted et al. 1998, 2001)
  • Smartkom
  • (Bühler et al. 2002, Wahlster et al. 2001,
    SmartKom 2004)
  • DARBS (Nolle et al. 2001)
  • DARPA Galaxy Communicator (Bayer et al. 2001)
  • Psyclone (Psyclone 2004)
  • Spoken Image/SONAS
  • (Ó Nualláin et al. 1994, Ó Nualláin Smith
    1994,
  • Kelleher et al. 2000)

11
Intelligent Multimedia Distributed Platforms
  • Non-blackboard Model
  • WAXHOLM (Carlson et al. 1996)
  • AESOPWORLD (Okada 1996)
  • COLLAGEN (Rich et al. 1997)
  • INTERACT (Waibel et al. 1996)
  • Oxygen (Oxygen 2004)
  • EMBASSI (Kirste 2001, EMBASSI 2004)
  • MIAMM (MIAMM 2004)

12
CHAMELEON
  • Language vision integration system
  • consists of ten modules, mostly programmed in C
    and C
  • DACS communication system used for communication
  • Blackboard stores semantic representations
    produced by other modules
  • Communication between modules achieved by
    exchanging semantic representations between
    themselves or blackboard
  • Semantic representation in form of input, output
    and integration frames

13
Architecture of CHAMELEON
14
SmartKom
  • User adaptive interface for human-computer
    interaction
  • Mobile
  • Public
  • Home/Office
  • Facilitates speech, gestures and facial
    expression input
  • XML-based mark-up language, M3L, used for
    semantic representation
  • Distributed multiple blackboard model

15
Architecture of SmartKom
16
Architecture of MediaHub
  • Dialogue Manager
  • Acts as a blackboard module
  • Facilitates communication between other modules
  • Synchronisation
  • Semantic Representation Database
  • Provides semantic representation of language and
    vision data
  • Decision Making Module
  • AI technique for a unique form of decision-making
  • Bayesian Networks (CPNs)
  • Neural Networks, Genetic Algorithms, Fuzzy Logic

17
Architecture of MediaHub
18
Decision Making Module
19
Decision making in MediaHub
  • Decisions at Input
  • Determining semantic content of input
  • Fusing semantics of input (into frames/XML)
  • Resolving ambiguity at input
  • Decisions at Output
  • Synchronising language with visual output
  • Best modality for output (i.e. language or
    vision)

20
Input example
Copy all files from the process control folder
of this computer to a new folder called check
data on that computer.
21
Output Example
This is the best route from Pauls office to
Toms office.
P
T
22
Comparison to related research
23
Potential Tools
  • Main Programming Language
  • Java
  • C
  • Communication
  • .NET
  • DACS
  • Semantic Representation
  • XML
  • XHTML Voice
  • SMIL
  • RDF Schema
  • MPEG-7
  • EMMA

24
Potential Tools
  • Decision Making Tools
  • HUGIN GUI / API (Hugin 2004)
  • Microsoft MSBNx / MSBN3 (Kadie et al. 2001)
  • GeNIe/SMILE (Genie 2005)
  • Netica (Norsys 2005)
  • Bayes Net Toolbox (BNT 2005)
  • BUGS (BUGS 2005)

25
Hugin
  • Tool for implementing Bayesian Networks as CPNs
    (Causal Probabilistic Networks)
  • Hugin GUI
  • Graphical user interface to Hugin decision engine
  • Hugin API
  • Library implemented in C, C, Java
  • Allows programs to implement Bayesian Networks
    for decision making

26
Bayesian Networks
  • AKA Bayes nets, Causal Probabilistic Networks
    (CPNs), Bayesian Belief Networks
  • Consists of nodes and directed edges between
    nodes
  • Node represents a variable
  • Edge represents cause-effect relationship
  • An edge connecting two nodes A and B indicates a
    direct influence exists between state of A and
    the state of B

27
Simple Bayesian Network
Diet and Exercise nodes have influence over
Weight Loss node
28
Future development
  • Define necessary decisions
  • Develop Bayesian decision making using Hugin API
    for Java
  • Semantic storage
  • Communication
  • Semantic representation scheme
  • Semantic representation database
  • Acquire multimodal corpora for testing
  • Test MediaHub in an existing Multimodal Platform
    e.g. CONFUCIUS (Ma Mc Kevitt 2003)

29
Conclusion
  • An intelligent multimodal distributed platform
    hub called MediaHub will be developed
  • MediaHub will interpret and generate semantic
    representations of multimodal input and output
  • MediaHub will perform fusion and synchronisation
    of language and vision data
  • MediaHub will provide a new method of decision
    making within a distributed platform hub
  • MediaHub will be tested within an existing
    multimodal platform (e.g. CONFUCIUS)
Write a Comment
User Comments (0)
About PowerShow.com