Chapter 6 Text and Multimedia Languages and Properties - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Chapter 6 Text and Multimedia Languages and Properties

Description:

document can also have information about itself, called metadata ... for internationalization of oriental languages like Chinese or Japanese Kanji, ... – PowerPoint PPT presentation

Number of Views:1059
Avg rating:3.0/5.0
Slides: 16
Provided by: KCK86
Category:

less

Transcript and Presenter's Notes

Title: Chapter 6 Text and Multimedia Languages and Properties


1
Chapter 6Text and Multimedia Languages and
Properties
  • ..
  • .

2
Introduction
  • Document has given syntax and structure
  • also has semantics
  • may have presentation style associated with it
  • Figure 6.1 depicts all these relationships
  • document can also have information about itself,
    called metadata

3
  • Syntax of document can express different elements
    such as structure, presentation style, semantics
  • one or more of these elements may be given
    together
  • structural element (e.g. section) can have fixed
    formatting style

4
  • Syntax of document can be
  • implicit in its content
  • expressed in declarative language or PL
  • current trend is to use languages
  • that provide information on document
  • structure
  • format
  • semantics
  • readable by humans and computers
  • SGML is one such language

5
Metadata
  • Metadata is data about data
  • metadata associated with text include
  • author
  • date of publication
  • source of publication
  • document length (in pages, words, bytes)
  • document genre (book, article, memo)
  • Machine Readable Cataloging Record (MARC) is
    most used format for library records

6
  • In Web, metadata used for many purposes
  • cataloging
  • content rating (e.g. to protect children from
    reading some type of document)
  • intellectual property rights
  • digital signatures (for authentication)
  • privacy levels (who should/should not have access
    to document)
  • application to EC, etc.

7
  • New standard for Web metadata is Resource
    Description Framework (RDF)
  • RDF allows description of Web resources
  • consists of description of nodes and attached
    attribute/value pairs
  • nodes can be any Web resource (any URI), that
    include URL
  • attributes are properties of nodes, and their
    values are text strings or other nodes

8
Text
  • With the advent of computers, necessary to code
    text in binary digits
  • first coding schemes were EBCDIC and ASCII
  • for internationalization of oriental languages
    like Chinese or Japanese Kanji, 16-bit Unicode
    (ISO10616) exists

9
Text Formats
  • No single format for text document
  • in the past, IR systems would convert document to
    internal format
  • cannot change content of document
  • current IR systems have filters to handle most
    popular documents, in particular Word,
    WordPerfect or Framemaker

10
  • Other text formats for document interchange
  • Rich Text Format (RTF)
  • used by word processors and has ASCII syntax
  • Portable Document Format (PDF)
  • developed for displaying and printing documents
  • Multipurpose Internet Mail Exchange (MIME)
  • used to encode electronic mail

11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com