Structured Documents - PowerPoint PPT Presentation

About This Presentation
Title:

Structured Documents

Description:

Passes received packets to the application. No delivery guarantee ... xlink:href='http://www.kirjasto.sci.fi/wbyeats.htm' xlink:type='simple' The Rose /poem1 ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 49
Provided by: Doug9
Category:

less

Transcript and Presenter's Notes

Title: Structured Documents


1
Structured Documents
  • Week 3
  • LBSC 690
  • Information Technology

2
Outline
  • Questions
  • Finishing networks
  • Building the Web
  • Building a better Web

3
TCP/IP layer architecture
Application
Application
Virtual network service
Transport
Transport
Virtual link for end to end packets
Network
Network
Network
Network
Virtual link for packets
Link
Link
Link
Link
Link
Link
Link for bits
Link for bits
Link for bits
4
The TCP/IP Protocol Stack
  • Link layer moves bits
  • Ethernet, cable modem, DSL
  • Network layer moves packets
  • IP
  • Transport layer provides services to applications
  • UDP, TCP
  • Application layer uses those services
  • DNS, SFTP, SSH,

5
User Datagram Protocol (UDP)
  • The Internets basic transport service
  • Sends every packet immediately
  • Passes received packets to the application
  • No delivery guarantee
  • Collisions can result in packet loss
  • Example sending clicks on web browser

6
Transmission Control Protocol (TCP)
  • Built on the network-layer version of UDP
  • Guarantees delivery all data
  • Retransmits missing data
  • Guarantees data will be delivered in order
  • Buffers subsequent packets if necessary
  • No guarantee of delivery time
  • Long delays may occur without warning

7
File Transfer Program (FTP)
  • Used to move files between machines
  • Upload (put) moves from client to server
  • Download (get) moves files from server to client
  • Available using command line and GUI interfaces
  • Normally requires an account on the server
  • Userid anonymous provides public access
  • Web browsers incorporate anonymous FTP
  • Automatically converts end-of-line conventions
  • Unless you select binary

8
Hands On FTP
  • Start a cmd window
  • Type ftp ftp.umiacs.umd.edu
  • Login in anonymously with
  • User anonymous
  • Password your email address
  • Go download a file
  • Type cd pub/gina/lbsc690/
  • Type binary
  • Type get hwOne.ppt
  • Exit
  • Type quit
  • Try it again with a graphical FTP program
  • WS_FTP, for example

9
Encryption
  • Secret-key systems (e.g., DES)
  • Use the same key to encrypt and decrypt
  • Public-key systems (e.g., PGP)
  • Public key open, for encryption
  • Private key secret, for decryption
  • Digital signatures
  • Encrypt with private key, decrypt with public key

10
Encrypted Standards
  • Secure Shell (SSH)
  • Replaces Telnet
  • Secure FTP (SFTP)/Secure Copy (SCP)
  • Replaces FTP
  • Secure HTTP (HTTPS)
  • Used for financial and other private data
  • Wired Equivalent Protocol (WEP)
  • Used on wireless networks

11
Network Abuse
  • Flooding
  • Excessive activity, intended to prevent valid
    activity
  • Worms
  • Like a virus, but self-propagating
  • Sniffing
  • Monitoring network traffic (e.g., for passwords)

12
Encryption Issues
  • Key length
  • 128 bits balances speed and protection today
  • Trust infrastructure
  • How do you prevent bait and switch?
  • Who certifies a digital signature is valid?

13
The World-Wide Web
My Browser
Local copy of Page requested
Page Requested
Proxy Server
Fetch Page
Send Request
Remote Sever
Internet
14
Web Standards
  • HTML
  • How to write and interpret the information
  • URL
  • Where to find it
  • HTTP
  • How to get it

15
HyperText Transfer Protocol (HTTP)
  • Send request
  • GET /path/file.html HTTP/1.0
  • From someuser_at_jmarshall.com
  • User-Agent HTTPTool/1.0
  • Server response
  • HTTP/1.0 200 OK
  • Date Fri, 31 Dec 1999 235959 GMT
  • Content-Type text/html
  • Content-Length 1354
  • lthtmlgtltbodygt lth1gtHappy New Millennium!lt/h1gt
    lt/bodygt lt/htmlgt

16
Uniform Resource Locator (URL)
  • Uniquely identify web pages on the WWW
  • Domain name
  • Directory path
  • File name

Domain name
File name
URL http//www.clis.umd.edu/courses/schedules/fal
l2003.html
Directory path
17
HyperText Markup Language (HTML)
  • Simple document structure language for Web
  • Advantages
  • Adapts easily to different display capabilities
  • Widely available display software (browsers)
  • Disadvantages
  • Does not directly control layout

18
Hands OnLearning HTML From Examples
  • Use Internet Explorer to find a page you like
  • http//www.glue.umd.edu/oard
  • On the View menu select Source
  • Opens a notepad window with the source
  • Compare HTML source with the Web page
  • Observe how each effect is achieved

19
Hands On Adopt a Web Page
  • Modify the HTML source using notepad
  • For example, change the page to yours
  • Save the HTML source on your M drive
  • In the File menu, select Save As
  • Select All Files and name it test.html
  • FTP it to your /pub directory on WAM
  • sftp wam.umd.edu
  • cd ../pub/
  • put test.html
  • View it
  • http//www.wam.umd.edu/(yourlogin)/test.html

20
HTML Document Structure
  • Tags mark structure
  • lthtmlgta documentlt/htmlgt
  • ltolgtan ordered listlt/olgt
  • ltigtsomething in italicslt/igt
  • Tag name in angle brackets ltgt
  • Not case sensitive
  • Open/Close pairs
  • Close tag may be optional (if unambiguous)

21
Logical Structure Tags
  • Head
  • Title
  • Body
  • Headers lth1gt lth2gt lth3gt lth4gt lth5gt
  • Lists ltolgt, ltulgt (can be nested)
  • Paragraphsltpgt
  • Definitions ltdtgtltddgt
  • Tables lttablegt lttrgt lttdgt lt/tdgt lt/trgt lt/tablegt
  • Role ltcitegt, ltaddressgt, ltstronggt,

22
Rendering
  • Different devices have different capabilities
  • Desktop
  • PDA
  • Rendering maps logical tags to physical layout
  • Controls line wrap, size, font
  • Place the title in the page border
  • Render lth1gt as 24pt Times
  • Render ltstronggt as bold
  • Somewhat browser-dependent
  • Internet Explorer and Netscape make different
    choices

23
Physical Structure Tags
  • Font
  • Typeface ltfont faceArialgtlt/fontgt
  • Size ltfont size1gtlt/fontgt
  • Color ltfont color990000gtlt/fontgt
  • http//webmonkey.wired.com/webmonkey/reference/col
    or_codes/Emphasis
  • Bold ltbgtlt/bgt
  • Italics ltigtlt/igt

24
Hypertext Anchors
  • Links make the Web a web!
  • Internal anchors somewhere on the same page
  • lta hrefstudentsgt Studentslt/agt
  • Links to lta namestudentsgtStudent
    Informationlt/agt
  • External anchors to another page
  • lta hrefhttp//www.clis.umd.edugtCLISlt/agt
  • lta hrefhttp//www.clis.umd.edustudentsgtCLIS
    studentslt/agt

25
Images
  • ltimg srcURLgt or ltimg srcpath/filegt
  • ltimg srchttp//www.clis.umd.edu/IMAGES/head.gif
    gt
  • SRC can be url or path/file
  • ALT a text string
  • ALIGN position of the image
  • WIDTH and HEIGHT size of the image
  • Can use as anchor
  • lta hrefURLgtltimg srcURL2gtlt/agt
  • Example
  • http//www.umiacs.umd.edu/daqingd/Image-Alignment
    .html

26
Tables
  • lttable aligncentergt
  • ltcaption alignrightgtThe captionlt/captiongt
  • lt tr alignLEFTgt
  • ltthgt Header1 lt/thgt
  • ltthgt Header2lt/thgt
  • lt/trgt
  • lttrgtlttdgtfirst row, first item lt/tdgt
  • lttdgtfirst row, second itemlt/tdgtlt/trgt
  • lt trgtlttdgtsecond row, first itemlt/tdgt
  • lttdgtsecond row, second itemlt/tdgtlt/trgt
  • lt/tablegt
  • Example http//www.umiacs.umd.edu/daqingd/Simple
    -Table.html

27
Frames
  • Divide browser pages into separate sections
  • Useful when you want to scroll separately
  • Each section can display an HTML page
  • Example 1 menu frame on the left side of a page
  • ltframeset cols10,90" gt
  • ltframe srctemplate.html"gt
  • ltframe srcimages.html"gt
  • lt/framesetgt
  • Example 2
  • http//www.hq.nasa.gov/alsj/frame.html

28
Designing Web Pages
  • Key design issues
  • Content What do you want to publish?
  • Style How do you want to present it?
  • Syntax How can you achieve that presentation?
  • Sources of information
  • Online tutorials (Yahoo points to lots of these)
  • Technical materials (e.g., the HTML 4.0 spec)

29
Some Style Guidelines
  • Design for generic browsers
  • And test on every version you wish to support
  • Provide appropriate access points
  • User needs and navigation strategies differ
  • Design useful navigational aids
  • A Web search may lead to the middle of a site
  • Include some indication of currency
  • Date of last update, new icons, etc.
  • Indicate who is responsible for the content
  • Helps readers assess authority

30
Accessibility Guidelines
  • Design for device independence
  • Maintain backward compatibility
  • Provide alternative pages if necessary
  • Provide alternatives for aural and visual content
  • Alt tags for images, transcripts for audio
  • Make is easy for assistive devices to work
  • Combine structural markup and style sheets
  • Give a title to each frame
  • Use HTML tables only for tabular data
  • Use markup to indicate language switching

31
HTML Editors
  • Goal is to create Web pages, not learn HTML!
  • Several are available
  • Macromedia Dreamweaver available commercially
  • In Netscape, File Edit Page for Composer
  • Tend to use physical layout tags extensively
  • Detailed control can make hand-editing difficult
  • You may still need to edit the HTML file
  • Some editors use browser-specific features
  • Some HTML features may be missing entirely
  • File names may be butchered by FTP

32
HTML Validators
  • Syntax checking cross-browser compatibility
  • http//validator.w3.org
  • Style checking improved accessibility
  • http//bobby.watchfire.com

33
Whats Wrong with the Web?
  • HTML
  • Confounds structure and appearance (XML)
  • HTTP
  • Cant recognize related transactions (Cookies)
  • URL
  • Links breaks when you move a file (PURL)

34
Whats a Document?
  • Content
  • Structure
  • Appearance
  • Behavior

35
History of Structured Documents
  • Early standards were typesetting languages
  • NROFF, TeX, LaTeX, SGML
  • HTML was developed for the Web
  • Too specialized for other uses
  • Specialized standards met other needs
  • Change tracking in Word, annotating manuscripts,
  • XML seeks to unify these threads
  • One standard format for printing, viewing,
    processing

36
Goals of XML
  • Metalanguage
  • A toolkit for design markup languages
  • Unambiguous markup
  • Clear span of tags
  • Separate markup from presentation
  • Style info gt stylesheet, so easy to change
  • Be simple

37
A Family of Standards
  • Definition DTD
  • Names known types of entities with labels
  • Defines part-whole and is-a relationships
  • Markup XML
  • Tags regions of text with labels
  • Markup XLink
  • Defines hypertext (and other) link
    relationships
  • Presentation XSL
  • Specifies how each type of entity should be
    rendered

38
XML Example
  • View The Song of the Wandering Aengus
  • http//www.umiacs.umd.edu/oard/teaching/690/fall0
    5/notes/3/xml.htm
  • Built from three files
  • yeats01.xml
  • poem01.dtd
  • poem01.xsl

39
An XML Example
lt?xml version"1.0"?gt lt!DOCTYPE POEM SYSTEM
"poem01.dtd"gt lt?xml-stylesheet type"text/xsl"
href"poem01.xsl"?gt ltPOEMgt ltTITLEgtThe Song of
Wandering Aenguslt/TITLEgt ltAUTHORgt
ltFIRSTNAMEgtW.B.lt/FIRSTNAMEgt
ltLASTNAMEgtYeatslt/LASTNAMEgt lt/AUTHORgt ltSTANZAgt
ltLINEgtI went on to the hazel wood,lt/LINEgt
ltLINEINgtBecause a fire was in my
head,lt/LINEINgt ltLINEgtAnd cut and peeled a hazel
wand,lt/LINEgt lt/STANZAgt lt/POEMgt
40
Document Type Definition (DTD)
lt!ELEMENT poem ( (title, author, stanza) )gt
lt!ELEMENT title (PCDATA) gt lt!ELEMENT author
(firstname, lastname) gt lt!ELEMENT firstname
(PCDATA) gt lt!ELEMENT lastname (PCDATA) gt
lt!ELEMENT stanza (line linein) gt lt!ELEMENT
line (PCDATA) gt lt!ELEMENT linein (PCDATA) gt
PCDATA span of text a,b a followed by
b ab either a or b a 0 or more as a 1 or more
as
41
Specifying Appearance XSL
ltxsltemplate match"POEM"gt ltHTMLgt ltBODY
BGCOLOR"FFFFCC"gt ltxslapply-templates/gt
lt/BODYgt lt/HTMLgt lt/xsltemplategt
ltxsltemplate match"TITLE"gt ltH1gt ltFONT
COLOR"Green"gt ltxslvalue-of/gt
lt/FONTgt lt/H1gt lt/xsltemplategt
42
An XLink Example
ltpoem xmlnsxlink"http//www.w3.org/1999/xlink
"gt ltauthor xlinkhref"yeatsRDFS3.xml
xlinktype"simple"gtW. B. Yeatslt/authorgt
ltpoemsgt ltpoem1 xlinkhref"http//www.kirjasto.s
ci.fi/wbyeats.htm" xlinktype"simple"gtThe
Roselt/poem1gt ltpoem2 xlinkhref"http//www.kirjas
to.sci.fi/wbyeats.htm" xlinktype"simple"gtThe
Towerlt/poem2gt lt/poemsgt lt/poemgt .
43
Some XML Applications
  • Text Encoding Initiative
  • For adding annotation to historical manuscripts
  • http//www.tei-c.org/
  • Encoded Archival Description
  • To enhance automated processing of finding aids
  • http//www.loc.gov/ead/
  • Metadata Encoding and Transmission Standard
  • Bundles descriptive and administrative metadata
  • http//www.loc.gov/standards/mets/

44
Whats Wrong with the Web?
  • HTML
  • Confounds structure and appearance (XML)
  • HTTP
  • Cant recognize related transactions (Cookies)
  • URL
  • Links breaks when you move a file (PURL)

45
Cookies
  • Servers know users by IP address and port
  • Because thats where they send the Web pages
  • Cookies preserve state
  • Server sends data to the browser
  • Browser later responds with the same data
  • A unique code (server-side state)
  • Information about the user (client-side state)

46
Persistent URLs
  • www.purl.org

My Browser
PURL
PURL Sever
URL
URL
Resource Sever
Page
47
Summary
  • Learning to build simple Web pages is easy
  • Which is good news for the homework!
  • All documents are structured documents
  • XML is a flexible markup language toolkits
  • The key is to understand its capabilities
  • XML editors can hide much of the complexity

48
Before You Go!
  • On a sheet of paper (no names), answer the
    following question
  • What was the muddiest point in todays class?
Write a Comment
User Comments (0)
About PowerShow.com