Dickson K.W. Chiu - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Dickson K.W. Chiu

Description:

Motivated by scripts in terminal emulation programs (Telix/Procomm) ... Tailor-made primitives for db access, web dialogue and exception handling ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 15
Provided by: kwc6
Category:
Tags: chiu | dickson | tailor

less

Transcript and Presenter's Notes

Title: Dickson K.W. Chiu


1
A Script Language for Generating Internet-bots
  • Dickson K.W. Chiu
  • Department of Computer Science Engineering
  • Chinese University of Hong Kong
  • Shatin, Hong Kong
  • kwchiu_at_ieee.org

2
Agenda
  • Introduction and Motivation
  • WebScript Operating Environment
  • The WebScript Language
  • Example Application
  • Conclusion and Further Work

3
Motivation of WebScript
  • Most web services and agents available only
    through manual web pages (e.g. online ordering)
  • Need human attention
  • Long delay impairing E-commerce
  • Motivated by scripts in terminal emulation
    programs (Telix/Procomm)
  • Generic tool for automating web interactions
  • Also useful for casual end-users e.g. get stock
    price into personal db from a web page

4
WebScript Features
  • Minimal core language
  • Complete set of primitive for responding to html
    forms
  • Information extraction from web pages based on
    pattern-matching
  • Interfacing to back-end databases or storing data
    to files
  • Raising exceptions and alerting

5
Architecture Operating Environment
  • Translator (instead of interpreter)
  • Perl or Java target language
  • Stand-alone utility at client
  • Server-side utility - script supplied by client
    or from repository
  • Translator service for thin clients (translated
    code executed at client-side)
  • Programming productivity tool
  • Part of complex information system

6
The WebScript Language Mechanism
  • Based on HTML features
  • Automate HTTP messages
  • Simulates a user browsing a target web page,
    entering information and pressing buttons
  • Carry out delegated actions and/or extract
    relevant information from pages

7
Basic Language Constructs
  • Variables / Parameters - String type and
    structured type
  • Structured type based on db table / class
    definition
  • Simple control flow primitives
  • Perl expression and functions
  • Subroutines

8
Interfacing Information Extraction
  • Connect to db (ODBC, MySQL, Postgres)
  • Send db statements (SQL) to obtain results (with
    cursor)
  • Insert a tuple / object from a structured
    variable
  • Download URL for processing
  • Save to file or as objects in host db
  • Extract information by matching regular
    expressions

9
HTML Form Dialogue
  • From script variables and expressions, fill in
    fields, select check-box / radio-buttons / pop-up
    list, etc.
  • Press Buttons

10
Example 1 Database driven script for checking
Registration Price
11
Example 1 Database driven script for checking
Registration Price
  • While (r.Reg_name ltgt NULL)
  • / while theres another registrar /
  • Checkpoint 1
  • URL r.URL
  • Expect title r.title Raise page_changed
  • Extract first like 0-9\.?0-9 after
  • r.pattern1 before r.pattern2 to newprice
  • If (newprice NULL)
  • Raise page_changed
  • If newprice ltgt h.price
  • Dbcommand h
  • update registrars set date_changed
  • curdate(), pricenewprice
  • DBcontinue q1 result r / get next registrar
    /
  • Dbdisconnect h
  • Return
  • Webscript CheckDomainReg
  • DBconnect (h, MySQL, localhost,
  • OrderClerk, pwd, services)
  • / registrars in a table in the RDBMS /
  • Declare r h.registrars
  • Declare newprice
  • DBcommand h select from registrars
  • result r continue q1
  • Timeout 5000
  • After retry 5 Raise
  • On error retry Checkpoint 1

12
Example 2 Online Domain Name Registration
Expect page available raise
domain_not_available Form 1 post
https//www.nicreg.com/cgi-bin/registrate.cgi
Fillform Dname default Button
Register On error retry no Form 1 post
https//www.nicreg.com/cgi-bin/autoccreg.cgi
Fillform Company order_form.company
Fillform Address order_form.address ...
Button Submit for Processing Expect page
Successful Registration DBdisconnect
h Return
  • Webscript regdomainame (n string)
  • / input form number /
  • DBconnect (h, MySQL, localhost,
  • OrderClerk, pwd, services)
  • / order_form in a table in the RDBMS /
  • Declare o h.order_form
  • DBcommand h select from order_form where
    order_numn result o
  • Timeout 5000
  • After retry 5 raise
  • On error retry Checkpoint 1
  • Checkpoint 1
  • URL http//www.niceg.com/registrate.html
  • Expect title regist
  • Form 1 post https//www.nicreg.com/cgi-bin/domain
    _search.cgi
  • Fillform Dname order_form.domain
  • Button Search

13
Concluding Remarks
  • Part of the ADOME-WFMS project
  • Flexible script language Webscript for generating
    Internet-bots
  • Simple, application oriented
  • Tailor-made primitives for db access, web
    dialogue and exception handling
  • Suitable for E-commerce environment
  • Easier to develop, understand, debug and maintain

14
Further Work
  • Scripting with XML pages
  • Java code generation
  • Script development tools
  • Recording, monitoring and debugging tools
  • Displaying form and db fields for drag-and-drop
  • Script development methodology
Write a Comment
User Comments (0)
About PowerShow.com