Building Intelligent Web Agents with CFML Michael Dinowitz November, 2000 - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Building Intelligent Web Agents with CFML Michael Dinowitz November, 2000

Description:

A page from ebay with the results of a search. 2. Define how it will be displayed ... cfhttp url='http://search-desc.ebay.com/search/search.dll?MfcISAPICommand ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 25
Provided by: kristinm5
Category:

less

Transcript and Presenter's Notes

Title: Building Intelligent Web Agents with CFML Michael Dinowitz November, 2000


1
Building Intelligent Web Agents with
CFMLMichael DinowitzNovember, 2000
2
Intelligent Agents in ColdFusion
  • What are Agents?
  • Code that does automatic work for you
  • Involves retrieving information
  • Processing or storing that information
  • Usually a single page or has no interface
  • What are Intelligent Agents (IA)?
  • Term user for a specific class of agents
  • Retrieves remote information
  • Processes the retrieved information
  • Decision making code built in
  • Usually involves Parsing operations
  • Interfaces with remote processes

3
Intelligent Agents in ColdFusion
  • What arent Intelligent Agents?
  • Push of any sort (CFMAIL)
  • Calls to structured locations
  • DBs
  • LDAP
  • Browsers
  • Grey Areas - Structured data
  • Syndicated data (Spectra)
  • HTTP query returns
  • Comma delimited information
  • Most local information calls

4
Intelligent Agents in ColdFusion
  • Broad examples
  • CF_StockGrabber - grabs and processed stock
    information
  • CF_UPS - interface to UPS shipping data
  • CF_MetaSearch - searches multiple search engines
    and collates results
  • CF_GetTags

5
Intelligent Agents in ColdFusion
  • Technologies used for retrieval
  • CFHTTP - retrieve websites
  • CFFTP - retrieves ftp information
  • CFX_Socket - socket calls for information
  • CFX_NNTP - retrieves usenet news
  • Technologies used for parsing
  • Find() / FindNoCase ()
  • Replace() / ReplaceNoCase ()
  • Mid()
  • REFind() / REFindNoCase ()
  • REReplace() / REReplaceNoCase()

6
IA technique I - CF_EbayItem
  • IA technique I - CF_EbayItem
  • 1. Define what you want
  • A page from ebay with the results of a search
  • 2. Define how it will be displayed
  • Whole page returned in a variable. No parsing
  • 3. Define the steps to get it
  • CFHTTP to retrieve a page
  • Place information in file or on browser

7
CFHTTP Basics
  • Url - Url to retrieve. Does not need http//
    prefix
  • Method - Get or Post.
  • ResolveUrl - Turns all relative links into full
    ones. Needed for graphics and links from the
    page.
  • Notes
  • The URL does not need to be prefixed by http//,
    but its good practice to do so.
  • Get is standard and uses the tag as is. Post
    requires a CFHTTPPARAM as well as a closing
    CFHTTP tag.
  • ResolveUrl should only be used when you expect to
    follow links from the called page or want to see
    the media content.

8
IA technique I - CF_EbayItem
  • IA technique I - CF_Ebay (Code)

AM name"attributes.ReturnVar" default"ReturnVar"
/search.dll?MfcISAPICommandGetResultebaytag1eba
yreght1queryattributes.searchitemebaytag1co
de0srchdescySortPropertyMetaNewSort"
method"GET" resolveurl"true" Caller.Attributes.ReturnVarCFHttp.FileContent

9
IA technique II - CF_EbayItem
  • 1. Define what you want
  • All items from an ebay search
  • 2. Define how it will be displayed
  • in a return array
  • 3. Define the string to search for in the page
  • ViewItemitem449570667"HEBREW AMULETS By T
    Schrire
  • 4. Define the steps to get it
  • CFHTTP to retrieve a page
  • CFLOOP over the page for elements
  • FindNoCase() to get start of specific element
  • FindNoCase() to get end of specific element
  • Mid() to get whole element
  • Place information in array for return

10
Find()/FindNoCase() Basics
  • FindNoCase(substring, string , start )
  • SubString - The exact string your looking for
  • String - The string that your searching
  • Start - Optional start position.
  • Notes
  • FindNoCase is slightly slower, but better when
    you dont know exactly what your looking for.
  • Always a good idea to set a start. Speeds up the
    search.
  • Remember that the return value is the START
    position of the SubString. Add the SubString
    length to get the end position.

11
Mid() Basics
  • Mid(string, start, count)
  • String - The string that contains the SubString
    you want.
  • Start - The start position of the SubString you
    want.
  • Count - The amount of characters in the SubString
    that you want.
  • Notes
  • When used with FindNoCase, it is usual to have a
    start variable and an end variable. The count
    would then be noted as
  • End-Start

12
IA technique II - CF_EbayItem
  • default"ReturnVar"
  • arch.dll?MfcISAPICommandGetResultebaytag1ebayre
    ght1queryattributes.searchitemebaytag1code
    0srchdescySortPropertyMetaNewSort"
    method"GET" resolveurl"true"
  • Content

AM nameAttributes.ReturnArray"
default"ReturnArray" url"http//search-desc.ebay.com/search/search.dll
?MfcISAPICommandGetResultebaytag1ebayreght1q
ueryAttributes.SearchItemebaytag1code0srchde
scySortPropertyMetaNewSort" method"GET"
resolveurl"true" LocalArrayArrayNew(1)
13
IA technique II - CF_EbayItem
  • y.com/aw-cgi/eBayISAPI.dll?ViewItemitem',
    cfhttp.filecontent, end)
  • ', cfhttp.filecontent,
    start)4
  • ent, start, end-start))
  • y

14
IA technique III - CF_EbayItem
  • 1. Define what you want
  • All items from an ebay search
  • 2. Define how it will be displayed
  • in a return array
  • 3. Define the string to search for in the page
  • ViewItemitem449570667"HEBREW AMULETS By T
    Schrire
  • 4. Define the steps to get it
  • CFHTTP to retrieve a page
  • CFLOOP over the page for elements
  • REFindNoCase() to get specific element
  • Mid() to get whole element
  • Place information in array for return

15
REFind()/REFindNoCase() Basics
  • REFindNoCase(RegEx, String ,start ,returnsub
    )
  • RegEx - Regular Expression to use as search
    criteria
  • String - String to search in
  • Start - Position in String to start search at
  • ReturnSub - Returns sub expressions as defined in
    the RegEx
  • Notes
  • Start should always be used as it speeds up the
    search. If using ReturnSub, it is required and
    can be set to 1.
  • This function returns the numeric position of the
    searched for text unless ReturnSub is specified.
    Then it returns a structure

16
REFind()/REFindNoCase() Basics
  • Structure returned by this string will have two
    keys (Pos, Len) with each key being an array. The
    first array (Variable.Pos1, Variable.Len1)
    will always contain the position/Length of the
    ENTIRE match. Each additional array element will
    contain the position and length of a subelement.
  • Variable
  • Pos
  • 1
  • 2
  • Len
  • 1
  • 2

17
RegEx Basics
  • The following is a fast rundown of important
    characters in Regular Expressions
  • In most cases, a character is equal to itself
  • A \ will escape any special character
  • A period (.) represents any one character
  • .at can mean bat, cat, rat, or anything that has
    a single character and ends with at.
  • A pair of brackets denotes a set of characters
    (I.e. one of them can be used)
  • 01256 means any one of those numbers
  • A dash (-) within a set means a range of
  • 0-9 means any single number of 0 through 9
  • A carat () within a range means Not the range
  • aeiou means any character but a vowel

18
RegEx Basics
  • Parenthesis is used to denote a compound
    expression OR a subexpression
  • (this) will return the position and length of the
    word this
  • When used within a compound, a pipe () means
    either/or
  • (thisthat) will return the position and length
    of the first occurrence of this or that
  • A question mark (?) means that the previous
    character, set or compond may or may not exist
    but if it does, will exist 1 time
  • A plus () means that the previous character, set
    or compond must exist 1 or more times
  • An asterisk () means that the previous
    character, set or compond may exist 0 or more
    times

19
IA technique III - CF_EbayItem
  • SELECT PRODUCT, PRICE
  • FROM PRODUCTS
  • Car Paint Colors
  • product - price

AM nameAttributes.ReturnArray"
default"ReturnArray" url"http//search-desc.ebay.com/search/search.dll
?MfcISAPICommandGetResultebaytag1ebayreght1q
ueryAttributes.SearchItemebaytag1code0srchde
scySortPropertyMetaNewSort" method"GET"
resolveurl"true" LocalArrayArrayNew(1)
20
IA technique III - CF_EbayItem
  • y\.com/aw-cgi/eBayISAPI\.dll\?ViewItemitem0-9
    "', cfhttp.filecontent, end, 1)

21
IA technique III - CF_EbayItem
  • ent, Item.pos1, item.len1))
  • y

22
Extra Information
  • CFHTTP Headers - extra information returned by a
    CFHTTP (or any HTTP) call
  • FILECONTENT - Text grabbed
  • HEADER - Header info (including cookies)
  • MIMETYPE - Return mime type
  • RESPONSEHEADER - structure with all information
    except content
  • STATUSCODE - HTTP return code

23
Syndication (WDDX Queries)
  • Can return structured information as a query
  • Better to use WDDX to send query encoded in a
    packet
  • Basis of Spectra syndication
  • Can pass binary files encoded with ToBase64()
    function

24
Conference Closing Slide
Write a Comment
User Comments (0)
About PowerShow.com