Advanced Regular Expressions - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Advanced Regular Expressions

Description:

Head of House of Fusion. Publisher of Fusion Authority. Founding member of Team Macromedia ... Carriage return Chr(13) is ignored as new line. Expression Modifiers ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 17
Provided by: michae1565
Category:

less

Transcript and Presenter's Notes

Title: Advanced Regular Expressions


1
Advanced Regular Expressions
  • Or
  • Whats special about
  • RegEx in MX

2
Your Presenter
  • Michael Dinowitz
  • Head of House of Fusion
  • Publisher of Fusion Authority
  • Founding member of Team Macromedia
  • Doing this since June 95
  • Called on for the black magic code

3
Disclaimer Introduction
  • If you dont know the basics get out
  • No real changes from CF 5 or CFMX 6

4
Basic additions
  • Greedy vs. Lazy
  • is one or more and as many as it can
  • ? Is one or more but only as many as it needs
  • Same as greedy but does not allow back
    references (not in CFMX)
  • Nested sub expressions
  • In order of execution from outside it
  • Then left to right

5
Character Vs. Posix classes
  • Non-special characters become special
  • Uses a backslash (\) to specify being special
  • Shorter than posix classes
  • Harder to read for newbies

6
Basic Character Classes
  • \b word boundary
  • Any jump from alphanumeric to non-alphanumeric
  • refindnocase('\bbig\b', 'big')
  • \B any 2 of the same types of characters
  • refindnocase('\B', 'big') 2

7
More Character Classes
  • \A - same as (not combined with (?m)
  • \Z same as (not combined with (?m)
  • \n newline
  • \r carriage return
  • \t tab
  • \d any digit (0-9)
  • \D any non digit (0-9)

8
More Character Classes
  • \w - Any alphanumeric character (alnum )
  • \W - Any non-alphanumeric character (alnum
    )
  • \s - Any whitespace character including tab,
    space, newline, carriage return, and form feed
    (\t\n\r\f )
  • \S any non-whitespace character ( \t\n\r\f)

9
Expression Modifiers
  • At beginning of expression
  • (?i) Causes expression to be case insensitive
    (same as NoCase version)
  • (?m) Multi-line mode
  • and matches line, not entire string
  • Carriage return Chr(13) is ignored as new line

10
Expression Modifiers
  • (?x) ignores all white space
  • Also allows usage of for comments
  • will comment to end of line
  • reFind("(?x) one first option
  • two second option
  • three\ point\ five note escaped spaces
  • ", "three point five")

11
Group Modifiers
  • Affects only the group its in
  • Must be at beginning of group
  • (?) comment
  • Must escape
  • (?) does not add group to return collection
  • (?) Positive look ahead
  • (?!) negative look ahead

12
Positive Lookahead
  • Tests if the text in the parenthesis exists
  • Does not save the text into return collection
  • Does not consume text
  • lta(?.href).?href"(").?gt

13
Negative Lookahead
  • Tests if the text in the parenthesis does not
    exist
  • Does not save the text into return collection
  • Does not consume text
  • (lta(?!.?target) gtgt)

14
Replace conversion
  • Used in REReplace()/REReplaceNoCase()
  • Either converts the next character or a
    specific section of characters
  • \u converts next character to uppercase
  • \l converts the next character to lowercase
  • \U\E converts block to uppercase
  • \L\E converts block to lowercase

15
Not Supported
  • Positive Lookbehinds
  • Negative Lookbehinds
  • Other features
  • All accessible through the Java RegEx engine
  • Massimo has a CFC pre-built to do this

16
Resources
  • Chapters in most CFMX books
  • CF-RegEx mailing list
  • This presentation
  • Books
  • Mastering Regular Expressions, 2nd Edition
  • Teach Yourself Regular Expressions in 10 Minutes
  • Java Regular Expressions Taming the java Dot util
    Dot regex Engine
Write a Comment
User Comments (0)
About PowerShow.com