Regular Expressions in .NET - PowerPoint PPT Presentation

About This Presentation
Title:

Regular Expressions in .NET

Description:

Ashraya R. Mathur CS 795 - .NET Security * Outline Introduction to Regular Expressions Regular Expression Syntax Validation in ASP.NET Regular Expressions in .NET ... – PowerPoint PPT presentation

Number of Views:241
Avg rating:3.0/5.0
Slides: 25
Provided by: AHRy
Learn more at: https://www.cs.odu.edu
Category:

less

Transcript and Presenter's Notes

Title: Regular Expressions in .NET


1
Regular Expressions in .NET
  • Ashraya R. Mathur
  • CS 795 - .NET Security

2
Outline
  • Introduction to Regular Expressions
  • Regular Expression Syntax
  • Validation in ASP.NET
  • Regular Expressions in .NET Programming
  • Demonstrations
  • Conclusion

3
What are Regular Expressions?
  • Definition
  • A Regular Expression is a series of characters
    that are transformed into an algorithm that
    matches and manipulates text
  • Allow you to
  • Extract, edit, replace, or delete text substrings
  • Add the extracted strings to a collection in
    order to generate a report
  • Are a universally valuable skill applicable in
    .NET, Java, Perl, PHP, JavaScript, and many other
    programming languages

4
Common Regular Expression Uses
  • Form and Data Validation
  • Query-String Validation
  • Data Clean-up / Reformatting
  • Data search and retrieval
  • HTML / XML Information Retrieval
  • Parsing Log Files

5
Regular Expressions Syntax
  • Simple Expressions
  • Simplest Regular Expression - the literal string
  • Quantifiers
  • , which describes "0 or more occurrences,
  • , which describes "1 or more occurrences", and
  • ?, which describes "0 or 1 occurrence".
  • Explicit Quantifiers - x,y, which allow an
    exact number or range to be specified
  • Quantifiers always refer to the pattern
    immediately preceding (to the left of) the
    quantifier

6
Regular Expressions Syntax(contd)
  • Metacharacters
  • include the following . ( ) and \
  • . matches a single character
  • and mark the start and end positions of a
    line of text. Ex aa-zb
  • () are used to group an expression. Ex (abc)
  • A class of characters from which the pattern
    can match one. Ex a-z, A-Z, 0-9
  • indicates an either-or situation Ex abcd
  • \ used as an escape character. Ex c\\

7
Sample Regular Expressions
Pattern Description
\d5 5 numeric digits, US ZIP code.
(\d5(-\d4)? Same as previous, but more efficient. Optional US ZIP4 format
\w_at_a-z?\.a-z2,3 Simple email validation expression
\d3-\d2-\d4 Social Security Number Validation
\d1,2\/\d1,2\/\d4 Date Format Validation
(\w-\.)\w-(/\w- ./?)? URL Validation
/\.\/ Matches the contents of a C-style comment / /
8
Validation in ASP.NET
  • RegularExpressionValidator Validation Control
  • Allows you to validate inputs by providing a
    regular expression which must match the input.
  • The regular expression pattern is specified by
    setting the ValidationExpression property of the
    control.
  • Key properties
  • ControlToValidate
  • ErrorMessage (for the ValidationSummary)
  • ltaspRegularExpressionValidator runat"server"
    iddate1 ControlToValidateTextBox1"
    ErrorMessage"Invalid Date" ValidationExpression
    "\d1,2\/\d1,2\/\d4" /gt

9
Regular Expressions in .NET Programming
  • .NET Base Classes
  • Namespace System.Text.RegularExpressions
  • Can use from any .NET language
  • Implements the Traditional NFA RegEX Engine
  • As does Java, Perl, PHP etc..
  • Almost all patterns will work the same
  • .NET is only one to implement Named Captures

10
The RegEx NamespaceSystem.Text.RegularExpressions
  • RegEx
  • Match
  • MatchCollection
  • Group
  • GroupCollection
  • Capture
  • CaptureCollection
  • RegExCompilationInfo

11
The Regex Base Class
  • The Regex class represents a single regular
    expression
  • It is immutable, which means once you create it,
    you cannot change it
  • To create a Regex object in C, you can first
    define it and then instantiate it with the
    regular expression pattern, as shown here

Regex myRegex myRegex new Regex(RegularExpressionPattern)
12
The Regex Base Class (Contd)
  • Match Searches a given string and returns a
    single Match object for the first text that is
    matched by the regular expression pattern
  • Matches Searches a given string and returns a
    MatchCollection object for all locations that are
    matched by the pattern stored in the Regex object
  • IsMatch Returns True if the provided string
    contains the pattern
  • Split Splits the given string into an array of
    substrings using the regular expression pattern
    as the delimiter
  • Replace Replaces any instances of text that
    match the pattern in the Regex object with the
    provided expression

13
Demonstration 1
  • private void btnRun_Click(object sender,
    System.EventArgs e)
  • //Use the RegEx object to determine if there is
    a match here we use a //single RegEx object
    passing in the pattern and option to ignore case
  • Regex rxMatch new Regex(txtRegEx.Text,
    RegexOptions.IgnoreCase)
  • //determine if there is a match using the user
    input
  • bool blnResultrxMatch.IsMatch(txtText.Text)
  • //display those results to the user
  • MessageBox.Show("The Result is "
    blnResult.ToString(),"RegEx Demo")

14
Match and Match Collection
  • Allows us to obtain the details of each match
    made via a regular expression
  • Match-represents a single match made
  • MatchCollection-a collection of Match Objects
  • When the Match method of the Regex object is
    used, it returns a Match object that contains the
    matching text
  • The MatchCollection object contains a series of
    Match objects, each representing a single
    substring from the string searched

15
Demonstration 2
  • private void btnRun_Click(object sender,
    System.EventArgs e)
  • //Use the RegEx object to determine if there is a
    match here we use a
  • //single RegEx object passing in the pattern and
    option to ignore case
  • Regex rxMatch new Regex(txtRegEx.Text,
    RegexOptions.IgnoreCase)
  • Match mtMatch
  • MatchCollection mtCol
  • mtMatch rxMatch.Match(txtText.Text)
  • mtColrxMatch.Matches(txtText.Text)
  • MessageBox.Show("There are " mtCol.Count "
    matche(s) found.","RegEx Demos")

16
Demonstration 2 (contd)
  • //if there are more than 0 matches, show them
  • if (mtCol.Countgt0)
  • //use the Match object here
  • do
  • //we want the match.value and position in the
    string
  • MessageBox.Show("Result at position string "
    mtMatch.Index.ToString() " "
    mtMatch.Value.ToString(),"RegEx Demos")
  • mtMatchmtMatch.NextMatch()
  • while (mtMatch.Success)

17
Group and GroupCollection
  • Capturing ()
  • The captured subsequence may be used later in the
    expression, via a back reference, and may also be
    retrieved from the matcher once the match
    operation is complete
  • Non-Capturing (?)
  • Named Capture (.NET only) (?ltnamegt)
  • Uses names for the captured groups instead of
    numbers
  • Substitutions
  • Specialized Replace via groups

18
Backreferences Advanced Grouping
  • Backreferences
  • Allows you to match the same characters as a
    previous group
  • Match repeated words
  • (\ba-zA-Z \b)\s\1
  • Advanced Grouping
  • Positive Look-Ahead Assertion (?)
  • Negative Look-Ahead Assertion (?!)
  • Positive Look-Behind Assertion (?lt)
  • Negative Look-Behind Assertion (?lt!)
  • Non-Backtracking (?gt)

19
Replacing Substrings
  • The Replace method of Regex is used to replace
    matched portions of a given string with the
    specified replacement.
  • Example using backrefrence named capture
  • NewDateYMD Regex.Replace( OldDateMDY,
    \b(?ltmonthgt\d1,2)/(?ltdaygt\d1,2)/(?ltyeargt\d2,
    4)\b, year-month-day)

20
Demonstration 3
  • private void btnCapture_Click(object sender,
    System.EventArgs e)
  • //a basic pattern that will capture any word w/
    4 characters
  • string strRegExPattern"(A-Za-z4)"
  • Regex rxGroups new Regex(strRegExPattern,RegexO
    ptions.IgnoreCase)
  • //Match Object-gtUsing a group here
  • Match mtGroup rxGroups.Match(txtCapture.Text)
  • //get all of the groups that exist
  • do
  • MessageBox.Show(mtGroup.Groups1.Value, "RegEx
    Demos")
  • mtGroupmtGroup.NextMatch()
  • while (mtGroup.Success)

21
Demonstration 3 (contd)
  • private void btnNamedCapture_Click(object sender,
    System.EventArgs e)
  • //a basic pattern that will capture any word w/ 4
    characters
  • //and the ability to use named capturing
  • string strRegExPattern"(?ltwordgtA-Za-z4)"
  • Regex rxGroups new Regex(strRegExPattern,RegexOp
    tions.IgnoreCase)
  • Match mtGroup rxGroups.Match(txtCapture.Text)
  • do
  • //show the match using the named reference "word"
  • MessageBox.Show(mtGroup.Result("word"), "RegEx
    Demos")
  • mtGroupmtGroup.NextMatch()
  • while (mtGroup.Success)

22
Demonstration 3 (contd)
  • private void btnBack_Click(object sender,
    System.EventArgs e)
  • //Use the RegEx object to determine if there is
    a
  • //duplicate word here using the
    (\ba-zA-Z\b)\s\1 pattern
  • Regex rxMatch new Regex(txtRegEx.Text,RegexOp
    tions.IgnoreCase)
  • //string to replace the text into
  • //replace the repeated word /w nothing 1
  • string strReplacerxMatch.Replace(txtBack.Text,"
    1")
  • //show the results
  • MessageBox.Show(strReplace,"RegEx Demos")

23
References
  • Regular Expression Library http//regexlib.com/
  • Regular Expressions Information Website
    http//www.regular-expressions.info/dotnet.html
  • Regular Expressions in .NET MSDN Library
  • Professional Visual Studio 2005
  • Andrew Parsons and Nick Randolph

24
Questions?
Write a Comment
User Comments (0)
About PowerShow.com