CSC 4630 - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

CSC 4630

Description:

awk, named for Aho, ... for each input file line do. if pattern matches line then action ... int(x) truncates fractional part. rand(x) returns a random ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 25
Provided by: supp158
Category:
Tags: csc | truncates

less

Transcript and Presenter's Notes

Title: CSC 4630


1
CSC 4630
  • Meeting 7
  • February 7, 2007

2
More Scripting Languages
  • awk, named for Aho, Weinberger, Kernighan
  • Script is embedded in a nested looping control
    structure
  • for each pattern action do
  • for each input file line do
  • if pattern matches line then action

3
awk Programs
  • Generally a sequence of pattern action
    statements
  • If action is missing, matched lines are printed
    (meaning written to STDOUT)
  • If pattern is missing, action is carried out for
    all lines

4
Running awk Programs
  • Short one, composed at keyboard with little
    thought
  • awk program file1 file2
  • Note that awk can take a sequence of files as
    input.
  • Long one, composed in editor
  • awk f progfile file1 file2

5
awks View of Files
  • Input to awk are text files
  • Divided into lines
  • Each line divided into fields by blanks or tabs
    (the default separator)
  • Each field referenced by relative number, 1, 2,
    3,
  • 0 refers to the entire line

6
Examples
  • awk print 1 names
  • Print the first field in each line of the names
    file
  • awk /M/ names
  • Print each line of the names file that contains
    an upper case M

7
Some Built-In Variables
  • NR, line number of current line of input (runs
    sequential over all input files)
  • NF, number of fields in current line
  • FS, the field separator
  • FS \t sets the separator to tab, only
  • FS sets the separator to colon
  • FNR, number of the current line (record) in the
    current input file (resets when a new input file
    is opened)

8
Examples
  • print NR, NF
  • print NR, 0
  • print NF
  • NR 10
  • NF ! 3
  • NF gt 4

9
Patterns
  • Special patterns
  • BEGIN Action is done once before any lines of
    the input file(s) are read
  • END Action is done once after the last file has
    been processed
  • Relational expressions between strings or numbers
  • Arguments treated as numbers, if possible

10
Comparison Operators
  • lt less than
  • gt greater than
  • lt less than or equal to
  • gt greater than or equal to
  • equal to
  • ! not equal to
  • matches
  • ! does not match

11
Regular Expressions
  • Enclosed in / /
  • Matches in entire line
  • Field match specified as 3 /Ab/, for example
  • Special symbols
  • \ . ? ( )

12
Examples
  • /Asia/
  • /./
  • /a\/
  • /\t/
  • 2 ! /0-9/
  • /(applecherry) (pietart)/ (note space)

13
C Escape Sequences
  • \b backspace
  • \f formfeed
  • \n newline
  • \r carriage return
  • \t tab
  • \ddd character whose ASCII value in octal is
    ddd
  • \ quotation mark
  • \c any other character c literally

14
Actions
  • Mini C-like programs
  • Can extend over several lines
  • Statements terminated by semicolons or newlines.
    Statements grouped with braces .
  • Variables are either floating point numbers or
    strings.
  • Variables are automatically declared and
    initialized
  • Strings initialized to , the empty string
  • Numbers initialized to 0

15
Assignment Statements
  • Simple version v e
  • Variable or field name assigned value of
    expression
  • Assignment operators v op e means
    v v op e
  • Legal values of op are - /
  • Used because interpreted code runs faster

16
Increment Operators
  • Borrowed from C
  • Prefix or postfix
  • or
  • Example x 3. What is the value of k?
  • k x
  • k x
  • k x--
  • k --x

17
Arithmetic Functions
  • sin(x) assumes x is in radians
  • cos(x) assumes x is in radians
  • atan2(y,x) range from pi to pi
  • exp(x) exponential
  • log(x) natural logarithm of x, so xgt0
  • sqrt(x) square root of x, so x gt 0
  • int(x) truncates fractional part
  • rand(x) returns a random number in 0,1
  • srand(x) sets the seed for rand to x

18
Strings
  • Literal values enclosed in double quotes
  • abc Wildcats rule 20 bananas
  • Concatenation represented by juxtaposition
  • s Villanova
  • t Wildcats
  • print s t

19
String Functions
  • Standard string operations (cf. head, tail,
    firstfew, lastfew, allbut)
  • length(s) length of s
  • length length(0)
  • index(s,t) if t is a substring of s return
    position of first character, return 0 otherwise
  • substr(s,p) returns substring starting at
    position p if 0ltpltlength(s), returns empty
    string otherwise
  • substr(s,p,n) returns substring of length n
    starting at position p

20
String Functions (2)
  • Editing functions
  • sub(r,s) replace r by s in current record (first
    occurrence only)
  • sub(r,s,t) replace r by s in t (first occurrence
    only)
  • gsub(r,s) replace r by s in current record
    (globally)
  • gsub(r,s,t) replace r by s in t (globally)
  • In all cases, return the number of substitutions

21
Control Structures
  • if (ltexpressiongt) lts1gt else lts2gt
  • ltexpressiongt can be any expression true is
    defined to be non-zero or non-null
  • lts1gt and lts2gt can be any group of statements
  • Note the critical parentheses that separate the
    conditional expression from lts1gt

22
Control Structures (2)
  • while (ltexpressiongt) lts1gt
  • Same rules as for if-then-else

23
Control Structures (3)
  • for (lte1gtlte2gtlte3gt) lts1gt is equivalent to
  • lte1gt while (lte2gt) lts1gtlte3gt
  • for (k in ltarraygt) lts1gt loops over the subscripts
    of an array but the order of the subscripts is
    random. Careful awk allows general
    subscripting. Strings can be used as subscripts.

24
Control Structures (4)
  • Go to structures
  • break when executed within a for or while
    statement, causes an immediate exit
  • continue when executed within a for or while
    statement, causes immediate execution of the next
    iteration
  • next causes the next line (record) of the
    input file to be read and the sequence of pattern
    action statements executed on it
  • exit causes the program to jump to the END
    pattern, execute it, and stop
Write a Comment
User Comments (0)
About PowerShow.com