I256: Applied Natural Language Processing - PowerPoint PPT Presentation

About This Presentation
Title:

I256: Applied Natural Language Processing

Description:

I256: Applied Natural Language Processing Marti Hearst Aug 30, 2006 Today Introductions Python Basics Introduction to NLTK The Natural Language Toolkit (NLTK ... – PowerPoint PPT presentation

Number of Views:213
Avg rating:3.0/5.0
Slides: 17
Provided by: coursesIs1
Category:

less

Transcript and Presenter's Notes

Title: I256: Applied Natural Language Processing


1
I256 Applied Natural Language Processing
Marti Hearst Aug 30, 2006    
2
Today
  • Introductions
  • Python Basics

3
Introduction to NLTK
  • The Natural Language Toolkit (NLTK) provides
  • Basic classes for representing data relevant to
    natural language processing.
  • Standard interfaces for performing tasks, such as
    tokenization, tagging, and parsing.
  • Standard implementations of each task, which can
    be combined to solve complex problems.
  • Pre-parsed corpora and tools to access them.

4
NLTK Example Modules
  • nltk_lite.tokenize processing individual
    elements of text, such as words or sentences.
  • nltk_lite.probability modeling frequency
    distributions and probabilistic systems.
  • nltk_lite.tag tagging tokens with supplemental
    information, such as parts of speech or wordnet
    sense tags.
  • nltk_lite.parser high-level interface for
    parsing texts.

5
Python and Natural Language Processing
  • Python is a great language for NLP
  • Simple (and fun!)
  • Powerful string manipulation
  • Easy to debug
  • Interpreted language
  • Easy to test small steps incrementally
  • Exceptions
  • Easy to structure
  • Modules
  • Object oriented programming

6
An Interpreted Language
  • The interpreter processes what youve typed as
    soon as you hit ltreturngt
  • gtgtgt 3 4
  • 12
  • gtgtgt
  • Python is sensitive to leading whitespace
  • If you put in extra spaces, or too few, it will
    complain.
  • If you type a multi-line command, you must do the
    indenting the interpreter helps you with this
  • gtgtgt if 4 gt 3
  • print "duh
  • duh
  • gtgtgt

7
Some Python Basics
  • Strings

8
Some Python Basics
  • Lists

9
Some Python Basics
  • Iteration over Lists

10
Modules and Packages
  • Python modules package program code and data for
    reuse. (Lutz)
  • Similar to library in C, package in Java.
  • Python packages are hierarchical modules (i.e.,
    modules that contain other modules).
  • Three commands for accessing modules
  • import
  • fromimport
  • reload

11
Modules and Packages import
  • The import command loads a module
  • Load the regular expression module
  • gtgtgt import re
  • To access the contents of a module, use dotted
    names
  • Use the search method from the re module
  • gtgtgt re.search(\w, str)
  • To list the contents of a module, use dir
  • gtgtgt dir(re)
  • DOTALL, I, IGNORECASE,

12
Modules and Packagesfromimport
  • The fromimport command loads individual
    functions and objects from a module
  • Load the search function from the re module
  • gtgtgt from re import search
  • Once an individual function or object is loaded
    with fromimport, it can be used directly
  • Use the search method from the re module
  • gtgtgt search (\w, str)

13
Import vs. fromimport
  • Import
  • Keeps module functions separate from user
    functions.
  • Requires the use of dotted names.
  • Works with reload.
  • fromimport
  • Puts module functions and user functions
    together.
  • More convenient names.
  • Does not work with reload.

14
Modules and Packages reload
  • If you edit a module, you must use the reload
    command before the changes become visible in
    Python
  • gtgtgt import mymodule
  • ...
  • gtgtgt reload (mymodule)
  • The reload command only affects modules that have
    been loaded with import it does not update
    individual functions and objects loaded with
    from...import.

15
Configuring the Python IDE
  • Called IDLE
  • You can set key bindings
  • Go to Options gt Configure IDLE
  • Select Keys tab
  • Select an action and specify an alternative
    binding
  • Click Save as New Custom Key Set
  • Give it a name
  • Click Apply so it takes hold
  • If you want to use an existing binding (say,
    Control-A)
  • First find the command that has that binding
  • Change it to something else
  • Click Apply
  • Now choose your command and change its binding
    ot Control-A

16
For Next Week
  • Monday holiday, no class
  • Sign up for the email list!
  • Mail to majordomo_at_sims.berkeley.edu
  • Put in msg body subscribe anlp
  • For Wed Sept 6
  • Finish the programming tutorial
  • Do the regular expression tutorial.
  • Well go through regexs some in class.
Write a Comment
User Comments (0)
About PowerShow.com