Fall 2005

1 / 58
About This Presentation
Title:

Fall 2005

Description:

Fall 2005. Lecture Notes #1. EECS 595 / LING 541 / SI 661&761. Natural ... Formerly at IBM TJ Watson Research Center. Times: Thursdays 2:40-5:25 PM, ... Synonyms ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 59
Provided by: rad75

less

Transcript and Presenter's Notes

Title: Fall 2005


1

EECS 595 / LING 541 / SI 661761
Natural Language Processing
  • Fall 2005
  • Lecture Notes 1

2
Introduction
3
Course logistics
  • Instructor Prof. Dragomir Radev
    (radev_at_umich.edu) Ph.D., Computer Science,
    Columbia University Formerly at IBM TJ Watson
    Research Center
  • Times Thursdays 240-525 PM, in 411, West Hall
  • Office hours TBA, 3080 West Hall Connector

Course home page
http//www.si.umich.edu/radev/NLP-fall2005
4
Example (from a famous movie)
Dave Bowman Open the pod bay doors, HAL. HAL
Im sorry Dave. Im afraid I cant do that.
5
Example
I saw her fall
  • How many different interpretations does the above
    sentence have? How many of them are
    reasonable/grammatical?

6
(No Transcript)
7
(No Transcript)
8
Example 1
The Standard and Poor's 500 and the Nasdaq
composite index both reached four-year highs
Thursday as investors, unfazed by oil prices
nearing 70 per barrel, welcomed a raft of strong
earnings reports.
9
Example 1
The Standard and Poor's 500 and the Nasdaq
composite index both reached four-year highs
Thursday as investors, unfazed by oil prices
nearing 70 per barrel, welcomed a raft of strong
earnings reports.
10
Example 1
The Standard and Poor's 500 and the Nasdaq
composite index both reached four-year highs
Thursday as investors, unfazed by oil prices
nearing 70 per barrel, welcomed a raft of strong
earnings reports.
11
Example 1
The Standard and Poor's 500 and the Nasdaq
composite index both reached four-year highs
Thursday as investors, unfazed by oil prices
nearing 70 per barrel, welcomed a raft of strong
earnings reports.
12
Example 1
The Standard and Poor's 500 and the Nasdaq
composite index both reached four-year highs
Thursday as investors, unfazed by oil prices
nearing 70 per barrel, welcomed a raft of strong
earnings reports.
13
Example 1
The Standard and Poor's 500 and the Nasdaq
composite index both reached four-year highs
Thursday as investors, unfazed by oil prices
nearing 70 per barrel, welcomed a raft of strong
earnings reports.
14
Example 2
Accenture posts higher earnings Consulting and
technology services firm beats estimates stock
gains in after-hours trading.July 7, 2005 435
PM EDT NEW YORK (Reuters) - Accenture Ltd., one
of the world's largest consulting and technology
services firms, posted a higher quarterly profit
Thursday boosted by a rebound in consulting
demand. Fiscal third-quarter net income more
than doubled to about 484 million, or 51 cents a
share, from 210 million, or 37 cents a share, a
year earlier, the company said. Analysts had
expected earning of 43 cents a share, according
to First Call. Accenture stock rose about 2
percent in after-hours trading after falling
nearly 6 percent in regular New York Stock
Exchange trading.
15
  • Gary Larson (The Far Side) cartoon
  • What we say to dogs
  • Okay Ginger! Ive had it! You stay out of the
    garbage! Understand, Ginger?
  • What they hear
  • Blah Ginger! blah blah blah blah blah blah blah
    blah blah blah blah Ginger?"

16
Time Warner to hold off on Cablevision But top
Time Warner execs said it may eventually be
interested in the cable assets.July 8, 2005
720 PM EDT SUN VALLEY, Idaho (Reuters) - A top
Time Warner Inc. executive said Friday it could
not bid for Cablevision until it completes a deal
to buy Adelphia Communications Corp., splashing
cold water on early buyout speculation. Time
Warner is in a joint deal with Comcast Corp. to
buy bankrupt cable provider Adelphia
Communications Corp. "We can't do anything else
until we get it (Adelphia) integrated," said Don
Logan, chairman of Time Warner's media and
communications group. But he added, "We've
always said we are interested in Cablevision. ...
Anything is possible." In June, the Dolan family
offered Cablevision shareholders about 33.50 per
share in a 7.9 billion deal to take the company
private. Analysts and one of Cablevision's top
investors have said the offer is too low and
could put the cable system, which serves 3
million customers in the New York area, into play
for other suitors, including Time Warner Cable
and Comcast. Wall Street analysts said in June
that Time Warner, if it were to bid, could top
the offer with a 35 to 40 per share bid. Time
Warner is the parent company of this Web site.
Time Warner chief executive Dick Parsons said on
Friday his company's decision about whether to
buy Cablevision Corp. rests on whether the Dolan
family decides to put it up for sale. "Chuck
(Dolan) controls it and it's not as if we could
take it away from him," Parsons said during a
break at the Allen Co. conference in Sun
Valley, Idaho. "When he's ready to bring that
asset to market he knows we're here." Parsons
would not comment on whether he has had recent
conversations with Dolan about buying
Cablevision. Parsons said he and Dolan agree
that cable assets are undervalued and that now is
a good time to buy them. Time Warner is the
parent company of CNN/Money.
17
Stocks edge upMajor gauges make tentative gains
at Friday's open after steep Fed-inspired
selloff.July 1, 2005 946 AM EDT NEW YORK
(CNN/Money) - Stocks inched higher early Friday,
recovering some from the big selloff after the
Federal Reserve boosted interest rates again, and
signaled it didn't intend to pause anytime soon.
The Dow Jones industrial average (down 99.51 to
10,274.97, Charts), the broader Standard Poor's
500 (up 2.50 to 1,193.83, Charts) index and the
Nasdaq composite (up 4.84 to 2,061.80, Charts)
all added a few points in the early going, with
the Nasdaq lagging the blue chip indicators a
bit. Stocks ended a mixed quarter on a down note
Thursday, with the Dow losing more than 100
points after the Fed raised the target for its
fed funds rate, an overnight bank lending rate,
another quarter point to 3.25 percent. In the
closely watched statement, the central bankers
acknowledged the impact of higher energy prices
and other negatives, but said the economic
expansion remains on track. They also pledged to
keep raising rates at a "measured" pace, all of
which suggested that they don't plan to pause in
the near term. Gains early Friday were broad
based, with 27 out of 30 Dow issues rising. In
corporate news, Microsoft (up 0.02 to 24.86,
Research) has settled antitrust claims made by
IBM (unchanged at 74.20, Research), the
companies said Friday. The software leader will
pay IBM 775 million as part of the deal. A
number of economic reports were due around 10
a.m. ET. The Institute for Supply Management's
manufacturing index for June was expected to have
risen to 51.5 in the month from 51.4 in May,
according to a consensus of economists surveyed
by Briefing.com. The revised read on June
consumer sentiment from the University of
Michigan was also due, as was the May read on
construction spending. Treasury prices slipped
after Thursday's big rally. The fall raised the
yield on the 10-year note to 3.94 percent from
3.92 percent late Thursday. Treasury prices and
yields move in opposite directions. In currency
trading, the dollar jumped versus the euro and
the yen. U.S. light crude oil for August
delivery rose 32 cents to trade at 56.82 a
barrel in electronic trading. Crude set a record
closing price for a nearby futures contract at
60.54 on Monday. COMEX gold fell 1.20 to
435.90 an ounce. In global trade, Asian-Pacific
markets ended mostly lower, and European markets
rose at midday.    
18
Google cracks 300Shares of the popular search
engine pass 300 for the first time and are now
up 260 since IPO.June 27, 2005 552 PM EDT By
Paul R. La Monica, CNN/Money senior writerNEW
YORK (CNN/Money) - Shares of Google, the popular
search-engine company, surpassed the 300 level
for the first time on Monday, sparking memories
of the dot-com stock craze of the late 1990s.
Google gained 2.3 percent to finish at 304.10,
slightly below its high for the day of 304.30.
The stock has now gained nearly 260 percent since
it went public last August at 85 a share. Much
of the optimism surrounding Google comes from the
fact that it is the leader in the white-hot
online advertising industry. The company reported
much better than expected sales and earnings for
the first quarter, thanks to a booming market for
online advertising, particularly ads tied to
specific keyword searches. And during the past
few weeks, Google has released several new
features -- including a desktop search function
for businesses and a test version of a
personalized home page tool -- that should help
the company remain competitive against rivals
Yahoo! and Microsoft. Several analysts have also
speculated that Google will soon launch an online
payment service that could compete against eBay's
PayPal. In addition, many investors have been
betting that the company, which now has a market
value of nearly 85 billion, will soon be added
to the benchmark SP 500 index. But the stock's
meteoric rise as of late -- shares have surged
more than 50 percent since the company reported
first-quarter results in mid-April -- has some
analysts thinking that the stock could take a hit
in the near future. "You might see the stock
pause temporarily," said Marianne Wolk, an
analyst with Susquehanna Financial Group. "For
the longer term, we're still very bullish but in
the very short term it wouldn't be a surprise to
see the stock stabilize or pull back." The key
for Google will be how strong its second quarter
results are. Google is set to report these
numbers on July 21. Analysts expect Google's
sales, excluding revenues it shares with
affiliates, a figure known as traffic acquisition
costs or TAC, to come in at 840 million, nearly
double last year's levels. Earnings, excluding
certain one-time charges, are forecast at 1.21,
an increase of 121 percent from a year ago. Wolk
thinks that Google should meet these targets but
does not believe the company will report results
that are significantly better than consensus
projections. And if Google does not continue to
beat estimates, the stock could take a bath.
"For Google to keep heading higher, it's
absolutely critical that they keep hitting
numbers. Everyone now believes the story," said
John Tinker, an analyst with ThinkEquity
Partners. Still, many investors are finding it
hard to bet against Google because it has been
posting extremely strong levels of sales growth
and healthy profit margins as a public company.
So the comparisons to the late 1990s, when shares
of many unprofitable Internet companies soared
solely due to hype, may not be apt. To that end,
Google is expected to generate nearly 3.6
billion in sales, excluding TAC and revenue of 5
billion next year as the company continues to
benefit from a shift of advertising dollars from
more mainstream media sources such as television,
radio, and newspapers, to the Web. In addition
to its ubiquitous search engine, Google has
branched out into related areas in order to
capitalize on the boom in online advertising. The
company has a comparison shopping site, Froogle,
a free e-mail service called Gmail which features
ads embedded in e-mails, and a local search site
that operates as kind of a Web version of the
Yellow Pages. Google also has expanded rapidly
abroad, with sales from outside the U.S.
accounting for nearly 40 percent of total sales
in the first quarter. What's more, some argue
that Google is not overvalued, since it continues
to trade at a discount to its top rival, Yahoo.
However, this gap has narrowed significantly as
of late. Google's price-to-earnings ratio, based
on 2005 earnings estimates, is 58. Yahoo trades
at 61.5 times earnings estimates for this year.
"Google is not an undiscovered stock any more,"
said Tinker. "It's no longer inefficiently
priced." And Google also potentially faces the
issue of the summer sluggishness that typically
affects Internet stocks. Last year, shares of
several Internet companies plunged in July as
results did not live up to lofty expectations.
19
Silly sentences
  • Children make delicious snacks
  • Stolen painting found by tree
  • I saw the Grand Canyon flying to New York
  • Court to try shooting defendant
  • Ban on nude dancing on Governors desk
  • Red tape holds up new bridges
  • Iraqi head seeks arms
  • Blair wins on budget, more lies ahead
  • Local high school dropouts cut in half
  • Hospitals are sued by seven foot doctors
  • In America a woman has a baby every 15 minutes.
    How does she do that?

20
Main problems in language
  • Novel words and usages
  • Blogs, little r me,7342.67
  • Spam as verb, email
  • Inconsistencies
  • Beverly Hills, Beverly Sills
  • junior college, college junior
  • pet spray, pet llama
  • Parsing problems
  • Cup holder
  • Federal Reserve Board Chairman
  • Implicature/reasoning
  • World knowledge
  • Subjectivity, scoping, negation

21
Types of ambiguity
  • Morphological Joe is quite impossible. Joe is
    quite important.
  • Phonetic Joes finger got number.
  • Part of speech Joe won the first round.
  • Syntactic Call Joe a taxi.
  • Pp attachment Joe ate pizza with a fork. Joe ate
    pizza with meatballs. Joe ate pizza with Mike.
    Joe ate pizza with pleasure.
  • Sense Joe took the bar exam.
  • Modality Joe may win the lottery.
  • Subjectivity Joe believes that stocks will rise.
  • Scoping Joe likes ripe apples and pears.
  • Negation Joe likes his pizza with no cheese and
    tomatoes.
  • Referential Joe yelled at Mike. He had broken
    the bike. Joe yelled at Mike.
    He was angry at him.
  • Reflexive John bought him a present. John bought
    himself a present.
  • Ellipsis and parallelism Joe gave Mike a beer
    and Jeremy a glass of wine.
  • Metonymy Boston called and left a message for
    Joe.

22
Synonyms/paraphrases
The SP 500 climbed 6.93, or 0.56 percent, to
1,243.72, its best close since June
12, 2001. The Nasdaq gained 12.22, or 0.56
percent, to 2,198.44 for its best showing since
June 8, 2001. The DJIA rose 68.46, or
0.64 percent, to 10,705.55, its highest level
since March 15.
23
What is Natural Language Processing
  • Natural Language Processing (NLP) is the study of
    the computational treatment of natural language.
  • NLP draws on research in Linguistics, Theoretical
    Computer Science, Mathematics and Statistics,
    Artificial Intelligence, Psychology, etc.

24
NLP
  • Information extraction
  • Named entity recognition
  • Trend analysis
  • Subjectivity analysis
  • Text classification
  • Anaphora resolution, alias resolution
  • Cross-document crossreference
  • Parsing
  • Semantic analysis
  • Word sense disambiguation
  • Word clustering
  • Question answering
  • Summarization
  • Document retrieval (filtering, routing)
  • Structured text (relational tables)
  • Paraphrasing and paraphrasing/entailment ID
  • Text generation
  • Machine translation

25
What is needed (1) linguistic knowledge
  • Examples
  • Zipfs law rank(wi)freq(wi) const
  • Collocations
  • Strong beer but powerful beer
  • Big sister but large sister
  • Stocks rise but ?stocks ascend (225,000 hits on
    Google vs. 47 hits)
  • Constituents
  • Children eat pizza.
  • They eat pizza.
  • My cousins neighbors children eat pizza.
  • _ Eat pizza!
  • Burstiness
  • P(ct2ctgt1)
  • How to get it
  • Manual rules
  • Automatically acquired from large text
    collections (corpora)

26
Linguistics
  • Knowledge about language
  • Phonetics and phonology - the study of sounds
  • Morphology - the study of word components
  • Syntax - the study of sentence and phrase
    structure
  • Lexical semantics - the study of the meanings of
    words
  • Compositional semantics - how to combine words
  • Pragmatics - how to accomplish goals
  • Discourse conventions - how to deal with units
    larger than utterances

27
What is needed (2) mathematical and
computational tools
  • Language models
  • Estimation methods
  • Hidden Markov Models (HMM) for sequences
  • Context-free grammars (CFG) for trees
  • Conditional Random Fields (CRF)
  • Generative/discriminative models
  • Maximum entropy models
  • Random walks
  • Latent semantic indexing (LSI)
  • Representation issues
  • Feature engineering

28
Theoretical Computer Science
  • Automata
  • Deterministic and non-deterministic finite-state
    automata
  • Push-down automata
  • Grammars
  • Regular grammars
  • Context-free grammars
  • Context-sensitive grammars
  • Complexity
  • Algorithms
  • Dynamic programming

29
Mathematics and Statistics
  • Probabilities
  • Statistical models
  • Hypothesis testing
  • Linear algebra
  • Optimization
  • Numerical methods

30
Artificial Intelligence
  • Logic
  • First-order logic
  • Predicate calculus
  • Agents
  • Speech acts
  • Planning
  • Constraint satisfaction
  • Machine learning

31
Existing applications
  • Web search
  • Natural language interfaces to databases
  • Parsing job postings
  • Military intelligence
  • Summarizing medical records
  • Information extraction for databases
  • Wrapper induction

32
Potential applications
  • Trend recognition
  • Db conversion named entity extraction
    classification relation extraction
  • Detecting change
  • Summarization
  • Social network analysis
  • Assigning subjectivity scores (stars)
  • Sentiment classification
  • Alignment of text w/ other signal (time series)
  • Record linkage

33
Current work at CLAIR
  • Semi-supervised entity and relation extraction
  • Subjectivity analysis factuality extraction
  • Protein interaction recognition
  • Text summarization
  • Text mining from the Web
  • Lexical network models of the Web
  • Syntactic alignment
  • Chronology recovery
  • Classification

34
Final remarks
  • Language is not adversarial
  • It is used to convey useful information
  • Hard to extract this information automatically
  • Need to use NLP
  • Inference mathematics, statistics, machine
    learning
  • Networks/fields
  • Graph theory
  • Differential equaitions
  • Statistics/optimization
  • Linguistics/KR/AI
  • Sequence alignment
  • Linear algebra/vector analysis

35
Ambiguity
I saw her fall.
  • The categories of knowledge of language can be
    thought of as ambiguity-resolving components
  • How many different interpretations does the above
    sentence have?
  • How can each ambiguous piece be resolved?
  • Does speech input make the sentence even more
    ambiguous?

Time flies like an arrow.
36
The alphabet soup(NLP vs. CL vs. SP vs. HLT vs.
NLE)
  • NLP (Natural Language Processing)
  • CL (Computational Linguistics)
  • SP (Speech Processing)
  • HLT (Human Language Technology)
  • NLE (Natural Language Engineering)
  • Other areas of research Speech and Text
    Generation, Speech and Text Understanding,
    Information Extraction, Information Retrieval,
    Dialogue Processing, Inference
  • Related areas Spelling Correction, Grammar
    Correction, Text Summarization

37
Some demos
  • ATT Labs Text to Speech (http//www.research.att.
    com/projects/tts/demo.html)
  • Babelfish (http//babelfish.altavista.com)
  • OneAcross (http//www.oneacross.com)
  • AskJeeves (http//www.ask.com)
  • IONaut (http//www.ionaut.com8400) seems to be
    down
  • NSIR (http//tangra.si.umich.edu/clair/NSIR/html/n
    sir.cgi)
  • AnswerBus (http//www.answerbus.com)
  • NewsInEssence (http//www.newsinessence.com)

38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
The Turing Test
  • Alan Turing the Turing test (language as test
    for intelligence)
  • Three participants a computer and two humans
    (one is an interrogator)
  • Interrogators goal to tell the machine and
    human apart
  • Machines goal to fool the interrogator into
    believing that a person is responding
  • Other humans goal to help the interrogator
    reach his goal

Q Please write me a sonnet on the topic of the
Forth Bridge. A Count me out on this one. I
never could write poetry. Q Add 34957 to
70764. A 105621 (after a pause)
43
Some brief history
  • Foundational insights (40s and 50s) automaton
    (Turing), probabilities, information theory
    (Shannon), formal languages (Backus and Naur),
    noisy channel and decoding (Shannon), first
    systems (Davis et al., Bell Labs)
  • Two camps (57-70) symbolic and
    stochastic.Transformation grammar (Harris,
    Chomsky), artificial intelligence (Minsky,
    McCarthy, Shannon, Rochester), automated theorem
    proving and problem solving (Newell and
    Simon)Bayesian reasoning (Mosteller and
    Wallace)Corpus work (Kucera and Francis)

44
Some brief history
  • Four paradigms (70-83) stochastic (IBM),
    logic-based (Colmerauer, Pereira and Warren, Kay,
    Bresnan), nlu (Winograd, Schank, Fillmore),
    discourse modelling (Grosz and Sidner)
  • Empiricism and finite-state models redux (83-93)
    Kaplan and Kay (phonology and morphology), Church
    (syntax)
  • Late years (94-03) strong integration of
    different techniques, different areas (including
    speech and IR), probabilistic models, machine
    learning

45
The state of the art and the near-term future
  • World-Wide Web (WWW)
  • Sample scenarios
  • generate weather reports in two languages
  • teaching deaf people to speak
  • translate Web pages into different languages
  • speak to your appliances
  • find restaurants
  • answer questions
  • grade essays (?)
  • closed-captioning in many languages
  • automatic description of a soccer game

46
Structure of the course
  • Three major parts
  • Linguistic, mathematical, and computational
    background
  • Computational models of morphology, syntax,
    semantics, discourse, pragmatics
  • Applications text generation, machine
    translation, information extraction, etc.
  • Three major goals
  • Learn the basic principles and theoretical issues
    underlying natural language processing
  • Learn techniques and tools used to develop
    practical, robust systems that can communicate
    with users in one or more languages
  • Gain insight into many open research problems in
    natural language

47
Readings
  • Speech and Language Processing(Daniel Jurafsky
    and James Martin)Prentice-Hall, 2000ISBN
    0-13-095069-6
  • Handouts given in class
  • 1-2 chapters per week

Optional readings Natural Language
Understanding by Allen Foundations of
Statistical Natural Language Processing by
Manning and Schütze.
48
Grading
  • Four homework assignments (40)
  • Midterm (15)
  • Final project (20)
  • Final exam (25)
  • Additional requirements for SI761

49
Assignments
  • (subject to change)
  • Finite-state modeling, part of speech tagging,
    and information extraction
  • Fsmtools/lextools/JMX (Bell Labs, Penn)
  • Tagging and parsing
  • Brill tagger/Charniak parser (JHU, Brown)
  • Machine translation
  • GIZA/Rewrite decoder (Aachen, JHU, ISI)
  • Text generation
  • FUF/Surge (Columbia)

50
Syllabus
51
Other meetings
  • CLAIR meeting
  • (TBA)
  • Artificial Intelligence Seminar
  • (Tuesdays 4-530)
  • STIET
  • (Thursdays 4-530)

52
Projects
Each student will be responsible for designing
and completing a research project that
demonstrates the ability to use concepts from the
class in addressing a practical problem. A
significant part of the final grade will depend
on the project assignment. Students can elect to
do a project on an assigned topic, or to select a
topic of their own. The final version of the
project will be put on the World Wide Web, and
will be defended in front of the class at the end
of the semester (procedure TBA). In some cases
(and only with instructors approval), students
may be allowed to work in pairs when the
projects scope is significant.
53
Sample projects
  • Noun phrase parser
  • Paraphrase identification
  • Question answering
  • NL access to databases
  • Named entity tagging
  • Rhetorical parsing
  • Anaphora resolution, entity crossreference
  • Document and sentence alignment
  • Using bioinformatics methods
  • Encyclopedia
  • Information extraction
  • Speech processing
  • Sentence normalization
  • Text summarization
  • Sentence compression
  • Definition extraction
  • Crossword puzzle generation
  • Prepositional phrase attachment
  • Machine translation
  • Generation
  • Semi-structured document parsing
  • Semantic analysis of short queries
  • User-friendly summarization
  • Number classification
  • Domain-specific PP attachment
  • Time-dependent fact extraction

54
Main research forums and other pointers
  • Conferences ACL/NAACL, SIGIR, AAAI/IJCAI, ANLP,
    Coling, HLT, EACL/NAACL, AMTA/MT Summit,
    ICSLP/Eurospeech
  • Journals Computational Linguistics, Natural
    Language Engineering, Information Retrieval,
    Information Processing and Management, ACM
    Transactions on Information Systems, ACM TALIP,
    ACM TSLP
  • University centers Columbia, CMU, JHU, Brown,
    UMass, MIT, UPenn, USC/ISI, NMSU, Michigan,
    Maryland, Edinburgh, Cambridge, Saarland,
    Sheffield, and many others
  • Industrial research sites IBM, SRI, BBN, MITRE,
    MSR, (ATT, Bell Labs, PARC)
  • Startups Language Weaver, Ask.com, LCC
  • The Anthology http//www.aclweb.org/anthology

55
(No Transcript)
56
What this course is NOT
  • EECS 597 / LING 792 / SI 661 Language and
    Information, last taught in Winter 2005,
    essentially an introduction to corpus-based and
    statistical NLP.
  • Topics covered introduction to computational
    linguistics, information theory, data compression
    and coding, N-gram models, clustering,
    lexicography, collocations, text summarization,
    information extraction, question answering, word
    sense disambiguation, analysis of style, and
    other topics .
  • SI 760 Information Retrieval, last taught
    Winter 2005.
  • Topics covered information need, IR models,
    documents, queries, query languages, relevance,
    retrieval evaluation, reference collections,
    query expansion and relevance feedback, indexing
    and searching, XML retrieval, language modeling
    approaches, crawling the Web, hyperlink analysis,
    measuring the Web, similarity and clustering,
    social network analysis for IR, hubs and
    authorities, PageRank and HITS, focused crawling,
    relevance transfer, question answering
  • The new advanced NLP/IR course, to be offered
    Winter 2006.
  • An undergraduate Linguistics course such as Ling
    212 Intro to the Symbolic Analysis of Language
    or Ling 320 Programming for Linguistics and
    Language Studies

57
Other sites
  • Johns Hopkins University (Jason
    Eisner)http//www.cs.jhu.edu/jason/465/
  • Cornell University (Lillian Lee)http//courses.cs
    .cornell.edu/cs674/2002SP/
  • Stanford University (Chris Manning)http//www.sta
    nford.edu/class/cs224n/
  • JHU Summer workshophttp//www.clsp.jhu.edu/ws2003
    /calendar/preliminary.shtml

58
Readings
  • JM Chapters 1, 2
  • What is Computational Linguistics by Hans
    Uszkoreithttp//www.coli.uni-sb.de/hansu/what_is
    _cl.html
  • Lecture notes 1
Write a Comment
User Comments (0)