Title: Overview
1Overview
- LING 5200
- Computational Corpus Linguistics
- Martha Palmer
2Whats a corpus?
- McEnery Wilson
- (i) (loosely) any body of text
- (ii) (most commonly) a body of machine-readable
text - (iii) (more strictly) a finite collection of
machine-readable text, sampled to be maximally
representable of a language or variety
3Whats corpus linguistics?
- the study of language based on examples of real
life language use (McEnery Wilson) - A methodology, not a branch of linguistics
- Biber et al.
- Uses computers
- Natural texts
- Large principled collection
- Both quantitative and qualitative
4What was Chomskys complaint?
- Linguistics should model competence not
performance. What are the underlying rules that
allow us to generate language? - Context structuralists believed in collecting
linguistic data about a language without taking
meaning and communication into consideration. - Mirrors the debate between the rationalists and
the empiricists. - But, does Chomsky account for meaning?
- (see Searle)
5Which Linguistic branches can make use of corpus
linguistics?
- Phonetics
- Phonology
- Morphology
- Syntax
- Semantics
- Pragmatics
- Psycholinguistics
- Computational Lx
- Descriptive Lx
- Historical Lx
- Sociolinguistics
6Corpus linguistics in context
7Whats LING 5200 Corpus Linguistics?
8Overview
- Quick intro to Unix
- A little corpus design
- Quick tour of corpora and annotation
- Tools for working with corpora
- Programming in Python
- Some software engineering
9Why Python?
- It works
- Many advantages
- Its a bona fide programming language
- Youll need it for CSCI 5832
10Administrative things
- Textbooks Unix, Python
- Office hours Mon 5-6, Tues 1-2
- verbs.colorado.edu/mpalmer/ling5200
- Prerequisites - none
- Grades homeworks/project
- Accounts on babel
11Logging on for the first time
- First thing to do change your password.
- passwd
- Give it your current password, then your new
password. Repeat the new one. (to catch typos)
12Connecting with another computer
- ssh l your_name babel.colorado.edu
- You are prompted to log in.
13Logging on for the first time, again
- First thing to do change your password.
- passwd
- Give it your current password, then your new
password. Repeat the new one. (Why?)
14Where am I?
- Type pwd
- You see something like this
- /home/mpalmer
15What's that mean??
16Important directories
/
bin
home
etc
usr
local
mpalmer
ling5200
bin
RCS
17Important directories
/
bin
home
etc
usr
local
mpalmer
/home/mpalmer/ling5200
ling5200
bin
RCS
18Important directories
/
bin
home
etc
usr
local
mpalmer
/home/mpalmer/ling5200
/usr/local/bin
ling5200
bin
RCS
19Navigating directories
- ls to list contents, cd to change directory
- Directories are just like windows folders
- /home/mpalmer shortcut
- the directory above this one ..
- this directory .
20What's in the neighborhood?
- Type ls
- You see a list of directories and files that are
contained within the current directory - Homework_1.txt
- tools
- buglog.txt
21I'd like to go somewhere else
- Type pwd
- Type cd
- Where are you?
- Type cd ..
- Where are you?
- Type cd your_user_id
- Where are you?
22Unix is a verb-initial language
"go"
where to go
23Unix is a verb-initial language
If no argument, I assume you mean "home"
"go"
24Making a new directory
- Type cd
- Type ls
- Type mkdir ling5200
- Type ls
- Go to the directory you just made (how?)
- Type pwd
- Type ls