Title: Introduction to Cognitive Science Linguistics Component
1Introduction to Cognitive Science Linguistics
Component
- Lecture 2
- September 22, 2005.
- (2.00 p.m. 3.50 p.m.)
- Venue Meng Wah Complex Room 324
- Lecturer Dr. A. B. Bodomo
- Department of Linguistics
- ltabbodomo_at_hku.hkgt
2Topic 3 Formal Grammar Parsing and Generation
3Introduction
- In my previous lectures, we discussed how tacit
linguistic knowledge can be represented at
various levels of phonology, morphology, syntax,
semantics, pragmatics, and their interfaces,
including morphophonology, morphosyntax, and the
syntax-semantics interrelationships. - In this lecture, we shall look closely at how
these linguistic knowledge representations can be
formalised into an algorithm, a computational
procedure for processing this linguistic
knowledge.
4Keywords
- Constituent structure rules
- initial symbol
- terminal symbol
- non-terminal symbol
- generative grammar
- formal grammar
5Formal devices and notation
- The symbol ?
- indicates that a node is rewritten as or
consists of , or has the constituents - This is used in rewrite rules of the type
- S ? NP VP
- a sentence, S, has the constituents noun phrase
(NP) and verb phrase (VP) - Optionality in the grammar is expressed as X,
Y . - This means apply either X or Y but not both
6Formal devices and notation (contd)
- Initial symbol the symbol from which a rewrite
rule begins (e.g. S) - Terminal symbol the end symbols from which no
constituent structure can be further developed
(N, V, Art). All others are non-terminal symbols
(e.g. NP, VP).
- The symbol is used to indicate constituent
boundary - e.g. _ is word initial while _ is word final
- The notation X (Y) implies that X is obligatory
and may be followed by Y
7Two main aspects of grammatical information
processingGenerating and Parsing sentences
- Before we begin let us illustrate with a simple
grammar and lexicon, using the following
sentence - The students greeted the teacher.
8The students greeted the teacher.
- Grammar
- S ? NP VP
- VP ? V NP
- NP ? Art N
- Lexicon 1
- Greeted V, - NP
- Students N
- The Art
- Teacher N
This grammar can also generate (i.e. produce)
the following sentences The teacher
greeted the students The teacher scared the
students The child ate an apple
But you have to augment i.e. increase the lexicon
as follows Lexicon2 an Art
teacher N greeted V, -NP the Art
students N scared V, -NP
apple N ate V, -NP
child N
9Sentence Generationthe algorithm
- To produce a sentence we need three things
- A set of phrase structure rules (as illustrated
above) - A lexicon (as illustrated above), and
- A lexical insertion rule (as explained below)
- A lexical insertion rule is an instruction to
select the right word from a lexicon - The following is an example of a lexical rule
10Lexical insertion rule
- For each terminal symbol of a phrase structure
rule, select a word from the lexicon that
satisfies the following conditions - terminal symbol (e.g. N, V) It is a member of the
class of - its subcategorization frame matches that of the
terminal symbol (e.g. V, _NP). Attach this word
as the daughter of this terminal symbol. - The set of rules above constitutes what is known
as a sentence generator.
11- The whole procedure of beginning with an initial
symbol and then working through phrase structure
rules to adding the lexical items via lexical
insertions rules is driven by an algorithm or a
set of instructions. - Let us set out an algorithm for the generation
(production) of the sentence The students
greeted the teacher, a grammar and a lexicon as
follows
12The students greeted the teacher
Lexicon1 Greeted V, - NP Students N The
Art Teacher N
Grammar PS Rule (a) S ? NP VP PS Rule (b)
VP ? V NP PS Rule (c) NP ? Art N
Rule 1 Start with the initial symbol, S. Rule
2 For every non-terminal symbol, X, find a
phrase structure rule with X as left-hand symbol
and others as the right hand symbol(s), and
develop a rewrite rule with X as the mother and
the right hand symbols as ordered daughters.
Rule 3 Apply rule 2 until all branches end in
terminal symbols. Rule 4 Apply lexical rule
iteratively until every terminal symbol is
replaced by a lexical item.
13Illustrating the algorithm
Applying Rule 1
Applying Rule 2,3
Applying Rule 3
Applying Rule 4
14- From the above we can see that we have started
from an initial string and have ended with
terminal strings with lexical items as their
daughters. A sentence has thus been generated
(produced), telling us how this sentence is built
up. - Now, let us see how we can begin with an existing
sentence and then break it down into its
component parts by applying rules.
15Sentence parsing the algorithm
- To parse a sentence means to analyse it into its
constituent parts by the systematic application
of lexical insertion rules and some phrase
structure rules. - It is like the reverse process of generation.
16Types of Parsing
- Top-down Begin with the symbol S.
- Bottom-up Begin with terminal symbols (words).
Possible research Which types of parsing in
natural languages provide the most cognitively
realistic and efficient parser?
17Some sentence parsing rules which constitute a
PARSER
- For a sentence, S
- Rule 1 Determine from the lexicon the word
class of every item and develop a partial tree
for each word where the word class label
dominates the word. - Rule 2 Find a PS rule of the type X ? Y, Z and
where the right hand symbols match some sequence
of categories in the structure so far, and
develop a partial tree with X as the mother and
the right hand symbols as ordered daughters. - Rule 3 Continue rule 2 until the root, S, is
reached and there are no unattached strings.
18The man drank the tea.
Lexicon1 drank V, - NP man N the Art tea
N
Grammar PS Rule1 S ? NP VP PS Rule2 VP ? V
NP PS Rule3 NP ? Art N
Applying Rule 1
Applying Rule 2
19Applying Rule 3
20Conclusion
- Parsing and generation of natural language data
is a very important area of linguistics,
especially in computer applications of natural
languages which has become an important aspect of
the computer or information processing industry.
21Topic 4 Language and Literacy Acquisition
22Keywords
- language acquisition
- innateness hypothesis
- language faculty / Language Acquisition Device
(LAD) - literacy
- levels of literacy
- literacy acquisition
23Introduction
- Theme
- A survey of how linguistic knowledge is
acquired/learnt by speakers of a language, from
the point of view of spoken language and from the
point of view of literacy (reading and writing). - Objective
- an understanding of the basic terms and issues in
language and literacy acquisition - an interface approach rather than rigidly
discussing these issues from language acquisition
as separate and different from literacy
acquisition, we will look at how language
acquisition relates to literacy acquisition.
24What is language acquisition?
- Gleitman and Bloom 1999434
- refers to the process of attaining a specific
variant of human languagethe fundamental puzzle
in understanding this process has to do with the
open-ended nature of what is learned children
appropriately use words acquired in one context
to make reference in the next, and they construct
novel sentences to make known their changing
thoughts and desires (in MIT Encyclopedia of the
Cognitive Sciences).
- Crystal 1997 430
- The process of learning a first language in
children. - The analogous process of gaining a foreign or
second language.
25Explaining how languages are acquired
- In previous lectures we have tried to account for
how all and only the grammatical sentences of a
language are produced and represented in the
brain of the speakers of a language. - However, a complete account of linguistic
knowledge representation must address the issue
of how we acquire a language as children and how
we learn foreign languages as adults. - We will mainly be concerned with first language
acquisition and not foreign language learning.
26Stages of language development
- the single word stage (12-18 months)
- the language of the child consists of just a few
isolated words of the target language, e.g.
mamma, daddy,etc. - very little grammatical development
- the grammar stage (19-29 months)
- marked by the emergence of a few nominal and
verbal inflections in languages that have these. - a few phrases and word utterances apparently
strung together mammy, milk daddy bye bye,
etc. - 30 months
- can produce more adult-like speech Where's
daddy ? Daddy, I want to go with you.
27Explaining language acquisition
- The reason for the uniformity and rapidity in
child language acquisition is contained in the
innateness hypothesis. - This is, at least, the position of Chomsky and
most cognitive approaches to linguistic
explanation. - In this hypothesis, language acquisition is
determined by a biologically endowed innate
language faculty (also called Language
Acquisition Device (LAD)). - LAD or language learning program in childrens
brains provides them with a set of procedures
(let us call it an algorithm since we are
computer/cognitive science inclined) for
developing a grammar. - Input linguistic experience they get from the
parents and teachers.
28The nature of the language faculty
- Children can acquire any language as their native
tongue. - e.g. a child of Cantonese speaking parents
growing up in England can learn to speak perfect
English as her native tongue. - Those aspects of language innately determined are
universal - language faculty does not vary significantly from
human to human
An important aspect in the language faculty is
the search for principles of Universal Grammar!
29Universal Grammar (UG)
- A theory of the human language faculty, i.e. a
module of the mind/brain involved in the basic
design of language (Noam Chomsky) - It is part of an innate biologically endowed
language faculty, an innate mental organ specific
to the human species - It allows us to perceive and interpret
information governed by certain formal
constraints - These formal constraints refer to a system of
rules and representations and one of its
operations (its grammar) by which the acceptable
sentences of a language can be generated - Examples of formal universals, linguistic
constraints of an abstract nature the binding
principles determining what can or cannot be the
antecedent of an anaphoric, pronominal, or fully
referential nominal element, etc.
30Literacy Acquisition
- Literacy the ability to read, write and
calculate basic numbers - Difficult to define
- can mean different things to different people in
different areas computer literacy, investment
literacy, etc. - Is literacy part of our mental, cognitive
faculty? - Yes, because any human can acquire literacy i.e.
learn how to read, write and calculate basic
numbers given the right environment
31Levels of Literacy (cf. Stages of language
acquisition)
- 6 stages of reading (Daswani 1999)
- Stages 1-3 Pre-reading, decoding, fluency
(approx. grades 1 3) - Stage 4 Acquiring new knowledge (approx.grades 4
8) - Stage 5 Reading a range of complex materials
critically (grades 9 12) - Stage 6 Mature reader able to read for various
purposes professional, personal, civic
(university and beyond)
32The relationship between language and literacy
acquisition
- Traditional/historical view of child language
acquisition - learning to speak happens up to the age of five
years, while learning to read happens after five. - Now they are seen as very intertwined i.e. very
related learning to speak and learning to be
literate both deal with learning to use language - the basis of learning to speak has been outlined
to provide an ecology for literacy. The most
important lesson is that learning to speak and
learning to read are very much interwoven.
33Evidence of the interface of language and
literacy acquisition
- They are both part of learning to USE language.
- Both need input from the environment.
- can be compared with Vygotsky's idea of ZOPED,
zone of proximal development, i.e. the distance
between child initiative and ability of child to
do things under the influence of parental
support. - The learning environment participants,
situation, activity and a mechanism - Literacy acquisition is like language acquisition
(cf. Givon's idea of literacy acquisition as a
weak reflex of language acquisition). - Literacy is best acquired in a language one has
acquired.
34Conclusion
- Literacy (reading and writing) is then another
level/kind of linguistic knowledge
representation. - Spoken and written linguistic knowledge
representation interface with each other and are
very intertwined. - Language and literacy acquisition have very
important social, educational and cognitive
implications. - Language and Literacy acquisition should
therefore form an integral part of cognitive
science.
35References
- David Barton. 1994. The roots of literacy.
Literacy An Introduction to the Ecology of
Written Language. Oxford UK and Cambridge USA
Blackwell. Chapter 9, p.130-139. - C. J. Daswani. 1999. Literacy. In Bernard Spolsky
(ed) 1999. Concise Encyclopedia of Educational
Linguistics. Oxford Elsevier Science Ltd.. - Viv Edwards and David Corson (eds.) 1997.
Encyclopedia of Language and Education, Volume 2
Literacy. Netherlands Kluwer Academic
Publishers. - Talmy Givon. 1998. The grammar of Literacy. In
Syntaxis, 1, 1998 1-40. - Elfrieda Hierbert. 1994. Literacy in preschool
programs. In Alan C. Purves et al.(eds.) 1994.
Encyclopedia of English Studies and Language
Arts. New York Scholastic. 754-756. - Ernest Lepore and Zenon Pylyshyn (eds). 1999.
What Is Cognitive Science. Blackwell Publishers.
(especially chapters 10, 11, 12, and 13) - Neil Stillings and others. 1995. Cognitive
Science An Introduction. MIT Press. (especially
chapters 6, 9, 10, and 11) - Daniel A. Wagner. 1994. Literacy definitions. In
Alan C. Purves et al.(eds.) 1994. Encyclopedia of
English Studies and Language Arts. New York
Scholastic. 748-752. - R. Wilson and Frank C. Neil (eds.) 1999. The MIT
Encyclopedia of the Cognitive Sciences. MIT
Press. - Lila Gleitman and Paul Bloom. Language
Acquisition. p.434-438 - David Olson. Literacy. p.481-482
36Tentative List of research topics for Cognitive
Science Students
- Supervisor Dr. Adams BODOMO (abbodomo_at_hku.hk)
- Topics in Syntax Theory, Description and
Application - Building human language components in
Computational Systems - The LFG treatment of serial verbs, Complex
Predicates, and other verbal constructions in
various languages French, Norwegian, Japanese,
Chinese, Dagaare, etc - Topics in Language and Literacy as cognitive
processes - Chinese writing and computer technology Survey
and evaluation of various inputting systems. - New forms and functions of language and literacy
in the age of Information technology (emails,
ICQ, bulletin boards, mobile phone
texting,etc).A survey of SMS texting as a
cognitive and communicative process in HK - The grammar of aphasic patients
37Further studies - courses by Dr Bodomo
- LING1002 - Language.com Language in the
Contemporary World (1st year undergraduate,
co-taught with other staff members) - LING2011 - Language and Literacy in the
Information Age - LING2032 - Syntactic Theory
- LING2018 - Lexical-Functional Grammar
- LING2041 - Language and Information Technology
- LING2050 Grammatical Description
- LING2051 French Syntax and Universal Grammar
- Also consider B.A. in Human Language Technology
(HLT) as an option for a minor
38Take-home Quiz
- Please submit your answers to your tutor on or
before September 22, 2005.
39- The End -