Connectionist Models of Language Development: Grammar and the Lexicon

About This Presentation

Title:

Connectionist Models of Language Development: Grammar and the Lexicon

Description:

Triples (Mozer, Wicklegren), or artificial phonemes ... completed simulation demonstrating superiority of triples or phoneme-level word ... – PowerPoint PPT presentation

Number of Views:95

Avg rating:3.0/5.0

Slides: 20

Provided by: SteveR72

Category:

more less

Transcript and Presenter's Notes

Title: Connectionist Models of Language Development: Grammar and the Lexicon

1
Connectionist Models of Language Development
Grammar and the Lexicon

Steve R. Howell
McMaster University, 1999

2
Overview

Description of Research Plan
Explanation of Research Goals
Examine Inspiration for this research, in both
connectionist and language sub-fields
Methods
Results (Preliminary)
Discussion Future Directions

3
Overall Research Plan

Pursuit of an Integrated, multi-level,
connectionist model of language development
Multi-level dealing with several different
levels or parts of the language task
Integrated non-modular homogenous
functioning throughout the multi-level design

4
Research Goals

Better understanding of language development
process
Ability to test different interventions on a
successful model, instead of children, including
possibly lesioning the model
Functional language-learning model for AI and
Software (e.g. Chatterbots on net)

5
Connectionist Inspiration

Work of Jeff Elman on models of grammar learning
using Simple Recurrent Networks (SRNs)
Work of Landauer et. al. on acquisition of
semantic information (i.e. the lexicon) through
analysis of many weak word to word relations in
real-world text

6
Language-Domain Inspiration

Evidence against a sharp divide in the
acquisition of the lexicon and grammar (e.g.
Bates)
Lexicon develops first, but grammar development
overlaps, in relation to it, and seemingly in
step with it)
Hence the present focus on homogenous mechanisms
to explain the two.

7
Method

Computer Simulation of Connectionist (Neural
Network) model
Base algorithm and structure is Elmans (1990)
Simple Recurrent Network
Modifications include sub-word-level input,
multi-level architecture, and automated localist
to distributed representation conversion

8
Diagram of SRN
9
Parts of an Elman SRN

Input Layer of Units
Larger (usually) Hidden Layer of Units
Context Layer memory connected to hidden layer
Output Layer of units, same size as input layer

Uses back-propagation learning algorithm
Uses Prediction task to provide more plausible
teaching signal
Recurrent Context Units take copy of hidden units
at each time step

10
Modifications Sub-word Input

Triples (Mozer, Wicklegren), or artificial
phonemes
Recently completed simulation demonstrating
superiority of triples or phoneme-level word
representations to whole-word localist
representations for grammar learning (phonics?)

11
Representations of Words

Localist
0 0 0 0 0 0 0 1
0 0 0 1 0 0 0 0
Binary Distributed
0 1 0 0 1 0 1 1
1 0 1 1 1 1 1 1

Fully Distributed
0.43 0.23 0.03 0.1 0.04
0.22 0.12 0.04 0.42 0.5
Elman(1990) - Localist
Triples - Binary distrib.
Semantic Encoding
- Fully Distributed

12
Route to Multi-level Architecture

Elman SRN showed how word co-occurrence
information could be used to learn word
relationships (simple grammar)
Learning was of previous words (context) to next
word predicted
Even with a sub-word distributed representation,
prediction is still of the next word

13
Elman (1990) Clustering Results
14
Sub-word prediction

If we use a sliding window on the input text
(e.g. five letters for three letter triples) then
we are predicting the next triple from the
previous triples true sub-word prediction
e.g. The dog chased the cat...
Time 1 - The_d The, he_, e_d, _d
Time 2 - he_do he_, e_d, _do

15
Sub-word Advantages

Richer representations, accessing more of the
data inherent in the text or speech stream
Makes prediction/internal representation easier
Eliminates need for artificial pre-processing of
text into word vectors, just automatically
translates letters into triple vectors.

16
Sub-word Disadvantages

Cannot output words easily, just have a
collection of triples
Must stack a clean-up net on top in order to
reach word representations from the existing
triple representations
Hence, the multi-layer approach combine
prediction at two time-scales and levels of
granularity, but using the same method

17
Multi-layer SRN Diagram
18
Multi-layer SRN

Triples or letters layer
Input Layer 1
Hidden Layer 1
Context Layer 1
Output Layer 1
Learns to predict triples/phonemes

Word Layer
Input Layer 2 Hidden Layer 1
Hidden Layer 2
Context Layer 2
Output Layer 2
Predicts words from triples/phonemes

19
(No Transcript)

Write a Comment

User Comments (0)

About PowerShow.com

Connectionist Models of Language Development: Grammar and the Lexicon - PowerPoint PPT Presentation

Connectionist Models of Language Development: Grammar and the Lexicon

Triples (Mozer, Wicklegren), or artificial phonemes ... completed simulation demonstrating superiority of triples or phoneme-level word ... – PowerPoint PPT presentation