Using%20CTW%20as%20a%20language%20modeler%20in%20Dasher - PowerPoint PPT Presentation

About This Presentation

Title:

Using%20CTW%20as%20a%20language%20modeler%20in%20Dasher

Description:

Conditional probability for each alphabet symbol, given the ... Exclusion: only use Betas of the actual model. Iterative process. Convergent? Approximation: ... – PowerPoint PPT presentation

Number of Views:52

Avg rating:3.0/5.0

Slides: 23

Provided by: martijn8

Category:

more less

Transcript and Presenter's Notes

Title: Using%20CTW%20as%20a%20language%20modeler%20in%20Dasher

1
Using CTW as a language modeler in Dasher

Martijn van Veen
05-02-2007
Signal Processing Group
Department of Electrical Engineering
Eindhoven University of Technology

2
Overview

What is Dasher
And what is a language model
What is CTW
And how to implement it in Dasher
Decreasing the model costs
Conclusions and future work

3
Dasher

Text input method
Continuous gestures
Language model
Lets give it a try!

Dasher
4
Dasher Language Model

Conditional probability for each alphabet symbol,
given the previous symbols
Similar to compression methods
Requirements
Sequential
Fast
Adaptive
Model is trained
Better compression -gt faster text input

5
Dasher Language model

PPM Prediction by Partial Match
Predictions by models of different order
Weight factor for each model

6
Dasher Language model

Asymptotically PPM reduces to fixed order context
model
But the incomplete model works better!

7
CTW Tree model

Source structure in the model, parameters
memoryless

KT estimator
a number of zeros
b number of ones

8
CTW Context tree

Context-Tree Weighting combine all possible tree
models up to a maximum depth

9
CTW tree update
10
CTW Implementation

Current implementation
Ratio of block probabilities stored in each node
Efficient but patented
Develop a new implementation
Use only integer arithmetic, avoid divisions
Represent both block probabilities as fractions
Ensure denominators equal by cross-multiplication
Store the numerators, scale if necessary

11
CTW for Text

Binary decomposition
Adjust zero-order estimator

12
Results

Comparing PPM and CTW language models
Single file
Model trained
with English text
Model trained
with English text
and user input

Input file CTW PPM Difference
Book 2 1.979 2.177 9.10
NL 2.364 2.510 5.82
Input file CTW PPM Difference
Book 2 2.632 2.876 8.48
NL 4.356 5.014 13.12
Input file CTW PPM Difference
GB 2.847 3.051 6.69
Book 2 2.380 2.543 6.41
Book 2 2.295 2.448 6.25
13
CTW Model costs

What are model costs?

14
CTW Model costs

Actual model and alphabet size fixed -gt Optimize
weight factor alpha
Per tree -gt not enough parameters
Per node -gt not enough adaptivity
Optimize alpha per depth of the tree

15
CTW Model costs

Exclusion only use Betas of the actual model
Iterative process
Convergent?
Approximation
To find actual model use Alpha 0.5

16
CTW Model costs

Compression of an input sequence
Model costs significant, especially for short
sequence
No decrease by optimizing alpha per depth?

Symbols Alpha 0.5 Alpha after exclusion Without model costs
100 5.73 5.21 4.94
1.000 4.22 4.07 3.68
10.000 3.12 3.07 2.77
100.000 2.33 2.32 2.13
600.000 1.95 1.95 1.83
17
CTW Model costs

Maximize probability in the root,
instead of the probability per depth
Exclusion based on alpha 0.5 almost optimal

Symbols Alpha 0.5 Alpha after exclusion Max. probability in root Without model costs
100 0.8437 0.8117 0.8113 0.7022
1.000 0.6236 0.6213 0.6209 0.5330
10.000 0.3830 0.3792 0.3794 0.3276
100.000 0.2661 0.2652 0.2647 0.2389
600.000 0.2248 0.2242 0.2241 0.2098
18
CTW Model costs

Results in Dasher scenario
Trained model
Negative effect if no user text is available
Trained with concatenated user text
Small positive effect if user text added to
training text, and very similar to it

Language Alpha 0.5 Alpha after exclusion
GB 2.01 2.04
NL 4.34 4.36
Language Alpha 0.5 Alpha after exclusion
GB 2.30 2.28
NL 4.12 4.13
19
Conclusions