Using%20CTW%20as%20a%20language%20modeler%20in%20Dasher - PowerPoint PPT Presentation

About This Presentation
Title:

Using%20CTW%20as%20a%20language%20modeler%20in%20Dasher

Description:

Conditional probability for each alphabet symbol, given the ... Exclusion: only use Betas of the actual model. Iterative process. Convergent? Approximation: ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 23
Provided by: martijn8
Category:

less

Transcript and Presenter's Notes

Title: Using%20CTW%20as%20a%20language%20modeler%20in%20Dasher


1
Using CTW as a language modeler in Dasher
  • Martijn van Veen
  • 05-02-2007
  • Signal Processing Group
  • Department of Electrical Engineering
  • Eindhoven University of Technology

2
Overview
  • What is Dasher
  • And what is a language model
  • What is CTW
  • And how to implement it in Dasher
  • Decreasing the model costs
  • Conclusions and future work

3
Dasher
  • Text input method
  • Continuous gestures
  • Language model
  • Lets give it a try!

Dasher
4
Dasher Language Model
  • Conditional probability for each alphabet symbol,
    given the previous symbols
  • Similar to compression methods
  • Requirements
  • Sequential
  • Fast
  • Adaptive
  • Model is trained
  • Better compression -gt faster text input

5
Dasher Language model
  • PPM Prediction by Partial Match
  • Predictions by models of different order
  • Weight factor for each model

6
Dasher Language model
  • Asymptotically PPM reduces to fixed order context
    model
  • But the incomplete model works better!

7
CTW Tree model
  • Source structure in the model, parameters
    memoryless
  • KT estimator
  • a number of zeros
  • b number of ones

8
CTW Context tree
  • Context-Tree Weighting combine all possible tree
    models up to a maximum depth

9
CTW tree update
10
CTW Implementation
  • Current implementation
  • Ratio of block probabilities stored in each node
  • Efficient but patented
  • Develop a new implementation
  • Use only integer arithmetic, avoid divisions
  • Represent both block probabilities as fractions
  • Ensure denominators equal by cross-multiplication
  • Store the numerators, scale if necessary

11
CTW for Text
  • Binary decomposition
  • Adjust zero-order estimator

12
Results
  • Comparing PPM and CTW language models
  • Single file
  • Model trained
  • with English text
  • Model trained
  • with English text
  • and user input

Input file CTW PPM Difference
Book 2 1.979 2.177 9.10
NL 2.364 2.510 5.82
Input file CTW PPM Difference
Book 2 2.632 2.876 8.48
NL 4.356 5.014 13.12
Input file CTW PPM Difference
GB 2.847 3.051 6.69
Book 2 2.380 2.543 6.41
Book 2 2.295 2.448 6.25
13
CTW Model costs
  • What are model costs?

14
CTW Model costs
  • Actual model and alphabet size fixed -gt Optimize
    weight factor alpha
  • Per tree -gt not enough parameters
  • Per node -gt not enough adaptivity
  • Optimize alpha per depth of the tree

15
CTW Model costs
  • Exclusion only use Betas of the actual model
  • Iterative process
  • Convergent?
  • Approximation
  • To find actual model use Alpha 0.5

16
CTW Model costs
  • Compression of an input sequence
  • Model costs significant, especially for short
    sequence
  • No decrease by optimizing alpha per depth?

Symbols Alpha 0.5 Alpha after exclusion Without model costs
100 5.73 5.21 4.94
1.000 4.22 4.07 3.68
10.000 3.12 3.07 2.77
100.000 2.33 2.32 2.13
600.000 1.95 1.95 1.83
17
CTW Model costs
  • Maximize probability in the root,
  • instead of the probability per depth
  • Exclusion based on alpha 0.5 almost optimal

Symbols Alpha 0.5 Alpha after exclusion Max. probability in root Without model costs
100 0.8437 0.8117 0.8113 0.7022
1.000 0.6236 0.6213 0.6209 0.5330
10.000 0.3830 0.3792 0.3794 0.3276
100.000 0.2661 0.2652 0.2647 0.2389
600.000 0.2248 0.2242 0.2241 0.2098
18
CTW Model costs
  • Results in Dasher scenario
  • Trained model
  • Negative effect if no user text is available
  • Trained with concatenated user text
  • Small positive effect if user text added to
    training text, and very similar to it

Language Alpha 0.5 Alpha after exclusion
GB 2.01 2.04
NL 4.34 4.36
Language Alpha 0.5 Alpha after exclusion
GB 2.30 2.28
NL 4.12 4.13
19
Conclusions
  • New CTW Implementation
  • Only integer arithmetic
  • Avoids patented techniques
  • New decomposition tree structure
  • Dasher language model based on CTW
  • 6 percent more are accurate predictions than
    PPM-D
  • Decreasing the model costs
  • Only insignificant decrease possible with our
    method

20
Future work
  • Make CTW suitable for MobileDasher
  • Decrease memory usage
  • Decrease number of computations
  • Combine language models
  • Select locally best model, or weight models
    together
  • Combine languages in 1 model
  • Models differ in structure or in parameters?

21
  • Thank you for your attention
  • Ask away!

22
CTW Implementation
  • Store the numerators of the block probabilities
Write a Comment
User Comments (0)
About PowerShow.com