Title: Improving Chinese handwriting Recognition by Fusing speech recognition
 1Improving Chinese handwriting Recognition by 
Fusing speech recognition
- Zhang Xi-Wen 
 - CSE, CUHK and HCI Lab., ISCAS 
 - 2005.4.12
 
  2Outline
- 1 Chinese handwriting recognition 
 - 2 Chinese speech recognition 
 - 3 Information fusion 
 - 4 Experimental results
 
  3Handwriting Recognition 
- Handwriting segmentation 
 - Character recognition 
 
  41.1 Handwriting segmentation
- It is more difficult for Chinese handwriting 
segmentation 
  5Character extraction using histogram
- A histogram of between-stroke gaps. 
 - The dimidiate threshold of the histogram is to 
extract lines of strokes.  - The dimidiate threshold of the histogram of a 
line of strokes is to extract characters. 
  6Figure 1. Handwriting segmentation 
 7Problems remained 
- A Chinese character may be mis-segmented into 
many characters.  - Many Chinese characters may be mis-grouped as a 
character.  - The segmentation error will inevitably result in 
handwriting recognition errors. 
  81.2 Character recognition
- Isolated character recognizer from HW 
 - Many candidates
 
  9Handwriting. 
Text recognized from the handwriting. 
The ground-truth text. 
Figure 2. Handwriting recognition 
 102 Speech recognition
- Chinese speech. 
 - On-line, microphone. 
 - Continuous speech recognizer from MS. 
 
  11Text recognized from the speech corresponding to 
the handwriting. 
The ground-truth text. 
Figure 3. Speech recognition 
 123 Text fusion
- An optimization problem 
 - Dynamic Programming 
 
  133.1 Principles
- The fused text should contain more semantic 
information.  - Construct a text with the least characters and 
the most semantic information. 
  143.2 Four ways
Text recognized from the handwriting. 
Text recognized from the speech corresponding to 
the handwriting. 
Figure 4. Texts to be fused 
 153.3 Dynamic Programming 
- A directed graph. 
 - Optimal paths. 
 
  16Figure 5. A directed graph with N levels. 
 17(a) Text recognized from the handwriting.
(b) Text recognized from the speech corresponding 
to the handwriting. 
(c) The optimal fused text corresponding to the 
optimal path. 
(d) The ground-truth text. 
Figure 6. Text fusion using DP. 
 183.4 A language model
  19Lexicon 
 20(No Transcript) 
 214 Experimental results 
 22(No Transcript) 
 23(No Transcript) 
 24- Thank you very much for 
 - your criticism, comments and suggestions! 
 - Email xwzhang_at_cse.cuhk.edu.hk 
 - Tel 3163-4260