Title: CALO Speed-Up Task Progress Report in January
1CALO Speed-Up TaskProgress Report in January
- Arthur Chan, Jahanzeb Sherwani
- Carnegie Mellon University
- Feb 2, 2004
2Overview of S3 and S3.3Computations at every
frame
3Current Systems Specifications(without Gaussian
Selection)
Sphinx 3 Sphinx 3.3
Speed in P4-1G Tested in Communicator Task 14xRT (11xRT GMM, 3xRT Srch) 7xRT (wo SVQ) (6xRT GMM, 1xRT Srch)
GMM Computations Not optimized (few code optimization) Can applied Sub-VQ-based Gauss. Selection
Lexicon Flat Tree
Search Beam on search, no beam on GMM Beam on Search Beam on GMM.
4Our Plan in Q1 upgrade s3.3 to s3.4
- Fast GMM Computation
- 4-Level of Optimization
- Combination of multiple methods in Gaussian
Selection - Phoneme look-ahead
- Reduction of search space by determining the
active phoneme list at word-begin. - Other features
- Multiple and dynamic LM
- Integration with end-pointing.
- APIs of the recognizer
- All implemented in S3.3
5Fast GMM Computation Level 1 Frame Selection
-Compute GMM in one and other frame
only -Improvement Compute GMM only if current
frame is not similar to previous frame
6Fast GMM Computation Level 2 GMM Selection
GMM
-Compute GMM only when its base-phones are highly
likely -Others backed-off by the base phone
scores. -Used by Microsoft and Akinobu
1999 -Known problems Can increase the load of
forward Search
7Fast GMM ComputationLevel 3 Gaussian Selection
Gaussian
-Compute Gaussian distribution only when they are
in the neighborhood of the feature vector.
(Bochierri 93) -Refinement (Knill and Gales
96) -Combination with other methods may be useful.
GMM
8Fast GMM Computation Level 4 Sub-vector
quantization
Gaussian
-(Ravi 98) Clustering sub-vector using sub-vector
quantization. -Only compute the scores of the
sub-vector-codeword -An approximate scores of a
Gaussian. -In S3.3, it is currently used as a way
for Gaussian Selection.
Feature Component
9So far
- Progress S3 (100), s3.3 (50)
- Frame, GMM and Gaussian levels optimizations
completed. - BL 17.1 Err. 11xRT for GS
- Cautious optimization
- 17.5 Err 2.6xRT GS. (75 reduction, 5
degradation.) - Aggressive optimization
- 19.8 Err 0.8xRT GS (90 reduction, 20
degradation.) - Should be good enough when ported to s3.3
10Fast Match in Search
- Compute the most likely phones in a frame.
- Search only carried out when the first phone is
an likely one. - Complementary to GMM selection.
11Outlook in February
- Porting of frame, GMM and Gaussian levels of
optimization to s3.3 (Started) - Integration with feature level of optimization.
- Phoneme look-ahead in s3.3
- Implement Multi-LM and Dynamic LM