Speech tools - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Speech tools

Description:

16k 12k 8k 4k 8bit. Size 16k16bit,256kbps 1.9Mo/mn 115Mo/h. Format. Sound: wav, wma, mp3, ogg, aiff, aifc, au, vox, raw, sd, CSL, Ogg/Vorbis, NIST/Sphere ... – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 29
Provided by: jpg39
Category:
Tags: speech | tools | wma

less

Transcript and Presenter's Notes

Title: Speech tools


1
Speech tools
  • Jean-Philippe Goldman
  • 03.03.2004

2
Two questions
  • What kind of data ?
  • Which task ?

3
What kind of data ?
  • Speech content (noise, multivoice,)
  • Data File
  • Sound/Transcription/PitchCurve
  • Sampling/Quantization
  • 16k 12k 8k 4k 8bit
  • Size 16k16bit,256kbps ? 1.9Mo/mn ? 115Mo/h
  • Format
  • Sound wav, wma, mp3, ogg, aiff, aifc, au, vox,
    raw, sd, CSL, Ogg/Vorbis, NIST/Sphere
  • Transcription HTK, TIMIT, TextGrid, Phondat
  • Number of files

4
Which task ?
  • Visualization and Edition
  • Record, Play, edit, mix, add effects
  • Analysis
  • spectral, pitch
  • Speech manipulation
  • Filtering, mixing, adding effects, prosodic
    manipulation
  • Annotation
  • segmentation, labeling
  • Scripting
  • Batch, communication with outside
  • Plotting

5
Examples of tasks
  • build stimuli for an experiment (i.e.
    cross-splicing)
  • manage a speech database for a TTS engine
  • create a prosodic database
  • analyze speech corpus from experiment recordings
  • verify/correct an automatic segmentation

6
Two questions
  • What kind of data ?
  • Which task ?

Two rules
  • there is no unique tool to do everything
  • there are plenty of ways to do one thing

7
Tool features
  • Visualization/Edition
  • Analysis
  • Speech manipulation
  • Annotation
  • Scripting
  • Plotting
  • Supported format
  • Platform/installation
  • Evolution/community
  • Accessibility
  • Price

8
Softwares
  • Goldwave (audio editor)
  • Esps Xwaves (routines visual.)
  • Praat (speech analysis)
  • Wavesurfer (speech editor)
  • Transcriber (annotation tool)
  • Matlab (general purpose soft)
  • OGI speech tools (routines app. dev.)
  • winpitch, pitchworks, phonedit, cooledit..

9
Goldwave
  • self-defined as top rated, professional digital
    audio editor

10
Goldwave
  • pros edition (good gestion of memory for big
    files), many FX, noise reduction, real-time
    spectrum and VU meters, various formats, batch
    conversion, chain effects, easy interface
  • cons nothing for speech (pitch, formant),
    windows only, no scripting
  • Good for file edition not for speech

11
(No Transcript)
12
Esps - Waves
  • Developed by Entropic ATT. Now public
  • Comp.speech FAQ says
  • Esps comprehensive set of speech
    analysis/processing tools
  • Waves is a graphical front-end for speech
    processing (waveforms, spectrograms, pitch)
    includes a signal labeling utility

13
(No Transcript)
14
Esps waves
  • pros powerful, designed for big files,
  • cons UNIX only (free BSD), not standard formats,
    requires programming skills, development has
    stopped

15
Praat
  • Developed by P.Boersma and D.Weenink at the
    Institute of Phonetic Sciences, University of
    Amsterdam
  • general purpose speech tool edition,
    segmentation and labeling, prosodic manipulation

16
(No Transcript)
17
Praat
  • pros designed for speech analysis (not only
    sound edition or spectrogram visualization), nice
    GUI, scripting, active development and community,
    prosodic manipulation
  • cons limited scripting language, native format
    of transcription and pitch files

18
WaveSurfer
  • Open Source tool for sound visualization and
    manipulation
  • speech/sound analysis and sound
    annotation/transcription
  • platform for more advanced/specialized
    applications extending WaveSurfer with new
    custom plug-ins or embedding WaveSurfer
    visualization components in other applications
  • Requires SnackToolKit

19
(No Transcript)
20
Transcriber
  • Authors C. Barras, E. Geoffrois
  • Relies on Snack (Tcl/tk)
  • Good for annotation
  • Nice, simple GUI
  • No speech analysis

21
(No Transcript)
22
Matlab (Mathworks)
  • Math. environment
  • Signal processing toolbox filter-design,
    spectral analysis, waveform generation, linear
    prediction
  • voicebox (2002) mike.brookes_at_ic.ac.uk
  • pitch determination algorithm (2002) Xuejing Sun
    sunxj_at_northwestern.edu
  • colea speech editor (1998) Philip Loizou
    loizou_at_utdallas.edu Univ of Texas-Dallas

23
Matlab (Mathworks)
  • pros open, powerful, scripting, excellent
    plotting
  • cons poor speech community, standards, not
    designed for big files

24
OGI speech tools/CSLU Toolkit
  • development started in 1992 in C on Unix, at
    Center for Spoken Language Understanding (CSLU)
    at OGI
  • Includes
  • An X windows display tool (LYRE) display, edit
    speech signal, spectrograms, phoneme labels, and
    other information
  • a set of C library routines (LIBNSPEECH),
    utilities for converting file formats, filtering,
    Neural Network training, vector-quantizer,
    database utility to automate speech database
    related enquiries
  • a set of PERL Scripts which have been used mainly
    to automate the use of the OGI Speech Tools.
  • MAN Pages
  • RAD rapid application development
  • points of entry Package(C), script(tcl), GUI(tk)
    levels
  • free for research use

25
(No Transcript)
26
Summary
Edit Anal Manip Annot Script Plot Format OS Evolut. Comm Price
Goldwave win 40
EspsWaves C sh Unix free
Praat yes nativeconsole sendpraat src free
wavesurfer snack C tcl/tkpython src free
transcriber xml free
OGI Toolkit free
matlab Sigproc packages native no BSD stud.100 40/tbx
yes but requires some dev.
27
Expect to do conversions
  • Sound files
  • goldwave (win)
  • sox (unix)
  • Transcription files
  • scripts to convert text-formatted label files

28
Links
  • www.goldwave.com
  • www.speech.kth.se/software/esps
  • www.praat.org
  • www.speech.kth.se/software/wavesurfer
  • www.cse.ogi.edu/toolkit
  • www.mathworks.com (Matlab)
  • www.lpl.univ-aix.fr/sqlab/ (phonedit)
  • www.sciconrd.com/pworks.htm (PitchWorks)
  • www.winpitch.com (WinPitch)
  • www.adobe.com (CoolEdit gt Audition)
Write a Comment
User Comments (0)
About PowerShow.com