Voice Biometrics - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Voice Biometrics

Description:

REX a toy dog that would come out of its house when a ... Plosive. Sounds produced by the release of built up pressure in the lungs, i.e. the p in putter. ... – PowerPoint PPT presentation

Number of Views:1797
Avg rating:3.0/5.0
Slides: 34
Provided by: Ale8341
Category:

less

Transcript and Presenter's Notes

Title: Voice Biometrics


1
Voice Biometrics
2
Topics covered
  • History
  • Speech vs Voice
  • How it works
  • Algorithms
  • Problems with VR
  • Modern Applications
  • Future Why VR is good

3
History of Speech Recognition
  • 1870
  • Alexander Bell invented the telephone
  • Wanted to help people with hearing difficulties
  • 1922
  • REX a toy dog that would come out of its house
    when a certain amount of acoustic energy was
    received

4
History (contd)
  • 1936
  • Bell Labs invented the voice synthesizer which is
    better known today as a computer voice
  • 1960s
  • Bell Labs invented a speech recognition system
    with a 10-word vocabulary

5
History (contd)
  • 1970s
  • Threshold Technology released the VIP 100 System
  • First system with a 1000-word vocabulary
  • 1978
  • Texas Instruments released the Speak and Spell, a
    toy that spoke words you asked it to

6
History (contd)
  • 1989
  • 486 Processor voice recognition systems became
    more powerful because of processors were more
    powerful
  • Less expensive because personal computers were
    becoming more popular

7
History (contd)
  • 1997
  • Dragon Systems released Naturally Speaking,
    which was the first continuous speech recognition
    system with a large vocabulary
  • IBM soon released VoiceType and ViaVoice, which
    are comparable systems to Naturally Speaking

8
History (contd)
  • 2004
  • Dragon Systems Naturally Speaking 8
    Continuous and Natural Dictation at 160 words per
    minute
  • 250,000 word vocabulary, with capabilities to add
    more

9
Speech vs Voice Recognition
  • Speech Recognition
  • Recognizing the voice of a wide range of users
  • Used to recognize words, not identification
  • Uses include dictation software for programs such
    as Word, audio commands in modern automobiles,
    telephone menus, etc
  • Voice Recognition
  • Recognizing the voice of selected users
  • Used to identify and/or authenticate users
  • Uses include personal, corporate and government
    security.

10
Speech Recognition
  • 3 Types
  • Discreet word
  • Connected word
  • Continuous speech

11
Discreet Word
  • Requires user to pause between words so software
    can recognize word.
  • Works well for systems that only require one word
    responses, i.e. telephone menu with yes/no
    answers.
  • Useless otherwise

12
Connected Word
  • Similar to discreet word, but allow for multiple
    word phrases or commands.
  • User must speak every word clearly and concisely
    so software can recognize each word.
  • Improvement over discreet word, but still
    inefficient.

13
Continuous Speech
  • Most natural and user friendly
  • Very difficult to implement
  • Can speak as though having a normal conversation
    with another person.

14
How speech recognition works
  • User speaks into microphone
  • Audio processed by soundcard
  • When soundcard finishes, software distinguishes
    between lower frequency vowels and higher
    frequency consonants
  • Compares results with groups of phonemes and then
    to actual words to determine likely match
  • Phonemes are the smallest unit of sound that can
    change the meaning of a word, i.e. rat ? bat
  • Words are then ordered into most likely sentence
    combinations and output to word processor

15
Downfalls
  • Software requires training that can take
    upwards of twelve hours of usage to become fully
    effective
  • Human ear and brain have evolved to near
    perfection over millions of years, making it very
    hard to mimic
  • Humans are able to discern speech out of more
    than just sound, i.e. out of context, from hand
    and body signals, unconscious lip reading, etc.
  • Inconsequential background noise may adversely
    affect softwares ability

16
Man vs. Machine
  • TOP Human error vs. machine error in letter
    recognition
  • BOTTOM Human error vs. machine error in
    spontaneous speech recognition
  • All types of environments tested

17
Two main obstacles
  • Software must be able to distinguish between
    background noise and desired input
  • Road noise, other people talking, wind, etc
  • Must also be able to recognize all different
    types of voices
  • High, low, squeaky, raspy, loud, quiet.

18
Biometrics Voice Recognition
  • Growing rapidly in corporate and government
    agencies
  • Must take completely different approach to
    implementation, software has to be VERY precise
  • False accept vs. misprinted word

19
Sound types
  • Three types
  • Voiced
  • Sounds produced by the vibration of the vocal
    cords.
  • Unvoiced
  • Sounds produced without vibration of vocal cords,
    i.e. sh
  • Any sound that does not produce a resonant peak.
  • Plosive
  • Sounds produced by the release of built up
    pressure in the lungs, i.e. the p in putter.

20
(No Transcript)
21
Approaches to voice authentication
  • Template matching
  • Speaker-dependent
  • Speech analysis
  • Speaker-independent

22
Template matching
  • User repeats word, series of words, or phrase
    into microphone several times for recording.
  • Voice sample is converted from analogue to
    digital for processing.
  • Extraction algorithms used to analyze digital
    voice sample looking for unique patterns.
  • A mathematical model (stochastic) is created
    which determines certain results from a given
    input according to some statistical distribution.
  • Hidden Markov model

23
Hidden Markov model
  • Variant of a Finite State Machine (FSM)
  • An FSM is a model of computation consisting of a
    set of states, a start state, an input alphabet,
    and a transition function that maps input symbols
    and current states to a next state.
  • The occurrence of each event in a series of
    random events is dependent on the outcome of the
    adjacent preceding event.
  • In any given state, a possible outcome can be
    generated according to its associated probability
    distribution.
  • It is only the outcome, not the actual state
    itself, visible to an outside observer, hence
    Hidden Markov model.
  • Light switch up or down doesnt matter, only see
    if light is on or off.

24
Template matching (cont.)
  • The HMM is used to create statistical profiles of
    each voice sample taken from the user.
  • Profiles are compared to look for repeating
    patterns in the users voice that can be used to
    identify his/her voice.
  • User is ready to be verified
  • Live voice sample compared to user attempting to
    authenticate in database
  • Probability score of whether or not user is who
    they claim to be is created
  • User accepted or denied based on probability score

25
Feature analysis
  • First processes input using Fourier transforms or
    linear predictive coding (LPC).
  • Fourier transform is an algorithm used to break
    down signals (sound in the case) into elementary
    components.
  • LPC analyzes speech signal by eliminating
    formants (resonances that characterize the vocal
    tract of a person), removing their effects from
    the speech signal, and estimating the intensity
    and frequency of the remaining buzz.
  • This process is called inverse filtering and the
    reaming signal is called the residue.

26
Voice authentication obstacles
  • Human voice changes with age, sickness, etc.
  • Needs to overcome high crossover rate

27
Advantages of Voice Authentication
  • Not intrusive
  • More natural to speak
  • No expensive hardware
  • Requires very little memory for voice samples
  • Remote Authentication

28
Voice Recognition Uses
  • Security
  • Financial Transactions
  • Account Access
  • Funds Transfer
  • Bill Payments
  • Trading of Financial Instruments
  • Credit Card Processing

29
Voice Recognition Uses (contd)
  • Authentication
  • Senior Citizen Facilities
  • Voice recognition systems used in conjunction
    with ID cards are used to let residents into
    their apartments
  • Jail System
  • Can call up people on parole or on house arrest
    anytime of the day and authenticate if they where
    they should be

30
Voice Recognition Uses (contd)
  • Authentication (contd)
  • Criminal Cases/Terrorism
  • Authentication on videos like the ones where
    Osama Bin Laden appears are authenticated with
    voice recognition to prove that it really is him

31
Uses (cont.)
  • Union Pacific Railroad
  • New York Town Manor
  • Bell Canada
  • Password Journal

32
Future Uses
  • Small Electronic Devices
  • Cell Phones
  • Key combination to unlock phones
  • PDAs and Laptops
  • Same concept could be implemented on these for
    added security

33
Future Uses
  • Parking Garage Systems
  • Instead of using smart cards that could easily be
    lost, stolen, or even melted, voice recognition
    would take away the possibility of access without
    proper authorization
Write a Comment
User Comments (0)
About PowerShow.com