Title: Neuro-IT%20Roadmap:%20Successful%20in%20the%20Physical%20World
1Neuro-IT Roadmap Successful in the Physical World
- Robust perception
- Image processing
- Speech recognition
- Multimodal human machine interaction
- System integration
- Scene analysis and representation
2Automotive Overtake-Checker and Door-Opener
Assistant
Dr. Axel Techmer Infineon Technologies
3Security Face Detection Recognition
- Leading edge approach of face detection
(University of Bochum) - Detection of face regions (a)
- Pre-selecting of frontal faces (b)
- Face recognition (c,d)
- Elastic graph matching
- Gabor Wavelet Transform
Ruhr University Bochum
4Vision Instruction Processor (VIP)
Infineon Technologies, Corporate Research,
Systems Technology
5Vision Instruction Processor (VIP)
- Prototype available since May 2001
- SIMD - Architecture
- 204 instructions
- 10 Million logic transistors
- On-chip memory 37KB
- Technology 0.35µm
- Clock 100 MHz
- Power consumption 100µW/MOPS
- Die size 22mm x 23mm
- Peak Performance 53 GOPS
- PCI-Board with VIP and camera submodules
- Software Tools for VIP
- Compiler, Debugger, Profiler
- Software Tools on Host
- MS Visual C with VPL-Library
- Application demonstrators
- Car Vision, Face recognition, MPEG2, Graphic
- in 0.13µm CMOS Technology
- Clock 200 MHz
- Peak Perf. 106 GOPS
- Die Size 70 mm²
- Power Consump. 700 mW
Infineon Technologies, Corporate Research,
Systems Technology
6Car Vision Components - Hardware
Dr. Axel Techmer Infineon Technologies
7Neuro-IT Roadmap Successful in the Physical World
- Robust perception
- Image processing
- Speech recognition
- Multimodal human machine interaction
- System integration
- Scene analysis and representation
8Classical Sound Processing for Speech Recognition
9Speech production time waveform
10FFT resolves neither frequency nor temporal
structure
- FFT
- frequency resolution 50 Hz
- temporal resolution 20 ms
11Classical Sound Processing for Speech Recognition
time structure of speech signal (lt20 ms) is lost
in the magnitude spectrum (FFT)
Humans extract both temporal- and
spectral information for robust speech recognition
12Auditory Sound Processing
sound signal
ear canal
middle ear
13Auditory Sound Processing
100µm
inner ear hydrodynamics
sound signal
ear canal
middle ear
14Dynamic Compression in the Inner Ear
Inner ear model responses to 1 kHz tones
apical
basal
15Auditory Sound Processing
sensory cell
inner ear hydrodynamics
sound signal
ear canal
middle ear
synaptic mechanisms
16Coding of Sound into Action Potentials
regular firing pattern (Dt10 ms ? f0100 Hz)
high
frequency
F0
low
17Spectral- and Temporal Sound Processing in the
Auditory Pathway
18Neuro-IT Roadmap Successful in the Physical World
- Robust perception
- Image processing
- Speech recognition
- Multimodal human machine interaction
- System integration
- Scene analysis and representation
19Audio-Visual Speech Recognition
20Audio-Visual Speech Recognition
Tracking of lip motion with sub-pixel precision
21Audio-Visual Speech Recognition
Tracking of lip motion with sub-pixel precision
two - one - seven - three - five - nine - eight
- zero - four - six
Hidden- Markov Speech Recognizer
22Multi-modal Pointing, gaze, gestures, mimics,
Dr. Axel Steinhage, Infineon Technologies AG
23Neuro-IT Roadmap Successful in the Physical World
- Robust perception
- Image processing
- Speech recognition
- Audio-visual speech recognition
- Multimodal human machine interaction
- System integration
- Scene analysis and representation
24Man-Machine-Interaction based on natural
communication channels
Dr. Axel Steinhage, Infineon Technologies
Items presented by VPA
Virtual Personal Assistant (VPA)
Cheap sensors (Webcam, Microphone)
Interactive comunication between user and VPA
Natural channels speech, lip-motion, gestures ...
25Man-Machine-Interaction based on natural
communication channels
Dr. Axel Steinhage, Infineon Technologies
Human expert via Advanced Videophone (HHI)
Items presented by VPA
Advanced Videophone
Virtual Personal Assistant (VPA)
Cheap sensors (Webcam, Microphone)
Interactive comunication between user and VPA
Natural channels speech, lip-motion, gestures ...
26What do we earn from Neuro-IT ?
- Robust perception
- Image processing
- Speech recognition
- Robust processing
- Tools for Neuroscience
- Successful in the Physical World
World knowledge ? Constructed brain
- Scene analysis and representation
- Intelligent human-machine interaction
- Natural feedback
- Intelligent virtual person
? Conscious Machines
? Factor 10
- Digital and/or analog
-
- neuronal networks
- Massively parallel processing hardware
27Neuro-IT Roadmap Successful in the Physical World
Werner Hemmert Infineon
technologies AG CPR-ST
Prof. Dr. Dr. h.c. H.-P. Zenner Prof. Dr. A.W.
Gummer
Prof. Dr. D.M. Freeman Dr. M. Mermelstein, B. Tsai
U. Dürig, M. Despont, G. Genolet, U. Drechsler,
P. Vettiger, G. Binning
Prof. Dr. U. Ramacher J.-P. de la
Cruz-Guiterrez, M. Holmberg Dr. A. Steinhage, Dr.
A. Techmer