Sound Music MIDI in Multimedia

About This Presentation

Title:

Sound Music MIDI in Multimedia

Description:

Computer Representation of Sound, speech and MIDI – PowerPoint PPT presentation

Number of Views:556

Updated: 12 October 2015

Slides: 42

Provided by: bimray8729

Category: How To, Education & Training

more less

Transcript and Presenter's Notes

Title: Sound Music MIDI in Multimedia

1
SOUNDMUSIC-MIDI-SPEECH

Mr.Bimal Kumar Ray
Dept. of Information Science Telecommunication
Ravenshaw University

2
What is Sound/Audio

The perception of sound by human beings is a very
complex process.
The detector which receives and interprets the
sound.
Sound is the combination of both high and low
pressure which is propagated through the air
medium in the form of wave.
Sound is a physical phenomenon(situation)
produced by the vibration of matter and
transmitted as waves.
Sound is always Non-periodic.
Sound is mechanical wave that is an oscillation
of pressure transmitted through a
solid,liguid,gas composed of frequency within the
range of hearing.
To create sound your computer feeds electricity
at a certain wave length through speaker.
Every sound is compression of waves of many
different frequencies and shapes. But the
simplest sound we can hear is a sine wave.

3
What is Sound/Audio

Sound waves can be characterised by the following
attributes
Period, Pitch, Volume, Frequency, Amplitude,
Bandwidth, sampling, Loudness and Dynamic.
Period the interval at which a periodic signal
repeats regularly.
Pitch a perception of sound by human beings. It
measures how high is the sound as it is
perceived by a listener.
Volume the height of each peak in the sound
wave
Frequency(sometimes referred to as pitch) the
distance between the peaks. The greater the
distance, the lower the sound.

4
What is Sound/Audio

Loudness important perceptual quality is
loudness or volume.
Amplitude is the measure of sound levels. For a
digital sound, amplitude is the sample value.
The reason that sounds have different loudness is
that they carry different amount of power. the
unit of power is watt.
The sounds is measured the unit Bel or more
commonly deciBel (dB). examples for
160 dB Jet engine
130 dB Large orchestra
100 dB Car/bus on highway
70 dB Voice conversation
50 dB Quiet residential areas
30 dB Very soft whisper
20 dB Sound studio

5
What is Sound/Audio

To include sound in a multimedia application, the
sound waves must be converted from analog to
digital form
This conversion is called sampling every fraction
of a second a sample the of sound is recorded in
digital bits
Two factors affect the quality of digitized sound
Sample rate the number of times the sample is
taken
Most common sampling rates are 11.025, 22.05,
and 44.1 kHz
Sample size the amount of information stored
about the sample
Most common sampling sizes are 8 and 16 bit

6
What is Sound/Audio

Dynamic range means the change in sound levels.
For example, a large orchestra can reach 130dB at
its climax and drop to as low as 30dB at its
softest, giving a range of 100dB.
Bandwidth is the range of frequencies a device
can produce, or a human can hear
FM radio 50Hz 15kHz
Childrens ears 20Hz 20kHz
Older ears 50Hz 10kHz
Ultra-sound 20kHz
1GHz
Hyper-sound 1GHz -
10THz

7
Computer Representation of Sound

Sound waves are continuous while computers are
good at handling discrete numbers.
In order to store a sound wave in a computer,
samples of the wave are taken.
Each sample is represented by a number, the
code.
This process is known as digitisation.
This method of digitising sound is know as pulse
code modulation (PCM)
This is why one of the most popular sampling rate
for high quality sound is 4410Hz.
Another aspect we need to consider is the
resolution, i.e., the number of bits used to
represent a sample.
16 bits are used for each sample in high quality
sound
Different sound card have different capability of
processing digital sounds.

8
Computer Representation of Sound

Recording and Digitising sound
An analogue-to-digital converter (ADC) converts
the analogue sound signal into digital samples.
A digital signal processor (DSP) processes the
sample, e.g. filtering, modulation, compression,
and so on.
Play back sound
A digital signal processor processes the sample,
e.g. decompression, demodulation, and so on.
A digital-to-analogue converter (DAC) converts
the digital samples into sound signal.

9
Quality vs File Size

The size of a digital recording depends on the
sampling rate, resolution and number of channels.
S R x (b/8) x C x D
Higher sampling rate, higher resolution gives
higher quality but bigger file size.
S file size bytes
R sampling rate samples per second
b resolution bits
C channels 1 - mono, 2 - stereo
D recording duration seconds

10
Quality vs File Size

For example, if we record 10 seconds of stereo
music at 44.1kHz, 16 bits, the size will be
S 44100 x (16/8) x 2 x 10
1,764,000bytes
1722.7Kbytes
1.68Mbytes
Note 1Kbytes 1024bytes
1Mbytes 1024Kbytes
High quality sound files are very big, however,
the file size can be reduced by compression.

11
Audio File Formats

The most commonly used digital sound format in
Windows systems is .wav files.
Sound is stored in .wav as digital samples known
as Pulse Code Modulation
Each .wav file has a header containing
information of the file.
type of format, e.g., PCM or other modulations
size of the data
number of channels
samples per second
bytes per sample
There is usually no compression in .wav files.
Other format may use different compression
technique to reduce file size.
.vox use Adaptive Delta Pulse Code Modulation
(ADPCM).
.mp3 MPEG-1 layer 3 audio.
RealAudio file is a proprietary format. (.ra
.ram .rm)

12
Audio File Formats

WMA Windows Media Audio (.wma)
Windows Media Audio is a Microsoft file format
for encoding digital audio files similar to MP3
though can compress files at a higher rate than
MP3.
MOV (movie) basically a video format where the
pictures are omitted.
RIFF Resource Interchange File Format
a Microsoft developed format capable of
handling digital audio and MIDI.
SDMI (Secure Digital Music Interface)
Designed to protect against most
forms of unauthorised copying
SND (sound) limited to 8 bits with
interpreters for the PC available.
Ogg (.ogg) Ogg is an audio compression format,
comparable to other formats used to store and
play digital music. It uses a specific audio
compression scheme that's designed to be
contained in Ogg.

13
Audio File Formats

AIFF Audio Interchange File Format is mostly
used by Silicon Graphics. AIFF files are easily
converted to other file formats, but can be quite
large. One minute of 16-bit stereo audio sampled
at 44.1 kHz usually takes up about 10 megabytes.
Dolby Digital Surround Sound Also known as AC3
(Audio Coding), or Dolby 5.1 (where .1 indicates
subwoofer bass channel). Dolby Digital has been
chosen as the standard sound technology for DVD
(digital video disk) and HDTV (High definition
TV).
Dolby Digital Surround Sound Digital Track on
Film It is a digital encoded system of 6
separate and independent surround sound channels,
for 6 speakers (Front (Left/right), Rear
(left/right), Front center and Sub-woofer.
MIDI - Musical Instrument Digital Interface
(.mid)
MIDI representation of a sound includes
values for the notes, pitch, length, and volume.
It can also include additional
characteristics, such as attack and delay time.

14
MUSIC

Music can be described in a symbolic way.
Music is the art of arranging tones in an orderly
sequence so that produce a sound.
Music is an art form whose medium is sound and
silence.
Music common elements are pitch, notes, scales
and tempo etc.
Any sound may be represent in that way including
music.

15
Musical Instrument Digital Interface

MIDI interface between electronics musical
instruments and computers is a small piece of
equipment that plugs directly into the computers
serial port and allows the transmission of music
signals.
MIDI represents a set of different musical
instruments to exchange musical information.
MIDI protocol is an entire music description
language in binary forms.

16
Musical Instrument Digital Interface

MIDI each word describing an action of musical
performance is assigned a specific binary code.
MIDI data is communicated digitally through a
production system as a string of MIDI messages.
MIDI is a standard control language and hardware
specification
MIDI allows equipment electronic musical
instruments and devices to communicate real-time
and non real-time performance and control data.

17
Musical Instrument Digital Interface

MIDI interface is 2 different components are
1. Hardware to connect the equipment
The physical connection of musical instruments.
MIDI cable and processes electrical signals
received over the cable.
2. Data Format Encodes
Information to be processing by the hardware.
The MIDI data format does not include the
encoding of individual sampling values such as
audio format.

18
MIDI Connection
19
MIDI CABLES
20
MIDI DEVICES

A computer can control output of individual MIDI
Devices.
MIDI device to communicate with other MIDI
devices over channels.
Musical data transmitted over a channel are
reproduced in the synthesizer at the receive end.
Synthesizer an electronic musical instrument,
typically operated by a keyboard, producing
sounds by generating and combining signals of
different frequencies.
The computer can use the same interface to
receive, store, and process encoded musical data.
A computer uses the MIDI interface to control
instruments for playout.

21
MIDI DEVICES

The microprocessor communicates with the keyboard
to know what notes the musician is playing and
with the control panel to know what commands the
musician wants to send to the microprocessor.
Pressing keys on the keyboard signals the micro-
processor what notes to play and how long to play
them.
Sound generator is to produce an audio signal.
Sound generator changes the quality of sound. for
examples are pitch, loudness, notes, tone etc

22
MIDI DEVICES

Sequencer
replay a sequence of MIDI messages
MIDI Interface
connect a group of MIDI devices together
Sound Sampler
record sound, then replay it on request
Can perform transposition shift of one base
sample, to produce different pitches
Can take average of several samples,
then produce a unique quality inter-polated
output sound.
Control Panel
- Control all the MIDI Devices
functions.
Memory to store all information for sound
format.

23
MIDI DEVICES

Keyboard (MIDI I/O)
i. Note Polyphony
Now a days, most keyboard have polyphony
ii. Touch response
A keyboard can sense different levels of input
pressure
Keyboard synthesizer keyboard synthesizer
have real-time audio output
Some keyboard synthesizers support DSP
(Digital Signal Processing)
Which gives more available effects echo, chorus
etc.
you can then compose and make music,
just with a keyboard
Guitar, Flute, Violin, Drumset

24
MIDI DEVICES

Controllers
Numbered controllers
e.g. volume panel
Continuous Controllers
You can roll the controller to get a particular
value
e.g. modulation wheel
On/Off Controllers
can send two different values (e.g. 0/127)
e.g. foot pedal (sustain pedal)

25
MIDI MESSAGE

MIDI uses a specific data format for each
instrument.
MIDI data format is digital and data are group of
message.
The message is transmitted to connected system to
the computer.
A musician play a key, the MIDI interface
generates a MIDI message that defines the start
of each strike and intensity.
Musician release the key to create digital sound
signal
and transmitted.
Messages are assigned to channels .
- a channel is a separate path through which
signals can flow.
Devices set to respond to particular channels
Every message (except system messages) have a
channel number which is stored in bits 0..3 of
the status byte

26
MIDI CHANNEL MESSAGES

1. MIDI Channel Messages have 4 modes
Mode 1 Omni On Poly, usually for testing
devices
Mode 2 Omni On Mono, has little purpose
Mode 3 Omni Off Poly, for general purpose
Mode 4 Omni Off Mono, for general purpose
where
i. Omni On/Off
respond to all messages regarding of their
channel
ii. Poly/Mono
respond to multiple/single notes per channel
2. Channel Voice Messages
Carries the musical component of a piece. usually
has 2 types
i. status byte
the first 4 most expressing bits identify the
message type,
the 4 last expressing bits identify which channel
is to be affected
ii. data byte
the most expressing (significant) bit is 0,
indicating a data byte.
The rest are data bits

27
MIDI SYSTEM MESSAGE

Real-time System Messages
Start
1st byte Status byte? 11111010
Direct slave devices to start playback from time
0
Stop
1st byte Status byte? 11111100
direct slave devices to stop playback
song position value doesnt change
? can restore the playback at the place where it
stops with the continue message
Continue
1st byte Status byte? 11111011
direct slave devices to start playback from the
present song position value

28
MIDI SYSTEM MESSAGE

System Reset
1st byte Status byte? 11111111
devices will return the control value to default
setting.
e.g. reset MIDI mode / program number assigned to
patch
System Exclusive messages
MIDI specification cant address every unique
need of each MIDI device
leave room for device-specific data
sysEx message are unique to a specific
manufacturer
1st byte Status byte? 11110000
2nd byte manufacturer ID,
e.g. 1 sequential, 67Yamaha
3rd byte (onwards) data byte(s)

29
MIDI SOFTWARE

Music Recording and Performance Application.
Recording of MIDI Message as they enter the
computer from other MIDI device, store, editing
and play back the message in performance.

This is a Daisy-chain network, where device are
connected serially.
30
MIDI SOFTWARE

Recording software
Ex Sony Sound Forge, sonar, cool edit pro etc
Much more efficient than using tape recording
Can redo recording process
Can easily do editing
Also allows effects (reverb, echo, chorus etc)

31
MIDI SOFTWARE

Musical Notations and Printing Application
writing music traditional musical notion.
The user can then play back the music using a
performance program or print the music on paper
for live performance publication.
Music Education Application
Synthesizer Patch Editor and Librarians
Information stage of different synthesizer
patches in the computer memory and editing of
patches in the computer.

32
APPLICATIONS OF MIDI

1.Studio Production
recording, playback, and editing
creative control/effect can be added
2. Making score
with score editing software, MIDI is excellent in
making score
some MIDI software provide function of auto
arrangement.
3. Learning
You can write a MIDI orchestra, who are always to
practice with you
4. Commercial products
mobile phone ring tones, music box music..
5. Musical Analysis
MIDI has detailed parameters for every input note
It is useful for doing research
For example, a pianist can input his performance
with a MIDI keyboard, then we can analyze his
performance style by the parameters

33
Introduction in Speech

The expression of the ability to express
thoughts and feelings by articulate (fluent)
sound.
Speech is our basic communication tool.
Speech power of speaking oral communication.
We have been hoping to be able to communicate
with machines using speech.
Speech output deals with the machine generation
of speech.
Voice speech signals have an almost periodic
structure over a certain time interval.
The spectrum of some sounds has characteristic
maxima that normally involve up to five
frequencies.

34
Speech Generation

Speech generation is a very interesting field for
multimedia systems.
Speech recognition is the foundation of human,
computer interaction using speech.
Speech generation is real-time signal generation.
Speech must be understandable and sound natural.
Speech recognition in different contexts
Dependent or independent on the speaker.
Discrete (individual) words or continuous speech.
Small vocabulary or large vocabulary.
In quiet environment or noisy environment.

35
Digital Speech
Waveform
Speech
Spectrogram
36
Speech Generation

A major challenge in speech output is how to
generate these signals in real time for a speech
output system to be able, for instance, to
convert text to speech automatically.
The most important technical terms used in
relation to speech output, including Speech
basic frequency means the lowest periodic signal
share in the speech signal.
A voiced sound is generated by oscillations of
the vocal cords. The characters M, W, and L are
examples.
Unvoiced sounds are generated with the vocal
cords open.
for example, F and S.

Reference patterns
Comparison and decision algorithm
Parameter analyzer
Words
speech
Language model
37
Voiced and Unvoiced Speech
Silence
unvoiced
voiced
38
Speech Synthesis

Speech synthesis is to generate speech with
strong properties (pitch, speed, loudness etc.)
Speech synthesis has been widely used for
text-to-speech systems and different telephone
services.
The easiest and most often used speech synthesis
method is waveform concatenation.

Increase the pitch without changing the speed
39
Speech Analysis

The primary quality characteristic of each speech
recognition session is determined by a
probability of to recognize a word correctly. A
word is always recognized only with a certain
probability.
Speech analysis can serve to analyze who is
speaking that is to understanding, recognize a
speaker for his identification and verification.
The computer identifies and verifies fingerprint,
voice.
.

40
Speech Transmission

Speech processing and speech transmission
technology are expanding fields of active
research.
Speech transmission is a field relating to highly
efficient encoding of speech signals to enable
low-rate data transmission over network.
New challenges arise from the anywhere, anytime
of mobile communications.
Internet based transmission protocols, such as
Voice over IP.
Advances in digital speech transmission provides
an up-to-date overview of the field, including
topics such as speech coding in heterogeneous
communication networks, wideband coding, and the
quality assessment of wideband speech.