Sanatan Dharma College, Ambala Cantt
B.C.A(3rd Year) Ms. Garima Sudan
Assistant Professor
Multimedia and tools
Unit -3
Digital Audio
What Is a Sound Wave?
Before talking about the digitization of sound it is crucial to understand what a
sound wave is exactly. Sound waves are realized when an instigating factor, such
as the striking of a drum head or the plucking of a string, causes the molecules
of a medium, typically air, to move. The molecules vibrate in a process that
alternates between compression (becoming denser and tightly packed) and
rarefaction (becoming less dense). The wave propagates in this way through the
medium until its energy is dissipated in the form of heat. It should be noted
that the medium could also be a liquid or a solid, and in fact, air is one of the
slowest mediums for transmitting sound. The diagram below identifies how the
amplitude of a sine wave would correspond to the compression and rarefaction
of molecules in a medium.
Anything that vibrates — a string, drum head, wine glass, tuning fork — will
produce a corresponding movement of molecules in the medium which we
perceive as sound.
What Is Analog Sound?
For some people, the term analog sound refers to old technology — which is of
course true to some extent. But analog technologies remain a crucial part of
music production. To understand why, we should define the origin of the term.
The word “analog” is derived from the word “analogous” which means comparable,
similar or related. In terms of audio technology, this idea is clearly present in
two fundamental devices for recording and creating music — the speaker and
the microphone.
Microphones create a change in voltage or capacitance that is analogous to the
movement of its diaphragm, which is instigated by a sound wave.
Speakers transform an electrical signal into a sound wave by creating analogous
movements of a speaker cone.
Both devices are considered transducers, in that they convert one form of
energy into another. And both are analog devices that are just as relevant today
as the day they were invented.
How Is Sound Digitized?
Before the advent of computers, sound was recorded using technologies like
magnetic tape, vinyl and — very early on — wax cylinders. Engineers strived for
the highest fidelity possible depending on the limitations of the medium. One of
the major potential limitations is that of dynamic range — the range of possible
amplitude values from the noise floor to the maximum peak level before the
onset of distortion. As audio production technology has advanced, so has
dynamic range as evidenced below:
Approximate Dynamic Ranges
FM Radio: 50 dB
Cassette Tape: 60-70 dB
Vinyl: 70-88 dB
Audio CD: 96 dB
24 Bit Audio: 144 dB
It should be noted that the goal of absolute fidelity is a bit of a misnomer. The
resurgence of vinyl and retro or lo-fi aesthetics indicates that taste and the
effect or “limitations” of a particular medium can be valued as part of the
process and the so-called flaws in audio fidelity might be actually desired.
The digitization of sound is necessary whenever computers are involved in the
recording, production or dissemination of music, which pretty much covers
everything except live performance. And even then, on-stage musicians are
probably using digital effects somewhere along the line.
To convert analog sound to the digital realm means taking an analog signal and
creating a representation of that signal in the language of computers, which is
binary (zeroes and ones).
An analog signal is continuous, meaning constantly changing in amplitude and time.
Digital conversion requires that it be sampled or measured periodically to make
it understandable and editable in a computer system. There are two conversion-
related terms to be aware of:
Analog to digital converter (ADC) – converts an analog signal to a digital file
Digital to analog converter (DAC) – converts a digital file to an analog signal
This pairing of devices or processes is the essence of digital audio production.
Sample Rate and Bit Depth
The digitization process has several user-defined variables which will influence
things like frequency range, dynamic range, file size and fidelity. Two
fundamental variables you should be aware of are sampling rate and bit
depth (or resolution).
The sampling rate is the rate at which amplitude measurements are taken as an
analog signal is converted or as a previously digitized file is resampled. The
resampling process could downsample (which reduces the sampling rate)
or upsample (which increases the rate).
Downsampling might be required when files recorded or created at higher
sampling rates, such as 48 kHz (48,000 samples per second) or 96 kHz (48,000
samples per second) need to be prepared for audio CD distribution. This
particular medium requires a sampling rate of 44.1 kHz (44,100 samples per
second).
Upsampling is used by mastering engineers to create higher resolution files
before processing to provide better results. This is followed by a downsampling
process to prepare the file for distribution.
A visual representation of the sampling process.
Nyquist Rate
The Nyquist rate is a concept derived from digital sampling theory which states
that to accurately represent a particular frequency, the signal must be sampled
at twice the rate of that frequency. For example, to create an accurate digital
representation of 10 kHz, you would need to use a minimum 20 kHz sampling rate.
When the audio CD standard was developed this was one consideration in
determining the standard sampling rate to be used. Based on the Nyquist
theorem, a sampling rate of 44.1 kHz can accurately recreate a 22,050 Hz
frequency in the digital realm. Since the range of human hearing is generally
considered to be 20 Hz to 20 kHz, this was considered to be sufficient and
manageable by computing systems and equipment at the time. Since then higher
sampling rates have emerged as commonplace including 48 kHz (used in video
contexts), 88.1 kHz, 96 kHz, and 192 kHz.
A logical question is — why use such high sampling rates when the limits of
human perception stop after 20 kHz at the maximum. Part of the answer lies in
the benefits of oversampling, which can reduce audio artifacts known as aliasing.
When audio effects processing is performed at higher rates, the results are
improved and the presence of artifacts is reduced. For more about oversampling
check out my article: “Oversampling in Digital Audio: What Is It and When
Should You Use It?”
In terms of recording, using higher sampling rates seems to provide a more
pristine result as well. When interactions between frequencies occur, sum tones
and difference tones are produced and the capability of a digital conversion
process to represent frequencies beyond the range of human hearing can
contribute to better results in the audible range.
Equally or perhaps more important than sampling rate is bit depth or resolution.
This can be thought of as the accuracy of how each sample is measured. The
higher the bit depth the more accurate the amplitude measurement. The three
most common bit depths used are 16, 24, and 32 bit. Referring back to the
article mentioned above, “Bits, Bytes, and Beers,” each bit in a binary system
can be either 0 or 1. This translates to a certain number of possible values
based on the number of bits used. For example:
16 bit samples can have 216 possible values or 65,536
24 bit samples can have 224 possible values or 16,777,216
32 bit samples can have 232 possible values or 4,294,967,296
The higher the number of possible values the less quantization error and
hence the less noise in a recording. This translates into a significantly wider
dynamic range for 24 bit versus 16 bit recordings. (see the dynamic range chart
above for the difference between a 16 bit and 24 bit audio).
Below is an example of two different bit depths used in computer graphics.
Consider two digital color palettes: 8 bit and 24 bit.
Note that in the 8 bit palette there are only 256 choices (28), meaning if you are
trying to match an existing color, you can only get so close.
In the 24 bit palette, the choices are in the millions and the image appears to be
almost a continuous blurring of one color transforming into the next. With this
palette, you could get much closer to a specific color.
In terms of audio, less error or rounding of values means a more accurate digital
representation of the analog input.
Audio Signal
Analog audio is a method of sound recording and reproduction that captures and
stores audio signals as continuous waveforms. This process directly corresponds
to the variations in air pressure of the original sound wave, allowing for a more
natural representation of the sound. Analog audio has been the standard for
many years, particularly before the advent of digital technologies, and is still
valued today for its distinctive qualities and unique sound.
Key Concepts and Characteristics of Analog Audio:
1. Waveform Representation
• Continuous Signal: Analog audio captures sound as a continuous electrical
signal. The signal’s amplitude and frequency are directly proportional to
the original sound wave, meaning that it replicates the natural
fluctuations of sound in its entirety.
2. Recording Mediums
• Magnetic Tape: One of the most common analog recording methods
involves capturing audio on magnetic tape. This was widely used in
cassette tapes, reel-to-reel tapes, and professional recording studios.
• Vinyl Records: Analog audio is also represented as grooves in vinyl
records. The physical shape of these grooves corresponds to the sound
wave, and when a needle (stylus) moves along the grooves, it vibrates to
reproduce the original sound.
• Mechanical and Acoustical Storage: Some early recording methods like
wax cylinders and phonographs used mechanical means to store analog
sound information.
3. Characteristics and Quality
• Warmth and Richness: Analog audio is often described as having a warm,
natural sound quality. This is because it captures subtle nuances and
harmonics that are sometimes lost in digital recordings.
• Dynamic Range: Analog systems can handle a wide dynamic range, but
they are also subject to limitations such as tape saturation or the
inability to capture extremely low or high frequencies accurately.
• Noise and Distortion: Analog recordings are susceptible to external
interference, which can result in tape hiss, hum, or other forms of
background noise. Over time, the quality of analog media can degrade due
to wear and tear.
4. Playback and Reproduction
• The playback of analog audio involves mechanical or electrical means to
convert the stored waveform back into an audio signal. For example, in a
vinyl record, the stylus reads the groove and vibrates to produce an
electrical signal, which is then amplified and played back through
speakers.
5. Editing and Mixing
• Editing analog audio is typically more complex and involves physically
cutting and splicing tape or manipulating it through mechanical means.
The limitations in editing make the process more labor-intensive but
sometimes contribute to a unique, raw sound.
6. Analog vs. Digital Sound
• Analog Fidelity: Analog audio can theoretically capture all the
information within the audible frequency range because it is not limited
by discrete sampling points. However, in practice, imperfections and
limitations of the medium can introduce noise, distortion, and a loss of
fidelity.
• Perceived Differences: Many listeners describe analog audio as warmer,
smoother, or more "musical" due to its ability to capture the harmonic
distortion and subtle fluctuations present in the sound wave.
Advantages of Analog Audio:
• Natural Sound: Analog recordings are often praised for their warmth,
depth, and natural sound characteristics.
• Unique Distortion and Artifacts: The harmonic distortion introduced by
analog equipment like tube amplifiers or tape machines can add a pleasing
quality to the sound.
• Organic Quality: Analog’s continuous signal representation can give it a
more organic and less clinical feel.
Disadvantages of Analog Audio:
• Susceptibility to Noise and Degradation: Analog recordings can pick up
external noise and degrade over time, losing quality with repeated
playback.
• Storage and Handling: Analog media like vinyl records and tapes require
careful handling and storage, as they can be damaged by dust, humidity,
and heat.
• Editing and Reproduction Limitations: Analog editing is more
cumbersome, and creating exact copies can result in a loss of quality.
Examples of Analog Audio Formats:
• Vinyl Records: Sound is etched into a disc as physical grooves that
correspond to the audio waveform.
• Cassette Tapes: Audio is stored magnetically on a strip of tape, which is
read by a tape head.
• Reel-to-Reel Tapes: Used in professional recording, these tapes offer
higher fidelity and quality compared to cassette tapes.
• 8-Track Tapes: An older, now obsolete format for playing music in a
continuous loop.
Digital vs Analog audio
Digital and analog audio are two distinct methods of recording, storing, and
reproducing sound. They have fundamental differences in the way they capture
and process audio signals:
1. Definition
• Analog Audio: Analog audio refers to sound recorded as continuous
electrical signals that directly correspond to the sound waves. It
captures the nuances of the original sound wave in its entirety.
• Digital Audio: Digital audio involves converting sound waves into a series
of discrete numbers or data points using a process called sampling. The
original sound is represented as a sequence of binary numbers.
2. Recording Process
• Analog Audio Recording: Captured using microphones and stored on
mediums like vinyl records or magnetic tapes. The waveform of the
recorded signal is continuous and replicates the changes in sound
pressure.
• Digital Audio Recording: Sound is captured, sampled, and quantized into
a numerical format (e.g., CD audio is sampled at 44.1 kHz with 16-bit
depth). These samples are stored digitally (e.g., on a hard drive or digital
storage device).
3. Signal Quality
• Analog Audio: Often praised for its "warmth" and "natural" sound, as it
captures all variations of the audio signal without truncation. However, it
is susceptible to noise, distortion, and degradation over time.
• Digital Audio: Offers cleaner, more precise sound with greater
consistency. Digital recordings do not degrade over time but may lose
some nuances compared to analog, depending on the sample rate and bit
depth.
4. Noise and Distortion
• Analog Audio: Prone to noise and distortion from mechanical and
electrical sources, which can affect playback quality (e.g., tape hiss, vinyl
crackles).
• Digital Audio: Generally free from noise and distortion introduced during
recording or playback, as long as the equipment is functioning properly.
The main limitation is quantization noise, but this is minimal with high
sampling rates.
5. Editing and Processing
• Analog Audio: Editing is more challenging and typically involves cutting
and splicing tapes or using complex mixing techniques. Analog processes
may introduce additional noise or loss of quality.
• Digital Audio: Much easier to edit, as you can manipulate individual
samples with software. Digital files can be copied, edited, and processed
repeatedly without loss of quality.
6. Storage and Portability
• Analog Audio: Requires larger physical storage (e.g., tapes, records) and
is vulnerable to environmental conditions like humidity or heat.
• Digital Audio: Can be compressed and stored on various devices like CDs,
DVDs, or hard drives, and easily transmitted over the internet.
7. Fidelity and Accuracy
• Analog Audio: Theoretically offers infinite resolution in capturing sound
since it is a continuous signal, but practical limitations like noise and wear
reduce its fidelity.
• Digital Audio: Offers highly accurate reproduction of sound based on
sampling rate and bit depth. Higher sample rates (e.g., 96 kHz, 24-bit)
offer more detail, approaching or exceeding what is typically perceptible
to the human ear.
8. Usage and Popularity
• Analog Audio: Favored by some audiophiles, musicians, and sound
engineers for its unique character and sound qualities. Often used for
vinyl records, some studio recordings, and certain live performances.
• Digital Audio: Dominates the market in music production, broadcasting,
and consumer electronics due to its convenience, versatility, and high-
quality playback.
9. Examples
• Analog Audio Formats: Vinyl records, magnetic tapes, cassettes, reel-to-
reel tapes.
• Digital Audio Formats: CDs, MP3s, WAV files, FLAC files, and digital
streaming formats.
10. Pros and Cons
Aspect Analog Audio Digital Audio
Pros Warm, natural sound Clean, precise sound
Continuous signal capture Easily editable and processable
Unique character for some genres No degradation over time
Cons Susceptible to noise/distortion Limited resolution based on sampling
Degrades over time May lack the "warmth" of analog
Digitization of Sound
Digitization is a process of converting the analog signals to a digital signal.
There are three steps of digitization of sound.
Sampling
Sampling is a process of measuring air pressure amplitude at equally spaced
moments in time, where each measurement constitutes a sample. A sampling rate
is the number of times the analog sound is taken per second. A higher sampling
rate implies that more samples are taken during the given time interval and
ultimately, the quality of reconstruction is better. The sampling rate is
measured in terms of Hertz, Hz in short, which is the term for Cycle per second.
A sampling rate of 5000 Hz(or 5kHz,which is more common usage) implies that
mt uj vu8i 9ikuhree sampling rates most often used in multimedia are
44.1kHz(CD-quality), 22.05kHz and 11.025kHz.
Quantization
Quantization is a process of representing the amplitude of each sample as
integers or numbers. How many numbers are used to represent the value of each
sample known as sample size or bit depth or resolution. Commonly used sample
sizes are either 8 bits or 16 bits. The larger the sample size, the more
accurately the data will describe the recorded sound. An 8-bit sample size
provides 256 equal measurement units to describe the level and frequency of
the sound in that slice of time. A 16-bit sample size provides 65,536 equal units
to describe the sound in that sample slice of time. The value of each sample is
rounded off to the nearest integer (quantization) and if the amplitude is greater
than the intervals available, clipping of the top and bottom of the wave occurs.
Encoding
Encoding converts the integer base-10 number to a base-2 that is a binary
number. The output is a binary expression in which each bit is either a 1(pulse)
or a 0(no pulse).
Quantization of Audio
Quantization is a process to assign a discrete value from a range of possible
values to each sample. Number of samples or ranges of values are dependent on
the number of bits used to represent each sample. Quantization results in
stepped waveform resembling the source signal.
Quantization Error/Noise
The difference between sample and the value assigned to it is known as
quantization error or noise.
Signal to Noise Ratio (SNR)
Signal to Ratio refers to signal quality versus quantization error. Higher the
Signal to Noise ratio, the better the voice quality. Working with very small
levels often introduces more error. So instead of uniform quantization, non-
uniform quantization is used as companding. Companding is a process of
distorting the analog signal in controlled way by compressing large values at the
source and then expanding at receiving end before quantization takes place.
Transmission of Audio
In order to send the sampled digital sound/ audio over the wire that it to
transmit the digital audio, it is first to be recovered as analog signal. This
process is called de-modulation.
PCM Demodulation
PCM Demodulator reads each sampled value then apply the analog filters to
suppress energy outside the expected frequency range and outputs the analog
signal as output which can be used to transmit the digital signal over the
network.
Digital audio file formats
Digital audio file formats refer to the different types of files that store audio
data. Each audio file type has unique benefits and drawbacks. Determine which
one is best for specific tasks or situations to save time and reduce stressful
errors. Here are seven popular audio file types and some unique differences
between them.
1. M4A audio file type
The M4A is an mpeg-4 audio file. It is an audio-compressed file used in the
modern setting due to increased quality demand as a result of cloud storage and
bigger hard drive space in contemporary computers. Its high quality keeps it
relevant, as users who need to hear distinct sounds on audio files will need this
over more common file types.
M4A file types are compressed audio files used by Apple iTunes.
Music download software like Apple iTunes use M4A instead of MP3 because it’s
smaller in size and higher in quality. Its limitations come in the form of
compatibility, as a lot of software are unable to recognize the M4A, making it
ideal for only a select type of user.
2. FLAC
The FLAC audio file is Free Lossless Audio Codec. It is an audio file compressed
into a smaller size of the original file. It’s a sophisticated file type that is
lesser-used among audio formats. This is because, even though it has its
advantages, it often needs special downloads to function. When you consider
that audio files are shared often, this can make for quite an inconvenience to
each new user who receives one.
The FLAC is a lossless audio file.
What makes the FLAC so important is the lossless compression can save size and
promote sharing of an audio file while being able to return to the original quality
standard. The near-exact amount of storage space required of the original audio
file is sixty percent – this saves a lot of hard drive space and time spent
uploading or downloading.
3. MP3
The MP3 audio file is an MPEG audio layer 3 file format. The key feature of
MP3 files is the compression that saves valuable space while maintaining near-
flawless quality of the original source of sound. This compression makes the MP3
very popular for all mobile audio-playing devices, particularly the Apple iPod.
The MP3 stays relevant among newer audio file types due to its high quality and
small size.
MP3 continues to be relevant in today’s digital landscape because it’s compatible
with nearly every device capable of reading audio files. The MP3 is probably
best used for extensive audio file sharing due to its manageable size. It also
works well for websites that host audio files. Finally, the MP3 remains popular
because of its overall sound quality. Though not the highest quality, it has
enough other benefits to compensate.
4. MP4
An MP4 audio file is often mistaken as an improved version of the MP3 file.
However, this couldn’t be further from the truth. The two are completely
different and the similarities come from their namesake rather than their
functionality. Also note that the MP4 is sometimes referred to as a video file
instead of an audio file. This isn’t an error, as in fact it’s both an audio and video
file.
There are plenty of differences between the MP4 and MP3.
An MP4 audio file type is a comprehensive media extension, capable of holding
audio, video and other media. The MP4 contains data in the file, rather than
code. This is important to note as MP4 files require different codecs to
implement the code artificially and allow it to be read.
5. WAV
A WAV audio file is a Waveform Audio File that stores waveform data. The
waveform data stored presents an image that demonstrates strength of volume
and sound in specific parts of the WAV file. It is entirely possible to transform
a WAV file using compression, though it’s not standard. Also, the WAV is
typically used on Windows systems.
The WAV offers an uncompressed format.
The easiest way to envision this concept is by thinking of ocean waves. The
water is loudest, fullest and strongest when the wave is high. The same holds
true for the waveform in the WAV. The visuals are high and large when the
sound increases in the file. WAV files are usually uncompressed audio files,
though it’s not a requirement of the format.
6. WMA
The WMA (Windows Media Audio) is a Windows-based alternative to the more
common and popular MP3 file type. What makes so beneficial is its lossless
compression, retaining high audio quality throughout all types of restructuring
processes. Even though it’s such a quality audio format, it’s not the most popular
due to the fact it’s inaccessible to many users, especially those who don’t use
the Windows operating system.
The WMA is a great file for Windows users.
If you’re a Windows user, simply double-click any WMA file to open it. The file
will open with Windows Media Player (unless you’ve changed the default program).
If you’re not using Windows, there are some alternatives to help you out. The
first option is to download a third-party system that plays the WMA. If this
isn’t something you want to do, consider converting the WMA to a different
audio format. There are plenty of conversion tools available.
7. AAC
The AAC (Advanced Audio Coding) is an audio file that delivers decently high-
quality sound and is enhanced using advanced coding. It has never been one of
the most popular audio formats, especially when it comes to music files, but the
AAC does still serve some purpose for major systems. This includes popular
mobile devices and video gaming units, where the AAC is a standard audio
component.
The AAC is a highly-practical audio file.
To open an AAC file, the most common and direct format for most users is
through iTunes. All this entails is launching the iTunes system and opening the
AAC file from your computer in the ‘File’ menu. If you don’t have iTunes and
want an alternative, consider downloading third-party software capable of
opening the AAC. If that doesn’t suit your needs, convert the AAC to a more
common audio file type.
MIDI (Musical Instrument Digital Interface)
The Musical Instrument Digital Interface (MIDI) is a music transmission and
storage standard that was originally developed for digital music synthesized
instruments. MIDI fails to convey recorded sound instead, it contains musical
notes, durations, and pitch information, which the receiving device can utilize to
play music from its sound library.
What is MIDI?
Musical Instrument Digital Interface is like a universal plug-and-play for music.
It is a sound card manufacturer that supports standards for recording and
playing back music on digital synthesizers. It was created to control one
keyboard from another, but it was quickly accepted for use with personal
computers. It’s like having a remote control that not only plays notes but can
also start up beats, play patterns, and control effects on other gear,
even computers loaded with music programs. A fact about MIDI is that it
doesn’t deal with the actual sounds but with the instructions on how to make
those sounds like what notes to play, how soft or loud, and what feels.
How does MIDI Work?
Musical Instrument Digital Interface is a protocol that enables electronic
musical instruments, computers, and other devices to communicate with each
other. Unlike transmitting audio signals, MIDI transmits data about musical
events like notes, pitch, velocity, and timing. This data is structured into
messages, including Note On/Off, Control Change, and others. Devices are
connected via MIDI cables or Universal Serial Bus, with controllers sending
MIDI messages based on user input, and sound generators receiving these
messages to produce sound. MIDI allows for real-time performance control and
recording/playback in sequencers, providing a standardized method for
electronic musical devices to interact and create music.
Uses of MIDI in Music
• Playing Instruments: It lets you control different sounds or instruments
from just one keyboard or pad. So, you could play a piano, drums, or even
strings from the same device.
• Making Music: MIDI helps in putting together music tracks. You can
record bits of music, mess around with them, fix timing, or change notes
until everything sounds just right.
• Writing Songs: It’s great for songwriters too. You can tap out a tune on a
MIDI keyboard, and the software shows you the notes, making it easier to
see and tweak your music.
• Creating Sounds: If you’re into making unique sounds, MIDI can be your
best friend. You can use it to control synths and create cool new sounds for
your tracks.
• Performing Live: In live shows, MIDI can be a game-changer. One person
can control a whole bunch of instruments and gadgets, making sure the music
and even stage lights or visuals are all in sync.
• Learning Music: It’s also a fantastic tool for learning music. Whether it’s
understanding how different parts come together or getting the hang of
musical notes, MIDI can make learning more interactive.
• Working in Studios: In recording studios, MIDI is everywhere. It
connects with the software, helping to control virtual instruments, edit music,
or even automate mixing tasks like adjusting volumes or effects.
• Automating Stuff: And not just in studios, MIDI helps in live setups too,
managing gear like mixers or effects, making sure everything changes just at
the right moment in a song or show.
Why Use Musical Instrument Digital Interface?
Using MIDI, or Musical Instrument Digital Interface, is crucial in the music
world because it streamlines composing, recording, and performing. It’s a key
player in music production, letting musicians and producers easily control
synthesizers, sequence tracks, and edit music. MIDI works well in live
performance too, enabling one-person control over multiple instruments and tech,
making shows smoother and more dynamic. Its universal compatibility means it
works with all sorts of equipment, no matter the brand. In studios, MIDI is vital
for manipulating digital audio workstations (DAWs), enhancing sound design, and
ensuring precise automation of musical elements. For learners, it’s a fantastic
tool, making music education more interactive and accessible. Essentially, MIDI
transforms musical ideas into a format that digital devices can understand and
manipulate, making it a fundamental technology in today’s music industry.
MIDI Channels
• Number of Channels: MIDI supports up to 16 channels on a single MIDI
port. This means you can control up to 16 different instruments or sound
sources through one cable or connection.
• Channel Assignment: Each MIDI instrument or device can be set to
listen to a specific channel. For example, you might have a keyboard set to
play a piano sound on MIDI channel 1, a drum machine receiving on channel 2,
and a string section on channel 3.
• Independent Control: Because each channel operates independently,
changes made on one channel, like altering the volume or applying an effect,
won’t affect the other channels. This independence is crucial for complex
arrangements and performances.
• Multi-Timbral Instruments: Some synthesizers and sound modules are
multi-timbral, meaning they can receive and play back different sounds on
different channels simultaneously. For instance, a multi-timbral synthesizer
could play a bassline, a melody, and a drum part on separate channels all at
the same time.
• Versatility in Use: MIDI channels enhance live performance flexibility,
allowing performers to switch between instruments or layers of sounds easily.
In studio production, they help in arranging and mixing by keeping different
instrument parts on separate channels for easier editing and manipulation.
3 Most Common MIDI Setups
• DAW and MIDI controller: The most basic and common MIDI setup is
to use a MIDI controller with your DAW in a home studio. It’s an easy,
portable, and effective way to use MIDI.
• Computer, MIDI interface and synthesizers: MIDI tracks in your DAW
sequencer can operate actual hardware synthesisers because of your MIDI
interface’s conversion capabilities. That means you may use all of your digital
tools and skills to enter and alter notes before playing them back on a real
synthesizer—or any other MIDI-capable device.
• Hardware sequencer, drum machine and synthesizer: The sequencer
uses MIDI THRU to transfer data to three devices: two synths and a drum
machine. This setup is similar to a small DAW setup made up just of
hardware gear.
What are MIDI cables?
A MIDI file stores MIDI data that can be played back by a device. A MIDI file
simply provides data on which notes to play, therefore the sound will vary
depending on the device playing it back. Because these files are so small, they
were popular in early video games and as mobile ringtones. The standard MIDI
file is a file format for exchanging MIDI information. It usually had the .mid
file extension. Because of the wide availability of MIDI music from early video
games, some bands now use game consoles as instruments. This is known as
chiptunes.
Advantages of MIDI
• MIDI is incredibly versatile, allowing for the control and synchronization
of a wide range of electronic musical instruments, software, and devices.
• MIDI messages are lightweight compared to audio data.
• MIDI facilitates real-time performance.
• MIDI is a standardized protocol, ensuring compatibility between
different MIDI-enabled devices from various manufacturers.
• MIDI is easy to implement.
• MIDI has small file size.
Disadvantages of MIDI
• MIDI does not transmit audio signals.
• The overall latency of MIDI is high.
• MIDI functionality depends on hardware and software.
• MIDI data transmission can be susceptible to interference or dropout,
leading to data loss or corruption.
• MIDI does not specify the final sound.
Multimedia Digital Audio Coding
Audio coding is used to obtain compact digital representations of high fidelity
audio signals for the purpose of efficient transmission of signal or storage of
signal. Objective in audio coding is to represent the signal with minimum number
of bits while achieving transparent signal reproduction that is generating output
audio that cannot be distinguished from the original input.
Pulse Code Modulation (PCM)
PCM is a method or technique for transmitting analog data in a digital and binary
way independent of the complexity of the analog waveform.All types of analog
data like video, voice; music etc. can be transferred using PCM. It is the
standard form for digital audio in computers. It is used by Blu-ray, DVD and
Compact Disc formats and other systemds such as digital telephone systems.
PCM consists of three steps to digitize an analog signal.
Sampling
Sampling is the process of reading the values of the filtered analog signal at
regular intervals. Suppose analog signal is sampled every Ts seconds. Ts are
referred to as the sampling interval.
Fs=1/Ts
It is called the sampling rate or sampling frequency.
Quantization
Quantization is the process of assigning a discrete value from a range of
possible values to each obtained sample. Possible values will depend on the
number of bits used to represent each sample. Quantization can be achieved by
either rounding the signal up and down to the nearest available value which is
lower than the actual sample. The difference between the sample and the value
assigned to it is known as the quantization noise or quantization error.
Quantization noise can be reduced by increasing the number of quantization
intervals or levels because the difference between the input signal amplitude
and the quantization interval decreases as the number of quantization intervals
increases.
Binary Encoding
Binary encoding is the process of representing the sampled values as a binary
number in the range 0 to n. The value of n is choosen as a power of 2,
considering the accuracy.
Differential Coding
Differential coding operates by making numbers small. Smaller numbers require
less information to code than larger numbers. It would be more accurate to
claim that similar numbers require less information to code. Suppose you are
working with a string of bytes. Bytes range from 0.255. Here is a string:
150 152 148 150 153 149 152
On the scale of 0...255, those numbers are reasonably large. They are quite
similar to each other. Differences are taken and coded instead of complete
numbers. Normally first number is taken as such and then differences are
computed.
150
152 - 150 = 2
148 - 152 = -4
150 - 148 = 2
153 - 150 = 3
149 - 153 = -4
152 - 149 = 3
Thus coding string of numbers is
150 2 -4 2 3 -4 3
When decoding this string of numbers, start with the first number and start
adding the deltas to get the remaining numbers:
150
150 + 2 = 152
152 - 4 = 148
148 + 2 = 150
150 + 3 = 153
153 - 4 = 149
149 + 3 = 152
Therefore, audio is not stored in simple PCM but in a form that exploits
differences.
Lossless Predictive Coding
Predictive coding is based on the principle that most source or analog signals
show significant correlation between successive samples so encoding uses
redundancy in sample values which implies lower bit rate. Therefore, in case of
predictive coding, we have to predict current sample value based upon previous
samples and we have to encode the difference between actual value of sample
and predicted value. The difference between actual and predicted samples is
known as prediction error. Therefore, predicting coding consists of finding
differences and transmitting these using a PCM system.
Assume that input signal (or samples) as the set of integer values fn. Then we
predict integer value f¯n as simply the previous value. Therefore
fn=f¯n
Let en is prediction error (i.e the difference between the actual and the
predicted signal).
en=fn-f¯n
We would like our error value en to be as small as possible. Therefore we would wish our pprediction fn to be
as close as possible to the actual signal fn.
But for a particular sequence of signal values, more than one previous values fn-
1,fn-2,fn-3 and so on may provide a better prediction of fn.For example
f¯n=1/2(fn-1+fn-2)
f¯n is the mean of previous two values. Such a predictor can be followed by a
truncating or rounding operation to result in integer.
Differential Pulse Code Modulation, DPCM
DPCM is exactly the same as predictive coding except that it incorporates a
quantizer step.
DPCM is a procedure of converting an analog into a digital signal in which an
analog signal is sampled and then the difference between the actual sample value
and its predicted value is quantized and then encoded forming a digital value.
DPCM codewords represent differences between samples unlike PCM where
codewords represented a sample value.
DPCM algorithm predicts the next sample based on the previous samples and the
encoder stores only the difference between this prediction and the actual value.
When prediction is reasonable, lesser bits are required to represent the same
information. This type of encoding reduces the number of bits required per
sample by about 25% compared to PCM.
Nomenclature for different signal values is as follows:
fn - Original Signal
f¯n - Predicted Signal
f´n - Quantized, Reconstructed Signal
en - Prediction Error
e´n - Quantized Error Value
The equations that describe DPCM are as follows:
f¯n=function of (f´n-1,f´n-2,- - - - - -)
en=fn-f¯n
e´n=Q[en]
transmit codeword(e´n)
reconstruct f´n=fn+e´n
Note that the predictor is always based on the reconstructed, quantized version
of the signal because encoder side is not using any information. If we try to
make use of the original signal fn in the predictor in place of f´n then
quantization error would tend to accumulate and could get worse rather than
being centered on zero.
The figure shows the schematic diagram for the DPCM encoder (transmitter).
We observed the following point for this figure
⚫ As shown in figure, the predictor makes use of the reconstructed, quantized
signal values f´n.
⚫ The quantizer can be uniform or non-uniform. For any signal, we select the
size of quantization steps so that they correspond to the range (maximum
and minimum) of the signal.
⚫ Codewords for quantized error value e´n are produced using entropy coding
(for example Huffman coding). Therefore, the box labeled symbol coder in
the figure simply means entropy coding.
⚫ The prediction value f¯n is based on previous values of f´n to form the
prediction. Therefore we need memory (or buffer) for predictor.
⚫ The quantization noise (fn-f´n) is equal to the quantization effect on the
error term, en-e´n.
ADPCM (Adaptive DPCM)
DPCM coder has two components, namely the quantizer and the predictor. In
Adaptive Differential Pulse Code Modulation (ADPCM), the quantizer and
predictor are adaptive. This means that they change to match the
characteristics of the speech (or audio) being coded.
ADPCM adapts the quantizer step size to suit the input. In DPCM, step size can
be changed along with decision boundaries, using a non uniform quantizer. There
are two ways to do this
Forward Adaptive Quantization - Properties of the input signal are used.
Backward Adaptive Quantizationor - Properties of quantized output are
used. If errors become too large, choose non-uniform quantizer.
Choose the predictor with forward or backward adaptation to make predictor
coefficients adaptive which is also known as Adaptive Predictive Coding (APC).
As you know that the predictor is usually a linear function of previous
reconstructed quantized value f´n. The number of previous values used is called
the order of the predictor. For example, if we use X previous values, we need X
coefficients ai, i=1, 2,........, X in predictor. Therefore,
f¯n=function of (f´n-1,f´n-2,- - - - - -)
However, we find a different situation, if we try to change the prediction
coefficients which multiply previous quantized values to make a complicated set
of equations to solve for these cofficients. Figure shows a schematic diagram
for the ADPCM encoder and decoder.
Delta Modulation
Nyquist rate or frequency is defined as the minimum rate at which a
finite bandwidth signal needs to be sampled to retain all of the information. In
order to get a comparatively better sampling rate in the differential pulse code
modulation process, the signal’s sampling rate is maintained higher than the
Nyquist rate.
In the Differential Pulse Code Modulation (DPCM) process, when the sampling
interval is reduced, the sample-to-sample amplitude difference becomes small,
like the difference is of 1-bit quantization. Hence the step-size will be very
small.
Delta modulation is a process mainly used in the transmission of voice
information. It is a technique where analog-to-digital and digital-to-analog signal
conversion are seen. Delta modulation (DM) is an easy way of DPCM. In this
technique, the difference between consecutive signal samples is encoded into n-
bit data streams. In DM, the data which is to be transmitted is minimized to a 1-
bit data stream.
Block Diagram of Delta Modulator
The sampling rate is comparatively very high in the delta modulation technique.
The value of the step size after quantization is smaller. In the delta modulation
process, the quantization design is easy and simple, and it gives the user the
option to design the bit rate.
The delta modulator includes a 1-bit quantizer as shown in the figure above and
a delay circuit along with two summer circuits. The output of the delta
modulator will be a stair-case approximated waveform. The step size of this
waveform is the delta (Δ). The output quality of the waveform is moderate. In
order to obtain a high ratio signal-to-noise, DM must adapt oversampling
techniques. In oversampling techniques, the analog signal is sampled many times
higher than the Nyquist rate.
The bandwidth in bits/second is needed for the transmission of a delta-
modulated signal. This signal is equal to the sampling frequency. We can find the
bandwidth to transmit the modulated signal using the below formula.
Bandwidth required to transmit the modulated signal = ffs samples/second X 1
bit/sample
= ffs bits/second
Where,
fs is the signal’s sampling frequency
Delta Demodulator
Refer to the below figure for the delta demodulator. The delta demodulator
includes a delay circuit, a low pass filter, and a summer connected as per the
image below. Since the predictor circuit is removed, there is no assumed input
given to the demodulator.
A low pass filter is included in the circuit for noise elimination and to obtain
better out-of-band signals. Granular noise is eliminated at the transmitter, and
granular noise is referred to as the step-size error. When zero noise is seen,
then the output of the modulator is equal to the demodulator input.
Advantages of Delta Modulation
• Design is easy and simple.
• It is a 1-bit quantizer.
• Modulator & demodulator can be designed easily.
• In delta modulation, the quantisation design is very simple
• The bit rate can be designed by the user
Disadvantages of Delta Modulation
• When the value of the delta is small, slope overload distortion is seen,
which is a type of noise.
• When the value of delta is large, granular noise is seen, which is a type of
noise.
Adaptive Delta Modulation
An advanced form of delta modulation is adaptive delta modulation. As we know,
granular noise and slope overload distortion are the types of noise seen in delta
modulation. To eliminate these kinds of noise, an adaptive delta modulation
technique was developed. Slope error seen in the delta modulation technique is
reduced in this process. Slope overload error and granular error are completely
removed using this process. Quantized noise is removed in the demodulation
process using an LPF (low pass filter).
Advantages of Adaptive Delta Modulation
• Adaptive delta modulation offers extremely high performance.
• This technique decreases the need for correction circuits in radio design
and error detection.
• Dynamic range is high since the variable step size covers a large range of
values.
• Slope overload error and granular error are not seen.
• Slope error is reduced.