Minimum and maximum sound quality. How audio is encoded

Answers on questions

The main parameters affecting quality digital sound recording are:

§ The capacity of the ADC and DAC.

§ Sampling frequency of ADC and DAC.

§ Jitter ADC and DAC

§ Resampling

Also, the parameters of the analog path of digital sound recording and sound reproduction devices remain important:

§ Signal to noise ratio

§ Total harmonic distortion

§ Intermodulation distortion

§ Unevenness of the amplitude-frequency response

§ Interpenetration of channels

§ Dynamic range

Digital Sound Recording Technique

Recording digital sound is currently carried out at recording studios, under the control of personal computers and other expensive and high-quality equipment. Also, the concept of "home studio" is quite widely developed, in which professional and semi-professional recording equipment is used, which allows you to create high-quality recordings at home.

Are applied sound cards as part of computers that perform processing in their ADCs and DACs - most often in 24 bits and 96 kHz, a further increase in the bit rate and sampling rate practically does not increase the recording quality.

There is a whole class of computer programs - sound editors that allow you to work with sound:

§ record incoming audio stream

§ create (generate) sound

§ modify an existing recording (add samples, change timbre, sound speed, cut parts, etc.)

§ rewrite from one format to another

§ convert convert different audio codecs

Some simple programs, only allow converting formats and codecs.

Varieties of digital audio formats

There are various concepts of audio format.

The digital representation of audio data depends on how the digital-to-analog converter (DAC) is quantized. In sound engineering, two types of quantization are currently most common:

§ pulse code modulation

§ sigma-delta modulation

Quantization bit depth and sampling rate are often specified for various audio recording and playback devices as a digital audio representation format (24 bit / 192 kHz; 16 bit / 48 kHz).

The file format determines the structure and presentation of audio data when stored on a PC storage device. To eliminate the redundancy of audio data, audio codecs are used, with the help of which audio data is compressed. There are three groups of audio file formats:

§ uncompressed audio formats such as WAV, AIFF

§ lossless compressed audio formats (APE, FLAC)

§ audio formats with lossy compression (mp3, ogg)

Modular music file formats stand out. Created synthetically or from samples of pre-recorded live instruments, they are mainly used to create modern electronic music (MOD). Also, this can be attributed to the MIDI format, which is not a sound recording, but at the same time, using a sequencer, it allows you to record and play music using a certain set of commands in text form.

Digital audio media formats are used both for mass distribution of sound recordings (CD, SACD) and in professional sound recording (DAT, minidisc).

For surround sound systems, sound formats can also be distinguished, which are mainly multichannel sound accompaniments to movies. These systems have entire format families from two large competing companies, Digital Theater Systems Inc. - DTS and Dolby Laboratories Inc. - Dolby Digital.

Also called the format is the number of channels in systems multichannel sound(5.1; 7.1). Initially such a system was developed for cinemas, but was later extended by the Software Codec

Audio codec at the program level

§ G.723.1 - one of the basic codecs for IP telephony applications

§ G.729 is a proprietary narrowband codec that is used to digitally represent speech

§ Internet Low Bitrate Codec (iLBC) - a popular free codec for IP telephony (in particular, for Skype and Google Talk)

Audio codec(eng. Audio codec; audio encoder / decoder) —A computer program or hardware designed to encode or decode audio data.

Software codec

Audio codec at the program level is a specialized computer program, a codec that compresses (compresses) or decompresses (decompresses) digital audio data according to a file audio format or streaming audio format. The task of an audio codec as a compressor is to provide an audio signal with a given quality / fidelity and the smallest possible size. Compression reduces the amount of space required for storing audio data, and it is also possible to reduce the bandwidth of the channel through which audio data is transmitted. Most audio codecs are implemented as software libraries that interact with one or more audio players such as QuickTime Player, XMMS, Winamp, VLC media player, MPlayer or Windows Media Player.

Popular software audio codecs by application:

§ MPEG-1 Layer III (MP3) is a proprietary codec for audio recordings (music, audiobooks, etc.) for computer technology and digital players

§ Ogg Vorbis (OGG) - the second most popular format, widely used in computer games and in file-sharing networks for transferring music

§ GSM-FR is the first digital speech coding standard used in GSM phones

§ Adaptive multi rate (AMR) - recording of human voice in mobile phones and others mobile devices

The human ear perceives sound at a frequency of 20 vibrations per second (low sound) to 20,000 vibrations per second (high sound).

A person can perceive sound in a huge range of intensities, in which the maximum intensity is 10 14 times greater than the minimum (one hundred thousand billion times). A special unit is used to measure the loudness of sound "decibel"(dbl) (Table 5.1). Decrease or increase in sound volume by 10 dB corresponds to a decrease or increase in sound intensity by 10 times.

Time sampling of sound. In order for a computer to process sound, a continuous audio signal must be converted to digital discrete form using time sampling. A continuous sound wave is divided into separate small time sections, for each such section a certain value of the sound intensity is set.

Thus, the continuous dependence of the sound loudness on time A (t) is replaced by a discrete sequence of loudness levels. On the graph, this looks like replacing a smooth curve with a sequence of "steps" (Fig. 1.2).

Rice. 1.2. Time sampling of audio

Sampling frequency. A microphone connected to the sound card is used to record analog sound and convert it to digital form. The quality of the obtained digital sound depends on the number of measurements of the sound volume level per unit of time, i.e. sampling rate... The more measurements are made in I second (the higher the sampling rate), the more accurate the "ladder" of the digital sound signal repeats the curve of the dialog signal.

Audio sampling rate is the number of sound volume measurements in one second.

The audio sampling rate can range from 8000 to 48000 sound volume measurements per second.

Audio coding depth. Each "step" is assigned a specific value for the sound volume level. Sound volume levels can be thought of as a set possible states N, for encoding which requires a certain amount of information I, which is called the audio coding depth.

Audio coding depth is the amount of information needed to encode discrete levels digital audio volume.

If the coding depth is known, then the number of digital sound loudness levels can be calculated using the formula N = 2 I. Let the sound coding depth be 16 bits, then the number of sound volume levels is:

N = 2 I = 2 16 = 65 536.

During the encoding process, each sound volume level is assigned its own 16-bit binary code, the smallest sound level will correspond to the code 0000000000000000, and the highest - 1111111111111111.

The quality of the digitized sound. The higher the frequency and sampling depth of the sound, the better the sound of the digitized sound will be. The lowest quality of the digitized sound corresponding to the quality telephone connection, is obtained at a sampling rate of 8000 times per second, a sampling depth of 8 bits and recording one audio track ("mono" mode). The most high quality digitized sound corresponding to audio CD quality is achieved at a sampling rate of 48,000 times per second, a sampling rate of 16 bits and recording two audio tracks ("stereo" mode).

It should be remembered that the higher the digital sound quality, the larger the information volume of the audio file. It is possible to estimate the information volume of a digital stereo sound file with a duration of 1 second with an average sound quality (16 bits, 24,000 measurements per second). To do this, the coding depth must be multiplied by the number of measurements in 1 second and multiplied by 2 (stereo sound):

16 bits × 24,000 × 2 = 768,000 bits = 96,000 bytes = 93.75 KB.

Sound editors. Sound editors allow you to not only record and play sound, but also edit it. The digitized sound is presented in sound editors in a visual form, so the operations of copying, moving and deleting parts of the audio track can be easily performed using the mouse. In addition, you can superimpose audio tracks on top of each other (mix sounds) and apply various acoustic effects (echo, reverse playback, etc.).

Sound editors allow you to change the digital sound quality and volume of an audio file by changing the sampling rate and encoding depth. Digitized audio can be saved without compression in audio files in a universal format Wav or in compressed format MP3.

When storing audio in compressed formats, audio frequencies with low intensity "excessive" for human perception are discarded, coinciding in time with audio frequencies with high intensity. The use of this format allows compressing sound files dozens of times, however, it leads to irreversible loss of information (files cannot be restored in their original form).

Control questions

1. How does the sampling rate and coding depth affect the quality of digital audio?

Self-study assignments

1.22. Selective answer assignment. The sound card performs binary encoding of the analog audio signal. How much information is needed to encode each of the 65,536 possible signal strength levels?
1) 16 bits; 2) 256 bits; 3) 1 bit; 4) 8 bits.

1.23. An assignment with a detailed answer. Estimate the information volume of digital audio files with a duration of 10 seconds at a coding depth and sampling rate of an audio signal that provide the minimum and maximum sound quality:
a) mono, 8 bits, 8000 measurements per second;
b) stereo, 16 bits, 48,000 measurements per second.

1.24. An assignment with a detailed answer. Determine the length of a sound file that will fit on a 3.5 "floppy disk (note that 2847 sectors of 512 bytes each are allocated for storing data on such a floppy disk):
a) with low sound quality: mono, 8 bits, 8000 measurements per second;
b) with high sound quality: stereo, 16 bits, 48,000 measurements per second.

Target. Comprehend the process of converting audio information, master the concepts necessary for calculating the volume of audio information. Learn to solve problems on a topic.

The goal is motivation. Preparation for the exam.

Lesson plan

1. Viewing the presentation on the topic with the teacher's comments. Annex 1

Presentation material: Audio coding.

Since the beginning of the 90s, personal computers have been able to work with sound information. Every computer with a sound card, microphone, and speakers can record, save, and play back audio information.

The process of converting sound waves into binary code in the computer memory:

The process of reproduction of sound information stored in the computer memory:

Sound is a sound wave with continuously varying amplitude and frequency. The larger the amplitude, the louder it is for a person, the higher the signal frequency, the higher the tone. Computer software now allows a continuous audio signal to be converted into a sequence of electrical impulses that can be represented in binary form. In the process of encoding a continuous audio signal, it is produced time sampling . A continuous sound wave is divided into separate small time sections, and a certain amplitude value is set for each such section.

Thus, the continuous dependence of the signal amplitude on time A (t) is replaced by a discrete sequence of loudness levels. On the graph, this looks like replacing a smooth curve with a sequence of "steps." Each "step" is assigned a value for the sound volume level, its code (1, 2, 3, etc.

Further). Sound loudness levels can be considered as a set of possible states, respectively, the more loudness levels will be allocated during the encoding process, the more information the value of each level will carry and the more high-quality the sound will be.

Audio adapter ( sound card) - a special device connected to a computer designed to convert electrical vibrations of an audio frequency into a numerical binary code when inputting sound and for the reverse conversion (from a numerical code to electrical vibrations) when playing sound.

In the process of recording sound, the audio adapter measures the amplitude with a certain period electric current and enters into the register the binary code of the obtained value. Then the resulting code from the register is rewritten into the computer's RAM. The quality of computer sound is determined by the characteristics of the audio adapter:

Sampling rate
Bit depth (sound depth).

Time sampling rate

This is the number of measurements of the input signal in 1 second. Frequency is measured in hertz (Hz). One measurement in one second corresponds to a frequency of 1 Hz. 1000 measurements in 1 second - 1 kilohertz (kHz). Typical sampling rates of audio adapters:

11 kHz, 22 kHz, 44.1 kHz, etc.

The bit depth (sound depth) is the number of bits in the audio adapter register, specifies the number of possible sound levels.

The bit depth determines the accuracy of the input signal measurement. The larger the digit capacity, the smaller the error of each individual conversion of the magnitude of the electrical signal into a number and vice versa. If the bit width is 8 (16), then when measuring the input signal, 2 8 = 256 (2 16 = 65536) different values can be obtained. Obviously, a 16-bit audio adapter encodes and reproduces sound more accurately than an 8-bit one. Modern sound cards provide 16-bit audio coding depth. The number of different signal levels (states for a given coding) can be calculated using the formula:

N = 2 I = 2 16 = 65536, where I is the depth of sound.

Thus, modern sound cards can provide coding of 65536 signal levels. Each value of the amplitude of the audio signal is assigned a 16-bit code. When a continuous audio signal is binary encoded, it is replaced by a sequence of discrete signal levels. The coding quality depends on the number of measurements of the signal level per unit of time, that is sampling rate. The more measurements are made in 1 second (the higher the sampling rate, the more accurate the binary coding procedure.

Sound file - a file that stores audio information in numerical binary form.

2. We repeat the units of measurement of information

1 byte = 8 bits

1 KB = 2 10 bytes = 1024 bytes

1 MB = 2 10 KB = 1024 KB

1 GB = 2 10 MB = 1024 MB

1 TB = 2 10 GB = 1024 GB

1 PB = 2 10 TB = 1024 TB

3. To consolidate the studied material by watching the presentation, textbook

4. Solving problems

Tutorial showing the solution at the presentation.

Objective 1. Determine the information volume of a stereo audio file with a duration of 1 second with high sound quality (16 bits, 48 kHz).

Task (on your own). Tutorial showing the solution at the presentation.
Determine the information volume of a digital audio file with a duration of 10 seconds at a sampling rate of 22.05 kHz and a resolution of 8 bits.

5. Anchoring. Solving problems at home, independently in the next lesson

Determine the amount of memory for storing a digital audio file, the playing time of which is two minutes at a sampling rate of 44.1 kHz and a resolution of 16 bits.

The user has a memory of 2.6 MB at his disposal. You need to record a 1 minute digital audio file. What should be the sampling rate and bit depth?

Free disk space - 5.25 MB, sound card capacity - 16. What is the duration of a digital audio file recorded with a sampling rate of 22.05 kHz?

One minute of digital audio file recording takes 1.3 MB on disk, sound card capacity - 8. What is the sampling rate of the sound recorded?

How much storage space is required to store a high quality digital audio file with a playing time of 3 minutes?

The digital audio file contains a low quality sound recording (the sound is dark and muffled). How long will a file sound if its size is 650 Kb?

Two minutes of digital audio file recording takes 5.05 MB on a disc. The sampling rate is 22,050 Hz. What is the bit depth of the audio adapter?

Volume free memory on disk - 0.1 GB, sound card capacity - 16. What is the duration of the sound of a digital audio file recorded with a sampling rate of 44 100 Hz?

Answers

No. 92. 124.8 seconds.

No. 93.22.05 kHz.

No. 94. High sound quality is achieved at a sampling rate of 44.1 kHz and an audio adapter bit width of 16. The required memory capacity is 15.1 MB.

No. 95. The following parameters are characteristic of a gloomy and muffled sound: sampling frequency - 11 kHz, audio adapter bit depth - 8. The duration of the sound is 60.5 s.

No. 96.16 bits.

No. 97. 20.3 minutes.

Literature

1. Textbook: Computer science, problem book-practical work 1 volume, edited by I. G. Semakin, E. K. Henner)

2. Festival of pedagogical ideas "Open lesson" Sound. Binary coding of audio information. Elena Aleksandrovna Supryagina, computer science teacher.

3. N. Ugrinovich. Informatics and information technology. 10-11 grades. Moscow. Binomial. Knowledge Lab 2003.

There are three main types of audio digits:

format - no compression;
format (lossy) - lossy compression;
format (lossless) - lossless compression.

Lossy - lossy compression: a technology in which there is a significant reduction of the encoded file in comparison with the original original, due to the removal of information that is not perceived by human hearing.

The disadvantage of this technology is the fact that the compressed file will never be identical to the original.

List of the most common lossy formats:

AAC (.m4a, .mp4, .m4p, .aac) - Advanced Audio Coding (often in an MPEG-4 container)
MP2 (MPEG Layer 2)
MP3 (MPEG Layer 3)
MPC (known as Musepack, formerly named MPEGplus or MP +)
Ogg Vorbis
WMA (Windows Media Audio)

Format	Quantization, bit	Sampling frequency, kHz	The size of the data stream from the disk, kbit / s	Compression / packing ratio
DTS	20-24	48; 96	up to 1536	~ 3: 1 lossy
MP3	floating	up to 48	up to 320	11: 1 lossy
AAC	floating	up to 96	up to 529	with losses
Ogg Vorbis	up to 32	up to 192	up to 1000	with losses
WMA	up to 24	up to 96	up to 768	2: 1, there is a lossless version

Lossless - lossless compressed audio formats, these include:

FLAC (Free Lossless Audio Codec)
APE (Monkey's Audio)
WV (WavPack)

These formats are capable of converting CDs to digital format while maintaining quality. As an example, you can take a CD, convert it to WAV, then WAV to FLAC, then back from FLAC to WAV, and then burn to a blank CD and you will have an absolutely identical copy of your source.

What format does the music sound in the best quality?

The most popular is the lossless FLAC format, and one of the most commonly used CD to FLAC conversion software is EAC (Exact Audio Copy).

Of all the parameters of digital audio, it is necessary to pay attention first of all to the following indicators:

sampling rate (sampling accuracy analog signal by time),
bit rate (the amount of information contained in the file per second).

The sampling rate is the rate at which digital audio is processed. The most common sampling rate in quality audio formats is 44.1 kHz

It is generally accepted that high bitrate guarantees best quality- this is true, but only if the source file is of good quality. A high-quality MP3 should be with a bitrate of 320 kbps, but a high-quality FLAC format usually has a bitrate of 900 and higher kbps.

What is the best quality music format

In addition to the audio formats themselves, for a high-quality sound of music, high-quality playback equipment is also needed: speakers, amplifiers, headphones. In other words, using desktop PC speakers and budget headphones you won't be able to fully enjoy high quality sound and unleash the full potential of lossless formats.

Without going deep into technical details, the following formats can be advised:

For home listening, I recommend in my opinion best format FLAC. For Audio Player good decision will be an MP3 format with a bit rate of at least 320 kbps. Personally, I use only the FLAC format on all devices, since the volumes microSD cards allow you to store a sufficient amount of data in the player.

As for the equipment for high-quality music reproduction, I advise you to pay attention to the following brands:

If budget acoustics do not suit you and you are a fan of high-quality sound (Hi-Fi or Hi-End) equipment, then everything is in your hands and is limited only by your budget, I will not give any recommendations.

Dependence of the loudness, as well as the pitch of the sound on the intensity and frequency of the sound wave

Hertz(denoted by Hz or Hz) - a unit of measurement of the frequency of periodic processes (for example, oscillations).
1 Hz means one execution of such a process in one second: 1 Hz = 1 / s.

If we have 10 Hz, then this means that we have ten executions of such a process in one second.

The human ear can perceive sound at a frequency of 20 vibrations per second (20 Hertz, low sound) to 20,000 vibrations per second (20 KHz, high sound).

In addition, a person can perceive sound in a wide range of intensities, in which the maximum intensity is 1014 times greater than the minimum (one hundred thousand billion times).

In order to measure the loudness of the sound, a special unit was invented and used " decibel" (dB)

A decrease or increase in sound volume by 10 dB corresponds to a decrease or increase in sound intensity by a factor of 10.

Sound volume in decibels

To computer systems could process sound, the continuous audio signal must be converted to digital, discrete form using time sampling.

For this, a continuous sound wave is divided into separate small time sections, for each such section a certain value of the sound intensity is set.

Time sampling of audio

A microphone connected to the sound card is used to record analog audio and convert it to digital form.

The denser the discrete stripes are located on the graph, the better it will ultimately be possible to recreate the original sound.

The quality of the obtained digital sound depends on the number of measurements of the sound volume level per unit of time, i.e. the sampling rate.

Audio sampling rate is the number of sound volume measurements in one second.

The more measurements are made in one second (the higher the sampling rate), the more accurately the "ladder" of the digital audio signal repeats the analog signal curve.

Each "step" on the graph is assigned a certain value of the sound volume level. Sound volume levels can be thought of as a set of possible states N(gradations) that require a certain amount of information to encode I, which is called the audio coding depth.

Audio coding depth is the amount of information required to encode the discrete loudness levels of digital audio.

If the coding depth is known, then the number of digital sound loudness levels can be calculated by general formula N = 2 I.

For example, if the audio coding depth is 16 bits, then the number of audio volume levels is:

N = 2 I = 2 16 = 65 536.

During the encoding process, each sound volume level is assigned its own 16-bit binary code, the smallest sound level will correspond to the code 0000000000000000, and the highest - 1111111111111111.

Digitized sound quality

So, the higher the sampling frequency and the coding depth of the sound, the better the sound of the digitized sound will be and the better you can bring the digitized sound closer to the original sound.

The highest quality of digitized audio, corresponding to audio CD quality, is achieved at a sampling rate of 48,000 times per second, a sampling rate of 16 bits and recording two audio tracks ("stereo" mode).

It must be remembered that the higher the digital sound quality, the larger the information volume of the audio file.

You can easily estimate the information volume of a digital stereo sound file with a duration of 1 second with an average sound quality (16 bits, 24,000 measurements per second). To do this, the coding depth must be multiplied by the number of measurements per second and multiplied by 2 channels (stereo sound):

16 bits × 24,000 × 2 = 768,000 bits = 96,000 bytes = 93.75 KB.

Sound editors

Sound editors allow you to not only record and play sound, but also edit it. The most prominent can be safely called, such as Sony Sound Forge, Adobe audition, GoldWave other.

The digitized sound is presented in sound editors in a clear visual form, so the operations of copying, moving and deleting parts of the audio track can be easily performed using a computer mouse.

In addition, you can superimpose, overlap audio tracks on top of each other (mix sounds) and apply various acoustic effects (echo, playback in reverse, etc.).

When storing sound in compressed formats, the inaudible and imperceptible ("redundant") for human perception sound frequencies with low intensity, coinciding in time with sound frequencies with high intensity, are discarded. The use of this format allows you to compress audio files dozens of times, but leads to irreversible loss of information (files cannot be restored in their original, original form).