Monday, March 1, 2004
System Design: What to Listen for in Digital Audio
We live in an analog universe, where every sound and color has infinite complexity, depth and resolution� Every sense–hearing, taste, smell, etc.–is analog in nature. As humans, we don't have the ability to directly process digital data.
When we digitize these "real" signals in hardware, we should understand that a digital replica is always less accurate than the analog original.�
The analog-to-digital conversion of live audio has both resolution limits (bits of resolution, or granularity) and sampling limits (samples per second). Also, to avoid "aliasing," or� false conversion, some low pass filtering at or below the Nyquist limit must occur before conversion processing begins. In short, the digital audio signal is a time-sampled, filtered, finite conversion of the original audio stream and contains considerable less data than the original�
This concept becomes important when recreating an analog signal to be heard and understood by humans–who have only analog ears. Issues involving digital compact disk (CD) technology– which depends on high-quality analog-to-digital and digital-to-analog conversions–are similar to those found in cockpit digital audio systems.
Typical CD digital resolution is 16 bits or higher at the 44.1-KHz-or-greater sampling rate required for adequate fidelity to convert a 20-KHz audio signal–a massive amount of streaming data. Many newer consumer and professional audio systems use much higher rates and conversion resolution to recreate the original audio even better.� For the technically curious, a detailed explanation of the complex bit coding scheme used in audio CDs can be viewed at www.ee.washington.edu/conselec/CE/kuhn/cdaudio2/95x7.htm.
Essentially, the CD concept is to deliver twin (stereo) 16-bit packets of serial data (which contain the instantaneous audio value) at a 44,100-packet-per-second rate to recreate the original audio. This is in addition to considerable error correction and parity coding, to help correct for defects in the data stream.� This is a lot of data, and a single audio channel requires a serial bit stream of at least 705,600 bits per second, and actually much higher with error correction data.
Interestingly, CDs helped bring about a revival of old analog vinyl records, as more dedicated audiophiles found the CD audio quality lacking in many recorded pieces.
For conversion at 44.1KHz, Nyquist sampling theory implies a maximum audio frequency of about 20 KHz, or just under half the sampling rate. In addition, a severely sloped, low pass filter (a brick wall filter) must be used to avoid converting any higher frequencies, or they will be "mis-sampled" and result in distortion and unwanted interference.� This mis-sampling effect is called aliasing, a common problem in many digital oscilloscopes.�
If set too low, the low pass filter has a significant impact on high-frequency data common in human speech. It can result in audio that is almost unintelligible, even when the� audio passband should be adequate. For designers of airborne systems to the DO-214 standard, the minimum 6-KHz audio passband must have a sampling rate of about 15 Ksamples per second minimum. But this may not be adequate, and is only a minimum threshold to consider in the designs.
CD digital audio technology is not viable for transmission by most radio communications, so some creative technology is required to recreate a much poorer replica in the few kilohertz of available modulation space. This is usually done by employing� complex modulation to encode multistate values and some kind of delta-sigma technique that attempts to send only ongoing changes, rather than the actual instantaneous state of a signal. In this way, data rates of 9,600 baud can actually encode a voice, although the results often are not impressive. Strange artifacts frequently exist within the audio.�
High-quality, change-based digital signals are common, using MPEG-3 (moving picture expert group-3) encoding. But this still is a high-data rate technique, although it uses less data than direct bit encoding.
Many people experience digital audio every day in a most unattractive way, via cell phone calls. The low data rates (and narrow radio frequency channel widths) often make the conversations hard to understand and are an example of how not to implement a digital network in an aircraft. Digital transmissions for cell and encrypted radios have data rates of 4,800, 9,600 to 14.4K or even 19.2K baud.� Critically, the frequency truncation inherent in Nyquist limit filtering also removes many key elements of speech, especially consonant and sibilant sounds, stripping out essential keys for speech recognition.
It's easy to measure bandwidth and data rates, and perform spectrum analysis of the signals. But a standardized and repeatable measurement of intelligibility or clarity–a critical factor for this technology–remains elusive. This is unfortunate, but underscores the inescapable variability of digital audio. What seems technically impressive on a specification level may sound awful in practice and result in many speech communication errors. Digital transmission, even at low rates, is perfect for data or text communication but can be inadequate for reliable speech recognition.�
So why use digital audio in an aircraft? The reasons have nothing to do with fidelity but everything to do with how the signal is handled, stored and processed. Digital audio can be stored digitally, a simple process that can be loss-less and free of degradation. In addition, digital audio can be processed as a digital bit stream. This can be a compelling feature for the transmission and rejection of many types of interference–which is why it is particularly useful in noisy aircraft environments.
If we look at a 1-KHz audio signal at 100 millivolts (mV) sent 20 feet (6.1 meters) in an aircraft on a shielded wire, we can clearly see the comparative situation. If the line impedance is 600 ohms (common for audio), one can easily inject audible interference into the wire–certainly enough to be significant,� relative to the 100-mV level.� Twenty mV probably will be injected at various frequencies that will slip through the audio passband (400-Hz, impulse noise, other adjacent audio, etc.), resulting in a poor, 5-to-1 signal-to-noise ratio. Relative noice could probably be reduced to 50-to-1 by such tricks as lowering the impedance, increasing signal levels, improving grounding, using differential connections and rerouting paths.
Running the signal digitally, we have to send a digital stream at a system bit rate adequate to pass 1 KHz. If the impedance is low (easily brought down to a low value like 50 or 124 ohms), injecting other signals becomes difficult.� Most importantly, there is no interference until the injected signal is so large as to upset the digital stream by being interpreted as valid data (volts, not mV). Concerning injected interference, this situation results in dramatically improved signal-to-noise ratios.�
We can improve our chances further by switching to optical fiber and creating an absolutely interference-proof� audio link that can run for miles. With a bit of ingenuity, this appealing fiber technique could also be used in an analog mixing system.
Digital signals also can be mixed with zero cross-talk or source contamination, since the digital mixing occurs mathematically, not by physical signal circuit combinations. This is very attractive when more than one station is needed in the aircraft.
However, this circuitry is not trivial to design, and it carries a high-risk element, as the key components often are subject to short market life spans and narrow temperature ranges. Undertaking this kind of project requires extraordinary management and a fair amount of luck. In addition, a digital system may still require a fallback analog emergency mode, as failure of a digital system is generally catastrophic in nature, with no "fail passive" mode possible.
Some control and function concepts are fundamentally analog in nature, including level controls–something digital designers� often ignore.� Up/down rockers, buttons and slew controls for audio level often are counterintuitive and awkward for users. They can draw a strong negative reaction, especially if the intervals available never seem to deliver the desired setting. People often find a lever, knob or other continuous control much easier to understand and operate, and prefer a smooth and correctly tapered level adjustment to a digitally stepped one. Momentary pushbutton controls also can be surprisingly difficult to operate in a high-vibration environment like an aircraft, as opposed to an armchair with a TV remote control.
Walter Shawlee 2 may be reached by e-mail at firstname.lastname@example.org.