Digital Sucks HomeNewsScrew DriverLunar ModuleTop FuelRetiredArtistsT-ShirtsGear LinksDigital Sucks


Does all digital suck?  Let's focus on the difference between theory and implementation...

(want to stop the cursor bugs?)

Digital audio is audio which has been sampled at regular, discreet intervals and converted to numeric data.  All the original audio information that lies between the sample intervals is, of course, lost.

Delta Sigma ("new hotness")

Delta Sigma analog/digital converters are rapidly gaining popularity in affordable digital-processing chips.  They involve sampling the audio signal at extremely high frequencies (usually 64 or more times higher than than the upper limit of the audio frequency) and storing the data as a 1-bit stream.  This stream is basically the image of the audio signal with a lot of higher-frequency noise superimposed over it and represented as a sequence of on and off bits.  The sampling frequency is so high that the original analog waveform is approximated by dithering the on/off bits over the time axis, with longer "on" times for the positive side of the wave, and longer "off" times for the negative side.  To convert it back to analog, just pump the stream into an analog integrator (a simple op-amp circuit) and filter out the noise.  The on's and off's then average out as the original audio wave form again.  It's not perfect, but I think a very "tape-like" sound quality can result from just the right amount of low-pass filtering.  If the sample rate is sufficiently high, the added noise is above the audio frequency, so the processed audio has plenty of clean bandwidth.

Learn more about Delta Sigma:
http://en.wikipedia.org/wiki/Delta-sigma_modulation
http://www.beis.de/Elektronik/DeltaSigma/DeltaSigma.html
http://www.dsdproaudio.com/html/dsd_sacd_explained.html
 

44.1 kHz ("old and busted")

At the top limit of the human audio spectrum, 20 kHz, a sampler digitizing at the standard CD-quality rate of 44.1 kHz only measures the signal roughly 2 times per cycle. 

The problem I have with 44.1 kHz is that sampling a signal only slightly higher than 2 times per cycle is the bare theoretical minimum requirement to produce "accurate" samples.  I use quotation marks around the word "accurate" because actually information is lost and errors are introduced during the sampling process.  The only reason the word "accurate" can be used is that the original waveforms can be reproduced using complex algorithms, assuming that the original source material consisted only of sine waves and no frequencies greater than 1/2 the sample rate.  Much of the original detail must be replaced by a mathematically reconstructed version during playback.  Theorists will tell you that this is "perfect" reproduction (ignoring the distortion caused by the various filters which process the information during its conversion from analog to digital, through various up-sampling and down-sampling intermediate steps, and back to analog, again through various intermediate steps--all of which rely on imperfect technology that is forced to make compromises along the way). 

The frequency range of human hearing lies roughly between 20 Hz and 20 kHz.  Many species and some humans can hear frequencies beyond this range, but there are several other reasons for sampling audio at much more than just twice the highest audio frequency.  One reason is the Hypersonic Effect.  Another reason is that the anti-aliasing filters would be much less critical and do their work above the audio spectrum, where they could cause less damage. 

One huge problem is the collision between what is theoretically possible and what technology actually produces.  

Time Smearing
As a first step, a sampler must filter away everything higher in frequency than 1/2 the sample rate.  A steep-slope low-pass filter such as is needed to pass 20 kHz and block 20.05 kHz would cause distortions in the phase of the signal as the frequency increases.  What this means is that the higher frequencies will be shifted somewhat in the time domain relative to the lower frequencies.  This will have some audible effect on the sound itself.  (I think high-hat cymbals, for example, sound somewhat fake and plastic when digitized; the sizzle is gone, IMO)  Even more apparent to the human ear will be the 'smearing' effect on the 3-D localization of a stereo image.  The human ear is very sensitive to differences in time between the left and right ears.  If the higher frequencies arrive just a tiny bit later than the lower frequencies, the 3-D soundstage will be less detailed and less natural. 

Jitter
Another source of of sampler error is clock jitter. The very fast, precise timing required to do our 44.1 kHz sampling rate should take one sample every 22 microseconds; but instead the point in time when the sampling occurs can vary by 50 nanoseconds.  More or less random inaccuracies in the timing of the sampler smear the audio image and give it a cheap, artificial quality.

Aliasing, Errors, Noise, and Loss
The sampler will detect and store data less and less accurately as the input signal rises in frequency, and this is further complicated by quick, high-frequency transient spikes (such as cymbal crashes).  Everything above the 2 kHz frequency range is subjected to increasing degrees of sampler distortion as the frequency rises (because the sampler will rarely encounter a higher-frequency signal at the exact top or bottom of its cycle).

Sampling and Data Loss

Analog input



Let's say this a 5 kHz audio signal


 



Now, if we add a 16 kHz signal to it, we don't simply add the two waves.  Some parts will sum together, and some parts will subtract from each other...



So the result of adding the 16 kHz to the 5 kHz might look something like this.

 

Sample Points



Now if we sample this signal at 44.1 kHz, these are the points we'd end up with.

 

Decimation



The sampled audio is a now series of numbers, which would not sound too good if it were not for the mathematics involved in bringing it back into the analog domain.  Note that the sharp corners of the samples and the differences between the formerly smooth wave and the stair-step patterns are called "quantization noise."  This quantization noise must be filtered out at the digital-to-analog conversion stage.
 

Reconstruction



Modern Digital-to-Analog
filters up-sample the data at a much higher rate than they were originally sampled at in order to fill in the spaces between samples with curves using the sinc function.  Here's roughly how the sampled audio might look after being converted back into analog by a digital interpolation filter.  (forgive my crude rendering)  It does a very good job of approximating the analog waveform, considering that much of the original data was lost in the sampling process.  In fact, it will probably do a better job than my drawing.  I will try to create these curves using the actual sinc function soon (looking for some free software to run the numbers through); this rendering is just using simple curves.

A high-quality digital-to-analog system can re-create surprisingly accurate sine waves based on even scant information from the digital samples by using digital filters which re-create smooth, curved transitions between the samples, rather than jagged stair-step patterns. What if the original data was much more complex than a simple sine wave, though? No doubt a great deal of lost high-frequency audio detail, ambience, and texture can never be re-created in this way, since the only information stored for each sample is the amplitude, not the actual angle of the rise or fall of the wave. The angle can only be calculated by the data before and after each sample, filling the lost pieces with smooth curves derived from sine-waves. All the original audio data in between each sample point has already been lost forever. Hence the "deadness" that is ascribed to digital audio by audiophiles. Digital has the ability to reproduce simple sine waves quite well, but more complex and textured wave forms must necessarily be simplified and smoothed out somewhat by the conversion processes, and, ultimately, not enough data is being saved at the currently-popular sample rates to ensure detailed reproduction of real-world audio.  Real-world audio will contain waveforms that are not only additive mixtures of various pure frequencies, but also phase-shifted components (frequencies which are subtracted rather than added, in varying, frequency-dependent degrees), subtle harmonic overtones and ambient nuances, very fast transitory spikes, and asymmetrical and unnatural wave forms produced by electronic instruments, making them no longer simple derivations of the sine function.  In other words, the spaces in between the samples (at the common sample rate of 44.1 kHz) are often significant and not always predictable.

The details the anti-aliasing filter and sampler errors leave out will have to be inferred by the listener at some psycho-acoustic level.  This extra effort required of the listener, plus the harshness and lack of subtlety of digitally-sampled audio, results in a quick onset of ear fatigue.  The absence of some the original harmonic and ambient information kills the feeling of "being there."  The best 30ips half-inch analog recorders can capture frequencies past 50 kHz, re-creating a live, detailed, realistic, "present", exciting sonic image.  I remember when I was a kid, I would close my eyes as I listened to records (especially the first few tracks along the outside of the record, where the bandwidth is highest) and the stereo image would transport me.  I don't get that from CDs (sample rate of 44.1 kHz for a bandwidth of just barely 20 kHz) or FM radio (32 kHz for a bandwidth of only 16 kHz).  

Music as Art and Self Expression
From an electronic standpoint, an all-analog signal path presents a musician with an instrument which is physically coupled to the output device. An electric guitarist can think of his instrument as consisting of everything from the strings to the amp's speakers, which then acoustically feed back again into the strings; they are all one. There is no latency. That feeling of oneness is part of the magic of performing music.

The feeling of playing through a digital effect is that of inputting to a system which then creates an artificial product. The link is broken. What comes out is not what you put in; it is something different. It might even sound awesome, but it is still different.  This can have an adverse affect on the creative flow of a performance.

Latency
Virtually all digital recording and signal processing equipment will have some discernable lag.  Many times it is not enough of a delay to affect the player's performance.  In the case of multitrack digital recording, latency can keep a musician's tracks from being tight and "in the pocket."  Even the best player can't make his performance sync up with a recording if the equipment won't track and play back accurately in real time.

Cheapo Digital Guitar Effects and Playing Dynamics
There's a reason digital audio has earned its reputation for sterility, lifelessness, and harshness.  Mainly, it's because it tends to be sterile, lifeless, and harsh.  Especially when used as an electric guitar effect, in my opinion (since most of those applications tend to be very cheaply made).  Electric guitars are generally used in conjunction with distortion or overdrive.  The most natural-sounding overdrive devices are really vacuum tubes, which many solid-state distortions are designed to emulate.  A natural-sounding overdrive will make use of playing dynamics and will respond to different string attacks with varying tones and levels of distortion.  Tube amplifiers are generally used with guitars to help facilitate this desirable behavior.  A low-quality digital sampler anywhere in the signal chain between a guitar and an amplifier will remove this natural link between the player's hands and the tone of the amp.  If you must use digital effects, keep in mind they are generally best used after distortion effects (unless you prefer a compressed, dead guitar tone).  If your amp's input stage is overdriven, then using your digital effect in the amp's effects loop will help to minimize its dynamic-killing tendency (assuming the digital effect can handle the amp's line-level signal; if not, it's easy enough to add an attenuator in front of the digital effect's input). 

Best Digital Practices
Ideally, the best place to make use of digital effects is after the guitar amp's speaker, processing either a mic'ed or a cabinet-simulated, direct-out signal.  At this point in the signal chain, the instrument and amplifier have already done everything they need to do together (assuming there is an acoustic interaction between the guitar and the speaker to allow for natural feedback), so the basic tone won't be killed.

In the same way, when recording, going first into analog tape and then converting to digital may warm up the signal and create a more natural-sounding result.

Obviously, the higher quality your digital equipment and the higher sample rates and word lengths you use, the better results you will get.

Technology Improvements
I think that the emerging 24 bit/96khz standard is a move in the right direction.  Unfortunately, the consumer audio CD format standard is still 16 bit/44khz for the time being.   The next big thing, however, is called DSD, Direct Stream Digital, by Sony (see http://www.dsdproaudio.com/html/dsd_sacd_explained.html)
By sampling a signal at 2.8 Mhz (2,822,400 per second) using only one bit (On/Off), the audible result is virtually indistinguishable from analog, and the capacity is increased to 4.7 GB for the same physical size disc.  The dynamic range is also increased to beyond 120db. 

Digital Audio Basics...
http://www.amek.com/oldsite/datashee/aesebu.htm

Sample Rates with Audio Examples...
http://artistpro.com/

The Importance of Digital Filtering...
http://www.iar-80.com/page25.html