Wednesday 19 February 2014

Is Digital Really Digital?

I have been engaged elsewhere over the last two weeks in an on-line exchange on a subject which ultimately boils down to “Why do CD transports sound different?”.  For most people, digital audio is quite a straightforward thing.  We read a bunch of digital data off a digital storage medium and play that data back through an audio converter.  Provided the transport is able to read the data accurately, then there is no basis on which to anticipate that different transports can sound at all different.

As the 1960’s rolled into the 1970’s the prevailing wisdom regarding turntables was not all that different.  Supposedly, all a turntable had to do was rotate at a constant speed and there was nothing more to it than that.  Once wow, flutter, and rumble were reduced to levels determined to be undetectable, there was no basis on which to anticipate that different turntables could sound at all different.  And then along came Ivor Tiefenbrun, who founded Linn and proceeded to turn the audio world totally upside down with his forcefully-delivered theory that the turntable - yes, the bit that just spins round - was in fact the single most important determinant of sound quality.  A theory that he was able to reduce conclusively to practice.  The wow, flutter, and rumble argument was consigned to the same dustbin of history into which the “bits is bits is bits” argument is currently tumbling, if it has not already landed.

Ivor’s “dry bones” theory of turntables was quite easy to understand.  It depended simply on taking into account mechanical relationships which had hitherto been assumed to be non-contributory.  Reading an LP with a stylus is an entirely mechanical process.  The motion of the stylus, being in contact with the LP’s surface, is transformed into an electrical signal by the motor assembly built into the cartridge body.  However, the LP sits on the turntable’s platter.  The platter sits on the bearing.  The bearing is bolted to the chassis.  The pickup arm’s base is also bolted to the chassis.  The pickup arm itself is connected to the arm base via a bearing assembly.  Finally, the cartridge body is bolted to the pickup arm.  Therefore, the voltage generated by the cartridge reflects not only the relative motion of the stylus in the groove, but also any mechanical detritus that may exist in the “mechanical ground plane” represented by the arrangement of interconnecting elements.  As the theory gained ground, so the “mechanical detritus” became better understood in terms of loose joints, vibration, energy storage and isolation.  Ivor’s Linn Sondek was considered to be outrageously expensive for a turntable.  That said, today’s ultra high end turntables - whatever you may think of them - can sell at prices that would make even Ivor wince.

Applying Ivor’s lateral thinking process to the modern (strange word, that, for something which is already going the way of the dodo) CD transport, we need to look more closely at the processes that we might otherwise assume to be non-contributory.  Like the turntable of yore, a CD transport spins a disc and a transducer reads the music off its surface.  In this case the music comprises pits impressed into the surface of a layer of metal beneath the protective plastic surface of the disc.  To detect these pits, the CD player contains a tiny laser.  The laser beam is focussed down onto the metallic surface of the disc, and reflects off it.  The reflection is picked up by a photodiode which then outputs an electrical signal in response.  The idea is that the pristine surface produces a nice clean reflection, whereas the pit produces a more diffuse reflection.  The clean reflection results in more reflected light impinging on the photodiode, and the diffuse reflection off the pit results in less.  The electronics behind the photodiode then try to determine whether the reflected signal represents a pit (a “1”) or a clean reflection (a “0”).

The whole process is not as clean and tidy as you might imagine.  First of all, the actual signal output by the laser is noisy for a whole bunch of reasons.  Secondly, the beam has to pass through the plastic protective layer on the CD’s surface, both before and after reflection, and that plastic layer can be scratched and dirty.  Secondly, the beam position is controlled by a servo, which means that it is constantly drifting slightly in and out of alignment with the stream of pits.  Then the photodiode itself is noisy.  All this noise means that it can be extremely challenging for the electronic circuitry to reliably detect whether the signal represents a 1 or a 0.  In fact, it gets it wrong alarmingly often.  To deal with this inescapable problem, the CD standard requires the actual data to be encoded in a format known as eight-to-fourteen modulation.  This, among other things, adds a whole bunch of extra bits to the data stream in such a way that if there is an error in reading an individual bit, it can be detected and automatically corrected.  So even though the read-off error rate can be quite high, the actual data is nonetheless very accurately retrieved from the disc.  Many people, therefore, will point out quite reasonably that unless the disc is badly scratched, marked, or otherwise damaged, it is fair to assume that the data stream extracted from a CD is essentially accurate.

Aside from getting the ones and zeros correct, a critical aspect of digital audio is timing jitter.  The theory underlying digital audio makes the fundamental assumption that the digital samples are converted to analog at the exactly correct sample rate.  Slight variations in the sample rate in a real-world system are referred to as jitter.  It is therefore important that the data coming off the transport is also synchronized very precisely to the sample rate.  However, since the rate at which data is retrieved from the disc is governed by the speed at which the disc spins, this means that the disc’s speed needs to be controlled with phenomenal accuracy.  And this is further compounded by the fact that because the pits on the disc have exactly the same spacing from the centre of the disc to the outside, the actual spinning speed of the disc varies dramatically from the start of the disc (the centre) to the end (the outside).  In today’s transports, the data is buffered to get around this.  The data is read into a buffer at a higher speed than is needed for playback, and the actual output data of the player is then transmitted according to a separate, and highly accurate, clock.  Most people, therefore, will point out quite validly that a modern transport’s jitter performance should be decoupled from the mechanism of the rotating platform.

These two issues between them appear to confirm that there is no reason to imagine that two transports should sound significantly different.  Unfortunately, experience suggests that CD transports continue to sound different in practice.  Without much of an evidentiary basis to enable blame for this to be laid at the foot of data errors (even though this claim continues to be made, mostly, it must be said, with no factual basis) the usual culprit is held to be jitter.

Jitter is a very helpful villain when it comes to needing something to blame.  The notional jitter sensitivity of digital audio is stupefyingly tight.  Yet it applies to the specific timing at which individual sample values are converted from digital to analog.  This is a signal which is difficult to measure, since the master clock is not normally externally accessible.  Instead, you can measure the jitter inherent in a serial data stream (such as S/PDIF) between transport and DAC, although it is not clear what the relationship would be between the jitter of the data stream and the jitter of the master clock.  Also, you can look for measurable artifacts in the analog output of the DAC, which you can then relate to the jitter properties of the master clock, although the underlying theories which are used to derive these relationships are built on highly simplistic assumptions.  In short, it is very handy to be able to blame something on jitter, because there is very little in the way of a basis upon which to dispute such an assertion.

The notion behind jitter is that a sample arrives early (or late) – maybe by a fraction of a nanosecond – and as a consequence the analog output changes amplitude a fraction of a nanosecond early (or late).  The problem is that these circuits just don’t respond unambiguously over those timescales.  One nanosecond is one thousandth of a microsecond (which in turn is a millionth of a second).  The waveform that is the output of the DAC core needs to change over a period from the end of one sample to the beginning of the next.  This is a timescale of the order of microseconds.  If you want to determine the precise timing of that change to an accuracy of less than one nanosecond, it means you have to measure it with a bandwidth exceeding 1GHz.  This is way up into the RF end of the frequency spectrum.  Look at ANY such signal with a bandwidth of 1GHz, and zoom in to a nanosecond-scale resolution, and ALL you will see is noise.  This is because RF is all-pervasive.  If it wasn’t, none of our radios, TVs, cell phones, WiFi, GPS, or bluetooth devices would work.  Stopping RF from infiltrating - and propagating within - electronic circuits is a major, major challenge.  Particularly if those circuits have to deal with signals within the RF bandwidth as matter of design.

In practice, what happens is that the actual waveform over timescales corresponding to sample rates is arrived at by bandwidth limitation.  Bandwidth limitation is in effect a big averaging filter – the peaks and troughs of the noise cancel each other out and you are left with the underlying signal.  Perhaps the underlying signal does average out to be a tad early or a tad late.  But the other thing about noise is that the underlying signal can also average out to be a tad too high or a tad too low (if you know enough about the noise – and the problem is we mostly don’t – you could even predict how often, and by how much, this will happen).  I am not sure that it is even possible in any practical sense to separate those two phenomena.  In any case, the solution lies in managing the RF noise problem.  It can’t be avoided, because the inside of a CD transport is an inherently RF-rich environment.  Many DAC manufacturers are already addressing this with various degrees of sophistication.  I suspect they still have a lot further to go.

Going back to the turntable again, the problem is to read the analog undulations in the surface of a plastic disc and represent them as an analog voltage.  The solution was to eliminate every possible interference from the unavoidable mechanical elements of the design.  It is the same problem in the digital world.  We have to read the digital undulations in the surface of a plastic disc and represent them as a digital voltage.  Except this time it is not mechanical interference, but RF electrical interference we have to worry about.
 

When we say something is digital, it is not really sufficient to say that it deals with ones and zeros.  It deals with situations where we have no need to take into account the possibility of something taking on a value other than one or zero - or, more accurately, taking on value that can always be expressed in their totality using arrangements of multiple ones and zeros.  Digital signals behave in a logical fashion, and represent a logical, ordered, bounded state of affairs.  Once those constraints fail to apply, then we are no longer looking at a digital signal.  ALL of our digital signals are naturally contaminated with RF Noise.  Understanding the behaviour of a digital data stream contaminated with RF noise requires treating it as an analog waveform.

Our challenge is to seriously reduce the RF noise from our digital environments, as Ivor Tiefenbrun did with mechanical noise in his turntable designs.  In doing so, we must bear in mind that we can never eliminate it.  Just as even the very best turntables of today, while sounding indisputably better than their forebears of yesteryear, still do manage to sound subtly different, so digital transports - in fact digital sources of every stripe - will always continue to sound slightly different, even if by increasingly smaller degrees.