Tuesday 20 January 2015

High-Order DSD

As support for regular DSD (aka DSD64) starts to become close to a requirement for manufacturers of not only high-end DACs, but also a number of entry-level models too, so the cutting edge of audio technology moves ever upward to more exotic versions of DSD denoted by the terms DSD128, DSD256, DSD512, etc.  What are these, why do they exist, and what are the challenges faced in playing them?  I thought a post on that topic might be helpful.

Simply put, these formats are identical to regular DSD, except that the sample rate is increased.  The benefit in doing so is twofold.  First, you can reduce the magnitude of the noise floor in the audio band.  Second, you can push the onset of undesirable ultrasonic noise further away from the audio band.

DSD is a noise-shaped 1-bit PCM encoding format (Oh yes it is!).  Because of that, the encoded analog signal can be reconstructed simply by passing the raw 1-bit data stream through a low-pass filter.  One way of looking at this is that at any instant in time the analog signal is very close to being the average of a number of consecutive DSD bits which encode that exact moment.  Consider this: the average of the sequence 1,0,0,0,1,1,0,1 is exactly 0.5 because it comprises four zeros and four ones.  Obviously, any sequence of 8 bits comprising four zeros and four ones will have an average value of 0.5.  So, if all we want is for our average to be 0.5, we have many choices as to how we can arrange the four zeros and four ones.

That simplistic illustration is a good example of how noise shaping works.  In effect we have a choice as to how we can arrange the stream of ones and zeros such that passing it through a low pass filter recreates the original waveform.  Some of those choices result in a lower noise floor in the audio band, but figuring out how to make those choices optimally is rather challenging from a mathematical standpoint.  Theory, however, does tell us a few things.  The first is that you cannot just take noise away from a certain frequency band.  You can only move it into another frequency band (or spread it over a selection of other frequency bands).  The second is that there are limits to both how low the noise floor can be depressed at the frequencies where you want to remove noise, and how high the noise floor can be raised at the frequencies you want to move it to.

Just like digging a hole in the ground, what you end up with is a low frequency area where you have removed as much of the noise as you can, and a high frequency area where all this removed noise has been piled up.  If DSD is to work, the low frequency area must cover the complete audio band, and the noise floor there must be pushed down by a certain minimum amount.  DSD was originally developed and specified to have a sample rate of 2,822,400 samples per second (2.8MHz) as this is the lowest convenient sample rate at which we can realize those key criteria.  We call it DSD64 because 2.8224MHz is exactly 64 times the standard sample rate of CD audio (44.1kHz).  The downside is that the removed noise starts to pile up uncomfortably close to the audio band, and it turns out that all the optimizing in the world does not make a significant dent in that problem.

This is the fundamental limitation of DSD64.  If we want to move the ultrasonic noise further away from the audio band we have to increase either the bit depth or the sample rate.  Of the two, there are, surprisingly enough, perhaps more reasons to want to increase the bit depth than the sample rate.  However, these are trumped by the great advantages in implementing an accurate D/A converter if the ‘D’ part is 1-bit.  Therefore we now have various new flavours of DSD with higher and higher sample rates.  DSD128 has a sample rate of 128 times 44.1kHz, which works out to about 5.6MHz.  Likewise we have DSD256, DSD512, and even DSD1024.

Of these, perhaps the biggest bang for the buck is obtained with DSD128.  Already, it moves the rise in the ultrasonic noise to nearly twice as far from the audio band as it was with DSD64.  Critical listeners - particularly those who record microphone feeds direct to DSD - are close to unanimous in their preference for DSD128 over DSD64.  The additional benefits in going to DSD256 and above seem to be real enough, but definitely fall into the realms of diminishing returns.  However, even though the remarkably low cost and huge capacity of hard disks today makes the storage of a substantial DSD library a practical possibility, if this library were to be DSD512 for example, this would start to represent a significant expense in both disk storage and download bandwidth costs.  In any case, as a result of all these developments, DSD128 recordings are now beginning to be made available in larger and larger numbers, and very occasionally we get sample tracks made available for evaluation in DSD256 format.  However, at the time of writing I don’t know where you can go to download samples of DSD512 or higher.

In the Apple World where BitPerfect users live, playback of DSD requires the use of the DoP (“DSD over PCM”) protocol.  This dresses up a DSD bitstream in a faux PCM format, where a 24-bit PCM word comprises 16 bits of raw DSD data plus an 8-bit marker which identifies it as such.  Windows users have the ability to use an ASIO driver which dispenses with the need for the 8-bit marker and transmits the raw DSD data directly to the DAC in its “native” format.  ASIO for Mac, while possible, remains problematic.

As mentioned, DoP encoding transmits the data to the DAC using a faux PCM stream format.  For DSD64 the DAC’s USB interface must provide 24-bit/176.4kHz support, which is generally not a particularly challenging requirement.  For DSD128 the required PCM stream format is 24-bit/352.8kHz which is still not especially challenging, but is less commonly encountered.  But if we go up to DSD256 we now have a requirement for a 24-bit/705.6kHz PCM stream format.  The good news is that your Mac can handle it out of the box, but unfortunately, very few DACs offer this.  Inside your DAC, if you prise off the cover, you will find that the USB subsystem is separate from the DAC chip itself.  USB receiver chipsets are sourced from specialist suppliers, and if you want one that will support a 24/705.6 format it will cost you more.  Additionally, if you are currently using a different receiver chipset, you may have a lot of time and effort invested in programming it, and you will have to return to GO if you move to a new design (do not collect $200).  The situation gets progressively worse with higher rate DSD formats.

Thus it is that we see examples of DSD-compatible DACs such as the OPPO HA-1 which offers DSD256 support, but only in “native” mode.  What this means is that if you have a Mac and are therefore constrained to using DoP, you need access to a 24/705.6 PCM stream format in order to deliver DSD256, and the HA-1 has apparently been designed with a USB receiver chipset that does not support it.  It may not be as simple as that, and there may be other considerations at play, but if so I am not aware of them.

Interestingly, the DoP specification does offer a workaround for precisely this circumstance.  It provides for an alternative to a 2-channel 24/705.6 PCM format using a 4-channel 24/352.8 PCM format.  The 8-bit DoP marker specified is different, which enables the DAC to tell 4-channel DSD128 from 2-channel DSD256 (they would otherwise be indistinguishable).  Very few DAC manufacturers currently support this variant format.  Mytek is the only one I know of - as I understand it their 192-DSD DAC supports DSD128 using the standard 2-channel DoP over USB, but using the 4-channel variant DoP over FireWire.

Because of its negligible adoption rate, BitPerfect currently does not support the 4-channel DoP variant.  If we did, it would require some additional configuration options in the DSD Support window.  I worry that such options are bound to end up confusing people.  For example, despite what our user manual says, you would not believe the number of customers who write to me because they have checked the “dCS DoP” checkbox and wonder why DSD playback isn’t working!  Maybe they were hoping it would make their DACs sound like a dCS, I dunno.  I can only imagine what they will make of a 2ch/4ch configurator!!!

As a final observation, some playback software will on-the-fly convert high-order DSD formats which are not supported by the user’s DAC to a lower-order DSD format which is.  While this is a noble solution, it should be noted that format conversion in DSD is a fundamentally lossy process, and that all of the benefits of the higher-order DSD format - and more - will as a result be lost.  In particular, the ultrasonic noise profile will be that of the output DSD format, not that of the source DSD format.  Additionally, DSD bitstreams are created by Sigma-Delta Modulators.  These are complex and very challenging algorithms which are seriously hard to design and implement successfully, particularly if you want anything beyond modest performance out of them.  The FPGA-based implementation developed for the PS Audio DirectStream DAC is an example of a good one, but there are some less-praiseworthy efforts out there.  In general, you can expect to obtain audibly superior results pre-converting instead to 24/176.4 (or even 24/352.8) PCM using DSD Master, which will retain both the extended frequency response and the lower ultrasonic noise floor of the DSD256 original.