Introduction to DAC Audio

DAC (Digital to Analogue Converter) audio is a way of producing realistic sounds using digital equipment (or computers to you and me!). In this sequence of articles we will explore the DACAudio library for both ESP32 and Arduino systems (at time of writing only ESP32 supported but Arduino will come).

Sound – A Beginners Introduction
Sounds are just vibrations, our ears pick up the vibrations from objects (or instruments) as they hit the air particles in front of them and the energy is passed on; air particle to air particle; until air particles in our ear pass on this vibrating energy to our ear drum and through some other clever nature stuff we convert this to electrical signals in our ear that the brain can interpret as sound. The pitch of a sound is the speed at which it vibrates, its frequency (measured in Hertz (Hz) – vibrations per second).

Sounds from vibrations produce sound waves and as such like other types of wave have certain properties, they two main ones we’re interested in are (as already mentioned) the frequency (how often the whatever it is, is vibrating) and the amplitude (how loud the sound is).  Below is an example of a simple sound wave;

The frequency is how many complete waves occur in 1 second. In the example above if the time taken for the entire wave to cross across the screen was 0.125seconds then the frequency of of this sound wave is around 32Hz. That is to say 4  complete waves per 0.125s which to convert to Hz (number of waves per second) would be 4/0.125=32Hz. This is a very low pitch or frequency. The other property is amplitude (the height if the wave above the centre, for sound this relates directly to the loudness of the sound.

An important thing to notice about the wave is that the amplitude changes over the time (producing the waveform we see). It’s not instant high then instant low. We can produce sounds like this and they are called square waves but in nature this type of sound wave is very very rare (as in none existent!). It looks like this;

 

Computers are very good at producing squares though due to them being able to switch voltages high and low very very quickly and in most home computers from 70’s to perhaps very early 80’s this was the type of wave output. This gives computers of that age a generally recognisable sound.

To produce different types of sound requires different waves (such as the sine wave shown earlier), the commodore 64 have a very could sound set up enabling better sound than most of its rivals. But why are we mentioning this? Well the embedded MCU’s we work with (Arduino, ESP32 etc.) have no real sound capabilities at all and generally a simple square wave beep is all they really did. This makes them very similar to those early computers. However they are a lot faster (particular the ESP products) and that means we can take advantage of that speed to do some work in software to produce more advanced sounds.

DacAudio Library
That’s where the DACAudio library comes in. I’ve written this to provide an easy interface for programmers/ hardware designers who want to add more complex sound to their projects with minimum effort. It uses a digital to analogue interface, which are built into the ESP32 (but not Arduino’s), hence why I chose that chip for the main development. This allows the relatively precise control over the final waveform. There are other ways of getting sound out of an ESP32, such as it’s in-built support for I2S streaming. With this you can output digital sound to nearly any pin and with couple or so components you can get stereo sound out. Really quite cool. In the future the library will support this approach but for now it uses the DAC pin which reduces external components (admittedly very slightly) and I think conceptually it is easier for a beginner to see how sound could be produced that way. In any case that’s the way the design was originally conceived!!

 

What do DAC’s do?
DAC
‘s  will take a digital value and convert it to an output voltage on a specified pin. Typically in the range 0v to Vcc (whatever Vcc may be, but sometimes you can specify or set an upper voltage point).

So for example if the Vcc was 3.3V and our digital value could be anything between 0 and 255 (8bits). Setting it to 0 would result in 0V at the output pin, a value of 127 would be approx 1.65V at the output pin and 255 would be 3.3V. In this way we could control the brightness of a light or speed of a motor by altering the voltage supplied to it using a digital number 0 for lowest value, 255 for largest and then any value in-between.

But you may say, “I can already do that using PWM”, well, yes, PWM simulates a DAC but is not suitable for all applications. DAC’s in particular come into their own when it comes to sound.

 

DAC’s and Sound.
The advantages of storing sound as a digital representation of itself were easily spotted by the early music and electronics engineers. For example being able to easily make perfect copies each and every time rather than having to “look after” a very special master copy of the music (which in itself was often copied to junior master copies) was obviously beneficial. It also did away with expensive high quality duplicating equipment. You could now send perfect copies over the telephone line with no loss of quality. Try sending your latest album to the other side of the world over the phone line for duplication using an analogue copy! Impossible, so master copies had to be shipped to duplication plants physically.

Once your sound has been stored digitally (converted using an ADC) if you want to listen to it again with your analogue ears then we need to be able to convert it back from the digital data into a analogue signal again and hence the DAC’s.

Resolution and accuracy
The higher the resolution then the more accurate your digital stored sound will be when replayed. Note that there are other factors to sound accuracy – such as sample rate – which we will cover in later articles on producing sound. For now we will just look at resolution. In the CD specification it was 16 bit resolution, so the original sound (when converted to a voltage using a microphone) was stored as a 16 bit number – a number between 0 and 65535 (65536 values). The higher the number of bits to store the sound the finer and more accurate sound representation you will get. However the human ear has limitations and it’s not just a matter of the more the number of bits the better the sound as there comes a point when our ears cannot tell the difference.

In the ESP32 the resolution is 256, it has a 8 bit DAC (values from 0 to 255). This may seem poor but in fact it does allow us to have a good  representation of sound albeit not of audiophile quality. With our 3.3 volts processor this should mean a 0 sent to the DAC would give 0V on the DAC output pin and 255 would give 3.3V on the DAC output pin. However in real life the circuitry gives just slightly different values for various technical reasons. Generally starting a little over 0V and ending at around 3.24V for value of 255.

What Next
Use the menu’s above to navigate to the next article in the Audio section. Alternatively click here to look at the hardware build in order to try out the library.