New object-based audio codec: the future of immersive sound

Steve Ahern experiences immersive object-based 3D audio.
At last week’s Digital Broadcasting Symposium in Kuala Lumpur I was treated to a demo of Fraunhofer’s latest innovation, immersive 3D object-based audio.
3D sound has been around for a long while. It began in theatres, where multiple speakers were placed around the room and movies with multi-channel sound tracks were screened.
Then it moved into home theatres with 5.1 surround sound, which worked well for movies played from DVDs and Video files that had multichannel outputs, but could only synthesise (fake) the surround sound experience when showing broadcast tv delivered in stereo. Next came 7.1 sound, which added a couple more speakers at the back of the room so that the bad guy’s footsteps in teh movie could be heard creeping up behind you before you saw him.
But these home theatre audio systems lacked height.
They also required too many speakers and too many audio tracks to carry the sound.
I experienced Fraunhofer’s Next Generation Audio (NGA) for broadcast through a single high quality sound bar placed at the front of the room. There’s a paper about NGA here for the tech geeks amongst us.
When listening to a recording of a cathedral choir I could hear the echo from above me as the sound bounced off the church’s high vaulted stone ceiling. I was tempted to look up.
In another demo of a sports event, where I was trackside for a sprint race, I closed my eyes to really test the spatial experience. I could hear the track and field sounds in front of me, the crowd next to and behind me, and people cheering above me in the high stadium seats.
All this from one sound bar.
The presentation was organised by Toni Fiedler, General Manager of Fraunhofer for China & APAC, and Kurt Zou from Pleasant Audio/Fraunhofer.
Just in case we thought it was a trick, Zou assured us there were no hidden speakers in the ceiling or elsewhere in the room.


Only a small number of people were allowed in the room because it had been tuned to a central ‘sweet spot,’ where we could hear the sound at its best.
There are several smart elements to this system. The engineering of the sound bar is important, as is the encoding technology.
The sound bar must contain multiple good quality speakers, with separate drivers, all pointed in different directions so that it can dynamically move the audio around the space depending on how large or small the room is. At setup, a remote control device with a microphone is placed at the chosen ‘sweet spot’ in the room and the different speakers are then adjusted to give the full experience to that location. As you move around the room the audio experience changes slightly.
The beauty of the sound bar and the encoding technology feeding it is that it’s very easy to install – just unpack the speaker, find a good location for it, then tune the room with the remote.
But there is much more to the engineering than just a good sound bar.
The Fraunhofer audio codec is the secret behind the immersive sound. It is designed to deliver the object-based MPEG-H audio to the sound bar in a way that allows the spatial separation to be manipulated and delivered by the speakers.
Kurt Zou told me: “The key point is the kind of audio information the bitstream has. You can’t reproduce such real atmosphere with a limited audio source like legacy stereo.,, just like, without good ingredients, even a grandmaster chef can’t make great food.”


Instead of needing multiple audio tracks to deliver the sound coming from each microphone the object-based audio codec contains metadata that captures the relative position of the sounds from the original 360 degree microphone array and uses that data to send sound to each speaker.
The information about the sound’s relative position is what generates the ‘immersive experience.’ Using object-based audio, the new codec can be used to tailor playback to sound best on any device and in any environment.
Once you start to broadcast or stream object-based audio, you no longer need lots of extra bandwidth to carry additional audio tracks, and even moderate quality speakers or headphones can generate immersive sound experiences using object-based audio if they are driven by the right codecs. With object-based audio, the user also has much greater ability to tune the mix of sounds to suit themselves.
While this technology is made for television, it can also be used for audio, most notably streaming music services and high quality digital radio broadcasts. It is also likely to be used in gaming headphones, to improve the gamer experience.
I think I may have just heard the next generation of broadcast audio.



More about Fraunhofer research institute here.  The video below illustrates the immersive sound experience.



Tags: | | | | | |