Content
HD, 4K, 8K, 16K? The evolution of the visual component in entertainment media seems to be rapid, but what about audio?
The MPEG-H audio format, also known as Next Generation Audio (NGA), is revolutionizing the way we produce and perceive audio content. It enables immersive 3D sound, enhanced audio quality, and audio restoration for an immersive listening experience.
Next Generation Audio is a catchy name given to the latest auditory evolution in broadcasting and streaming devices.
Next Generation Audio should not only provide consumers and users with immersive listening experiences but should also offer interaction possibilities. That sounds very promising, so what’s behind creating it?
The DVB consortium decided to support the two formats Dolby AC-4 and MPEG-H. These are both audio formats that support multi-channel and object-based content and also make do with comparatively low bandwidth.
The MPEG H authoring suite provides powerful and flexible tools for creating MPEG-H enabled content. With the advent of immersive audio production technologies, content creators gain more creative freedom and ability to shape the way sound elements are woven together, much like a composer in a symphony.
The big advantage of object-based audio is its channel independence since the rendering is only done in the playback system of the end-user. Systems that support Next Generation Audio must, therefore, have an appropriate decoder installed. This promises a constantly optimized audio playback.
What about interaction? The magic word here is: meta data. As already mentioned, rendering takes place in the playback system. The audio definition model (ADM) serves as a framework for next generation audio content, allowing for the intricate manipulation capabilities of sound elements during immersive audio production.
With the help of metadata, audio systems can offer personalized listening experiences. Information about positions of audio objects and interaction parameters allow viewers to create their own immersive sound mix.
So how does the decoder know what the mix should sound like? That’s right, through meta-data. This is transmitted as a separate track and contains information about volume ratios, positions of audio objects, and interaction parameters.
Besides mixing, the producer can set parameters to determine to what extent interactions between audio elements are possible later during playback. In practice, these are usually volume ratios that can be controlled, or the selection of audio elements from different presets.
For example, different languages or voice amplification for people with hearing loss. Theoretically, however, as a producer, you could sound out the possibilities and let intervene in all conceivable parameters during playback. An interesting idea which in turn raises interesting questions.
If you think briefly about the theoretical possibilities of interaction, it quickly becomes clear that it is probably not always sensible to have new audio technology give the consumer so much freedom over individual preferences.
Think for example of the news or political broadcasts in general. Here, too much individualization would probably be counterproductive or almost manipulative, if certain persons could simply be muted. In other words, how you deal and will deal with Next Generation Audio depends on the content.
Also, the question then arises as to whether and how television broadcasts, for example, will develop. The potential for new broadcasting concepts is definitely there!
Establishing these new freedoms could prove to be difficult, as users are not necessarily used to being able to intervene in or enhance the mix of a broadcast. Moreover, these interactive possibilities are likely to appeal only to a technology-oriented clientele and will probably pass most of the masses without any concern.
Who knows, maybe broadcasts optimized for Next Generation Audio could help to ensure that the interaction possibilities are not only registered by people but also actively accepted.
The effects of these possibilities on the production and distribution side are also exciting to watch, as the mixer has to say goodbye to delivering a mix set in stone. Pessimistically seen one releases a mix that probably won’t be the personally preferred optimum because one leaves its completion to the consumer.
Optimistically seen it is, however, a new challenge for him to explore and try out new things, and that is exciting.
MPEG-H Audio is a groundbreaking next-generation audio standard that is transforming the landscape of audio production, streaming, and playback. Designed to deliver immersive sound and personalized audio experiences, MPEG-H Audio stands out with its efficient audio compression and adaptability across various platforms.
This standard is widely embraced in industries such as broadcasting, streaming, and music production, thanks to its versatile features.
One of the standout features of MPEG-H Audio is its support for object-based audio, which allows sound elements to be treated as individual objects. This means that each audio element, whether it’s a voice, instrument, or sound effect, can be independently manipulated and positioned within a 3D soundscape.
This capability is complemented by scene-based audio, which captures the spatial characteristics of a sound environment, providing a more realistic and immersive audio experience.
Moreover, MPEG-H Audio’s advanced audio compression algorithms ensure high-quality sound while minimizing bandwidth usage, making it ideal for streaming services and mobile devices.
Whether you’re listening on smart speakers, home theaters, or on the go, MPEG-H Audio adapts to different playback environments, delivering consistent and high-fidelity audio content. As the next generation of audio technology, MPEG-H Audio is setting new standards for immersive sound and personalized listening experiences.
In May 2017, South Korea introduced MPEG-H, the first Next Generation Audio Codec for a 4K UHD TV service. Major events are often important “springboards” for the development and advancement of new technologies.
The 2018 Olympic Games in Pyeongchang, for example, are important stepping stones for the use of Next Generation Audio. Furthermore, the “Rock in Rio” festival and the Eurovision Song Contest have already been broadcast in MPEG-H format:
Meanwhile the format is also officially used in China and Brazil. In this respect, it would be exciting to know what the consumers who already use these formats regularly say in general about it.
The audio codecs AC-4 and MPEG-H are also already used in the purely musical field, for various music streaming services. However, so far only the immersive aspect of Next Generation Audio (NGA) is relevant here.
The still quite manageable range of “3D music” is usually only available for an additional charge to the normal subscription of a streaming service. Also, apart from headphones, there are still very few products on the market that can play NGA.
As immersive audio becomes more widespread, it brings about new possibilities for sound design, enhancing the immersive audio experience for both live production and studio broadcasting. It will probably take some time before the format becomes established, unless the audio content is made more easily accessible.
The future of audio technology is brimming with exciting possibilities and rapid advancements. Emerging technologies like spatial audio, object-based audio, and advanced noise-canceling are revolutionizing the way we experience sound.
Next-generation audio standards such as MPEG-H Audio are at the forefront of this transformation, enabling immersive and interactive audio experiences that closely mimic real-world environments.
One of the most promising developments is the integration of AI and machine learning in audio technology. These advancements are enhancing audio synthesis, sound recognition, and audio enhancement, leading to more sophisticated and personalized audio experiences.
For instance, AI-driven algorithms can now create realistic soundscapes, improve speech intelligibility, and even restore audio quality in old recordings.
As audio technology continues to evolve, we can expect to see a surge in personalized and immersive audio experiences. The rise of smart speakers, mobile devices, and streaming services is driving demand for high-fidelity, realistic, and customizable sound.
With next-generation audio technologies like MPEG-H Audio, the future promises unprecedented levels of audio fidelity and realism, transforming the way we interact with sound in our daily lives. Whether it’s for entertainment, communication, or professional use, the advancements in audio technology are set to deliver a richer and more engaging audio experience.
Strict rules apply to participation in the Eurovision Song Contest. For example, the songs may not have been published before September 1 of the previous year and may not exceed a length of three minutes. Most works actually almost exhaust the 180 seconds – but shorter is also conceivable.
Covers are not allowed, but note that the language in which the song is sung is not prescribed, so contributions in fantasy languages are also possible. A maximum of six people per country are allowed on stage, while animals are forbidden.
The songs are performed live, but the music comes from a tape. Since 2021, the voices of background singers have also been allowed to be pre-recorded, which was not allowed before.
As the audio industry evolves, many channels and broadcasters now harness the potential of next generation audio to offer an immersive audio experience for viewers, enhancing programme production beyond the boundaries of the past. With this, I can’t subscribe to the prejudice that “they all can’t sing.”
Next Generation Audio definitely has potential to bring a breath of fresh air to audio consumption and production. The integration of audio production technologies enables professionals to craft tailor-made listening experiences, whether for streaming services, smart speakers, or mobile devices.
It remains to be seen to what extent this potential will be exploited. And above all, whether it will reach the people. I think a step in the right direction would be to make accessibility to audio technology more intuitive and attractive for consumers on the one hand, and to make corresponding production tools more accessible for freelance producers on the other.
The latter concerns the musical sector in particular. But the technology is already very advanced and the future creative possibilities are endless. I’m really keen to bring immersive and interactive audio to people via NGA – it can’t take much longer.
More about the new generation of sound