Audio Intelligence with Raoul Wedel –
In case you haven’t noticed, the AI revolution is in full swing, and it’s not just about ChatGPT, Image Generation, or AI Voice. It’s about seamlessly integrating these cutting-edge technologies into existing systems. Microsoft has already announced their copilot products for Office and plans to integrate them into Windows. Meanwhile, Meta is rolling out AI features on WhatsApp and Messenger, such as chatbots and image generation.
However, many larger media companies, like iHeart, and other broadcasters in the US are blocking the use of ChatGPT and other generative AI tools like audio, image and video in their organisations as they struggle to keep up with this rapid pace of advancement.
As someone who has experienced the early days of radio automation in the 80s and 90s, I can’t help but see some parallels. Back then, some stations were hesitant to allow computers connected to the internet, which eventually led to disastrous outcomes when virus-infected memory sticks were connected to their systems.
So, how can media organisations manage the risks associated with integrating AI technologies into mainstream communication and messaging?
When not connected to large language models like ChatGPT, voice technology has a relatively low risk. However, cloned voices may be used as deep fakes, or voice talents may not have given explicit permission for their use. To avoid legal issues, make sure your voice provider can furnish all the required documentation and consent in spoken word. Some providers, like Resemble.ai, have developed technologies to detect AI-generated voices, but these may be less effective when audio processing is used or played over speakers or phones.
Voice models are typically created using a vast corpus of scientific audio data and audiobooks, posing little risk. Nevertheless, some providers may use user recordings for training data or create unauthorised voice clones, leading to potential legal ramifications.
Numerous tools have emerged for audio processing that can remove background noise, echo, and separate music from voice. However, separating music from voice can pose a legal threat, such as when a DJ extracts the music from a copyrighted song and uses it in a promo. Being aware of potential copyright violations is crucial for media companies.
AI music generation is a hotly debated topic. Groundbreaking music generation models like Stable Audio have been trained using data they have received consent for, such as from production music site Audiosparx. However, open-source models can be exploited by individuals who train models on copyrighted content and flood platforms like Spotify with AI-generated releases.
Additionally, there’s a random chance that a model may accidentally generate a song similar to a copyrighted track. The risk of using AI-generated music that infringes on copyrights is significant for broadcasters.
On the other hand, haven’t all musicians been ‘trained’ or influenced by other artists? Why should a computer be any different?
Large language models
Large language models like ChatGPT and Bard have become increasingly popular for generating creative content, but they also come with certain challenges. One such challenge is the phenomenon of “hallucination,” where the AI generates content that might be imaginative but isn’t necessarily grounded in facts or reality. While this can lead to interesting and innovative ideas, it can also result in misinformation or content that strays too far from the desired topic.
To strike the right balance, media companies can employ techniques such as fine-tuning these models with domain-specific data and setting strict guidelines on content generation. Additionally, human supervision and collaboration remain essential to ensure the output aligns with the intended message and adheres to journalistic standards. By combining the creative capabilities of large language models with human expertise, media companies can leverage the power of AI while maintaining accuracy and credibility in their content.
The bottom line is that the AI revolution is reshaping the media industry, bringing innovation and enhanced experiences for audiences worldwide. Media companies need to adapt and evolve, harnessing the power of AI to create more engaging, immersive, and accessible content while being mindful of the legal and ethical implications. By navigating this new frontier with caution and foresight, the potential for transformation and growth in the media industry is immense.
About the Author
With a career in the radio industry spanning more than 30 years, Raoul Wedel is CEO of Wedel Software, a leading international provider of broadcast software solutions. In 2021 he launched the Adthos Ad Platform, bringing broadcast-quality AI and synthetic voice technology to the audio advertising industry for the first time. The platform continues to deliver more market firsts, including the option of creating 100% AI-generated audio ads.
Adthos is an international advertiser on the radioinfo group of sites.
Main Pic: Shutterstock