Google DeepMind creates AI model that can add sound to silent videos

Lloyd Coombes

19 June 2024 at 6:43 am·3-min read

Charlie Chaplin in Gold Rush (1925). — Credit: Alamy

Fresh off of animating memes over the last few days, AI has turned its attention to silent videos. Specifically, bringing audio to AI-generated clips.

Google’s DeepMind research arm has built a powerful new AI model that can add audio to videos without sound, dubbing over the top with sound effects and music.

What is most impressive about the new research is the ability to accurately follow the visuals. In one clips they show a close up of guitar playing and the music in the SFX closely matches the actual notes being played.

In some ways, it’s the other side of the coin that saw the generation of music based on a visual prompt last month via ElevenLabs and brings with it plenty of potential for restoration of old media that no longer has an audio component — and Charlie Chaplin may be about to get a new voice if this progresses further.

While the Google DeepMind model isn't available to use yet, there is a similar tool from ElevenLabs that you can try today. If you want to create a video to try it you can check out our 5 best AI video generators list.

Google's new audio generation is off to a solid start

In the thread of posts on X, Google’s DeepMind account starts things off with a character walking through an eerily lit tunnel.

Some light choir music can be heard over the top of dramatic percussion as the character’s footsteps can be heard as they move through the scene.

The second, audio generated with “Wolf howling at the moon” as the prompt, ties in nicely with the animation, and even offers a chorus of howls in the distance.

The harmonica example sounds a little too “uncanny valley” in the way its pitch shifts, but the backing underneath is solid, while the jellyfish one sounds like, well, jellyfish. Notably, that has some extra prompts, though, including “marine life” and “ocean”.

The video with the prompt “A drummer on a stage at a concert surrounded by flashing lights and a cheering crowd” is a little off, though. For one, the beats don’t quite match the rhythm in the video once it gets going, while the sticks appear to be focused on the snare and maybe a floor tom, while the audio sounds a tad more complex with some other drums involved.

Still, it’s an impressive start to a project that’s only likely to grow with time.

Limitations of the DeepMind model

Like many projects from Google this hasn't been released yet, its just a research preview. Google says there are limitations and safety issues to address first.

For example: "Since the quality of the audio output is dependent on the quality of the video input, artefacts or distortions in the video, which are outside the model’s training distribution, can lead to a noticeable drop in audio quality."

They are also working on lip synching for videos with speech as, while it currently attempts to do this it isn't always accurate and creates an uncanny valley effect.

ElevenLabs is working on a similar project

Not to be outdone, ElevenLabs this week revealed its new Text to Sound Effects API that can generate audio effects based on what you upload to it.

Unlike Google's V2A model, the API from ElevenLabs is already accessible and from experiments works surprisingly well.

In the example above, a video of a bottle smashing gets a few different options to choose from, while the DiCaprio laughing meme gets a additional audio from other people in the room.

The company 'bootstrapped' a quick app to demonstrate what is possible with the API, allowing you to upload a video and have it add the sound. This is free to use and open source, and you can try it right now.

ElevenLabs told Tom's Guide the real aim is to have other companies and developers build things with the API themselves, such as integrating into generative video.

Google DeepMind creates AI model that can add sound to silent videos

Google's new audio generation is off to a solid start

Limitations of the DeepMind model

ElevenLabs is working on a similar project

More from Tom's Guide

Latest stories

The internet has a lot of thoughts of Kylie Jenner’s sheer veil

Katy Perry Went Topless in Nothing But Ripped Tights and an Open Jacket at PFW

Simon Cowell floored by heavy metal 10-year-old on “America's Got Talent”: 'You turned into this rock goddess'

Catherine Zeta-Jones, 54, Shows Off Her Curves in a Sexy Cutout One-Piece: ‘Caught Me Posing!’

Jason Kelce and Travis Kelce Reveal Royal "Warning" They Got Before Meeting Prince William

Suri Cruise ditches famous last name at graduation ceremony which Tom appeared to miss for the Eras Tour

I Am: Celine Dion viewers 'broken' and 'sobbing' after harrowing scenes

Katy Perry Takes on Paris in Red-Hot Minidress with Mega Train Featuring Lyrics from Her New Single

Longlegs: the horror movie people are calling ‘the best serial killer film in recent memory’

Queen Camilla Debuts a Highly Anticipated New Accessory Honoring King Charles at Japanese State Banquet

Princess Beatrice looks ravishing in sheer ruffles and Zara accessories

Nicole Kidman Shares Intimate Photo with Husband Keith Urban to Celebrate 18th Wedding Anniversary: 'Forever'

Alexa Chung recreates iconic film look for Serpentine Gallery Summer Party

Demi Moore Glitters from All Angles in Glam Sequin Skirt at National Women's History Museum Dinner

Maya Jama just styled a bikini top with flared trousers for a night out

Nicole Kidman and Daughter Sunday Rose Coordinate in Black Dresses and Sunglasses for Balenciaga Fall 2024 Couture Show During Paris Fashion Week

Jennifer Lopez Was 'Relaxed' and Happy While Dining with Friends in Italy on Vacation (Source)

It's a 'Forrest Gump' reunion! Tom Hanks, Robin Wright get de-aged in new film 'Here'

Queen Rania of Jordan Makes the Case for Two-Tone Dressing in Elie Saab Dress, Brigitte Macron Takes on Hosting Duties in Classic White Suit in Paris

44 Truly Terrible Wedding Guests Who Should 100% Never Receive An Invitation To A Wedding Again