I went hands-on with Veo 3, Google’s new AI video generator that can create some truly mind-blowing fake videos to test its amazing power and limitations.
31 minutes ago
When the AI-generated “Will Smith eating spaghetti” video went viral a little over 2 years ago, I wasn’t as skeptical as some about the future of AI video generation. I anticipated improvements, but I never imagined the technology would advance so quickly. Indeed, it was just last month that Google rolled out Veo 2, its second-generation AI video generator model, to the public, and the company is already back with a much more impressive model. After making over 25 videos with it, I’m convinced that Google’s Veo 3 is a mind-blowing advancement in AI video generation, for better or worse.What is Veo 3?
Veo 3 is Google’s state-of-the-art text-to-video generation model. Like Veo 2, Veo 3 creates high-quality videos in a range of subjects and styles, even capturing nuanced object interactions and human expressions. Both models also block “harmful requests and results” and mark their video outputs with an invisible watermark called SynthID.
The Veo 2 model could only produce silent videos, making it more like a high-quality GIF generator. The new Veo 3 model, however, supports native audio generation, putting it leaps and bounds ahead of its predecessor. The new model can not only generate sound effects and ambient noise but also create dialogue that’s synced with the video.
While Veo 3 outputs are still limited to short, 8-second video clips, the addition of native audio generation has allowed people to create some truly mind-blowing AI videos that have taken the Internet by storm. I’m sure you’ve seen some of these videos already, but if not, we’ve put together a collection of over 25 videos made by Veo 3 that demonstrate the tool’s prowess and its current limitations. While Veo 3 can be a pain to work with, its low barrier to entry makes it an incredible tool for anyone with enough time to create convincing, life-like videos, and I’m not convinced the world is ready for this.Veo 3 almost makes it too easy to create realistic videos
If you’ve spent any amount of time on social media in the last few weeks, then you’ve probably seen people argue over whether 100 men could beat 1 gorilla in a fight. It’s become somewhat of a meme, with laypeople and experts alike chiming in on the debate. Some amateur video makers have even created their own simulations of the hypothetical brawl. I wanted to see how easy it would be for me, someone with virtually no 3D animation experience, to make a video showing 100 men take on 1 gorilla.
It was as simple as asking the Gemini chatbot, “Create a video showing 100 men fighting one silverback gorilla.”
Now, I’m sure if you pixel-peep, you can find some errors. Maybe you’ll spot some men or weapons in the background appearing or disappearing randomly, or perhaps you’ll notice that there clearly aren’t 100 men in this 8-second clip. But if you were to simply watch this video on a small smartphone screen, you’d be hard-pressed to find any major issues at a glance.
This video definitely captures the chaotic, fast-paced action that would ensue when 100 men take on 1 gorilla. The sound that Veo 3 generated for the gorilla’s punches had weight to it, making it feel believable. I knew it was an AI-generated video, of course, because I was the one who made it. But when I showed my mom — who was unaware of the memes on social media — this clip, she asked me what movie it was from!
Another video that demonstrates Veo 3’s skills at simulating animal physics is this one:
I asked Gemini to make a video of “a bull rampaging in a shop selling fine china,” and it created a video where, again, if you were to pixel-peep, you’d probably find issues.