Meta’s Movie Gen looks like a huge leap forward for AI video (but you can’t use it yet)

6 hours ago

2 2 minutes read

At this point, you probably either love the idea of making realistic videos with generative AI, or you think it’s a morally bankrupt endeavor that devalues artists and will usher in a disastrous era of deepfakes we’ll never escape from. It’s hard to find middle ground. Meta isn’t going to change minds with Movie Gen, its latest video creation AI model, but no matter what you think of AI media creation, it could end up being a significant milestone for the industry.

Movie Gen can produce realistic videos alongside music and sound effects at 16 fps or 24 fps at up to 1080p (upscaled from 768 by 768 pixels). It can also generative personalized videos if you upload a photo, and crucially, it appears to be easy to edit videos using simple text commands. Notably, it can also edit normal, non-AI videos with text. It’s easy to imagine how that could be useful for cleaning up something you’ve shot on your phone for Instagram. Movie Gen is just purely research at the moment —Meta won’t be releasing it to the public, so we have a bit of time to think about what it all means.

The company describes Movie Gen as its “third wave” of generative AI research, following its initial media creation tools like Make-A-Scene, as well as more recent offerings using its Llama AI model. It’s powered by a 30 billion parameter transformer model that can make 16 second-long 16 fps videos, or 10-second long 24 fps footage. It also has a 13 billion parameter audio model that can make 45 seconds of 48kHz of content like “ambient sound, sound effects (Foley), and instrumental background music” synchronized to video. There’s no synchronized voice support yet “due to our design choices,” the Movie Gen team wrote in their research paper.