Google’s VideoPoet can generate videos from text, audio input

Creators will be able to make short videos through the AI-driven tool

Web Desk

30 Dec, 2023

Technology

Google has introduced its AI-driven VideoPoet tool that will allow creators to generate video from text, image and audio input.

Its features also include video frame continuation, video inpainting and outpainting, video stylisation, and video-to-audio capabilities.

Without complex commands, the platform empowers users to generate videos, controllable motions, and stylised effects guided by text prompts. You will need to type a prompt and the model will give you a result, or provide an image and ask for customised motions, style or audio.

The tool, developed using the advanced MAGVIT-2, enables users to produce high-motion variable-length videos. MAGVIT-2 is a video tokeniser designed to generate concise and expressive tokens for both videos and images using a common token vocabulary.

Videopoet will also provide interactive editing capabilities and consolidate multiple video generation functionalities within a single Large Language Model (LLM).

“VideoPoet is a simple modeling method that can convert any autoregressive language model or large language model (LLM) into a high-quality video generator,” according to a post shared on the Google website.

The company claimed that the model is an autoregressive model, which means it creates output by taking cues from what it generated previously.

The company shared several demos on its website which show that the platforms can also create a short film by combining numerous video clips.

As per the latest information, the tech giant has not made VideoPoet accessible to the public as it remains under development.

However, the Google research team has released a demo website that shows several demos and videos created through the platform.

For the latest news, follow us on Twitter @Aaj_Urdu. We are also on Facebook, Instagram and YouTube.