Startups & Business News

Google Gemini AI Now Lets You Convert Photos to Videos: What You Need to Know

futureTEKnow
July 11, 2025

KEY POINTS

Gemini AI now converts photos and text into 8-second videos with sound
Available to Ultra and Pro subscribers via web, with mobile support coming soon
Strict content guidelines prevent misuse and protect privacy
Best results currently come from simple animations of objects and nature
Face and complex motion animation are still under development

Google has just rolled out a new feature for its Gemini AI platform, making it possible for users to transform photos and text into short, 8-second video clips with sound. This update is currently available to subscribers of the Gemini AI Ultra and Pro plans, and it’s accessible through the Gemini chat interface on the web, with a mobile app update coming soon.

How Does the Photo-to-Video Feature Work?

Upload a photo and add a text description.
Gemini generates a 720p landscape MP4 video—up to 8 seconds long, complete with sound.
The process is streamlined within the Gemini chat, making it user-friendly for both creators and casual users.

✨📦✨ Special delivery! A new Gemini feature just dropped. Make photos come alive by turning them into videos with sound.
— Google Gemini App (@GeminiApp) July 10, 2025

Why Is This Update Important?

This isn’t Google’s first foray into AI-powered video creation. The technology was initially showcased as part of Veo 3, Google’s advanced video-generation model, and was previously limited to Flow, a standalone filmmaking tool. Now, by integrating it into Gemini, Google is democratizing access to AI video creation for a much wider audience.

Competing in the AI Video Race

Google’s move comes as the competition heats up in the AI video space. Rivals like OpenAI, Runway, Alibaba, and Kuaishou are all racing to launch their own generative video tools. By embedding this capability directly into Gemini, Google is aiming to keep pace and set new standards for AI-driven creative tools.

User Guidelines and Limitations

To prevent misuse, Google has implemented strict content guidelines:

No videos using images of celebrities, politicians, or public figures.
Prohibited from generating content that promotes violence or bullying.

Despite these safeguards, the technology is still evolving. Early testers found that while Gemini excels at animating nature scenes, drawings, and objects, it struggles with more complex tasks. For instance, attempts to create talking videos from photos sometimes resulted in altered facial features or even changes in race. Simple prompts—like making a plant sway or animating a cat—worked well, but more ambitious requests, such as making a person breakdance, often produced awkward or unintended results.

A Google spokesperson emphasized that the AI is not programmed to change appearances and that improvements, especially for face animation, are in the pipeline.

What This Means for Content Creators

This update is a significant step for anyone interested in AI-powered content creation. The ability to quickly generate short videos from photos and text opens up new possibilities for storytelling, marketing, and social media engagement. As the technology matures, expect even more sophisticated video generation capabilities to become available to a broader audience.

Google’s latest update signals a new era in AI-assisted creativity, putting powerful video generation tools directly into users’ hands.

futureTEKnow is a leading source for Technology, Startups, and Business News, spotlighting the most innovative companies and breakthrough trends in emerging tech sectors like Artificial Intelligence (AI), Robotics, and the Space Industry.

Become a contributor

Find out how

Discover the companies and startups shaping tomorrow — explore the future of technology today.