Google is rolling out a powerful update to its Gemini AI assistant, bringing the ability to upload and analyze video clips directly within the app on Android, iPhone, and the web interface. This significant enhancement expands Gemini’s multimodal capabilities, allowing users to go beyond text and images by incorporating video into their prompts.
Initially spotted undergoing a gradual rollout around June 17th, Google has confirmed that the video upload feature is now live “for everyone” as of June 19th. This includes both free and paid Gemini users across supported platforms.
How to Upload Videos to Gemini
Accessing this new functionality is straightforward:
On Mobile (Android & iPhone): Ensure you have the latest version of the Gemini app on iOS/iPadOS or the Google app (version 16.24 stable or higher) on Android. Open the Gemini chat interface. Tap the ‘+’ (plus) icon next to the prompt bar. From the menu, select ‘Gallery’ or ‘Files’ and choose the video clip you want to upload. If the feature is active for your account, video files will be selectable.
On Web (gemini.google.com): Simply navigate to the Gemini website. You can then drag and drop a video file directly into the prompt bar to upload it.
Once uploaded, the video appears within the chat interface, often with a playback control, allowing you to rewatch the clip as you interact with Gemini.
What Gemini Can Do With Your Videos
With video analysis capabilities, Gemini can process the content of your uploaded clips to provide insights, answer questions, or generate descriptions. This opens up a range of possibilities:
Identify Objects or Actions: Ask Gemini about specific things happening in the video, such as identifying an object displayed on a screen or describing an action taking place.
Summarize Content: Request a summary of the video’s main points or events.
Generate Descriptions: Get detailed, evocative descriptions of scenes, like analyzing the setting and atmosphere of a landscape video.
Answer Contextual Questions: Ask questions about information presented visually in the video, like the time shown on a device screen.
For instance, you could upload a video of a nature scene and ask Gemini to describe it. The AI can analyze visual cues like colors, lighting, and textures to provide a rich narrative, potentially even inferring sounds based on visual evidence, as seen in examples where Gemini described the atmosphere and suggested the “rustling sound of leaves” based on a video of a forest path covered in foliage.
Understanding Availability and Limitations
While the rollout is now widespread for all users, it’s part of a phased deployment, so minor variations in availability might still occur. The feature supports uploads from your existing video library; the built-in Gemini camera function does not currently support recording video directly for analysis, though code suggests this capability might be added in a future update.
Initial reports and APK teardowns indicated an early limitation allowing analysis of videos up to a combined total length of five minutes when uploaded directly. For longer videos, the existing method of providing a link to an unlisted YouTube video remains an option for analysis.
This video analysis feature builds on Gemini’s existing ability to interact with YouTube videos via links and complements other recent updates like the stable release of Gemini 2.5 Pro, a persistent navigation drawer on Android, and the introduction of features like Scheduled Actions and even song identification capabilities on Android.
By integrating direct video upload and analysis, Google is significantly enhancing Gemini’s ability to process and understand multimodal inputs, bringing it closer to parity with other advanced AI models and making it a more versatile tool for users across mobile and web platforms.