Midjourney Launches V1: Turn Still Images Into AI Videos

Generative AI pioneer Midjourney, renowned for pushing the boundaries of AI image creation, is now stepping into the dynamic world of artificial intelligence video generation. The company officially announced the debut of its first AI video model, Midjourney V1, on Wednesday, June 18, with availability beginning around June 19, 2025. This marks a significant expansion, allowing users to animate their existing images and transform static visuals into short, captivating video clips.

The core function of Midjourney V1 is its image-to-video capability. It takes any still image – whether previously generated within Midjourney or uploaded by the user – and brings it to life. Accessing this new feature is designed to be intuitive: users simply select an image from their gallery on the web interface or via Discord and look for the new “Animate” option.

Upon initiating the process, Midjourney V1 automatically generates an initial five-second video clip. To provide users with creative options, the model typically produces four distinct video outputs based on the single image, allowing selection of the most suitable result.

Bringing Images to Life: Control and Customization

While V1 offers automated animation, users aren’t limited to default settings. The model provides options for customizing the video’s motion. Users can choose between an ‘Automatic’ mode, where Midjourney determines the movement, or a ‘Manual’ mode. In Manual mode, users can provide text prompts to guide the animation, and even specify whether the model should adhere strictly to the prompt or inject a “creative flair” for potentially unexpected results.

Further control is available through adjustable motion settings:
Low Motion: Ideal for subtle, ambient scenes where the camera remains largely static and the movement is concentrated within the subject or elements of the image. However, early use suggests this setting can sometimes result in videos with very little noticeable animation.
High Motion: Designed for more dynamic sequences featuring greater subject movement and potential camera motion. While this can produce more dramatic effects, there’s a higher chance of visual artifacts or glitches appearing in the output. Experimenting with both settings is recommended to find the best fit for a given image.

Need a longer clip? Midjourney V1 allows users to extend the duration of their generated videos. From the initial 5-second clip, users can add approximately four seconds at a time, with the ability to extend the video up to four times. This brings the maximum possible length of a V1-generated video to around 20-21 seconds.

Pricing, Availability, and Market Position

As a premium feature, access to Midjourney V1 requires a paid subscription. Generating videos comes at a notably higher cost compared to generating still images – specifically, it’s priced at roughly eight times the cost of a standard image generation job. While this consumes credits faster, Midjourney positions its video pricing competitively, claiming that the cost per second is comparable to performing an image upscale, making it significantly cheaper than some rival offerings per second.

Availability for Midjourney V1 is currently accessible through the company’s web interface and via Discord. There is currently no dedicated mobile app support for the video feature. Subscription tiers affect video access: Basic plans allow limited use, while Pro and Mega subscribers benefit from unlimited video generation when utilizing the platform’s “Relax” mode.

The launch of V1 places Midjourney in direct competition with major players in the burgeoning AI video space, including OpenAI’s Sora, Google’s Veo, and Runway ML. While competitors like Runway and Sora may offer longer clips or advanced features like built-in audio (which V1 currently lacks), Midjourney appears to be carving out its niche. CEO David Holz has suggested Midjourney’s focus remains on creative storytelling and imagination, rather than immediately targeting corporate or Hollywood-level productions.

A Stepping Stone to Wider Ambitions

Midjourney views the V1 video model not just as a new feature, but as a crucial step in a much larger, long-term research and development roadmap. David Holz has outlined the ambitious goal of creating AI models capable of simulating real-time open-world simulations. This vision involves building complex “world models” that integrate image generation, video, 3D environments, and real-time interaction, akin to creating interactive, navigable worlds from prompts. V1, enabling motion from static images, is seen as a fundamental building block (visual -> video) towards achieving this ultimate goal of creating interactive 3D spaces and real-time experiences.

The introduction of V1 also comes amidst a significant legal challenge. Disney and Universal have filed a copyright infringement lawsuit against Midjourney, alleging the company trained its AI models on protected intellectual property, including popular characters like Darth Vader and Homer Simpson, without permission. The lawsuit targets the video service preemptively, claiming Midjourney is also training its video model on this protected data, potentially gaining an unfair advantage and facilitating widespread plagiarism. The legal battle highlights the ongoing tensions regarding AI training data and copyright law, a factor that could influence the pace of Midjourney’s future developments.

Despite the legal hurdles and the initial focus on short, artistic clips (described by some early users as “dreamlike” or “moving paintings” rather than photorealistic), the release of Midjourney V1 marks a significant milestone. It expands the platform’s creative toolkit, brings AI video generation capabilities to its large user base, and signals Midjourney’s commitment to its long-term vision of developing increasingly complex and interactive AI generation technologies. Early reactions from creators have reportedly been positive, praising the accessibility and results, even while noting areas for future improvement like video length and audio integration.