Core Functionality and Platform Access
Midjourney has officially entered the AI video space with the launch of its V1 model, an image-to-video generator. The system operates by taking a user-provided image—either uploaded or previously generated by another Midjourney model—and transforming it into a set of four distinct five-second videos. Staying true to its roots, access to V1 is exclusively available through Discord, with the functionality limited to the web platform at its initial launch. This approach maintains the familiar workflow for its existing user base while expanding its creative toolkit from static images to multimedia content.
Market Competition and Creative Focus
The release of V1 positions Midjourney against established AI video models from major players like OpenAI’s Sora, Runway’s Gen 4, Adobe’s Firefly, and Google’s Veo 3. While many competitors are developing controllable AI video tools for commercial applications, Midjourney continues to distinguish itself by catering to creative professionals and artists. Early demonstrations of V1's output align with this focus, producing videos that are more otherworldly and artistic rather than hyperrealistic. The initial response from the community has been positive, though direct comparisons to more mature models on the market are still pending.
Customization, Features, and Current Limitations
User Control and Settings
Midjourney's V1 provides users with several custom settings to influence the final video output. There are two primary modes for animation. The first is an automatic setting that animates the source image randomly. The second is a manual mode that allows users to provide a text description to specify the desired animation. Further control is offered through toggles for camera and subject movement, with options for "low motion" for subtle effects and "high motion" for more dynamic results.
Video Length and Extension Capabilities
Videos generated by V1 have a base length of five seconds. However, users have the option to extend these clips. A video can be extended by four seconds at a time, and this can be done up to four times. This feature allows for a maximum potential video length of 21 seconds per generation.
Notable Omissions at Launch
At launch, the V1 model has some significant limitations compared to rivals. The most noticeable is the lack of sound generation; any audio or soundtrack must be added in post-production using external software. Additionally, the platform does not currently support advanced editing features like timelines, scene transitions, or tools for ensuring continuity between different clips.
Pricing Structure and Subscription Plans
Midjourney has set the cost for video generation at eight times that of standard image generation. This means subscribers will use their monthly generation allotments much faster when creating videos. The most affordable way to access V1 is through the Basic plan, which costs $10 per month.
For heavy users, the Pro plan ($60/month) and Mega plan ($120/month) offer unlimited video generations in the slower "Relax" mode. The company has stated it will re-evaluate its video pricing model over the next month.
Legal Scrutiny and Industry Copyright Concerns
The launch of V1 occurs under the shadow of significant legal challenges. Just a week prior to the release, Midjourney was sued by Disney and Universal for alleged copyright infringement. The lawsuit claims that Midjourney's AI image models were trained on and continue to produce images of the studios' copyrighted characters, such as Darth Vader and Homer Simpson. This legal battle reflects a broader fear within Hollywood and other media industries that generative AI tools could devalue the work of human creatives and are being built on unlicensed copyrighted works. The lawsuit specifically anticipates future infringement from the video service, alleging it was likely trained on protected characters in motion.
Midjourney's Bold Long-Term Plan
Despite the immediate focus on video, Midjourney has much larger ambitions. In a blog post, CEO David Holz described the AI video model as a step toward the company's ultimate goal: creating AI models "capable of real-time open-world simulations". This vision involves merging static image generation, motion, and 3D spatial navigation into a single, unified system. Following the development of its video models, the company plans to create AI models for producing 3D renderings and, eventually, real-time AI models, positioning V1 as a foundational building block for these future technologies.
Q&A
How does Midjourney's V1 video model work?
V1 is an image-to-video model that you access via Discord. You upload an image or use one generated by Midjourney, and the model produces four different five-second video clips based on that image. You can then extend these clips up to a total of 21 seconds.
What does it cost to use Midjourney's V1 video generator?
A video generation costs eight times more than a standard image generation in terms of subscription credits. The cheapest access is through the $10 per month Basic plan. Subscribers to the $60 Pro and $120 Mega plans get unlimited video generations in the slower "Relax" mode.
How does Midjourney V1 compare to competitors like Sora or Runway?
Midjourney V1 differentiates itself by focusing on a more artistic, otherworldly aesthetic rather than the hyperrealism often targeted by competitors for commercial use. At launch, it lacks features like sound generation and advanced in-app video editing tools that some other platforms offer.