Imagen 4 Text-to-Image Now in Google Gemini API

imagen-4-text-to-image-now-in-google-gemini-api-fe-685c22f75aaff

Google continues to rapidly expand its generative AI offerings, recently integrating its advanced text-to-image model, Imagen 4, into the Gemini API and Google AI Studio. This move provides developers and creators with powerful new capabilities for generating high-fidelity images directly through Google’s AI platforms.

The Imagen 4 release includes both the standard Imagen 4 model and a more capable version, Imagen 4 Ultra. Imagen 4 Ultra is specifically designed to achieve higher alignment with user prompts and deliver superior visual fidelity compared to previous iterations.

Imagen 4 marks a significant step forward in text-to-image generation. A key area of improvement highlighted is its enhanced ability to render text within images, a task that has historically been challenging for AI models. Early feedback from testers corroborates these advancements, noting sharper lettering and fewer visual artifacts in generated images compared to Imagen 3.

The models are designed for performance, offering near real-time image generation. They are capable of generating up to four images per request, each at a resolution of 1024 x 1024 pixels, and support detailed prompts up to 480 tokens in length. For even higher quality, Imagen 4 Ultra offers an optional 2K export capability accessible via Google Cloud’s Vertex AI platform.

Currently available as a paid preview, Imagen 4 is priced at $0.04 per image, while the more precise Imagen 4 Ultra costs $0.06 per image. Both versions offer limited free access through Google AI Studio, with broader availability and additional pricing tiers planned for the near future. Developers can access these models via the same /generate endpoint used for Gemini models, with a default rate limit of 20 requests per minute per project during the preview. Google supports these models with a straightforward pay-as-you-go pricing structure on Google Cloud, alongside generous free tiers. An even faster version, Imagen 4 Fast, is also expected soon.

To promote transparency and help users identify AI-generated content, each image created using Imagen 4 includes a digital SynthID watermark. Google is also rolling out a SynthID Detector tool to help verify the origin of watermarked content.

The launch of Imagen 4 into the Gemini API is part of a larger suite of recent Google AI announcements, including updates to their creative AI models like Veo (video generation) and Flow (filmmaking tool), as well as developer tools like the new Gemini CLI and an AI-first Google Colab. This underscores Google DeepMind’s commitment to making accessible, high-performance generative AI tools available across its ecosystem, empowering developers and creators with cutting-edge capabilities.

References

Leave a Reply