We've enhanced Grok's image generation abilities with a new model, code-named Aurora. Aurora is an autoregressive mixture-of-experts network trained to predict the next token from interleaved text and image data. We trained the model on billions of examples from the internet, giving it a deep understanding of the world. As a result, it excels at photorealistic rendering and precisely following text instructions. Beyond text, the model also has native support for multimodal input, allowing it to take inspiration from or directly edit user-provided images.
Grok's new capabilities are now available on the 𝕏 platform in select countries and will roll out to all users within a week.
Image Generation
Grok can now generate high-quality images across several domains where other image generation models often struggle. It can render precise visual details of real-world entities, text, logos, and can create realistic portraits of humans.
Cybertruck under an aurora
Image Editing
Our new image generation model can now take images as input, giving users greater creative control and flexibility. We will release this capability to users on the 𝕏 platform soon.
Make the cat anime style
Looking Forward
At xAI, we are advancing the frontier of multimodal understanding and generation. If this goal inspires you, we invite you to join us on this journey — we are hiring!