AI Image Generation

VariantLab supports multiple AI models from different providers for image generation. Choose the model that best fits your needs based on quality, speed, resolution, and style.

Available Models

Gemini 2.5 Flash Image

Google's fast, cost-effective image generation model. Great for rapid iteration and exploring prompt ideas.

Specification	Value
Provider	Google
Max Resolution	1024px
Speed	Fast (seconds)
Aspect Ratios	10 options

Best for: Prototyping, testing prompts, high-volume generation, budget-conscious projects.

Gemini 3.0 Pro Image

Google's high-quality generation model with support for up to 4K resolution.

Specification	Value
Provider	Google
Max Resolution	4096px
Speed	Slower (10-30 seconds)
Size Options	1K, 2K, 4K
Aspect Ratios	10 options

Best for: Final production images, print-quality output, projects requiring high resolution and detail.

FLUX.2 [pro]

High-fidelity image generation from Black Forest Labs with a commercial license.

Specification	Value
Provider	Black Forest Labs
Max Resolution	2048px
Speed	Moderate
Size Options	1K, 2K
Aspect Ratios	10 options

Best for: Commercial projects, photorealistic styles, high-quality generation with licensing clarity.

note

FLUX.2 [pro] is generation-only. Variations use FLUX.2 [pro] Edit automatically.

GPT Image 1

OpenAI's multimodal image generation model with quality level control.

Specification	Value
Provider	OpenAI
Max Resolution	1024px
Speed	Moderate
Quality Levels	Low, Medium, High
Aspect Ratios	3 options (1:1, 2:3, 3:2)

Best for: Diverse art styles, text rendering in images, OpenAI ecosystem integration.

GPT Image 1.5

OpenAI's latest image model — more cost-effective with an additional "auto" quality option.

Specification	Value
Provider	OpenAI
Max Resolution	1024px
Speed	Moderate
Quality Levels	Low, Medium, High, Auto
Aspect Ratios	3 options (1:1, 2:3, 3:2)

Best for: Cost-effective OpenAI generation, letting the model choose optimal quality with "auto" mode.

Imagen 4

Google's latest dedicated image generation model with excellent prompt adherence.

Specification	Value
Provider	Google
Max Resolution	2048px
Speed	Moderate
Size Options	1K, 2K
Aspect Ratios	5 options

Best for: High-quality generation, photorealistic output, strong prompt following.

note

Imagen 4 is generation-only. Variations automatically use a compatible Gemini model.

Imagen 4 Ultra

The highest-quality Imagen model with enhanced detail and output quality.

Specification	Value
Provider	Google
Max Resolution	2048px
Speed	Slower
Size Options	1K, 2K
Aspect Ratios	5 options

Best for: Maximum quality output, detailed illustrations, premium production images.

note

Imagen 4 Ultra is generation-only. Variations automatically use a compatible Gemini model.

Seedream 4.5

ByteDance's unified generation and editing model.

Specification	Value
Provider	ByteDance
Max Resolution	2048px
Speed	Moderate
Size Options	1K, 2K
Aspect Ratios	7 options

Best for: Generation and editing in one model, diverse art styles.

Coming Soon

Seedream 4.5 is not yet available. It will be enabled in a future update.

Quality Levels

GPT Image models support quality levels that control output fidelity:

Level	Description
Low	Fastest, lowest cost
Medium	Balanced quality and cost
High	Best quality, highest cost
Auto	Model chooses optimal quality (GPT Image 1.5 only)

Higher quality levels produce more detailed images but cost more Mana.

Variation Model Override

Some models (like Imagen and FLUX.2 [pro]) are generation-only and don't support image editing natively. When you use these models for base generation, VariantLab automatically assigns a compatible model for variations:

Generation Model	Variation Model
FLUX.2 [pro]	FLUX.2 [pro] Edit
Imagen 4	Gemini 2.5 Flash Image
Imagen 4 Ultra	Gemini 2.5 Flash Image

You can change the variation model in project settings if you prefer a different option.

Aspect Ratios

Available aspect ratios vary by model. Most models support 10 ratios:

Ratio	Use Case
1:1	Avatars, icons, square art
3:2	Landscape photography
2:3	Portrait photography
4:3	Traditional landscape
3:4	Traditional portrait
5:4	Wide traditional
4:5	Tall traditional
16:9	Widescreen, banners
9:16	Vertical, mobile
21:9	Ultra-wide, panoramic

GPT Image models support 3 ratios (1:1, 2:3, 3:2). Imagen models support 5 ratios (1:1, 3:4, 4:3, 9:16, 16:9).

Writing Effective Prompts

Structure

A good prompt includes:

Subject - What you're generating
Style - Art style, medium
Details - Colors, features, accessories
Composition - Position, framing
Background - Setting, context

Example Prompts

Character avatar:

A cute robot mascot with large expressive blue eyes,
shiny silver metallic body, small antenna on top,
flat digital art style, centered composition,
solid white background, full body visible

Game asset:

Medieval fantasy sword with glowing blue blade,
ornate golden hilt with gems, magical particles,
game asset style, centered, transparent background,
high detail, no shadows

NFT art:

Abstract geometric lion portrait, low poly style,
vibrant gradient colors purple to orange,
modern digital art, centered, dark background

Tips for Better Results

Be specific - "glowing blue LED eyes" beats "blue eyes"
Include style - Always mention the art style
Request centering - Helps with trait detection later
Solid backgrounds - Easier for background removal
Mention what to avoid - "no text, no watermarks"

Generation Settings

Remove Background

Enable to automatically remove the background after generation:

Uses selected background removal model
Replaces background with transparency
Can be applied/removed later

Background Removal Models

Model	Best For
U2Net Fast	General purpose, quick
ISNet General	Digital art, clip art
U2Net Pro	High quality general
U2Net Cloth	Clothing items
U2Net Human	People, portraits
Silueta	High quality general
ISNet Anime	Anime, manga style

Cost Estimation

Before generating, the button shows the estimated Mana cost for your current model and settings. Costs vary by model, image size, and quality level — faster models and smaller sizes cost less, while higher-quality models and larger outputs cost more.

Regenerating Images

Click the regenerate icon to create a new image:

Uses current prompt and settings
Replaces the existing image in that slot
Same Mana cost as new generation

Multiple Base Images

Generate up to 5 base images per project:

Click + in the thumbnail stack
Generate or upload into the new slot
Star your favorite as the base for the pipeline

Having multiple options lets you pick the best starting point for your collection.

Available Models​

Gemini 2.5 Flash Image​

Gemini 3.0 Pro Image​

FLUX.2 [pro]​

GPT Image 1​

GPT Image 1.5​

Imagen 4​

Imagen 4 Ultra​

Seedream 4.5​

Quality Levels​

Variation Model Override​

Aspect Ratios​

Writing Effective Prompts​

Structure​

Example Prompts​

Tips for Better Results​

Generation Settings​

Remove Background​

Background Removal Models​

Cost Estimation​

Regenerating Images​

Multiple Base Images​

Available Models

Gemini 2.5 Flash Image

Gemini 3.0 Pro Image

FLUX.2 [pro]

GPT Image 1

GPT Image 1.5

Imagen 4

Imagen 4 Ultra

Seedream 4.5

Quality Levels

Variation Model Override

Aspect Ratios

Writing Effective Prompts

Structure

Example Prompts

Tips for Better Results

Generation Settings

Remove Background

Background Removal Models

Cost Estimation

Regenerating Images

Multiple Base Images