Skip to main content

AI Models Reference

Technical reference for the AI models available in VariantLab.

Image Generation Models

Gemini 2.5 Flash Image

Google's fast image generation model.

SpecificationValue
Model IDgemini-2.5-flash-image
ProviderGoogle
Max Resolution1024 × 1024
Generation Speed2-5 seconds
Image Sizes1K
Aspect Ratios10

Capabilities: Fast iteration, good quality for most use cases, supports reference images.

Best for: Prototyping, testing prompts, high-volume generation, budget-conscious projects.

Gemini 3.0 Pro Image

Google's high-quality image generation model.

SpecificationValue
Model IDgemini-3-pro-image-preview
ProviderGoogle
Max Resolution4096 × 4096
Generation Speed10-30 seconds
Image Sizes1K, 2K, 4K
Aspect Ratios10

Capabilities: Higher image quality, up to 4K resolution, better detail rendering, supports reference images.

Best for: Final production images, print-quality output, high-resolution assets.

FLUX.2 [pro]

Black Forest Labs' high-fidelity generation model with commercial licensing.

SpecificationValue
Model IDflux-2-pro
ProviderBlack Forest Labs
Max Resolution2048 × 2048
Image Sizes1K, 2K
Aspect Ratios10

Capabilities: High-quality generation, commercial license, supports reference images.

Best for: Commercial projects, photorealistic styles, high-fidelity output.

note

FLUX.2 [pro] is generation-only. Variations use the FLUX.2 [pro] Edit model (flux-2-pro-edit).

FLUX.2 [pro] Edit

Black Forest Labs' professional image editing model. Not shown in the generation model selector — used automatically for variations when FLUX.2 [pro] is the generation model.

SpecificationValue
Model IDflux-2-pro-edit
ProviderBlack Forest Labs
Max Resolution2048 × 2048
Image Sizes1K, 2K
Edit OnlyYes

Capabilities: Image editing, commercial license.

GPT Image 1

OpenAI's multimodal image generation model.

SpecificationValue
Model IDgpt-image-1
ProviderOpenAI
Max Resolution1024 × 1024
Image Sizes1K (fixed)
Quality LevelsLow, Medium, High
Aspect Ratios3 (1:1, 2:3, 3:2)

Capabilities: Image generation and editing, quality level control, supports reference images.

Best for: Diverse art styles, text rendering, quality-cost tradeoff control.

GPT Image 1.5

OpenAI's latest image model with improved cost-effectiveness.

SpecificationValue
Model IDgpt-image-1.5
ProviderOpenAI
Max Resolution1024 × 1024
Image Sizes1K (fixed)
Quality LevelsLow, Medium, High, Auto
Aspect Ratios3 (1:1, 2:3, 3:2)

Capabilities: Image generation and editing, quality level control (including auto), supports reference images.

Best for: Cost-effective OpenAI generation, automatic quality optimization.

Imagen 4

Google's latest dedicated image generation model.

SpecificationValue
Model IDimagen-4
ProviderGoogle
Max Resolution2048 × 2048
Image Sizes1K, 2K
Aspect Ratios5 (1:1, 3:4, 4:3, 9:16, 16:9)

Capabilities: High-quality generation, strong prompt adherence.

Best for: Photorealistic output, high-quality illustrations.

note

Imagen 4 is generation-only. Variations default to Gemini 2.5 Flash Image.

Imagen 4 Ultra

The highest-quality Imagen model.

SpecificationValue
Model IDimagen-4-ultra
ProviderGoogle
Max Resolution2048 × 2048
Image Sizes1K, 2K
Aspect Ratios5 (1:1, 3:4, 4:3, 9:16, 16:9)

Capabilities: Premium image quality, enhanced detail rendering.

Best for: Maximum quality output, premium production images.

note

Imagen 4 Ultra is generation-only. Variations default to Gemini 2.5 Flash Image.

Seedream 4.5

ByteDance's unified generation and editing model.

SpecificationValue
Model IDseedream-4.5
ProviderByteDance
Max Resolution2048 × 2048
Image Sizes1K, 2K
Aspect Ratios7 (1:1, 2:3, 3:2, 3:4, 4:3, 9:16, 16:9)
StatusComing soon

Capabilities: Image generation and editing in one model.

Detection Model

Gemini 2.5 Flash

Used for trait detection and mask generation.

SpecificationValue
Model IDgemini-2.5-flash
InputImage + text prompt
OutputBounding box coordinates

Capabilities: Image understanding, object detection, segmentation mask generation, fast processing.

How it works:

  1. Receives base image + detection prompt
  2. Analyzes image for specified trait
  3. Returns bounding box coordinates
  4. VariantLab converts to pixel mask

Variation Model Mapping

Some generation-only models cannot edit images directly. VariantLab automatically maps these to a compatible editing model for variations:

Generation ModelDefault Variation Model
FLUX.2 [pro]FLUX.2 [pro] Edit
Imagen 4Gemini 2.5 Flash Image
Imagen 4 UltraGemini 2.5 Flash Image

Models that support editing natively (Gemini, GPT Image, Seedream) use themselves for variations. You can override the variation model in project settings.

Model Selection Guide

PriorityRecommended Model
SpeedGemini 2.5 Flash Image
QualityImagen 4 Ultra, Gemini 3.0 Pro Image
BudgetGemini 2.5 Flash Image, GPT Image 1.5
ResolutionGemini 3.0 Pro Image (up to 4K)
IterationGemini 2.5 Flash Image
Final outputImagen 4 Ultra, Gemini 3.0 Pro Image
Commercial useFLUX.2 [pro]
Quality controlGPT Image 1 / 1.5 (quality levels)

Limitations

Resolution Limits

  • Gemini Flash, GPT Image models: 1024px maximum
  • FLUX.2, Imagen, Seedream: Up to 2048px
  • Gemini Pro: Up to 4096px

Aspect Ratio Limits

  • GPT Image models: 3 ratios only (1:1, 2:3, 3:2)
  • Imagen models: 5 ratios
  • Gemini, FLUX: 10 ratios

Detection

  • Accuracy varies by image complexity
  • May require manual refinement
  • Works best with clear, distinct traits

Updates

Model capabilities may change as providers update their APIs. Check this page for the latest specifications.