AI Models Reference
Technical reference for the AI models available in VariantLab.
Image Generation Models
Gemini 2.5 Flash Image
Google's fast image generation model.
| Specification | Value |
|---|---|
| Model ID | gemini-2.5-flash-image |
| Provider | |
| Max Resolution | 1024 × 1024 |
| Generation Speed | 2-5 seconds |
| Image Sizes | 1K |
| Aspect Ratios | 10 |
Capabilities: Fast iteration, good quality for most use cases, supports reference images.
Best for: Prototyping, testing prompts, high-volume generation, budget-conscious projects.
Gemini 3.0 Pro Image
Google's high-quality image generation model.
| Specification | Value |
|---|---|
| Model ID | gemini-3-pro-image-preview |
| Provider | |
| Max Resolution | 4096 × 4096 |
| Generation Speed | 10-30 seconds |
| Image Sizes | 1K, 2K, 4K |
| Aspect Ratios | 10 |
Capabilities: Higher image quality, up to 4K resolution, better detail rendering, supports reference images.
Best for: Final production images, print-quality output, high-resolution assets.
FLUX.2 [pro]
Black Forest Labs' high-fidelity generation model with commercial licensing.
| Specification | Value |
|---|---|
| Model ID | flux-2-pro |
| Provider | Black Forest Labs |
| Max Resolution | 2048 × 2048 |
| Image Sizes | 1K, 2K |
| Aspect Ratios | 10 |
Capabilities: High-quality generation, commercial license, supports reference images.
Best for: Commercial projects, photorealistic styles, high-fidelity output.
FLUX.2 [pro] is generation-only. Variations use the FLUX.2 [pro] Edit model (flux-2-pro-edit).
FLUX.2 [pro] Edit
Black Forest Labs' professional image editing model. Not shown in the generation model selector — used automatically for variations when FLUX.2 [pro] is the generation model.
| Specification | Value |
|---|---|
| Model ID | flux-2-pro-edit |
| Provider | Black Forest Labs |
| Max Resolution | 2048 × 2048 |
| Image Sizes | 1K, 2K |
| Edit Only | Yes |
Capabilities: Image editing, commercial license.
GPT Image 1
OpenAI's multimodal image generation model.
| Specification | Value |
|---|---|
| Model ID | gpt-image-1 |
| Provider | OpenAI |
| Max Resolution | 1024 × 1024 |
| Image Sizes | 1K (fixed) |
| Quality Levels | Low, Medium, High |
| Aspect Ratios | 3 (1:1, 2:3, 3:2) |
Capabilities: Image generation and editing, quality level control, supports reference images.
Best for: Diverse art styles, text rendering, quality-cost tradeoff control.
GPT Image 1.5
OpenAI's latest image model with improved cost-effectiveness.
| Specification | Value |
|---|---|
| Model ID | gpt-image-1.5 |
| Provider | OpenAI |
| Max Resolution | 1024 × 1024 |
| Image Sizes | 1K (fixed) |
| Quality Levels | Low, Medium, High, Auto |
| Aspect Ratios | 3 (1:1, 2:3, 3:2) |
Capabilities: Image generation and editing, quality level control (including auto), supports reference images.
Best for: Cost-effective OpenAI generation, automatic quality optimization.
Imagen 4
Google's latest dedicated image generation model.
| Specification | Value |
|---|---|
| Model ID | imagen-4 |
| Provider | |
| Max Resolution | 2048 × 2048 |
| Image Sizes | 1K, 2K |
| Aspect Ratios | 5 (1:1, 3:4, 4:3, 9:16, 16:9) |
Capabilities: High-quality generation, strong prompt adherence.
Best for: Photorealistic output, high-quality illustrations.
Imagen 4 is generation-only. Variations default to Gemini 2.5 Flash Image.
Imagen 4 Ultra
The highest-quality Imagen model.
| Specification | Value |
|---|---|
| Model ID | imagen-4-ultra |
| Provider | |
| Max Resolution | 2048 × 2048 |
| Image Sizes | 1K, 2K |
| Aspect Ratios | 5 (1:1, 3:4, 4:3, 9:16, 16:9) |
Capabilities: Premium image quality, enhanced detail rendering.
Best for: Maximum quality output, premium production images.
Imagen 4 Ultra is generation-only. Variations default to Gemini 2.5 Flash Image.
Seedream 4.5
ByteDance's unified generation and editing model.
| Specification | Value |
|---|---|
| Model ID | seedream-4.5 |
| Provider | ByteDance |
| Max Resolution | 2048 × 2048 |
| Image Sizes | 1K, 2K |
| Aspect Ratios | 7 (1:1, 2:3, 3:2, 3:4, 4:3, 9:16, 16:9) |
| Status | Coming soon |
Capabilities: Image generation and editing in one model.
Detection Model
Gemini 2.5 Flash
Used for trait detection and mask generation.
| Specification | Value |
|---|---|
| Model ID | gemini-2.5-flash |
| Input | Image + text prompt |
| Output | Bounding box coordinates |
Capabilities: Image understanding, object detection, segmentation mask generation, fast processing.
How it works:
- Receives base image + detection prompt
- Analyzes image for specified trait
- Returns bounding box coordinates
- VariantLab converts to pixel mask
Variation Model Mapping
Some generation-only models cannot edit images directly. VariantLab automatically maps these to a compatible editing model for variations:
| Generation Model | Default Variation Model |
|---|---|
| FLUX.2 [pro] | FLUX.2 [pro] Edit |
| Imagen 4 | Gemini 2.5 Flash Image |
| Imagen 4 Ultra | Gemini 2.5 Flash Image |
Models that support editing natively (Gemini, GPT Image, Seedream) use themselves for variations. You can override the variation model in project settings.
Model Selection Guide
| Priority | Recommended Model |
|---|---|
| Speed | Gemini 2.5 Flash Image |
| Quality | Imagen 4 Ultra, Gemini 3.0 Pro Image |
| Budget | Gemini 2.5 Flash Image, GPT Image 1.5 |
| Resolution | Gemini 3.0 Pro Image (up to 4K) |
| Iteration | Gemini 2.5 Flash Image |
| Final output | Imagen 4 Ultra, Gemini 3.0 Pro Image |
| Commercial use | FLUX.2 [pro] |
| Quality control | GPT Image 1 / 1.5 (quality levels) |
Limitations
Resolution Limits
- Gemini Flash, GPT Image models: 1024px maximum
- FLUX.2, Imagen, Seedream: Up to 2048px
- Gemini Pro: Up to 4096px
Aspect Ratio Limits
- GPT Image models: 3 ratios only (1:1, 2:3, 3:2)
- Imagen models: 5 ratios
- Gemini, FLUX: 10 ratios
Detection
- Accuracy varies by image complexity
- May require manual refinement
- Works best with clear, distinct traits
Updates
Model capabilities may change as providers update their APIs. Check this page for the latest specifications.