[Feature]: Support for Image Generation

Add native image generation capabilities to Bifrost, enabling unified access to image generation models (DALL-E, Stable Diffusion, Midjourney API, etc.) through the existing gateway infrastructure with full streaming support, semantic caching, and UI rendering.

As AI applications increasingly combine text and image generation, Bifrost should provide a unified interface for image generation models with the same benefits it offers for LLMs:
- **Provider abstraction** - Switch between DALL-E, Stable Diffusion, etc. seamlessly
- **Fallback support** - Automatic failover between image providers
- **Observability** - Logging, metrics, and cost tracking for image generation
- **Caching** - Semantic cache for repeated image prompts
- **Streaming** - Progressive image loading for better UX

## Scope

### In Scope
- [ ] New `/v1/images/generations` endpoint (OpenAI-compatible)
- [ ] Image generation via Chat Completion API (tool use pattern)
- [ ] Image generation via Responses API (native support)
- [ ] Streaming image delivery (base64 chunks)
- [ ] Semantic caching for image generation
- [ ] UI components for image rendering
- [ ] Provider implementations: OpenAI DALL-E, Azure DALL-E

### Out of Scope (Future Work)
- Image editing (`/v1/images/edits`)
- Image variations (`/v1/images/variations`)
- Video generation
- Additional providers (Stability AI, Midjourney)

---

## Technical Design

Note: These are just basic examples for better understanding

### 1. Schema Definitions

#### New File: `core/schemas/images.go`

```go
package schemas

// Request Types
const (
    ImageGenerationRequest       RequestType = "image_generation"
    ImageGenerationStreamRequest RequestType = "image_generation_stream"
)

// BifrostImageGenerationRequest represents an image generation request
type BifrostImageGenerationRequest struct {
    Provider       ModelProvider              `json:"provider"`
    Model          string                     `json:"model"`
    Input          *ImageGenerationInput      `json:"input"`
    Params         *ImageGenerationParameters `json:"params,omitempty"`
    Fallbacks      []Fallback                 `json:"fallbacks,omitempty"`
    RawRequestBody []byte                     `json:"-"`
}

type ImageGenerationInput struct {
    Prompt string `json:"prompt"`
}

type ImageGenerationParameters struct {
    N              *int    `json:"n,omitempty"`               // Number of images (1-10)
    Size           *string `json:"size,omitempty"`            // "256x256", "512x512", "1024x1024", "1792x1024", "1024x1792"
    Quality        *string `json:"quality,omitempty"`         // "standard", "hd"
    Style          *string `json:"style,omitempty"`           // "natural", "vivid"
    ResponseFormat *string `json:"response_format,omitempty"` // "url", "b64_json"
    User           *string `json:"user,omitempty"`
    ExtraParams    map[string]interface{} `json:"extra_params,omitempty"`
}

// BifrostImageGenerationResponse represents the response
type BifrostImageGenerationResponse struct {
    ID          string                     `json:"id"`
    Created     int64                      `json:"created"`
    Model       string                     `json:"model"`
    Data        []ImageData                `json:"data"`
    Usage       *ImageUsage                `json:"usage,omitempty"`
    ExtraFields BifrostResponseExtraFields `json:"extra_fields,omitempty"`
}

type ImageData struct {
    URL           string `json:"url,omitempty"`
    B64JSON       string `json:"b64_json,omitempty"`
    RevisedPrompt string `json:"revised_prompt,omitempty"`
    Index         int    `json:"index"`
}

type ImageUsage struct {
    PromptTokens int `json:"prompt_tokens"`
    TotalTokens  int `json:"total_tokens"`
}

// Streaming Response
type BifrostImageStreamResponse struct {
    ID            string  `json:"id"`
    Type          string  `json:"type"`                        // "image.chunk", "image.complete", "error"
    Index         int     `json:"index"`                       // Which image (0-N)
    ChunkIndex    int     `json:"chunk_index"`                 // Chunk order within image
    PartialB64    string  `json:"partial_b64,omitempty"`       // Base64 chunk
    RevisedPrompt string  `json:"revised_prompt,omitempty"`    // On first chunk
    Usage         *ImageUsage `json:"usage,omitempty"`         // On final chunk
    Error         *BifrostError `json:"error,omitempty"`
}
```

### 2. Provider Interface Extension

#### Update: `core/schemas/provider.go`

```go
type Provider interface {
    // ... existing methods ...

    // Image Generation
    ImageGeneration(ctx context.Context, key Key, request *BifrostImageGenerationRequest) (
        *BifrostImageGenerationResponse, *BifrostError)
    ImageGenerationStream(ctx context.Context, postHookRunner PostHookRunner, key Key,
        request *BifrostImageGenerationRequest) (chan *BifrostStream, *BifrostError)
}
```

### 3. Core Bifrost Methods

#### Update: `core/bifrost.go`

```go
// Add public methods
func (b *Bifrost) ImageGenerationRequest(ctx context.Context,
    req *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError)

func (b *Bifrost) ImageGenerationStreamRequest(ctx context.Context,
    req *schemas.BifrostImageGenerationRequest) (chan *schemas.BifrostStream, *schemas.BifrostError)

// Update handleRequest switch statement
case schemas.ImageGenerationRequest:
    resp, err := provider.ImageGeneration(req.Context, key, req.BifrostRequest.ImageGenerationRequest)
    if err != nil {
        return nil, err
    }
    response.ImageGenerationResponse = resp
```

### 4. HTTP Transport Layer

#### Update: `transports/bifrost-http/handlers/inference.go`

```go
// Add route
r.POST("/v1/images/generations", h.imageGeneration)

// Handler implementation
func (h *CompletionHandler) imageGeneration(ctx *fasthttp.RequestCtx) {
    var req ImageGenerationHTTPRequest
    if err := sonic.Unmarshal(ctx.PostBody(), &req); err != nil {
        // error handling
    }

    bifrostReq := toBifrostImageRequest(&req)

    if req.Stream != nil && *req.Stream {
        h.handleStreamingImageGeneration(ctx, bifrostReq)
        return
    }

    resp, err := h.client.ImageGenerationRequest(ctx, bifrostReq)
    // response handling
}
```

### 5. Streaming Implementation

#### New File: `framework/streaming/images.go`

```go
package streaming

type ImageStreamChunk struct {
    Timestamp    time.Time
    Delta        *schemas.BifrostImageStreamResponse
    FinishReason *string
    ChunkIndex   int
    ImageIndex   int
    ErrorDetails *schemas.BifrostError
}

// Pool for memory efficiency
var imageStreamChunkPool = sync.Pool{
    New: func() interface{} {
        return &ImageStreamChunk{}
    },
}

func (a *Accumulator) addImageStreamChunk(requestID string, chunk *ImageStreamChunk, isFinal bool) error {
    acc := a.getOrCreateStreamAccumulator(requestID)
    acc.mu.Lock()
    defer acc.mu.Unlock()

    acc.ImageStreamChunks = append(acc.ImageStreamChunks, chunk)

    if isFinal {
        return a.processImageStreamingResponse(requestID, acc)
    }
    return nil
}

func (a *Accumulator) processImageStreamingResponse(requestID string, acc *StreamAccumulator) error {
    // Sort chunks by ImageIndex, then ChunkIndex
    sort.Slice(acc.ImageStreamChunks, func(i, j int) bool {
        if acc.ImageStreamChunks[i].ImageIndex != acc.ImageStreamChunks[j].ImageIndex {
            return acc.ImageStreamChunks[i].ImageIndex < acc.ImageStreamChunks[j].ImageIndex
        }
        return acc.ImageStreamChunks[i].ChunkIndex < acc.ImageStreamChunks[j].ChunkIndex
    })

    // Reconstruct complete images from chunks
    images := make(map[int]*strings.Builder)
    for _, chunk := range acc.ImageStreamChunks {
        if _, ok := images[chunk.ImageIndex]; !ok {
            images[chunk.ImageIndex] = &strings.Builder{}
        }
        images[chunk.ImageIndex].WriteString(chunk.Delta.PartialB64)
    }

    // Build final response
    // ...
}
```

#### Update: `framework/streaming/types.go`

```go
type StreamAccumulator struct {
    // ... existing fields ...
    ImageStreamChunks []*ImageStreamChunk
}

const StreamTypeImage = "image.generation"
```

### 6. Semantic Cache Integration

#### Update: `plugins/semanticcache/main.go`

```go
// Add image generation to cacheable request types
func (p *SemanticCachePlugin) PreHook(ctx context.Context, req *schemas.BifrostRequest) error {
    switch req.RequestType {
    case schemas.ChatCompletionRequest, schemas.ImageGenerationRequest:
        return p.checkCache(ctx, req)
    }
    return nil
}

// Image-specific cache key generation
func (p *SemanticCachePlugin) getImageCacheKey(req *schemas.BifrostImageGenerationRequest) string {
    // Hash: prompt + size + quality + style + n
    h := xxhash.New()
    h.WriteString(req.Input.Prompt)
    if req.Params != nil {
        if req.Params.Size != nil {
            h.WriteString(*req.Params.Size)
        }
        if req.Params.Quality != nil {
            h.WriteString(*req.Params.Quality)
        }
        // ... other params
    }
    return fmt.Sprintf("img_%x", h.Sum64())
}
```

#### Cache Storage Schema:
```go
// Vector store properties for image cache
Properties: []Property{
    {Name: "request_hash", DataType: "string"},
    {Name: "prompt_embedding", DataType: "vector"},  // For semantic similarity
    {Name: "image_urls", DataType: "string[]"},      // Cached URLs
    {Name: "image_b64", DataType: "string[]"},       // Cached base64 (optional)
    {Name: "revised_prompts", DataType: "string[]"},
    {Name: "expires_at", DataType: "int"},
    {Name: "provider", DataType: "string"},
    {Name: "model", DataType: "string"},
    {Name: "params_hash", DataType: "string"},
}
```

### 7. Provider Implementation (OpenAI)

#### New File: `core/providers/openai/images.go`

```go
package openai

type OpenAIImageRequest struct {
    Model          string  `json:"model"`
    Prompt         string  `json:"prompt"`
    N              *int    `json:"n,omitempty"`
    Size           *string `json:"size,omitempty"`
    Quality        *string `json:"quality,omitempty"`
    Style          *string `json:"style,omitempty"`
    ResponseFormat *string `json:"response_format,omitempty"`
    User           *string `json:"user,omitempty"`
}

type OpenAIImageResponse struct {
    Created int64 `json:"created"`
    Data    []struct {
        URL           string `json:"url,omitempty"`
        B64JSON       string `json:"b64_json,omitempty"`
        RevisedPrompt string `json:"revised_prompt,omitempty"`
    } `json:"data"`
}

func (p *OpenAIProvider) ImageGeneration(ctx context.Context, key schemas.Key,
    req *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {

    openaiReq := toOpenAIImageRequest(req)

    resp, err := p.doRequest(ctx, key, "POST", "/v1/images/generations", openaiReq)
    if err != nil {
        return nil, toBifrostError(err)
    }

    return toBifrostImageResponse(resp), nil
}

func (p *OpenAIProvider) ImageGenerationStream(ctx context.Context, postHookRunner schemas.PostHookRunner,
    key schemas.Key, req *schemas.BifrostImageGenerationRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) {

    // OpenAI doesn't natively support streaming for images
    // Implement chunked base64 delivery for large images
    streamChan := make(chan *schemas.BifrostStream, 10)

    go func() {
        defer close(streamChan)

        // Generate image (non-streaming from provider)
        resp, err := p.ImageGeneration(ctx, key, req)
        if err != nil {
            streamChan <- &schemas.BifrostStream{BifrostError: err}
            return
        }

        // Stream base64 data in chunks
        for i, img := range resp.Data {
            if img.B64JSON != "" {
                chunks := chunkBase64(img.B64JSON, 64*1024) // 64KB chunks
                for j, chunk := range chunks {
                    streamChan <- &schemas.BifrostStream{
                        BifrostImageStreamResponse: &schemas.BifrostImageStreamResponse{
                            ID:         resp.ID,
                            Type:       "image.chunk",
                            Index:      i,
                            ChunkIndex: j,
                            PartialB64: chunk,
                        },
                    }
                }
            }
            // Send completion marker
            streamChan <- &schemas.BifrostStream{
                BifrostImageStreamResponse: &schemas.BifrostImageStreamResponse{
                    ID:            resp.ID,
                    Type:          "image.complete",
                    Index:         i,
                    RevisedPrompt: img.RevisedPrompt,
                },
            }
        }
    }()

    return streamChan, nil
}
```

### 8. UI Components

#### New File: `ui/components/chat/ImageMessage.tsx`

```tsx
import React, { useState, useEffect } from 'react';
import { Card } from '@/components/ui/card';
import { Skeleton } from '@/components/ui/skeleton';

interface ImageMessageProps {
  images: Array<{
    url?: string;
    b64_json?: string;
    revised_prompt?: string;
    index: number;
  }>;
  isStreaming?: boolean;
  streamProgress?: number; // 0-100
}

export const ImageMessage: React.FC<ImageMessageProps> = ({
  images,
  isStreaming,
  streamProgress
}) => {
  return (
    <div className="grid grid-cols-2 gap-4 my-4">
      {images.map((img, idx) => (
        <Card key={idx} className="overflow-hidden">
          {isStreaming && !img.url && !img.b64_json ? (
            <div className="relative">
              <Skeleton className="w-full aspect-square" />
              <div className="absolute bottom-2 left-2 text-sm text-muted-foreground">
                Loading... {streamProgress}%
              </div>
            </div>
          ) : (
            <>
              <img
                src={img.url || `data:image/png;base64,${img.b64_json}`}
                alt={img.revised_prompt || `Generated image ${idx + 1}`}
                className="w-full h-auto"
                loading="lazy"
              />
              {img.revised_prompt && (
                <div className="p-2 text-xs text-muted-foreground border-t">
                  {img.revised_prompt}
                </div>
              )}
            </>
          )}
        </Card>
      ))}
    </div>
  );
};
```

---

## API Examples

### REST API

**Request:**
```bash
curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "model": "dall-e-3",
    "prompt": "A serene Japanese garden with cherry blossoms",
    "n": 1,
    "size": "1024x1024",
    "quality": "hd",
    "response_format": "b64_json"
  }'
```

**Response:**
```json
{
  "id": "img-abc123",
  "created": 1699999999,
  "model": "dall-e-3",
  "data": [
    {
      "b64_json": "iVBORw0KGgo...",
      "revised_prompt": "A tranquil Japanese garden featuring blooming cherry blossom trees...",
      "index": 0
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "total_tokens": 15
  },
  "extra_fields": {
    "provider": "openai",
    "latency_ms": 8500,
    "cache_debug": null
  }
}
```

### Streaming Response (SSE)

```
data: {"id":"img-abc123","type":"image.chunk","index":0,"chunk_index":0,"partial_b64":"iVBORw0KGgo..."}

data: {"id":"img-abc123","type":"image.chunk","index":0,"chunk_index":1,"partial_b64":"AAAANSUhEU..."}

data: {"id":"img-abc123","type":"image.complete","index":0,"revised_prompt":"A tranquil Japanese garden...","usage":{"prompt_tokens":15,"total_tokens":15}}

data: [DONE]
```

---

## Files to Modify/Create

### New Files
| File | Purpose |
|------|---------|
| `core/schemas/images.go` | Image generation type definitions |
| `core/providers/openai/images.go` | OpenAI DALL-E implementation |
| `core/providers/azure/images.go` | Azure DALL-E implementation |
| `core/internal/testutil/image_generation.go` | Image Generation Test Scenario |
| `core/internal/testutil/image_generation_stream.go` | Image Generation Stream Test Scenario |
| `framework/streaming/images.go` | Image stream accumulation |
| `ui/components/chat/ImageMessage.tsx` | React image rendering component |
| `ui/hooks/useImageStream.ts` | Streaming hook for progressive loading |

### Modified Files
| File | Changes |
|------|---------|
| `core/schemas/bifrost.go` | Add ImageGenerationRequest/Response to BifrostRequest/Response unions |
| `core/schemas/provider.go` | Add ImageGeneration methods to Provider interface |
| `core/bifrost.go` | Add ImageGenerationRequest/StreamRequest methods, update handleRequest |
| `core/providers/openai/openai.go` | Implement Provider interface methods |
| `transports/bifrost-http/handlers/inference.go` | Add `/v1/images/generations` route |
| `framework/streaming/types.go` | Add ImageStreamChunk, StreamTypeImage |
| `framework/streaming/accumulator.go` | Add image chunk pool, processing methods |
| `plugins/semanticcache/main.go` | Add image generation caching support |
| `plugins/semanticcache/search.go` | Add image-specific cache search |

---

## Testing Plan

### Unit Tests
- [ ] Schema serialization/deserialization
- [ ] Request transformation (Bifrost → OpenAI format)
- [ ] Response transformation (OpenAI → Bifrost format)
- [ ] Stream chunk accumulation
- [ ] Cache key generation

### Integration Tests
- [ ] End-to-end image generation (non-streaming)
- [ ] End-to-end streaming image generation
- [ ] Fallback to secondary provider
- [ ] Cache hit/miss scenarios
- [ ] Error handling (rate limits, invalid prompts)

### Load Tests
- [ ] Concurrent image generation requests
- [ ] Stream memory usage under load
- [ ] Cache performance at scale

---

## Rollout Plan

1. **Phase 1**: Core schema and provider implementation (OpenAI + Azure)
2. **Phase 2**: HTTP transport and non-streaming endpoint
3. **Phase 3**: Streaming support and accumulator
4. **Phase 4**: Semantic cache integration (Base64 storage, 5min TTL)
5. **Phase 5**: UI components and documentation

---

## Design Decisions

| Decision | Choice | Rationale |
|----------|--------|-----------|
| **Cache Storage** | Base64 Data | Store actual image bytes to prevent expiration issues with provider URLs |
| **Initial Providers** | OpenAI + Azure | Both DALL-E providers for redundancy and enterprise support |
| **Streaming Chunk Size** | 64KB | Balance between latency and overhead |

---

## References

- [OpenAI Images API](https://platform.openai.com/docs/api-reference/images)
- [Azure DALL-E](https://learn.microsoft.com/en-us/azure/ai-services/openai/dall-e-quickstart)
- [Bifrost Architecture](https://github.com/maximhq/bifrost)

---

## Critical Files Reference

| Component | File Path |
|-----------|-----------|
| Core schemas | `core/schemas/bifrost.go`, `core/schemas/chatcompletions.go` |
| Provider interface | `core/schemas/provider.go` |
| Main engine | `core/bifrost.go` |
| OpenAI provider | `core/providers/openai/openai.go`, `core/providers/openai/chat.go` |
| HTTP handlers | `transports/bifrost-http/handlers/inference.go` |
| Streaming framework | `framework/streaming/accumulator.go`, `framework/streaming/types.go` |
| Semantic cache | `plugins/semanticcache/main.go`, `plugins/semanticcache/search.go` |
| UI components | `ui/components/` |


File	Purpose
`core/schemas/images.go`	Image generation type definitions
`core/providers/openai/images.go`	OpenAI DALL-E implementation
`core/providers/azure/images.go`	Azure DALL-E implementation
`core/internal/testutil/image_generation.go`	Image Generation Test Scenario
`core/internal/testutil/image_generation_stream.go`	Image Generation Stream Test Scenario
`framework/streaming/images.go`	Image stream accumulation
`ui/components/chat/ImageMessage.tsx`	React image rendering component
`ui/hooks/useImageStream.ts`	Streaming hook for progressive loading

File	Changes
`core/schemas/bifrost.go`	Add ImageGenerationRequest/Response to BifrostRequest/Response unions
`core/schemas/provider.go`	Add ImageGeneration methods to Provider interface
`core/bifrost.go`	Add ImageGenerationRequest/StreamRequest methods, update handleRequest
`core/providers/openai/openai.go`	Implement Provider interface methods
`transports/bifrost-http/handlers/inference.go`	Add `/v1/images/generations` route
`framework/streaming/types.go`	Add ImageStreamChunk, StreamTypeImage
`framework/streaming/accumulator.go`	Add image chunk pool, processing methods
`plugins/semanticcache/main.go`	Add image generation caching support
`plugins/semanticcache/search.go`	Add image-specific cache search

Decision	Choice	Rationale
Cache Storage	Base64 Data	Store actual image bytes to prevent expiration issues with provider URLs
Initial Providers	OpenAI + Azure	Both DALL-E providers for redundancy and enterprise support
Streaming Chunk Size	64KB	Balance between latency and overhead

Component	File Path
Core schemas	`core/schemas/bifrost.go`, `core/schemas/chatcompletions.go`
Provider interface	`core/schemas/provider.go`
Main engine	`core/bifrost.go`
OpenAI provider	`core/providers/openai/openai.go`, `core/providers/openai/chat.go`
HTTP handlers	`transports/bifrost-http/handlers/inference.go`
Streaming framework	`framework/streaming/accumulator.go`, `framework/streaming/types.go`
Semantic cache	`plugins/semanticcache/main.go`, `plugins/semanticcache/search.go`
UI components	`ui/components/`

[Feature]: Support for Image Generation #950

Description

Scope

In Scope

Out of Scope (Future Work)

Technical Design

1. Schema Definitions

New File: core/schemas/images.go

2. Provider Interface Extension

Update: core/schemas/provider.go

3. Core Bifrost Methods

Update: core/bifrost.go

4. HTTP Transport Layer

Update: transports/bifrost-http/handlers/inference.go

5. Streaming Implementation

New File: framework/streaming/images.go

Update: framework/streaming/types.go

6. Semantic Cache Integration

Update: plugins/semanticcache/main.go

Cache Storage Schema:

7. Provider Implementation (OpenAI)

New File: core/providers/openai/images.go

8. UI Components

New File: ui/components/chat/ImageMessage.tsx

API Examples

REST API

Streaming Response (SSE)

Files to Modify/Create

New Files

Modified Files

Testing Plan

Unit Tests

Integration Tests

Load Tests

Rollout Plan

Design Decisions

References

Critical Files Reference

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

New File: `core/schemas/images.go`

Update: `core/schemas/provider.go`

Update: `core/bifrost.go`

Update: `transports/bifrost-http/handlers/inference.go`

New File: `framework/streaming/images.go`

Update: `framework/streaming/types.go`

Update: `plugins/semanticcache/main.go`

New File: `core/providers/openai/images.go`

New File: `ui/components/chat/ImageMessage.tsx`