-
Notifications
You must be signed in to change notification settings - Fork 126
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Description
Add native image generation capabilities to Bifrost, enabling unified access to image generation models (DALL-E, Stable Diffusion, Midjourney API, etc.) through the existing gateway infrastructure with full streaming support, semantic caching, and UI rendering.
As AI applications increasingly combine text and image generation, Bifrost should provide a unified interface for image generation models with the same benefits it offers for LLMs:
- Provider abstraction - Switch between DALL-E, Stable Diffusion, etc. seamlessly
- Fallback support - Automatic failover between image providers
- Observability - Logging, metrics, and cost tracking for image generation
- Caching - Semantic cache for repeated image prompts
- Streaming - Progressive image loading for better UX
Scope
In Scope
- New
/v1/images/generationsendpoint (OpenAI-compatible) - Image generation via Chat Completion API (tool use pattern)
- Image generation via Responses API (native support)
- Streaming image delivery (base64 chunks)
- Semantic caching for image generation
- UI components for image rendering
- Provider implementations: OpenAI DALL-E, Azure DALL-E
Out of Scope (Future Work)
- Image editing (
/v1/images/edits) - Image variations (
/v1/images/variations) - Video generation
- Additional providers (Stability AI, Midjourney)
Technical Design
Note: These are just basic examples for better understanding
1. Schema Definitions
New File: core/schemas/images.go
package schemas
// Request Types
const (
ImageGenerationRequest RequestType = "image_generation"
ImageGenerationStreamRequest RequestType = "image_generation_stream"
)
// BifrostImageGenerationRequest represents an image generation request
type BifrostImageGenerationRequest struct {
Provider ModelProvider `json:"provider"`
Model string `json:"model"`
Input *ImageGenerationInput `json:"input"`
Params *ImageGenerationParameters `json:"params,omitempty"`
Fallbacks []Fallback `json:"fallbacks,omitempty"`
RawRequestBody []byte `json:"-"`
}
type ImageGenerationInput struct {
Prompt string `json:"prompt"`
}
type ImageGenerationParameters struct {
N *int `json:"n,omitempty"` // Number of images (1-10)
Size *string `json:"size,omitempty"` // "256x256", "512x512", "1024x1024", "1792x1024", "1024x1792"
Quality *string `json:"quality,omitempty"` // "standard", "hd"
Style *string `json:"style,omitempty"` // "natural", "vivid"
ResponseFormat *string `json:"response_format,omitempty"` // "url", "b64_json"
User *string `json:"user,omitempty"`
ExtraParams map[string]interface{} `json:"extra_params,omitempty"`
}
// BifrostImageGenerationResponse represents the response
type BifrostImageGenerationResponse struct {
ID string `json:"id"`
Created int64 `json:"created"`
Model string `json:"model"`
Data []ImageData `json:"data"`
Usage *ImageUsage `json:"usage,omitempty"`
ExtraFields BifrostResponseExtraFields `json:"extra_fields,omitempty"`
}
type ImageData struct {
URL string `json:"url,omitempty"`
B64JSON string `json:"b64_json,omitempty"`
RevisedPrompt string `json:"revised_prompt,omitempty"`
Index int `json:"index"`
}
type ImageUsage struct {
PromptTokens int `json:"prompt_tokens"`
TotalTokens int `json:"total_tokens"`
}
// Streaming Response
type BifrostImageStreamResponse struct {
ID string `json:"id"`
Type string `json:"type"` // "image.chunk", "image.complete", "error"
Index int `json:"index"` // Which image (0-N)
ChunkIndex int `json:"chunk_index"` // Chunk order within image
PartialB64 string `json:"partial_b64,omitempty"` // Base64 chunk
RevisedPrompt string `json:"revised_prompt,omitempty"` // On first chunk
Usage *ImageUsage `json:"usage,omitempty"` // On final chunk
Error *BifrostError `json:"error,omitempty"`
}2. Provider Interface Extension
Update: core/schemas/provider.go
type Provider interface {
// ... existing methods ...
// Image Generation
ImageGeneration(ctx context.Context, key Key, request *BifrostImageGenerationRequest) (
*BifrostImageGenerationResponse, *BifrostError)
ImageGenerationStream(ctx context.Context, postHookRunner PostHookRunner, key Key,
request *BifrostImageGenerationRequest) (chan *BifrostStream, *BifrostError)
}3. Core Bifrost Methods
Update: core/bifrost.go
// Add public methods
func (b *Bifrost) ImageGenerationRequest(ctx context.Context,
req *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError)
func (b *Bifrost) ImageGenerationStreamRequest(ctx context.Context,
req *schemas.BifrostImageGenerationRequest) (chan *schemas.BifrostStream, *schemas.BifrostError)
// Update handleRequest switch statement
case schemas.ImageGenerationRequest:
resp, err := provider.ImageGeneration(req.Context, key, req.BifrostRequest.ImageGenerationRequest)
if err != nil {
return nil, err
}
response.ImageGenerationResponse = resp4. HTTP Transport Layer
Update: transports/bifrost-http/handlers/inference.go
// Add route
r.POST("/v1/images/generations", h.imageGeneration)
// Handler implementation
func (h *CompletionHandler) imageGeneration(ctx *fasthttp.RequestCtx) {
var req ImageGenerationHTTPRequest
if err := sonic.Unmarshal(ctx.PostBody(), &req); err != nil {
// error handling
}
bifrostReq := toBifrostImageRequest(&req)
if req.Stream != nil && *req.Stream {
h.handleStreamingImageGeneration(ctx, bifrostReq)
return
}
resp, err := h.client.ImageGenerationRequest(ctx, bifrostReq)
// response handling
}5. Streaming Implementation
New File: framework/streaming/images.go
package streaming
type ImageStreamChunk struct {
Timestamp time.Time
Delta *schemas.BifrostImageStreamResponse
FinishReason *string
ChunkIndex int
ImageIndex int
ErrorDetails *schemas.BifrostError
}
// Pool for memory efficiency
var imageStreamChunkPool = sync.Pool{
New: func() interface{} {
return &ImageStreamChunk{}
},
}
func (a *Accumulator) addImageStreamChunk(requestID string, chunk *ImageStreamChunk, isFinal bool) error {
acc := a.getOrCreateStreamAccumulator(requestID)
acc.mu.Lock()
defer acc.mu.Unlock()
acc.ImageStreamChunks = append(acc.ImageStreamChunks, chunk)
if isFinal {
return a.processImageStreamingResponse(requestID, acc)
}
return nil
}
func (a *Accumulator) processImageStreamingResponse(requestID string, acc *StreamAccumulator) error {
// Sort chunks by ImageIndex, then ChunkIndex
sort.Slice(acc.ImageStreamChunks, func(i, j int) bool {
if acc.ImageStreamChunks[i].ImageIndex != acc.ImageStreamChunks[j].ImageIndex {
return acc.ImageStreamChunks[i].ImageIndex < acc.ImageStreamChunks[j].ImageIndex
}
return acc.ImageStreamChunks[i].ChunkIndex < acc.ImageStreamChunks[j].ChunkIndex
})
// Reconstruct complete images from chunks
images := make(map[int]*strings.Builder)
for _, chunk := range acc.ImageStreamChunks {
if _, ok := images[chunk.ImageIndex]; !ok {
images[chunk.ImageIndex] = &strings.Builder{}
}
images[chunk.ImageIndex].WriteString(chunk.Delta.PartialB64)
}
// Build final response
// ...
}Update: framework/streaming/types.go
type StreamAccumulator struct {
// ... existing fields ...
ImageStreamChunks []*ImageStreamChunk
}
const StreamTypeImage = "image.generation"6. Semantic Cache Integration
Update: plugins/semanticcache/main.go
// Add image generation to cacheable request types
func (p *SemanticCachePlugin) PreHook(ctx context.Context, req *schemas.BifrostRequest) error {
switch req.RequestType {
case schemas.ChatCompletionRequest, schemas.ImageGenerationRequest:
return p.checkCache(ctx, req)
}
return nil
}
// Image-specific cache key generation
func (p *SemanticCachePlugin) getImageCacheKey(req *schemas.BifrostImageGenerationRequest) string {
// Hash: prompt + size + quality + style + n
h := xxhash.New()
h.WriteString(req.Input.Prompt)
if req.Params != nil {
if req.Params.Size != nil {
h.WriteString(*req.Params.Size)
}
if req.Params.Quality != nil {
h.WriteString(*req.Params.Quality)
}
// ... other params
}
return fmt.Sprintf("img_%x", h.Sum64())
}Cache Storage Schema:
// Vector store properties for image cache
Properties: []Property{
{Name: "request_hash", DataType: "string"},
{Name: "prompt_embedding", DataType: "vector"}, // For semantic similarity
{Name: "image_urls", DataType: "string[]"}, // Cached URLs
{Name: "image_b64", DataType: "string[]"}, // Cached base64 (optional)
{Name: "revised_prompts", DataType: "string[]"},
{Name: "expires_at", DataType: "int"},
{Name: "provider", DataType: "string"},
{Name: "model", DataType: "string"},
{Name: "params_hash", DataType: "string"},
}7. Provider Implementation (OpenAI)
New File: core/providers/openai/images.go
package openai
type OpenAIImageRequest struct {
Model string `json:"model"`
Prompt string `json:"prompt"`
N *int `json:"n,omitempty"`
Size *string `json:"size,omitempty"`
Quality *string `json:"quality,omitempty"`
Style *string `json:"style,omitempty"`
ResponseFormat *string `json:"response_format,omitempty"`
User *string `json:"user,omitempty"`
}
type OpenAIImageResponse struct {
Created int64 `json:"created"`
Data []struct {
URL string `json:"url,omitempty"`
B64JSON string `json:"b64_json,omitempty"`
RevisedPrompt string `json:"revised_prompt,omitempty"`
} `json:"data"`
}
func (p *OpenAIProvider) ImageGeneration(ctx context.Context, key schemas.Key,
req *schemas.BifrostImageGenerationRequest) (*schemas.BifrostImageGenerationResponse, *schemas.BifrostError) {
openaiReq := toOpenAIImageRequest(req)
resp, err := p.doRequest(ctx, key, "POST", "/v1/images/generations", openaiReq)
if err != nil {
return nil, toBifrostError(err)
}
return toBifrostImageResponse(resp), nil
}
func (p *OpenAIProvider) ImageGenerationStream(ctx context.Context, postHookRunner schemas.PostHookRunner,
key schemas.Key, req *schemas.BifrostImageGenerationRequest) (chan *schemas.BifrostStream, *schemas.BifrostError) {
// OpenAI doesn't natively support streaming for images
// Implement chunked base64 delivery for large images
streamChan := make(chan *schemas.BifrostStream, 10)
go func() {
defer close(streamChan)
// Generate image (non-streaming from provider)
resp, err := p.ImageGeneration(ctx, key, req)
if err != nil {
streamChan <- &schemas.BifrostStream{BifrostError: err}
return
}
// Stream base64 data in chunks
for i, img := range resp.Data {
if img.B64JSON != "" {
chunks := chunkBase64(img.B64JSON, 64*1024) // 64KB chunks
for j, chunk := range chunks {
streamChan <- &schemas.BifrostStream{
BifrostImageStreamResponse: &schemas.BifrostImageStreamResponse{
ID: resp.ID,
Type: "image.chunk",
Index: i,
ChunkIndex: j,
PartialB64: chunk,
},
}
}
}
// Send completion marker
streamChan <- &schemas.BifrostStream{
BifrostImageStreamResponse: &schemas.BifrostImageStreamResponse{
ID: resp.ID,
Type: "image.complete",
Index: i,
RevisedPrompt: img.RevisedPrompt,
},
}
}
}()
return streamChan, nil
}8. UI Components
New File: ui/components/chat/ImageMessage.tsx
import React, { useState, useEffect } from 'react';
import { Card } from '@/components/ui/card';
import { Skeleton } from '@/components/ui/skeleton';
interface ImageMessageProps {
images: Array<{
url?: string;
b64_json?: string;
revised_prompt?: string;
index: number;
}>;
isStreaming?: boolean;
streamProgress?: number; // 0-100
}
export const ImageMessage: React.FC<ImageMessageProps> = ({
images,
isStreaming,
streamProgress
}) => {
return (
<div className="grid grid-cols-2 gap-4 my-4">
{images.map((img, idx) => (
<Card key={idx} className="overflow-hidden">
{isStreaming && !img.url && !img.b64_json ? (
<div className="relative">
<Skeleton className="w-full aspect-square" />
<div className="absolute bottom-2 left-2 text-sm text-muted-foreground">
Loading... {streamProgress}%
</div>
</div>
) : (
<>
<img
src={img.url || `data:image/png;base64,${img.b64_json}`}
alt={img.revised_prompt || `Generated image ${idx + 1}`}
className="w-full h-auto"
loading="lazy"
/>
{img.revised_prompt && (
<div className="p-2 text-xs text-muted-foreground border-t">
{img.revised_prompt}
</div>
)}
</>
)}
</Card>
))}
</div>
);
};API Examples
REST API
Request:
curl -X POST http://localhost:8080/v1/images/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"model": "dall-e-3",
"prompt": "A serene Japanese garden with cherry blossoms",
"n": 1,
"size": "1024x1024",
"quality": "hd",
"response_format": "b64_json"
}'Response:
{
"id": "img-abc123",
"created": 1699999999,
"model": "dall-e-3",
"data": [
{
"b64_json": "iVBORw0KGgo...",
"revised_prompt": "A tranquil Japanese garden featuring blooming cherry blossom trees...",
"index": 0
}
],
"usage": {
"prompt_tokens": 15,
"total_tokens": 15
},
"extra_fields": {
"provider": "openai",
"latency_ms": 8500,
"cache_debug": null
}
}Streaming Response (SSE)
data: {"id":"img-abc123","type":"image.chunk","index":0,"chunk_index":0,"partial_b64":"iVBORw0KGgo..."}
data: {"id":"img-abc123","type":"image.chunk","index":0,"chunk_index":1,"partial_b64":"AAAANSUhEU..."}
data: {"id":"img-abc123","type":"image.complete","index":0,"revised_prompt":"A tranquil Japanese garden...","usage":{"prompt_tokens":15,"total_tokens":15}}
data: [DONE]
Files to Modify/Create
New Files
| File | Purpose |
|---|---|
core/schemas/images.go |
Image generation type definitions |
core/providers/openai/images.go |
OpenAI DALL-E implementation |
core/providers/azure/images.go |
Azure DALL-E implementation |
core/internal/testutil/image_generation.go |
Image Generation Test Scenario |
core/internal/testutil/image_generation_stream.go |
Image Generation Stream Test Scenario |
framework/streaming/images.go |
Image stream accumulation |
ui/components/chat/ImageMessage.tsx |
React image rendering component |
ui/hooks/useImageStream.ts |
Streaming hook for progressive loading |
Modified Files
| File | Changes |
|---|---|
core/schemas/bifrost.go |
Add ImageGenerationRequest/Response to BifrostRequest/Response unions |
core/schemas/provider.go |
Add ImageGeneration methods to Provider interface |
core/bifrost.go |
Add ImageGenerationRequest/StreamRequest methods, update handleRequest |
core/providers/openai/openai.go |
Implement Provider interface methods |
transports/bifrost-http/handlers/inference.go |
Add /v1/images/generations route |
framework/streaming/types.go |
Add ImageStreamChunk, StreamTypeImage |
framework/streaming/accumulator.go |
Add image chunk pool, processing methods |
plugins/semanticcache/main.go |
Add image generation caching support |
plugins/semanticcache/search.go |
Add image-specific cache search |
Testing Plan
Unit Tests
- Schema serialization/deserialization
- Request transformation (Bifrost → OpenAI format)
- Response transformation (OpenAI → Bifrost format)
- Stream chunk accumulation
- Cache key generation
Integration Tests
- End-to-end image generation (non-streaming)
- End-to-end streaming image generation
- Fallback to secondary provider
- Cache hit/miss scenarios
- Error handling (rate limits, invalid prompts)
Load Tests
- Concurrent image generation requests
- Stream memory usage under load
- Cache performance at scale
Rollout Plan
- Phase 1: Core schema and provider implementation (OpenAI + Azure)
- Phase 2: HTTP transport and non-streaming endpoint
- Phase 3: Streaming support and accumulator
- Phase 4: Semantic cache integration (Base64 storage, 5min TTL)
- Phase 5: UI components and documentation
Design Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Cache Storage | Base64 Data | Store actual image bytes to prevent expiration issues with provider URLs |
| Initial Providers | OpenAI + Azure | Both DALL-E providers for redundancy and enterprise support |
| Streaming Chunk Size | 64KB | Balance between latency and overhead |
References
Critical Files Reference
| Component | File Path |
|---|---|
| Core schemas | core/schemas/bifrost.go, core/schemas/chatcompletions.go |
| Provider interface | core/schemas/provider.go |
| Main engine | core/bifrost.go |
| OpenAI provider | core/providers/openai/openai.go, core/providers/openai/chat.go |
| HTTP handlers | transports/bifrost-http/handlers/inference.go |
| Streaming framework | framework/streaming/accumulator.go, framework/streaming/types.go |
| Semantic cache | plugins/semanticcache/main.go, plugins/semanticcache/search.go |
| UI components | ui/components/ |
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Type
Projects
Status
Backlog