[Bug]: Goroutines leak on context cancellation

### Prerequisites

- [x] I have searched existing issues and discussions to avoid duplicates
- [x] I am using the latest version (or have tested against main/nightly)

### Description

Bifrost has a bug where worker goroutines and streaming goroutines don't clean up on context cancellation, causing them to block indefinitely and leak.

\[reported using Claude]

1. **Worker Loop Doesn't Monitor Context**: The `requestWorker` function (bifrost.go:2016-2135) uses `for req := range queue` which only exits when the queue channel is closed during `Shutdown()`. Workers never check `bifrost.ctx.Done()` for cancellation, so they continue blocking on `<-queue` waiting for requests even after the context is cancelled.

2. **Streaming Goroutines Block on I/O**: The streaming goroutine in `HandleOpenAIChatCompletionStreaming` (openai.go:786-939) blocks on `scanner.Scan()` at line 804. The context cancellation check happens inside the loop, so it never executes while blocked on I/O. When context times out, the goroutine stays permanently blocked waiting for data.

### Steps to reproduce

https://gist.github.com/pjcdawkins/6f63fad7eea19c3d698b2740aaf21959

### Expected behavior

Goroutines should return to baseline after context cancellation. When `bifrost.Init()` is called with a context and that context is cancelled, all worker goroutines and streaming goroutines should clean up and exit.

### Actual behavior

Multiple goroutines leak (workers + streaming goroutines). They remain blocked indefinitely until process exit, causing memory growth and resource exhaustion.

### Affected area(s)

Core (Go)

### Version

v1.2.22

### Environment

```text
- Go version: 1.25
- OS: Linux
- Affected providers: any providers using streaming
```

### Relevant logs/output

```shell
### Root Cause Analysis

**Issue 1: Worker Loop (`bifrost.go:2026`)**

The worker loop only exits on channel close:

for req := range queue {  // ONLY EXITS ON CHANNEL CLOSE
    // ... request processing
}


This loop only exits when the queue channel is closed in `Shutdown()` at line 2514. When a context is cancelled, the worker doesn't check `bifrost.ctx.Done()` and continues blocking.

**Why passing context to `Init()` doesn't fix it**: While `bifrost.Init()` accepts and stores a context in `bifrost.ctx`, the `requestWorker` goroutines don't monitor this context. They only monitor the per-request context (`req.Context`) for individual operations, not for the worker lifecycle itself.

**Issue 2: Streaming I/O Blocking (`openai.go:804`)**


for scanner.Scan() {  // BLOCKS HERE ON I/O
    // Context check happens AFTER scan completes
    select {
    case <-ctx.Done():
        return
    default:
    }
    // ... process line
}


The `ctx.Done()` check is inside the loop, so it never executes while blocked on `scanner.Scan()`. When context times out, the HTTP client may not immediately close the connection, and the goroutine stays blocked indefinitely.


### Proposed Fixes

**Fix 1: Make Workers Monitor Context Cancellation**

In `bifrost.go:requestWorker`:

func (bifrost *Bifrost) requestWorker(...) {
    // Monitor both queue closure AND context cancellation
    for {
        select {
        case <-bifrost.ctx.Done():
            bifrost.logger.Debug("worker exiting due to context cancellation")
            return
        case req, ok := <-queue:
            if !ok {
                return  // Queue closed - shutdown
            }
            // Process request...
        }
    }
}


**Fix 2: Make Streaming Goroutines Respect Context**

Monitor context in parallel with I/O and force-close response body on cancellation:

go func() {
    done := make(chan struct{})
    defer close(done)

    // Monitor context and force cleanup
    go func() {
        select {
        case <-ctx.Done():
            if resp.BodyStream() != nil {
                resp.BodyStream().Close()
            }
        case <-done:
        }
    }()

    // Existing streaming logic...
}()


## Additional Notes

The channel send pattern at lines 2094-2129 correctly prevents workers from blocking on sends using select with timeout. However, this doesn't help when workers are already blocked on channel receives (`<-queue`) or I/O operations (`scanner.Scan()`).

## Workarounds

Until fixed, consumers can:
1. Create fresh Bifrost instances for each isolated operation
2. Explicitly call `Shutdown()` when done with a Bifrost instance
3. Accept the leak for short-lived processes that exit soon anyway
4. Set aggressive timeouts on the HTTP client level
```

### Regression?

_No response_

### Severity

Medium (some functionality impaired)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Goroutines leak on context cancellation #828

Prerequisites

Description

Steps to reproduce

Expected behavior

Actual behavior

Affected area(s)

Version

Environment

Relevant logs/output

Regression?

Severity

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Goroutines leak on context cancellation #828

Description

Prerequisites

Description

Steps to reproduce

Expected behavior

Actual behavior

Affected area(s)

Version

Environment

Relevant logs/output

Regression?

Severity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions