-
Notifications
You must be signed in to change notification settings - Fork 46
Open
Description
TTFT is a super key user experience metric I'd like to monitor across the various LLM providers I use. It'd be great to have a uniform way of measuring TTFT using all these instrumentations. I can see a couple ways of doing it:
- emitting a span event when the first token is received
- measuring the time until the first token and setting a new
genai.time_to_first_tokenattribute (or similar) on the span.
This measurement wouldn't really apply to non-streaming use cases, but for streams that take many 10s of seconds, my janky version of it has proven really useful for showing what is TTFT vs actual streaming time in my own use case. See a WIP implementation here: 53b6bb4
Metadata
Metadata
Assignees
Labels
No labels