Replies: 1 comment 1 reply
-
|
You can override the value of max_tokens to whatever you want. 8192 is just a default value. model=Claude(model="claude-3-5-sonnet-v2@20241022", max_tokens=) |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Is your feature request related to a problem? Please describe.
Yes. There are several related problems:
Yes. The current
max_tokensis hardcoded to 8192 (e.g., inClaudemodel atsrc/google/adk/models/anthropic_llm.py:298). This causes two critical issues:Real-World Impact
Personal Experience: While building an agentic system that queries databases to provide recommendations, I encountered this issue frequently. The system would retrieve data and generate responses that sometimes exceeded 8192 tokens, causing critical information to be truncated.
The Manual Workaround I Had to Implement:
This same issue also occurred when using OpenAI models in other projects - responses would sometimes generate more tokens than expected, requiring similar manual intervention.
The Problem This Creates:
Example Problem:
Describe the solution you'd like
I propose adding several complementary features:
Add an Adaptive Token Limit Management feature that automatically adjusts
max_tokensbased on response quality and context.Proposed API:
Usage:
Behavior:
initial_tokens(e.g., 2048) for first invocationincrease_factor(e.g., 2048 → 3072 → 4608 → ...)decrease_factorfor next invocationmin_tokensor abovemax_tokensDescribe alternatives you've considered
Manual Adjustment: Users could manually set
max_tokensper invocation, but:Fixed Higher Limits: Simply increasing default to 16384:
External Tracking: Building token management outside ADK:
Additional context
Use Cases:
Implementation Notes:
RunConfigenabled=True)ContextCacheConfigfor cost savingsFinishReason.MAX_TOKENSin responseRelated Code:
src/google/adk/models/anthropic_llm.py:298(max_tokens: int = 8192)src/google/adk/agents/run_config.pyPriority:
High - This directly addresses a critical production issue that forces developers to implement manual workarounds. The problem affects:
This feature would eliminate the need for manual token management and prompt engineering workarounds, allowing developers to focus on building better agentic systems rather than managing token limits.
Beta Was this translation helpful? Give feedback.
All reactions