-
Notifications
You must be signed in to change notification settings - Fork 68
Fix find_dotenv assertion error in local conversation #1326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Handle AssertionError and OSError from dotenv_values() gracefully when find_dotenv() fails to walk the call stack. This fixes the issue where local conversation would crash with AssertionError when dotenv tries to locate a .env file in contexts with shallow or unusual call stacks. The fix catches these exceptions and returns None, allowing the code to continue execution without breaking. Environment variables are still prioritized and checked first before attempting to load from dotenv. Fixes #1325 Co-authored-by: openhands <openhands@all-hands.dev>
| return env_value | ||
| try: | ||
| return dotenv_values().get(key) | ||
| except (AssertionError, OSError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there an AssertionError in the first place, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea 🤷 it just happened randomly in the middle of my CLI session
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could load dotenv at the beginning of the execution path (not in observability utils) and then we shouldn't get any such error? @OpenHands WDYT? Look at the original issue #1325 and rethink / reinvestigate why the problem occurred in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on it! enyst can track my progress at all-hands.dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of findings and recommendation
-
Root cause: The AssertionError came from python-dotenv’s find_dotenv() when called indirectly via dotenv_values() without arguments. find_dotenv() walks Python stack frames (using frame.f_back). In our local conversation path, heavy decorator/wrapper usage can exhaust the frame chain, making the assertion frame.f_back is not None fail.
-
Answer to “Should we load dotenv at the beginning of the execution path?”: Yes. Loading dotenv once at process startup (outside observability utils) avoids the fragile stack-walk entirely. After that, get_env can simply read from os.environ, and we don’t rely on dotenv_values()/find_dotenv() mid-execution.
-
Practical approach:
- Keep the current defensive handling in get_env() (catch AssertionError/OSError) for robustness and backward compatibility.
- At application entry points (CLI, server), call load_dotenv(usecwd=True, override=False) early to populate os.environ deterministically without stack inspection.
- Long-term, consider simplifying get_env() to only read os.environ and leave .env loading to the host application. If we keep a dotenv fallback, prefer using usecwd=True or an explicit path rather than relying on call-stack introspection.
Bottom line: Loading dotenv at startup is the right architectural fix and would have prevented the AssertionError; the existing try/except in get_env is a good safety net.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xingyaoww How about this? So maybe it shouldn't be necessary to find_dotenv here at all?
OpenHands/OpenHands-CLI#162
Description
This PR fixes issue #1325 where
find_dotenvbreaks local conversation with anAssertionError.Problem
The
get_env()function inopenhands/sdk/observability/utils.pycallsdotenv_values()without any arguments. Whendotenv_values()is called without arguments, it internally callsfind_dotenv()to locate the .env file. Thefind_dotenv()function tries to walk up the call stack to find the directory of the calling file, but in certain execution contexts (like when running in local conversation or when the call stack is unusual), the assertionassert frame.f_back is not Nonecan fail because the frame stack is exhausted.Solution
The fix catches
AssertionErrorandOSErrorfromdotenv_values()and returnsNonegracefully. This allows the code to continue execution without breaking. The implementation:Noneif value cannot be retrieved from either sourceChanges
Modified:
openhands-sdk/openhands/sdk/observability/utils.pyget_env()function to handleAssertionErrorandOSErrorfromdotenv_values()Added: Tests for the fix
tests/sdk/observability/test_utils.py- Tests forget_env()function including error handlingtests/sdk/observability/test_laminar.py- Tests forshould_enable_observability()with dotenv errorsTesting
All tests pass:
The fix has been validated to:
Nonegracefully when dotenv fails withAssertionErrorNonegracefully when dotenv fails withOSErrorFixes #1325
@xingyaoww can click here to continue refining the PR
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:3d8c0fc-pythonRun
All tags pushed for this build
About Multi-Architecture Support
3d8c0fc-python) is a multi-arch manifest supporting both amd64 and arm643d8c0fc-python-amd64) are also available if needed