Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions .github/workflows/examples-adk.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
name: Examples - ADK
permissions:
contents: read
on:
schedule:
# Every day at 4 AM UTC+8
- cron: '0 20 * * *'

workflow_dispatch:

repository_dispatch:
types: [ci-adk, ci-all]

run-name: >-
${{ github.event_name == 'repository_dispatch'
&& format(
'PR #{0} - Label {1} - {2}',
github.event.client_payload.pull_number,
github.event.client_payload.ci_label,
github.event.client_payload.correlation_id
)
|| format('ADK - {0}', github.event_name) }}

jobs:
adk:
if: >
github.event_name != 'repository_dispatch' ||
github.event.action == 'ci-adk' ||
github.event.action == 'ci-all'
name: ADK (Python 3.12)
runs-on: [self-hosted, 1ES.Pool=agl-runner-gpu]
timeout-minutes: 30
steps:
- name: Check GPU status
run: nvidia-smi

- name: Check disk space
run: df -h

- uses: actions/checkout@v4
with:
ref: ${{ github.event_name == 'repository_dispatch' && github.event.client_payload.pr_ref || (github.event.pull_request.number && format('refs/pull/{0}/merge', github.event.pull_request.number)) || github.ref }}

- uses: astral-sh/setup-uv@v7
with:
enable-cache: true
python-version: '3.12'

- name: Sync dependencies
run: |
uv sync --frozen --no-default-groups --extra verl \
--group dev --group experiment --group agents --group torch-gpu-stable

- name: Freeze dependencies
run: |
set -ex
uv pip freeze | tee requirements-freeze.txt
echo "UV_LOCKED=1" >> $GITHUB_ENV
echo "UV_NO_SYNC=1" >> $GITHUB_ENV

- name: Upload dependencies artifact
uses: actions/upload-artifact@v4
with:
name: dependencies-adk
path: requirements-freeze.txt
compression-level: 0

- name: Launch LiteLLM Proxy
run: ./scripts/litellm_run.sh
env:
AZURE_API_BASE: ${{ secrets.AZURE_GROUP_SUBSCRIPTION_API_BASE }}
AZURE_API_KEY: ${{ secrets.AZURE_GROUP_SUBSCRIPTION_API_KEY }}

- name: Prepare ADK dataset
run: |
set -ex
cd examples/google_adk
uv run python prepare_dataset.py --download --outdir .

- name: ADK debug sanity check
run: |
set -ex
cd examples/google_adk
uv run python adk_debug.py --file data/test.parquet --index 0
env:
OPENAI_API_BASE: http://localhost:12306/
OPENAI_API_KEY: dummy
OPENAI_MODEL: meta-llama/Meta-Llama-3-8B-Instruct
if: success() || failure()
8 changes: 1 addition & 7 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -206,10 +206,4 @@ cython_debug/
.cursorindexingignore

# Claude
.claude/*.local.json

# Dashboard generated files
agentlightning/dashboard/**/*.css
agentlightning/dashboard/**/*.js
agentlightning/dashboard/**/*.html
agentlightning/dashboard/**/*.svg
.claude/*.local.json
111 changes: 111 additions & 0 deletions docs/how-to/train-adk-agent.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Train ADK Agent with Agent-lightning and VERL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's refine this example. Users might have already read the how-to with verl + langchain. This how-to should focus on the differences -- how to make ADK agent available for training. Omitting other unnecessary details.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! I have updated and made a latest commit based on the feedback. Ready for the review.


This how-to keeps only the steps required to make the ADK-powered agent visible to the Agent-lightning trainer and to launch VERL training. For end-to-end reference implementations, open [`examples/google_adk`](../examples/google_adk) while you follow along.

## 1. Prerequisites

- Install dependencies that ship the ADK wrappers and VERL runner:

```bash
pip install "agentlightning[verl,adk]" "google-adk>=0.3.0"
```

- Prepare two Parquet files under `examples/google_adk/data`: `train.parquet` and `test.parquet`. Run `uv run python prepare_dataset.py --download --outdir examples/google_adk` to pull the Spider dataset that we reuse from the SQL tutorial, or supply your own JSON/CSV via `--train` / `--test`.
- Export the OpenAI-compatible endpoint that will back the ADK agent (native OpenAI, Azure, or a local vLLM proxy):

```bash
export OPENAI_API_BASE=http://localhost:8000/v1
export OPENAI_API_KEY=<redacted>
export OPENAI_MODEL=meta-llama/Meta-Llama-3-8B-Instruct
export HF_TOKEN=<token used by VERL to download weights>
```

## 2. Wrap the ADK agent

[`examples/google_adk/adk_agent.py`]({{ src("examples/google_adk/adk_agent.py") }}) defines `LitAdkAgent`, a thin subclass of [`agl.LitAgent`][agentlightning.LitAgent]. Its responsibilities:

- Pull the `"main_llm"` resource that VERL injects into each rollout.
- Construct the ADK orchestrator (Agent + Orchestrator or any custom logic) with that LLM endpoint.
- Emit spans automatically through ADK’s tracing hooks while answering the task.
- Return a scalar reward from `rollout(...)`. Do **not** call [`emit_reward`][agentlightning.emit_reward] when returning a value.

Because `LitAdkAgent` is already implemented, you only need to verify that your ADK-side plan/execution logic looks up the base URL and credentials from the provided `agl.LLM`. That is what makes the agent “available” to the trainer—no extra registration layer is required.

## 3. Provide resources to the trainer

Making the ADK agent usable during training boils down to handing the trainer an initial `"main_llm"` resource and pointing it at `LitAdkAgent`. The snippet below matches what `train_adk.py` does:

```python
import agentlightning as agl
from examples.google_adk.adk_agent import LitAdkAgent

verl_config = {
"algorithm": {"adv_estimator": "grpo"},
"data": {"train_batch_size": 32, "max_prompt_length": 4096, "max_response_length": 2048},
"actor_rollout_ref": {
"rollout": {"name": "vllm", "n": 4, "multi_turn": {"format": "hermes"}},
"actor": {"ppo_mini_batch_size": 32, "optim": {"lr": 1e-6}},
"model": {"path": "meta-llama/Meta-Llama-3-8B-Instruct"},
},
"trainer": {"n_gpus_per_node": 1, "val_before_train": True, "test_freq": 32, "save_freq": 64},
}

trainer = agl.Trainer(
n_runners=10,
algorithm=agl.VERL(verl_config),
adapter={"agent_match": "LitAdkAgent"},
initial_resources={
"main_llm": agl.LLM(
endpoint=os.environ["OPENAI_API_BASE"],
model=os.environ["OPENAI_MODEL"],
api_key=os.environ["OPENAI_API_KEY"],
sampling_parameters={"temperature": 0.0},
)
},
)

agent = LitAdkAgent()
train_data = pd.read_parquet("data/train.parquet").to_dict("records")
val_data = pd.read_parquet("data/test.parquet").to_dict("records")
trainer.fit(agent, train_dataset=train_data, val_dataset=val_data)
```

Key takeaways:

- The agent becomes discoverable to VERL once you pass it to `trainer.fit(...)`.
- The `"main_llm"` key is a convention—use it consistently between the trainer config and the agent’s rollout.
- `adapter.agent_match` filters spans so that VERL only consumes the ADK agent’s traces.

## 4. Launch the packaged script

All the wiring above is already bundled inside [`examples/google_adk/train_adk.py`]({{ src("examples/google_adk/train_adk.py") }}). From the example directory, run:

```bash
python train_adk.py \
--train-file data/train.parquet \
--val-file data/test.parquet \
--model ${OPENAI_MODEL:-meta-llama/Meta-Llama-3-8B-Instruct} \
--endpoint ${OPENAI_API_BASE:-http://localhost:8000/v1}
```

Helpful flags:

- `--ci` or `--ci-fast` to shrink runner count + dataset slices.
- `--wandb-project` / `--wandb-run-name` if you want W&B logging.
- `--external-store-address` to connect to an existing LightningStore (reuse traces between runs).

Use `python adk_debug.py --file data/test.parquet` for a quick dry run that exercises the agent without launching VERL.

## 5. Example training result

A representative CI-fast run (1 runner, Spider-derived dataset downloaded via `prepare_dataset.py --download`, vLLM backend on a single A100-40GB) produced:

| Step | Avg reward | Notes |
| ---- | ---------- | ----- |
| 0 | 0.08 | Random rollout before updates |
| 32 | 0.31 | First validation pass after GRPO update |
| 64 | 0.47 | Checkpoint saved (`ckpt-00064`) |
| 96 | 0.52 | Plateau; spans show stable ADK orchestration |

Your numbers will vary with model choice and dataset, but seeing validation reward rise above random baseline and spans streaming into LightningStore confirms that the ADK agent is correctly wired into Agent-lightning’s training stack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's a how-to, please write the training results here.

6 changes: 5 additions & 1 deletion examples/apo/room_selector_apo.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,11 @@ def main() -> None:
trainer = Trainer(
algorithm=algo,
# Increase the number of runners to run more rollouts in parallel
n_runners=8,
strategy={
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

"type": "shm",
"n_runners": 8,
"main_thread": "algorithm",
},
# APO algorithm needs a baseline
# Set it either here or in the algo
initial_resources={
Expand Down
54 changes: 54 additions & 0 deletions examples/google_adk/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# ADK Example

This folder hosts the runnable sample that pairs the ADK agent with Agent-lightning’s VERL integration. For architectural details, please read the [Train ADK Agent how-to](../../docs/how-to/train-adk-agent.md). This README focuses only on installing dependencies and running the scripts.

## Install

```bash
cd examples/google_adk
pip install "agentlightning[verl,adk]" "google-adk>=0.3.0"
# or: uv sync
```

You’ll need a machine with a 40 GB GPU (A100 or similar) for full training; CPU + smaller GPUs are fine for CI modes.

## Prepare data

Create `data/train.parquet` and `data/test.parquet` that match the `AdkTask` schema (`question`, `app_id`, `ground_truth`, optional `meta`). To download and convert the Spider dataset (same as `examples/spider`):

```bash
uv run python prepare_dataset.py --download --outdir data
```

Alternatively, convert your own JSON/CSV files using `--train` and `--test` flags. See the how-to guide for details.

## Run training

```bash
python train_adk.py \
--train-file data/train.parquet \
--val-file data/test.parquet \
--model ${OPENAI_MODEL:-meta-llama/Meta-Llama-3-8B-Instruct} \
--endpoint ${OPENAI_API_BASE:-http://localhost:8000/v1}
```

Useful flags:

- `--ci` / `--ci-fast` shrink runner count and dataset slices for smoke tests.
- `--external-store-address` connects to an existing LightningStore service.
- `--wandb-project` / `--wandb-run-name` enable Weights & Biases logging.

Environment variables the scripts read:

- `OPENAI_API_BASE`, `OPENAI_API_KEY`, `OPENAI_MODEL`
- `HF_TOKEN` (required for VERL checkpoints hosted on Hugging Face)

## Quick debug loop

Before spending GPU hours, run:

```bash
python adk_debug.py --file data/test.parquet --index 0
```

This executes a single rollout with the same ADK wiring used in training, letting you confirm credentials, dataset rows, and trace emission without launching VERL. Use `--model` and `--endpoint` overrides to point at different LLM backends.
Loading