Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
FROM anyscale/ray:2.49.0-slim-py312-cu128
FROM anyscale/ray:2.52.0-slim-py312-cu128

# C compiler for Triton’s runtime build step (vLLM V1 engine)
# https://github.com/vllm-project/vllm/issues/2997
RUN sudo apt-get update && \
sudo apt-get install -y --no-install-recommends build-essential

RUN pip install vllm==0.10.1
RUN pip install vllm==0.11.0
35 changes: 18 additions & 17 deletions doc/source/serve/tutorials/deployment-serve-llm/gpt-oss/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,33 +200,21 @@ For production deployment, use Anyscale services to deploy the Ray Serve app to

### Launch the service

Anyscale provides out-of-the-box images (`anyscale/ray-llm`), which come pre-loaded with Ray Serve LLM, vLLM, and all required GPU and runtime dependencies. See the [Anyscale base images](https://docs.anyscale.com/reference/base-images) for details on what each image includes.
Anyscale provides out-of-the-box images (`anyscale/ray-llm`) which come pre-loaded with Ray Serve LLM, vLLM, and all required GPU/runtime dependencies. This makes it easy to get started without building a custom image.

Build a minimal Dockerfile:
```Dockerfile
FROM anyscale/ray:2.49.0-slim-py312-cu128

# C compiler for Triton’s runtime build step (vLLM V1 engine)
# https://github.com/vllm-project/vllm/issues/2997
RUN sudo apt-get update && \
sudo apt-get install -y --no-install-recommends build-essential

RUN pip install vllm==0.10.1
```

Create your Anyscale service configuration in a new `service.yaml` file and reference the Dockerfile with `containerfile`:
Create your Anyscale Service configuration in a new `service.yaml` file:

```yaml
# service.yaml
name: deploy-gpt-oss
containerfile: ./Dockerfile # Build Ray Serve LLM with vllm==0.10.1
name: deploy-llama-3-8b
image_uri: anyscale/ray-llm:2.52.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
compute_config:
auto_select_worker_config: true
working_dir: .
cloud:
applications:
# Point to your app in your Python module
- import_path: serve_gpt_oss:app
- import_path: serve_llama_3_1_8b:app
```


Expand All @@ -237,6 +225,19 @@ Deploy your service:
anyscale service deploy -f service.yaml
```

**Custom Dockerfile**
You can customize the container by building your own Dockerfile. In your Anyscale Service config, reference the Dockerfile with `containerfile` (instead of `image_uri`):

```yaml
# service.yaml
# Replace:
# image_uri: anyscale/ray-llm:2.49.0-py311-cu128

# with:
containerfile: ./Dockerfile
```

See the [Anyscale base images](https://docs.anyscale.com/reference/base-images) for details on what each image includes.

---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -263,33 +263,21 @@
"\n",
"### Launch the service\n",
"\n",
"Anyscale provides out-of-the-box images (`anyscale/ray-llm`), which come pre-loaded with Ray Serve LLM, vLLM, and all required GPU and runtime dependencies. See the [Anyscale base images](https://docs.anyscale.com/reference/base-images) for details on what each image includes.\n",
"Anyscale provides out-of-the-box images (`anyscale/ray-llm`) which come pre-loaded with Ray Serve LLM, vLLM, and all required GPU/runtime dependencies. This makes it easy to get started without building a custom image.\n",
"\n",
"Build a minimal Dockerfile:\n",
"```Dockerfile\n",
"FROM anyscale/ray:2.49.0-slim-py312-cu128\n",
"\n",
"# C compiler for Triton’s runtime build step (vLLM V1 engine)\n",
"# https://github.com/vllm-project/vllm/issues/2997\n",
"RUN sudo apt-get update && \\\n",
" sudo apt-get install -y --no-install-recommends build-essential\n",
"\n",
"RUN pip install vllm==0.10.1\n",
"```\n",
"\n",
"Create your Anyscale service configuration in a new `service.yaml` file and reference the Dockerfile with `containerfile`:\n",
"Create your Anyscale Service configuration in a new `service.yaml` file:\n",
"\n",
"```yaml\n",
"# service.yaml\n",
"name: deploy-gpt-oss\n",
"containerfile: ./Dockerfile # Build Ray Serve LLM with vllm==0.10.1\n",
"name: deploy-llama-3-8b\n",
"image_uri: anyscale/ray-llm:2.52.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.\n",
"compute_config:\n",
" auto_select_worker_config: true \n",
"working_dir: .\n",
"cloud:\n",
"applications:\n",
" # Point to your app in your Python module\n",
" - import_path: serve_gpt_oss:app\n",
" - import_path: serve_llama_3_1_8b:app\n",
"```\n",
"\n",
"\n",
Expand All @@ -311,6 +299,19 @@
"id": "7e6de36c",
"metadata": {},
"source": [
"**Custom Dockerfile** \n",
"You can customize the container by building your own Dockerfile. In your Anyscale Service config, reference the Dockerfile with `containerfile` (instead of `image_uri`):\n",
"\n",
"```yaml\n",
"# service.yaml\n",
"# Replace:\n",
"# image_uri: anyscale/ray-llm:2.49.0-py311-cu128\n",
"\n",
"# with:\n",
"containerfile: ./Dockerfile\n",
"```\n",
"\n",
"See the [Anyscale base images](https://docs.anyscale.com/reference/base-images) for details on what each image includes.\n",
"\n",
"---\n",
"\n",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# service.yaml
name: deploy-gpt-oss
containerfile: ./Dockerfile # Build Ray Serve LLM with vllm==0.10.1
image_uri: anyscale/ray-llm:2.52.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
compute_config:
auto_select_worker_config: true
working_dir: .
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
FROM anyscale/ray:2.49.0-slim-py312-cu128
FROM anyscale/ray:2.52.0-slim-py312-cu128

# C compiler for Triton’s runtime build step (vLLM V1 engine)
# https://github.com/vllm-project/vllm/issues/2997
RUN sudo apt-get update && \
sudo apt-get install -y --no-install-recommends build-essential

RUN pip install vllm==0.10.0
RUN pip install vllm==0.11.0
Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ Create your Anyscale service configuration in a new `service.yaml` file:
```yaml
#service.yaml
name: deploy-deepseek-r1
image_uri: anyscale/ray-llm:2.49.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
image_uri: anyscale/ray-llm:2.52.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
compute_config:
auto_select_worker_config: true
# Change default disk size to 1000GB
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@
"```yaml\n",
"#service.yaml\n",
"name: deploy-deepseek-r1\n",
"image_uri: anyscale/ray-llm:2.49.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.\n",
"image_uri: anyscale/ray-llm:2.52.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.\n",
"compute_config:\n",
" auto_select_worker_config: true \n",
" # Change default disk size to 1000GB\n",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#service.yaml
name: deploy-deepseek-r1
image_uri: anyscale/ray-llm:2.49.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
image_uri: anyscale/ray-llm:2.52.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
compute_config:
auto_select_worker_config: true
# Change default disk size to 1000GB
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
FROM anyscale/ray:2.49.0-slim-py312-cu128
FROM anyscale/ray:2.52.0-slim-py312-cu128

# C compiler for Triton’s runtime build step (vLLM V1 engine)
# https://github.com/vllm-project/vllm/issues/2997
RUN sudo apt-get update && \
sudo apt-get install -y --no-install-recommends build-essential

RUN pip install vllm==0.10.0
RUN pip install vllm==0.11.0
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ Create your Anyscale service configuration in a new `service.yaml` file:
```yaml
# service.yaml
name: deploy-llama-3-70b
image_uri: anyscale/ray-llm:2.49.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
image_uri: anyscale/ray-llm:2.52.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
compute_config:
auto_select_worker_config: true
working_dir: .
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@
"```yaml\n",
"# service.yaml\n",
"name: deploy-llama-3-70b\n",
"image_uri: anyscale/ray-llm:2.49.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.\n",
"image_uri: anyscale/ray-llm:2.52.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.\n",
"compute_config:\n",
" auto_select_worker_config: true \n",
"working_dir: .\n",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# service.yaml
name: deploy-llama-3-70b
image_uri: anyscale/ray-llm:2.49.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
image_uri: anyscale/ray-llm:2.52.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
compute_config:
auto_select_worker_config: true
working_dir: .
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
FROM anyscale/ray:2.49.0-slim-py312-cu128
FROM anyscale/ray:2.52.0-slim-py312-cu128

# C compiler for Triton’s runtime build step (vLLM V1 engine)
# https://github.com/vllm-project/vllm/issues/2997
RUN sudo apt-get update && \
sudo apt-get install -y --no-install-recommends build-essential

RUN pip install vllm==0.10.0
RUN pip install vllm==0.11.0
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ Create your Anyscale Service configuration in a new `service.yaml` file:
```yaml
# service.yaml
name: deploy-llama-3-8b
image_uri: anyscale/ray-llm:2.49.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
image_uri: anyscale/ray-llm:2.52.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
compute_config:
auto_select_worker_config: true
working_dir: .
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@
"```yaml\n",
"# service.yaml\n",
"name: deploy-llama-3-8b\n",
"image_uri: anyscale/ray-llm:2.49.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.\n",
"image_uri: anyscale/ray-llm:2.52.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.\n",
"compute_config:\n",
" auto_select_worker_config: true \n",
"working_dir: .\n",
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# service.yaml
name: deploy-llama-3-8b
image_uri: anyscale/ray-llm:2.49.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
image_uri: anyscale/ray-llm:2.52.0-py311-cu128 # Anyscale Ray Serve LLM image. Use `containerfile: ./Dockerfile` to use a custom Dockerfile.
compute_config:
auto_select_worker_config: true
working_dir: .
Expand Down