Skip to content

RayCluster

RayCluster is a custom resource definition (CRD). KubeRay operator will listen to the resource events about RayCluster and create related Kubernetes resources (e.g. Pod & Service). Hence, KubeRay operator installation and CRD registration are required for this guide.

Prerequisites

See kuberay-operator/README.md for more details. * Helm * Install custom resource definition and KubeRay operator (covered by the following end-to-end example.)

End-to-end example

# Step 1: Create a KinD cluster 
kind create cluster

# Step 2: Register a Helm chart repo
helm repo add kuberay https://ray-project.github.io/kuberay-helm/

# Step 3: Install both CRDs and KubeRay operator v1.1.0.
helm install kuberay-operator kuberay/kuberay-operator --version 1.1.0

# Step 4: Install a RayCluster custom resource
# (For x86_64 users)
helm install raycluster kuberay/ray-cluster --version 1.1.0
# (For arm64 users, e.g. Mac M1)
# See here for all available arm64 images: https://hub.docker.com/r/rayproject/ray/tags?page=1&name=aarch64
helm install raycluster kuberay/ray-cluster --version 1.1.0 --set image.tag=nightly-aarch64

# Step 5: Verify the installation of KubeRay operator and RayCluster 
kubectl get pods
# NAME                                          READY   STATUS    RESTARTS   AGE
# kuberay-operator-6fcbb94f64-gkpc9             1/1     Running   0          89s
# raycluster-kuberay-head-qp9f4                 1/1     Running   0          66s
# raycluster-kuberay-worker-workergroup-2jckt   1/1     Running   0          66s

# Step 6: Forward the port of Dashboard
kubectl port-forward svc/raycluster-kuberay-head-svc 8265:8265

# Step 7: Check 127.0.0.1:8265 for the Dashboard

# Step 8: Log in to Ray head Pod and execute a job.
kubectl exec -it ${RAYCLUSTER_HEAD_POD} -- bash
python -c "import ray; ray.init(); print(ray.cluster_resources())" # (in Ray head Pod)

# Step 9: Check 127.0.0.1:8265/#/job. The status of the job should be "SUCCEEDED".

# Step 10: Uninstall RayCluster
helm uninstall raycluster

# Step 11: Verify that RayCluster has been removed successfully
# NAME                                READY   STATUS    RESTARTS   AGE
# kuberay-operator-6fcbb94f64-gkpc9   1/1     Running   0          9m57s