kuberay-operator¶
A Helm chart for deploying the Kuberay operator on Kubernetes.
Homepage: https://github.com/ray-project/kuberay
Introduction¶
This document provides instructions to install both CRDs (RayCluster, RayJob, RayService) and KubeRay operator with a Helm chart.
Prerequisites¶
- Kubernetes
- Helm >= 3
Make sure the version of Helm is v3+. Currently, existing CI tests are based on Helm v3.4.1 and v3.9.4.
helm version
Install CRDs and KubeRay operator¶
- Install a stable version via Helm repository (only supports KubeRay v0.4.0+)
helm repo add kuberay https://ray-project.github.io/kuberay-helm/
# Install both CRDs and KubeRay operator v1.1.0.
helm install kuberay-operator kuberay/kuberay-operator --version 1.1.0
# Check the KubeRay operator Pod in `default` namespace
kubectl get pods
# NAME READY STATUS RESTARTS AGE
# kuberay-operator-6fcbb94f64-mbfnr 1/1 Running 0 17s
- Install the nightly version
# Step1: Clone KubeRay repository
# Step2: Move to `helm-chart/kuberay-operator`
# Step3: Install KubeRay operator
helm install kuberay-operator .
- Install KubeRay operator without installing CRDs
- In some cases, the installation of the CRDs and the installation of the operator may require different levels of admin permissions, so these two installations could be handled as different steps by different roles.
- Use Helm's built-in
--skip-crds
flag to install the operator only. See this document for more details.
# Step 1: Install CRDs only (for cluster admin)
kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/crd?ref=v1.1.0&timeout=90s"
# Step 2: Install KubeRay operator only. (for developer)
helm install kuberay-operator kuberay/kuberay-operator --version 1.1.0 --skip-crds
List the chart¶
To list the my-release
deployment:
helm ls
# NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
# kuberay-operator default 1 2023-09-22 02:57:17.306616331 +0000 UTC deployed kuberay-operator-1.1.0
Uninstall the Chart¶
# Uninstall the `kuberay-operator` release
helm uninstall kuberay-operator
# The operator Pod should be removed.
kubectl get pods
# No resources found in default namespace.
Working with Argo CD¶
If you are using Argo CD to manage the operator, you will encounter the issue which complains the CRDs too long. Same with this issue. The recommended solution is to split the operator into two Argo apps, such as:
- The first app just for installing the CRDs with
Replace=true
directly, snippet:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: ray-operator-crds
spec:
project: default
source:
repoURL: https://github.com/ray-project/kuberay
targetRevision: v1.0.0-rc.0
path: helm-chart/kuberay-operator/crds
destination:
server: https://kubernetes.default.svc
syncPolicy:
syncOptions:
- Replace=true
...
- The second app that installs the Helm chart with
skipCrds=true
(new feature in Argo CD 2.3.0), snippet:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: ray-operator
spec:
source:
repoURL: https://github.com/ray-project/kuberay
targetRevision: v1.0.0-rc.0
path: helm-chart/kuberay-operator
helm:
skipCrds: true
destination:
server: https://kubernetes.default.svc
namespace: ray-operator
syncPolicy:
syncOptions:
- CreateNamespace=true
...
Values¶
Key | Type | Default | Description |
---|---|---|---|
nameOverride | string | "kuberay-operator" |
String to partially override release name. |
fullnameOverride | string | "kuberay-operator" |
String to fully override release name. |
componentOverride | string | "kuberay-operator" |
String to override component name. |
image.repository | string | "quay.io/kuberay/operator" |
Image repository. |
image.tag | string | "nightly" |
Image tag. |
image.pullPolicy | string | "IfNotPresent" |
Image pull policy. |
labels | object | {} |
Extra labels. |
annotations | object | {} |
Extra annotations. |
serviceAccount.create | bool | true |
Specifies whether a service account should be created. |
serviceAccount.name | string | "kuberay-operator" |
The name of the service account to use. If not set and create is true, a name is generated using the fullname template. |
logging.stdoutEncoder | string | "json" |
Log encoder to use for stdout (one of json or console ). |
logging.fileEncoder | string | "json" |
Log encoder to use for file logging (one of json or console ). |
logging.baseDir | string | "" |
Directory for kuberay-operator log file. |
logging.fileName | string | "" |
File name for kuberay-operator log file. |
logging.sizeLimit | string | "" |
EmptyDir volume size limit for kuberay-operator log file. |
batchScheduler.enabled | bool | false |
|
batchScheduler.name | string | "" |
|
featureGates[0].name | string | "RayClusterStatusConditions" |
|
featureGates[0].enabled | bool | true |
|
featureGates[1].name | string | "RayJobDeletionPolicy" |
|
featureGates[1].enabled | bool | false |
|
metrics.enabled | bool | true |
Whether KubeRay operator should emit control plane metrics. |
operatorCommand | string | "/manager" |
Path to the operator binary |
leaderElectionEnabled | bool | true |
If leaderElectionEnabled is set to true, the KubeRay operator will use leader election for high availability. |
rbacEnable | bool | true |
If rbacEnable is set to false, no RBAC resources will be created, including the Role for leader election, the Role for Pods and Services, and so on. |
crNamespacedRbacEnable | bool | true |
When crNamespacedRbacEnable is set to true, the KubeRay operator will create a Role for RayCluster preparation (e.g., Pods, Services) and a corresponding RoleBinding for each namespace listed in the "watchNamespace" parameter. Please note that even if crNamespacedRbacEnable is set to false, the Role and RoleBinding for leader election will still be created. Note: (1) This variable is only effective when rbacEnable and singleNamespaceInstall are both set to true. (2) In most cases, it should be set to true, unless you are using a Kubernetes cluster managed by GitOps tools such as ArgoCD. |
singleNamespaceInstall | bool | false |
When singleNamespaceInstall is true: - Install namespaced RBAC resources such as Role and RoleBinding instead of cluster-scoped ones like ClusterRole and ClusterRoleBinding so that the chart can be installed by users with permissions restricted to a single namespace. (Please note that this excludes the CRDs, which can only be installed at the cluster scope.) - If "watchNamespace" is not set, the KubeRay operator will, by default, only listen to resource events within its own namespace. |
env | string | nil |
Environment variables. |
resources | object | {"limits":{"cpu":"100m","memory":"512Mi"}} |
Resource requests and limits for containers. |
livenessProbe.initialDelaySeconds | int | 10 |
|
livenessProbe.periodSeconds | int | 5 |
|
livenessProbe.failureThreshold | int | 5 |
|
readinessProbe.initialDelaySeconds | int | 10 |
|
readinessProbe.periodSeconds | int | 5 |
|
readinessProbe.failureThreshold | int | 5 |
|
podSecurityContext | object | {} |
Set up securityContext to improve Pod security. |
service.type | string | "ClusterIP" |
Service type. |
service.port | int | 8080 |
Service port. |