提交 cc8dba30 编写于 作者: R Rajadeepan D Ramesh 提交者: Kubernetes Prow Robot

Update scheduler name (#741)

上级 43aec15e
......@@ -4,18 +4,18 @@ description = "How to schedule a job with gang-scheduling"
weight = 30
+++
This guide describes how to use kube-batch to support gang-scheduling in
This guide describes how to use [volcano scheduler](https://github.com/volcano-sh/scheduler) to support gang-scheduling in
Kubeflow, to allow jobs to run multiple pods at the same time.
## Running jobs with gang-scheduling
To use gang-scheduling, you have to install kube-batch in your cluster first as a secondary scheduler of Kubernetes and configure operator to enable gang-scheduling.
To use gang-scheduling, you have to install volcano scheduler in your cluster first as a secondary scheduler of Kubernetes and configure operator to enable gang-scheduling.
* Kube-batch's introduction is [here](https://github.com/kubernetes-sigs/kube-batch), and also check how to install it [here](https://github.com/kubernetes-sigs/kube-batch/blob/master/doc/usage/tutorial.md).
* Volcano's scheduler repo is [here](https://github.com/volcano-sh/scheduler) and check how to install it [here](https://github.com/volcano-sh/volcano).
* Take tf-operator for example, enable gang-scheduling in tf-operator by setting true to `--enable-gang-scheduling` flag.
**Note:** Kube-batch and operator in Kubeflow achieve gang-scheduling by using pdb. operator will create the pdb of the job automatically. You can know more about pdb [here](https://kubernetes.io/docs/tasks/run-application/configure-pdb/).
**Note:** Volcano scheduler and operator in Kubeflow achieve gang-scheduling by using pdb. operator will create the pdb of the job automatically. You can know more about pdb [here](https://kubernetes.io/docs/tasks/run-application/configure-pdb/).
To use kube-batch to schedule your job as a gang, you have to specify the schedulerName in each replica; for example.
To use volcano scheduler to schedule your job as a gang, you have to specify the schedulerName in each replica; for example.
```yaml
apiVersion: "kubeflow.org/v1beta1"
......@@ -74,16 +74,16 @@ spec:
restartPolicy: OnFailure
```
## About kube-batch and gang-scheduling
With using kube-batch to apply gang-scheduling, a job can run only if there are enough resources for all the pods of the job. Otherwise, all the pods will be in pending state waiting for enough resources. For example, if a job requiring N pods is created and there are only enough resources to schedule N-2 pods, then N pods of the job will stay pending.
## About volcano scheduler and gang-scheduling
With using volcano scheduler to apply gang-scheduling, a job can run only if there are enough resources for all the pods of the job. Otherwise, all the pods will be in pending state waiting for enough resources. For example, if a job requiring N pods is created and there are only enough resources to schedule N-2 pods, then N pods of the job will stay pending.
**Note:** when in a high workload, if a pod of the job dies when the job is still running, it might give other pods chance to occupied the resources and cause deadlock.
## Troubleshooting
If you keep getting problems related to RBAC in your kube-batch.
If you keep getting problems related to RBAC in your volcano scheduler.
You can try to add the following rules into your clusterrole of scheduler used by kube-batch.
You can try to add the following rules into your clusterrole of scheduler used by volcano scheduler.
```
- apiGroups:
- '*'
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册