获取文件夹内容时发生错误.

名称	最后提交	最后更新

README.md

DI Orchestrator

DI Orchestrator is designed to manage DI (Decision Intelligence) jobs using Kubernetes Custom Resource and Operator.

Prerequisites

A well-prepared kubernetes cluster. Follow the instructions to create a kubernetes cluster, or create a local kubernetes node referring to kind or minikube

Install DI Orchestrator

DI Orchestrator consists of two components: di-operator and di-server. Install them with the following command.

kubectl create -f ./config/di-manager.yaml

di-operator and di-server will be installed in di-system namespace.

$ kubectl get pod -n di-system
NAME                               READY   STATUS    RESTARTS   AGE
di-operator-57cc65d5c9-5vnvn       1/1     Running   0          59s
di-server-7b86ff8df4-jfgmp         1/1     Running   0          59s

Submit DIJob

# submit DIJob
$ kubectl create -f config/samples/dijob-gobigger.yaml

# get pod and you will see coordinator is created by di-operator
# a few seconds later, you will see collectors and learners created by di-server
$ kubectl get pod
NAME                READY   STATUS    RESTARTS   AGE
gobigger-test-0-0   1/1     Running   0          4m17s
gobigger-test-0-1   1/1     Running   0          4m17s

# get logs of coordinator
$ kubectl logs -n xlab gobigger-test-0-0
Bind subprocesses on these addresses: ['tcp://10.148.3.4:22270',
'tcp://10.148.3.4:22271']
[Warning] no enough data: 128/0
...
[Warning] no enough data: 128/120
Current Training: Train Iter(0) Loss(102.256)
Current Training: Train Iter(0) Loss(103.133)
Current Training: Train Iter(20)        Loss(28.795)
Current Training: Train Iter(20)        Loss(32.837)
...
Current Training: Train Iter(360)       Loss(12.850)
Current Training: Train Iter(340)       Loss(11.812)
Current Training: Train Iter(380)       Loss(12.892)
Current Training: Train Iter(360)       Loss(13.621)
Current Training: Train Iter(400)       Loss(15.183)
Current Training: Train Iter(380)       Loss(14.187)
Current Evaluation: Train Iter(404)     Eval Reward(-1788.326)