diff --git a/doc/howto/usage/k8s/k8s_cn.md b/doc/howto/usage/k8s/k8s_cn.md index ab07cb9cd5b135ddea82b3360720537f1dc5a801..f05fb981679d49d6774ee2f0f72039e8c263f219 100644 --- a/doc/howto/usage/k8s/k8s_cn.md +++ b/doc/howto/usage/k8s/k8s_cn.md @@ -4,14 +4,15 @@ ## 制作Docker镜像 -在一个功能齐全的Kubernetes机群里,通常我们会安装Ceph等分布式文件系统来存储训练数据。这样的话,一个分布式Paddle训练任务中的每个进程都可以从Ceph读取数据。在这个例子里,我们只演示一个单机作业,所以可以简化对环境的要求,把训练数据直接放在 -Paddle的Docker image里。为此,我们需要制作一个包含训练数据的Paddle镜像。 - -Paddle 的 [Quick Start Tutorial](http://www.paddlepaddle.org/doc/demo/quick_start/index_en.html) -里介绍了用Paddle源码中的脚本下载训练数据的过程。 -而 `paddledev/paddle:cpu-demo-latest` 镜像里有 Paddle 源码与demo,( 请注意,默认的 -Paddle镜像 `paddledev/paddle:cpu-latest` 是不包括源码的, Paddle的各版本镜像可以参考 [Docker installation guide](http://www.paddlepaddle.org/doc/build/docker_install.html) ),所以我们使用这个镜像来下载训练数据到Docker container中,然后把这个包含了训练数据的container保存为一个新的镜像。 - +在一个功能齐全的Kubernetes机群里,通常我们会安装Ceph等分布式文件系统来存储训练数据。这样的话,一个分布式PaddlePaddle训练任务中 +的每个进程都可以从Ceph读取数据。在这个例子里,我们只演示一个单机作业,所以可以简化对环境的要求,把训练数据直接放在 +PaddlePaddle的Docker Image里。为此,我们需要制作一个包含训练数据的PaddlePaddle镜像。 + +PaddlePaddle的 `paddlepaddle/paddle:cpu-demo-latest` 镜像里有PaddlePaddle的源码与demo, +(请注意,默认的PaddlePaddle生产环境镜像 `paddlepaddle/paddle:latest` 是不包括源码的,PaddlePaddle的各版本镜像可以参考 +[Docker installation guide](http://paddlepaddle.org/docs/develop/documentation/zh/getstarted/build_and_install/docker_install_cn.html)), +下面我们使用这个镜像来下载数据到Docker Container中,并把这个包含了训练数据的Container保存为一个新的镜像。 + ### 运行容器 ``` diff --git a/doc/howto/usage/k8s/k8s_en.md b/doc/howto/usage/k8s/k8s_en.md index 0c3ab05b708e7a924577c26496b8c55126e76c62..c66c295e2afae9499e33d9cdb1521e653f8430aa 100644 --- a/doc/howto/usage/k8s/k8s_en.md +++ b/doc/howto/usage/k8s/k8s_en.md @@ -4,11 +4,20 @@ ## Build Docker Image -In distributed Kubernetes cluster, we will use Ceph or other shared storage system for storing training related data so that all processes in Paddle training can retrieve data from Ceph. In this example, we will only demo training job on single machine. In order to simplify the requirement of the environment, we will directly put training data into Paddle's Docker Image, so we need to create a Paddle Docker image that already includes the training data. +In distributed Kubernetes cluster, we will use Ceph or other distributed +storage system for storing training related data so that all processes in +PaddlePaddle training can retrieve data from Ceph. In this example, we will +only demo training job on single machine. In order to simplify the requirement +of the environment, we will directly put training data into the PaddlePaddle Docker Image, +so we need to create a PaddlePaddle Docker image that includes the training data. + +The production Docker Image `paddlepaddle/paddle:cpu-demo-latest` has the PaddlePaddle +source code and demo. (Caution: Default PaddlePaddle Docker Image `paddlepaddle/paddle:latest` doesn't include +the source code, PaddlePaddle's different versions of Docker Image can be referred here: +[Docker Installation Guide](http://paddlepaddle.org/docs/develop/documentation/zh/getstarted/build_and_install/docker_install_en.html)), +so we run this Docker Image and download the training data, and then commit the whole +Container to be a new Docker Image. -Paddle's [Quick Start Tutorial](http://www.paddlepaddle.org/doc/demo/quick_start/index_en.html) introduces how to download and train data by using script from Paddle's source code. -And `paddledev/paddle:cpu-demo-latest` image has the Paddle source code and demo. (Caution: Default Paddle image `paddledev/paddle:cpu-latest` doesn't include the source code, Paddle's different versions of image can be referred here: [Docker installation guide](http://www.paddlepaddle.org/doc/build/docker_install.html)), so we run this container and download the training data, and then commit the whole container to be a new Docker image. - ### Run Docker Container ```