<!DOCTYPE html> <!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]--> <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]--> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Kubernetes单机训练 — PaddlePaddle 文档</title> <link rel="stylesheet" href="../../../_static/css/theme.css" type="text/css" /> <link rel="index" title="索引" href="../../../genindex.html"/> <link rel="search" title="搜索" href="../../../search.html"/> <link rel="top" title="PaddlePaddle 文档" href="../../../index.html"/> <link rel="up" title="分布式训练" href="cluster_train_cn.html"/> <link rel="next" title="Kubernetes分布式训练" href="k8s_distributed_cn.html"/> <link rel="prev" title="在OpenMPI集群中提交训练作业" href="openmpi_cn.html"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" /> <link rel="stylesheet" href="../../../_static/css/override.css" type="text/css" /> <script> var _hmt = _hmt || []; (function() { var hm = document.createElement("script"); hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba"; var s = document.getElementsByTagName("script")[0]; s.parentNode.insertBefore(hm, s); })(); </script> <script src="../../../_static/js/modernizr.min.js"></script> </head> <body class="wy-body-for-nav" role="document"> <header class="site-header"> <div class="site-logo"> <a href="/"><img src="../../../_static/images/PP_w.png"></a> </div> <div class="site-nav-links"> <div class="site-menu"> <a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a> <div class="language-switcher dropdown"> <a type="button" data-toggle="dropdown"> <span>English</span> <i class="fa fa-angle-up"></i> <i class="fa fa-angle-down"></i> </a> <ul class="dropdown-menu"> <li><a href="/doc_cn">中文</a></li> <li><a href="/doc">English</a></li> </ul> </div> <ul class="site-page-links"> <li><a href="/">Home</a></li> </ul> </div> <div class="doc-module"> <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../../../getstarted/index_cn.html">新手入门</a></li> <li class="toctree-l1 current"><a class="reference internal" href="../../index_cn.html">进阶指南</a></li> <li class="toctree-l1"><a class="reference internal" href="../../../api/index_cn.html">API</a></li> <li class="toctree-l1"><a class="reference internal" href="../../../faq/index_cn.html">FAQ</a></li> </ul> <div role="search"> <form id="rtd-search-form" class="wy-form" action="../../../search.html" method="get"> <input type="text" name="q" placeholder="Search docs" /> <input type="hidden" name="check_keywords" value="yes" /> <input type="hidden" name="area" value="default" /> </form> </div> </div> </div> </header> <div class="main-content-wrap"> <nav class="doc-menu-vertical" role="navigation"> <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../../../getstarted/index_cn.html">新手入门</a><ul> <li class="toctree-l2"><a class="reference internal" href="../../../getstarted/build_and_install/index_cn.html">安装与编译</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../../getstarted/build_and_install/pip_install_cn.html">使用pip安装</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../getstarted/build_and_install/docker_install_cn.html">使用Docker安装运行</a></li> <li class="toctree-l3"><a class="reference internal" href="../../dev/build_cn.html">用Docker编译和测试PaddlePaddle</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../getstarted/build_and_install/build_from_source_cn.html">从源码编译</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../../../getstarted/concepts/use_concepts_cn.html">基本使用概念</a></li> </ul> </li> <li class="toctree-l1 current"><a class="reference internal" href="../../index_cn.html">进阶指南</a><ul class="current"> <li class="toctree-l2"><a class="reference internal" href="../cmd_parameter/index_cn.html">设置命令行参数</a><ul> <li class="toctree-l3"><a class="reference internal" href="../cmd_parameter/use_case_cn.html">使用案例</a></li> <li class="toctree-l3"><a class="reference internal" href="../cmd_parameter/arguments_cn.html">参数概述</a></li> <li class="toctree-l3"><a class="reference internal" href="../cmd_parameter/detail_introduction_cn.html">细节描述</a></li> </ul> </li> <li class="toctree-l2 current"><a class="reference internal" href="cluster_train_cn.html">分布式训练</a><ul class="current"> <li class="toctree-l3"><a class="reference internal" href="fabric_cn.html">fabric集群</a></li> <li class="toctree-l3"><a class="reference internal" href="openmpi_cn.html">openmpi集群</a></li> <li class="toctree-l3 current"><a class="current reference internal" href="#">kubernetes单机</a></li> <li class="toctree-l3"><a class="reference internal" href="k8s_distributed_cn.html">kubernetes distributed分布式</a></li> <li class="toctree-l3"><a class="reference internal" href="k8s_aws_cn.html">AWS上运行kubernetes集群训练</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../capi/index_cn.html">PaddlePaddle C-API</a><ul> <li class="toctree-l3"><a class="reference internal" href="../capi/compile_paddle_lib_cn.html">编译 PaddlePaddle 预测库</a></li> <li class="toctree-l3"><a class="reference internal" href="../capi/organization_of_the_inputs_cn.html">输入/输出数据组织</a></li> <li class="toctree-l3"><a class="reference internal" href="../capi/workflow_of_capi_cn.html">C-API 使用流程</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../../dev/contribute_to_paddle_cn.html">如何贡献代码</a></li> <li class="toctree-l2"><a class="reference internal" href="../../dev/write_docs_cn.html">如何贡献/修改文档</a></li> <li class="toctree-l2"><a class="reference internal" href="../../deep_model/rnn/index_cn.html">RNN相关模型</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../deep_model/rnn/rnn_config_cn.html">RNN配置</a></li> <li class="toctree-l3"><a class="reference internal" href="../../deep_model/rnn/recurrent_group_cn.html">Recurrent Group教程</a></li> <li class="toctree-l3"><a class="reference internal" href="../../deep_model/rnn/hierarchical_layer_cn.html">支持双层序列作为输入的Layer</a></li> <li class="toctree-l3"><a class="reference internal" href="../../deep_model/rnn/hrnn_rnn_api_compare_cn.html">单双层RNN API对比介绍</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../../optimization/gpu_profiling_cn.html">GPU性能分析与调优</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../../../api/index_cn.html">API</a><ul> <li class="toctree-l2"><a class="reference internal" href="../../../api/v2/model_configs.html">模型配置</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/config/activation.html">Activation</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/config/layer.html">Layers</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/config/evaluators.html">Evaluators</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/config/optimizer.html">Optimizer</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/config/pooling.html">Pooling</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/config/networks.html">Networks</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/config/attr.html">Parameter Attribute</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../../../api/v2/data.html">数据访问</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/data/data_reader.html">Data Reader Interface</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/data/image.html">Image Interface</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/data/dataset.html">Dataset</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../../../api/v2/run_logic.html">训练与应用</a></li> <li class="toctree-l2"><a class="reference internal" href="../../../api/v2/fluid.html">Fluid</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/fluid/layers.html">layers</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/fluid/data_feeder.html">data_feeder</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/fluid/executor.html">executor</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/fluid/initializer.html">initializer</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/fluid/evaluator.html">evaluator</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/fluid/nets.html">nets</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/fluid/optimizer.html">optimizer</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/fluid/param_attr.html">param_attr</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/fluid/profiler.html">profiler</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/fluid/regularizer.html">regularizer</a></li> <li class="toctree-l3"><a class="reference internal" href="../../../api/v2/fluid/io.html">io</a></li> </ul> </li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../../../faq/index_cn.html">FAQ</a><ul> <li class="toctree-l2"><a class="reference internal" href="../../../faq/build_and_install/index_cn.html">编译安装与单元测试</a></li> <li class="toctree-l2"><a class="reference internal" href="../../../faq/model/index_cn.html">模型配置</a></li> <li class="toctree-l2"><a class="reference internal" href="../../../faq/parameter/index_cn.html">参数设置</a></li> <li class="toctree-l2"><a class="reference internal" href="../../../faq/local/index_cn.html">本地训练与预测</a></li> <li class="toctree-l2"><a class="reference internal" href="../../../faq/cluster/index_cn.html">集群训练与预测</a></li> </ul> </li> </ul> </nav> <section class="doc-content-wrap"> <div role="navigation" aria-label="breadcrumbs navigation"> <ul class="wy-breadcrumbs"> <li><a href="../../index_cn.html">进阶指南</a> > </li> <li><a href="cluster_train_cn.html">分布式训练</a> > </li> <li>Kubernetes单机训练</li> </ul> </div> <div class="wy-nav-content" id="doc-content"> <div class="rst-content"> <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article"> <div itemprop="articleBody"> <div class="section" id="kubernetes"> <span id="kubernetes"></span><h1>Kubernetes单机训练<a class="headerlink" href="#kubernetes" title="永久链接至标题">¶</a></h1> <p>在这篇文档里,我们介绍如何在 Kubernetes 集群上启动一个单机使用CPU的PaddlePaddle训练作业。在下一篇中,我们将介绍如何启动分布式训练作业。</p> <div class="section" id="docker"> <span id="docker"></span><h2>制作Docker镜像<a class="headerlink" href="#docker" title="永久链接至标题">¶</a></h2> <p>在一个功能齐全的Kubernetes机群里,通常我们会安装Ceph等分布式文件系统来存储训练数据。这样的话,一个分布式PaddlePaddle训练任务中 的每个进程都可以从Ceph读取数据。在这个例子里,我们只演示一个单机作业,所以可以简化对环境的要求,把训练数据直接放在 PaddlePaddle的Docker Image里。为此,我们需要制作一个包含训练数据的PaddlePaddle镜像。</p> <p>PaddlePaddle的 <code class="docutils literal"><span class="pre">paddlepaddle/paddle:cpu-demo-latest</span></code> 镜像里有PaddlePaddle的源码与demo, (请注意,默认的PaddlePaddle生产环境镜像 <code class="docutils literal"><span class="pre">paddlepaddle/paddle:latest</span></code> 是不包括源码的,PaddlePaddle的各版本镜像可以参考 <a class="reference external" href="http://paddlepaddle.org/docs/develop/documentation/zh/getstarted/build_and_install/docker_install_cn.html">Docker Installation Guide</a>), 下面我们使用这个镜像来下载数据到Docker Container中,并把这个包含了训练数据的Container保存为一个新的镜像。</p> <div class="section" id=""> <span id="id1"></span><h3>运行容器<a class="headerlink" href="#" title="永久链接至标题">¶</a></h3> <div class="highlight-default"><div class="highlight"><pre><span></span>$ docker run --name quick_start_data -it paddlepaddle/paddle:cpu-demo-latest </pre></div> </div> </div> <div class="section" id=""> <span id="id2"></span><h3>下载数据<a class="headerlink" href="#" title="永久链接至标题">¶</a></h3> <p>进入容器<code class="docutils literal"><span class="pre">/root/paddle/demo/quick_start/data</span></code>目录,使用<code class="docutils literal"><span class="pre">get_data.sh</span></code>下载数据</p> <div class="highlight-default"><div class="highlight"><pre><span></span>$ root@fbd1f2bb71f4:~/paddle/demo/quick_start/data# ./get_data.sh Downloading Amazon Electronics reviews data... --2016-10-31 01:33:43-- http://snap.stanford.edu/data/amazon/productGraph/categoryFiles/reviews_Electronics_5.json.gz Resolving snap.stanford.edu (snap.stanford.edu)... 171.64.75.80 Connecting to snap.stanford.edu (snap.stanford.edu)|171.64.75.80|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 495854086 (473M) [application/x-gzip] Saving to: 'reviews_Electronics_5.json.gz' 10% [=======> ] 874,279 64.7KB/s eta 2h 13m </pre></div> </div> </div> <div class="section" id=""> <span id="id3"></span><h3>修改启动脚本<a class="headerlink" href="#" title="永久链接至标题">¶</a></h3> <p>下载完数据后,修改<code class="docutils literal"><span class="pre">/root/paddle/demo/quick_start/train.sh</span></code>文件,内容如下(增加了一条cd命令)</p> <div class="highlight-default"><div class="highlight"><pre><span></span>set -e cd /root/paddle/demo/quick_start cfg=trainer_config.lr.py #cfg=trainer_config.emb.py #cfg=trainer_config.cnn.py #cfg=trainer_config.lstm.py #cfg=trainer_config.bidi-lstm.py #cfg=trainer_config.db-lstm.py paddle train \ --config=$cfg \ --save_dir=./output \ --trainer_count=4 \ --log_period=20 \ --num_passes=15 \ --use_gpu=false \ --show_parameter_stats_period=100 \ --test_all_data_in_one_period=1 \ 2>&1 | tee 'train.log' </pre></div> </div> </div> <div class="section" id=""> <span id="id4"></span><h3>提交镜像<a class="headerlink" href="#" title="永久链接至标题">¶</a></h3> <p>修改启动脚本后,退出容器,使用<code class="docutils literal"><span class="pre">docker</span> <span class="pre">commit</span></code>命令创建新镜像。</p> <div class="highlight-default"><div class="highlight"><pre><span></span>$ docker commit quick_start_data mypaddle/paddle:quickstart </pre></div> </div> </div> </div> <div class="section" id="kubernetes"> <span id="id5"></span><h2>使用 Kubernetes 进行训练<a class="headerlink" href="#kubernetes" title="永久链接至标题">¶</a></h2> <blockquote> <div>针对任务运行完成后容器自动退出的场景,Kubernetes有Job类型的资源来支持。下文就是用Job类型的资源来进行训练。</div></blockquote> <div class="section" id="yaml"> <span id="yaml"></span><h3>编写yaml文件<a class="headerlink" href="#yaml" title="永久链接至标题">¶</a></h3> <p>在训练时,输出结果可能会随着容器的消耗而被删除,需要在创建容器前挂载卷以便我们保存训练结果。使用我们之前构造的镜像,可以创建一个 <a class="reference external" href="http://kubernetes.io/docs/user-guide/jobs/#what-is-a-job">Kubernetes Job</a>,简单的yaml文件如下:</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">apiVersion</span><span class="p">:</span> <span class="n">batch</span><span class="o">/</span><span class="n">v1</span> <span class="n">kind</span><span class="p">:</span> <span class="n">Job</span> <span class="n">metadata</span><span class="p">:</span> <span class="n">name</span><span class="p">:</span> <span class="n">quickstart</span> <span class="n">spec</span><span class="p">:</span> <span class="n">parallelism</span><span class="p">:</span> <span class="mi">1</span> <span class="n">completions</span><span class="p">:</span> <span class="mi">1</span> <span class="n">template</span><span class="p">:</span> <span class="n">metadata</span><span class="p">:</span> <span class="n">name</span><span class="p">:</span> <span class="n">quickstart</span> <span class="n">spec</span><span class="p">:</span> <span class="n">volumes</span><span class="p">:</span> <span class="o">-</span> <span class="n">name</span><span class="p">:</span> <span class="n">output</span> <span class="n">hostPath</span><span class="p">:</span> <span class="n">path</span><span class="p">:</span> <span class="o">/</span><span class="n">home</span><span class="o">/</span><span class="n">work</span><span class="o">/</span><span class="n">paddle_output</span> <span class="n">containers</span><span class="p">:</span> <span class="o">-</span> <span class="n">name</span><span class="p">:</span> <span class="n">pi</span> <span class="n">image</span><span class="p">:</span> <span class="n">mypaddle</span><span class="o">/</span><span class="n">paddle</span><span class="p">:</span><span class="n">quickstart</span> <span class="n">command</span><span class="p">:</span> <span class="p">[</span><span class="s2">"bin/bash"</span><span class="p">,</span> <span class="s2">"-c"</span><span class="p">,</span> <span class="s2">"/root/paddle/demo/quick_start/train.sh"</span><span class="p">]</span> <span class="n">volumeMounts</span><span class="p">:</span> <span class="o">-</span> <span class="n">name</span><span class="p">:</span> <span class="n">output</span> <span class="n">mountPath</span><span class="p">:</span> <span class="o">/</span><span class="n">root</span><span class="o">/</span><span class="n">paddle</span><span class="o">/</span><span class="n">demo</span><span class="o">/</span><span class="n">quick_start</span><span class="o">/</span><span class="n">output</span> <span class="n">restartPolicy</span><span class="p">:</span> <span class="n">Never</span> </pre></div> </div> </div> <div class="section" id="paddlepaddle-job"> <span id="paddlepaddle-job"></span><h3>创建PaddlePaddle Job<a class="headerlink" href="#paddlepaddle-job" title="永久链接至标题">¶</a></h3> <p>使用上文创建的yaml文件创建Kubernetes Job,命令为:</p> <div class="highlight-default"><div class="highlight"><pre><span></span>$ kubectl create -f paddle.yaml </pre></div> </div> <p>查看job的详细情况:</p> <div class="highlight-default"><div class="highlight"><pre><span></span>$ kubectl get job NAME DESIRED SUCCESSFUL AGE quickstart 1 0 58s $ kubectl describe job quickstart Name: quickstart Namespace: default Image(s): registry.baidu.com/public/paddle:cpu-demo-latest Selector: controller-uid=f120da72-9f18-11e6-b363-448a5b355b84 Parallelism: 1 Completions: 1 Start Time: Mon, 31 Oct 2016 11:20:16 +0800 Labels: controller-uid=f120da72-9f18-11e6-b363-448a5b355b84,job-name=quickstart Pods Statuses: 0 Running / 1 Succeeded / 0 Failed Volumes: output: Type: HostPath (bare host directory volume) Path: /home/work/paddle_output Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 1m 1m 1 {job-controller } Normal SuccessfulCreate Created pod: quickstart-fa0wx </pre></div> </div> </div> <div class="section" id=""> <span id="id6"></span><h3>查看训练结果<a class="headerlink" href="#" title="永久链接至标题">¶</a></h3> <p>根据Job对应的Pod信息,可以查看此Pod运行的宿主机。</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">kubectl</span> <span class="n">describe</span> <span class="n">pod</span> <span class="n">quickstart</span><span class="o">-</span><span class="n">fa0wx</span> <span class="n">Name</span><span class="p">:</span> <span class="n">quickstart</span><span class="o">-</span><span class="n">fa0wx</span> <span class="n">Namespace</span><span class="p">:</span> <span class="n">default</span> <span class="n">Node</span><span class="p">:</span> <span class="n">paddle</span><span class="o">-</span><span class="n">demo</span><span class="o">-</span><span class="n">let02</span><span class="o">/</span><span class="mf">10.206</span><span class="o">.</span><span class="mf">202.44</span> <span class="n">Start</span> <span class="n">Time</span><span class="p">:</span> <span class="n">Mon</span><span class="p">,</span> <span class="mi">31</span> <span class="n">Oct</span> <span class="mi">2016</span> <span class="mi">11</span><span class="p">:</span><span class="mi">20</span><span class="p">:</span><span class="mi">17</span> <span class="o">+</span><span class="mi">0800</span> <span class="n">Labels</span><span class="p">:</span> <span class="n">controller</span><span class="o">-</span><span class="n">uid</span><span class="o">=</span><span class="n">f120da72</span><span class="o">-</span><span class="mi">9</span><span class="n">f18</span><span class="o">-</span><span class="mf">11e6</span><span class="o">-</span><span class="n">b363</span><span class="o">-</span><span class="mi">448</span><span class="n">a5b355b84</span><span class="p">,</span><span class="n">job</span><span class="o">-</span><span class="n">name</span><span class="o">=</span><span class="n">quickstart</span> <span class="n">Status</span><span class="p">:</span> <span class="n">Succeeded</span> <span class="n">IP</span><span class="p">:</span> <span class="mf">10.0</span><span class="o">.</span><span class="mf">0.9</span> <span class="n">Controllers</span><span class="p">:</span> <span class="n">Job</span><span class="o">/</span><span class="n">quickstart</span> <span class="n">Containers</span><span class="p">:</span> <span class="n">quickstart</span><span class="p">:</span> <span class="n">Container</span> <span class="n">ID</span><span class="p">:</span> <span class="n">docker</span><span class="p">:</span><span class="o">//</span><span class="n">b8561f5c79193550d64fa47418a9e67ebdd71546186e840f88de5026b8097465</span> <span class="n">Image</span><span class="p">:</span> <span class="n">registry</span><span class="o">.</span><span class="n">baidu</span><span class="o">.</span><span class="n">com</span><span class="o">/</span><span class="n">public</span><span class="o">/</span><span class="n">paddle</span><span class="p">:</span><span class="n">cpu</span><span class="o">-</span><span class="n">demo</span><span class="o">-</span><span class="n">latest</span> <span class="n">Image</span> <span class="n">ID</span><span class="p">:</span> <span class="n">docker</span><span class="p">:</span><span class="o">//</span><span class="mf">18e457</span><span class="n">ce3d362ff5f3febf8e7f85ffec852f70f3b629add10aed84f930a68750</span> <span class="n">Port</span><span class="p">:</span> <span class="n">Command</span><span class="p">:</span> <span class="nb">bin</span><span class="o">/</span><span class="n">bash</span> <span class="o">-</span><span class="n">c</span> <span class="o">/</span><span class="n">root</span><span class="o">/</span><span class="n">paddle</span><span class="o">/</span><span class="n">demo</span><span class="o">/</span><span class="n">quick_start</span><span class="o">/</span><span class="n">train</span><span class="o">.</span><span class="n">sh</span> <span class="n">QoS</span> <span class="n">Tier</span><span class="p">:</span> <span class="n">cpu</span><span class="p">:</span> <span class="n">BestEffort</span> <span class="n">memory</span><span class="p">:</span> <span class="n">BestEffort</span> <span class="n">State</span><span class="p">:</span> <span class="n">Terminated</span> <span class="n">Reason</span><span class="p">:</span> <span class="n">Completed</span> <span class="n">Exit</span> <span class="n">Code</span><span class="p">:</span> <span class="mi">0</span> <span class="n">Started</span><span class="p">:</span> <span class="n">Mon</span><span class="p">,</span> <span class="mi">31</span> <span class="n">Oct</span> <span class="mi">2016</span> <span class="mi">11</span><span class="p">:</span><span class="mi">20</span><span class="p">:</span><span class="mi">20</span> <span class="o">+</span><span class="mi">0800</span> <span class="n">Finished</span><span class="p">:</span> <span class="n">Mon</span><span class="p">,</span> <span class="mi">31</span> <span class="n">Oct</span> <span class="mi">2016</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span><span class="p">:</span><span class="mi">46</span> <span class="o">+</span><span class="mi">0800</span> <span class="n">Ready</span><span class="p">:</span> <span class="kc">False</span> <span class="n">Restart</span> <span class="n">Count</span><span class="p">:</span> <span class="mi">0</span> <span class="n">Environment</span> <span class="n">Variables</span><span class="p">:</span> <span class="n">Conditions</span><span class="p">:</span> <span class="n">Type</span> <span class="n">Status</span> <span class="n">Ready</span> <span class="kc">False</span> <span class="n">Volumes</span><span class="p">:</span> <span class="n">output</span><span class="p">:</span> <span class="n">Type</span><span class="p">:</span> <span class="n">HostPath</span> <span class="p">(</span><span class="n">bare</span> <span class="n">host</span> <span class="n">directory</span> <span class="n">volume</span><span class="p">)</span> <span class="n">Path</span><span class="p">:</span> <span class="o">/</span><span class="n">home</span><span class="o">/</span><span class="n">work</span><span class="o">/</span><span class="n">paddle_output</span> </pre></div> </div> <p>我们还可以登录到宿主机上查看训练结果。</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="p">[</span><span class="n">root</span><span class="nd">@paddle</span><span class="o">-</span><span class="n">demo</span><span class="o">-</span><span class="n">let02</span> <span class="n">paddle_output</span><span class="p">]</span><span class="c1"># ll</span> <span class="n">total</span> <span class="mi">60</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">20</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00000</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">20</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00001</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00002</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00003</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00004</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00005</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00006</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00007</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00008</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00009</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00010</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00011</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00012</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00013</span> <span class="n">drwxr</span><span class="o">-</span><span class="n">xr</span><span class="o">-</span><span class="n">x</span> <span class="mi">2</span> <span class="n">root</span> <span class="n">root</span> <span class="mi">4096</span> <span class="n">Oct</span> <span class="mi">31</span> <span class="mi">11</span><span class="p">:</span><span class="mi">21</span> <span class="k">pass</span><span class="o">-</span><span class="mi">00014</span> </pre></div> </div> </div> </div> </div> </div> </div> <footer> <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation"> <a href="k8s_distributed_cn.html" class="btn btn-neutral float-right" title="Kubernetes分布式训练" accesskey="n">Next <span class="fa fa-arrow-circle-right"></span></a> <a href="openmpi_cn.html" class="btn btn-neutral" title="在OpenMPI集群中提交训练作业" accesskey="p"><span class="fa fa-arrow-circle-left"></span> Previous</a> </div> <hr/> <div role="contentinfo"> <p> © Copyright 2016, PaddlePaddle developers. </p> </div> Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>. </footer> </div> </div> </section> </div> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT:'../../../', VERSION:'', COLLAPSE_INDEX:false, FILE_SUFFIX:'.html', HAS_SOURCE: true, SOURCELINK_SUFFIX: ".txt", }; </script> <script type="text/javascript" src="../../../_static/jquery.js"></script> <script type="text/javascript" src="../../../_static/underscore.js"></script> <script type="text/javascript" src="../../../_static/doctools.js"></script> <script type="text/javascript" src="../../../_static/translations.js"></script> <script type="text/javascript" src="https://cdn.bootcss.com/mathjax/2.7.0/MathJax.js"></script> <script type="text/javascript" src="../../../_static/js/theme.js"></script> <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script> <script src="../../../_static/js/paddle_doc_init.js"></script> </body> </html>