<!DOCTYPE html> <!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]--> <!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]--> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>本地训练与预测 — PaddlePaddle 文档</title> <link rel="stylesheet" href="../../_static/css/theme.css" type="text/css" /> <link rel="index" title="索引" href="../../genindex.html"/> <link rel="search" title="搜索" href="../../search.html"/> <link rel="top" title="PaddlePaddle 文档" href="../../index.html"/> <link rel="up" title="FAQ" href="../index_cn.html"/> <link rel="next" title="集群训练与预测" href="../cluster/index_cn.html"/> <link rel="prev" title="参数设置" href="../parameter/index_cn.html"/> <link rel="stylesheet" href="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/css/perfect-scrollbar.min.css" type="text/css" /> <link rel="stylesheet" href="../../_static/css/override.css" type="text/css" /> <script> var _hmt = _hmt || []; (function() { var hm = document.createElement("script"); hm.src = "//hm.baidu.com/hm.js?b9a314ab40d04d805655aab1deee08ba"; var s = document.getElementsByTagName("script")[0]; s.parentNode.insertBefore(hm, s); })(); </script> <script src="../../_static/js/modernizr.min.js"></script> </head> <body class="wy-body-for-nav" role="document"> <header class="site-header"> <div class="site-logo"> <a href="/"><img src="../../_static/images/PP_w.png"></a> </div> <div class="site-nav-links"> <div class="site-menu"> <a class="fork-on-github" href="https://github.com/PaddlePaddle/Paddle" target="_blank"><i class="fa fa-github"></i>Fork me on Github</a> <div class="language-switcher dropdown"> <a type="button" data-toggle="dropdown"> <span>English</span> <i class="fa fa-angle-up"></i> <i class="fa fa-angle-down"></i> </a> <ul class="dropdown-menu"> <li><a href="/doc_cn">中文</a></li> <li><a href="/doc">English</a></li> </ul> </div> <ul class="site-page-links"> <li><a href="/">Home</a></li> </ul> </div> <div class="doc-module"> <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_cn.html">新手入门</a></li> <li class="toctree-l1"><a class="reference internal" href="../../howto/index_cn.html">进阶指南</a></li> <li class="toctree-l1"><a class="reference internal" href="../../api/index_cn.html">API</a></li> <li class="toctree-l1 current"><a class="reference internal" href="../index_cn.html">FAQ</a></li> </ul> <div role="search"> <form id="rtd-search-form" class="wy-form" action="../../search.html" method="get"> <input type="text" name="q" placeholder="Search docs" /> <input type="hidden" name="check_keywords" value="yes" /> <input type="hidden" name="area" value="default" /> </form> </div> </div> </div> </header> <div class="main-content-wrap"> <nav class="doc-menu-vertical" role="navigation"> <ul class="current"> <li class="toctree-l1"><a class="reference internal" href="../../getstarted/index_cn.html">新手入门</a><ul> <li class="toctree-l2"><a class="reference internal" href="../../getstarted/build_and_install/index_cn.html">安装与编译</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/pip_install_cn.html">使用pip安装</a></li> <li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/docker_install_cn.html">使用Docker安装运行</a></li> <li class="toctree-l3"><a class="reference internal" href="../../howto/dev/build_cn.html">用Docker编译和测试PaddlePaddle</a></li> <li class="toctree-l3"><a class="reference internal" href="../../getstarted/build_and_install/build_from_source_cn.html">从源码编译</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../../getstarted/concepts/use_concepts_cn.html">基本使用概念</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../../howto/index_cn.html">进阶指南</a><ul> <li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cmd_parameter/index_cn.html">设置命令行参数</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/use_case_cn.html">使用案例</a></li> <li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/arguments_cn.html">参数概述</a></li> <li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cmd_parameter/detail_introduction_cn.html">细节描述</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../../howto/usage/cluster/cluster_train_cn.html">分布式训练</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/fabric_cn.html">fabric集群</a></li> <li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/openmpi_cn.html">openmpi集群</a></li> <li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_cn.html">kubernetes单机</a></li> <li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_distributed_cn.html">kubernetes distributed分布式</a></li> <li class="toctree-l3"><a class="reference internal" href="../../howto/usage/cluster/k8s_aws_cn.html">AWS上运行kubernetes集群训练</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../../howto/usage/capi/index_cn.html">PaddlePaddle C-API</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../howto/usage/capi/compile_paddle_lib_cn.html">编译 PaddlePaddle 预测库</a></li> <li class="toctree-l3"><a class="reference internal" href="../../howto/usage/capi/organization_of_the_inputs_cn.html">输入/输出数据组织</a></li> <li class="toctree-l3"><a class="reference internal" href="../../howto/usage/capi/workflow_of_capi_cn.html">C-API 使用流程</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../../howto/dev/contribute_to_paddle_cn.html">如何贡献代码</a></li> <li class="toctree-l2"><a class="reference internal" href="../../howto/dev/write_docs_cn.html">如何贡献/修改文档</a></li> <li class="toctree-l2"><a class="reference internal" href="../../howto/deep_model/rnn/index_cn.html">RNN相关模型</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/rnn_config_cn.html">RNN配置</a></li> <li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/recurrent_group_cn.html">Recurrent Group教程</a></li> <li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/hierarchical_layer_cn.html">支持双层序列作为输入的Layer</a></li> <li class="toctree-l3"><a class="reference internal" href="../../howto/deep_model/rnn/hrnn_rnn_api_compare_cn.html">单双层RNN API对比介绍</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../../howto/optimization/gpu_profiling_cn.html">GPU性能分析与调优</a></li> </ul> </li> <li class="toctree-l1"><a class="reference internal" href="../../api/index_cn.html">API</a><ul> <li class="toctree-l2"><a class="reference internal" href="../../api/v2/model_configs.html">模型配置</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/activation.html">Activation</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/layer.html">Layers</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/evaluators.html">Evaluators</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/optimizer.html">Optimizer</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/pooling.html">Pooling</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/networks.html">Networks</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/config/attr.html">Parameter Attribute</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../../api/v2/data.html">数据访问</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/data_reader.html">Data Reader Interface</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/image.html">Image Interface</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/data/dataset.html">Dataset</a></li> </ul> </li> <li class="toctree-l2"><a class="reference internal" href="../../api/v2/run_logic.html">训练与应用</a></li> <li class="toctree-l2"><a class="reference internal" href="../../api/v2/fluid.html">Fluid</a><ul> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/layers.html">layers</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/data_feeder.html">data_feeder</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/executor.html">executor</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/initializer.html">initializer</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/evaluator.html">evaluator</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/nets.html">nets</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/optimizer.html">optimizer</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/param_attr.html">param_attr</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/profiler.html">profiler</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/regularizer.html">regularizer</a></li> <li class="toctree-l3"><a class="reference internal" href="../../api/v2/fluid/io.html">io</a></li> </ul> </li> </ul> </li> <li class="toctree-l1 current"><a class="reference internal" href="../index_cn.html">FAQ</a><ul class="current"> <li class="toctree-l2"><a class="reference internal" href="../build_and_install/index_cn.html">编译安装与单元测试</a></li> <li class="toctree-l2"><a class="reference internal" href="../model/index_cn.html">模型配置</a></li> <li class="toctree-l2"><a class="reference internal" href="../parameter/index_cn.html">参数设置</a></li> <li class="toctree-l2 current"><a class="current reference internal" href="#">本地训练与预测</a></li> <li class="toctree-l2"><a class="reference internal" href="../cluster/index_cn.html">集群训练与预测</a></li> </ul> </li> </ul> </nav> <section class="doc-content-wrap"> <div role="navigation" aria-label="breadcrumbs navigation"> <ul class="wy-breadcrumbs"> <li><a href="../index_cn.html">FAQ</a> > </li> <li>本地训练与预测</li> </ul> </div> <div class="wy-nav-content" id="doc-content"> <div class="rst-content"> <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article"> <div itemprop="articleBody"> <div class="section" id="id1"> <h1><a class="toc-backref" href="#id10">本地训练与预测</a><a class="headerlink" href="#id1" title="永久链接至标题">¶</a></h1> <div class="contents topic" id="contents"> <p class="topic-title first">Contents</p> <ul class="simple"> <li><a class="reference internal" href="#id1" id="id10">本地训练与预测</a><ul> <li><a class="reference internal" href="#id2" id="id11">1. 如何减少内存占用</a><ul> <li><a class="reference internal" href="#dataprovider" id="id12">减少DataProvider缓冲池内存</a></li> <li><a class="reference internal" href="#id3" id="id13">神经元激活内存</a></li> <li><a class="reference internal" href="#id4" id="id14">参数内存</a></li> </ul> </li> <li><a class="reference internal" href="#id5" id="id15">2. 如何加速训练速度</a><ul> <li><a class="reference internal" href="#id6" id="id16">减少数据载入的耗时</a></li> <li><a class="reference internal" href="#id7" id="id17">加速训练速度</a></li> <li><a class="reference internal" href="#id8" id="id18">利用更多的计算资源</a></li> </ul> </li> <li><a class="reference internal" href="#gpu" id="id19">3. 如何指定GPU设备</a></li> <li><a class="reference internal" href="#floating-point-exception" id="id20">4. 训练过程中出现 <code class="code docutils literal"><span class="pre">Floating</span> <span class="pre">point</span> <span class="pre">exception</span></code>, 训练因此退出怎么办?</a></li> <li><a class="reference internal" href="#infer-layer" id="id21">5. 如何调用 infer 接口输出多个layer的预测结果</a></li> <li><a class="reference internal" href="#layeroutput" id="id22">6. 如何在训练过程中获得某一个layer的output</a></li> <li><a class="reference internal" href="#id9" id="id23">7. 如何在训练过程中获得参数的权重和梯度</a></li> </ul> </li> </ul> </div> <div class="section" id="id2"> <h2><a class="toc-backref" href="#id11">1. 如何减少内存占用</a><a class="headerlink" href="#id2" title="永久链接至标题">¶</a></h2> <p>神经网络的训练本身是一个非常消耗内存和显存的工作,经常会消耗数10GB的内存和数GB的显存。 PaddlePaddle的内存占用主要分为如下几个方面:</p> <ul class="simple"> <li>DataProvider缓冲池内存(只针对内存)</li> <li>神经元激活内存(针对内存和显存)</li> <li>参数内存 (针对内存和显存)</li> <li>其他内存杂项</li> </ul> <p>其中,其他内存杂项是指PaddlePaddle本身所用的一些内存,包括字符串分配,临时变量等等,暂不考虑在内。</p> <div class="section" id="dataprovider"> <h3><a class="toc-backref" href="#id12">减少DataProvider缓冲池内存</a><a class="headerlink" href="#dataprovider" title="永久链接至标题">¶</a></h3> <p>PyDataProvider使用的是异步加载,同时在内存里直接随即选取数据来做Shuffle。即</p> <img src="../../_images/graphviz-9be6aad37f57c60f4b971dde0ef44ce27179cf9a.png" alt="digraph { rankdir=LR; 数据文件 -> 内存池 -> PaddlePaddle训练 }" /> <p>所以,减小这个内存池即可减小内存占用,同时也可以加速开始训练前数据载入的过程。但是,这 个内存池实际上决定了shuffle的粒度。所以,如果将这个内存池减小,又要保证数据是随机的, 那么最好将数据文件在每次读取之前做一次shuffle。可能的代码为</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="c1"># Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.</span> <span class="c1">#</span> <span class="c1"># Licensed under the Apache License, Version 2.0 (the "License");</span> <span class="c1"># you may not use this file except in compliance with the License.</span> <span class="c1"># You may obtain a copy of the License at</span> <span class="c1">#</span> <span class="c1"># http://www.apache.org/licenses/LICENSE-2.0</span> <span class="c1">#</span> <span class="c1"># Unless required by applicable law or agreed to in writing, software</span> <span class="c1"># distributed under the License is distributed on an "AS IS" BASIS,</span> <span class="c1"># WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.</span> <span class="c1"># See the License for the specific language governing permissions and</span> <span class="c1"># limitations under the License.</span> <span class="nd">@provider</span><span class="p">(</span><span class="n">min_pool_size</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="o">...</span><span class="p">)</span> <span class="k">def</span> <span class="nf">process</span><span class="p">(</span><span class="n">settings</span><span class="p">,</span> <span class="n">filename</span><span class="p">):</span> <span class="n">os</span><span class="o">.</span><span class="n">system</span><span class="p">(</span><span class="s1">'shuf </span><span class="si">%s</span><span class="s1"> > </span><span class="si">%s</span><span class="s1">.shuf'</span> <span class="o">%</span> <span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="n">filename</span><span class="p">))</span> <span class="c1"># shuffle before.</span> <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'</span><span class="si">%s</span><span class="s1">.shuf'</span> <span class="o">%</span> <span class="n">filename</span><span class="p">,</span> <span class="s1">'r'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> <span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="n">f</span><span class="p">:</span> <span class="k">yield</span> <span class="n">get_sample_from_line</span><span class="p">(</span><span class="n">line</span><span class="p">)</span> </pre></div> </div> <p>这样做可以极大的减少内存占用,并且可能会加速训练过程,详细文档参考 <span class="xref std std-ref">api_pydataprovider2</span> 。</p> </div> <div class="section" id="id3"> <h3><a class="toc-backref" href="#id13">神经元激活内存</a><a class="headerlink" href="#id3" title="永久链接至标题">¶</a></h3> <p>神经网络在训练的时候,会对每一个激活暂存一些数据,如神经元激活值等。 在反向传递的时候,这些数据会被用来更新参数。这些数据使用的内存主要和两个参数有关系, 一是batch size,另一个是每条序列(Sequence)长度。所以,其实也是和每个mini-batch中包含 的时间步信息成正比。</p> <p>所以做法可以有两种:</p> <ul class="simple"> <li>减小batch size。 即在网络配置中 <code class="code docutils literal"><span class="pre">settings(batch_size=1000)</span></code> 设置成一个小一些的值。但是batch size本身是神经网络的超参数,减小batch size可能会对训练结果产生影响。</li> <li>减小序列的长度,或者直接扔掉非常长的序列。比如,一个数据集大部分序列长度是100-200, 但是突然有一个10000长的序列,就很容易导致内存超限,特别是在LSTM等RNN中。</li> </ul> </div> <div class="section" id="id4"> <h3><a class="toc-backref" href="#id14">参数内存</a><a class="headerlink" href="#id4" title="永久链接至标题">¶</a></h3> <p>PaddlePaddle支持非常多的优化算法(Optimizer),不同的优化算法需要使用不同大小的内存。 例如使用 <code class="code docutils literal"><span class="pre">adadelta</span></code> 算法,则需要使用等于权重参数规模大约5倍的内存。举例,如果参数保存下来的模型目录 文件为 <code class="code docutils literal"><span class="pre">100M</span></code>, 那么该优化算法至少需要 <code class="code docutils literal"><span class="pre">500M</span></code> 的内存。</p> <p>可以考虑使用一些优化算法,例如 <code class="code docutils literal"><span class="pre">momentum</span></code>。</p> </div> </div> <div class="section" id="id5"> <h2><a class="toc-backref" href="#id15">2. 如何加速训练速度</a><a class="headerlink" href="#id5" title="永久链接至标题">¶</a></h2> <p>加速PaddlePaddle训练可以考虑从以下几个方面:</p> <ul class="simple"> <li>减少数据载入的耗时</li> <li>加速训练速度</li> <li>利用分布式训练驾驭更多的计算资源</li> </ul> <div class="section" id="id6"> <h3><a class="toc-backref" href="#id16">减少数据载入的耗时</a><a class="headerlink" href="#id6" title="永久链接至标题">¶</a></h3> <p>使用<code class="code docutils literal"><span class="pre">pydataprovider</span></code>时,可以减少缓存池的大小,同时设置内存缓存功能,即可以极大的加速数据载入流程。 <code class="code docutils literal"><span class="pre">DataProvider</span></code> 缓存池的减小,和之前减小通过减小缓存池来减小内存占用的原理一致。</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="c1"># Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.</span> <span class="c1">#</span> <span class="c1"># Licensed under the Apache License, Version 2.0 (the "License");</span> <span class="c1"># you may not use this file except in compliance with the License.</span> <span class="c1"># You may obtain a copy of the License at</span> <span class="c1">#</span> <span class="c1"># http://www.apache.org/licenses/LICENSE-2.0</span> <span class="c1">#</span> <span class="c1"># Unless required by applicable law or agreed to in writing, software</span> <span class="c1"># distributed under the License is distributed on an "AS IS" BASIS,</span> <span class="c1"># WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.</span> <span class="c1"># See the License for the specific language governing permissions and</span> <span class="c1"># limitations under the License.</span> <span class="nd">@provider</span><span class="p">(</span><span class="n">min_pool_size</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="o">...</span><span class="p">)</span> <span class="k">def</span> <span class="nf">process</span><span class="p">(</span><span class="n">settings</span><span class="p">,</span> <span class="n">filename</span><span class="p">):</span> <span class="n">os</span><span class="o">.</span><span class="n">system</span><span class="p">(</span><span class="s1">'shuf </span><span class="si">%s</span><span class="s1"> > </span><span class="si">%s</span><span class="s1">.shuf'</span> <span class="o">%</span> <span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="n">filename</span><span class="p">))</span> <span class="c1"># shuffle before.</span> <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'</span><span class="si">%s</span><span class="s1">.shuf'</span> <span class="o">%</span> <span class="n">filename</span><span class="p">,</span> <span class="s1">'r'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> <span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="n">f</span><span class="p">:</span> <span class="k">yield</span> <span class="n">get_sample_from_line</span><span class="p">(</span><span class="n">line</span><span class="p">)</span> </pre></div> </div> <p>同时 <code class="code docutils literal"><span class="pre">@provider</span></code> 接口有一个 <code class="code docutils literal"><span class="pre">cache</span></code> 参数来控制缓存方法,将其设置成 <code class="code docutils literal"><span class="pre">CacheType.CACHE_PASS_IN_MEM</span></code> 的话,会将第一个 <code class="code docutils literal"><span class="pre">pass</span></code> (过完所有训练数据即为一个pass)生成的数据缓存在内存里,在之后的 <code class="code docutils literal"><span class="pre">pass</span></code> 中,不会再从 <code class="code docutils literal"><span class="pre">python</span></code> 端读取数据,而是直接从内存的缓存里读取数据。这也会极大减少数据读入的耗时。</p> </div> <div class="section" id="id7"> <h3><a class="toc-backref" href="#id17">加速训练速度</a><a class="headerlink" href="#id7" title="永久链接至标题">¶</a></h3> <p>PaddlePaddle支持Sparse的训练,sparse训练需要训练特征是 <code class="code docutils literal"><span class="pre">sparse_binary_vector</span></code> 、 <code class="code docutils literal"><span class="pre">sparse_vector</span></code> 、或者 <code class="code docutils literal"><span class="pre">integer_value</span></code> 的任一一种。同时,与这个训练数据交互的Layer,需要将其Parameter设置成 sparse 更新模式,即设置 <code class="code docutils literal"><span class="pre">sparse_update=True</span></code></p> <p>这里使用简单的 <code class="code docutils literal"><span class="pre">word2vec</span></code> 训练语言模型距离,具体使用方法为:</p> <p>使用一个词前两个词和后两个词,来预测这个中间的词。这个任务的DataProvider为:</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="c1"># Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.</span> <span class="c1">#</span> <span class="c1"># Licensed under the Apache License, Version 2.0 (the "License");</span> <span class="c1"># you may not use this file except in compliance with the License.</span> <span class="c1"># You may obtain a copy of the License at</span> <span class="c1">#</span> <span class="c1"># http://www.apache.org/licenses/LICENSE-2.0</span> <span class="c1">#</span> <span class="c1"># Unless required by applicable law or agreed to in writing, software</span> <span class="c1"># distributed under the License is distributed on an "AS IS" BASIS,</span> <span class="c1"># WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.</span> <span class="c1"># See the License for the specific language governing permissions and</span> <span class="c1"># limitations under the License.</span> <span class="n">DICT_DIM</span> <span class="o">=</span> <span class="mi">3000</span> <span class="nd">@provider</span><span class="p">(</span><span class="n">input_types</span><span class="o">=</span><span class="p">[</span><span class="n">integer_sequence</span><span class="p">(</span><span class="n">DICT_DIM</span><span class="p">),</span> <span class="n">integer_value</span><span class="p">(</span><span class="n">DICT_DIM</span><span class="p">)])</span> <span class="k">def</span> <span class="nf">process</span><span class="p">(</span><span class="n">settings</span><span class="p">,</span> <span class="n">filename</span><span class="p">):</span> <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">filename</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> <span class="c1"># yield word ids to predict inner word id</span> <span class="c1"># such as [28, 29, 10, 4], 4</span> <span class="c1"># It means the sentance is 28, 29, 4, 10, 4.</span> <span class="k">yield</span> <span class="n">read_next_from_file</span><span class="p">(</span><span class="n">f</span><span class="p">)</span> </pre></div> </div> <p>这个任务的配置为:</p> <div class="highlight-default"><div class="highlight"><pre><span></span><span class="c1"># Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.</span> <span class="c1">#</span> <span class="c1"># Licensed under the Apache License, Version 2.0 (the "License");</span> <span class="c1"># you may not use this file except in compliance with the License.</span> <span class="c1"># You may obtain a copy of the License at</span> <span class="c1">#</span> <span class="c1"># http://www.apache.org/licenses/LICENSE-2.0</span> <span class="c1">#</span> <span class="c1"># Unless required by applicable law or agreed to in writing, software</span> <span class="c1"># distributed under the License is distributed on an "AS IS" BASIS,</span> <span class="c1"># WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.</span> <span class="c1"># See the License for the specific language governing permissions and</span> <span class="c1"># limitations under the License.</span> <span class="o">...</span> <span class="c1"># the settings and define data provider is omitted.</span> <span class="n">DICT_DIM</span> <span class="o">=</span> <span class="mi">3000</span> <span class="c1"># dictionary dimension.</span> <span class="n">word_ids</span> <span class="o">=</span> <span class="n">data_layer</span><span class="p">(</span><span class="s1">'word_ids'</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">DICT_DIM</span><span class="p">)</span> <span class="n">emb</span> <span class="o">=</span> <span class="n">embedding_layer</span><span class="p">(</span> <span class="nb">input</span><span class="o">=</span><span class="n">word_ids</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="mi">256</span><span class="p">,</span> <span class="n">param_attr</span><span class="o">=</span><span class="n">ParamAttr</span><span class="p">(</span><span class="n">sparse_update</span><span class="o">=</span><span class="kc">True</span><span class="p">))</span> <span class="n">emb_sum</span> <span class="o">=</span> <span class="n">pooling_layer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">emb</span><span class="p">,</span> <span class="n">pooling_type</span><span class="o">=</span><span class="n">SumPooling</span><span class="p">())</span> <span class="n">predict</span> <span class="o">=</span> <span class="n">fc_layer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">emb_sum</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">DICT_DIM</span><span class="p">,</span> <span class="n">act</span><span class="o">=</span><span class="n">Softmax</span><span class="p">())</span> <span class="n">outputs</span><span class="p">(</span> <span class="n">classification_cost</span><span class="p">(</span> <span class="nb">input</span><span class="o">=</span><span class="n">predict</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">data_layer</span><span class="p">(</span> <span class="s1">'label'</span><span class="p">,</span> <span class="n">size</span><span class="o">=</span><span class="n">DICT_DIM</span><span class="p">)))</span> </pre></div> </div> </div> <div class="section" id="id8"> <h3><a class="toc-backref" href="#id18">利用更多的计算资源</a><a class="headerlink" href="#id8" title="永久链接至标题">¶</a></h3> <p>利用更多的计算资源可以分为以下几个方式来进行:</p> <ul class="simple"> <li>单机CPU训练<ul> <li>使用多线程训练。设置命令行参数 <code class="code docutils literal"><span class="pre">trainer_count</span></code>。</li> </ul> </li> <li>单机GPU训练<ul> <li>使用显卡训练。设置命令行参数 <code class="code docutils literal"><span class="pre">use_gpu</span></code>。</li> <li>使用多块显卡训练。设置命令行参数 <code class="code docutils literal"><span class="pre">use_gpu</span></code> 和 <code class="code docutils literal"><span class="pre">trainer_count</span></code> 。</li> </ul> </li> <li>多机训练<ul> <li>请参考 <span class="xref std std-ref">cluster_train</span> 。</li> </ul> </li> </ul> </div> </div> <div class="section" id="gpu"> <h2><a class="toc-backref" href="#id19">3. 如何指定GPU设备</a><a class="headerlink" href="#gpu" title="永久链接至标题">¶</a></h2> <p>例如机器上有4块GPU,编号从0开始,指定使用2、3号GPU:</p> <ul class="simple"> <li>方式1:通过 <a class="reference external" href="http://www.acceleware.com/blog/cudavisibledevices-masking-gpus">CUDA_VISIBLE_DEVICES</a> 环境变量来指定特定的GPU。</li> </ul> <div class="highlight-bash"><div class="highlight"><pre><span></span>env <span class="nv">CUDA_VISIBLE_DEVICES</span><span class="o">=</span><span class="m">2</span>,3 paddle train --use_gpu<span class="o">=</span><span class="nb">true</span> --trainer_count<span class="o">=</span><span class="m">2</span> </pre></div> </div> <ul class="simple"> <li>方式2:通过命令行参数 <code class="docutils literal"><span class="pre">--gpu_id</span></code> 指定。</li> </ul> <div class="highlight-bash"><div class="highlight"><pre><span></span>paddle train --use_gpu<span class="o">=</span><span class="nb">true</span> --trainer_count<span class="o">=</span><span class="m">2</span> --gpu_id<span class="o">=</span><span class="m">2</span> </pre></div> </div> </div> <div class="section" id="floating-point-exception"> <h2><a class="toc-backref" href="#id20">4. 训练过程中出现 <code class="code docutils literal"><span class="pre">Floating</span> <span class="pre">point</span> <span class="pre">exception</span></code>, 训练因此退出怎么办?</a><a class="headerlink" href="#floating-point-exception" title="永久链接至标题">¶</a></h2> <p>Paddle二进制在运行时捕获了浮点数异常,只要出现浮点数异常(即训练过程中出现NaN或者Inf),立刻退出。浮点异常通常的原因是浮点数溢出、除零等问题。 主要原因包括两个方面:</p> <ul class="simple"> <li>训练过程中参数或者训练过程中的梯度尺度过大,导致参数累加,乘除等时候,导致了浮点数溢出。</li> <li>模型一直不收敛,发散到了一个数值特别大的地方。</li> <li>训练数据有问题,导致参数收敛到了一些奇异的情况。或者输入数据尺度过大,有些特征的取值达到数百万,这时进行矩阵乘法运算就可能导致浮点数溢出。</li> </ul> <p>这里有两种有效的解决方法:</p> <ol class="arabic simple"> <li>设置 <code class="code docutils literal"><span class="pre">gradient_clipping_threshold</span></code> 参数,示例代码如下:</li> </ol> <div class="highlight-python"><div class="highlight"><pre><span></span> </pre></div> </div> <dl class="docutils"> <dt>optimizer = paddle.optimizer.RMSProp(</dt> <dd>learning_rate=1e-3, gradient_clipping_threshold=10.0, regularization=paddle.optimizer.L2Regularization(rate=8e-4))</dd> </dl> <p>具体可以参考 <a class="reference external" href="https://github.com/PaddlePaddle/models/blob/develop/nmt_without_attention/train.py#L35">nmt_without_attention</a> 示例。</p> <ol class="arabic simple" start="2"> <li>设置 <code class="code docutils literal"><span class="pre">error_clipping_threshold</span></code> 参数,示例代码如下:</li> </ol> <div class="highlight-python"><div class="highlight"><pre><span></span> </pre></div> </div> <dl class="docutils"> <dt>decoder_inputs = paddle.layer.fc(</dt> <dd><p class="first">act=paddle.activation.Linear(), size=decoder_size * 3, bias_attr=False, input=[context, current_word], layer_attr=paddle.attr.ExtraLayerAttribute(</p> <blockquote class="last"> <div>error_clipping_threshold=100.0))</div></blockquote> </dd> </dl> <p>完整代码可以参考示例 <a class="reference external" href="https://github.com/PaddlePaddle/book/blob/develop/08.machine_translation/train.py#L66">machine translation</a> 。</p> <p>两种方法的区别:</p> <ol class="arabic simple"> <li>两者都是对梯度的截断,但截断时机不同,前者在 <code class="code docutils literal"><span class="pre">optimzier</span></code> 更新网络参数时应用;后者在激活函数反向计算时被调用;</li> <li>截断对象不同:前者截断可学习参数的梯度,后者截断回传给前层的梯度;</li> </ol> <p>除此之外,还可以通过减小学习率或者对数据进行归一化处理来解决这类问题。</p> </div> <div class="section" id="infer-layer"> <h2><a class="toc-backref" href="#id21">5. 如何调用 infer 接口输出多个layer的预测结果</a><a class="headerlink" href="#infer-layer" title="永久链接至标题">¶</a></h2> <ul class="simple"> <li>将需要输出的层作为 <code class="code docutils literal"><span class="pre">paddle.inference.Inference()</span></code> 接口的 <code class="code docutils literal"><span class="pre">output_layer</span></code> 参数输入,代码如下:</li> </ul> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">inferer</span> <span class="o">=</span> <span class="n">paddle</span><span class="o">.</span><span class="n">inference</span><span class="o">.</span><span class="n">Inference</span><span class="p">(</span><span class="n">output_layer</span><span class="o">=</span><span class="p">[</span><span class="n">layer1</span><span class="p">,</span> <span class="n">layer2</span><span class="p">],</span> <span class="n">parameters</span><span class="o">=</span><span class="n">parameters</span><span class="p">)</span> </pre></div> </div> <ul class="simple"> <li>指定要输出的字段进行输出。以输出 <code class="code docutils literal"><span class="pre">value</span></code> 字段为例,代码如下:</li> </ul> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="n">out</span> <span class="o">=</span> <span class="n">inferer</span><span class="o">.</span><span class="n">infer</span><span class="p">(</span><span class="nb">input</span><span class="o">=</span><span class="n">data_batch</span><span class="p">,</span> <span class="n">field</span><span class="o">=</span><span class="p">[</span><span class="s2">"value"</span><span class="p">])</span> </pre></div> </div> <p>需要注意的是:</p> <ul class="simple"> <li>如果指定了2个layer作为输出层,实际上需要的输出结果是两个矩阵;</li> <li>假设第一个layer的输出A是一个 N1 * M1 的矩阵,第二个 Layer 的输出B是一个 N2 * M2 的矩阵;</li> <li>paddle.v2 默认会将A和B 横向拼接,当N1 和 N2 大小不一样时,会报如下的错误:</li> </ul> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="ne">ValueError</span><span class="p">:</span> <span class="nb">all</span> <span class="n">the</span> <span class="nb">input</span> <span class="n">array</span> <span class="n">dimensions</span> <span class="k">except</span> <span class="k">for</span> <span class="n">the</span> <span class="n">concatenation</span> <span class="n">axis</span> <span class="n">must</span> <span class="n">match</span> <span class="n">exactly</span> </pre></div> </div> <p>多个层的输出矩阵的高度不一致导致拼接失败,这种情况常常发生在:</p> <ul class="simple"> <li>同时输出序列层和非序列层;</li> <li>多个输出层处理多个不同长度的序列;</li> </ul> <p>此时可以在调用infer接口时通过设置 <code class="code docutils literal"><span class="pre">flatten_result=False</span></code> , 跳过“拼接”步骤,来解决上面的问题。这时,infer接口的返回值是一个python list:</p> <ul class="simple"> <li>list 中元素的个数等于网络中输出层的个数;</li> <li>list 中每个元素是一个layer的输出结果矩阵,类型是numpy的ndarray;</li> <li>每一个layer输出矩阵的高度,在非序列输入时:等于样本数;序列输入时等于:输入序列中元素的总数;宽度等于配置中layer的size;</li> </ul> </div> <div class="section" id="layeroutput"> <h2><a class="toc-backref" href="#id22">6. 如何在训练过程中获得某一个layer的output</a><a class="headerlink" href="#layeroutput" title="永久链接至标题">¶</a></h2> <p>可以在event_handler中,通过 <code class="code docutils literal"><span class="pre">event.gm.getLayerOutputs("layer_name")</span></code> 获得在模型配置中某一层的name <code class="code docutils literal"><span class="pre">layer_name</span></code> 在当前 mini-batch forward的output的值。获得的值类型均为 <code class="code docutils literal"><span class="pre">numpy.ndarray</span></code> ,可以通过这个输出来完成自定义的评估指标计算等功能。例如下面代码:</p> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">score_diff</span><span class="p">(</span><span class="n">right_score</span><span class="p">,</span> <span class="n">left_score</span><span class="p">):</span> <span class="k">return</span> <span class="n">np</span><span class="o">.</span><span class="n">average</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">abs</span><span class="p">(</span><span class="n">right_score</span> <span class="o">-</span> <span class="n">left_score</span><span class="p">))</span> <span class="k">def</span> <span class="nf">event_handler</span><span class="p">(</span><span class="n">event</span><span class="p">):</span> <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">event</span><span class="p">,</span> <span class="n">paddle</span><span class="o">.</span><span class="n">event</span><span class="o">.</span><span class="n">EndIteration</span><span class="p">):</span> <span class="k">if</span> <span class="n">event</span><span class="o">.</span><span class="n">batch_id</span> <span class="o">%</span> <span class="mi">25</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span> <span class="n">diff</span> <span class="o">=</span> <span class="n">score_diff</span><span class="p">(</span> <span class="n">event</span><span class="o">.</span><span class="n">gm</span><span class="o">.</span><span class="n">getLayerOutputs</span><span class="p">(</span><span class="s2">"right_score"</span><span class="p">)[</span><span class="s2">"right_score"</span><span class="p">][</span> <span class="s2">"value"</span><span class="p">],</span> <span class="n">event</span><span class="o">.</span><span class="n">gm</span><span class="o">.</span><span class="n">getLayerOutputs</span><span class="p">(</span><span class="s2">"left_score"</span><span class="p">)[</span><span class="s2">"left_score"</span><span class="p">][</span> <span class="s2">"value"</span><span class="p">])</span> <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">((</span><span class="s2">"Pass </span><span class="si">%d</span><span class="s2"> Batch </span><span class="si">%d</span><span class="s2"> : Cost </span><span class="si">%.6f</span><span class="s2">, "</span> <span class="s2">"average absolute diff scores: </span><span class="si">%.6f</span><span class="s2">"</span><span class="p">)</span> <span class="o">%</span> <span class="p">(</span><span class="n">event</span><span class="o">.</span><span class="n">pass_id</span><span class="p">,</span> <span class="n">event</span><span class="o">.</span><span class="n">batch_id</span><span class="p">,</span> <span class="n">event</span><span class="o">.</span><span class="n">cost</span><span class="p">,</span> <span class="n">diff</span><span class="p">))</span> </pre></div> </div> <p>注意:此方法不能获取 <code class="code docutils literal"><span class="pre">paddle.layer.recurrent_group</span></code> 里step的内容,但可以获取 <code class="code docutils literal"><span class="pre">paddle.layer.recurrent_group</span></code> 的输出。</p> </div> <div class="section" id="id9"> <h2><a class="toc-backref" href="#id23">7. 如何在训练过程中获得参数的权重和梯度</a><a class="headerlink" href="#id9" title="永久链接至标题">¶</a></h2> <p>在某些情况下,获得当前mini-batch的权重(或称作weights, parameters)有助于在训练时观察具体数值,方便排查以及快速定位问题。 可以通过在 <code class="code docutils literal"><span class="pre">event_handler</span></code> 中打印其值(注意,需要使用 <code class="code docutils literal"><span class="pre">paddle.event.EndForwardBackward</span></code> 保证使用GPU训练时也可以获得), 示例代码如下:</p> <div class="highlight-python"><div class="highlight"><pre><span></span><span class="o">...</span> <span class="n">parameters</span> <span class="o">=</span> <span class="n">paddle</span><span class="o">.</span><span class="n">parameters</span><span class="o">.</span><span class="n">create</span><span class="p">(</span><span class="n">cost</span><span class="p">)</span> <span class="o">...</span> <span class="k">def</span> <span class="nf">event_handler</span><span class="p">(</span><span class="n">event</span><span class="p">):</span> <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">event</span><span class="p">,</span> <span class="n">paddle</span><span class="o">.</span><span class="n">event</span><span class="o">.</span><span class="n">EndForwardBackward</span><span class="p">):</span> <span class="k">if</span> <span class="n">event</span><span class="o">.</span><span class="n">batch_id</span> <span class="o">%</span> <span class="mi">25</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">parameters</span><span class="o">.</span><span class="n">keys</span><span class="p">():</span> <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"Param </span><span class="si">%s</span><span class="s2">, Grad </span><span class="si">%s</span><span class="s2">"</span><span class="p">,</span> <span class="n">parameters</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">p</span><span class="p">),</span> <span class="n">parameters</span><span class="o">.</span><span class="n">get_grad</span><span class="p">(</span><span class="n">p</span><span class="p">))</span> </pre></div> </div> <p>注意:“在训练过程中获得某一个layer的output”和“在训练过程中获得参数的权重和梯度”都会造成训练中的数据从C++拷贝到numpy,会对训练性能造成影响。不要在注重性能的训练场景下使用。</p> </div> </div> </div> </div> <footer> <div class="rst-footer-buttons" role="navigation" aria-label="footer navigation"> <a href="../cluster/index_cn.html" class="btn btn-neutral float-right" title="集群训练与预测" accesskey="n">Next <span class="fa fa-arrow-circle-right"></span></a> <a href="../parameter/index_cn.html" class="btn btn-neutral" title="参数设置" accesskey="p"><span class="fa fa-arrow-circle-left"></span> Previous</a> </div> <hr/> <div role="contentinfo"> <p> © Copyright 2016, PaddlePaddle developers. </p> </div> Built with <a href="http://sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>. </footer> </div> </div> </section> </div> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT:'../../', VERSION:'', COLLAPSE_INDEX:false, FILE_SUFFIX:'.html', HAS_SOURCE: true, SOURCELINK_SUFFIX: ".txt", }; </script> <script type="text/javascript" src="../../_static/jquery.js"></script> <script type="text/javascript" src="../../_static/underscore.js"></script> <script type="text/javascript" src="../../_static/doctools.js"></script> <script type="text/javascript" src="../../_static/translations.js"></script> <script type="text/javascript" src="https://cdn.bootcss.com/mathjax/2.7.0/MathJax.js"></script> <script type="text/javascript" src="../../_static/js/theme.js"></script> <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script> <script src="https://cdn.jsdelivr.net/perfect-scrollbar/0.6.14/js/perfect-scrollbar.jquery.min.js"></script> <script src="../../_static/js/paddle_doc_init.js"></script> </body> </html>