Deploy to GitHub Pages: 6dc5b34e

00b17acd · Travis CI · dac32632 · 00b17acd · 00b17acd · 00b17acd
6 changed file
--- a/develop/doc/_sources/howto/optimization/cpu_profiling.md.txt
+++ b/develop/doc/_sources/howto/optimization/cpu_profiling.md.txt
-This tutorial introduces techniques we used to profile and tune the
+This tutorial introduces techniques we use to profile and tune the
 CPU performance of PaddlePaddle.  We will use Python packages
-`cProfile` and `yep`, and Google `perftools`.
+`cProfile` and `yep`, and Google's `perftools`.

-Profiling is the process that reveals the performance bottlenecks,
+Profiling is the process that reveals performance bottlenecks,
 which could be very different from what's in the developers' mind.
-Performance tuning is to fix the bottlenecks. Performance optimization
+Performance tuning is done to fix these bottlenecks. Performance optimization
 repeats the steps of profiling and tuning alternatively.

-PaddlePaddle users program AI by calling the Python API, which calls
+PaddlePaddle users program AI applications by calling the Python API, which calls
 into `libpaddle.so.` written in C++.  In this tutorial, we focus on
 the profiling and tuning of

@@ -82,7 +82,7 @@ focus on. We can sort above profiling file by tottime:

 We can see that the most time-consuming function is the `built-in
 method run`, which is a C++ function in `libpaddle.so`.  We will
-explain how to profile C++ code in the next section.  At the right
+explain how to profile C++ code in the next section.  At this 
 moment, let's look into the third function `sync_with_cpp`, which is a
 Python function.  We can click it to understand more about it:

@@ -135,8 +135,8 @@ to generate the profiling file.  The default filename is
 `main.py.prof`.

 Please be aware of the `-v` command line option, which prints the
-analysis results after generating the profiling file.  By taking a
-glance at the print result, we'd know that if we stripped debug
+analysis results after generating the profiling file.  By examining the
+ the print result, we'd know that if we stripped debug
 information from `libpaddle.so` at build time.  The following hints
 help make sure that the analysis results are readable:

@@ -155,9 +155,9 @@ help make sure that the analysis results are readable:
   variable `OMP_NUM_THREADS=1` to prevents OpenMP from automatically
   starting multiple threads.

-### Look into the Profiling File
+### Examining the Profiling File

-The tool we used to look into the profiling file generated by
+The tool we used to examine the profiling file generated by
 `perftools` is [`pprof`](https://github.com/google/pprof), which
 provides a Web-based GUI like `cprofilev`.

@@ -194,4 +194,4 @@ time, and `MomentumOp` takes about 17%. Obviously, we'd want to
 optimize `MomentumOp`.

 `pprof` would mark performance critical parts of the program in
-red. It's a good idea to follow the hint.
+red. It's a good idea to follow the hints.
--- a/develop/doc/howto/optimization/cpu_profiling.html
+++ b/develop/doc/howto/optimization/cpu_profiling.html
@@ -188,14 +188,14 @@
          <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
           <div itemprop="articleBody">
            
-  <p>This tutorial introduces techniques we used to profile and tune the
+  <p>This tutorial introduces techniques we use to profile and tune the
 CPU performance of PaddlePaddle.  We will use Python packages
-<code class="docutils literal"><span class="pre">cProfile</span></code> and <code class="docutils literal"><span class="pre">yep</span></code>, and Google <code class="docutils literal"><span class="pre">perftools</span></code>.</p>
-<p>Profiling is the process that reveals the performance bottlenecks,
+<code class="docutils literal"><span class="pre">cProfile</span></code> and <code class="docutils literal"><span class="pre">yep</span></code>, and Google&#8217;s <code class="docutils literal"><span class="pre">perftools</span></code>.</p>
+<p>Profiling is the process that reveals performance bottlenecks,
 which could be very different from what&#8217;s in the developers&#8217; mind.
-Performance tuning is to fix the bottlenecks. Performance optimization
+Performance tuning is done to fix these bottlenecks. Performance optimization
 repeats the steps of profiling and tuning alternatively.</p>
-<p>PaddlePaddle users program AI by calling the Python API, which calls
+<p>PaddlePaddle users program AI applications by calling the Python API, which calls
 into <code class="docutils literal"><span class="pre">libpaddle.so.</span></code> written in C++.  In this tutorial, we focus on
 the profiling and tuning of</p>
 <ol class="simple">
@@ -259,7 +259,7 @@ focus on. We can sort above profiling file by tottime:</p>
 </pre></div>
 </div>
 <p>We can see that the most time-consuming function is the <code class="docutils literal"><span class="pre">built-in</span> <span class="pre">method</span> <span class="pre">run</span></code>, which is a C++ function in <code class="docutils literal"><span class="pre">libpaddle.so</span></code>.  We will
-explain how to profile C++ code in the next section.  At the right
+explain how to profile C++ code in the next section.  At this
 moment, let&#8217;s look into the third function <code class="docutils literal"><span class="pre">sync_with_cpp</span></code>, which is a
 Python function.  We can click it to understand more about it:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">Called</span> <span class="n">By</span><span class="p">:</span>
@@ -305,8 +305,8 @@ pip install yep
 <p>to generate the profiling file.  The default filename is
 <code class="docutils literal"><span class="pre">main.py.prof</span></code>.</p>
 <p>Please be aware of the <code class="docutils literal"><span class="pre">-v</span></code> command line option, which prints the
-analysis results after generating the profiling file.  By taking a
-glance at the print result, we&#8217;d know that if we stripped debug
+analysis results after generating the profiling file.  By examining the
+the print result, we&#8217;d know that if we stripped debug
 information from <code class="docutils literal"><span class="pre">libpaddle.so</span></code> at build time.  The following hints
 help make sure that the analysis results are readable:</p>
 <ol class="simple">
@@ -324,9 +324,9 @@ variable <code class="docutils literal"><span class="pre">OMP_NUM_THREADS=1</spa
 starting multiple threads.</li>
 </ol>
 </div>
-<div class="section" id="look-into-the-profiling-file">
-<span id="id1"></span><h2>Look into the Profiling File<a class="headerlink" href="#look-into-the-profiling-file" title="Permalink to this headline">¶</a></h2>
-<p>The tool we used to look into the profiling file generated by
+<div class="section" id="examining-the-profiling-file">
+<span id="examining-the-profiling-file"></span><h2>Examining the Profiling File<a class="headerlink" href="#examining-the-profiling-file" title="Permalink to this headline">¶</a></h2>
+<p>The tool we used to examine the profiling file generated by
 <code class="docutils literal"><span class="pre">perftools</span></code> is <a class="reference external" href="https://github.com/google/pprof"><code class="docutils literal"><span class="pre">pprof</span></code></a>, which
 provides a Web-based GUI like <code class="docutils literal"><span class="pre">cprofilev</span></code>.</p>
 <p>We can rely on the standard Go toolchain to retrieve the source code
@@ -354,7 +354,7 @@ of the gradient of multiplication takes 2% to 4% of the total running
 time, and <code class="docutils literal"><span class="pre">MomentumOp</span></code> takes about 17%. Obviously, we&#8217;d want to
 optimize <code class="docutils literal"><span class="pre">MomentumOp</span></code>.</p>
 <p><code class="docutils literal"><span class="pre">pprof</span></code> would mark performance critical parts of the program in
-red. It&#8217;s a good idea to follow the hint.</p>
+red. It&#8217;s a good idea to follow the hints.</p>
 </div>
 </div>


--- a/develop/doc/searchindex.js
+++ b/develop/doc/searchindex.js
--- a/develop/doc_cn/_sources/howto/optimization/cpu_profiling.md.txt
+++ b/develop/doc_cn/_sources/howto/optimization/cpu_profiling.md.txt
-This tutorial introduces techniques we used to profile and tune the
+This tutorial introduces techniques we use to profile and tune the
 CPU performance of PaddlePaddle.  We will use Python packages
-`cProfile` and `yep`, and Google `perftools`.
+`cProfile` and `yep`, and Google's `perftools`.

-Profiling is the process that reveals the performance bottlenecks,
+Profiling is the process that reveals performance bottlenecks,
 which could be very different from what's in the developers' mind.
-Performance tuning is to fix the bottlenecks. Performance optimization
+Performance tuning is done to fix these bottlenecks. Performance optimization
 repeats the steps of profiling and tuning alternatively.

-PaddlePaddle users program AI by calling the Python API, which calls
+PaddlePaddle users program AI applications by calling the Python API, which calls
 into `libpaddle.so.` written in C++.  In this tutorial, we focus on
 the profiling and tuning of

@@ -82,7 +82,7 @@ focus on. We can sort above profiling file by tottime:

 We can see that the most time-consuming function is the `built-in
 method run`, which is a C++ function in `libpaddle.so`.  We will
-explain how to profile C++ code in the next section.  At the right
+explain how to profile C++ code in the next section.  At this 
 moment, let's look into the third function `sync_with_cpp`, which is a
 Python function.  We can click it to understand more about it:

@@ -135,8 +135,8 @@ to generate the profiling file.  The default filename is
 `main.py.prof`.

 Please be aware of the `-v` command line option, which prints the
-analysis results after generating the profiling file.  By taking a
-glance at the print result, we'd know that if we stripped debug
+analysis results after generating the profiling file.  By examining the
+ the print result, we'd know that if we stripped debug
 information from `libpaddle.so` at build time.  The following hints
 help make sure that the analysis results are readable:

@@ -155,9 +155,9 @@ help make sure that the analysis results are readable:
   variable `OMP_NUM_THREADS=1` to prevents OpenMP from automatically
   starting multiple threads.

-### Look into the Profiling File
+### Examining the Profiling File

-The tool we used to look into the profiling file generated by
+The tool we used to examine the profiling file generated by
 `perftools` is [`pprof`](https://github.com/google/pprof), which
 provides a Web-based GUI like `cprofilev`.

@@ -194,4 +194,4 @@ time, and `MomentumOp` takes about 17%. Obviously, we'd want to
 optimize `MomentumOp`.

 `pprof` would mark performance critical parts of the program in
-red. It's a good idea to follow the hint.
+red. It's a good idea to follow the hints.
--- a/develop/doc_cn/howto/optimization/cpu_profiling.html
+++ b/develop/doc_cn/howto/optimization/cpu_profiling.html
@@ -202,14 +202,14 @@
          <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
           <div itemprop="articleBody">
            
-  <p>This tutorial introduces techniques we used to profile and tune the
+  <p>This tutorial introduces techniques we use to profile and tune the
 CPU performance of PaddlePaddle.  We will use Python packages
-<code class="docutils literal"><span class="pre">cProfile</span></code> and <code class="docutils literal"><span class="pre">yep</span></code>, and Google <code class="docutils literal"><span class="pre">perftools</span></code>.</p>
-<p>Profiling is the process that reveals the performance bottlenecks,
+<code class="docutils literal"><span class="pre">cProfile</span></code> and <code class="docutils literal"><span class="pre">yep</span></code>, and Google&#8217;s <code class="docutils literal"><span class="pre">perftools</span></code>.</p>
+<p>Profiling is the process that reveals performance bottlenecks,
 which could be very different from what&#8217;s in the developers&#8217; mind.
-Performance tuning is to fix the bottlenecks. Performance optimization
+Performance tuning is done to fix these bottlenecks. Performance optimization
 repeats the steps of profiling and tuning alternatively.</p>
-<p>PaddlePaddle users program AI by calling the Python API, which calls
+<p>PaddlePaddle users program AI applications by calling the Python API, which calls
 into <code class="docutils literal"><span class="pre">libpaddle.so.</span></code> written in C++.  In this tutorial, we focus on
 the profiling and tuning of</p>
 <ol class="simple">
@@ -273,7 +273,7 @@ focus on. We can sort above profiling file by tottime:</p>
 </pre></div>
 </div>
 <p>We can see that the most time-consuming function is the <code class="docutils literal"><span class="pre">built-in</span> <span class="pre">method</span> <span class="pre">run</span></code>, which is a C++ function in <code class="docutils literal"><span class="pre">libpaddle.so</span></code>.  We will
-explain how to profile C++ code in the next section.  At the right
+explain how to profile C++ code in the next section.  At this
 moment, let&#8217;s look into the third function <code class="docutils literal"><span class="pre">sync_with_cpp</span></code>, which is a
 Python function.  We can click it to understand more about it:</p>
 <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">Called</span> <span class="n">By</span><span class="p">:</span>
@@ -319,8 +319,8 @@ pip install yep
 <p>to generate the profiling file.  The default filename is
 <code class="docutils literal"><span class="pre">main.py.prof</span></code>.</p>
 <p>Please be aware of the <code class="docutils literal"><span class="pre">-v</span></code> command line option, which prints the
-analysis results after generating the profiling file.  By taking a
-glance at the print result, we&#8217;d know that if we stripped debug
+analysis results after generating the profiling file.  By examining the
+the print result, we&#8217;d know that if we stripped debug
 information from <code class="docutils literal"><span class="pre">libpaddle.so</span></code> at build time.  The following hints
 help make sure that the analysis results are readable:</p>
 <ol class="simple">
@@ -338,9 +338,9 @@ variable <code class="docutils literal"><span class="pre">OMP_NUM_THREADS=1</spa
 starting multiple threads.</li>
 </ol>
 </div>
-<div class="section" id="look-into-the-profiling-file">
-<span id="id1"></span><h2>Look into the Profiling File<a class="headerlink" href="#look-into-the-profiling-file" title="永久链接至标题">¶</a></h2>
-<p>The tool we used to look into the profiling file generated by
+<div class="section" id="examining-the-profiling-file">
+<span id="examining-the-profiling-file"></span><h2>Examining the Profiling File<a class="headerlink" href="#examining-the-profiling-file" title="永久链接至标题">¶</a></h2>
+<p>The tool we used to examine the profiling file generated by
 <code class="docutils literal"><span class="pre">perftools</span></code> is <a class="reference external" href="https://github.com/google/pprof"><code class="docutils literal"><span class="pre">pprof</span></code></a>, which
 provides a Web-based GUI like <code class="docutils literal"><span class="pre">cprofilev</span></code>.</p>
 <p>We can rely on the standard Go toolchain to retrieve the source code
@@ -368,7 +368,7 @@ of the gradient of multiplication takes 2% to 4% of the total running
 time, and <code class="docutils literal"><span class="pre">MomentumOp</span></code> takes about 17%. Obviously, we&#8217;d want to
 optimize <code class="docutils literal"><span class="pre">MomentumOp</span></code>.</p>
 <p><code class="docutils literal"><span class="pre">pprof</span></code> would mark performance critical parts of the program in
-red. It&#8217;s a good idea to follow the hint.</p>
+red. It&#8217;s a good idea to follow the hints.</p>
 </div>
 </div>


--- a/develop/doc_cn/searchindex.js
+++ b/develop/doc_cn/searchindex.js