test=develop, test=document_fix (#2144)

LGTM

test=develop, test=document_fix (#2144)
LGTM
6281ff04 · Daniel Yang · GitHub · 2b979c42 · 6281ff04 · 6281ff04
8 changed file
--- a/doc/fluid/advanced_guide/index_cn.rst
+++ b/doc/fluid/advanced_guide/index_cn.rst
@@ -2,31 +2,13 @@
 进阶指南
 ########

-如果您已比较熟练使用飞桨来完成常规任务，期望了解更多飞桨在工业部署方面的能力，或者尝试自己的二次开发，请阅读：
+如果您已经学会使用飞桨来完成常规任务，期望了解更多飞桨在工业部署方面的能力，请阅读：

-    - `数据准备 <../advanced_guide/data_preparing/index_cn.html>`_：介绍高效的同步异步数据读取方法
-
-    - `分布式训练 <../advanced_guide/distributed_training/index_cn.html>`_ ：介绍如何使用分布式训练

    - `预测与部署 <../advanced_guide/inference_deployment/index_cn.html>`_ ：介绍如何应用训练好的模型进行预测

-    - `性能调优 <../advanced_guide/performance_improving/index_cn.html>`_ ：介绍飞桨使用过程中的调优方法
-
-    - `模型评估/调试 <../advanced_guide/evaluation_debugging/index_cn.html>`_ ：介绍模型评估与调试的典型方法
-
-    - `二次开发 <../advanced_guide/addon_development/index_cn.html>`_ ：介绍如何新增Operator和如何向飞桨开源社区贡献代码
-
-    - `环境变量FLAGS <../advanced_guide/flags/flags_cn.html>`_ 
-
-
 ..  toctree::
    :hidden:

-    data_preparing/index_cn.rst
-    distributed_training/index_cn.rst
    inference_deployment/index_cn.rst
-    performance_improving/index_cn.rst
-    evaluation_debugging/index_cn.rst
-    addon_development/index_cn.rst
-    flags/flags_cn.rst

--- a/doc/fluid/advanced_guide/index_en.rst
+++ b/doc/fluid/advanced_guide/index_en.rst
@@ -8,30 +8,13 @@ Advanced User Guides

 So far you have already been familiar with PaddlePaddle. And the next expectation, read more on:

-    - `Prepare Data <data_preparing/index_en.html>`_：How to prepare the data efficiently.
-
-    - `Distributed Training <distributed_training/index_en.html>`_ ：How to apply the distributed training in your projects.

    - `Deploy Inference Model  <inference_deployment/index_en.html>`_ ：How to deploy the trained network to perform practical inference

-    - `Practice Improving  <performance_improving/index_en.html>`_ ：How to do profiling for Fluid programs
-
-    - `Model Evaluation and Debugging <evaluation_debugging/index_en.html>`_ ：How to evaluate your program.
-
-    - `Addon Development <addon_development/index_en.html>`_ ：How to contribute codes and documentation to our communities
-
-    - `FLAGS <flags/flags_en.html>`_ 
-

 ..  toctree::
    :hidden:

-    data_preparing/index_en.rst
-    distributed_training/index_en.rst
    inference_deployment/index_en.rst
-    performance_improving/index_en.rst
-    evaluation_debugging/index_en.rst
-    addon_development/index_en.rst
-    flags/flags_en.rst


--- a/doc/fluid/beginners_guide/hapi.md
+++ b/doc/fluid/beginners_guide/hapi.md
 # 高层API介绍

-# 简介
+## 简介

 PaddleHapi是飞桨新推出的高层API，PaddleHapi是对飞桨API的进一步封装与升级，提供了更加简洁易用的API，进一步提升了飞桨的易学易用性，并增强飞桨的功能。


--- a/doc/fluid/beginners_guide/index_cn.md
+++ b/doc/fluid/beginners_guide/index_cn.md
-# 飞桨2.0概述
+快速上手
+===========

+飞桨2.0概述
+-----------
 在保持1.x版本工业级大规模高效训练和多平台快速部署优势的前提，飞桨2.0版本重点提升了框架的易用性，主要在用户交互层进行了优化，降低学习门槛，提升开发效率。不管对于初学者还是资深专家，都能够更容易地使用飞桨进行深度学习任务开发，加速前沿算法研究和工业级任务开发。

-此版本为测试版，还在迭代开发中，目前还没有稳定，后续API会根据反馈有可能进行不兼容的升级。对于想要体验飞桨最新特性的开发者，欢迎试用此版本；对稳定性要求高的工业级应用场景推荐使用Paddle 1.8稳定版本。此版本主推命令式(imperative)开发模式，并提供了高层API的封装。命令式开发模式具有很好的灵活性，高层API可以大幅减少重复代码。对于初学者或基础的任务场景，推荐使用高层API的开发方式，简单易用；对于资深开发者想要实现复杂的功能，推荐使用动态图的API，灵活高效。
+此版本为测试版，还在迭代开发中，目前还没有稳定，后续API会根据反馈有可能进行不兼容的升级。对于想要体验飞桨最新特性的开发者，欢迎试用此版本；对稳定性要求高的工业级应用场景推荐使用Paddle
+1.8稳定版本。此版本主推命令式(imperative)开发模式，并提供了高层API的封装。命令式开发模式具有很好的灵活性，高层API可以大幅减少重复代码。对于初学者或基础的任务场景，推荐使用高层API的开发方式，简单易用；对于资深开发者想要实现复杂的功能，推荐使用动态图的API，灵活高效。

 跟1.x版本对比，飞桨2.0版本的重要升级如下：

+------------+--------------------------------------+-----------------------------------------+
 |            | 飞桨1.x版本                          | 飞桨2.0版本                             |
-| -------- | ---------------------------------- | ------------------------------------- |
+============+======================================+=========================================+
 | 开发模式   | 推荐声明式（declarative)             | 推荐命令式(imperative)                  |
+------------+--------------------------------------+-----------------------------------------+
 | 组网方式   | 推荐函数式组网                       | 推荐面向对象式组网                      |
+------------+--------------------------------------+-----------------------------------------+
 | 高层API    | 无                                   | 封装常见的操作，实现低代码开发          |
+------------+--------------------------------------+-----------------------------------------+
 | 基础API    | fluid目录，结构不清晰，存在过时API   | paddle目录，整体结构调整，清理废弃API   |
+------------+--------------------------------------+-----------------------------------------+

-## 开发模式
+开发模式
+--------

 飞桨同时支持声明式和命令式这两种开发模式，兼具声明式编程的高效和命令式编程的灵活。

 声明式编程模式（通常也被称为静态模式或define-and-run模式），程序可以明确分为网络结构定义和执行这两个阶段。定义阶段声明网络结构，此时并未传入具体的训练数据；执行阶段需要用户通过feed的方式传入具体数据，完成计算后，通过fetch的方式返回计算结果。示例如下：

-```python
-import numpy
-import paddle
-# 定义输入数据占位符
-a = paddle.nn.data(name="a", shape=[1], dtype='int64')
-b = paddle.nn.data(name="b", shape=[1], dtype='int64')
-# 组建网络（此处网络仅由一个操作构成，即elementwise_add）
-result = paddle.elementwise_add(a, b)
-# 准备运行网络
-cpu = paddle.CPUPlace() # 定义运算设备，这里选择在CPU下训练
-exe = paddle.Executor(cpu) # 创建执行器
-# 创建输入数据
-x = numpy.array([2])
-y = numpy.array([3])
-# 运行网络
-outs = exe.run(
+.. code:: python
+
+    import numpy
+    import paddle
+    # 定义输入数据占位符
+    a = paddle.nn.data(name="a", shape=[1], dtype='int64')
+    b = paddle.nn.data(name="b", shape=[1], dtype='int64')
+    # 组建网络（此处网络仅由一个操作构成，即elementwise_add）
+    result = paddle.elementwise_add(a, b)
+    # 准备运行网络
+    cpu = paddle.CPUPlace() # 定义运算设备，这里选择在CPU下训练
+    exe = paddle.Executor(cpu) # 创建执行器
+    # 创建输入数据
+    x = numpy.array([2])
+    y = numpy.array([3])
+    # 运行网络
+    outs = exe.run(
        feed={'a':x, 'b':y}, # 将输入数据x, y分别赋值给变量a，b
        fetch_list=[result]  # 通过fetch_list参数指定需要获取的变量结果
        )
-#输出运行结果
-print (outs)
-#[array([5], dtype=int64)]
-```
+    #输出运行结果
+    print (outs)
+    #[array([5], dtype=int64)]

 声明式开发模式的优点为在程序执行之前，可以拿到全局的组网信息，方便对计算图进行全局的优化，提升性能；并且由于全局计算图的存在，方便将计算图导出到文件，方便部署到非python语言的开发环境中，比如：C/C++/JavaScript等。声明式开发模式的缺点为，由于网络定义和执行阶段分离，在定义的时候并不知道所执行的具体的数据，程序的开发和调试会比较困难。

 命令式编程模式（通常也被称为动态模式、eager模式或define-by-run模式），程序在网络结构定义的同时立即执行，能够实时的到执行结果。示例如下：

-```python
-import numpy
-import paddle
-from paddle.imperative import to_variable
+.. code:: python

-# 切换命令式编程模式
-paddle.enable_imperative()
+    import numpy
+    import paddle
+    from paddle.imperative import to_variable

-# 创建数据
-x = to_variable(numpy.array([2]))
-y = to_variable(numpy.array([3]))
-# 定义运算并执行
-z = paddle.elementwise_add(x, y)
-# 输出执行结果
-print (z.numpy())
-```
+    # 切换命令式编程模式
+    paddle.enable_imperative()
+
+    # 创建数据
+    x = to_variable(numpy.array([2]))
+    y = to_variable(numpy.array([3]))
+    # 定义运算并执行
+    z = paddle.elementwise_add(x, y)
+    # 输出执行结果
+    print (z.numpy())

 飞桨2.0推荐开发者使用命令式编程，可以使用原生python控制流API，具有灵活，容易开发调试的优点；同时为了兼具声明式编程在性能和部署方面的优势，飞桨提供了自动转换功能，可以将包含python控制流的代码，转换为Program，通过底层的Executor进行执行。

-## 组网方式
+组网方式
+--------

 飞桨1.x大量使用函数式的组网方式，这种方法的好处是写法很简洁，但表达能力偏弱，比如：如果我们想要查看fc隐含的参数的值或者想要对某一个参数进行裁剪时，会很困难，我们需要操作隐含的参数名才能访问。比如：

-```python
-import paddle.fluid as fluid
+.. code:: python
+
+    import paddle.fluid as fluid
+
+    data = fluid.layers.data(name="data", shape=[32, 32], dtype="float32")
+    fc = fluid.layers.fc(input=data, size=1000, act="tanh")

-data = fluid.layers.data(name="data", shape=[32, 32], dtype="float32")
-fc = fluid.layers.fc(input=data, size=1000, act="tanh")
-```
+飞桨2.0推荐使用面向对象式的组网方式，需要通过继承\ ``paddle.nn.Layer``\ 类的\ ``__init__``\ 和\ ``forward``\ 函数实现网络结构自定义，这种方式通过类的成员变量，方便地访问到每个类的成员，比如：

-飞桨2.0推荐使用面向对象式的组网方式，需要通过继承`paddle.nn.Layer`类的`__init__`和`forward`函数实现网络结构自定义，这种方式通过类的成员变量，方便地访问到每个类的成员，比如：
+.. code:: python

-```python
-import paddle
+    import paddle

-class SimpleNet(paddle.nn.Layer):
+    class SimpleNet(paddle.nn.Layer):
        def __init__(self, in_size, out_size):
            super(SimpleNet, self).__init__()
            self._linear = paddle.nn.Linear(in_size, out_size)
@@ -90,25 +102,26 @@ class SimpleNet(paddle.nn.Layer):
        def forward(self, x):
            y = self._linear(x)
            return y
-```

-## 高层API
+高层API
+-------

 使用飞桨进行深度学习任务的开发，整体过程包括数据处理、组网、训练、评估、模型导出、预测部署这些基本的操作。这些基本操作在不同的任务中会反复出现，使用基础API进行开发时，需要开发者重复地写这些基础操作的代码，增加了模型开发的工作量。高层API针对这些基础操作进行了封装，提供更高层的开发接口，开发者只需要关心数据处理和自定义组网，其他工作可以通过调用高层API来完成。在MNIST手写数字识别任务中，对比动态图基础API的实现方式，通过使用高层API可以减少80%的非组网类代码。

-使用高层API的另外一个好处是，可以通过一行代码`paddle.enable_imperative`，切换命令式编程模式和声明式编程模式。在开发阶段，可以使用的命令式编程模式，方便调试；开发完成后，可以切换到声明式编程模式，加速训练和方便部署。兼具了命令式编程实时执行，容易调试的优点，以及声明式编程全局优化和容易部署的优点。
+使用高层API的另外一个好处是，可以通过一行代码\ ``paddle.enable_imperative``\ ，切换命令式编程模式和声明式编程模式。在开发阶段，可以使用的命令式编程模式，方便调试；开发完成后，可以切换到声明式编程模式，加速训练和方便部署。兼具了命令式编程实时执行，容易调试的优点，以及声明式编程全局优化和容易部署的优点。

 以下为高层API的一个基础示例

-```python
-import numpy as np
-import paddle
-import paddle.nn.functional as F
-from paddle.incubate.hapi.model import Model, Input, Loss
-from paddle.incubate.hapi.loss import CrossEntropy
+.. code:: python

-#高层API的组网方式需要继承Model，Model类实现了模型执行所需的逻辑
-class SimpleNet(Model):
+    import numpy as np
+    import paddle
+    import paddle.nn.functional as F
+    from paddle.incubate.hapi.model import Model, Input, Loss
+    from paddle.incubate.hapi.loss import CrossEntropy
+
+    #高层API的组网方式需要继承Model，Model类实现了模型执行所需的逻辑
+    class SimpleNet(Model):
        def __init__(self, in_size, out_size):
            super(SimpleNet, self).__init__()
            self._linear = paddle.nn.Linear(in_size, out_size)
@@ -118,49 +131,62 @@ class SimpleNet(Model):
            pred = F.softmax(z)
            return pred

-#兼容声明式开发模式，定义数据形状类型，如果不使用声明式编程模式，可以不定义数据占位符
-inputs = [Input([None, 8], 'float32', name='image')]
-labels = [Input([None, 1], 'int64', name='labels')]
+    #兼容声明式开发模式，定义数据形状类型，如果不使用声明式编程模式，可以不定义数据占位符
+    inputs = [Input([None, 8], 'float32', name='image')]
+    labels = [Input([None, 1], 'int64', name='labels')]

-#定义模型网络结构，包括指定损失函数和优化算法
-model = SimpleNet(8, 8)
-optimizer = paddle.optimizer.AdamOptimizer(learning_rate=0.1, parameter_list=model.parameters())
-model.prepare(optimizer, CrossEntropy(), None, inputs, labels, device='cpu')
+    #定义模型网络结构，包括指定损失函数和优化算法
+    model = SimpleNet(8, 8)
+    optimizer = paddle.optimizer.AdamOptimizer(learning_rate=0.1, parameter_list=model.parameters())
+    model.prepare(optimizer, CrossEntropy(), None, inputs, labels, device='cpu')

-#切换执行模式
-paddle.enable_imperative(paddle.CPUPlace())
+    #切换执行模式
+    paddle.enable_imperative(paddle.CPUPlace())

-#基于batch的训练
-batch_num = 10
-x = np.random.random((4, 8)).astype('float32')
-y = np.random.randint(0, 8, (4, 1)).astype('int64')
-for i in range(batch_num):
+    #基于batch的训练
+    batch_num = 10
+    x = np.random.random((4, 8)).astype('float32')
+    y = np.random.randint(0, 8, (4, 1)).astype('int64')
+    for i in range(batch_num):
        model.train_batch(inputs=x, labels=y)
-```

-更多高层API开发的模型和示例请参考github Repo: [hapi](https://github.com/paddlepaddle/hapi) 
+更多高层API开发的模型和示例请参考github Repo:
+`hapi <https://github.com/paddlepaddle/hapi>`__

-## 基础API
+基础API
+-------

 飞桨2.0提供了新的API，可以同时支持声明式和命令式两种开发模式，比如paddle.nn.Linear，避免在两种模式下使用不同的API造成困惑。原飞桨1.x的API位于paddle.fluid目录下，其中部分组网类的API，只能用于声明式开发，比如：fluid.layers.fc，无法用于命令式开发。

 飞桨2.0对API的目录结构进行了调整，从原来的paddle.fluid目录调整到paddle目录下，使得开发接口更加清晰，调整后的目录结构如下：

+---------------------+-----------------------------------------------------------------------------------------------------------+
 | 目录                | 功能和包含API                                                                                             |
-| ----------------- | ------------------------------------------------------------ |
-| paddle.*          | paddle根目录下保留了常用API的别名，当前包括：paddle.tensor, paddle.framework目录下的所有API |
-| paddle.tensor     | 跟tensor操作相关的API，比如：创建zeros, 矩阵运算matmul, 变换concat, 计算elementwise_add, 查找argmax等 |
-| paddle.nn         | 跟组网相关的API，比如：输入占位符data/Input，控制流while_loop/cond，损失函数，卷积，LSTM等，激活函数等 |
+=====================+===========================================================================================================+
+| paddle.\*           | paddle根目录下保留了常用API的别名，当前包括：paddle.tensor, paddle.framework目录下的所有API               |
+---------------------+-----------------------------------------------------------------------------------------------------------+
+| paddle.tensor       | 跟tensor操作相关的API，比如：创建zeros, 矩阵运算matmul, 变换concat, 计算elementwise\_add, 查找argmax等    |
+---------------------+-----------------------------------------------------------------------------------------------------------+
+| paddle.nn           | 跟组网相关的API，比如：输入占位符data/Input，控制流while\_loop/cond，损失函数，卷积，LSTM等，激活函数等   |
+---------------------+-----------------------------------------------------------------------------------------------------------+
 | paddle.framework    | 基础框架相关的API，比如：Variable, Program, Executor等                                                    |
-| paddle.imperative | imprerative模式专用的API，比如：to_variable, prepare_context等 |
+---------------------+-----------------------------------------------------------------------------------------------------------+
+| paddle.imperative   | imprerative模式专用的API，比如：to\_variable, prepare\_context等                                          |
+---------------------+-----------------------------------------------------------------------------------------------------------+
 | paddle.optimizer    | 优化算法相关API，比如：SGD，Adagrad, Adam等                                                               |
-| paddle.metric     | 评估指标计算相关的API，比如：accuracy, cos_sim等             |
+---------------------+-----------------------------------------------------------------------------------------------------------+
+| paddle.metric       | 评估指标计算相关的API，比如：accuracy, cos\_sim等                                                         |
+---------------------+-----------------------------------------------------------------------------------------------------------+
 | paddle.io           | 数据输入输出相关API，比如：save, load, Dataset, DataLoader等                                              |
+---------------------+-----------------------------------------------------------------------------------------------------------+
 | paddle.device       | 设备管理相关API，比如：CPUPlace， CUDAPlace等                                                             |
+---------------------+-----------------------------------------------------------------------------------------------------------+
 | paddle.fleet        | 分布式相关API                                                                                             |
+---------------------+-----------------------------------------------------------------------------------------------------------+

-同时飞桨2.0对部分Paddle 1.x版本的API进行了清理，删除了部分不再推荐使用的API，具体信息请参考Release Note。
-
+同时飞桨2.0对部分Paddle
+1.x版本的API进行了清理，删除了部分不再推荐使用的API，具体信息请参考Release
+Note。

 ..  toctree::
    :hidden:

--- a/doc/fluid/index_cn.rst
+++ b/doc/fluid/index_cn.rst
@@ -12,8 +12,6 @@

    install/index_cn.rst
    beginners_guide/index_cn.rst
-    user_guides/index_cn.rst
    advanced_guide/index_cn.rst
    api_cn/index_cn.rst
-    faq/index_cn.rst
    release_note_cn.md
--- a/doc/fluid/index_en.rst
+++ b/doc/fluid/index_en.rst
@@ -6,8 +6,6 @@
 	
    install/index_en.rst
    beginners_guide/index_en.rst
-    user_guides/index_en.rst
    advanced_guide/index_en.rst
    api/index_en.rst
-    faq/index_en.rst
    release_note_en.md
--- a/doc/fluid/release_note_cn.md
+++ b/doc/fluid/release_note_cn.md
--- a/doc/fluid/release_note_en.md
+++ b/doc/fluid/release_note_en.md
+# Release Note
+
+## Important Statements
+
+- This version is a beta version. It is still in iteration and is not stable at present. Incompatible upgrade may be subsequently performed on APIs based on the feedback. For developers who want to experience the latest features of Paddle, welcome to this version. For industrial application scenarios requiring high stability, the stable Paddle Version 1.8 is recommended.
+
+- This version mainly popularizes the dynamic graph development method and provides the encapsulation of high-level APIs. The dynamic graph mode has great flexibility and high-level APIs can greatly reduces duplicated codes. For beginners or basic task scenarios, the high-level API development method is recommended because it is simple and easy to use. For senior developers who want to implement complex functions, the dynamic graph API is commended because it is flexible and efficient.
+
+- This version also optimizes the Paddle API directory system. The APIs in the original directory can create an alias and are still available, but it is recommended that new programs use the new directory structure.
+
+## Basic Framework
+
+### Basic APIs
+
+- Networking APIs achieve dynamic and static unity and support operation in dynamic and static graph modes
+
+- The API directory structure is adjusted. In the Paddle Version 1.x, the APIs are mainly located in the paddle.fluid directory. This version adjusts the API directory structure so that the classification is more reasonable. The specific adjustment rules are as follows:
+
+  - Moves the APIs related to the tensor operations in the original fluid.layers directory to the paddle.tensor directory
+  - Moves the networking-related operations in the original fluid.layers directory to the paddle.nn directory. Puts the types with parameters in the paddle.nn.layers directory and the functional APIs in the paddle.nn.functional directory
+  - Moves the special API for dynamic graphs in the original fluid.dygraph directory to the paddle.imperative directory
+  - Creates a paddle.framework directory that is used to store framework-related program, executor, and other APIs
+  - Creates a paddle.distributed directory that is used to store distributed related APIs
+  - Creates a paddle.optimizer directory that is used to store APIs related to optimization algorithms
+  - Creates a paddle.metric directory that is used to create APIs related to evaluation index calculation
+  - Creates a paddle.incubate directory that is used to store incubating codes. APIs may be adjusted. This directory stores codes related to complex number computation and high-level APIs
+  - Creates an alias in the paddle directory for all APIs in the paddle.tensor and paddle.framework directories. For example, paddle.tensor.creation.ones can use paddle.ones as an alias
+
+- The added APIs are as follows:
+
+  - Adds eight networking APIs in the paddle.nn directory: interpolate, LogSoftmax, ReLU, Sigmoid, loss.BCELoss, loss.L1Loss, loss.MSELoss, and loss.NLLLoss
+  - Adds 59 tensor-related APIs in the paddle.tensor directory: add, addcmul, addmm, allclose, arange, argmax, atan, bmm, cholesky, clamp, cross, diag\_embed, dist, div, dot, elementwise\_equal, elementwise\_sum, equal, eye, flip, full, full\_like, gather, index\_sample, index\_select, linspace, log1p, logsumexp, matmul, max, meshgrid, min, mm, mul, nonzero, norm, ones, ones\_like, pow, randint, randn, randperm, roll, sin, sort, split, sqrt, squeeze, stack, std, sum, t, tanh, tril, triu, unsqueeze, where, zeros, and zeros\_like
+  - Adds device\_guard that is used to specify a device. Adds manual\_seed that is used to initialize a random number seed
+
+### High-level APIs
+
+- Adds a paddle.incubate.hapi directory. Encapsulates common operations such as networking, training, evaluation, inference, and access during the model development process. Implements low-code development. Uses the dynamic graph implementation mode of MNIST task comparison. High-level APIs can reduce 80% of executable codes.
+- Adds model-type encapsulation. Inherits the layer type. Encapsulates common basic functions during the model development process, including:
+  - Provides a prepare API that is used to specify a loss function and an optimization algorithm
+  - Provides a fit API to implement training and evaluation. Implements the execution of model storage and other user-defined functions during the training process by means of callback
+  - Provides an evaluate interface to implement the inference and evaluation index calculation on the evaluation set
+  - Provides a predict interface to implement specific test data inference
+  - Provides a train\_batch interface to implement the training of single-batch data
+- Adds a dataset interface to encapsulate commonly-used data sets and supports random access to data
+- Adds encapsulation of common Loss and Metric types
+- Adds 16 common data processing interfaces including Resize and Normalize in the CV field
+- Adds lenet, vgg, resnet, mobilenetv1, and mobilenetv2 image classification backbone networks in the CV field
+- Adds MultiHeadAttention, BeamSearchDecoder, TransformerEncoder, TransformerDecoder, and DynamicDecode APIs in the NLP field
+- Releases 12 models based on high-level API implementation, including Transformer, Seq2seq, LAC, BMN, ResNet, YOLOv3, VGG, MobileNet, TSM, CycleGAN, Bert, and OCR
+
+### Performance Optimization
+
+- Adds a `reshape+transpose+matmul` fuse so that the performance of the INT8 model is improved by about 4% (on the 6271 machine) after Ernie quantization. After the quantization, the speed of the INT8 model is increased by about 6.58 times compared with the FP32 model on which DNNL optimization (including fuses) and quantization are not performed
+
+### Debugging Analysis
+
+- To solve the problem of program printing contents being too lengthy and low utilization efficiency during debugging, considerably simplifies the printing strings of objects such as programs, blocks, operators, and variables, thus improving the debugging efficiency without losing effective information
+- To solve the problem of insecure third-party library APIs `boost::get` and difficulty in debugging due to exceptions during running, adds the `BOOST_GET` series of macros to replace over 600 risky `boost::get` in Paddle. Richens error message during `boost::bad_get` exceptions. Specifically, adds the C++ error message stack, error file and line No., expected output type, and actual type, thus improving the debugging experience
+
+## Bug Fixes
+
+- Fix the bug of wrong computation results when any slice operation exists in the while loop
+- Fix the problem of degradation of the transformer model caused by inplace ops
+- Fix the problem of running failure of the last batch in the Ernie precision test
+- Fix the problem of failure to correctly exit when exceptions occur in context of fluid.dygraph.guard