Fix paddle docs (#2592)

* fix___init_bug * fix migration guide * add api_reference test=develop * fix style test=develop * change faq to chinese test=develop * add release note en * fix release note test=develop * fix api_reference test=develop * fix style test=develop

Fix paddle docs (#2592)
* fix___init_bug * fix migration guide * add api_reference test=develop * fix style test=develop * change faq to chinese test=develop * add release note en * fix release note test=develop * fix api_reference test=develop * fix style test=develop
ec40ff48 · Chen Long · GitHub · e26fbc40 · ec40ff48 · ec40ff48
8 changed file
--- a/doc/paddle/api/alias_api_mapping
+++ b/doc/paddle/api/alias_api_mapping
@@ -44,7 +44,7 @@ paddle.tensor.manipulation.reshape	paddle.reshape,paddle.tensor.reshape
 paddle.fluid.layers.increment	paddle.increment,paddle.tensor.increment,paddle.tensor.math.increment
 paddle.fluid.compiler.CompiledProgram	paddle.static.CompiledProgram
 paddle.tensor.manipulation.flip	paddle.flip,paddle.reverse,paddle.tensor.flip,paddle.tensor.reverse
-paddle.distributed.__init__.paddle.fluid.dygraph.parallel.ParallelEnv	paddle.distributed.ParallelEnv
+paddle.fluid.dygraph.parallel.ParallelEnv	paddle.distributed.ParallelEnv
 paddle.fluid.layers.hash	paddle.nn.functional.hash,paddle.nn.functional.lod.hash
 paddle.nn.functional.activation.selu	paddle.nn.functional.selu
 paddle.nn.functional.input.embedding	paddle.nn.functional.embedding
@@ -193,7 +193,7 @@ paddle.nn.layer.conv.Conv1d	paddle.nn.Conv1d,paddle.nn.layer.Conv1d
 paddle.fluid.param_attr.ParamAttr	paddle.ParamAttr,paddle.framework.ParamAttr
 paddle.fluid.layers.retinanet_target_assign	paddle.nn.functional.retinanet_target_assign,paddle.nn.functional.vision.retinanet_target_assign
 paddle.fluid.initializer.Xavier	paddle.nn.initializer.Xavier
-paddle.distributed.__init__.paddle.fluid.dygraph.parallel.prepare_context	paddle.distributed.prepare_context
+paddle.fluid.dygraph.parallel.prepare_context	paddle.distributed.prepare_context
 paddle.tensor.math.pow	paddle.pow,paddle.tensor.pow
 paddle.fluid.layers.bipartite_match	paddle.nn.functional.bipartite_match,paddle.nn.functional.vision.bipartite_match
 paddle.fluid.input.embedding	paddle.static.nn.embedding
@@ -253,7 +253,7 @@ paddle.tensor.linalg.matmul	paddle.matmul,paddle.tensor.matmul
 paddle.fluid.layers.generate_proposals	paddle.nn.functional.generate_proposals,paddle.nn.functional.vision.generate_proposals
 paddle.nn.layer.loss.SmoothL1Loss	paddle.nn.SmoothL1Loss,paddle.nn.layer.SmoothL1Loss
 paddle.fluid.dygraph.checkpoint.save_dygraph	paddle.save,paddle.framework.save
-paddle.framework.__init__.paddle.fluid.core	paddle.framework.core
+paddle.fluid.core	paddle.framework.core
 paddle.nn.functional.vision.grid_sample	paddle.nn.functional.grid_sample
 paddle.tensor.random.rand	paddle.rand,paddle.tensor.rand
 paddle.fluid.layers.cond	paddle.nn.cond,paddle.nn.control_flow.cond

--- a/doc/paddle/api/index_cn.rst
+++ b/doc/paddle/api/index_cn.rst
 ==================
-API REFERENCE
+API 文档
 ==================
+
+PaddlePaddle (PArallel Distributed Deep LEarning)是一个易用、高效、灵活、可扩展的深度学习框架。
+本页列出了PaddlePaddle 2.0-beta所支持的API，您可以在此查看该API的相关信息。
+
+此外，您可参考PaddlePaddle的 `GitHub <https://github.com/PaddlePaddle/Paddle>`_ 了解详情，也可阅读 `版本说明 <../release_note_cn.html>`_ 了解新版本的特性。
+
+**飞桨框架2.0的API目录结构如下：**
+
+-------------------------------+-------------------------------------------------------+
+| 目录                          | 功能和包含的API                                       |
+===============================+=======================================================+
+| paddle.\*                     | paddle                                                |
+|                               | 根目录下保留了常用API的别名，当前包括：paddle.tensor, |
+|                               | paddle.framework目录下的所有API                       |
+-------------------------------+-------------------------------------------------------+
+| paddle.tensor                 | 跟tensor操作相关的API，比如：创建zeros,               |
+|                               | 矩阵运算matmul, 变换concat, 计算add, 查找argmax等     |
+-------------------------------+-------------------------------------------------------+
+| paddle.nn                     | 跟组网相关的API，比如：Linear,                        |
+|                               | Conv2d，损失函数，卷积，LSTM等，激活函数等            |
+-------------------------------+-------------------------------------------------------+
+| paddle.static.nn              | 静态图下组网专用A                                     |
+|                               | PI，比如：输入占位符data/Input，控制流while_loop/cond |
+-------------------------------+-------------------------------------------------------+
+| paddle.static                 | 静态图下基础框架相关API，比如：Variable, Program,     |
+|                               | Executor等                                            |
+-------------------------------+-------------------------------------------------------+
+| paddle.framework              | 框架通用API和imprerative模式的API，比如：to_tensor,   |
+|                               | prepare_context等                                     |
+-------------------------------+-------------------------------------------------------+
+| paddle.optimizer              | 优化算法相关API，比如：SGD，Adagrad, Adam等           |
+|                               |                                                       |
+-------------------------------+-------------------------------------------------------+
+| paddle.optimizer.lr_scheduler | 学习率衰减相关API                                     |
+|                               |                                                       |
+-------------------------------+-------------------------------------------------------+
+| paddle.metric                 | 评估指标计算相关的API，比如：accuracy, auc等          |
+|                               |                                                       |
+-------------------------------+-------------------------------------------------------+
+| paddle.io                     | 数据输入输出相关API，比如：save, load, Dataset,       |
+|                               | DataLoader等                                          |
+-------------------------------+-------------------------------------------------------+
+| paddle.device                 | 设备管理相关API，比如：CPUPlace， CUDAPlace等         |
+|                               |                                                       |
+-------------------------------+-------------------------------------------------------+
+| paddle.distributed            | 分布式相关基础API                                     |
+|                               |                                                       |
+-------------------------------+-------------------------------------------------------+
+| paddle.distributed.fleet      | 分布式相关高层API                                     |
+|                               |                                                       |
+-------------------------------+-------------------------------------------------------+
+| paddle.vision                 | 视觉领域API，                                         |
+|                               | 比如，数据集，数据处理，常用基础网络结构，比如resnet  |
+-------------------------------+-------------------------------------------------------+
+| paddle.text                   | NLP领域API,                                           |
+|                               | 比如，数据集，数据处理，常用网络结构，比如transformer |
+-------------------------------+-------------------------------------------------------+
--- a/doc/paddle/api/index_en.rst
+++ b/doc/paddle/api/index_en.rst
 ==================
-API REFERENCE
+API Reference
 ==================
+
+PaddlePaddle (PArallel Distributed Deep LEarning) is an efficient, flexible, and extensible deep learning framework.
+This page lists the APIs supported by PaddlePaddle 2.0-beta. You can view the information of the APIs here.
+
+In addition, you can refer to PaddlePaddle's `GitHub <https://github.com/PaddlePaddle/Paddle>`_ for details, or read `Release Notes <../release_note_en.html>`_ to learn about the features of the new version.
+
+**The API directory structure of PaddlePaddle 2.0-beta is as follows:**
+
+-------------------------------+-------------------------------------------------------+
+| Directory                     | Functions and Included APIs                           |
+===============================+=======================================================+
+| paddle.*                      | The aliases of commonly used APIs are reserved in the |
+|                               | paddle root directory, which currently include all    |
+|                               | the APIs in the paddle.tensor and paddle.framework    |
+|                               | directories                                           |
+-------------------------------+-------------------------------------------------------+
+| paddle.tensor                 | APIs related to tensor operations such as creating    |
+|                               | zeros, matrix operation matmul, transforming concat,  |
+|                               | computing add, and finding argmax                     |
+-------------------------------+-------------------------------------------------------+
+| paddle.nn                     | Networking-related APIs such as Linear, Conv2d, loss  |
+|                               | function, convolution, LSTM，and activation function  |
+-------------------------------+-------------------------------------------------------+
+| paddle.static.nn              | Special APIs for networking under a static graph such |
+|                               | as input placeholder data/Input and control flow      |
+|                               | while_loop/cond                                       |
+-------------------------------+-------------------------------------------------------+
+| paddle.static                 | APIs related to the basic framework under a static    |
+|                               | graph such as Variable, Program, and Executor         |
+-------------------------------+-------------------------------------------------------+
+| paddle.framework              | Universal APIs and imprerative mode APIs such as      |
+|                               | to_variable and prepare_context                       |
+-------------------------------+-------------------------------------------------------+
+| paddld.optimizer              | APIs related to optimization algorithms such as SGD,  |
+|                               | Adagrad, and Adam                                     |
+-------------------------------+-------------------------------------------------------+
+| paddle.optimizer.lr_scheduler | APIs related to learning rate attenuation             |
+|                               |                                                       |
+-------------------------------+-------------------------------------------------------+
+| paddle.metric                 | APIs related to evaluation index computation such as  |
+|                               | accuracy and auc                                      |
+-------------------------------+-------------------------------------------------------+
+| paddle.io                     | APIs related to data input and output such as save,   |
+|                               | load, Dataset, and DataLoader                         |
+-------------------------------+-------------------------------------------------------+
+| paddle.device                 | APIs related to device management such as CPUPlace    |
+|                               | and CUDAPlace                                         |
+-------------------------------+-------------------------------------------------------+
+| paddle.distributed            | Distributed related basic APIs                        |
+|                               |                                                       |
+-------------------------------+-------------------------------------------------------+
+| paddle.distributed.fleet      | Distributed related high-level APIs                   |
+|                               |                                                       |
+-------------------------------+-------------------------------------------------------+
+| paddle.vision                 | Vision domain APIs such as datasets, data processing, |
+|                               | and commonly used basic network structures like       |
+|                               | resnet                                                |
+-------------------------------+-------------------------------------------------------+
+| paddle.text                   | NLP domain APIs such as datasets, data processing,    |
+|                               | and commonly used basic network structures like       |
+|                               | transformer                                           |
+-------------------------------+-------------------------------------------------------+
--- a/doc/paddle/api/not_display_doc_list
+++ b/doc/paddle/api/not_display_doc_list
 paddle.utils
 paddle.incubate
+paddle.hapi
--- a/doc/paddle/faq/index_cn.rst
+++ b/doc/paddle/faq/index_cn.rst
 ##############
-FAQ
+常见问题
 ##############
 如果您在使用Paddle框架开发过程中遇到了使用咨询类的问题，希望快速得到官方的答疑和指导，可以先来FAQ中查阅


--- a/doc/paddle/guides/index_cn.rst
+++ b/doc/paddle/guides/index_cn.rst
@@ -8,7 +8,7 @@ PaddlePaddle (PArallel Distributed Deep LEarning)是一个易用、高效、灵

 让我们从学习PaddlePaddle基本概念这里开始：

- `版本转换工具 <./migration_cn.html>`_：介绍 Paddle 1 到Paddle 2的变化与Paddle1to2转换工具的使用。
+- `版本迁移 <./migration_cn.html>`_：介绍 Paddle 1 到Paddle 2的变化与Paddle1to2转换工具的使用。
 - `动态图转静态图 <./dygraph_to_static/index_cn.html>`_：介绍 Paddle 动态图转静态图的方法 
 - `模型存储与载入 <./model_save_load_cn.html>`_：介绍 Paddle 模型与参数存储载入的方法


--- a/doc/paddle/guides/migration_cn.rst
+++ b/doc/paddle/guides/migration_cn.rst
-Paddle 1 to Paddle 2
+版本迁移
 ====================

 飞桨框架v2.0-beta，最重要的变化为API体系的全面升级以及动态图能力的全面完善。下文将简要介绍Paddle

--- a/doc/paddle/release_note_en.md
+++ b/doc/paddle/release_note_en.md
 # Release Note

-## Important Statements
+## Important Update
+This version is the beta version of PaddlePaddle Framework v2.0. The most important change is the full upgrade of the API system and the comprehensive improvement on the imperative programming (dynamic graph) capability. This version systematically optimizes the directory structure of PaddlePaddle basic APIs, comprehensively fixes relevant issues left over from the past, fully supplements APIs, and especially provides the better high-level API functions. It also provides support for the quantitative training and mixed precision training under a dynamic graph. Perfect syntax support is implemented in the dynamic-to-static conversion. The usability is improved substantially. Dynamic graph-related functions tend to be perfect. The default development mode of PaddlePaddle is changed to the dynamic graph mode.In addition, the C++ APIs for the inference library are upgraded and optimized. Both the support of the inference library for quantitative models and the inference performance are fully enhanced.

- This version is a beta version. It is still in iteration and is not stable at present. Incompatible upgrade may be subsequently performed on APIs based on the feedback. For developers who want to experience the latest features of Paddle, welcome to this version. For industrial application scenarios requiring high stability, the stable Paddle Version 1.8 is recommended.
- This version mainly popularizes the imperative programming development method and provides the encapsulation of high-level APIs. The imperative programming(dynamic graph) mode has great flexibility and high-level APIs can greatly reduces duplicated codes. For beginners or basic task scenarios, the high-level API development method is recommended because it is simple and easy to use. For senior developers who want to implement complex functions, the imperative programming API is commended because it is flexible and efficient.
- This version also optimizes the Paddle API directory system. The APIs in the original directory can create an alias and are still available, but it is recommended that new programs use the new directory structure.
-
-## Basic Framework
+## Training Framework

 ### Basic APIs

- Networking APIs achieve dynamic and static unity and support operation in imperative programming and declarative programming modes(static graph)
+#### Compatibility Description
+
+For Version Paddle 2.x, users are recommended to use APIs in the paddle root directory. In addition, all the APIs of Version Paddle 1.x are reserved in the paddle.fluid directory. Codes for Version Paddle 1.x training are not changed according to the design, that is, models saved for Version Paddle 1.x training can run on Version Paddle 2.x normally and inference can be performed using Version Paddle 2.x.
+
+#### Directory Structure Adjustment
+- Based on the 2.0-alpha version, this version has made some adjustments to the directory structure. The latest adjusted directory structure is as follows:
+
+  | Directory | Functions and Included APIs |
+  | :--- | --------------- |
+  | paddle.* | The aliases of commonly used APIs are reserved in the paddle root directory, which currently include all the APIs in the paddle.tensor and paddle.framework directories |
+  | paddle.tensor | APIs related to tensor operations such as creating zeros, matrix operation matmul, transforming concat, computing add, and finding argmax |
+  | paddle.nn | Networking-related APIs such as Linear, Conv2d, loss function, convolution, LSTM，and activation function |
+  | paddle.static.nn | Special APIs for networking under a static graph such as input placeholder data/Input and control flow while_loop/cond |
+  | paddle.static | APIs related to the basic framework under a static graph such as Variable, Program, and Executor |
+  | paddle.framework | Universal APIs and imprerative mode APIs such as to_variable and prepare_context |
+  | paddle.optimizer | APIs related to optimization algorithms such as SGD, Adagrad, and Adam |
+  | paddle.optimizer.lr_scheduler | APIs related to learning rate attenuation |
+  | paddle.metric | APIs related to evaluation index computation such as accuracy and auc |
+  | paddle.io | APIs related to data input and output such as save, load, Dataset, and DataLoader |
+  | paddle.device | APIs related to device management such as CPUPlace and CUDAPlace |
+  | paddle.distributed | Distributed related basic APIs |
+  | paddle.distributed.fleet | Distributed related high-level APIs |
+  | paddle.vision | Vision domain APIs such as datasets, data processing, and commonly used basic network structures like resnet |
+  | paddle.text | NLP domain APIs such as datasets, data processing, and commonly used basic network structures like transformer |
+
+#### API Alias Rules
+- For the convenience of users, APIs will create aliases in different paths, such as `paddle.add -> paddle.sensor.add`. Users are recommend to use the shorter path `paddle.add`.
+
+- All the APIs in the framework and tensor directories are aliased in the paddle root directory. Except for a few special APIs, all other APIs have no aliases in the paddle root directory.

- The API directory structure is adjusted. In the Paddle Version 1.x, the APIs are mainly located in the paddle.fluid directory. This version adjusts the API directory structure so that the classification is more reasonable. The specific adjustment rules are as follows:
+- All the APIs in the paddle.nn directory, except those in the functional directory, have aliases in the paddle.nn directory. All the APIs in the functional directory have no aliases in the paddle.nn directory.

-  - Moves the APIs related to the tensor operations in the original fluid.layers directory to the paddle.tensor directory
-  - Moves the networking-related operations in the original fluid.layers directory to the paddle.nn directory. Puts the types with parameters in the paddle.nn.layers directory and the functional APIs in the paddle.nn.functional directory
-  - Moves the special API for imperative programming in the original fluid.dygraph directory to the paddle.imperative directory
-  - Creates a paddle.framework directory that is used to store framework-related program, executor, and other APIs
-  - Creates a paddle.distributed directory that is used to store distributed related APIs
-  - Creates a paddle.optimizer directory that is used to store APIs related to optimization algorithms
-  - Creates a paddle.metric directory that is used to create APIs related to evaluation index calculation
-  - Creates a paddle.incubate directory that is used to store incubating codes. APIs may be adjusted. This directory stores codes related to complex number computation and high-level APIs
-  - Creates an alias in the paddle directory for all APIs in the paddle.tensor and paddle.framework directories. For example, paddle.tensor.creation.ones can use paddle.ones as an alias
+- The following are some special alias relations. It is recommended to use the names on the left.
+  - paddle.sigmoid -> paddle.tensor.sigmoid -> paddle.nn.functional.sigmoid
+  - paddle.tanh -> paddle.tensor.tanh -> paddle.nn.functional.tanh
+  - paddle.remainder -> paddle.mod -> paddle.floor_mod
+  - paddle.divide -> paddle.true_divide
+  - paddle.rand -> paddle.uniform
+  - paddle.randn -> paddle.standard_normal
+  - Optimizer.clear_grad -> Optimizer.clear_gradients
+  - Optimizer.set_state_dict -> Optimizer.set_dict
+  - Optimizer.get_lr -> Optimizer.current_step_lr
+  - Layer.clear_grad -> Layer.clear_gradients
+  - Layer.set_state_dict -> Layer.set_dict

- The added APIs are as follows:
+#### Name Change of Commonly Used APIs

-  - Adds eight networking APIs in the paddle.nn directory: interpolate, LogSoftmax, ReLU, Sigmoid, loss.BCELoss, loss.L1Loss, loss.MSELoss, and loss.NLLLoss
-  - Adds 59 tensor-related APIs in the paddle.tensor directory: add, addcmul, addmm, allclose, arange, argmax, atan, bmm, cholesky, clamp, cross, diag\_embed, dist, div, dot, elementwise\_equal, elementwise\_sum, equal, eye, flip, full, full\_like, gather, index\_sample, index\_select, linspace, log1p, logsumexp, matmul, max, meshgrid, min, mm, mul, nonzero, norm, ones, ones\_like, pow, randint, randn, randperm, roll, sin, sort, split, sqrt, squeeze, stack, std, sum, t, tanh, tril, triu, unsqueeze, where, zeros, and zeros\_like
-  - Adds device\_guard that is used to specify a device. Adds manual\_seed that is used to initialize a random number seed
+- This version uses tensor representation data, creates tensor APIs, and changes paddle.fluid.dygraph.to_variable to paddle.to_tensor
+- Addition, subtraction, multiplication, and division use full names only
+- For the current element-by-element operation, no elementwise prefix is added
+- For operating by a certain axis, no reduce prefix is added
+- For Conv, Pool, Dropout, BatchNorm and Pad networking APIs, 1d, 2d, and 3d suffixes are added according to the input data type

- Some of the APIs in the original fluid directory have not been migrated to the paddle directory
-  - The following API under fluid.contrib directory are kept in the original location, not migrated：BasicGRUUnit, BasicLSTMUnit, BeamSearchDecoder, Compressor, HDFSClient, InitState, QuantizeTranspiler, StateCell, TrainingDecoder, basic_gru, basic_lstm, convert_dist_to_sparse_program, ctr_metric_bundle, extend_with_decoupled_weight_decay, fused_elemwise_activation, fused_embedding_seq_pool, load_persistables_for_increment, load_persistables_for_inference, match_matrix_tensor, memory_usage, mixed_precision.AutoMixedPrecisionLists, mixed_precision.decorate, multi_download, multi_upload, multiclass_nms2, op_freq_statistic, search_pyramid_hash, sequence_topk_avg_pooling, shuffle_batch, tree_conv, var_conv_2d
-  - The following APIs related to LodTensor are still under development and have not been migrated yet：LoDTensor, LoDTensorArray, create_lod_tensor, create_random_int_lodtensor, DynamicRNN, array_length, array_read, array_write, create_array, ctc_greedy_decoder, dynamic_gru, dynamic_lstm, dynamic_lstmp, im2sequence, linear_chain_crf, lod_append, lod_reset, sequence_concat, sequence_conv, sequence_enumerate, sequence_expand, sequence_expand_as, sequence_first_step, sequence_last_step, sequence_mask, sequence_pad, sequence_pool, sequence_reshape, sequence_reverse, sequence_scatter, sequence_slice, sequence_softmax, sequence_unpad, tensor_array_to_tensor
-  - The following APIs related to distributed training are still under development, not migrated yet
-  - The following APIs in fluid.nets directory will be implemented with high level API， not migrated：nets.glu, nets.img_conv_group, nets.scaled_dot_product_attention, nets.sequence_conv_pool, nets.simple_img_conv_pool
-  - The following APIs are to be improved, not migrated：dygraph.GRUUnit, layers.DecodeHelper, layers.GreedyEmbeddingHelper, layers.SampleEmbeddingHelper, layers.TrainingHelper, layers.autoincreased_step_counter, profiler.cuda_profiler, profiler.profiler, profiler.reset_profiler, profiler.start_profiler, profiler.stop_profiler
-  - The following APIs are no longer recommended and are not migrated：DataFeedDesc, DataFeeder, clip.ErrorClipByValue, clip.set_gradient_clip, dygraph_grad_clip.GradClipByGlobalNorm, dygraph_grad_clip.GradClipByNorm, dygraph_grad_clip.GradClipByValue, initializer.force_init_on_cpu, initializer.init_on_cpu, io.ComposeNotAligned.with_traceback, io.PyReader, io.load_params, io.load_persistables, io.load_vars, io.map_readers, io.multiprocess_reader, io.save_params, io.save_persistables, io.save_vars, io.xmap_readers, layers.BasicDecoder, layers.BeamSearchDecoder, layers.Decoder, layers.GRUCell, layers.IfElse, layers.LSTMCell, layers.RNNCell, layers.StaticRNN, layers.Switch, layers.While, layers.create_py_reader_by_data, layers.crop, layers.data, layers.double_buffer, layers.embedding, layers.fill_constant_batch_size_like, layers.gaussian_random_batch_size_like, layers.get_tensor_from_selected_rows, layers.load, layers.merge_selected_rows, layers.one_hot, layers.py_reader, layers.read_file, layers.reorder_lod_tensor_by_rank, layers.rnn, layers.uniform_random_batch_size_like, memory_optimize, release_memory, transpiler.memory_optimize, transpiler.release_memory
+  | Paddle 1.8    | Paddle 2.0-beta |
+  | --------------- | ------------------------ |
+  | paddle.fluid.layers.elementwise_add | paddle.add               |
+  | paddle.fluid.layers.elementwise_sub | paddle.subract           |
+  | paddle.fluid.layers.elementwise_mul | paddle.multiply          |
+  | paddle.fluid.layers.elementwise_div | paddle.divide |
+  | paddle.fluid.layers.elementwise_max | paddle.maximum             |
+  | paddle.fluid.layers.elementwise_min | paddle.minimum |
+  | paddle.fluid.layers.reduce_sum | paddle.sum |
+  | paddle.fluid.layers.reduce_prod | paddle.prod |
+  | paddle.fluid.layers.reduce_max | paddle.max        |
+  | paddle.fluid.layers.reduce_min | paddle.min        |
+  | paddle.fluid.layers.reduce_all | paddle.all        |
+  | paddle.fluid.layers.reduce_any | paddle.any        |
+  | paddle.fluid.dygraph.Conv2D | paddle.nn.Conv2d |
+  | paddle.fluid.dygraph.Conv2DTranspose | paddle.nn.ConvTranspose2d |
+  | paddle.fluid.dygraph.Pool2D | paddle.nn.MaxPool2d, paddle.nn.AvgPool2d |
+
+#### Fixing and Improving APIs
+- Modified and improved a total of 155 APIs. See [Link] (https://github.com/PaddlePaddle/Paddle/wiki/Paddle-2.0beta-Upgraded-API-List) and the API document
+- Fixed APIs related to random number generation including: seed setting paddle.rand, randn, randint, randperm, dropout, Uniform, and Normal
+- Upgraded the codes of the underlying C++ operators corresponding to the following APIs to theoretically achieve compatibility without excluding slight incompatibility: linspace, concat, gather, gather_nd, split, squeeze, unsqueeze, clip, argmax, argmin, mean, norm, unique, cumsum, LeakyReLU, leaky_relu, hardshrink, embedding, margin_ranking_loss, grid_sample, affine_grid
+- Added oneDNN support for the relu6 and Sigmoid activation functions
+
+#### Multi-device/Distributed Training APIs
+- Single-Machine Multi-Card Training Under a Dynamic Graph
+  - Added paddle.distributed.spawn(func, args=(), nprocs=-1, join=True, daemon=False, **options)，which is used to start multi-card training under a dynamic graph.
+  - Added paddle.distributed.init_parallel_env(), which is used to initialize the environment of multi-card training under a dynamic graph.
+  - Added paddle.distribued.get_rank(), which is used to get the rank of the current process during the multi-card training.
+  - Added paddle.distribued.get_world_size(), which is used to get the total number of processes participating in training during the multi-card training.
+
+- Distributed Collective Communication
+  - Added paddle.distributed.broadcast(tensor, src, group=0), which broadcasts a tensor of a specified process to all the processes.
+  - Added paddle.distributed.all_reduce(tensor, op=ReduceOp.SUM, group=0), which performs the reduce operation on specified tensors of all the processes and returns results to all the processes.
+  - Added paddle.distributed.reduce(tensor, dst, op=ReduceOp.SUM, group=0), which performs the reduce operation on specified tensors of all the processes and returns results to specified processes.
+  - Added paddle.distributed.all_gather(tensor_list, tensor, group=0), which gathers specified tensors of all the processes and returns results to all the processes.
+  - Added paddle.distributed.scatter(tensor, tensor_list=None, src=0, group=0), which distributes tensors in a specified tensor list to all the processes.
+  - Added paddle.distributed.barrier(group=0)，which synchronizes all the processes.

 ### High-level APIs

- Adds a paddle.incubate.hapi directory. Encapsulates common operations such as networking, training, evaluation, inference, and access during the model development process. Implements low-code development. Uses the imperative programming implementation mode of MNIST task comparison. High-level APIs can reduce 80% of executable codes.
- Adds model-type encapsulation. Inherits the layer type. Encapsulates common basic functions during the model development process, including:
-  - Provides a prepare API that is used to specify a loss function and an optimization algorithm
-  - Provides a fit API to implement training and evaluation. Implements the execution of model storage and other user-defined functions during the training process by means of callback
-  - Provides an evaluate interface to implement the inference and evaluation index calculation on the evaluation set
-  - Provides a predict interface to implement specific test data inference
-  - Provides a train\_batch interface to implement the training of single-batch data
- Adds a dataset interface to encapsulate commonly-used data sets and supports random access to data
- Adds encapsulation of common Loss and Metric types
- Adds 16 common data processing interfaces including Resize and Normalize in the CV field
- Adds lenet, vgg, resnet, mobilenetv1, and mobilenetv2 image classification backbone networks in the CV field
- Adds MultiHeadAttention, BeamSearchDecoder, TransformerEncoder, TransformerDecoder, and DynamicDecode APIs in the NLP field
- Releases 12 models based on high-level API implementation, including Transformer, Seq2seq, LAC, BMN, ResNet, YOLOv3, VGG, MobileNet, TSM, CycleGAN, Bert, and OCR
+- Added PaddlePaddle high-level APIs to encapsulate common operations such as networking, training, evaluation, inference, and access so as to implement low code development. In the MNIST handwritten digit recognition task versus the imperative programming implementation mode, high-level APIs can reduce 80% of executable codes.
+
+- **Data Management**
+  - Unified data loading and usage method
+    - Dataset definition, which is implemented by inheriting `paddle.io.Dataset`.
+    - Multi-process data loading using `paddle.io.DataLoader`.
+  - Added `paddle.io.IterableDataset`, which is used for a streaming dataset and supports its concurrent acceleration in `paddle.io.DataLoader`.
+  - Added `paddle.io.get_worker_info` for dividing child process data in `paddle.io.IterableDataset`.
+
+- **Model Networking**
+  - Added the encapsulation of the common loss API `paddle.nn.loss.*` and metric API `paddle.metric.*`
+  - Released 12 models based on high-level API implementations, including Transformer, Seq2seq, LAC, BMN, ResNet, YOLOv3, VGG, MobileNet, TSM, CycleGAN, Bert, OCR. The code can be found in [PaddlePaddle/hapi examples](https://github.com/PaddlePaddle/hapi/tree/master/examples).
+
+- **Model Execution**
+  - Added class API `paddle.Model`, which encapsulates the common model development methods:
+    - API `Model.summary`   to view the network structure and the number of parameters of the dynamic graph networking.
+    - API `Model.prepare`  to specify a loss function and an optimization algorithm.
+    - API `Model.fit`  to implement training and evaluation, which can implement the execution of user-defined functions such as model storage by callback.
+    - API `Model.evaluate`  to implement the computation of inference and evaluation indexes on the evaluation set.
+    - API `Model.predict`  to implement specific test data inference.
+    - API `Model.train_batch`  to implement training on a single batch of data.
+    - API `Model.eval_batch`  to implement evaluation on a single batch of data.
+    - API `Model.text_batch`  to implement testing on a single batch of data.
+    - API `Model.save`/`Model.load` , which supports storing an inference model in dynamic graph training mode.
+  - Added callback API `paddle.callbacks.*` as a model execution API, which performs logging and Checkpoint model saving, etc. Users can customize a callback by inheriting `paddle.callbacks.Callback`.
+
+- **Domain APIs**
+  - Added computer vision (CV) APIs `paddle.vision`
+    - Added dataset API `paddle.vision.datasets.*`, which encapsulates common public datasets and supports random access to data.
+    - Added 24 common data preprocessing APIs `paddle.vision.transforms.*` such as Resize, Normalize, etc.
+    - Added image classification backbone network and pre-training parameters:
+      - `paddle.vision.models.lenet` or `paddle.vision.lenet`
+      - `paddle.vision.models.vgg` or `paddle.vision.vgg`
+      - `paddle.vision.models.resnet` or `paddle.vision.resnet`
+      - `paddle.vision.models.mobilenetv1` or `paddle.vision.mobilenetv1`
+      - `paddle.vision.models.mobilenetv2` or `paddle.vision.mobilenetv2`
+  - Added natural language processing (NLP)  APIs `paddle.text`.
+    - Added dataset API `paddle.text.datasets.*`, which encapsulates commonly-used datasets and supports random access to data.
+    - Added networking API `paddle.text.*`.
+- **Automatic Breakpoint Restart**
+  - Added API `train_epoch_range`, which implements the epoch-level `checkpoint` autosave and autoloading functions on a static graph and supports automatic breakpoint restart.
+
+### Function Optimization (Including Distributed)
+
+#### Dynamic Graph to Static Graph
+
+- **Added Syntax Support for ProgramTranslator**
+
+  - Added dynamic-to-static support for the return syntax so as to return in advance or to return different types of tensors or none in if-elif-else or loop conditions during the dynamic-to-static conversion.
+
+  - Added dynamic-to-static support for the print syntax so that print (tensor) can also print out a tensor in the dynamic-to-static conversion.
+
+    - Added dynamic support for “for traversing a tensor”, “for traversing a tensor using enumeration”, “for traversing a TensorList”, and “for traversing a TensorList using enumeration” syntaxes so that operations related to the circular processing of tensors can be flexibly used in the dynamic-to-static conversion.
+
+    - Added dynamic-to-static support for the assert syntax to ensure that an assert tensor can be true (bool type) or non-0 (other data types) in the dynamic-to-static conversion.
+
+    - Added support for the transfer of cast of data type so that type conversion of similar conversion statements of dynamic graph type such as float (tensor) and int (tensor) can also be performed in a static graph.
+
+- **ProgramTranslator Usability Optimization Function**
+
+  - Changed the dynamic-to-static return type to class StaticLayer from callable. This class can obtain converted static graph information more easily by calling .code，.main_program, and other APIs.
+
+  - Added set_verbosity and set_code_level APIs so that users can set a log class to view a log in the dynamic-to-static running process or a converted code in intermediate state.
+
+  - Added InputSpec to specify the shape and data type of an input tensor variable.
+
+  - Optimized an error message displayed in case of error in the dynamic-to-static running so that codes with running error in the static graph after dynamic-to-static conversion can also be reported to the original error code line in the dynamic graph; deleted some dynamic-to-static errors from python stacks so that an error message is more related to user codes.
+
+  - Support performing a breakpoint test using pdb.set_trace() during the dynamic-to-static conversion.
+
+- **Optimized Deployment of Model Storage and Loading APIs**
+
+  - Added paddle.jit.save API, which is used to save a dynamic-to-static model so that the API is easier to use; deleted an old API ProgramTranslator.save_inference_model.
+  - Added paddle.jit.load API, which is used to load inference models including models saved by paddle.jit.save and paddle.io.save_inference_model. After being loaded, models can be used for model inference or model training optimization in a dynamic graph.
+
+
+#### Mixed Precision Training
+- Added the support for mixed precision of dynamic graphs. The ratio of the speed when the ResNet-50 model is trained on V100 using mixed precision to the speed using fp32 is 2.6.
+
+#### Quantitative Training
+
+- Added `ImperativeQuantAware` class. The dynamic graph quantitative training function is provided. Currently, the quantization of Conv2D, Linear, and other layers are supported. The supported model types include MobileNetV1/MobileNetV2/ResNet50.
+- After dynamic graph quantitative training is performed on a model, inference deployment of any quantitative model saved using an `ImperativeQuantAware.save_quantized_model` API can be performed using a Paddle-Lite inference library.
+- As for static graph quantization, Conv2d_tranpose quantization as well as Linear quantization in the form of per-channel is supported.
+
+#### Performance Optimization (Including Distributed)
+
+- Simplified the DataLoader underlying implementation logic in dynamic graph mode, reduced the thread reading overhead, and further improved the data reading efficiency and the overall model training speed.The overall training speed of MobileNetV1 in a scenario of single V100 card and BatchSize = 128 is increased by 34%.
+- Upgrade and performance optimization of dynamic graph networking. A large number of dynamic graph APIs will directly call an automatically generated Pybind API, improving the performance.
+
+#### Basic Functions for Dynamic Graph
+
+- Support the function of updating the gradient using a sparse parameter by configuring embedding and other APIs.
+- Added over 120 member functions of Tensor type, including Tensor().abs(), Tensor().add(), and Tensor().cos().
+- Added dir() API for a layer to facilitate viewing the attributes and functions in the layer.
+- Added an optimizer.set_lr() API so that users can flexibly adjust a learning rate in dynamic diagram mode.
+- Added a global parameter initialization method API set_global_initializer to define a global parameter initialization method.
+- Added oneDNN (former MKL-DNN) support for dynamic training and inference.Resent50 oneDNN dynamic training with minist dataset is enabled.
+- Added oneDNN support for dynamic training and inference. Resent50 oneDNN dynamic training with minist dataset is enabled.
+
+#### Debugging Analysis
+
+- Uniformly changed the wording of LOG (FATAL) throw abnormal at just 100 points to PADDLE_THROW; optimized the error format and content caused by non-support of the framework for a behavior.
+- Improved Signal Handler implementation within the framework; optimized the error format and content when system signal error occurs during the execution.
+- Optimized the framework error stack format. The python error stack occurring during the compilation is moved to below the native error stack to improve error message reading experience.
+- Further improved an accumulative total of about 1,300 error type and prompt copywritings of check errors within the framework to enhance the overall debugging usability of the framework.
+- Enhanced dynamic graph error messages. Error messages on the Pybind layer under a dynamic graph are systematically enhanced to improve user experience.
+
+### Bug Fixing
+
+- Fixed the problem that AttributeError may unexpectedly occur when the add_parameter API is used on a layer under a dynamic graph; enhance the input check.
+- Fixed the problem that tensors of int_8 and uint_8 types cannot be normally printed so that data can be normally output.
+
+#### Dependency Library Upgrading
+- Upgraded oneDNN (former MKL-DNN) to Version 1.5 from Version 1.3.
+- Upgrade oneDNN from 1.3->1.5
+
+
+## Inference

-### Performance Optimization
+### Paddle Inference

- Adds a `reshape+transpose+matmul` fuse so that the performance of the INT8 model is improved by about 4% (on the 6271 machine) after Ernie quantization. After the quantization, the speed of the INT8 model is increased by about 6.58 times compared with the FP32 model on which DNNL optimization (including fuses) and quantization are not performed
+#### API
+- Fully upgraded the inference C++ APIs. The new version of the APIs is recommended. The original APIs are reserved tentatively, but give a warning during use, and are planned to be deleted in the future. The upgrade to the new version of the APIs mainly involves naming standardization and usage method simplification. The important changes include:
+  - adding a `paddle_infer` naming space for the C++ APIs, containing inference-related APIs.
+  - renaming `ZeroCopyTensor` to `Tensor` as the default input/output representation method for the inference APIs.
+  - simplifying `CreatePaddlePredictor` to `CreatePredictor` and reserving the support for only `AnalysisConfig`, not for other Configs any more.
+  - adding service-related utility classes such as `PredictorPool`, which can be used when multiple predictors are created.

-### Debugging Analysis
+#### Functional Upgrading
+- Upgraded the operator version compatibility information registry to support more accurate Op version information and improve inferential compatibility.
+- Added the adaptive support for Version TRT 7.1.
+- Paddle-TensorRT enhances the support for the PaddleSlim quantitative model. Multiple tasks such as detection, classification, and segmentation on CV are covered.
+- Added the support for user-defined operators for Python-side inference.
+- Added the kernel support for `elementwise_add` and `elementwise_mul` INT8 oneDNN (former MKL-DNN) on the CPU side.
+- Improved the usability of CPU-side test quantitative models. A simultaneous comparison test of original models with quantitative models is supported.
+- Added the adaptive support for Jetson Nx hardware.

- To solve the problem of program printing contents being too lengthy and low utilization efficiency during debugging, considerably simplifies the printing strings of objects such as programs, blocks, operators, and variables, thus improving the debugging efficiency without losing effective information
- To solve the problem of insecure third-party library APIs `boost::get` and difficulty in debugging due to exceptions during running, adds the `BOOST_GET` series of macros to replace over 600 risky `boost::get` in Paddle. Richens error message during `boost::bad_get` exceptions. Specifically, adds the C++ error message stack, error file and line No., expected output type, and actual type, thus improving the debugging experience
+### Performance optimization
+- Added conv + affine_op pass. The MASK-RCNN fp32 single thread performance is improved by 26% (1.26x) on machine 6248.
+  - Added conv + affine_op pass, MASK-RCNN single thread performance is improved by 26% (1.26x) on machine 6248
+- Added fc + gru pass and enabled oneDNN (former MKL-DNN) GRU fp32 kernel, speeding up GRU fp32 model inference on 4 CPU threads by 20% on machine Intel Xeon 6248.
+  - Added fc + gru fuse pass and enabled oneDNN gru fp32 kernel, speeding up GRU fp32 model inference on 4 CPU threads by 20% (1.2x) on machine Intel Xeon 6248
+- Added oneDNN inplace support for many operators (speedup 2% for the feature fp32 model).
+  - Added support for oneDNN inplace support for many operators (speedup 2% for Feature model)
+- Optimized oneDNN LRN operator (speedup 1% for the GoogleNet fp32 model).
+  - Optimized LRN operator (speedup 1% for GoogleNet)
+- Improved the transformation and optimization of quantitative models.
+  -  Improved the transformation and optimization of quantized model
+- Optimized the ArgMin, ArgMax operator of CUDA so that the binary system size of the operator is decreased to 1.3 M from 60 M.

-## Bug Fixes
+#### Bug Fixing

- Fix the bug of wrong computation results when any slice operation exists in the while loop
- Fix the problem of degradation of the transformer model caused by inplace ops
- Fix the problem of running failure of the last batch in the Ernie precision test
- Fix the problem of failure to correctly exit when exceptions occur in context of fluid.dygraph.guard
+- Fixed the mask-rcnn inference error under CPU inference.
+  - Fixed mask-rcnn inference error under CPU inference
+- Fixed the error occurring in the CPU multithread inference on quantitative models.
+  - Fixed the CPU multithread inference on oneDNN quantized INT8 models