未验证 提交 7421aaa7 编写于 作者: Q qiceng 提交者: GitHub

release (#1950)

add the compatibility statement in release notes
上级 74be6398
...@@ -62,7 +62,7 @@ ...@@ -62,7 +62,7 @@
- 支持named_sublayers、named_parameters功能,方便用户编程。 - 支持named_sublayers、named_parameters功能,方便用户编程。
- 支持Linear lr warmup decay策略。 - 支持Linear lr warmup decay策略。
- 性能优化 - 性能优化
- 优化了python 与c++ 交互,GradMaker、OperatorBase、allocator等。基于LSTM的语言模型任务p在P40机器上性能提升提升270%。 - 优化了python 与c++ 交互,GradMaker、OperatorBase、allocator等。基于LSTM的语言模型任务在P40机器上性能提升提升270%。
- 针对optimize中多次调用optimized_guard无用代码导致的性能问题,移除了冗余代码。Transformer模型(batch_size=64)在P40机器上,SGD、Adam等优化器有5%~8%%的性能提升。 - 针对optimize中多次调用optimized_guard无用代码导致的性能问题,移除了冗余代码。Transformer模型(batch_size=64)在P40机器上,SGD、Adam等优化器有5%~8%%的性能提升。
- 针对AdamOptimizer中额外添加scale_op更新beta参数对性能的影响,将beta更新逻辑融合到adam_op中,减少op kernel调用开销。Dialogue-PLATO模型P40机器上性能提升9.67%。 - 针对AdamOptimizer中额外添加scale_op更新beta参数对性能的影响,将beta更新逻辑融合到adam_op中,减少op kernel调用开销。Dialogue-PLATO模型P40机器上性能提升9.67%。
- 优化动态图异步DataLoader,对于Mnist、ResNet等CV模型任务在P40机器上单卡训练速度提升超过40%。 - 优化动态图异步DataLoader,对于Mnist、ResNet等CV模型任务在P40机器上单卡训练速度提升超过40%。
...@@ -331,3 +331,23 @@ ...@@ -331,3 +331,23 @@
- 修复部分动态图模式下reshape、Conv2D相关的bug;修复网络中部分参数无梯度,导致程序crash 的bug。 - 修复部分动态图模式下reshape、Conv2D相关的bug;修复网络中部分参数无梯度,导致程序crash 的bug。
- 修复GradientClip在参数服务器模式下运行错误的BUG。 - 修复GradientClip在参数服务器模式下运行错误的BUG。
- 修复参数服务器全异步模式下内存泄露的问题。 - 修复参数服务器全异步模式下内存泄露的问题。
## 兼容性说明
- 静态图:1.7版本对上一版本(1.6.0~1.6.3)完全兼容,1.6+版本训练的模型均可在1.7版本下进行训练或预测。
- 动态图:1.7版本作了大量提升易用性的优化,有部分升级无法兼顾兼容性:
- paddle.fluid.dygraph提供的API中统一移除了”name_scope”参数:该参数在设计和使用上均没有实际意义,属于冗余参数,为了减少调用API的复杂性,统一进行移除。
- 部分API参数列表变更:
- Conv2D、Conv2DTranspose、Conv3D、Conv3DTranspose:添加必选参数 num_channels (required, int),用于表示输入图像的通道数。
- LayerNorm:移除 begin_norm_axis参数;添加必选参数 normalized_shape (required, int | list | tuple),用于表示需规范化的shape。
- NCE:添加必选参数 dim (required, int),用于表示输入的维度(一般为词嵌入的维度)。
- PRelu:添加参数 input_shape (list | tuple),用于表示输入的维度,该参数仅在mode参数为”all”时为必选参数。
- BilinearTensorProduct:移除参数 size,添加必选参数 input1_dim (required, int) 和 input2_dim (required, int),分别用于表示第一个和第二个输入的维度大小。
- GroupNorm:添加必选参数 num_channels (required, int),用于表示输入的通道数。
- SpectralNorm:添加必选参数 weight_shape (required, list | tuple),用于表示权重参数的shape。
- TreeConv:添加参数 feature_size (required, int),用于表示nodes_vector的shape最后一维的维度。
- API使用方式变更:
- Embedding:不再要求输入数据的最后一维必须为1,而且输出的shape的规则发生了改变,比如在NLP任务中,1.6版的Embedding要求输入数据的shape形如[batch_size, seq_len, 1],在1.7版本为保证输出的shape同样为[batch_size, seq_len, embedding_size],输入的shape需为[batch_size, seq_len],
- FC API被删除,替换用法为Linear,具体迁移方法请见Linear API的文档。
- Optimizer定义时,需显式指定要优化的参数列表,参数列表可以通过Layer的 parameters() 接口直接获取。
- 参数名称的变化:
- 1.7 版本参数名称为了和静态图统一,命名规则进行了调整,增量训练无法加载上一版本(1.6.0~1.6.3)保存的模型。
...@@ -358,3 +358,27 @@ This version focuses on enhancement of the framework functions, includes improvi ...@@ -358,3 +358,27 @@ This version focuses on enhancement of the framework functions, includes improvi
- Fix some bugs related to reshape and Conv2D depthwisecoin dynamic graph mode; fix the problem of some parameters in the network having no gradient, causing the bug of program crash. - Fix some bugs related to reshape and Conv2D depthwisecoin dynamic graph mode; fix the problem of some parameters in the network having no gradient, causing the bug of program crash.
- Fix the bug of running error of GradientClip in parameter server mode. - Fix the bug of running error of GradientClip in parameter server mode.
- Fix the problem of memory leak in full asynchronous mode of the parameter server. - Fix the problem of memory leak in full asynchronous mode of the parameter server.
## Compatibility instructions
- Static diagram: Version 1.7 is fully compatible with the previous version (1.6.0 ~ 1.6.3). The model trained in version 1.6+ can be trained or predicted in version 1.7.
- Dynamic diagram: In version 1.7, a lot of optimization has been made to improve usability. Some upgrades fail to take compatibility into account:
- The "name_scope"parameter is removed from the API provided by paddle.fluid.dygraph in a unified way: this parameter has no practical significance in design and use, and belongs to redundant parameter. In order to reduce the complexity of calling API, it is removed in a unified way.
- Some API parameter list change:
- Conv2D、Conv2DTranspose、Conv3D、Conv3DTranspose: add required parameter `num_channels` (required,int), which is used to represent the channel number of the input image.
- LayerNorm: remove `begin_norm_axis` parameter, add required parameter `normalized_shape` (required, int | list | tuple), which is used to represent shape to be normalized.
- NCE: remove parameter `dim` (required, int), which is used to represent input dimension (generally the dimension of word embedded).
- PRelu: add parameter `input_shape` (list | tuple), which is used to represent input dimension, this parameter is required only when the mode parameter is "all".
- BilinearTensorProduct: remove parameter `size`, add required parameter `input1_dim` (required, int) and `input2_dim` (required, int),which are used to represent the dimension size of the first and second input respectively.
- GroupNorm: add required parameter `num_channels` (required, int), which is used to represent the channel number of the input.
- SpectralNorm: add required parameter `weight_shape` (required, list | tuple), which is used to represent the shape of weight parameters.
- TreeConv: add parameter `feature_size` (required, int), which is used to represent the last dimension of shape of nodes_vector.
- API usage changes:
- Embedding: it is no longer required that the last dimension of the input data must be 1, and the rules of the output shape have changed.For example, in NLP tasks, Embedding in version 1.6 requires the shape of the input data to be [batch_size, seq_len, 1]. In version 1.7, to ensure that the output shape is also [batch_size, seq_len, embedding_size], the input shape must be [batch_size, seq_len],
- FC API is deleted, replace with Linear. Please refer to the documentation of linear API for specific migration methods.
- When defining Optimizer, you need to explicitly specify the parameter list to be optimized. The parameter list can be obtained directly through the parameters() of Layer.
- Change of the parameter name:
- In order to unify the parameter name of version 1.7 with the static diagram, the naming rules have been adjusted, and incremental training cannot load the model saved in the previous version (1.6.0 ~ 1.6.3).
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册