this is version 2.2.2

2.2.2 Release Note

1. 重要更新

我们很高兴的发布飞桨框架2.2.2版本,主要是对2.2.1中一些功能和性能问题的修复,并对部分功能点做了增强。

2. 训练框架(含分布式)

(1)新功能

API

  • 新增paddle.nn.Mishpaddle.nn.functional.mish,支持逐元素计算mish激活函数。 (#38803)

其他

  • paddle.nn.PReLUpaddle.nn.functional.prelupaddle.nn.static.prelu 新增支持 data_format 参数,可以设置输入的数据类型。 (#38495)
  • paddle.index_select 新增支持 float16 数据类型。(#38751)
  • 优化 paddle.multiplexinputs中张量 size 为 0 时的报错信息。(#38757)
  • paddle.fluid.contrib.slim.quantization.PostTrainingQuantization 新增初始化参数data_loader,支持传入 paddle.io.DataLoader 对象或者Python Generator 。(#38729)

(2)问题修复

API

  • 修复paddle.max在输入x.ndim > 6 and axis < 0时运行出错的问题。(#38070)
  • 修复paddle.maxpaddle.min的bug:在CPU设备上,当参数axis是list类型且len(axis) == x.ndim and axis[i] < 0时,结果出错。(#38478)
  • 修复paddle.nn.functional.unfold在InferShape计算时不区分compile time和runtime的问题。(#38925)
  • 修复paddle.nn.functional.cross_entropy在对labels进行检查时,存在不必要的GPU与CPU同步的问题。(#38849
  • 修复paddle.distributed.split在沿列切分FC时,反向计算时得到的输入梯度结果异常的问题。(#38724)
  • 修复 paddle.nn.Layer.to 不支持 paddle.dtype 类型的问题。(#38108)
  • 修复静态图下 paddle.linalg.svdfull_matrics=True 时,输出tensor的shape在动态图和静态图下不同的问题。(#37744)
  • 修复Tensor切片索引使用多个None类型索引时结果维度异常的问题。(#37400)
  • 修复Tensor索引赋值在部分场景下显存泄露的问题。(#38098)
  • 修复模型使用 save_inference_model 导出后,添加反向 pass 做训练,conv2d 缺失属性报错的问题。 (#38832)

IR(Intermediate Representation)

  • 动态图转静态图

    • 修复了部分初始化相关 API 动静行为不统一的问题。(#37827)
    • 修复动转静代码转写时会将 paddle 作为变量的问题。(#37999)
    • 修复动转静代码转写时,突出的代码注释导致转写报错的问题。(#38003)
    • 修复 for ... zip... 语句在动转静中死循环的问题。(#37846)
  • 模型量化

    • 修复动态图量化训练导出模型多余节点问题。(#38122) (#38025)
    • 针对量化模型在Paddle Lite上无法预测的问题,去除量化导出模型的 clip_extra 设置。 (#38343)
    • 针对 flatten_contiguous_range 算子在量化中输出配置错误的问题,修复 flatten_contiguous_range 量化设置。 (#37741)

其他

  • 自定义OP

    • 修复了自定义算子在多进程下加载Python API 时,可能因文件不完整导致报错的问题。(#38128)
    • 修复了在CentOS平台上编译时,D_GLIBCXX_USE_CXX11_ABI未按预期生效导致的编译失败问题。(#37878)
  • 动态图Inplace策略

    • 修复了多个inplace op连续执行时,accumulator 报错的问题。(#38406)
    • 修复了 Tensorsetitem 方法,对叶子节点进行inplace操作时,导致反向图构建错误的bug。(#38014)
  • NHWC 策略

    • 修复 batchnorm_op 中,当数据类型为 FP32 ,且数据维度 dims = 2,data_layout = NHWC 时,反向 Op 内中间变量未定义问题。 (#37020)

3. 部署方向(Paddle Inference)

(1)功能优化

框架及API更新

  • C API支持对c++ std::string的处理。(#38667)

后端能力增强

  • GPU 及 TensorRT 子图引擎相关更新
    • 支持 relu、relu6、tanh、sigmoid、pool2d、concat、batch_norm、split、gelu、scale、swish、prelu、clip、reduce_sum、reduce_mean 算子在静态 shape 且2维输入情况下调用 TensorRT 推理。(#37773)
    • 支持mish激活函数调用 TensorRT 推理。 (#38866)

(2)问题修复

框架及API修复

  • 算子修复

    • 修复roi_align算子在使用 TRT 时不兼容的问题。(#38788)
    • 增加elementwise在维度相同时广播的功能。(#37908)
  • 框架功能修复

    • 修复动态图转静态图时的模型剪裁逻辑,使得包含 subblock 的算子在动态图转静态图时可以正确剪裁。(#37579)
    • 修复多线程下 CreatePredictor 接口的报错问题,当前的 CreatePredictor 接口允许在多线程中调用而不会导致推理异常。(#37894)
    • 配置config时,对于没有权重的模型,支持 params file 传空字符串。(#38579)
    • 修复Paddle-TRT engine直接输入cpu tensor没有进行gpu数据拷贝的问题。(#37427)

后端能力修复

  • TensorRT 子图引擎修复

    • 修复pool2d在某些参数组合的情况下运行TensorRT出错的问题。(#37929)
  • MKLDNN引擎修复

    • 修复 matmul_v2 的 mkldnn kernel 不支持两个输入的shape长度不同的问题。 (#38733)

其他修复

  • 修复ERNIE模型在TRT8下可能出现的hang死问题。(#37839)

2.2.2 Release Note

1. Important Updates

This version fixed some function and performance issues of PaddlePaddle 2.2.1 and optimized some functions.

2. Training Framework (distributed included)

(1)New functions

API

  • Add the paddle.nn.Mish and paddle.nn.functional.mish which support the element-by-element calculation of the mish activation function. (#38803)

Others

  • The paddle.nn.PReLU, paddle.nn.functional.prelu, and paddle.nn.static.prelu newly support the data_format parameter. You can set input data type. (#38495)
  • The paddle.index_select supports float16 data type. (#38751)
  • Optimize error message of paddle.multiplex when tensor size in inputs is 0. (#38757)
  • Add initialization parameter data_loader for paddle.fluid.contrib.slim.quantization.PostTrainingQuantization, and support input of the paddle.io.DataLoader object or Python Generator. (#38729)

(2)Bug Fixes

API

  • Fix operation error of paddle.max in input of x.ndim > 6 and axis < 0. (#38070)
  • Fix bug of paddle.max and paddle.min: Result is incorrect on the CPU device when the parameter axis is the list type and len(axis) == x.ndim and axis[i] < 0. (#38478)
  • Fix bug that paddle.nn.functional.unfold does not distinguish between compile time and runtime in InferShape calculation. (#38925) (#38834)
  • Fix bug where GPU unnecessarily synchronizes with the CPU when paddle.nn.functional.cross_entropy checks labels. (#38849
  • Fix bug of input gradient result error in backward computing when paddle.distributed.split slices the FC along columns. (#38724)
  • Fix bug where paddle.nn.Layer.to does not support paddle.dtype type. (#38108)
  • Fix bug that output tensor's shape is different between dynamic and static graphs when full_matrics=True in paddle.linalg.svd under static graphs. (#37744)
  • Fix bug of the result dimension exception when the Tensor slice index uses multiple None type indexes. (#37400)
  • Fix memory leak bug of Tensor index assignment in some scenarios. (#38098)
  • Fix bug of conv2d reporting an error with missing attributes after model is exported using save_inference_model and backward pass is added for training. (#38832)

IR(Intermediate Representation)

  • Dynamic Graph to Static Graph

    • Fix bug of inconsistency between dynamic and static behaviors of some initialization-related APIs. (#37827)
    • Fix bug where paddle will be used as a variable when dynamic to static code is transcribed. (#37999)
    • Fix bug that highlighted code comments lead to an error report when dynamic to static code is transcribed. (#38003)
    • Fix endless loop of for … zip … statement in dynamic to static graph. (#37846)
  • Model quantization

    • Fix problem of redundant nodes in model derived from quantitative training of dynamic graph. (#38122) (#38025)
    • To solve the problem that the quantitative model cannot be predicted on Paddle Lite, remove clip_extra settings of quantitative export models. (#38343)
    • Fix flatten_contiguous_range quantization settings for flatten_contiguous_range operator output configuration error in quantization. (#37741)

Others

  • Custom OP

    • Fix bug that user-defined operator may report an error due to incomplete files when loading Python APIs under multiple processes. (#38128)
    • Fix compilation failure caused by D_GLIBCXX_USE_CXX11_ABI not taking effect as expected when compiling on CentOS platforms. (#37878)
  • Dynamic graph inplace strategy

    • Fix problem that accumulator reports an error when multiple inplace OPs execute continuously. (#38406)
    • Fix problem that the setitem method of Tensor causes the backward graph construction error when performing the inplace operation on leaf nodes. (#38014)
  • NHWC strategy

    • Fix bug of undefined intermediate variables in backward Op in batchnorm_op when data type is FP32, with dims = 2 and data_layout = NHWC. (#37020)

3. Paddle Inference

(1)Function Optimization

Framework and API updates

  • C API supports processing of c++ std::string. (#38667)

Back-end capability enhancement

  • GPU and TensorRT subgraph engine related updates
    • Support invoke of TensorRT inference for relu, relu6, tanh, sigmoid, pool2d, concat, batch_norm, split, gelu, scale, swish, prelu, clip, reduce_sum, and reduce_mean operators in the static shape and 2-dimensional input. (#37773)
    • Support invoke of TensorRT inference by mish activation function. (#38866)

(2)Bug Fixes

Framework and API fixing

  • Operator fixing

    • Fix incompatibility bug of the roi_align operator in use of TRT. (#38788)
    • Add the function of elementwise broadcasting in the same dimension. (#37908)
  • Framework function fixing

    • Fix bug of model clipping logic in dynamic-to-static graphs, so operators containing subblock are clipped correctly in dynamic-to-static graphs. (#37579)
    • Fix error reporting issue of CreatePredictor interface under multiple threads. Current CreatePredictor interface allows calling in multiple threads without causing inference exceptions. (#37894)
    • Support “params file” to pass empty strings for models without weights in config. (#38579)
    • Fix problem of not copying GPU data when Paddle-TRT engine directly inputs CPU tensor. (#37427)

Back-end capability fixing

  • TensorRT subgraph engine fixing

    • Fix the bug of an error that occurred in the running of TensorRT by pool2d with some of the parameters. (#37929)
  • MKLDNN engine fixing

    • Fix the problem that mkldnn kernel of matmul_v2 does not support different lengths of two input shapes. (#38733)

Others

  • Fix the possible hang bug of ERNIE model under TRT8. (#37839)

项目简介

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

🚀 Github 镜像仓库 🚀

源项目地址

https://github.com/paddlepaddle/paddle

发行版本 60

PaddlePaddle 2.5.0 Release Note

全部发行版

贡献者 246

全部贡献者

开发语言

  • C++ 49.8 %
  • Python 41.0 %
  • Cuda 7.0 %
  • CMake 1.1 %
  • Shell 0.6 %