提交 · 0594d2a7f086cc64b58f01aeb0299cc06c683825 · 机器未来 / Paddle

12 10月, 2021 2 次提交
- Z
  Revert "refine case when thread_num = 1 (#36201)" (#36347) · 0594d2a7
  由 Zeng Jinle 提交于 10月 12, 2021
```
This reverts commit 7e60cc63.
```
  0594d2a7
- A
  Fix stop_gradient in RunProgramOp (#36339) · 2a75b447
  由 Aurelius84 提交于 10月 12, 2021
```
* Fix stop_gradient in RunProgramOp

* fix reference
```
  2a75b447
11 10月, 2021 27 次提交
- L
  refine auto_growth allocator (#35732) · 6d353aa5
  由 Leo Chen 提交于 10月 11, 2021
```
* do not use alignedAllocator when cuda has alignment

* update test

* fix error during multiple process
```
  6d353aa5
- D
  [heterps] add fuse_allreduce (#35131) · e5b4dd73
  由 danleifeng 提交于 10月 11, 2021
```
* heterps:add fuse_allreduce op; test=develop
* add program_mode in minimize for pslib mode;test=develop
```
  e5b4dd73
- J
  
  fix for matmul_v2 6D x 2D (#36342) · 339cb191
  由 jakpiase 提交于 10月 11, 2021
  
  339cb191
- Z
  Add FLAGS_allreduce_record_one_event to remove event waiting number (#36263) · 7b45a46e
  由 Zeng Jinle 提交于 10月 11, 2021
```
* add FLAGS_allreduce_record_one_event

* add more comments

* fix ut

* improve coverage

* fix ut, improve coverage
```
  7b45a46e
- L
  Add nn.functional.sparse_attention and some test cases, test=develop (#35757) · 85b77232
  由 Liu-xiandong 提交于 10月 11, 2021
```
Add paddle.nn.functional.sparse_attention API

    本个PR主要将sparse_attention功能在python层进行了一层封装，OP的主体代码见：#PR35676

    此外，对于封装的python 接口，增加了相应的单测。
```
  85b77232
- J
  
  added missing bf16 ops (#36291) · 14393876
  由 jakpiase 提交于 10月 11, 2021
  
  14393876
- Z
  
  Add more tests and fix bugs for cudnn_norm_conv_test and cudnn_bn_and_relu_test (#36314) · a679fcbb
  由 Zhang Zheng 提交于 10月 11, 2021
  
  a679fcbb
- N
  Add functor_primitives.h for kernel primtive api (#36203) · 830debc2
  由 niuliling123 提交于 10月 11, 2021
```
* Add functor_primitives.h for kernel primtive api

* update

* move namespace kps

* subFunctor init_data

* delete InvalidArgumentError
```
  830debc2
- S
  
  fix bug of clear third_party cache every 10 days (#36332) · eaeeb884
  由 Sing_chan 提交于 10月 11, 2021
  
  eaeeb884
- S
  
  change exit code of pip install dependencies to 5 (#36016) · fc5415d6
  由 Sing_chan 提交于 10月 11, 2021
  
  fc5415d6
- Y
  
  fix_dp_grad_merge_with_grad_clip_by_global_norm (#36334) · 1026052c
  由 Yuang Liu 提交于 10月 11, 2021
  
  1026052c
- Z
  [Paddle-ASP] Revise 4d tensor sparsity mask pattern for conv2d sparsity (#36054) · 00245cfd
  由 zlsh80826 提交于 10月 11, 2021
```
Sparse tensor core for convolution requires the input channel dimension is 2:4 structed sparse.
So we have to mask the input channel dimension for using sparse tensor core
```
  00245cfd
- C
  add reshard module (#35779) · c38b0488
  由 caozhou 提交于 10月 11, 2021
```
* add reshard module

* fix conflict

* update reshard module

* update and add unitest

* update reshard module and unitest

* add more unitests
```
  c38b0488
- Y
  
  fix multi-node (#36329) · 7a724ddb
  由 yaoxuefeng 提交于 10月 11, 2021
  
  7a724ddb
- T
  
  Fix, test=document_fix (#36336) · 414c252a
  由 tianshuo78520a 提交于 10月 11, 2021
  
  414c252a
- W
  enhance yolobox trt plugin (#34128) · 71cb3ff8
  由 wangxinxin08 提交于 10月 11, 2021
```
* enhance yolobox plugin
```
  71cb3ff8
- Q
  [NPU] fix matmul_v2 and utils.run_check, test=develop (#36164) · 7850f7ce
  由 Qi Li 提交于 10月 11, 2021
```
* [NPU] fix matmul_v2 and utils.run_check, test=develop

* remove debug files, test=develop

* fix install_check, test=develop

* fix doc, test=develop

* fix review comments, test=develop
```
  7850f7ce
- Q
  [NPU] fix set_value, test=develop (#36272) · 83541fd4
  由 Qi Li 提交于 10月 11, 2021
```
* [NPU] fix set_value, test=develop

* fix typo, test=develop

* fix typo, test=develop
```
  83541fd4
- Q
  
  [NPU] fix softmax_with_cross_entropy in dygraph, test=develop (#36297) · 11061325
  由 Qi Li 提交于 10月 11, 2021
  
  11061325
- S
  
  fix bug of upload third party to bos (#36311) · 64d08c0e
  由 Sing_chan 提交于 10月 11, 2021
  
  64d08c0e
- X
  
  use unified external error message for cufft api (#36114) · 642aaa2e
  由 Xiaoxu Chen 提交于 10月 11, 2021
  
  642aaa2e
- F
  fix fft axis (#36321) · 2bf82e75
  由 Feiyu Chan 提交于 10月 11, 2021
```
fix: `-1` is used when fft's axis is `0`
```
  2bf82e75
- 李
  
  fix the hidden method in paddle.distributed.utils file (#36210) · ea76457c
  由李季提交于 10月 11, 2021
  
  ea76457c
- W
  add mish trt plugin (#34123) · 2b7b752a
  由 wangxinxin08 提交于 10月 11, 2021
```
* add mish trt plugin, compile & install success, run error. test=develop
* modify code according to review
* add TRT_NOEXCEPT for mish trt plugin
* add unittest for mish trt plugin
* remove unnecessary check of mish in op_teller.cc
* fix some problem of trt8
* add check and modify unittest while converting mish to trt plugin
Co-authored-by: Ndengkaipeng <dengkaipeng@baidu.com>
```
  2b7b752a
- B
  add skip case in trt converter ut (#36287) · 34bd18ff
  由 baoachun 提交于 10月 11, 2021
```
* add skip case in trt converter ut

* disable group_norm trt plugin
```
  34bd18ff
- H
  Add use_cinn Flag and RunFromCinn in PE (#36107) · 5690666c
  由 Huihuang Zheng 提交于 10月 11, 2021
```
Add use_cinn flag and use it to control whether we run PaddlePaddle using CINN.

Also add:

Replace PaddlePaddle graph with a CINN graph in a pass
PE Method to feed data and run the graph by CINN
```
  5690666c
- J
  
  Add skip case for conv2d convert test (#36301) · 9b987b3d
  由 JingZhuangzhuang 提交于 10月 10, 2021
  
  9b987b3d
09 10月, 2021 9 次提交
- Z
  
  Implement Fused BN + Add + Relu with cudnnFusedOps API. (#35955) · 7e6c0cee
  由 Zhang Zheng 提交于 10月 09, 2021
  
  7e6c0cee
- Y
  
  Enhance OpTest for bfloat16. (#36079) · 91119271
  由 Yiqun Liu 提交于 10月 09, 2021
  
  91119271
- Z
  Add const for OpDesc::id() and VarDesc::id() (#36298) · cb620ca6
  由 Zeng Jinle 提交于 10月 09, 2021
```
* add const OpDesc id()

* add const for VarDesc::id()
```
  cb620ca6
- F
  Add new API 'tensordot' (#36273) · 21dc7f40
  由 From00 提交于 10月 09, 2021
```
* Add new API tensordot

* Set timeout value 400 for UT; Fix format for EN docs

* Set timeout value 1000 for UT; Fix format for EN docs

* Remove some input check

* Coding style improve: don't compare boolean values to True or False
using ==
```
  21dc7f40
- Z
  
  fill_diagonal op fix border cross caused by offset (#36212) · 62e41150
  由 zhiboniu 提交于 10月 09, 2021
  
  62e41150
- Z
  update fft api path (#36219) · c8a01010
  由 zhiboniu 提交于 10月 09, 2021
```
* update fft api path
* add sample code for ihfft2
Co-authored-by: Nchenfeiyu <chenfeiyu@baidu.com>
```
  c8a01010
- Z
  support ClipGradByGlobalNorm in sharding (#36012) · 623df429
  由 zhaoyingli 提交于 10月 09, 2021
```
* support ClipGradByGlobalNorm in sharding

* support ClipGradByGlobalNorm in sharding

* test=allcase
```
  623df429
- W
  C++ support register pass via PassDesc (#36095) · 2fd8deea
  由 wuhuanzhou 提交于 10月 09, 2021
```
支持C++开发注册GeneratePass，简化针对fusion等子图优化场景开发方式。
```
  2fd8deea
- W
  fix hasattr(paddle.fluid.ir.PassDesc.OP, '__name__') error (#36229) · d8887afa
  由 wuhuanzhou 提交于 10月 09, 2021
```
对于__getattr__重载后不满足条件的参数，全部抛出AttributeError异常，达到与未重载版本一致。
```
  d8887afa
08 10月, 2021 2 次提交
- J
  Fix for oneDNN conv op (#36284) · 57e8cbec
  由 jakpiase 提交于 10月 08, 2021
```
* fix for conv op

* Minor change
```
  57e8cbec
- Z
  Support CUDA Graph on ParallelExecutor (#36250) · f9591bb1
  由 Zeng Jinle 提交于 10月 08, 2021
```
* support CUDA Graph on PE

* add ut, fix CI compile

* reduce memory consumption

* fix CUDA 10 CI

* improve coverage

* improve python coverage
```
  f9591bb1

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致