提交 · d7064f0435ce1c35c2b57bf6fcbef6b2597c5f4f · Crayon鑫 / Paddle

13 10月, 2021 14 次提交

Y
[PaddlePaddle hackathon] + ADD CELU (#36088) · d7064f04
由 yujun 提交于 10月 13, 2021
```
* update

* update

* update

* try make CI pass

* doc typo

* update doc string
```
d7064f04

由 limingshu 提交于 10月 13, 2021

* A leap of try for cudaLaunchCooperativeKernel

* fix bugs

* Totally replace the lar cuda kernel

* Fix bugs

* a test for lars merge

* Adding las_op_momentum infer_shape

* Fix codes

* use avg_numel instead of max_numel to acquire grid num

* modify unittest files about lars op

* Finally converge when merged-lars works

* fix ctest files

* add merged_operation kernel when cuda version is older than 11

* Fix code style

* fix ctest failure

* fix error

* fix all ctest error and change lars compute code of cpu

* fix bugs on v100.

* revert python modififation about lars

* revert python modification codes

0c31579c

W
Verify the correctness of graph rewrited by GeneratePass (#36116) · 24418479
由 wuhuanzhou 提交于 10月 13, 2021
```
Check detail PR description at https://github.com/PaddlePaddle/Paddle/pull/36116
```
24418479
Z
[AMP] add attr is_distributed for layer.to (#36221) · 9a9953d9
由 zhangbo9674 提交于 10月 13, 2021
```
* add attr is_distributed

* refine code

* refine black/white list for pure fp16
```
9a9953d9

Add fp16 for clip_by_norm & clip_by_global_norm (#36198) · 3a869cc5

由 zhangbo9674 提交于 10月 13, 2021

* add fp16 for clip_by_norm api

* support ClipByGlobalNorm for fp16 in dygraph

* add unittest for dygraph clipGlobalNorm

* refine unittest for dygraph clipGlobalNorm for mac and windows

* refine unittest

* add unittest for fp64

* refine unittest for fp64

3a869cc5

G

support auto parallel data shard (#36055) · 85bb1a85
由 Guoxia Wang 提交于 10月 13, 2021

85bb1a85
C

fix pp comm init bug (#36377) · 817f9ef0
由 caozhou 提交于 10月 13, 2021

817f9ef0
L
[Amp] refine code of amp level (#36362) · 59e425cd
由 Leo Chen 提交于 10月 13, 2021
```
* refine amp level

* fix typo

* update tracer._amp_level
```
59e425cd
H
Remove RunFromCinn in PE because We Will Call CinnRunner in Compute of SubgraphOp (#36385) · e051bba0
由 Huihuang Zheng 提交于 10月 13, 2021
```
Remove RunFromCinn method in PE because We Will Call CinnRunner in Compute method of SubgraphOp
```
e051bba0

[New Feature] Support triple grad in Paddle (#36187) · 2c44ee7e

由 Jiabin Yang 提交于 10月 13, 2021

* native commit for triple grad of sigmod

* Updated unittests files

* init functional jacobian api

* Updated trible_test func

* Updated gradient_checker & test_script

* finish test with dtype float32

* add float64 test case

* polish code

* use atol=1e-5 with dtype float64

* fix for ci

* set timeout for test_jacobian

* fix dygraph grad to support high differential

* polish API docstring

* Updated gradient checker and some related files

* fix double grad strip error for high differential

* fix double grad strip error for high differential

* Add Sigmoid triple grad tests

* fix dygraph double grad dtype error when calling for high differential senario

* Updated triple grad teses func

* Use np.random to initialize ddx

* Updated triple_grad_check func

* add todo for gradient checker and refine some comments

* remove additional code

* add test for warnging in backward.py

* format python code
Co-authored-by: Nveyron95 <veyron_wu@163.com>
Co-authored-by: Nlevi131 <limaolin01@baidu.com>

2c44ee7e

[PaddleInference] Pass: add int8 flag for op (#36042) · d7858c99

由 Wangzheee 提交于 10月 13, 2021

* add_int_pass

* add_int8_flag_pass

* add_int8_flag_pass

* fix CMakeLists.txt

* fix test_trt_fc_fuse_quant_dequant_pass.py

* fix python/paddle/fluid/tests/unittests/ir/inference/test_trt_fc_fuse_quant_dequant_pass.py

* fix test_trt_fc_fuse_quant_dequant_pass.py

d7858c99

F
[PaddlePaddle Hackathon] add AlexNet (#36058) · caa2003a
由 fuqianya 提交于 10月 13, 2021
```
* add alexnet
```
caa2003a
F

Set NIGHTLY tag for 'tensordot' UT (#36354) · 90457d8c
由 From00 提交于 10月 13, 2021

90457d8c
L
unify usage of tuple and list (#36368) · 3c2bdaa8
由 levi131 提交于 10月 13, 2021
```
* modify format

* modify format
```
3c2bdaa8

12 10月, 2021 21 次提交
- Z
  Revert "refine LarsOptimizer (#36351)" (#36369) · 033a73c3
  由 Zeng Jinle 提交于 10月 12, 2021
```
This reverts commit b3f6eedb.
```
  033a73c3
- W
  change the paddle.mm to matmul_v2 (#35770) · fba355fb
  由 wawltor 提交于 10月 12, 2021
```
* change the paddle.mm to matmul_v2

* update the code for the mm

* update the document for the mm
```
  fba355fb
- A
  [NPU] concat supports dtype int64 for model deepfm (#36327) · 5f1eb839
  由 Aganlengzi 提交于 10月 12, 2021
```
* [NPU] modify for model deepfm

* [NPU] unit test delete precision control

* [NPU] add more unit test

* revert elementwise_mul related modification

* [NPU] add more unit tests for concat
```
  5f1eb839
- fix windows bug that python virtual env can't find python executable (#36227) · 6920afeb
  由 zhouweiwei2014 提交于 10月 12, 2021
  
  6920afeb
- 0
  delete remove_static_file() function in error.py (#36153) · 40cfe7b2
  由 0x45f 提交于 10月 12, 2021
```
* change time to remove static tempfile

* delete remove_static_file() function
```
  40cfe7b2
- T
  [Autograd.functional] VJP and JVP (#36020) · 1e1aa197
  由 Tongxin Bai 提交于 10月 12, 2021
```
* autograd.functional passed pylint checker.

* autograd.functional: fix import errors.

* autograd.functional: fixed unit tests.

* autograd.functional minor format change
```
  1e1aa197
- Q
  [NPU] fix elementwise_mul to support broadcast, test=develop (#36258) · 09778f46
  由 Qi Li 提交于 10月 12, 2021
```
* [NPU] fix elementwise_mul to support broadcast, test=develop

* remove debug files, test=develop

* add axis support, test=develop
```
  09778f46
- Z
  
  refine LarsOptimizer (#36351) · b3f6eedb
  由 Zeng Jinle 提交于 10月 12, 2021
  
  b3f6eedb
- H
  
  Update loss.py · f77083bb
  由 HydrogenSulfate 提交于 10月 11, 2021
  
  f77083bb
- H
  
  Update test_cross_entropy_loss.py · 59841e6f
  由 HydrogenSulfate 提交于 10月 11, 2021
  
  59841e6f
- H
  
  Update test_cross_entropy_loss.py · a4246b90
  由 HydrogenSulfate 提交于 10月 11, 2021
  
  a4246b90
- H
  
  Update loss.py · 6cd41cec
  由 HydrogenSulfate 提交于 10月 11, 2021
  
  6cd41cec
- H
  
  Update loss.py · 3675f25d
  由 HydrogenSulfate 提交于 10月 11, 2021
  
  3675f25d
- H
  
  Update loss.py · 53dc0143
  由 HydrogenSulfate 提交于 10月 11, 2021
  
  53dc0143
- H
  
  Update loss.py · 8c2fbc31
  由 HydrogenSulfate 提交于 10月 11, 2021
  
  8c2fbc31
- H
  
  Fix the bug when axis is specified and weight is provided · 1d660eb6
  由 HydrogenSulfate 提交于 10月 11, 2021
  
  1d660eb6
- Q
  [NPU] add int64 kernel for slice, test=develop (#36328) · 8cc7146d
  由 Qi Li 提交于 10月 12, 2021
```
* [NPU] add int64 kernel for scale and slice, test=develop

* remove int64 for scale, test=develop
```
  8cc7146d
- J
  
  Add pool2d test convert (#36338) · e275e423
  由 JingZhuangzhuang 提交于 10月 11, 2021
  
  e275e423
- H
  fix bugs in mp_layers、pp_layers and HybridParallelClipGrad (#36144) · d247cf17
  由 Haohongxiang 提交于 10月 12, 2021
```
* fix calling bug of HybridParallelClipGrad

* fix bugs of HybridParallelClipGrad

* add unittest of pp with HybridParallelClipGrad

* fix bugs in mp_layers.py

* update

* fix bugs in pp_layers.py

* update
```
  d247cf17
- L
  
  fft: modify sample code result (#36325) · ec148cab
  由 LJQ❤️ 提交于 10月 12, 2021
  
  ec148cab
- A
  Fix stop_gradient in RunProgramOp (#36339) · 2a75b447
  由 Aurelius84 提交于 10月 12, 2021
```
* Fix stop_gradient in RunProgramOp

* fix reference
```
  2a75b447
11 10月, 2021 5 次提交
- D
  [heterps] add fuse_allreduce (#35131) · e5b4dd73
  由 danleifeng 提交于 10月 11, 2021
```
* heterps:add fuse_allreduce op; test=develop
* add program_mode in minimize for pslib mode;test=develop
```
  e5b4dd73
- J
  
  fix for matmul_v2 6D x 2D (#36342) · 339cb191
  由 jakpiase 提交于 10月 11, 2021
  
  339cb191
- Z
  Add FLAGS_allreduce_record_one_event to remove event waiting number (#36263) · 7b45a46e
  由 Zeng Jinle 提交于 10月 11, 2021
```
* add FLAGS_allreduce_record_one_event

* add more comments

* fix ut

* improve coverage

* fix ut, improve coverage
```
  7b45a46e
- L
  Add nn.functional.sparse_attention and some test cases, test=develop (#35757) · 85b77232
  由 Liu-xiandong 提交于 10月 11, 2021
```
Add paddle.nn.functional.sparse_attention API

    本个PR主要将sparse_attention功能在python层进行了一层封装，OP的主体代码见：#PR35676

    此外，对于封装的python 接口，增加了相应的单测。
```
  85b77232
- Y
  
  fix_dp_grad_merge_with_grad_clip_by_global_norm (#36334) · 1026052c
  由 Yuang Liu 提交于 10月 11, 2021
  
  1026052c

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致