提交 · d3c9394202579ab65bedfb3cbe0cc058a410f600 · PaddlePaddle / Paddle

18 10月, 2021 3 次提交
- J
  
  Fix conv2d op_teller error (#36474) · d3c93942
  由 JingZhuangzhuang 提交于 10月 17, 2021
  
  d3c93942
- T
  [autograd.functional] Fix a bug on handling v=None in vjp and jvp (#36445) · 79dbbcce
  由 Tongxin Bai 提交于 10月 18, 2021
```
* autograd.functional passed pylint checker.

* autograd.functional: fix import errors.

* autograd.functional: fixed unit tests.

* autograd.functional minor format change

* [autograd.functional] Fixed vjp and jvp's v=None bug.
```
  79dbbcce
- H
  
  modify ut of cond (#36475) · e496d1e9
  由 Haohongxiang 提交于 10月 18, 2021
  
  e496d1e9
17 10月, 2021 2 次提交
- Z
  
  refine rescale_grad (#36490) · 4e036fa1
  由 Zeng Jinle 提交于 10月 17, 2021
  
  4e036fa1
- Z
  Revert "fix the initializer of resnet unit op (#36483)" (#36487) · 314cc495
  由 Zeng Jinle 提交于 10月 17, 2021
```
This reverts commit 0452f27c.
```
  314cc495
16 10月, 2021 1 次提交
- Z
  fix the initializer of resnet unit op (#36483) · 0452f27c
  由 Zhang Zheng 提交于 10月 16, 2021
```
* fix the initializer of resnet unit op

* fix the initializer of resnet unit op
```
  0452f27c
15 10月, 2021 11 次提交

Z
Remove wrong __restrict__ of CUDA LarsMomentumOpKernel (#36460) · adb80494
由 Zeng Jinle 提交于 10月 15, 2021
```
* remove wrong restrict

* remove master_param_out __restrict__

* update
```
adb80494
D

fix opt-offload save bug (#36433) · e703a2ed
由 duanboqiang 提交于 10月 15, 2021

e703a2ed
Z

Add ResNetUnit Python API (#35426) · 12882b2f
由 Zhang Zheng 提交于 10月 15, 2021

12882b2f
F

feat: Add TRT support for 3D(batch_norm_op and elementwise_add_op) (#36446) · 2de0b58e
由 feng_shuai 提交于 10月 15, 2021

2de0b58e

由 Nyakku Shigure 提交于 10月 15, 2021

* add resnext model
* add zh docs
* add unittest
* test performance
Co-authored-by: Ainavo <ainavo@163.com>
Co-authored-by: Npithygit <pyg20200403@163.com>
Co-authored-by: Ainavo <ainavo@163.com>
Co-authored-by: Npithygit <pyg20200403@163.com>

277c9a55

0
fix no_grad context error in train mode when using save/load (#36434) · 37257d6a
由 0x45f 提交于 10月 15, 2021
```
* fix no_grad context error in train mode when using save/load

* change net to train mode in test case
```
37257d6a
F

dynamic load mkl as a fft backend when it is avaialble and requested (#36414) · f45e6cf6
由 Feiyu Chan 提交于 10月 15, 2021

f45e6cf6

Add BuildCinnPass (#36345) · b3f02c57

由 jiangcheng 提交于 10月 15, 2021

* Add CinnSubgraphSearchPass

* solve CI problem of subgraph order not same

* fix some bug by review advices

* ensure the independently of subgraph, that mean the subgraph should not have link to out-graph

* rename cinn_subgraph_search_pass to build_cinn_pass and delete paddle_to_cinn_pass

* add flag to control wheter append build cinn pass

* remove AppendPass at ParallelExecutorPassBuilder

* rename paddle_to_cinn_pass to build_cinn_pass in build_strategy and close test_run_from_cinn

b3f02c57

[New Feature] Support tanh triple grad (#36225) · 808be657

由 Jiabin Yang 提交于 10月 15, 2021

* native commit for triple grad of sigmod

* Updated unittests files

* init functional jacobian api

* Updated trible_test func

* Updated gradient_checker & test_script

* finish test with dtype float32

* add float64 test case

* polish code

* use atol=1e-5 with dtype float64

* fix for ci

* set timeout for test_jacobian

* fix dygraph grad to support high differential

* polish API docstring

* Updated gradient checker and some related files

* fix double grad strip error for high differential

* fix double grad strip error for high differential

* Add Sigmoid triple grad tests

* fix dygraph double grad dtype error when calling for high differential senario

* Updated triple grad teses func

* Use np.random to initialize ddx

* Updated triple_grad_check func

* add todo for gradient checker and refine some comments

* remove additional code

* add test for warnging in backward.py

* add tanh triple grad

* format python code

* refine code
Co-authored-by: Nveyron95 <veyron_wu@163.com>
Co-authored-by: Nlevi131 <limaolin01@baidu.com>

808be657

Z

fix momentum ops (#36452) · 4dda18a8
由 Zeng Jinle 提交于 10月 15, 2021

4dda18a8
W

close some check on CI-OP-Benchmark, test=develop (#36442) · 8566cc98
由 wuhuanzhou 提交于 10月 15, 2021

8566cc98

14 10月, 2021 18 次提交
- Y
  add sparse_embedding doc (#36283) · 6ccc2a40
  由 Yanxing Shi 提交于 10月 14, 2021
```
* add sparse_embedding doc

* delete wrong space

* fix error for sample code

* fix error for doc compile

* delete __all__

* modify sample code
```
  6ccc2a40
- D
  
  optimize-offload support adamw op type (#36432) · 66c58fa3
  由 duanboqiang 提交于 10月 14, 2021
  
  66c58fa3
- Z
  
  fix lars (#36431) · 8256f6fa
  由 Zeng Jinle 提交于 10月 14, 2021
  
  8256f6fa
- L
  
  enable 3rd order test case (#36427) · 3cf57646
  由 levi131 提交于 10月 14, 2021
  
  3cf57646
- Z
  
  refine merge lars (#36428) · 63fd7d66
  由 Zeng Jinle 提交于 10月 14, 2021
  
  63fd7d66
- W
  inference support bert when exists matmul_v2 (#36424) · 3e6d9dbb
  由 Wilber 提交于 10月 14, 2021
```
* support bert when exists matmul_v2

* update
```
  3e6d9dbb
- Z
  
  Add the complete code and related files of resnet_unit_op (#36366) · 12e6dbbc
  由 Zhang Zheng 提交于 10月 14, 2021
  
  12e6dbbc
- Z
  [NPU] Add density_prior_box (#36361) · bed4fb27
  由 zhulei 提交于 10月 14, 2021
```
* [NPU] Add density_prior_box op

* [NPU] Add density_prior_box op
```
  bed4fb27
- L
  Revert "Implemented LRU based cache clearing (#36290)" (#36426) · 5d18967b
  由 lidanqing 提交于 10月 14, 2021
```
This reverts commit bf748f24.
```
  5d18967b
- Z
  Merge momentum ops/kernels (#36380) · f4eda869
  由 Zeng Jinle 提交于 10月 14, 2021
```
* merge momentum ops

* update

* add ut to improve coverage

* remove optimizer change

* fix error msg

* update ut

* add __restrict__ for CUDA

* update ut

* move merged_momentum_op to optimizer dir

* fix coverage
```
  f4eda869
- Z
  
  refine lars (#36409) · eb722e34
  由 Zeng Jinle 提交于 10月 14, 2021
  
  eb722e34
- S
  [HybridParallel]Rebuild code for pipeline (#36396) · 8ffcc7c8
  由 ShenLiang 提交于 10月 14, 2021
```
* add no_sync for parameters sync

* add pipeline for moe
```
  8ffcc7c8
- S
  
  reduce some unittest's parallel number to avoding timeout failure (#36397) · 693b1aa1
  由 Sing_chan 提交于 10月 14, 2021
  
  693b1aa1
- L
  
  fix import bug for assign (#36406) · cb5bf583
  由 levi131 提交于 10月 14, 2021
  
  cb5bf583
- Z
  Add static memory analysis module (#36408) · fb68ea62
  由 Zeng Jinle 提交于 10月 14, 2021
```
* add memory_analysis

* fix has_none
```
  fb68ea62
- Y
  
  [hybrid enhance] add flag to control the avg position for grad merge under pipeline mode (#36384) · 03d8304f
  由 Yuang Liu 提交于 10月 14, 2021
  
  03d8304f
- J
  Sparsity support (#36413) · b857d755
  由 JingZhuangzhuang 提交于 10月 13, 2021
```
* add pool2d convert test

* modify error

* modify error

* modify error

* modify error

* modify error

* modify error

* sparsity support
```
  b857d755
- P
  
  clean inference logs when config.DisableGlogInfo is triggered (#36356) · 7f5128f4
  由 Pei Yang 提交于 10月 14, 2021
  
  7f5128f4
13 10月, 2021 5 次提交

G
fix BatchNorm for fp16 (#36376) · 8fd1b6ad
由 Guoxia Wang 提交于 10月 13, 2021
```
* fix BatchNorm for fp16
```
8fd1b6ad
Y
[PaddlePaddle hackathon] + ADD CELU (#36088) · d7064f04
由 yujun 提交于 10月 13, 2021
```
* update

* update

* update

* try make CI pass

* doc typo

* update doc string
```
d7064f04

Merge lars op (#35476) · 0c31579c

由 limingshu 提交于 10月 13, 2021

* A leap of try for cudaLaunchCooperativeKernel

* fix bugs

* Totally replace the lar cuda kernel

* Fix bugs

* a test for lars merge

* Adding las_op_momentum infer_shape

* Fix codes

* use avg_numel instead of max_numel to acquire grid num

* modify unittest files about lars op

* Finally converge when merged-lars works

* fix ctest files

* add merged_operation kernel when cuda version is older than 11

* Fix code style

* fix ctest failure

* fix error

* fix all ctest error and change lars compute code of cpu

* fix bugs on v100.

* revert python modififation about lars

* revert python modification codes

0c31579c

W
Verify the correctness of graph rewrited by GeneratePass (#36116) · 24418479
由 wuhuanzhou 提交于 10月 13, 2021
```
Check detail PR description at https://github.com/PaddlePaddle/Paddle/pull/36116
```
24418479
Z
[AMP] add attr is_distributed for layer.to (#36221) · 9a9953d9
由 zhangbo9674 提交于 10月 13, 2021
```
* add attr is_distributed

* refine code

* refine black/white list for pure fp16
```
9a9953d9

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功