提交 · 999242e35f450e2904df22a56ca8954f1811dbf8 · BaiXuePrincess / Paddle

19 10月, 2021 10 次提交
- Z
  [NPU] Add iou_similarity op (#36412) · 999242e3
  由 zhulei 提交于 10月 19, 2021
```
* [NPU] Add iou_similarity op

* [NPU] Add iou_similarity op

* [NPU] Add iou_similarity op
```
  999242e3
- D
  
  [heterps]edit shrink and unseenday logit for pslib (#36194) · 9e494472
  由 danleifeng 提交于 10月 19, 2021
  
  9e494472
- W
  Inference add type check in copy_from_cpu (#36429) · be6a8330
  由 Wilber 提交于 10月 19, 2021
```
* update

* fix ut error

* update ut
```
  be6a8330
- W
  add nearest_interp_v2 trt plugin (#34126) · 7b67f398
  由 wangxinxin08 提交于 10月 19, 2021
```
* add nearest_interp_v2 trt plugin
```
  7b67f398
- W
  
  [hybrid] static model parallel dropout support deterministic RandomSeedGenerator (#36228) · 8cc8e411
  由 WangXi 提交于 10月 19, 2021
  
  8cc8e411
- fix replicate pad when input size is 0 (#36510) · d89a759b
  由 littletomatodonkey 提交于 10月 19, 2021
```
* fix replicate pad when input size is 0
* add unit test
```
  d89a759b
- X
  catch the generatorfunction and intercept it. (#35369) · 7edcc4fb
  由 xiongkun 提交于 10月 19, 2021
```
* catch the generatorfunction and intercept it.

* add test generator

* add test case

* refine the testcase
```
  7edcc4fb
- Y
  [paddle.linalg.qr] Add the Qr Operator (#35742) · 34d785c2
  由 Yulong Ao 提交于 10月 19, 2021
```
* Add QR decomposition op

* Change codes to adapt to new svd_helper

* Update linalg.py

Restore the deleted comma

* Restore the deleted line

* Update linalg.py

* Update linalg.py

* Improve the qr code by reviews

* Update QR based on CI results

* Update qr doc, test=document_fix

* Change unsafe and ill-formed codes
```
  34d785c2
- Y
  Add auto parallel cost model and unittests (#36363) · a573a7ed
  由 YipZLF 提交于 10月 19, 2021
```
* Add auto parallel cost model and unittests

* Fixed code styles.

* Fixed bugs and codes style

* fixed typo

* Improved code style: object encapsulation.

* Fixed codes.

* Refractored estimate_cost

* Fixed typo
```
  a573a7ed
- Z
  Add pow2_decay_with_linear_warmup op (#36421) · 305b99a0
  由 Zeng Jinle 提交于 10月 19, 2021
```
* add pow2_warmup op

* remove contrib __all__

* add AttrT

* rename

* follow comments

* fix duplicate PADDLE_RESTRICT
```
  305b99a0
18 10月, 2021 9 次提交

[HybridParallel]Support fp16 in dygraph hybrid parallel (#36420) · 10f0a0f6

由 Haohongxiang 提交于 10月 18, 2021

* [HybridParallel]Support fp16 in dygraph hybrid parallel

* update

* update

* update for recompute

* add unittest of pp+fp16

* add unittest of recompute+fp16

* update

* modify ut

10f0a0f6

Added softplus FP32 FWD OneDNN kernel (#36382) · bdac9ff6

由 jakpiase 提交于 10月 18, 2021

* added softplus

* refactored softplus op

* deleted unnecessary file

* added missing file

* added formatting

* disabled tests if GPU is used

* added reviewer suggestion

* unified softplus kernel

bdac9ff6

Lml/vhp (#36146) · 4c0ad772

由 levi131 提交于 10月 18, 2021

* init functional jacobian api

* finish test with dtype float32

* add float64 test case

* polish code

* use atol=1e-5 with dtype float64

* fix for ci

* set timeout for test_jacobian

* init hessian API

* save status

* polish API docstring

* modify docstring

* add utils.py

* save status

* fix dygraph double grad dtype error when calling for high differential senario

* reinvoke ci

* test_hessian.py is ok

* polish hessian API

* init vhp

* Revert "init vhp"

This reverts commit cbd4d3b66abe82b0ac10721b9eddeb7d82e0a1c8.

* init vhp

* finish vhp API logically

* add test for partial_engine.cc

* modify numerical_delta with dtype float32

* merge fix for dtype float64

* spell fix

* save status

* polish code

* rm _stop_gradient_pre_process

* save status

* add example for vhp interface

* add _compute_numerical_vjp and _compute_numerical_vhp

* test is ok

* vhp is ok

* add testVHPFloat64

* modify for comments

* modify format

* modify format

* save status

* test_vhp is ok

* finish code polish

* small modify for v is None
Co-authored-by: NJiabinYang <360788950@qq.com>

4c0ad772

Q

[NPU] add kernels for elementwise_add gather_nd tile, test=develop (#36464) · cbd15f7d
由 Qi Li 提交于 10月 18, 2021

cbd15f7d
Q

[NPU] fix dtype for arg_max, test=develop (#36457) · 8757fc5b
由 Qi Li 提交于 10月 18, 2021

8757fc5b
C
quant support matmul_v2 (#36469) · 051544b6
由 ceci3 提交于 10月 18, 2021
```
* quant support matmul_v2

* fix format
```
051544b6
T
[XPU AMP] 1. xpu support gradient acc 2. xpu support create tensor in dygraph... · d19a9b39
由 taixiurong 提交于 10月 18, 2021
```
[XPU AMP] 1. xpu support gradient acc 2. xpu support create tensor in dygraph 3. xpu support update weight params in amp (#36439)
```
d19a9b39

[autograd.functional] Fix a bug on handling v=None in vjp and jvp (#36445) · 79dbbcce

由 Tongxin Bai 提交于 10月 18, 2021

* autograd.functional passed pylint checker.

* autograd.functional: fix import errors.

* autograd.functional: fixed unit tests.

* autograd.functional minor format change

* [autograd.functional] Fixed vjp and jvp's v=None bug.

79dbbcce

H

modify ut of cond (#36475) · e496d1e9
由 Haohongxiang 提交于 10月 18, 2021

e496d1e9

15 10月, 2021 4 次提交

0
fix no_grad context error in train mode when using save/load (#36434) · 37257d6a
由 0x45f 提交于 10月 15, 2021
```
* fix no_grad context error in train mode when using save/load

* change net to train mode in test case
```
37257d6a

Add BuildCinnPass (#36345) · b3f02c57

由 jiangcheng 提交于 10月 15, 2021

* Add CinnSubgraphSearchPass

* solve CI problem of subgraph order not same

* fix some bug by review advices

* ensure the independently of subgraph, that mean the subgraph should not have link to out-graph

* rename cinn_subgraph_search_pass to build_cinn_pass and delete paddle_to_cinn_pass

* add flag to control wheter append build cinn pass

* remove AppendPass at ParallelExecutorPassBuilder

* rename paddle_to_cinn_pass to build_cinn_pass in build_strategy and close test_run_from_cinn

b3f02c57

[New Feature] Support tanh triple grad (#36225) · 808be657

由 Jiabin Yang 提交于 10月 15, 2021

* native commit for triple grad of sigmod

* Updated unittests files

* init functional jacobian api

* Updated trible_test func

* Updated gradient_checker & test_script

* finish test with dtype float32

* add float64 test case

* polish code

* use atol=1e-5 with dtype float64

* fix for ci

* set timeout for test_jacobian

* fix dygraph grad to support high differential

* polish API docstring

* Updated gradient checker and some related files

* fix double grad strip error for high differential

* fix double grad strip error for high differential

* Add Sigmoid triple grad tests

* fix dygraph double grad dtype error when calling for high differential senario

* Updated triple grad teses func

* Use np.random to initialize ddx

* Updated triple_grad_check func

* add todo for gradient checker and refine some comments

* remove additional code

* add test for warnging in backward.py

* add tanh triple grad

* format python code

* refine code
Co-authored-by: Nveyron95 <veyron_wu@163.com>
Co-authored-by: Nlevi131 <limaolin01@baidu.com>

808be657

Z

fix momentum ops (#36452) · 4dda18a8
由 Zeng Jinle 提交于 10月 15, 2021

4dda18a8

14 10月, 2021 9 次提交
- Y
  add sparse_embedding doc (#36283) · 6ccc2a40
  由 Yanxing Shi 提交于 10月 14, 2021
```
* add sparse_embedding doc

* delete wrong space

* fix error for sample code

* fix error for doc compile

* delete __all__

* modify sample code
```
  6ccc2a40
- L
  
  enable 3rd order test case (#36427) · 3cf57646
  由 levi131 提交于 10月 14, 2021
  
  3cf57646
- Z
  
  Add the complete code and related files of resnet_unit_op (#36366) · 12e6dbbc
  由 Zhang Zheng 提交于 10月 14, 2021
  
  12e6dbbc
- Z
  [NPU] Add density_prior_box (#36361) · bed4fb27
  由 zhulei 提交于 10月 14, 2021
```
* [NPU] Add density_prior_box op

* [NPU] Add density_prior_box op
```
  bed4fb27
- Z
  Merge momentum ops/kernels (#36380) · f4eda869
  由 Zeng Jinle 提交于 10月 14, 2021
```
* merge momentum ops

* update

* add ut to improve coverage

* remove optimizer change

* fix error msg

* update ut

* add __restrict__ for CUDA

* update ut

* move merged_momentum_op to optimizer dir

* fix coverage
```
  f4eda869
- Z
  
  refine lars (#36409) · eb722e34
  由 Zeng Jinle 提交于 10月 14, 2021
  
  eb722e34
- S
  [HybridParallel]Rebuild code for pipeline (#36396) · 8ffcc7c8
  由 ShenLiang 提交于 10月 14, 2021
```
* add no_sync for parameters sync

* add pipeline for moe
```
  8ffcc7c8
- Z
  Add static memory analysis module (#36408) · fb68ea62
  由 Zeng Jinle 提交于 10月 14, 2021
```
* add memory_analysis

* fix has_none
```
  fb68ea62
- Y
  
  [hybrid enhance] add flag to control the avg position for grad merge under pipeline mode (#36384) · 03d8304f
  由 Yuang Liu 提交于 10月 14, 2021
  
  03d8304f
13 10月, 2021 8 次提交

Y
[PaddlePaddle hackathon] + ADD CELU (#36088) · d7064f04
由 yujun 提交于 10月 13, 2021
```
* update

* update

* update

* try make CI pass

* doc typo

* update doc string
```
d7064f04

Merge lars op (#35476) · 0c31579c

由 limingshu 提交于 10月 13, 2021

* A leap of try for cudaLaunchCooperativeKernel

* fix bugs

* Totally replace the lar cuda kernel

* Fix bugs

* a test for lars merge

* Adding las_op_momentum infer_shape

* Fix codes

* use avg_numel instead of max_numel to acquire grid num

* modify unittest files about lars op

* Finally converge when merged-lars works

* fix ctest files

* add merged_operation kernel when cuda version is older than 11

* Fix code style

* fix ctest failure

* fix error

* fix all ctest error and change lars compute code of cpu

* fix bugs on v100.

* revert python modififation about lars

* revert python modification codes

0c31579c

W
Verify the correctness of graph rewrited by GeneratePass (#36116) · 24418479
由 wuhuanzhou 提交于 10月 13, 2021
```
Check detail PR description at https://github.com/PaddlePaddle/Paddle/pull/36116
```
24418479
Z
[AMP] add attr is_distributed for layer.to (#36221) · 9a9953d9
由 zhangbo9674 提交于 10月 13, 2021
```
* add attr is_distributed

* refine code

* refine black/white list for pure fp16
```
9a9953d9

Add fp16 for clip_by_norm & clip_by_global_norm (#36198) · 3a869cc5

由 zhangbo9674 提交于 10月 13, 2021

* add fp16 for clip_by_norm api

* support ClipByGlobalNorm for fp16 in dygraph

* add unittest for dygraph clipGlobalNorm

* refine unittest for dygraph clipGlobalNorm for mac and windows

* refine unittest

* add unittest for fp64

* refine unittest for fp64

3a869cc5

G

support auto parallel data shard (#36055) · 85bb1a85
由 Guoxia Wang 提交于 10月 13, 2021

85bb1a85
C

fix pp comm init bug (#36377) · 817f9ef0
由 caozhou 提交于 10月 13, 2021

817f9ef0
L
[Amp] refine code of amp level (#36362) · 59e425cd
由 Leo Chen 提交于 10月 13, 2021
```
* refine amp level

* fix typo

* update tracer._amp_level
```
59e425cd

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致