提交 · 69a04209fc72a1b9181ea0f224b51ff920927f0c · PaddlePaddle / Paddle

23 2月, 2022 8 次提交

Z
refactor range unittest for kunlun (#39800) · 69a04209
由 zhangxiaoci 提交于 2月 23, 2022
```
*test=kunlun
```
69a04209

[KP] Add elementwise add xpu after phi, test=develop (#39787) · 1a1a2ce8

由 Liu-xiandong 提交于 2月 23, 2022

* [KP] Add elementwise add xpu, test=develop

* modify the File Permissions

* modify the copyright time

* modify code style

* modify code style

1a1a2ce8

B
update gather_nd trt converter ut (#39584) · 4130b640
由 baoachun 提交于 2月 23, 2022
```
* update gather_nd trt converter ut

* update ut
```
4130b640
T

refactoring gather/masked_select/arg_max unittests for kunlun, *test=kunlun (#39711) · da492a13
由 TTerror 提交于 2月 23, 2022

da492a13
L
fix 'is with a literal' warning (#39798) · 22abb6b3
由 Leo Chen 提交于 2月 23, 2022
```
* fix 'is with a literal'

* fix typo
```
22abb6b3
H

fix activation ut typo xpu. test=kunlun (#39813) · 9880595a
由 houj04 提交于 2月 23, 2022

9880595a

[Eager] Support Eager mode for some model testcase (#39248) · abe232d8

由 wanghuancoder 提交于 2月 23, 2022

* eager, test=develop

* fix bug, test=develop

* eager, test=develop

* merge legacy to fluid

* eager, test=develop

* eager, test=develop

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* eager, test=develop

* eager, test=develop

* Use overload instead of template

* Remove legacy code

* Remove legacy code

* selectedrows, test=develop

* Remove DataType test

* eager, test=develop

* eager, test=develop

* support gan, test=develop

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* ptb, test=develop

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* eager, test=develop

* eager, test=develop

* eager, test=develop

* eager, test=develop

* add more test

* eager, test=develop

* Support copiable selected rows and merge develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* clear grad, test=develop

* merge, develop

* merge, develop
Co-authored-by: NJiabinYang <360788950@qq.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>

abe232d8

[bf16] add bf16 kernel: elementwise_div (#39602) · ca4df333

由 zhangbo9674 提交于 2月 23, 2022

* add elementwise_div

* refine rocm

* refine code

* refine op register

* solve conflict

* refine unittest

* refine unittest precision

* add rocm

ca4df333

22 2月, 2022 14 次提交
- J
  Auto Parallel support conditional block (#39612) · a08ee62a
  由 JZ-LIANG 提交于 2月 22, 2022
```
* add subblock logic for context and partitioner

* partitioner support sub blocks

* revise typos

* fixed param init bug for while

* chmod 644

* add unitest

* mv forward parser

* update unitest

* update dist op ctx

* update dist op ctx

* fixed bug in dist op ctx

* fixed bug for recompute subblock
```
  a08ee62a
- Y
  
  disable some distribute test case when in CPU test env (#39801) · ae8c811a
  由 YUNSHEN XIE 提交于 2月 22, 2022
  
  ae8c811a
- J
  
  added round fwd onednn kernel (#39653) · 74c0bc1c
  由 jakpiase 提交于 2月 22, 2022
  
  74c0bc1c
- L
  Add the implementation of TCP Store (#39384) · b95cd3b7
  由 lilong12 提交于 2月 22, 2022
```
* add tcp_socket and tcp_store
```
  b95cd3b7
- F
  delete gather_ut skip_case (#39657) · da43e065
  由 feng_shuai 提交于 2月 22, 2022
```
* delete gather_ut skip_case

* add trt version limit
```
  da43e065
- L
  Adapt to batch_norm_grad op and add align function in roi_align op for kunlun (#39685) · f33ae206
  由 Leo Guo 提交于 2月 22, 2022
```
* Adapt to batch_norm_grad op and add align function in
roi_align op for kunlun, *test=kunlun

* Adapt to batch_norm, batch_norm_grad op api for kunlun, and add unit-tests of batch_norm, roi_align. *test=kunlun
```
  f33ae206
- H
  
  update unittests for nearest_interp_v2_op_xpu: 'sync' from gpu. test=kunlun (#39768) · e89bf25b
  由 houj04 提交于 2月 22, 2022
  
  e89bf25b
- Y
  [Auto Parallel] Add the high-level Engine API (#39709) · 5595fdbb
  由 Yulong Ao 提交于 2月 22, 2022
```
* [Auto Parallel] Add the high-level Engine API

* Update the test cmakefile
```
  5595fdbb
- Z
  refactor reshape2/shape unittest for kunlun (#39665) · c8d6c146
  由 zhangxiaoci 提交于 2月 22, 2022
```
*test=kunlun
```
  c8d6c146
- F
  
  delete skip_case for dropout_ut (#39629) · cdf05dfc
  由 feng_shuai 提交于 2月 22, 2022
  
  cdf05dfc
- F
  
  fix:Modify matrix latitude (#39686) · b8dbffb7
  由 feng_shuai 提交于 2月 22, 2022
  
  b8dbffb7
- N
  Modified RandomKernel with Kernel Primitive API (#39666) · 9f94821b
  由 niuliling123 提交于 2月 22, 2022
```
* Modified RandomKernel with Kernel Primitive API

* update pten.h to phi.h

* update

* update fullKernel
```
  9f94821b
- C
  [PTen->Phi PR2] Rename PT_REGISTER macro to PD_REGISTER (#39790) · 4a338796
  由 Chen Weihang 提交于 2月 22, 2022
```
* unify register macro

* rename declare macro

* fix infrt error
```
  4a338796
- Y
  
  [fleet exe] supprot fp16 feed and fetch on cpp side (#39758) · 73bf9673
  由 Yuang Liu 提交于 2月 22, 2022
  
  73bf9673
21 2月, 2022 9 次提交

[PluggableDevice]custom kernel to phi core structs (#39690) · 68631ed4

由 Aganlengzi 提交于 2月 21, 2022

* [PluggableDevice]custom kernel to pten core structs

* mod extension.h for custom op

* compatible python for CI

* support custom context

* refactor to pten

* fix windows and ut

68631ed4

0
[Dy2St]Fix cond grad error when handle tensor array (#39689) · a863b32e
由 0x45f 提交于 2月 21, 2022
```
* fix cond grad error when handle tensor array

* add UT
```
a863b32e

disable some distribute test case when in CPU test env (#39682) · 941bdb41

由 wanghuancoder 提交于 2月 21, 2022

* disable some distribute test case when in CPU test env, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

941bdb41

fix fill_constant bug, *test=kunlun (#39681) · b1805727
由 z8hanghuan 提交于 2月 21, 2022
```
* fix fill_constant bug, *test=kunlun

* fix fill_constant bug,*test=kunlun
```
b1805727

[bf16] add bf16 kernel: elementwise_max (#39461) · 93016331

由 zhangbo9674 提交于 2月 21, 2022

* add elementwise_max & unittest

* refine cuda register and unittest

* refine unittest

* refine uinttest for bf16

* refine optest

* refine code

* refine unittest

* refine unittest

93016331

update unittests for activation ops on xpu test=kunlun (#39677) · 83dd7e47

由 houj04 提交于 2月 21, 2022

* update unittests for activation ops on xpu. test=kunlun

* update input data range. test=kunlun

* update input data range. test=kunlun

83dd7e47

[PTen]Remove infershape of Reshape OP (#39631) · 45dd4a5f

由 YuanRisheng 提交于 2月 21, 2022

* remove infershape and Xshape

* add xshape

* fix bugs when run ci

* fix bugs when run ci

* fix bugs when run infrt test

* pass converage

45dd4a5f

update unittest for reduce_prod op on xpu. test=kunlun (#39671) · f858b645

由 houj04 提交于 2月 21, 2022

* update unittest for reduce_prod op on xpu. test=kunlun

* update unittest for reduce_prod op on xpu. test=kunlun

* bugfix: use dtype instead of float32. test=kunlun

f858b645

S

fix alignment bug (#39747) · 65ced1fa
由 sneaxiy 提交于 2月 21, 2022

65ced1fa

20 2月, 2022 1 次提交
- S
  Add int16 support for several ops (#39636) · 267275d9
  由 sneaxiy 提交于 2月 20, 2022
```
* add more op int16 support

* fix xpu ci
```
  267275d9
19 2月, 2022 2 次提交

Z
Enabled test_matmul_v2_op for final state Eager Dygraph (#39504) · 77625d7d
由 Zhanlue Yang 提交于 2月 19, 2022
```
* Enabled test_matmul_v2_op for final state Eager Dygraph

* Fixed minor issue

* Fixed format issue
```
77625d7d

Add the DistributedFusedLamb optimizer (#39148) · 5df3cd61

由 sneaxiy 提交于 2月 19, 2022

* add DistributedFusedLamb op

* polish code

* fix compile error

* compatible with pten changement

* fix rocm compile error

* improve converage

* update upstream/develop

* fix cast_with_ptr.h

* add FLAGS_distributed_lamb_divide_nranks_when_allreduce=1

* fix clip before allreduce

* add use_master_param_norm

* code polish

* fix bug

* fix ROCM ci

5df3cd61

18 2月, 2022 6 次提交
- Z
  
  bug fix (#39630) · bbf31a4e
  由 zhaoyingli 提交于 2月 18, 2022
  
  bbf31a4e
- Z
  [AMP] support GPU BF16 amp for dygraph (#39029) · 7d6d3848
  由 zhangbo9674 提交于 2月 18, 2022
```
* support dtype param for auto_cast

* add amp_dtype for tracer

* add unsupported bf16 list

* support bf16 amp for O2

* refine python interface for bfloat16

* refine code

* refine code

* refine unittest

* refine code

* refine code

* add bf16 o1

* refine code by comment

* add gradient accumulator

* add recompute
```
  7d6d3848
- B
  
  refactor the forward implementation of shape npu op (#39613) · e674af23
  由 baoachun 提交于 2月 18, 2022
  
  e674af23
- new way of unit test , *test=kunlun (#39650) · c5179772
  由 z8hanghuan 提交于 2月 18, 2022
```
* new way of unit test , *test=kunlun

* new way of ut, *test=kunlun
```
  c5179772
- Q
  [MLU]add matmul and matmul_v2 op (#39539) · 229ec32a
  由 qipengh 提交于 2月 18, 2022
```
* [MLU]add matmul and matmul_v2 op

* [MLU] fix data_type and del matmul

* [MLU] fix compile error

* [MLU] fix ci_check error
```
  229ec32a
- J
  
  add flatten op pytest (#39534) · 50fb57c9
  由 joeqiao12 提交于 2月 18, 2022
  
  50fb57c9

PaddlePaddle / Paddle 接近 2 年 前同步成功

PaddlePaddle / Paddle
接近 2 年前同步成功