提交 · b33a3c232c6161b2e1f7e4c4c78012e2715c9047 · BaiXuePrincess / Paddle

26 2月, 2022 2 次提交

Support custom implement for C++ API (#39521) · caea126c

由 zyfncg 提交于 2月 26, 2022

* Support custom implement for C++ API

* rename api_invoke_impl to api_custom_impl

* remove manual_api

* delete mutable_data in copy_to api

* fix problem of copy_to

* add unittest for infer_meta_fn_factory

* fix split cofig in yaml

* fix split cofig in yaml

* modify sum api yaml

* add copy_to wrapped infermeta

* rollback copy impl

caea126c

W
[Eager Hook] Support GradientHook and ReduceHook, expose related interface to python (#39893) · a456dda6
由 Weilong Wu 提交于 2月 26, 2022
```
* Support Eager Hook, expose interface to python

* Fix CI issue
```
a456dda6

25 2月, 2022 6 次提交
- J
  
  added logsoftmax oneDNN kernel (#39793) · 584844ec
  由 jakpiase 提交于 2月 25, 2022
  
  584844ec
- S
  Add MultiTensorApply to calculate L2-Norm in DistributedFusedLamb optimizer (#39900) · d32a0102
  由 sneaxiy 提交于 2月 25, 2022
```
* add multi tensor apply l2 norm

* add multi_tensor_apply code

* make sizeof(TensorMeta) smalller

* move code to distributed_fused_lamb_op.cu

* remove useless FLAGS
```
  d32a0102
- Z
  
  [MLU]support launch process on mlu (#39839) · 2533cac6
  由 zn 提交于 2月 25, 2022
  
  2533cac6
- Z
  [bf16] add bf16 kernel: elementwise_add elementwise_mul elementwise_sub (#39716) · 2fedd39b
  由 zhangbo9674 提交于 2月 25, 2022
```
* add ele_add

* add ele_mul

* add ele_sub

* sovle conflict

* fix npu

* refine ele_add

* add ele_mul unittest

* refine ele_sub

* refine ci

* refine unittest
```
  2fedd39b
- J
  
  add reduce_min and reduce_max (#39899) · 44da9b42
  由 joeqiao12 提交于 2月 25, 2022
  
  44da9b42
- F
  
  [MLU] add elementwise_mul mlu kernel (#39864) · 04d324b2
  由 fwenguang 提交于 2月 25, 2022
  
  04d324b2
24 2月, 2022 11 次提交

[IPU] Update IpuStrategy Python Part (#39646) · e0409c93

由 Allen Guo 提交于 2月 24, 2022

* Update IpuStrategy Python Part

* add docs

* add add_custom_op for ipu_strategy

* fix build warning

* rm unneeded part

* clean api

* fix typo

* update option names

* update IpuStrategy doc

e0409c93

Z

[MLU]add mlu kernel for allreduce (#39788) · ce207c3a
由 zn 提交于 2月 24, 2022

ce207c3a
R

fix paddle.where torch diff (#39859) · c5ae43a2
由 ronnywang 提交于 2月 24, 2022

c5ae43a2
C
Fix unittests for eigh op (#39568) · 539fb0d7
由 crystal 提交于 2月 24, 2022
```
* fix eigh test

* modify atol and rtol
```
539fb0d7

[doc]Fix maxunpool2d example (#39862) · eb4ad509

由 xiaoting 提交于 2月 24, 2022

* fix maxunpool2d example, test=document_fix

* fix maxunpool2d example, test=document_fix

eb4ad509

Added nearest interp v2 BF16 FWD kernel (#39490) · 2ec943a7

由 jakpiase 提交于 2月 24, 2022

* added nearest interp v2 bf16

* disabled bilinear interp nhwc test

* added skipping UT for gpu

* added NHWC support

* removed unnecessary statements

* minor change

* CI fix

* added appropriate changes to interpolate_v1

* fix after review

* minor change

* minor change

* revert unwanted deletions

* CI fix

2ec943a7

Refactored GradNodeAccumulation data structure and behaviour (#39526) · 1abfc8dd

由 Zhanlue Yang 提交于 2月 24, 2022

* Refactored GradNodeAccumulation data structure and behaviour

* Fixed CI issues

* Fix compilation issues

* Fixed minor issues

* Reverted changes for intermediate and OverwriteOutput

* fixed minor issue

* Fixed code format issues

* Fixed CI-Coverage issue

* Fixed CI issues

1abfc8dd

L
fix 'invalid escape sequence' (#39842) · 4e26fa57
由 Leo Chen 提交于 2月 24, 2022
```
* fix 'invalid escape sequence'

* fix assert error
```
4e26fa57
H
Add Note for Place of Executor in Parallel Environment (#39063) · 867224b2
由 Huihuang Zheng 提交于 2月 24, 2022
```
Add note for Place of Executor in parallel environment
```
867224b2
J

fix bug for block state (#39854) · 5fd7b5c3
由 JZ-LIANG 提交于 2月 24, 2022

5fd7b5c3

[Eager] save load testcase (#39571) · 6b5749eb

由 wanghuancoder 提交于 2月 24, 2022

* eager, test=develop

* fix bug, test=develop

* eager, test=develop

* merge legacy to fluid

* eager, test=develop

* eager, test=develop

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* eager, test=develop

* eager, test=develop

* Use overload instead of template

* Remove legacy code

* Remove legacy code

* selectedrows, test=develop

* Remove DataType test

* eager, test=develop

* eager, test=develop

* support gan, test=develop

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* ptb, test=develop

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* eager, test=develop

* eager, test=develop

* eager, test=develop

* eager, test=develop

* add more test

* eager, test=develop

* Support copiable selected rows and merge develop

* save load, eager, test=develop

* save load, eager, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* revert static_runner, test=develop

* EagerTensor to Tensor, test=develop

* refine, test=develop

* refine, test=develop

* clear grad, test=develop

* merge, develop

* merge, develop

* merge, test=develop

* merge, test=develop
Co-authored-by: NJiabinYang <360788950@qq.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>

6b5749eb

23 2月, 2022 11 次提交

J

added paddle_bfloat to requirements (#39740) · 2457a7d1
由 jakpiase 提交于 2月 23, 2022

2457a7d1
S
Add ProcessGroupNCCL for distributed training (#39737) · 0b205817
由 ShenLiang 提交于 2月 23, 2022
```
* add processgroup_nccl
```
0b205817
change CUDA implementaion of bernoulli OP (#39732) · b9675acc
由 zhouweiwei2014 提交于 2月 23, 2022
```
* change CUDA implementaion of bernoulli OP

* fix CI
```
b9675acc
Z
refactor range unittest for kunlun (#39800) · 69a04209
由 zhangxiaoci 提交于 2月 23, 2022
```
*test=kunlun
```
69a04209

[KP] Add elementwise add xpu after phi, test=develop (#39787) · 1a1a2ce8

由 Liu-xiandong 提交于 2月 23, 2022

* [KP] Add elementwise add xpu, test=develop

* modify the File Permissions

* modify the copyright time

* modify code style

* modify code style

1a1a2ce8

B
update gather_nd trt converter ut (#39584) · 4130b640
由 baoachun 提交于 2月 23, 2022
```
* update gather_nd trt converter ut

* update ut
```
4130b640
T

refactoring gather/masked_select/arg_max unittests for kunlun, *test=kunlun (#39711) · da492a13
由 TTerror 提交于 2月 23, 2022

da492a13
L
fix 'is with a literal' warning (#39798) · 22abb6b3
由 Leo Chen 提交于 2月 23, 2022
```
* fix 'is with a literal'

* fix typo
```
22abb6b3
H

fix activation ut typo xpu. test=kunlun (#39813) · 9880595a
由 houj04 提交于 2月 23, 2022

9880595a

[Eager] Support Eager mode for some model testcase (#39248) · abe232d8

由 wanghuancoder 提交于 2月 23, 2022

* eager, test=develop

* fix bug, test=develop

* eager, test=develop

* merge legacy to fluid

* eager, test=develop

* eager, test=develop

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* eager, test=develop

* eager, test=develop

* Use overload instead of template

* Remove legacy code

* Remove legacy code

* selectedrows, test=develop

* Remove DataType test

* eager, test=develop

* eager, test=develop

* support gan, test=develop

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* ptb, test=develop

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* eager, test=develop

* eager, test=develop

* eager, test=develop

* eager, test=develop

* add more test

* eager, test=develop

* Support copiable selected rows and merge develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* refine, test=develop

* clear grad, test=develop

* merge, develop

* merge, develop
Co-authored-by: NJiabinYang <360788950@qq.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>

abe232d8

[bf16] add bf16 kernel: elementwise_div (#39602) · ca4df333

由 zhangbo9674 提交于 2月 23, 2022

* add elementwise_div

* refine rocm

* refine code

* refine op register

* solve conflict

* refine unittest

* refine unittest precision

* add rocm

ca4df333

22 2月, 2022 10 次提交
- Z
  
  unset fluid in tensor (#35082) · 42eb56e2
  由 zhiboniu 提交于 2月 22, 2022
  
  42eb56e2
- J
  Auto Parallel support conditional block (#39612) · a08ee62a
  由 JZ-LIANG 提交于 2月 22, 2022
```
* add subblock logic for context and partitioner

* partitioner support sub blocks

* revise typos

* fixed param init bug for while

* chmod 644

* add unitest

* mv forward parser

* update unitest

* update dist op ctx

* update dist op ctx

* fixed bug in dist op ctx

* fixed bug for recompute subblock
```
  a08ee62a
- Y
  
  disable some distribute test case when in CPU test env (#39801) · ae8c811a
  由 YUNSHEN XIE 提交于 2月 22, 2022
  
  ae8c811a
- J
  
  added round fwd onednn kernel (#39653) · 74c0bc1c
  由 jakpiase 提交于 2月 22, 2022
  
  74c0bc1c
- L
  Add the implementation of TCP Store (#39384) · b95cd3b7
  由 lilong12 提交于 2月 22, 2022
```
* add tcp_socket and tcp_store
```
  b95cd3b7
- F
  delete gather_ut skip_case (#39657) · da43e065
  由 feng_shuai 提交于 2月 22, 2022
```
* delete gather_ut skip_case

* add trt version limit
```
  da43e065
- L
  Adapt to batch_norm_grad op and add align function in roi_align op for kunlun (#39685) · f33ae206
  由 Leo Guo 提交于 2月 22, 2022
```
* Adapt to batch_norm_grad op and add align function in
roi_align op for kunlun, *test=kunlun

* Adapt to batch_norm, batch_norm_grad op api for kunlun, and add unit-tests of batch_norm, roi_align. *test=kunlun
```
  f33ae206
- W
  fix bug in new the_one_ps (#39505) · d56a0a1b
  由 wangguanqun 提交于 2月 22, 2022
```
* fix benchmark and communicator config

* fix bugs of the_one_ps

* multi program and fix bug in optimizer

* multi program in the_one_ps

* public commcontext
```
  d56a0a1b
- Z
  
  unset fluid in nn.others (#34935) · a710738e
  由 zhiboniu 提交于 2月 22, 2022
  
  a710738e
- H
  
  update unittests for nearest_interp_v2_op_xpu: 'sync' from gpu. test=kunlun (#39768) · e89bf25b
  由 houj04 提交于 2月 22, 2022
  
  e89bf25b

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致