提交 · 974676bc6ec41e222083729af55f34e4b2f20f2e · PaddlePaddle / Paddle

14 6月, 2023 1 次提交

support sharding stage1 (#54069) · 974676bc

由 pangengzheng 提交于 6月 14, 2023

* support sharding stage1

* fix unittest

* format

* pass sharded sharding params_and_grads to inner_opt apply_pptimize

* change sharding gradient allreduce to reduce

* support save state_dict adptively and support sharding with mp

* fix sharding test

* test set_state_dict

* add more unit test

* fix global norm of mp case

* polish

* hack to calculate global norm in order to remove diff in calculating global norm values in HybridParallelClipGrad compared to dp

* remove print

974676bc

13 4月, 2023 1 次提交

[enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h (#52651) · 5664ea26

由 HongyuJia 提交于 4月 13, 2023

* [enforce.h Decouple logging.h] Delete glog/logging.h from enforce.h

* Add logging.h for profiler.cc

* Add logging.h for gloo_utils.h

* Add logging.h for addmm_kernel_impl.h

* Add logging.h for addmm_grad_kernel_impl.h

* Add logging.h for p_send_kernel.cu

* Add logging.h for determinant_grad_kernel_impl.h

* Add logging.h for p_recv_kernel.cu

* Add logging.h for elementwise_grad_base.h

* Add logging.h for transfer_layout_kernel.cc

* Add logging.h for eigvals_kernel.cc and index_select_impl.h

* Add logging.h for all files in kernel directory

* Add logging.h for xpu_info.cc

* Add logging.h for xpu

5664ea26

11 4月, 2023 2 次提交
- Z
  
  delete remote_prefetch (#52748) · 3951c40d
  由 zhangyuqin1998 提交于 4月 11, 2023
  
  3951c40d
- W
  
  [BUG Fixs] adadelta lr support (#49732) · 23032590
  由 wangzhen38 提交于 4月 11, 2023
  
  23032590
04 4月, 2023 1 次提交
- Z
  rename_bilinear_tensor_product (#52375) · 34069c46
  由 zhangyuqin1998 提交于 4月 04, 2023
```
* rename_bilinear_tensor_product

* fix
```
  34069c46
30 3月, 2023 2 次提交

[Zero-Dim] Support broadcast_tensors input 0D and distribution API output 0D (#51721) · 2bd0a946
由 zhouweiwei2014 提交于 3月 30, 2023

2bd0a946

Speedup worker (#51760) · 8ca86d72

由 pangengzheng 提交于 3月 30, 2023

* support run haokanctr model in heterps-models

* polish setup.py

* polish JVM_LIB in evn_dict

* align infer auc with DistPsArch pre-stable

* async and multi thread data feed

* rewrite dense tensor intialization

* async infer shape and reuse memory

8ca86d72

27 3月, 2023 2 次提交
- Z
  
  edit formate of mea (#52147) · 13baef48
  由 ZhangDY-6483 提交于 3月 27, 2023
  
  13baef48
- R
  fix_gcc12_error (#52007) · b2bd74f7
  由 risemeup1 提交于 3月 27, 2023
```
* fix_gcc12_error

* patch on eigen3 for fixing gcc12 error

* Update multiary.cc
```
  b2bd74f7
24 3月, 2023 2 次提交

P
[PHI]fix momentum dtype infer (#51353) · 648ec795
由 PuQing 提交于 3月 24, 2023
```
* fix momentum dtype infer

* fix momentum datatype

* fix on cpu

* add momentum
```
648ec795

Memory Efficient Attention (#51867) · e5ad3859

由 ZhangDY-6483 提交于 3月 24, 2023

* first version, notest

* return final rst, notest

* use infinity() instead of max

* ut structure

* start up of ut

* generate lse

* update

* add depense

* reconstruct cmake

* move file

* add memory efficient attention and fix blasimpl

* update

* update cmake

* add namespace

* update cmake

* use .cu

* update for pad3d

* bug fix

* bug fix

* update

* bug fix

* update enforce

* add test case

* merge the lse pad

* fix kernel_fn of backward

* fix PADDLE_ENFORCE_EQ and phi_api

* fix PADDLE_ENFORCE

* fix PADDLE_ENFORCE

* rerun coverage

* fix memory efficient attention test

* rerun ci

* add cuda version condition

* add cuda version condition

* delete WIP test

* replace PADDLE_ENFORCE

* edit the namespace of datatype in multiple.cc

* rerun

* rerun

---------
Co-authored-by: Nliuyuang <liuyuang@baidu.com>

e5ad3859

23 3月, 2023 1 次提交

[Prim] add meshgrid composite rule (#51061) · 53bb883d

由 chenjian 提交于 3月 23, 2023

* add meshgrid composite rule

* add meshgrid composite rule

* update

* add into CMakeLists

* fix

* update

* update

* optimize code

* fix meshgrid op

* update test

53bb883d

22 3月, 2023 1 次提交

Add fused_linear_param_grad_add_kernel (#51805) · f59c5d8b

由 sneaxiy 提交于 3月 22, 2023

* add fused_linear_param_grad_add_kernel

* fix compile error

* remove flag

* fix ci compile error

* fix ci compile error

* revert pylayer revision

* fix ci ut

* improve performance

f59c5d8b

21 3月, 2023 1 次提交

[PHI decoupling] Move DataType* from paddle:experimental to phi namespace (#51716) · 4638a62e

由 iSerendipity 提交于 3月 21, 2023

* move DataType from paddle::experimental to phi

* convert namespace

* convert namespace

* convert namespace

* clarify namespace

* convert more datatype

* Revert "convert more datatype"

This reverts commit 083b462959e6a22d4d8767707b628b95b396642e.

* convert more in auto_code_generator

* fix conflicts for XPU

* fix namespace conflicts

* fix errors

* Revert "fix errors"

This reverts commit f9d9958b54ee32141112274c8a5c3c381ab0f876.

* fix errors

* fix formatting

4638a62e

09 3月, 2023 1 次提交
- Add output defs for sgd kernel (#51332) · c0f84b8f
  由 iSerendipity 提交于 3月 09, 2023
```
* Add output defs for sgd kernel

* add datatype infer for sgd

* add infer logic
```
  c0f84b8f
08 3月, 2023 1 次提交
- N
  
  Add mult_precision param for adamax op (#49705) · 151ec311
  由 niuliling123 提交于 3月 08, 2023
  
  151ec311
06 3月, 2023 1 次提交
- N
  
  Add multiprecision for adadelta op (#50131) · a8a2b7f4
  由 niuliling123 提交于 3月 06, 2023
  
  a8a2b7f4
03 3月, 2023 1 次提交
- N
  
  Add multi_precision for adagrad op (#50078) · 4779c2c1
  由 niuliling123 提交于 3月 03, 2023
  
  4779c2c1
01 3月, 2023 1 次提交
- N
  
  Add multiprecision for rms op (#50132) · 48060b2e
  由 niuliling123 提交于 3月 01, 2023
  
  48060b2e
17 2月, 2023 1 次提交

Rename MultiTensorAdam To FusedAdam (#50449) · e6af9bd2

由 yuehuayingxueluo 提交于 2月 17, 2023

* rename multi_tensor_adam to fused_adam

* fix some bugs

* fix CI coverage

* rename test_fused_adam.py

* fix some bug

* add test_fused_adam_op.py

* fix some bugs

* fix fused_adam_op.cc

* fix CI bugs

* fix CI bug

* fix CI bug

e6af9bd2

16 2月, 2023 1 次提交
- C
  Add logspace yaml (#49194) · c284d42a
  由 Chen Weihang 提交于 2月 16, 2023
```
* add logspace yaml

* update by comments

* resolve test framework conflicct
```
  c284d42a
09 2月, 2023 1 次提交

Add MultiTenosrAdam OP (#49220) · 10654c77

由 yuehuayingxueluo 提交于 2月 09, 2023

* add multi_tenosr_adam

* update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py

* fix adam.py optimizer.py

* fix adamw.py

* fix test_multi_tensor_adam.py

* fix CI bug

* fix CI coverage

* fix ci bug

* fix betapow

* fix some bugs

* fix test_adamw_op.py

* fix CI coverage

* fix multi_tensor_adam_kernel.cc

* fix CI bug

* fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py

* fix code style

* update C++ parts

* remove python parts modification temporarily

* add C++ ut

* update betapow copy code logic

* fix ci ut

* fix windows ci

* fix coverage ci

* improve coverage rate

---------
Co-authored-by: Nsneaxiy <sneaxiy@126.com>

10654c77

31 1月, 2023 2 次提交

R
Fix 空指针 (Null pointer) of case15: paddle.broadcast_tensors (#49980) · 78ec942b
由 RedContritio 提交于 1月 31, 2023
```
* fix incorrect output shape of broadcast

* add unittest
```
78ec942b

support 0d tensor for interpolate (#49929) · 2e156ac8

由 xiaoting 提交于 1月 31, 2023

* support 0d tensor for interpolate

* support 0d tensor for interpolate

* add xpu unittest for interp

* update unittest for interpolate

* fix coverage

* fix code style

* fix for coverage

* fix coverage

2e156ac8

16 1月, 2023 1 次提交
- W
  
  add add_n for the 0d tensor (#49854) · 65b0181e
  由 wawltor 提交于 1月 16, 2023
  
  65b0181e
28 12月, 2022 1 次提交
- H
  
  fix bugs of paddle.multiplex API (#49368) · f6f0c562
  由 Haohongxiang 提交于 12月 28, 2022
  
  f6f0c562
26 12月, 2022 1 次提交
- R
  [0d Tensor] update scatter for zero-dimension tensor (#49279) · 73aa98cf
  由 Roc 提交于 12月 26, 2022
```
* revert concat and change concat to stack

* let stack kernel support int8, uint8 and bool type
```
  73aa98cf
23 12月, 2022 1 次提交
- H
  add rnn-t loss and api (#49199) · c088f9ec
  由 Hui Zhang 提交于 12月 23, 2022
```
* add warp transducer code
```
  c088f9ec
22 12月, 2022 1 次提交
- X
  
  [Paddle Inference] Add moe phi kernel (#48703) · def2a87f
  由 xiaoxiaohehe001 提交于 12月 22, 2022
  
  def2a87f
09 12月, 2022 1 次提交
- L
  move share_buffer kernel to phi (#48858) · c2e77ba3
  由 Leo Chen 提交于 12月 09, 2022
```
* move share_buffer kernel to phi

* fix ut

* add source file

* fix window links
```
  c2e77ba3
05 12月, 2022 1 次提交
- R
  
  [0D Tensor]support 0d tensor for dist.scatter and dist.broadcast (#48638) · 22ec915c
  由 Roc 提交于 12月 05, 2022
  
  22ec915c
17 11月, 2022 1 次提交
- Y
  [PHI]Standardise some C++ API (Part5) (#47860) · f3650201
  由 YuanRisheng 提交于 11月 17, 2022
```
* standard api

* fix xpu bugs
```
  f3650201
11 11月, 2022 1 次提交
- [Zero-Dim] fix batch_norm op infermeta bug (#47858) · 18549417
  由 zhouweiwei2014 提交于 11月 11, 2022
  
  18549417
02 11月, 2022 1 次提交
- Y
  [PHI]Standardise some C++ API (Part3) (#47532) · fe8c6796
  由 YuanRisheng 提交于 11月 02, 2022
```
* Standardise batch norm

* standardize conv3d and depwise_conv2d

* fix ci bugs
```
  fe8c6796
01 11月, 2022 1 次提交
- Y
  [PHI]Standardise some C++ API (Part2) (#47510) · 399047d7
  由 YuanRisheng 提交于 11月 01, 2022
```
* standard_api

* add hardtanh
```
  399047d7
31 10月, 2022 1 次提交
- Y
  [PHI]Standardise some C++ API (#47385) · 60e0c506
  由 YuanRisheng 提交于 10月 31, 2022
```
* standard api

* fix ci bugs

* fix ci bugs

* fix ce bugs
```
  60e0c506
17 10月, 2022 1 次提交
- Y
  [PHI]Modify DataLayout's namespace from paddle::experimental to phi (#46869) · ec749398
  由 YuanRisheng 提交于 10月 17, 2022
```
* namespace modify

* update by comment
```
  ec749398
12 10月, 2022 1 次提交
- [Zero-Dim] support input 0D Tensor for some unary api (#45992) · 05c2b9ba
  由 zhouweiwei2014 提交于 10月 12, 2022
```
* [Zero-Dim] support input 0D Tensor for unary api

* fix CI
```
  05c2b9ba
10 10月, 2022 1 次提交

[PHI]Add RNN yaml (#46812) · ab60fd8b

由 YuanRisheng 提交于 10月 10, 2022

* add yaml entry for rnn and rrnn_grad, move infershape function for rnn_grad to phi infer meta

* WIP: move rnn kernrl to phi

* Change the code generation to avoid converting from intializer list to tuple of heterogeneous types.
This is only triggered when an api has intermediate outputs, and the result of the outputs are of heterogeneous types.

* fix the bug that when none in a vector of tensors requires gradient, the conversion to InferShapeContext to InferMetaContext (a.k.a. BuildInferMetaContext) produces errorous results.

* fix ci bugs

* fix ci bugs

* fix ci bugs

* modify code according comment
Co-authored-by: Nchenfeiyu <chenfeiyu@baidu.com>

ab60fd8b

09 10月, 2022 1 次提交
- Z
  
  [Sparse] Add a batch_norm kernel (#46359) · 888223b7
  由 zhangkaihuo 提交于 10月 09, 2022
  
  888223b7

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功