提交 · 6833ecfe94272cdf97bfaa667d100d3f6318ba49 · PaddlePaddle / Paddle

14 9月, 2022 2 次提交
- S
  Fix DistributedFusedLAMB NaN problem (#46011) · 6833ecfe
  由 sneaxiy 提交于 9月 14, 2022
```
* fix distributed_fused_lamb nan

* remove CUDA_ASSERT
```
  6833ecfe
- C
  
  [MLU] add mergedAdam kernel. (#45965) · bf6ec262
  由 Chenxiao Niu 提交于 9月 14, 2022
  
  bf6ec262
06 9月, 2022 2 次提交
- Y
  
  migrate deformable_conv and merged momentum kernels to phi, test=kunlun (#45691) · 7f3c7aeb
  由 ykkk2333 提交于 9月 06, 2022
  
  7f3c7aeb
- H
  
  [XPU] rmsprop to phi. (#45734) · 1137677a
  由 houj04 提交于 9月 06, 2022
  
  1137677a
02 9月, 2022 2 次提交
- Y
  
  migrate shaple sgd, split,sign xpu kernels to phi, test=kunlun (#45607) · 3b9b4c34
  由 ykkk2333 提交于 9月 02, 2022
  
  3b9b4c34
- A
  [XPU]Migrate Adam XPU kernel into Phi (#45572) · cbabbe2e
  由 Aurelius84 提交于 9月 02, 2022
```
* [XPU]Migrate Adam XPU kernel into Phi

* test=kunlun
```
  cbabbe2e
01 9月, 2022 2 次提交
- T
  xpu-paddlepaddle-37 [任务] 迁移lamb到phi (#45520) · 1a0ef45e
  由 taixiurong 提交于 9月 01, 2022
```
test=kunlun
```
  1a0ef45e
- A
  [XPU]Migrate adamw XPU kernel into Phi (#45609) · f5a041e6
  由 Aurelius84 提交于 9月 01, 2022
```
* [XPU]Migrate adamw XPU kernel into Phi

* test=kunlun

* test=kunlun
```
  f5a041e6
31 8月, 2022 1 次提交
- W
  Move XPU momentum to phi (#45565) · d7807806
  由 WangZhen 提交于 8月 31, 2022
```
* Move XPU momentum to phi, test=kunlun

* Fix mu type, test=kunlun
```
  d7807806
24 8月, 2022 1 次提交

Support fp16 of adam operator in xpu environment (#45292) · a012d426

由 mengqingchun02 提交于 8月 24, 2022

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support fp16 of adam operator in xpu environment. test=kunlun

* support fp16 of adam operator in xpu environment. test=kunlun

* support fp16 of adam operator in xpu environment. test=kunlun

a012d426

19 8月, 2022 1 次提交

[XPU] add merged_momentum unittest and change momentum (#45241) · e0f1c9f2

由 dongfangshenzhu 提交于 8月 19, 2022

* add merged_momentum *test=kunlun

* add merged_momentum *test=kunlun

* add fp16 to merged_momentum,*test=kunlun

* change dist_model.cc

* add merged_momentum unittest and  change momentum,test=kunlun

* add merged_momentum unittest and  change momentum,test=kunlun

* add merged_momentum unittest and  change momentum,test=kunlun

* add merged_momentum unittest and  change momentum,test=kunlun

e0f1c9f2

17 8月, 2022 1 次提交
- F
  
  [MLU] fix copy error (#45194) · 75690584
  由 fwenguang 提交于 8月 17, 2022
  
  75690584
08 8月, 2022 1 次提交
- T
  
  move lamb_op to phi (#44899) · 4a7aa7c3
  由 Thomas Young 提交于 8月 08, 2022
  
  4a7aa7c3
04 8月, 2022 2 次提交
- D
  [XPU] add merged_momentum including fp32 and fp16 (#44824) · 4922376c
  由 dongfangshenzhu 提交于 8月 04, 2022
```
* add merged_momentum *test=kunlun

* add merged_momentum *test=kunlun

* add fp16 to merged_momentum,*test=kunlun
```
  4922376c
- S
  
  opt allreduce (#44843) · 1f9e2742
  由 sneaxiy 提交于 8月 04, 2022
  
  1f9e2742
03 8月, 2022 1 次提交
- S
  Add use_hierarchical_allreduce for DistributedFusedLAMB (#44821) · c770053c
  由 sneaxiy 提交于 8月 03, 2022
```
* add use_hierarchical_allreduce

* support hierarchical allreduce for more cases
```
  c770053c
01 8月, 2022 1 次提交

unify gpu context (#44740) · 86763023

由 Leo Chen 提交于 8月 01, 2022

* remove cudaDeviceContext

* remove more template

* fix rocm compile

* remove alias name CUDADeviceContext

* fix compile

* fix tests

* revert changes

86763023

29 7月, 2022 1 次提交
- Q
  add some fp16 op for kunlun resnet50 model (#44672) · fecbc958
  由 QingshuChen 提交于 7月 29, 2022
```
* add some fp16 op for kunlun resnet50 model
*test=kunlun

* tmp
*test=kunlun
```
  fecbc958
27 7月, 2022 1 次提交
- Y
  
  [DCU] Fix NAN problem when training BERT on DUC platform (#44643) · 28aa0c61
  由 Yuang Liu 提交于 7月 27, 2022
  
  28aa0c61
25 7月, 2022 1 次提交
- L
  
  [Phi] Migrate squared_l2_norm_op to phi (#44492) · 3e170163
  由 lyq 提交于 7月 25, 2022
  
  3e170163
22 7月, 2022 1 次提交
- Q
  add xpu lars_momentum/pow2_decay (#44448) · 8ccbb863
  由 QingshuChen 提交于 7月 22, 2022
```
*test=kunlun
```
  8ccbb863
14 7月, 2022 2 次提交
- Y
  
  [operator migration] Migrate infer shape for merged momentum (#44338) · 246ac976
  由 Yuang Liu 提交于 7月 14, 2022
  
  246ac976
- Y
  
  [operator migration] Migrate merged momentum cpu/gpu kernels (#44300) · d15b490a
  由 Yuang Liu 提交于 7月 14, 2022
  
  d15b490a
13 7月, 2022 1 次提交
- Q
  fix cpu lars_momentum bug & add xpu grad_add/log_softmax/log_softmax_… (#44260) · d6d60cbc
  由 QingshuChen 提交于 7月 13, 2022
```
* fix cpu lars_momentum bug & add xpu grad_add/log_softmax/log_softmax_grad
*test=kunlun

* minor
*test=kunlun
```
  d6d60cbc
12 7月, 2022 1 次提交
- Z
  [Phi] Migrate merged_adam_op into Phi (#44184) · d55ee95f
  由 zhangbo9674 提交于 7月 12, 2022
```
* remov merged_adam_op to phi

* refine code
```
  d55ee95f
11 7月, 2022 1 次提交
- H
  rmsprop for xpu. test=kunlun (#44175) · 3ca713ee
  由 houj04 提交于 7月 11, 2022
```
* rmsprop for xpu. test=kunlun

* minor fix (follow comments). test=kunlun
```
  3ca713ee
02 7月, 2022 1 次提交

unify cpu context, part2 (#44012) · 755438a7

由 Leo Chen 提交于 7月 02, 2022

* fix init()

* delete test_device_context

* replace CPUDeviceContext with CPUContext

* fix test_scalar

* remove dot_op.cc

* fix compile

755438a7

26 6月, 2022 1 次提交
- S
  
  format all files in fluid using new config (#43776) · 576236a0
  由 Sing_chan 提交于 6月 26, 2022
  
  576236a0
13 6月, 2022 1 次提交
- Q
  
  [MLU]add lookup_table_v2 op and fix amp feature of bert with mlu device (#43366) · 67bd5d9c
  由 qipengh 提交于 6月 13, 2022
  
  67bd5d9c
10 6月, 2022 1 次提交
- S
  
  fix nullptr (#43370) · acfd7129
  由 sneaxiy 提交于 6月 10, 2022
  
  acfd7129
09 6月, 2022 1 次提交

Add nproc_per_node for DistributedFusedLamb (#43295) · 6678def9

由 sneaxiy 提交于 6月 09, 2022

* add nproc_per_node for DistributedFusedLamb

* fix nproc_per_node communicator bug

* fix ring_id = 1 init bug

* fix ci

* fix test_parallel_executor_mnist.py

6678def9

07 6月, 2022 2 次提交
- S
  
  Optimized the performance of activation op in XPU2 (#43187) · d5afc1ba
  由 shixingbo 提交于 6月 07, 2022
  
  d5afc1ba
- S
  Add use_master_acc_grad for DistributedFusedLamb (#43266) · 601d7a35
  由 sneaxiy 提交于 6月 07, 2022
```
* add use_master_acc_grad

* add ut
```
  601d7a35
05 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：clang-format (#42840) · a3730dc8
  由 Sing_chan 提交于 6月 05, 2022
  
  a3730dc8
04 6月, 2022 1 次提交
- S
  
  【code format check upgrade】 step2：cmake-format (#43057) · 92568edb
  由 Sing_chan 提交于 6月 04, 2022
  
  92568edb
27 5月, 2022 1 次提交

[Phi] Change optional tensor from `optional<const Tensor&>` to `optional<Tensor>` (#42939) · 6d78524c

由 zyfncg 提交于 5月 27, 2022

* refactor the optional tensor

* remove optiona<MetaTensor> in InferMeta

* fix bug

* fix optional<vector<Tensor>>

* fix bug

* fix rmsprop

* fix amp of eager_gen

* polish code

* fix deleted code

* fix merge conflict

* polish code

* remove is_nullopt_

* fix merge conflict

* fix merge conflict

6d78524c

16 5月, 2022 1 次提交

Add the new XDNN implementation. test=kunlun (#42683) · 87667c66

由 wbn 提交于 5月 16, 2022

* Add the new XDNN implementation. test=kunlun

* Add the new XDNN implementation. test=kunlun

* Modify the code based on review, test=kunlun

87667c66

11 5月, 2022 1 次提交
- T
  
  remove old XDNN implementation test=kunlun (#42404) · 7b828f71
  由 taixiurong 提交于 5月 11, 2022
  
  7b828f71
10 5月, 2022 1 次提交
- Q
  
  [MLU]add adam, adamw op of mlu device (#42557) · cc077693
  由 qipengh 提交于 5月 10, 2022
  
  cc077693
29 4月, 2022 1 次提交
- A
  
  [OP]Fix adamw not registered into AllKernels (#42391) · 683f152a
  由 Aurelius84 提交于 4月 29, 2022
  
  683f152a

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功