提交 · e5ed5257083b92b018330812c33c746bae26fb41 · PaddlePaddle / Paddle

17 11月, 2022 3 次提交

Support bfloat16 for adamw and adam optimizer. Fit the lr for pure bf16... · e5ed5257

由 Yuang Liu 提交于 11月 17, 2022

Support bfloat16 for adamw and adam optimizer. Fit the lr for pure bf16 training with tensor fusion. (#48041)

* add bfloat16 for adamw

* set lr not to bfloat16 for pure bf16 training

* update the logic

* update the adamw optimizer

* support bfloat for adam

e5ed5257

Add vectorized bfloat16 atomicAdd (#48056) · ccbd03d5

由 sneaxiy 提交于 11月 17, 2022

* add vectorized bfloat16 atomicAdd

* fix compile error

* fix compile error again

* fix V100 compile error

* fix V100 compile again

ccbd03d5

Z

generate static graph code for some op (#48036) · 7cc0d171
由 zyfncg 提交于 11月 17, 2022

7cc0d171

16 11月, 2022 14 次提交
- H
  
  rm "paddle/fluid/framework/gpu_utils.h" in phi (#48020) · 29a0987a
  由 huangjiyi 提交于 11月 16, 2022
  
  29a0987a
- Q
  [NPU] update npu prop, test=develop (#47859) · ad8847aa
  由 Qi Li 提交于 11月 16, 2022
```
* [NPU] update npu prop, test=develop

* remove ddim.h

* remove diff

* update storage prop, test=develop
```
  ad8847aa
- X
  [Paddle Inference] Add fill_any_like trt converter. (#47974) · d6be9000
  由 xiaoxiaohehe001 提交于 11月 16, 2022
```
* add_fill_any_like

* add_fill_any_like
```
  d6be9000
- W
  elementwise_floordiv (#47944) · b4b78060
  由 wenbin 提交于 11月 16, 2022
```
* elementwise_op

* add teller

* modify ut

* comments

* modify ut

* return

* modify
```
  b4b78060
- Z
  
  trt memory set change from setMaxWorkspaceSize to setMemoryPoolLimit since trt 8.3+ (#47795) · 9cf3aa61
  由 Zhang Jun 提交于 11月 16, 2022
  
  9cf3aa61
- Z
  
  [inference][trt] update trt hardswish plugin to layer (#47745) · 6c54e0e8
  由 Zhang Jun 提交于 11月 16, 2022
  
  6c54e0e8
- H
  [Opt depthwise_conv2d] Simplify depthwise_conv2d use_cudnn attribute (#48010) · 7c304580
  由 HongyuJia 提交于 11月 16, 2022
```
* simplify depthwise_conv2d phi kernel selection

* fix depthwise_conv2d
```
  7c304580
- P
  Add bf16 data type support to oneDNN bilinear_interp kernel (#46770) · 8e6315e4
  由 Piotr Paturej 提交于 11月 16, 2022
```
* Enable bf16 in oneDNN bilinear_interp kernel

* Fix bilinear_interp_v2 not enabled in models

* Remove unnecessary checks
```
  8e6315e4
- Y
  Fix paddle rec, kim, dsin models' bugs (#47792) · e23dfed9
  由 ykkk2333 提交于 11月 16, 2022
```
* add stat tool

* add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun

* embedding and embedding_grad add int32 input, test=kunlun
```
  e23dfed9
- H
  remove avx check (#48003) · a762d68e
  由 hong 提交于 11月 16, 2022
```
* remove avx check

* fix bug;
```
  a762d68e
- L
  
  increase the level of some log (#47990) · 2f8901cb
  由 Leo Chen 提交于 11月 16, 2022
  
  2f8901cb
- W
  
  move "gpu_primitives.h" to phi (#48015) · 9adca1e7
  由 Wang Xin 提交于 11月 16, 2022
  
  9adca1e7
- W
  Update `ProcessGroupCustom` for `sync_op` compatibility (#47976) · e4ebf383
  由 Wen Sun 提交于 11月 16, 2022
```
* refactor: update pg custom

* fix: use new api in ut

* fix: typo

* revert: recover legacy apis

* fix: add GetDeviceContext
```
  e4ebf383
- C
  
  feat(ipu): add paddle inference support for model_runtime. (#47364) · 39c85064
  由 czr-gc 提交于 11月 16, 2022
  
  39c85064
15 11月, 2022 12 次提交
- S
  
  add gather dtype err msg (#48002) · 5859d0a6
  由 sneaxiy 提交于 11月 15, 2022
  
  5859d0a6
- H
  [Opt Error Message] Opt error message when selecting kernels under phi (#47970) · fd550c1b
  由 HongyuJia 提交于 11月 15, 2022
```
* opt error message when selecting kernels under phi

* fix for loop

* polish error message

* polish error message, split into 3 error condition

* polish error message
```
  fd550c1b
- Y
  
  fix onednn bugs, test=document_fix (#48013) · 21d4fa02
  由 YuanRisheng 提交于 11月 15, 2022
  
  21d4fa02
- J
  Added optimization pass for oneDNN layernorm kernel (#47782) · 519e7426
  由 jakpiase 提交于 11月 15, 2022
```
* optimization for ln

* fix

* added output to gpd

* added formatting

* fix
```
  519e7426
- [Zero-Dim] Make auto parallel judge dim more strict (#47961) · 626d7bcb
  由 zhouweiwei2014 提交于 11月 15, 2022
  
  626d7bcb
- Y
  
  Update for scatter support fake 2d index (#47946) · e65bac28
  由 Yuang Liu 提交于 11月 15, 2022
  
  e65bac28
- W
  
  [convert_to_mixed_precision] fallback to fp32 when encounter circle (#47902) · a00aebe1
  由 Wilber 提交于 11月 15, 2022
  
  a00aebe1
- [Zero-Dim] support input 0D Tensor for xpu kernel, test=kunlun (#47849) · d4d3d7ed
  由 zhouweiwei2014 提交于 11月 15, 2022
  
  d4d3d7ed
- S
  mkldnn directory cleanup (#47779) · 8a339d24
  由 Sławomir Siwek 提交于 11月 15, 2022
```
* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency
```
  8a339d24
- H
  [PHI decoupling] remove "paddle/fluid/platform/complex.h" in phi (#47926) · aa08b769
  由 huangjiyi 提交于 11月 15, 2022
```
* rm "paddle/fluid/platform/complex.h" in phi

* fix codestyle with pre-commit
```
  aa08b769
- W
  
  remove 'paddle/fluid/operators/conv_op.h' from phi (#47914) · f7bf2930
  由 Wang Xin 提交于 11月 15, 2022
  
  f7bf2930
- H
  [PHI decoupling] remove dependency on "paddle/fluid/operators/elementwise/xxx.h" in phi (#47870) · 04c29558
  由 huangjiyi 提交于 11月 15, 2022
```
* rm "paddle/fluid/operators/elementwise/xxx.h" in phi

* fix bugs

* add LaunchElementwiseCudaKernel in phi

* Revert "add LaunchElementwiseCudaKernel in phi"

This reverts commit 588f45bbdad2372ec7bff0c567a29bff675d22e1.

* rm indirect dependence to "elementwise_op_impl.cu.h"

rm indirect dependence to "elementwise_op_impl.cu.h"

Revert "add LaunchElementwiseCudaKernel in phi"

This reverts commit 588f45bbdad2372ec7bff0c567a29bff675d22e1.

add LaunchElementwiseCudaKernel in phi

fix bugs

* rm LaunchSameDimsElementwiseCudaKernel and LaunchElementwiseCudaKernel in phi
```
  04c29558
14 11月, 2022 11 次提交
- W
  Refactor collective communication send_partial, recv_partial, all_gather_partial C++ API (#47863) · 25e63dca
  由 Wen Sun 提交于 11月 14, 2022
```
* refactor: simplify send, recv interfaces

* refactor: rm send_partial, recv_partial, all_gather_partial
```
  25e63dca
- X
  
  [Paddle Inference] Add where trt converter (#47820) · dac0f7dd
  由 xiaoxiaohehe001 提交于 11月 14, 2022
  
  dac0f7dd
- L
  
  Remove place for process group (#47857) · 2d383b81
  由 LiYuRio 提交于 11月 14, 2022
  
  2d383b81
- [Zero-Dim] support input 0D Tensor as scalar attribute for some api (#47689) · e0be4b94
  由 zhouweiwei2014 提交于 11月 14, 2022
```
* [Zero-Dim] support input 0D Tensor as scalar attribute for some api

* fix doc
```
  e0be4b94
- C
  
  add cos double and triple grad operator (#47796) · 1a145aab
  由 cyber-pioneer 提交于 11月 14, 2022
  
  1a145aab
- J
  - Modified mem_desc() to return reference to Tensor::memory::desc to (#47844) · 2182a4f9
  由 Jacek Czaja 提交于 11月 14, 2022
```
avoid copying
```
  2182a4f9
- L
  
  remove heter and hccl (#47918) · 9191e743
  由 LiYuRio 提交于 11月 14, 2022
  
  9191e743
- R
  
  Do not release memory cache after build_op_func_list in interpretercore (#47910) · 8347354d
  由 Ruibiao Chen 提交于 11月 14, 2022
  
  8347354d
- N
  
  Fix HOSTDEVICE redefinition during XPU KP compilation, test=kunlun (#47885) · 81e16a85
  由 niuliling123 提交于 11月 14, 2022
  
  81e16a85
- E
  
  add lite opencl support api (#47112) · 798ab3f9
  由 engineer1109 提交于 11月 14, 2022
  
  798ab3f9
- Y
  
  fix squueze_transpose (#47911) · f50de679
  由 yeliang2258 提交于 11月 14, 2022
  
  f50de679

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功