提交 · 86d92092e762ae8330800cd98a7f553b7fe3954e · BaiXuePrincess / Paddle

28 11月, 2022 3 次提交
- Use phi layernorm (#48276) · 86d92092
  由 MarDino 提交于 11月 28, 2022
  
  86d92092
- T
  fix expand as op (#48336) · 827fd5cd
  由 Thomas Young 提交于 11月 28, 2022
```
* fix expand as op

* fix bug
```
  827fd5cd
- H
  
  add square fp16 *test=kunlun (#48095) · 81d0a3cc
  由 haosicheng 提交于 11月 28, 2022
  
  81d0a3cc
25 11月, 2022 2 次提交

W
Group norm fp16 support (#48222) · 34fd65cf
由 Wang Bojun 提交于 11月 25, 2022
```
* group norm fp16 support
```
34fd65cf

add bfloat16 support for more ops (#48272) · aaf3a13e

由 sneaxiy 提交于 11月 25, 2022

* add bfloat16 support for more ops

* fix ci compile

* fix windows compile error

* fix windows compile error

* fix rocm compile error

* fix ROCM compile error

aaf3a13e

24 11月, 2022 8 次提交

Z

add exp_grad, hard_sigmoid and hard_sigmoid_grad for xpu, test=kunlun (#48307) · d2f87d96
由 zhangyikun02 提交于 11月 24, 2022

d2f87d96
Z

add pad3d and pad3d_grad op for xpu, test=kunlun (#48306) · 22555e96
由 zhangyikun02 提交于 11月 24, 2022

22555e96

[Fluid clean] (#48105) · 43b92b63

由 wangxiaoning 提交于 11月 24, 2022

* add index sample fp16 support

* remove fluid APIs in distributed_strategy.py and role_maker.py

* Revert "remove fluid APIs in distributed_strategy.py and role_maker.py"

This reverts commit 223bbee990d3bf69e252fc3c0f19e3873550a264.

* remove fluid APIs in distributed_strategy.py and role_maker.py

* remove index sample op changes

* remove fluid APIs under fleet.base

* remove fluid APIs under fleet.layers.mpu

* remove fluid APIs under fleet.meta_optimizers

* fix fluid error

* fix util_factory.py

* reset fluid.io.load_inference_model API

43b92b63

[PHI decoupling] simplify "convert_utils.h" in fluid (#48168) · de4310e6

由 huangjiyi 提交于 11月 24, 2022

* rm dependence to "convert_utils.h" in some files

* fix bugs

* replace DataType2String with DataTypeToString

* replace framework::DataTypeSize with phi::SizeOf

* mv convert_function from fluid to phi and rm old map

* recommit with pre-commit

* repalce ProtoVarType with ProtoDataType and update comment.

* fix error about include "dnnl.hpp"

* revert add dep mkldnn to convert_utils in phi

* add mkldnn deps in convert_utils.h in phi

* move deps to convert_utils.h in phi

de4310e6

P

[PHI decoupling] remove "paddle/fluid/platform/enforce.h" in phi (#48049) · df23c7c3
由 PuQing 提交于 11月 24, 2022

df23c7c3
S

[PHI] Migrate batch_norm_grad kernel (#48288) · 561b7278
由 Sławomir Siwek 提交于 11月 24, 2022

561b7278
S

fix adam thread num (#48297) · dd27996c
由 sneaxiy 提交于 11月 24, 2022

dd27996c

do not calc reduce_all in eager mode (#48199) · bcf75132

由 wanghuancoder 提交于 11月 24, 2022

* do not calc reduce_all in eager mode

* refine python c cast list

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

bcf75132

23 11月, 2022 5 次提交
- H
  [PHI decoupling] move im2col from fluid to phi (#48174) · 88cac16b
  由 huangjiyi 提交于 11月 23, 2022
```
* decouple im2col from fluid

* move im2col to phi

* fix build error

* delete redundant comment
```
  88cac16b
- Y
  add masked_select_grad kernel (#48137) · db0ea0ce
  由 ykkk2333 提交于 11月 23, 2022
```
* add stat tool

* add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun

* add masked_selected_grad kernel,test=kunlun
```
  db0ea0ce
- L
  Add bfloat16 type support for abs op (#48205) · 29d75c14
  由 limingshu 提交于 11月 23, 2022
```
* first commit

* 2nd commit
```
  29d75c14
- L
  
  fix vector out of range error (#48255) · a606db67
  由 Leo Chen 提交于 11月 23, 2022
  
  a606db67
- Z
  
  add warpctc kernel and change cast_v2 to cast for xpu, test=kunlun (#48134) · 25ffe9c2
  由 zhangyikun02 提交于 11月 23, 2022
  
  25ffe9c2
22 11月, 2022 3 次提交
- P
  [PHI] Migrate elementwise_div + all elementwise grad kernels (#48210) · 78b30e97
  由 Piotr Paturej 提交于 11月 22, 2022
```
* Migrate elementwise_div

* Migrate elementwise grad kernels
```
  78b30e97
- H
  [PHI decoupling] move vol2col from fluid to phi (#48175) · aa36c6aa
  由 huangjiyi 提交于 11月 22, 2022
```
* move vol2col from fluid to phi

* update copyright year
```
  aa36c6aa
- Y
  
  bf16 for interpolate, nhwc for bf16 (#48192) · e0dd4ee9
  由 Yuang Liu 提交于 11月 22, 2022
  
  e0dd4ee9
21 11月, 2022 7 次提交

[PHI] Migrate mul_grad kernel (#48061) · 55f6fb3d

由 Sławomir Siwek 提交于 11月 21, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

* mul_grad

55f6fb3d

W
refine reduce_all (#48133) · 56f15c43
由 wanghuancoder 提交于 11月 21, 2022
```
* refine reduce_all
```
56f15c43
Z
Fix wrong eigen header include in data_type.h (#48157) · 70589379
由 zyfncg 提交于 11月 21, 2022
```
* Fix wrong eigen header include

* fix compile bug
```
70589379
P
[PHI decoupling] move "thread pool" from fluid to phi (#48075) · 3ca7328f
由 PuQing 提交于 11月 21, 2022
```
* move threadpool

fix cmake

* fix make
```
3ca7328f
T

add adamw suppor xpu, test=kunlun (#48114) · 27e252d9
由 taixiurong 提交于 11月 21, 2022

27e252d9
H
[PHI decoupling] move cross_entropy from fluid to phi (#48160) · 3501ff7d
由 huangjiyi 提交于 11月 21, 2022
```
* move cross_entropy from fluid to phi

* replace mutable_data with Alloc

* use .template
```
3501ff7d
P

remove macros.h (#48069) · 02c51f3b
由 PuQing 提交于 11月 21, 2022

02c51f3b

18 11月, 2022 10 次提交

[PHI] Migrate matmul_grad kernel (#48023) · 4ab18ada

由 Sławomir Siwek 提交于 11月 18, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

4ab18ada

[PHI] Migrate conv_transpose kernel (#48119) · 9aacb31b

由 Zuza Gawrysiak 提交于 11月 18, 2022

* Migrate conv_transpose to phi

* Move handler to kernel

* kernel m

* Fix formatting

* handler

* remove fluid

* revert tcp_store

* tcp_store

* remove unused

* Fix declaration

* add dnn input

* Fix typo
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

9aacb31b

Optimize FusedBiasAddGelu Kernel (#47679) · b0e28540

由 MarDino 提交于 11月 18, 2022

* Add quick gelu and fused bias add kernel

* fix annotation

* remove useless code

* add fast gelu option and set it in multi transformer op

* add flag to restrict if use fast gelu approximate

* fix flags conflict

* fix use tanh function instead

* add cudart version limit

* use phi fast tanh func

* fix comment

b0e28540

[PHI decoupling] move "gpu_device_function.h" from fluid to phi (#48097) · 27ee6e71

由 huangjiyi 提交于 11月 18, 2022

* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi

* update copyright years

* rm "fluid/platform/device/gpu/gpu_device_function.h" in phi

* fix rocm-complie bugs

27ee6e71

correct sync behavior for XPU distributed training (#47882) · aafa9820

由 james 提交于 11月 18, 2022

* correct sync behavior for XPU distributed training

XPU support event mechanism similar to cuda event, so it is advisable to
use an event to sync compute/comm streams for performance. However this
mechanism is never fully tested, and inconsistent loss/ending_epochs are
reported. Therefore, this PR replaces event sync with stream waiting as
a temporary solution.

* remove compile warning

aafa9820

CUDNN v8 Implementation of Convolution Kernels (#47454) · 14a6e67b

由 Tian Zheng 提交于 11月 18, 2022

* Refactor conv_kernel and conv_grad_kernel to provide interface for CUDNNv8 implementation

* Fix macro

* Add implementation for conv_kernel and conv_grad_kernel

* Modification after rebase onto latest develop

* Modify plan cache to comply with the API of phi::autotune

* Refactor to reduce duplicate code

* Review fix:
- move functions in  conv_kernel_impl_v8.h and conv_grad_kernel_impl_v8.h to conv_kernel.cu and conv_grad_kernelk.cu
- add const specifier for input tensor
- add logging when plans fail to execute
- move CudnnConvBwdFilterV8 and CudnnConvBwdDataV8 to conv_cudnn_frontend.h

* - move plan building outside of cache

* Fix ROCM build

14a6e67b

Y

add bf16 for numel (#48121) · a7d306af
由 Yuang Liu 提交于 11月 18, 2022

a7d306af
Z

cast and gradient_accumulator support double for xpu, test=kunlun (#47800) · 982d5ff7
由 zhangyikun02 提交于 11月 18, 2022

982d5ff7
S

fix onednn prelu header (#48064) · 85598e31
由 Sylwester Fraczek 提交于 11月 18, 2022

85598e31
H

rm "paddle/fluid/operators/amp/fp16_type_traits.h" in phi (#48051) · e4670d80
由 huangjiyi 提交于 11月 18, 2022

e4670d80

17 11月, 2022 2 次提交
- Q
  [NPU] add _npu_identity op and api, test=develop (#47850) · 099c2302
  由 Qi Li 提交于 11月 17, 2022
```
* [NPU] add _npu_identity op and api, test=develop

* fix doc

* address comments
```
  099c2302
- X
  
  fix the thread number to ensure deterministic of embedding kernel (#48073) · 5329187d
  由 xiongkun 提交于 11月 17, 2022
  
  5329187d

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致