提交 · 70589379211de9b1b63681c55fa771776974a848 · BaiXuePrincess / Paddle

21 11月, 2022 5 次提交
- Z
  Fix wrong eigen header include in data_type.h (#48157) · 70589379
  由 zyfncg 提交于 11月 21, 2022
```
* Fix wrong eigen header include

* fix compile bug
```
  70589379
- P
  [PHI decoupling] move "thread pool" from fluid to phi (#48075) · 3ca7328f
  由 PuQing 提交于 11月 21, 2022
```
* move threadpool

fix cmake

* fix make
```
  3ca7328f
- T
  
  add adamw suppor xpu, test=kunlun (#48114) · 27e252d9
  由 taixiurong 提交于 11月 21, 2022
  
  27e252d9
- H
  [PHI decoupling] move cross_entropy from fluid to phi (#48160) · 3501ff7d
  由 huangjiyi 提交于 11月 21, 2022
```
* move cross_entropy from fluid to phi

* replace mutable_data with Alloc

* use .template
```
  3501ff7d
- P
  
  remove macros.h (#48069) · 02c51f3b
  由 PuQing 提交于 11月 21, 2022
  
  02c51f3b
18 11月, 2022 10 次提交

[PHI] Migrate matmul_grad kernel (#48023) · 4ab18ada

由 Sławomir Siwek 提交于 11月 18, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

4ab18ada

[PHI] Migrate conv_transpose kernel (#48119) · 9aacb31b

由 Zuza Gawrysiak 提交于 11月 18, 2022

* Migrate conv_transpose to phi

* Move handler to kernel

* kernel m

* Fix formatting

* handler

* remove fluid

* revert tcp_store

* tcp_store

* remove unused

* Fix declaration

* add dnn input

* Fix typo
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

9aacb31b

Optimize FusedBiasAddGelu Kernel (#47679) · b0e28540

由 MarDino 提交于 11月 18, 2022

* Add quick gelu and fused bias add kernel

* fix annotation

* remove useless code

* add fast gelu option and set it in multi transformer op

* add flag to restrict if use fast gelu approximate

* fix flags conflict

* fix use tanh function instead

* add cudart version limit

* use phi fast tanh func

* fix comment

b0e28540

[PHI decoupling] move "gpu_device_function.h" from fluid to phi (#48097) · 27ee6e71

由 huangjiyi 提交于 11月 18, 2022

* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi

* update copyright years

* rm "fluid/platform/device/gpu/gpu_device_function.h" in phi

* fix rocm-complie bugs

27ee6e71

correct sync behavior for XPU distributed training (#47882) · aafa9820

由 james 提交于 11月 18, 2022

* correct sync behavior for XPU distributed training

XPU support event mechanism similar to cuda event, so it is advisable to
use an event to sync compute/comm streams for performance. However this
mechanism is never fully tested, and inconsistent loss/ending_epochs are
reported. Therefore, this PR replaces event sync with stream waiting as
a temporary solution.

* remove compile warning

aafa9820

CUDNN v8 Implementation of Convolution Kernels (#47454) · 14a6e67b

由 Tian Zheng 提交于 11月 18, 2022

* Refactor conv_kernel and conv_grad_kernel to provide interface for CUDNNv8 implementation

* Fix macro

* Add implementation for conv_kernel and conv_grad_kernel

* Modification after rebase onto latest develop

* Modify plan cache to comply with the API of phi::autotune

* Refactor to reduce duplicate code

* Review fix:
- move functions in  conv_kernel_impl_v8.h and conv_grad_kernel_impl_v8.h to conv_kernel.cu and conv_grad_kernelk.cu
- add const specifier for input tensor
- add logging when plans fail to execute
- move CudnnConvBwdFilterV8 and CudnnConvBwdDataV8 to conv_cudnn_frontend.h

* - move plan building outside of cache

* Fix ROCM build

14a6e67b

Y

add bf16 for numel (#48121) · a7d306af
由 Yuang Liu 提交于 11月 18, 2022

a7d306af
Z

cast and gradient_accumulator support double for xpu, test=kunlun (#47800) · 982d5ff7
由 zhangyikun02 提交于 11月 18, 2022

982d5ff7
S

fix onednn prelu header (#48064) · 85598e31
由 Sylwester Fraczek 提交于 11月 18, 2022

85598e31
H

rm "paddle/fluid/operators/amp/fp16_type_traits.h" in phi (#48051) · e4670d80
由 huangjiyi 提交于 11月 18, 2022

e4670d80

17 11月, 2022 10 次提交
- Q
  [NPU] add _npu_identity op and api, test=develop (#47850) · 099c2302
  由 Qi Li 提交于 11月 17, 2022
```
* [NPU] add _npu_identity op and api, test=develop

* fix doc

* address comments
```
  099c2302
- X
  
  fix the thread number to ensure deterministic of embedding kernel (#48073) · 5329187d
  由 xiongkun 提交于 11月 17, 2022
  
  5329187d
- H
  
  rm "paddle/fluid/framework/convert_utils.h" in phi (#48001) · 2f34fc7a
  由 huangjiyi 提交于 11月 17, 2022
  
  2f34fc7a
- Y
  [PHI]Standardise some C++ API (Part5) (#47860) · f3650201
  由 YuanRisheng 提交于 11月 17, 2022
```
* standard api

* fix xpu bugs
```
  f3650201
- T
  
  xpu-paddlepaddle-41 [任务] ffn and attention test=kunlun (#46658) · 071708fa
  由 taixiurong 提交于 11月 17, 2022
  
  071708fa
- W
  
  move "function_traits.h" from fluid to phi (#48065) · b7841a2b
  由 Wang Xin 提交于 11月 17, 2022
  
  b7841a2b
- Y
  Implement a common dimension simplifier. (#47981) · bf6af816
  由 Yiqun Liu 提交于 11月 17, 2022
```
* Implement a common dims simplifier.

* Fix the include position error.

* Reduce the cpu overhead of broadcast computing.
```
  bf6af816
- H
  
  rm "paddle/phi/kernels/gpu/batch_norm_utils.h" in phi (#48057) · b7e120d2
  由 huangjiyi 提交于 11月 17, 2022
  
  b7e120d2
- H
  [PHI decoupling] move "paddle/fluid/operators/math.h" to phi (#48062) · f62bd3b4
  由 huangjiyi 提交于 11月 17, 2022
```
* rm "paddle/fluid/operators/math.h" in phi

* rm "paddle/fluid/operators/math.h" in fluit
```
  f62bd3b4
- Y
  Support bfloat16 for adamw and adam optimizer. Fit the lr for pure bf16... · e5ed5257
  由 Yuang Liu 提交于 11月 17, 2022
```
Support bfloat16 for adamw and adam optimizer. Fit the lr for pure bf16 training with tensor fusion. (#48041)

* add bfloat16 for adamw

* set lr not to bfloat16 for pure bf16 training

* update the logic

* update the adamw optimizer

* support bfloat for adam
```
  e5ed5257
16 11月, 2022 4 次提交
- H
  
  rm "paddle/fluid/framework/gpu_utils.h" in phi (#48020) · 29a0987a
  由 huangjiyi 提交于 11月 16, 2022
  
  29a0987a
- P
  Add bf16 data type support to oneDNN bilinear_interp kernel (#46770) · 8e6315e4
  由 Piotr Paturej 提交于 11月 16, 2022
```
* Enable bf16 in oneDNN bilinear_interp kernel

* Fix bilinear_interp_v2 not enabled in models

* Remove unnecessary checks
```
  8e6315e4
- Y
  Fix paddle rec, kim, dsin models' bugs (#47792) · e23dfed9
  由 ykkk2333 提交于 11月 16, 2022
```
* add stat tool

* add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun

* embedding and embedding_grad add int32 input, test=kunlun
```
  e23dfed9
- W
  
  move "gpu_primitives.h" to phi (#48015) · 9adca1e7
  由 Wang Xin 提交于 11月 16, 2022
  
  9adca1e7
15 11月, 2022 6 次提交

S

add gather dtype err msg (#48002) · 5859d0a6
由 sneaxiy 提交于 11月 15, 2022

5859d0a6
[Zero-Dim] support input 0D Tensor for xpu kernel, test=kunlun (#47849) · d4d3d7ed
由 zhouweiwei2014 提交于 11月 15, 2022

d4d3d7ed

mkldnn directory cleanup (#47779) · 8a339d24

由 Sławomir Siwek 提交于 11月 15, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

8a339d24

H
[PHI decoupling] remove "paddle/fluid/platform/complex.h" in phi (#47926) · aa08b769
由 huangjiyi 提交于 11月 15, 2022
```
* rm "paddle/fluid/platform/complex.h" in phi

* fix codestyle with pre-commit
```
aa08b769
W

remove 'paddle/fluid/operators/conv_op.h' from phi (#47914) · f7bf2930
由 Wang Xin 提交于 11月 15, 2022

f7bf2930

[PHI decoupling] remove dependency on "paddle/fluid/operators/elementwise/xxx.h" in phi (#47870) · 04c29558

由 huangjiyi 提交于 11月 15, 2022

* rm "paddle/fluid/operators/elementwise/xxx.h" in phi

* fix bugs

* add LaunchElementwiseCudaKernel in phi

* Revert "add LaunchElementwiseCudaKernel in phi"

This reverts commit 588f45bbdad2372ec7bff0c567a29bff675d22e1.

* rm indirect dependence to "elementwise_op_impl.cu.h"

rm indirect dependence to "elementwise_op_impl.cu.h"

Revert "add LaunchElementwiseCudaKernel in phi"

This reverts commit 588f45bbdad2372ec7bff0c567a29bff675d22e1.

add LaunchElementwiseCudaKernel in phi

fix bugs

* rm LaunchSameDimsElementwiseCudaKernel and LaunchElementwiseCudaKernel in phi

04c29558

14 11月, 2022 1 次提交
- C
  
  add cos double and triple grad operator (#47796) · 1a145aab
  由 cyber-pioneer 提交于 11月 14, 2022
  
  1a145aab
11 11月, 2022 4 次提交
- [Zero-Dim] fix batch_norm op infermeta bug (#47858) · 18549417
  由 zhouweiwei2014 提交于 11月 11, 2022
  
  18549417
- W
  
  remove "paddle/fluid/framework/op_registry.h" from phi (#47868) · 78c8c7de
  由 Wang Xin 提交于 11月 11, 2022
  
  78c8c7de
- W
  
  remove fluid/framework/lod_tensor.h from phi (#47840) · 494cab07
  由 Wang Xin 提交于 11月 11, 2022
  
  494cab07
- Y
  
  Simplify the autotune cache codes. (#47667) · 8758a338
  由 Yiqun Liu 提交于 11月 11, 2022
  
  8758a338

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致