提交 · edda13cd88b269c932e1d8fafa5a6fabbbda72a2 · PaddlePaddle / Paddle

18 11月, 2022 6 次提交

correct sync behavior for XPU distributed training (#47882) · aafa9820

由 james 提交于 11月 18, 2022

* correct sync behavior for XPU distributed training

XPU support event mechanism similar to cuda event, so it is advisable to
use an event to sync compute/comm streams for performance. However this
mechanism is never fully tested, and inconsistent loss/ending_epochs are
reported. Therefore, this PR replaces event sync with stream waiting as
a temporary solution.

* remove compile warning

aafa9820

CUDNN v8 Implementation of Convolution Kernels (#47454) · 14a6e67b

由 Tian Zheng 提交于 11月 18, 2022

* Refactor conv_kernel and conv_grad_kernel to provide interface for CUDNNv8 implementation

* Fix macro

* Add implementation for conv_kernel and conv_grad_kernel

* Modification after rebase onto latest develop

* Modify plan cache to comply with the API of phi::autotune

* Refactor to reduce duplicate code

* Review fix:
- move functions in  conv_kernel_impl_v8.h and conv_grad_kernel_impl_v8.h to conv_kernel.cu and conv_grad_kernelk.cu
- add const specifier for input tensor
- add logging when plans fail to execute
- move CudnnConvBwdFilterV8 and CudnnConvBwdDataV8 to conv_cudnn_frontend.h

* - move plan building outside of cache

* Fix ROCM build

14a6e67b

Y

add bf16 for numel (#48121) · a7d306af
由 Yuang Liu 提交于 11月 18, 2022

a7d306af
Z

cast and gradient_accumulator support double for xpu, test=kunlun (#47800) · 982d5ff7
由 zhangyikun02 提交于 11月 18, 2022

982d5ff7
S

fix onednn prelu header (#48064) · 85598e31
由 Sylwester Fraczek 提交于 11月 18, 2022

85598e31
H

rm "paddle/fluid/operators/amp/fp16_type_traits.h" in phi (#48051) · e4670d80
由 huangjiyi 提交于 11月 18, 2022

e4670d80

17 11月, 2022 10 次提交
- Q
  [NPU] add _npu_identity op and api, test=develop (#47850) · 099c2302
  由 Qi Li 提交于 11月 17, 2022
```
* [NPU] add _npu_identity op and api, test=develop

* fix doc

* address comments
```
  099c2302
- X
  
  fix the thread number to ensure deterministic of embedding kernel (#48073) · 5329187d
  由 xiongkun 提交于 11月 17, 2022
  
  5329187d
- H
  
  rm "paddle/fluid/framework/convert_utils.h" in phi (#48001) · 2f34fc7a
  由 huangjiyi 提交于 11月 17, 2022
  
  2f34fc7a
- Y
  [PHI]Standardise some C++ API (Part5) (#47860) · f3650201
  由 YuanRisheng 提交于 11月 17, 2022
```
* standard api

* fix xpu bugs
```
  f3650201
- T
  
  xpu-paddlepaddle-41 [任务] ffn and attention test=kunlun (#46658) · 071708fa
  由 taixiurong 提交于 11月 17, 2022
  
  071708fa
- W
  
  move "function_traits.h" from fluid to phi (#48065) · b7841a2b
  由 Wang Xin 提交于 11月 17, 2022
  
  b7841a2b
- Y
  Implement a common dimension simplifier. (#47981) · bf6af816
  由 Yiqun Liu 提交于 11月 17, 2022
```
* Implement a common dims simplifier.

* Fix the include position error.

* Reduce the cpu overhead of broadcast computing.
```
  bf6af816
- H
  
  rm "paddle/phi/kernels/gpu/batch_norm_utils.h" in phi (#48057) · b7e120d2
  由 huangjiyi 提交于 11月 17, 2022
  
  b7e120d2
- H
  [PHI decoupling] move "paddle/fluid/operators/math.h" to phi (#48062) · f62bd3b4
  由 huangjiyi 提交于 11月 17, 2022
```
* rm "paddle/fluid/operators/math.h" in phi

* rm "paddle/fluid/operators/math.h" in fluit
```
  f62bd3b4
- Y
  Support bfloat16 for adamw and adam optimizer. Fit the lr for pure bf16... · e5ed5257
  由 Yuang Liu 提交于 11月 17, 2022
```
Support bfloat16 for adamw and adam optimizer. Fit the lr for pure bf16 training with tensor fusion. (#48041)

* add bfloat16 for adamw

* set lr not to bfloat16 for pure bf16 training

* update the logic

* update the adamw optimizer

* support bfloat for adam
```
  e5ed5257
16 11月, 2022 4 次提交
- H
  
  rm "paddle/fluid/framework/gpu_utils.h" in phi (#48020) · 29a0987a
  由 huangjiyi 提交于 11月 16, 2022
  
  29a0987a
- P
  Add bf16 data type support to oneDNN bilinear_interp kernel (#46770) · 8e6315e4
  由 Piotr Paturej 提交于 11月 16, 2022
```
* Enable bf16 in oneDNN bilinear_interp kernel

* Fix bilinear_interp_v2 not enabled in models

* Remove unnecessary checks
```
  8e6315e4
- Y
  Fix paddle rec, kim, dsin models' bugs (#47792) · e23dfed9
  由 ykkk2333 提交于 11月 16, 2022
```
* add stat tool

* add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun

* embedding and embedding_grad add int32 input, test=kunlun
```
  e23dfed9
- W
  
  move "gpu_primitives.h" to phi (#48015) · 9adca1e7
  由 Wang Xin 提交于 11月 16, 2022
  
  9adca1e7
15 11月, 2022 6 次提交

S

add gather dtype err msg (#48002) · 5859d0a6
由 sneaxiy 提交于 11月 15, 2022

5859d0a6
[Zero-Dim] support input 0D Tensor for xpu kernel, test=kunlun (#47849) · d4d3d7ed
由 zhouweiwei2014 提交于 11月 15, 2022

d4d3d7ed

mkldnn directory cleanup (#47779) · 8a339d24

由 Sławomir Siwek 提交于 11月 15, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

8a339d24

H
[PHI decoupling] remove "paddle/fluid/platform/complex.h" in phi (#47926) · aa08b769
由 huangjiyi 提交于 11月 15, 2022
```
* rm "paddle/fluid/platform/complex.h" in phi

* fix codestyle with pre-commit
```
aa08b769
W

remove 'paddle/fluid/operators/conv_op.h' from phi (#47914) · f7bf2930
由 Wang Xin 提交于 11月 15, 2022

f7bf2930

[PHI decoupling] remove dependency on "paddle/fluid/operators/elementwise/xxx.h" in phi (#47870) · 04c29558

由 huangjiyi 提交于 11月 15, 2022

* rm "paddle/fluid/operators/elementwise/xxx.h" in phi

* fix bugs

* add LaunchElementwiseCudaKernel in phi

* Revert "add LaunchElementwiseCudaKernel in phi"

This reverts commit 588f45bbdad2372ec7bff0c567a29bff675d22e1.

* rm indirect dependence to "elementwise_op_impl.cu.h"

rm indirect dependence to "elementwise_op_impl.cu.h"

Revert "add LaunchElementwiseCudaKernel in phi"

This reverts commit 588f45bbdad2372ec7bff0c567a29bff675d22e1.

add LaunchElementwiseCudaKernel in phi

fix bugs

* rm LaunchSameDimsElementwiseCudaKernel and LaunchElementwiseCudaKernel in phi

04c29558

14 11月, 2022 1 次提交
- C
  
  add cos double and triple grad operator (#47796) · 1a145aab
  由 cyber-pioneer 提交于 11月 14, 2022
  
  1a145aab
11 11月, 2022 7 次提交
- [Zero-Dim] fix batch_norm op infermeta bug (#47858) · 18549417
  由 zhouweiwei2014 提交于 11月 11, 2022
  
  18549417
- W
  
  remove "paddle/fluid/framework/op_registry.h" from phi (#47868) · 78c8c7de
  由 Wang Xin 提交于 11月 11, 2022
  
  78c8c7de
- W
  
  remove fluid/framework/lod_tensor.h from phi (#47840) · 494cab07
  由 Wang Xin 提交于 11月 11, 2022
  
  494cab07
- Y
  
  Simplify the autotune cache codes. (#47667) · 8758a338
  由 Yiqun Liu 提交于 11月 11, 2022
  
  8758a338
- Z
  
  [Sparse]Optimize BatchNorm1D forward in test mode (#47736) · 6cdc18af
  由 zhangkaihuo 提交于 11月 11, 2022
  
  6cdc18af
- H
  [PHI decoupling] remove dependency on 2 header files in fluid from phi (#47842) · 1ad95e97
  由 huangjiyi 提交于 11月 11, 2022
```
* rm "paddle/fluid/operators/eigen/eigen_function.h" in phi

* rm "paddle/fluid/operators/elementwise/elementwise_op_function.h" in phi

* Revert "rm "paddle/fluid/operators/elementwise/elementwise_op_function.h" in phi"

This reverts commit c4ba51225e3652f1d80925afba406612968f0ee9.
```
  1ad95e97
- P
  
  [PHI decoupling] remove #include "paddle/fluid/platform/bfloat16.h" in phi (#47831) · e0742c48
  由 PuQing 提交于 11月 11, 2022
  
  e0742c48
10 11月, 2022 6 次提交
- Z
  
  conv2d_transpose and deformable_conv unrestricted some limit for xpu2, test=kunlun (#47837) · a38fc5e1
  由 zhangyikun02 提交于 11月 10, 2022
  
  a38fc5e1
- S
  [phi] migrate prelu (#47422) · cdd8c8ab
  由 Sylwester Fraczek 提交于 11月 10, 2022
```
* migrate prelu

* remove cache

* review fixes
```
  cdd8c8ab
- Y
  [PHI]Standardise some C++ API (Part4) (#47702) · 594bd723
  由 YuanRisheng 提交于 11月 10, 2022
```
* standard api

* fix sparse bugs

* fix xpu bugs, test=kunlun

* remove hard code for custom unittest

* open ci, test=kunlun

* deal with conflict
```
  594bd723
- W
  [PHI decoupling] remove fluid/framework/generator.h from phi (#47822) · 28c56d77
  由 Wang Xin 提交于 11月 10, 2022
```
* remove fluid/framework/generator.h from phi

* fix PR-CI-Kunlun-KP-Build fail
```
  28c56d77
- P
  [PHI decoupling] remove "paddle/fluid/platform/device/gpu/gpu_launch_config.h" in phi (#47808) · 40a9b488
  由 PuQing 提交于 11月 10, 2022
```
* rm fluid gpu_launch_config

* fix type
```
  40a9b488
- H
  [PHI Decoupling] remove "paddle/fluid/platform/float16.h" and... · 8164b97a
  由 huangjiyi 提交于 11月 10, 2022
```
[PHI Decoupling] remove "paddle/fluid/platform/float16.h" and "paddle/fluid/platform/for_range.h" in phi. (#47817)

* rm "paddle/fluid/platform/float16.h" in phi

* rm "paddle/fluid/platform/for_range.h" in phi
```
  8164b97a

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功