提交 · 3ca7328f75b4bb86274ded7c3226c64aa673b21c · PaddlePaddle / Paddle

21 11月, 2022 15 次提交
- P
  [PHI decoupling] move "thread pool" from fluid to phi (#48075) · 3ca7328f
  由 PuQing 提交于 11月 21, 2022
```
* move threadpool

fix cmake

* fix make
```
  3ca7328f
- 傅
  
  （fluid清理）Remove filter by instag in nn.py under fluid (#47929) · 468f8815
  由傅剑寒提交于 11月 21, 2022
  
  468f8815
- T
  
  add adamw suppor xpu, test=kunlun (#48114) · 27e252d9
  由 taixiurong 提交于 11月 21, 2022
  
  27e252d9
- H
  
  add check_xpu_dependence.sh script. (#48154) · 394a7179
  由 houj04 提交于 11月 21, 2022
  
  394a7179
- 傅
  Remove fluid.layers.relu6 under fluid directory (#47876) · 5a45ceb2
  由傅剑寒提交于 11月 21, 2022
```
* remove relu6 test case under fluid

* fix relu6 test case in mkldnn_elt_act_fuse_pass
```
  5a45ceb2
- V
  Remove API: selu (#47969) · 1175a2b9
  由 Vvsmile 提交于 11月 21, 2022
```
replace paddle.fluid.layers.selu with paddle.nn.functional.selu
```
  1175a2b9
- V
  [Clean Fluid API]Remove API: gather (#47954) · 844ab6fe
  由 Vvsmile 提交于 11月 21, 2022
```
* Remove API: gather
	replace the paddle.fluid.layers.gather with paddle.gather

* modify the call of gather from old style to new style
```
  844ab6fe
- Update AUTHORS.md (#48177) · 1ba308f5
  由 engineer1109 提交于 11月 21, 2022
  
  1ba308f5
- W
  
  round (#48107) · b546438c
  由 wenbin 提交于 11月 21, 2022
  
  b546438c
- H
  [PHI decoupling] move cross_entropy from fluid to phi (#48160) · 3501ff7d
  由 huangjiyi 提交于 11月 21, 2022
```
* move cross_entropy from fluid to phi

* replace mutable_data with Alloc

* use .template
```
  3501ff7d
- W
  Unify `ProcessGroupNCCL` APIs underlying implementation (#48163) · 88410225
  由 Wen Sun 提交于 11月 21, 2022
```
* refactor: replace Collective & PointToPoint with NCCLEnv

* refactor: rename to RunFnInNCCLEnv

* refactor: pass std::function by value
```
  88410225
- L
  
  add new map instance (#48145) · 2a47416c
  由 LiYuRio 提交于 11月 21, 2022
  
  2a47416c
- L
  
  return pointer rather than reference (#48152) · 403d58bb
  由 LiYuRio 提交于 11月 21, 2022
  
  403d58bb
- P
  
  remove macros.h (#48069) · 02c51f3b
  由 PuQing 提交于 11月 21, 2022
  
  02c51f3b
- S
  
  add state_dict convert (#48161) · c00f0daf
  由 sneaxiy 提交于 11月 21, 2022
  
  c00f0daf
20 11月, 2022 1 次提交
- C
  remove range from fluid (#48086) · 5675c7d5
  由 ccrrong 提交于 11月 20, 2022
```
* remove range
```
  5675c7d5
19 11月, 2022 2 次提交
- W
  
  refactor: rm redundant funcs (#48149) · f38e09f0
  由 Wen Sun 提交于 11月 19, 2022
  
  f38e09f0
- A
  [CustomPlace] fix amp (#48090) · c775bc69
  由 Aganlengzi 提交于 11月 19, 2022
```
* [CustomPlace] fix amp

* [CustomPlace] fix amp

* fix ut because of too long time matmul fp16
```
  c775bc69
18 11月, 2022 22 次提交

W

refine save hook (#48124) · 04709310
由 wanghuancoder 提交于 11月 18, 2022

04709310

Fused QKVBiasAdd and Transpose with Split Q, KV (#47680) · d595928e

由 MarDino 提交于 11月 18, 2022

* fused qkvBiasAdd and transpose with split qkv

* fix typo

* fix format

* fix name

* add annotation

* fix comment

d595928e

Y
clear fluid apis: fix apis in fleet and passes (#48021) · e5408835
由 yuehuayingxueluo 提交于 11月 18, 2022
```
* clear fluid apis in fleet and passes

* fix model.py

* fix model.py

* fix cpp_pass.py
```
e5408835

[PHI] Migrate matmul_grad kernel (#48023) · 4ab18ada

由 Sławomir Siwek 提交于 11月 18, 2022

* cleanup unused code

* unify is_int8 is_bfloat16

* Simplify matmul_v2 FWD kernel

* remove RunKernel methods

* remove import namespace

* remove headers

* clean fluid/phi cross imports

* remove fluid axpy_handler

* delete fluid methods

* activations

* OneDNNMemDesc

* MKLDNNFormatForSize

* MatchShapeToLayout

* MKLDNNMemoryFormat

* MKLDNNFormat

* ReorderMKLDNNHandler

* to_void_cast

* review suggestions

* interpolate

* remove fluid depedency

* init

* ExecuteMatMulV2

* rm fluid kernel

* matmul_grad

* remove mutable_data

4ab18ada

V
Remove API: pad_constant_like (#47949) · 7073ed5b
由 Vvsmile 提交于 11月 18, 2022
```
remove pad_constant_like which is not used in paddle 2.0
```
7073ed5b

[PHI] Migrate conv_transpose kernel (#48119) · 9aacb31b

由 Zuza Gawrysiak 提交于 11月 18, 2022

* Migrate conv_transpose to phi

* Move handler to kernel

* kernel m

* Fix formatting

* handler

* remove fluid

* revert tcp_store

* tcp_store

* remove unused

* Fix declaration

* add dnn input

* Fix typo
Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>

9aacb31b

2

delete logical_xor api (#48070) · ec778272
由 201716010711 提交于 11月 18, 2022

ec778272
Z
Fix bug of zero_allocator in HostAlloc (#48108) · 7f92e27e
由 zyfncg 提交于 11月 18, 2022
```
* fix bug of zero_allocator in host

* fix test compile bug

* add unittest

* update test
```
7f92e27e
傅

(fluid清理）remove stack in nn.py under fluid (#47942) · 058aa381
由傅剑寒提交于 11月 18, 2022

058aa381

Optimize FusedBiasAddGelu Kernel (#47679) · b0e28540

由 MarDino 提交于 11月 18, 2022

* Add quick gelu and fused bias add kernel

* fix annotation

* remove useless code

* add fast gelu option and set it in multi transformer op

* add flag to restrict if use fast gelu approximate

* fix flags conflict

* fix use tanh function instead

* add cudart version limit

* use phi fast tanh func

* fix comment

b0e28540

[PHI decoupling] move "gpu_device_function.h" from fluid to phi (#48097) · 27ee6e71

由 huangjiyi 提交于 11月 18, 2022

* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi

* update copyright years

* rm "fluid/platform/device/gpu/gpu_device_function.h" in phi

* fix rocm-complie bugs

27ee6e71

W

Refactor collective communication reduce, scatter, reduce_scatter C++ API (#48115) · edda13cd
由 Wen Sun 提交于 11月 18, 2022

edda13cd
Z
[AutoParallel] selective recompute (#48111) · d7f7963f
由 zhaoyingli 提交于 11月 18, 2022
```
* [AutoParallel] selective recompute

* add cmakelist
```
d7f7963f

correct sync behavior for XPU distributed training (#47882) · aafa9820

由 james 提交于 11月 18, 2022

* correct sync behavior for XPU distributed training

XPU support event mechanism similar to cuda event, so it is advisable to
use an event to sync compute/comm streams for performance. However this
mechanism is never fully tested, and inconsistent loss/ending_epochs are
reported. Therefore, this PR replaces event sync with stream waiting as
a temporary solution.

* remove compile warning

aafa9820

D

Add description to `nn.functional.celu` (#48074) · 1fb4d90b
由 Dandelight 提交于 11月 18, 2022

1fb4d90b

fix device id issue for xpu eager mode (#48076) · 3b18d96b

由 james 提交于 11月 18, 2022

* fix device id issue for xpu eager

xpu device id is not correctly set in eager mode, thus vars are on dev0 unless
XPUDeviceGurad is called, leading to this error message for all node rank != 0:
"NotImplementedError: (Unimplemented) Place Place(xpu:0) is not supported."

* fix typo

* fix pybind error

3b18d96b

CUDNN v8 Implementation of Convolution Kernels (#47454) · 14a6e67b

由 Tian Zheng 提交于 11月 18, 2022

* Refactor conv_kernel and conv_grad_kernel to provide interface for CUDNNv8 implementation

* Fix macro

* Add implementation for conv_kernel and conv_grad_kernel

* Modification after rebase onto latest develop

* Modify plan cache to comply with the API of phi::autotune

* Refactor to reduce duplicate code

* Review fix:
- move functions in  conv_kernel_impl_v8.h and conv_grad_kernel_impl_v8.h to conv_kernel.cu and conv_grad_kernelk.cu
- add const specifier for input tensor
- add logging when plans fail to execute
- move CudnnConvBwdFilterV8 and CudnnConvBwdDataV8 to conv_cudnn_frontend.h

* - move plan building outside of cache

* Fix ROCM build

14a6e67b

G

remove no used fluid beam_search_decoder (#48096) · 593bc4e2
由 GGBond8488 提交于 11月 18, 2022

593bc4e2
Y

add bf16 for numel (#48121) · a7d306af
由 Yuang Liu 提交于 11月 18, 2022

a7d306af
W
[PHI decoupling] remove "gpu_primitives.h" in fluid (#48063) · 9918bf9c
由 Wang Xin 提交于 11月 18, 2022
```
* remove "gpu_primitives.h" in fluid namespace

* fix PR-CI-GpuPS fail

* fix PR-CI-GpuPS fail
```
9918bf9c

Allow to specify train_bs and eval_bs separately in hapi.fit() (#48032) · a33d563c

由 parap1uie-s 提交于 11月 18, 2022

* Fix hAPI bug of not compatible with LayerHook

https://github.com/PaddlePaddle/Paddle/issues/47000

* Fix hAPI bug of not compatible with LayerHook

* Allow to specify train_bs and eval_bs separately in hapi.fit()

* Update model.py

* Update Model.py

* Update test_model.py

* update model.py

a33d563c

Z

cast and gradient_accumulator support double for xpu, test=kunlun (#47800) · 982d5ff7
由 zhangyikun02 提交于 11月 18, 2022

982d5ff7

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功