提交 · 0fd8ee63c318471e77a82a66cc9378382cf23da4 · Crayon鑫 / Paddle

02 8月, 2022 5 次提交

Multihead matmul fp16 (#44792) · 0fd8ee63

由 Wilber 提交于 8月 02, 2022

* multihead matmul add fp16

* fix windows error

* fix rocm error

* fix rocm error

0fd8ee63

[Phi] Move QR to Phi (#44742) · 2cf2e786

由 Yulong Ao 提交于 8月 02, 2022

* [Phi] Move Qr to the Phi

* [Phi] Regiter the cpu grad kernel for qr

* [Phi] Share the cuda kernels to lstsq

* [Phi] Remove some improper inlcude files

* [Phi] Modify codes based on the reviews

* [Phi] Remove unecessary files and add the cuda_only comment

* [Phi] Remove the unecessary include file

* [Phi] Remove qr_op.cu and lstsq_op.cu

2cf2e786

H
[XPU] fp16 for layer_norm op (#44778) · 4c3e13de
由 houj04 提交于 8月 02, 2022
```
* [XPU] fp16 for layer_norm op. test=kunlun
```
4c3e13de

[phi] add yolov3_loss yaml and unittest (#44476) · c7cf12fc

由 ccrrong 提交于 8月 02, 2022

* add yaml and unittest

* update yaml

* update backward yaml and unittest

* update yaml

* add Yolov3LossGradInferMeta

* update yolov3_loss_op.cc

* fix bug

* code format

c7cf12fc

support beam_search operator on xpu. test=kunlun (#44720) · 9bf80772

由 mengqingchun02 提交于 8月 02, 2022

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

* support beam_search operator on xpu. test=kunlun

9bf80772

01 8月, 2022 7 次提交

unify gpu context (#44740) · 86763023

由 Leo Chen 提交于 8月 01, 2022

* remove cudaDeviceContext

* remove more template

* fix rocm compile

* remove alias name CUDADeviceContext

* fix compile

* fix tests

* revert changes

86763023

Z

Revert for cmake static library errors on XPU KP #44762 · f15d930a
由 zhiboniu 提交于 8月 01, 2022

f15d930a

[operator migration] Migrate unstack_op and nms_op (#44424) · 9d2e0ecb

由 Thomas Young 提交于 8月 01, 2022

* update unstack_op

* update unstack_op

* update unstack_op

* fix unstack test

* update unstack

* update with remote

* fix unstack_test.py

* temp_save_change_nms_op

* add nms test

* update nms fix

* update unstack_op

* temp save change

* finish fix nms_op

* pass nms test

* fix CI

* fix ops test

* save change

* fix code style

* fix code style

* fix ci and codestyle

* fix ci
Co-authored-by: NShiningZhang <zhang_liang1991@126.com>

9d2e0ecb

J
Fix to CI (#44744) · 71f74f5c
由 Jacek Czaja 提交于 8月 01, 2022
```
* - fix

* - another fix

* lint
```
71f74f5c
L
migrate overlap_add and overlap_add_grad op (#44739) · 2a8219c1
由 levi131 提交于 8月 01, 2022
```
* update code format

* add ymal and test

* update for comments
```
2a8219c1
X

migrate reduce_amin,reduce_amax kernel to phi (#44698) · 8482f1ae
由 Xiaoxu Chen 提交于 8月 01, 2022

8482f1ae

[PHI] Move lu_unpack to phi (#44674) · c905a9e9

由 Lin Manhui 提交于 8月 01, 2022

* Add kernel declarations

* Copy kernel implementation code

* Transfer implementation code

* Register new kernels

* Remove old kernels

* Fix code style

* Fix bugs

* mutable_data->HostAlloc

* Transfer infermeta

* Add yaml and update python api

* Add PADDLE_WITH_HIP check

* Update unittests

* Add kernel declarations

* Copy kernel implementation code

* Transfer kernel implementation code

* Register new kernels

* Remove old kernels

* Add lu_unpack_sig

* Fix bugs

* Fix bugs

* Fix bugs

* Optimize directory structure

* Add output checks

* Update include files

* lu_impl.h->lu_kernel_impl.h

* Transfer infermeta

* Add yaml and update python api

* Add check_eager
Co-authored-by: NBobholamovic <linmanhui@baidu.com>

c905a9e9

30 7月, 2022 1 次提交
- Z
  Phi prior box (#44431) · d92b2f2d
  由 zhiboniu 提交于 7月 30, 2022
```
* phi_prior_box

* add float[] support

* phi_prior_box_optest

* update
```
  d92b2f2d
29 7月, 2022 9 次提交

L
unify fluid::CUDADeviceContext and phi::GpuContext (#44723) · 88490567
由 Leo Chen 提交于 7月 29, 2022
```
* remove cudaDeviceContext

* remove more template

* fix rocm compile
```
88490567

[API/OP] Migrate Lstsq op into phi (#44318) · ab2aaf8b

由 Haohongxiang 提交于 7月 29, 2022

* migrate lstsq op

* update

* fix bugs for CIs

* update

* fix bugs

* add uts

* update

* update

* update

* fix bugs of jip

* fix bugs of hip

* update

* update according to review

* update

* update

* update

* update

ab2aaf8b

Q
add some fp16 op for kunlun resnet50 model (#44672) · fecbc958
由 QingshuChen 提交于 7月 29, 2022
```
* add some fp16 op for kunlun resnet50 model
*test=kunlun

* tmp
*test=kunlun
```
fecbc958
Z

phi_multiclass_nms3 (#44613) · a9919903
由 zhiboniu 提交于 7月 29, 2022

a9919903

[WIP] Matmul v1 & v2 unification -- part 1 (#44640) · 653885a5

由 Jacek Czaja 提交于 7月 29, 2022

* - Unit tests to be debugged

- fix

- refactor

- diagnostic

- more diagnostic

- fix

- Fix number two

- fix

- fix

- fix

- alpha added

- more fixes

- compilation fix

- removed diagnostic code

- cosmetic fixes

* lint

653885a5

move CUDAStream to phi (#44529) · da3743fd

由 Leo Chen 提交于 7月 29, 2022

* init

* move CUDAStream to phi

* fix compilation

* merge develop

* add stream_owned_ member

* split cuda_stream.h

* fix cpu compile

* fix constructor

* fix bug

* fix windows compile

* fix inference test_levit

* fix windows tests

da3743fd

[PHI] Move lu to phi (#44605) · 3d88816e

由 Lin Manhui 提交于 7月 29, 2022

* Add kernel declarations

* Copy kernel implementation code

* Transfer implementation code

* Register new kernels

* Remove old kernels

* Fix code style

* Fix bugs

* mutable_data->HostAlloc

* Transfer infermeta

* Add yaml and update python api

* Add PADDLE_WITH_HIP check

* Update unittests

* Fix bugs

* Fix bugs

* Optimize directory structure

* Add output checks

* lu_impl.h->lu_kernel_impl.h
Co-authored-by: NBobholamovic <linmanhui@baidu.com>

3d88816e

M
fused_fc_elementwise_layernorm_op support fp16 (#44710) · 856f741a
由 ming1753 提交于 7月 29, 2022
```
* fused_fc_elementwise_layernorm support fp16

* fused_fc_elementwise_layernorm support double
```
856f741a
H

[XPU] add sampling_id op, add top_k op, update xdnn api. test=kunlun (#44704) · e61f48c1
由 houj04 提交于 7月 29, 2022

e61f48c1

28 7月, 2022 10 次提交

[phi]move softsign from fluid to phi (#44616) · 20759c30

由 HongyuJia 提交于 7月 28, 2022

* test_activation_op unitest error, yaml & activation.py in_dygraph_mode incomplete

* fix test_activation_op unitest error, add yaml and dygraph test

* fix code style with pre-commit

* try to fix namespace error of abs in activation_functor.h

* fix namespace error of abs

20759c30

X
migrate dirichlet kernel to phi (#44434) · 798a4eac
由 Xiaoxu Chen 提交于 7月 28, 2022
```
* migrate dirichlet op kernel to phi

* fix dirichlet sample memory leak
```
798a4eac
H

fix bugs of lstsq (#44689) · 2781740b
由 Haohongxiang 提交于 7月 28, 2022

2781740b
C

[MLU] fix log_softmax mode selection. (#44669) · a9f76d07
由 Chenxiao Niu 提交于 7月 28, 2022

a9f76d07

Move frame kernel to phi (#44615) · 28b4b2f7

由 Charles-hit 提交于 7月 28, 2022

* Move frame OP to phi、add frame OP yaml config and supplement single test

* add Header file of in_dygraph_mode

* Modify variable name and FrameGradInferMeta multiplex UnchangedInferMeta

* move seq2col to phi

28b4b2f7

Move api(lgamma) from legacy_api.yaml to api.yaml (#44355) · 511a2c1c

由 Charles-hit 提交于 7月 28, 2022

* Move api(lgamma) from legacy_api.yaml to api.yaml

* Move api(lgamma) from legacy_api.yaml to api.yaml

* Move api(lgamma) from legacy_api.yaml to api.yaml

* modify code style

* add x to X mapping

* add definition of lgamma

* delete redundant lgamma definitions

* Modify code comments

* Modify ops.py code format

* add lgamma  single test and lgamma api in fluid

* Optimized lgamma unittest

511a2c1c

support log_grad op, *test=kunlun (#44662) · 067107ad
由 z8hanghuan 提交于 7月 28, 2022

067107ad
L

Complete the dtypes for all_gather, add all_gather_object api (#44417) · d4cf02bc
由 LiYuRio 提交于 7月 28, 2022

d4cf02bc

[PHI] Move spectral_norm to phi (#44577) · 768e50c9

由 Lin Manhui 提交于 7月 28, 2022

* Add kernel declarations

* Copy kernel implementation code

* Transfer implementation code

* Fix: Move out_grad to first

* Register new kernels

* Remove old kernels

* Move out_grad to last

* Fix bugs

* Transfer infermeta

* Add yaml files

* Add blank line

* Fix code style

* Optimize directory structure
Co-authored-by: NBobholamovic <linmanhui@baidu.com>

768e50c9

[XPU] add top_k op (#44656) · acf07c74

由 houj04 提交于 7月 28, 2022

* [XPU] add top_k op. test=kunlun

* [XPU] add top_k op. test=kunlun

* use PADDLE_ENFORCE_XDNN_NOT_NULL to check pointer. test=kunlun

acf07c74

27 7月, 2022 5 次提交
- Q
  
  [MLU]fix sync_batch_norm and concat_grad op (#44586) · f49b0cb9
  由 qipengh 提交于 7月 27, 2022
  
  f49b0cb9
- F
  [phi] move crop_tensor kernel from fluid to phi (#44574) · b20f771f
  由 freeliuzc 提交于 7月 27, 2022
```
* move crop_tensor from fluid to phi

* delete fluid header files

* fix crop_tensor_op dygraph_mode bug

* modify header files, add out tensor check
```
  b20f771f
- Y
  
  [DCU] Fix NAN problem when training BERT on DUC platform (#44643) · 28aa0c61
  由 Yuang Liu 提交于 7月 27, 2022
  
  28aa0c61
- W
  Phi average accumulates migration (#44554) · eafd4280
  由 Wang Bojun 提交于 7月 27, 2022
```
* move average_accumulates op to phi kernel
```
  eafd4280
- fix bug of elementwise_add_grad, *test=kunlun (#44545) · 35ca1ce4
  由 z8hanghuan 提交于 7月 27, 2022
```
* fix bug of elementwise_add_grad, *test=kunlun

* fix bug, *test=kunlun

* rm pooling_t, *test=kunlun

* fix bug of ew_add_grad when inplace, *test=kunlun
```
  35ca1ce4
26 7月, 2022 3 次提交
- X
  
  add sin,cos,exp primitive operators (#44345) · 22342d51
  由 Xiaoxu Chen 提交于 7月 26, 2022
  
  22342d51
- S
  
  fix windows cuda11.7 bug (#44601) · e3ee5103
  由 Sing_chan 提交于 7月 26, 2022
  
  e3ee5103
- L
  
  [Phi] Migrate box coder to phi. (#44550) · 98f8fa4c
  由 lyq 提交于 7月 26, 2022
  
  98f8fa4c

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致