提交 · e808fa30f73fa242821f573823e1d7c16760fad0 · PaddlePaddle / Paddle

20 3月, 2023 1 次提交
- T
  
  Mv phi and fluid/test To test dir (#50640) · e808fa30
  由 tianshuo78520a 提交于 3月 20, 2023
  
  e808fa30
19 3月, 2023 2 次提交
- D
  resgister for ftt_r2c, ftt_c2_r (#51563) · d431b7c7
  由 Difer 提交于 3月 19, 2023
```
* resgister for ftt_r2c, ftt_c2_r

* fix clang-format
```
  d431b7c7
- S
  [phi] Add output defs for argsort kernel (#51407) · 545e20f8
  由 Sanbu 提交于 3月 19, 2023
```
* Add output defs for argsort kernel

* Update argsort_kernel.cc

* Update argsort_kernel.cu

* Update argsort_kernel.cc
```
  545e20f8
17 3月, 2023 1 次提交
- P
  [PHI] Add multinomial output defs (#51357) · b647c2f0
  由 PuQing 提交于 3月 17, 2023
```
* add multinomial output defs

* fix register on gpu
```
  b647c2f0
16 3月, 2023 8 次提交

[Custom Operator] Custom op support inplace mechanism (#51620) · f824bc0d

由 HongyuJia 提交于 3月 16, 2023

* init unit test commit, contains register thinking

* support inplace

* get inplaced x.grad

* Try support inplace and hook at the same time

* Support inplace, need debug

* Support inplace successfully

* Inplace use Tensor&, consistent with Tensor*

* fix MapPlainOutputs bug

* fix double grad inplace error

f824bc0d

Update from_blob API (#51646) · c07c7712

由 Huang Jiyi 提交于 3月 16, 2023

* remove contexts in tensor_utils

* update from_blob

* update from_blob

* update from_blob

* fix bug

* fix bug

c07c7712

P
[PHI] Add rnn and searchsorted output defs (#51360) · 3094d475
由 PuQing 提交于 3月 16, 2023
```
* add rnn and searchsorted output defs

* add gpu kernel
```
3094d475
H
[phi decoupling] remove fluid gpu_info usage in phi (#51699) · 907433a7
由 Huang Jiyi 提交于 3月 16, 2023
```
* remove fluid thread_data_registry

* update

* fix bug
```
907433a7

split layernorm pass (#51228) · 3f3372b6

由 wenbin 提交于 3月 16, 2023

* split pass

* fix compile

* fix ut

* more time

* modify ut

* reduce dim

* fix compile

* reshape weight

* tensor

* remove enforce

* static shape ut

* batchsize

* reorder pass

* minus test cases

* windows timeout

* windows time out

* remove test for windows

* correct

* sssss

* xxx

3f3372b6

I
add output defs for atan2 kernel (#51312) · ab3b87a6
由 Infinity_lee 提交于 3月 16, 2023
```
* fix atan2

* fix

* fix

* fix

* fix error

* fix error

* fix
```
ab3b87a6
S
Add output defs for generate_proposals,instance_norm kernel (#51576) · 939b58b2
由 Sanbu 提交于 3月 16, 2023
```
* Add output defs for generate_proposals,instance_norm kernel

* fix
```
939b58b2
L

add conv2d and conv2d_grad as default deny cinn ops (#51645) · d021095e
由 Leo Chen 提交于 3月 16, 2023

d021095e

15 3月, 2023 5 次提交

I
add output defs for eig kernel (#51319) · 5cb95856
由 Infinity_lee 提交于 3月 15, 2023
```
* fix eig

* fix

* fix

* fix

* fix
```
5cb95856

[PHI] remove operator.h in blas.h (rebase to latest codebase) (#51472) · 427712df

由 iSerendipity 提交于 3月 15, 2023

* Revert "Revert "【Hackathon No.67】remove operator.h in blas.h (#50989)" (#51467)"

This reverts commit b9d91531.

* remove cout

* add header

* fix missing header

* fix refer fluid error

* fix missing header

* 更新 repeat_interleave_grad_kernel_impl.h

Change to phi style datatype.

* 更新 repeat_interleave_grad_kernel_impl.h

Fix missing header

* datatype fluid -> phi

* paddle::experimental -> phi

* fix reference error

* fix reference error

* fix reference error

* fix errors

* fix missing FLAGS

* fix missing headers

* fix missing headers

* fix missing headers

* fix missing headers

* fix missing header

* fix missing header

* fix errors

427712df

P

Speedup datafeed (#51624) · effe2c11
由 pangengzheng 提交于 3月 15, 2023

effe2c11

Move the "GetExpectedKernelType" into "get_expected_kernel_func.cc" (#51453) · f0db1f7e

由 HappyHeavyRain 提交于 3月 15, 2023

* test_get_kernel

* add invoke signature

* change reduce_max

* change frobenius_norm

* reset reduce_max according to composite and change reduce_all

* fix the bug when Scalar(*)

* fix 'scalar when support_tensor'

* change code according to review

* change 'keep_signature' to 'manual_signature' and add some erro info

f0db1f7e

P

fix cuda graph (#51648) · 53c73c77
由 pangyoki 提交于 3月 15, 2023

53c73c77

14 3月, 2023 6 次提交
- C
  
  Fix typos (#51379) · e34c79c7
  由 chenxujun 提交于 3月 14, 2023
  
  e34c79c7
- P
  cuda graph support multi-stream for new executor (#51389) · 579fb5fd
  由 pangyoki 提交于 3月 14, 2023
```
* cuda graph support multi-stream for new executor

* fix windows compile error

* delete create_cuda_graph_stream
```
  579fb5fd
- I
  
  add output defs for histogram kernel (#51317) · 2876f6f8
  由 Infinity_lee 提交于 3月 14, 2023
  
  2876f6f8
- A
  add register of select (#51595) · 93867e20
  由 Ackeraa 提交于 3月 14, 2023
```
add register of select
Co-authored-by: Nwqgo <1552367872@qq.com>
```
  93867e20
- C
  
  [prim] enable dygraph_to_static to support custom_vjp · d0c80f43
  由 cxxly 提交于 2月 24, 2023
  
  d0c80f43
- S
  
  [Hackathon NO.73] 为 Paddle-TRT 添加 temporal_shift 算子 (#51207) · e79699fb
  由 Sonder 提交于 3月 14, 2023
  
  e79699fb
13 3月, 2023 15 次提交

L

Update interpreter_util.cc (#51478) · eeb0cfdc
由 lubiu 提交于 3月 13, 2023

eeb0cfdc
Add phi operator all_gather (#51420) · afa26a59
由 TaoTao Li 提交于 3月 13, 2023
```
* add all_gather and fix conflicts

* fix code format

* fix ut

* fix broadcast ut
```
afa26a59
Z
Add output defs for mode kernel (#51363) · 383a3f8c
由 Zhenghai Zhang 提交于 3月 13, 2023
```
* Add output defs for mode kernel

* fix bug
```
383a3f8c
Add output defs for fused_matmul kernel (#51326) · bc3afd82
由 iSerendipity 提交于 3月 13, 2023
```
* remove fused_matmul from list

* add infermeta for fused matmul
```
bc3afd82

Fused softplus (#51087) · fdcfa04f

由 Sławomir Siwek 提交于 3月 13, 2023

* mkldnn->onednn

* fused softplus op + kernel

* remove extra attributes

* add missing handler

* change var name

fdcfa04f

S
Add output defs for conv3d_coo distribute_fpn_proposals kernel (#51516) · 09241d85
由 Sanbu 提交于 3月 13, 2023
```
* Add output defs for conv3d_coo distribute_fpn_proposals kernel

* fix
```
09241d85
R

fix ps_proto_bug (#51449) · 5dfbb229
由 risemeup1 提交于 3月 13, 2023

5dfbb229

[with_data_parallel][part6] remove with_data_parallel in distributed optimizer (#50719) · 1404f732

由 kangguangli 提交于 3月 13, 2023

* find relevant testcase

* remove with_data_parallel

* trigger CI

* do not apply ParameterServerGraphOptimizer

* remove useless optimizer

* remove with_data_parallel in test_dist_base

* fix test_fleet_base_3

* only reserve changes for GraphExecutionOptimizer

* fix bug

* fix test_minst_dgc_nccl

* fix typo

* fix test_dist_mnist_gradient_merge

* rm TestDistMnistNCCL2DGCMultiCards

* fix optimizer conflicts

* fix dist_mnist

* fix test_dist_hapi

* delete test_fleet_graph_execution_meta_optimizer & test_fleet_graph_executor

* temporally not delete unittest

* fix unittests

* fix ci

* recover prune in python/paddle/hapi/model.py

1404f732

K

remove flags_enable_parallel_graph (#51375) · 38865fcd
由 kangguangli 提交于 3月 13, 2023

38865fcd
Z

fix phi xpu kernel tensor transform (#51306) · fcab331d
由 zhupengyang 提交于 3月 13, 2023

fcab331d

add register of kthvalue (#51534) · 87c5f23b

由 junxiu777 提交于 3月 13, 2023

* add register of KthvalueKernel

add register of KthvalueKernel

* Update kthvalue_kernel.cc

* Update kthvalue_kernel.cu

87c5f23b

[Paddle Inference ]use python to generate cutlass code (#50603) · 4e9e23cb

由 zhoutianzi666 提交于 3月 13, 2023

* use python to generate cutlass code

* refine CommonConvKernelPart1, CommonConvKernelPart2

* remove useless code in generate_cutlass_code.sh

* add more config in conv2d_residual

* CommonCutlassConvKernelPart1 and CommonCutlassConvKernelPart2

* add group conv support in util.cu

* remove .sh

* refine name

* make name goodgit status!

* add fuse_alpha

* make code easy to understand

* mot fopen generate in py

* use python script to generate conv2d,group=1 cutlass code

* use const &

* use const & && use python script to generate conv2d/group=1 code

4e9e23cb

add register of auc (#51451) · 39899d79

由 Little-chick 提交于 3月 13, 2023

* Update interpreter_util.cc

* Update auc_kernel.cc

* Update auc_kernel.cu

* Update auc_kernel.cc

* Update auc_kernel.cu

39899d79

4
remove_viterbi_devode (#51523) · 1d992173
由 404988613 提交于 3月 13, 2023
```
* Update interpreter_util.cc

* Update interpreter_util.cc
```
1d992173
Z

[xpu] optimize multi_encoder_xpu_fuse_pass performance (#51346) · e2cdd4a3
由 zhupengyang 提交于 3月 13, 2023

e2cdd4a3

12 3月, 2023 2 次提交
- H
  add register of bincount (#51508) · 48090c72
  由 hellockx 提交于 3月 12, 2023
```
* Update interpreter_util.cc

* Update bincount_kernel.cc

* Update bincount_kernel.cu
```
  48090c72
- H
  add register of is_empty (#51484) · 5afab2cd
  由 hellolllw 提交于 3月 12, 2023
```
* Update interpreter_util.cc

* Update is_empty_kernel.cc

* Update is_empty_kernel.cc
```
  5afab2cd

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功