提交 · 48b3e86956fd8b25f11be60e04df0b63df857d4c · BaiXuePrincess / Paddle

31 1月, 2023 11 次提交
- H
  [Decouple phi] Decouple custom_op in fluid and phi (#49866) · 48b3e869
  由 HongyuJia 提交于 1月 31, 2023
```
* decouple phi custom_op

* decouple phi custom_op, remove codes

* delete custom symbol of inference
```
  48b3e869
- 张
  
  fix div 0 error in conv1_transpose (#50000) · 1755a154
  由张春乔提交于 1月 31, 2023
  
  1755a154
- R
  Fix 堆栈溢出 (stack overflow) of case10: paddle.unique (#49981) · dbfdefa7
  由 RedContritio 提交于 1月 31, 2023
```
* add axis check in UniqueRawInferMeta

* add unittest for negative axis

* simplify check for unique
```
  dbfdefa7
- R
  Fix 空指针 (Null pointer) of case 14 paddle.atan2 (#49973) · 82edc65b
  由 RedContritio 提交于 1月 31, 2023
```
* add elements count check in atan2

* add unittest and pre-check in inferMeta

* add dimension check
```
  82edc65b
- 张
  Fix the div 0 error of matrix_power (#49942) · fb74147c
  由张春乔提交于 1月 31, 2023
```
* add zero size check in matrix_power_kernel_impl.h

* add zero size check in matrix_power_kernel_impl.h

* add zero size check in unittest

* bug_fix

* bug_fix

* bug_fix

* bug_fix

* bug_fix

* bug fix

* bug_fix

* bug_fix

* add static check

* delete the dy codes
```
  fb74147c
- R
  Fix 堆栈溢出 (stack overflow) of case9: paddle.repeat_interleave (#49982) · 66682be0
  由 RedContritio 提交于 1月 31, 2023
```
* support negative index in repeat_interleave

* add unittest
```
  66682be0
- 张
  
  fix the div 0 error of pixel_shuffle (#49996) · baf96a12
  由张春乔提交于 1月 31, 2023
  
  baf96a12
- R
  
  add dims check for nms_kernel (#49993) · 4976153d
  由 RedContritio 提交于 1月 31, 2023
  
  4976153d
- Y
  Unify the gpu implementation of stack and unstack to reuse the optimization. (#49748) · 3586e856
  由 Yiqun Liu 提交于 1月 31, 2023
```
* Unify the gpu implementation of stack and unstack to reuse the optimization.

* Optimize the cuda implementation of unstack.

* Use GpuMemcpyAsync instead of memory::Copy.

* Fix error of calculating the index.

* Use FastDivMod to further imporve the performance of unstack.
```
  3586e856
- L
  
  add multi fetch (#50070) · a8078bbd
  由 LiYuRio 提交于 1月 31, 2023
  
  a8078bbd
- 姜
  rm flags retain grad in pybind (#49888) · 9c3a35b9
  由姜永久提交于 1月 31, 2023
```
* rm flags_retain grad in pybind

* retain grads for xpu test

* set retain grad for xpu

* rm flag

* lint

---------
Co-authored-by: Nwanghuancoder <wanghuan29@baidu.com>
```
  9c3a35b9
30 1月, 2023 8 次提交

J

[CINN] fix build_cinn_pass collect inplace var bug (#50072) · ac84dce9
由 jiangcheng 提交于 1月 30, 2023

ac84dce9

Fix 空指针 (Null pointer) of case 2 paddle.linalg.lu_unpack (#49976) · 6f8ec229

由 RedContritio 提交于 1月 30, 2023

* add pivots type check and fix batchsize error

* add unittest for batchsize = 0

* fix nullptr in lu_unpack

fix batchsize error in LU_Unpack
add nullptr check in OneFunctor

* remove exception in device code

6f8ec229

[Divide by 0 Error] add pinv check (#49951) · f6e874bc

由 Ryan 提交于 1月 30, 2023

* add pinv check

* add unitest

* update unitest

* roll back

* fix not call stupid bug

* use context

f6e874bc

E
add phi tensor vector array api from fluid (#49885) · 094e3b8c
由 engineer1109 提交于 1月 30, 2023
```
replace all TensorFromVector & TensorToVector

AssignKernel async copy
```
094e3b8c

Support stream priority for standalone executor (#49939) · 172d1de6

由 Ruibiao Chen 提交于 1月 30, 2023

* Support stream priority for standalone executor

* Fix compile error

* Fix compile error

* Fix compile error

* Fix compile error

* Fix compile error

172d1de6

[Pglbox2.0] merge gpugraph to develop (#49946) · cb525d4e

由 zmxdream 提交于 1月 30, 2023

* add set slot_num for psgpuwraper (#177)

* add set slot_num_for_pull_feature for psgpuwarper

* Add get_epoch_finish python interface (#182)

* add get_epoch_finish interface

* add return

* delete return

* add unzip op (#183)

* fix miss key for error dataset (#186)

* fix miss key for error dataset

* fix miss key for error dataset
Co-authored-by: Nyangjunchao <yangjunchao@baidu.com>

* add excluded_train_pair and infer_node_type (#187)

* support return of degree (#188)

* fix task stuck in barrier (#189)
Co-authored-by: Nyangjunchao <yangjunchao@baidu.com>

* check node/feature format when loading (#190)

* check node&feature format when loading

* check node&feature format when loading (2£ (2)

* degrade log (#191)

* [PGLBOX]fix conflict

* [PGLBOX]fix conflict

* [PGLBOX]replace LodTensor with phi::DenseTensor

* [PGLBOX]fix gpu_primitives.h include path

* [PGLBOX]from platform::PADDLE_CUDA_NUM_THREADS to phi::PADDLE_CUDA_NUM_THREADS

* [PGLBOX]fix unzip example code

* [PGLBOX]fix unzip example code

* [PGLBOX]fix unzip example code

* [PGLBOX]fix unzip example code

* [PGLBOX]fix unzip ut

* [PGLBOX]fix unzip ut

* [PGLBOX]fix code style

* [PGLBOX]fix code style

* [PGLBOX]fix code style

* fix code style

* fix code style

* fix unzip ut

* fix unzip ut

* fix unzip ut

* fix unzip

* fix code stype

* add ut

* add c++ ut & fix train_mode_ set

* fix load into memory

* fix c++ ut

* fix c++ ut

* fix c++ ut

* fix c++ ut

* fix code style

* fix collective

* fix unzip_op.cc

* fix barrier

* fix code style

* fix barrier

* fix barrier

* fix code styple

* fix unzip

* add unzip.py

* add unzip.py

* fix unzip.py

---------
Co-authored-by: Nchao9527 <33347532+chao9527@users.noreply.github.com>
Co-authored-by: NSiming Dai <908660116@qq.com>
Co-authored-by: Nhuwei02 <53012141+huwei02@users.noreply.github.com>
Co-authored-by: Nyangjunchao <yangjunchao@baidu.com>

cb525d4e

G

depthwise_conv 映射成 conv的逻辑中添加下cudnn版本的判断 (#50058) · 320958eb
由 gem5 提交于 1月 30, 2023

320958eb
S
make FLAGS_gemm_use_half_precision_compute_type=false by default (#50050) · 964cd660
由 sneaxiy 提交于 1月 30, 2023
```
* make FLAGS_gemm_use_half_precision_compute_type=false defaultly

* fix comments
```
964cd660

29 1月, 2023 8 次提交
- J
  
  [CINN] BuildCinnPass collect inplace var from all cluster instead op (#50057) · 6d13992e
  由 jiangcheng 提交于 1月 29, 2023
  
  6d13992e
- Z
  
  refine code (#50053) · f8557cd9
  由 zhangbo9674 提交于 1月 29, 2023
  
  f8557cd9
- S
  
  update latest ps.proto (#50054) · 3da73f8f
  由 sneaxiy 提交于 1月 29, 2023
  
  3da73f8f
- S
  Add the missing ps.proto and remove ps_pb2.py (#50040) · ba67361b
  由 sneaxiy 提交于 1月 29, 2023
```
* add missing proto file

* fix windows ci

* fix ci compile error
```
  ba67361b
- R
  [CustomDevice] registering feed_dense_tensor, feed_sparse_coo_tensor,... · 50d92531
  由 ronnywang 提交于 1月 29, 2023
```
[CustomDevice] registering feed_dense_tensor, feed_sparse_coo_tensor, feed_strings kernels for custom device (#50042)

* [CustomDevice] registering feed_dense_tensor, feed_sparse_coo_tensor, feed_strings kernels for custom device

* update

* update

* update
```
  50d92531
- L
  [FleetExecutor] Remove max_slot_num and implement multi-scope fetch (#50041) · decbb588
  由 LiYuRio 提交于 1月 29, 2023
```
* remove max_slot_num

* fix test case
```
  decbb588
- J
  [CINN] collect inplace var into cinn op desc's kInplaceVarNames attribute (#49898) · bad49b51
  由 jiangcheng 提交于 1月 29, 2023
```
* [CINN] collect inplace var into cinn op desc's kInplaceVarNames attribute

* attr move from op desc to subgraph

* GetFetchIds from var_map instead of var_model_to_program_map_
```
  bad49b51
- Y
  
  Fused attention pass backward pattern (#49855) · 8e02f290
  由 Yuang Liu 提交于 1月 29, 2023
  
  8e02f290
28 1月, 2023 1 次提交
- L
  
  add cond interceptor (#50019) · b2706b0c
  由 LiYuRio 提交于 1月 28, 2023
  
  b2706b0c
25 1月, 2023 1 次提交
- L
  remove useless kTranspose enum element (#38660) · f43cb3b7
  由 limingshu 提交于 1月 25, 2023
```
Co-authored-by: Nzhangbopd <1299246947@qq.com>
```
  f43cb3b7
20 1月, 2023 4 次提交
- J
  
  【Prim】Refactor prim flags system (#49930) · 23d20e30
  由 Jiabin Yang 提交于 1月 20, 2023
  
  23d20e30
- J
  Fix for bad_alloc in oneDNN matmul_grad kernel (#48593) · 44855da3
  由 jakpiase 提交于 1月 20, 2023
```
* fix for matmul_grad

* another fix for matmul_grad

* fix
```
  44855da3
- S
  
  add unique support zero dim (#49260) · ee4e5323
  由 sprouteer 提交于 1月 20, 2023
  
  ee4e5323
- J
  [KUNLUN] update xccl lib & use native Reduce in dygraph (#49941) · 073f7ced
  由 jameszhang 提交于 1月 20, 2023
```
* update xccl lib & use native Reduce in dygraph

* minor
```
  073f7ced
19 1月, 2023 5 次提交

F

add test for zero dimensional tensor for real, imag, angle, conj, as_real and sequence_pad (#49921) · 64b3f2f6
由 Feiyu Chan 提交于 1月 19, 2023

64b3f2f6

Fix paddle.queeze_ bug (#49903) · 11e34ae0

由 heliqi 提交于 1月 19, 2023

* fix queeze_ bug

* fix slove use squeeze_kernel

* fix slove use squeeze_kernel

* fix slove use squeeze_kernel

* add test case

11e34ae0

X
【prim】Modify dygraph code_gen , add set_output (#49918) · 22b5241f
由 xiaoguoguo626807 提交于 1月 19, 2023
```
* modify name

* merge develop

* fix param

* fix exp gen bug

* fix sum_grad

* comment
```
22b5241f

[KUNLUN] add op: maxpool_with_index (#49505) · f71f77e9

由 jameszhang 提交于 1月 19, 2023

* [KUNLUN] add op: maxpool_with_index

* use DeviceContext::Alloc() instead of DenseTensor::mutable_data()

* fix file format

* solve clip unittest failure

* minor fix

* Revert "solve clip unittest failure" since the issue is fixed
in #49535

This reverts commit 1127adc66e79afe35ac3c00bb34e6aaa7cd7d78b.

* align with xdnn on the definition of mask in max_pool_with_index

* minor

f71f77e9

H
[Paddle Inference]Support PaddlePaddle Backend on Triton (#49758) · e3f39833
由 heliqi 提交于 1月 19, 2023
```
* support PaddlePaddle Backend on Triton

* fix test cases

* fix Codestyle

* add test case

* add test case
```
e3f39833

18 1月, 2023 2 次提交

Handle repetitive code in oneDNN activation fuse passes (#49824) · a1b2e1e2

由 Sławomir Siwek 提交于 1月 18, 2023

* extract fuse pass logic to header file

* adjust namespaces

* Update paddle/fluid/framework/ir/mkldnn/activation_onednn_fuse_pass.h

update date
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* add inline remove static
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

a1b2e1e2

Add align check for Concat Kernel (#49761) · 24379442
由 MarDino 提交于 1月 18, 2023
```
* add align check

* refine
```
24379442

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致