提交 · 382e9a065ad395bcd377699beea200008edc1444 · PaddlePaddle / Paddle

30 1月, 2023 2 次提交
- G
  
  depthwise_conv 映射成 conv的逻辑中添加下cudnn版本的判断 (#50058) · 320958eb
  由 gem5 提交于 1月 30, 2023
  
  320958eb
- S
  make FLAGS_gemm_use_half_precision_compute_type=false by default (#50050) · 964cd660
  由 sneaxiy 提交于 1月 30, 2023
```
* make FLAGS_gemm_use_half_precision_compute_type=false defaultly

* fix comments
```
  964cd660
29 1月, 2023 8 次提交
- J
  
  [CINN] BuildCinnPass collect inplace var from all cluster instead op (#50057) · 6d13992e
  由 jiangcheng 提交于 1月 29, 2023
  
  6d13992e
- Z
  
  refine code (#50053) · f8557cd9
  由 zhangbo9674 提交于 1月 29, 2023
  
  f8557cd9
- S
  
  update latest ps.proto (#50054) · 3da73f8f
  由 sneaxiy 提交于 1月 29, 2023
  
  3da73f8f
- S
  Add the missing ps.proto and remove ps_pb2.py (#50040) · ba67361b
  由 sneaxiy 提交于 1月 29, 2023
```
* add missing proto file

* fix windows ci

* fix ci compile error
```
  ba67361b
- R
  [CustomDevice] registering feed_dense_tensor, feed_sparse_coo_tensor,... · 50d92531
  由 ronnywang 提交于 1月 29, 2023
```
[CustomDevice] registering feed_dense_tensor, feed_sparse_coo_tensor, feed_strings kernels for custom device (#50042)

* [CustomDevice] registering feed_dense_tensor, feed_sparse_coo_tensor, feed_strings kernels for custom device

* update

* update

* update
```
  50d92531
- L
  [FleetExecutor] Remove max_slot_num and implement multi-scope fetch (#50041) · decbb588
  由 LiYuRio 提交于 1月 29, 2023
```
* remove max_slot_num

* fix test case
```
  decbb588
- J
  [CINN] collect inplace var into cinn op desc's kInplaceVarNames attribute (#49898) · bad49b51
  由 jiangcheng 提交于 1月 29, 2023
```
* [CINN] collect inplace var into cinn op desc's kInplaceVarNames attribute

* attr move from op desc to subgraph

* GetFetchIds from var_map instead of var_model_to_program_map_
```
  bad49b51
- Y
  
  Fused attention pass backward pattern (#49855) · 8e02f290
  由 Yuang Liu 提交于 1月 29, 2023
  
  8e02f290
28 1月, 2023 1 次提交
- L
  
  add cond interceptor (#50019) · b2706b0c
  由 LiYuRio 提交于 1月 28, 2023
  
  b2706b0c
25 1月, 2023 1 次提交
- L
  remove useless kTranspose enum element (#38660) · f43cb3b7
  由 limingshu 提交于 1月 25, 2023
```
Co-authored-by: Nzhangbopd <1299246947@qq.com>
```
  f43cb3b7
20 1月, 2023 4 次提交
- J
  
  【Prim】Refactor prim flags system (#49930) · 23d20e30
  由 Jiabin Yang 提交于 1月 20, 2023
  
  23d20e30
- J
  Fix for bad_alloc in oneDNN matmul_grad kernel (#48593) · 44855da3
  由 jakpiase 提交于 1月 20, 2023
```
* fix for matmul_grad

* another fix for matmul_grad

* fix
```
  44855da3
- S
  
  add unique support zero dim (#49260) · ee4e5323
  由 sprouteer 提交于 1月 20, 2023
  
  ee4e5323
- J
  [KUNLUN] update xccl lib & use native Reduce in dygraph (#49941) · 073f7ced
  由 jameszhang 提交于 1月 20, 2023
```
* update xccl lib & use native Reduce in dygraph

* minor
```
  073f7ced
19 1月, 2023 5 次提交

F

add test for zero dimensional tensor for real, imag, angle, conj, as_real and sequence_pad (#49921) · 64b3f2f6
由 Feiyu Chan 提交于 1月 19, 2023

64b3f2f6

Fix paddle.queeze_ bug (#49903) · 11e34ae0

由 heliqi 提交于 1月 19, 2023

* fix queeze_ bug

* fix slove use squeeze_kernel

* fix slove use squeeze_kernel

* fix slove use squeeze_kernel

* add test case

11e34ae0

X
【prim】Modify dygraph code_gen , add set_output (#49918) · 22b5241f
由 xiaoguoguo626807 提交于 1月 19, 2023
```
* modify name

* merge develop

* fix param

* fix exp gen bug

* fix sum_grad

* comment
```
22b5241f

[KUNLUN] add op: maxpool_with_index (#49505) · f71f77e9

由 jameszhang 提交于 1月 19, 2023

* [KUNLUN] add op: maxpool_with_index

* use DeviceContext::Alloc() instead of DenseTensor::mutable_data()

* fix file format

* solve clip unittest failure

* minor fix

* Revert "solve clip unittest failure" since the issue is fixed
in #49535

This reverts commit 1127adc66e79afe35ac3c00bb34e6aaa7cd7d78b.

* align with xdnn on the definition of mask in max_pool_with_index

* minor

f71f77e9

H
[Paddle Inference]Support PaddlePaddle Backend on Triton (#49758) · e3f39833
由 heliqi 提交于 1月 19, 2023
```
* support PaddlePaddle Backend on Triton

* fix test cases

* fix Codestyle

* add test case

* add test case
```
e3f39833

18 1月, 2023 11 次提交

Handle repetitive code in oneDNN activation fuse passes (#49824) · a1b2e1e2

由 Sławomir Siwek 提交于 1月 18, 2023

* extract fuse pass logic to header file

* adjust namespaces

* Update paddle/fluid/framework/ir/mkldnn/activation_onednn_fuse_pass.h

update date
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

* add inline remove static
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>

a1b2e1e2

Add align check for Concat Kernel (#49761) · 24379442
由 MarDino 提交于 1月 18, 2023
```
* add align check

* refine
```
24379442
W
fix cast issue (#49909) · 55ccb429
由 wenbin 提交于 1月 18, 2023
```
* fix cast issue

* add ut
```
55ccb429
[Zero-Dim] support input 0D for paddle.moveaxis / quantile (#49813) · 26140ec8
由 zhouweiwei2014 提交于 1月 18, 2023
```
* [Zero-Dim] support input 0D for paddle.moveaxis/quantile

* fix CI
```
26140ec8

[PHI] remove bitwise and, or, xor (#49916) · 9056cc8b

由 RuohengMa 提交于 1月 18, 2023

* add reduce_sum_int64 and reduce_sum_int8 xpu kernels

* [PHI] add clip grad kernel with support type float32 and int32

* [PHI unittest] add clip_grad unit test

* adapt code to clang-format

* update xpu api output with clip_grad api

* remove int8 support of reduce_sum xpu kernel since it can not pass unit tests

* adapt license date, add code for XPUDataType convertion

* add int8 support of reduce_sum

* add reduce_sum unit tests for dtype int64, int8, and add more test cases

* update license date

* remove buggy bitwise and, or and xor xpu kernels, refine bitwise not xpu kernel

* change license date

9056cc8b

H

[XPU] add logical_not op. (#49911) · 60d1199a
由 houj04 提交于 1月 18, 2023

60d1199a
J

kunlun support p2p send/recv (#49896) · 7242f40b
由 jameszhang 提交于 1月 18, 2023

7242f40b

[0 Tensor support] support the 0d tensor for the cumsum (#49518) · 5fca45ea

由 wawltor 提交于 1月 18, 2023

* Add the cumsum 0d tensor

* xpu and cpu judge the 0d  tensor

* change to 2022 to 2023 in new commit

* fix the reverse logic

5fca45ea

Z

[Zero-Dim] Fix bug in masked_select for XPU (#49904) · 1a8be158
由 Zhang Zheng 提交于 1月 18, 2023

1a8be158
L

fix cinn compilation with py38 (#49883) · bc93452d
由 Leo Chen 提交于 1月 18, 2023

bc93452d

use default XPU stream for computing (#49806) · f6b23d6d

由 jameszhang 提交于 1月 18, 2023

* revert to use default XPU stream for computing

XPUContext now has a null stream by default. If you want to use a separate stream
 (e.g. in async collective communication), you should create a dedicated XPUContext
and invoke its XPUContext::CreateStream()

* minor

f6b23d6d

17 1月, 2023 8 次提交

J
Add more dy2st ut2 (#49881) · 2242136a
由 Jiabin Yang 提交于 1月 17, 2023
```
* add test for composite with dy2st

* add more log
```
2242136a

Refine munmap freq for RefcountedMemoryMapAllocation (#49691) · 3fdc105f

由 zhangbo9674 提交于 1月 17, 2023

* refine munmap freq for ref_cnt_mmap_allocator

* add shm reuse logic

* fix compile bug

* fix compile bug

* fix bug of file refcount

* fix compile bug

* fix compile bug

* refine code for delete shm case

* polish code

* refine shm cache pool size setting logic

* set buffer is 2

* refine shm cache size logic

* refine max shm cache

* refine shm cache size

3fdc105f

Rewrite mat reshape transpose testers (#49580) · d9d47dc6

由 Paulina Gacek 提交于 1月 17, 2023

* reshape_transpose_matmul_pass_tester rewritten

* matmul_transpose_reshape_pass_tester rewritten

* mkldnn to onednn

d9d47dc6

Y
[Zero-Dim] support input 0D Tensor for equal_all (#49845) · f287b1e9
由 yeliang2258 提交于 1月 17, 2023
```
* add zero dims test

* update code

* fix zero dims

* update code
```
f287b1e9

support CUDA Graph for new executor (#49708) · 8e5ed04d

由 pangyoki 提交于 1月 17, 2023

* new exe supports CUDA Graph

* fix

* fix

* fix

* fix FLAGS_use_stream_safe_cuda_allocator in unittest

* insert output of coalesce_tensor op to skip_gc_var

* fix

8e5ed04d

Prim api gen (#49654) · 813e27c9

由 xiaoguoguo626807 提交于 1月 17, 2023

* proto type of composite grad in paddle

* proto type of composite grad in paddle

* refactor composite api with phi

* fix compile error

* support static graph code-gen for squeeze op

* generate static graph code of unsqueeze

* refine op name

* fix compile error

* add extra output in op_compat

* remove debug log

* fix clang compile error

* support prim switch flag

* support prim switch flag

* fix dygraph error

* merge develop

* add code_gen

* add necessary files without codegen

* fix code_gen bug

* add deps

* modify igmnore

* add ignore

* delete std cout

* add composite logic for backward.py

* add tanh first order grad composite

* support enable_prim flag for static graph

* throw expection when both GrapOpMaker and GradCompOpMaker not been registered

* reorganize the directory of prim api tests

* fix windows error

* add eager_utils

* add eager_utils

* modify code gen

* add composite parse

* add unittest for get_grad_op_desc

* code optimize

* fix static test on windows

* support generate static graph code for imag and real op

* fix windows compile error in test_static_prim

* merge develop

* disable test eager in inference

* prim code gen

* disable eager compile in inference

* origin_yaml codegen success

* rm other file

* rm gitignore file

* code_style

* add eager test

* code_style

* clear #

* merge develop

* clear #

* remove useless files

* modify static test

* support bool flag from singlton

* merge develop

* recover git ignore

* fix conflict

* clear prim_gen

* recover git ignore for generated op

* parse_yaml success

* fix test compile error

* remove some tests

* add python test

* code_style

* revert parse_utils+ clear prim_gen

* fix some name issue

* add composite code gen

* modify backward yaml

* fix static composite grad maker code gen

* remove addtional files

* add some static funcs unit test

* fix some bugs

* fix composite grad maker register code gen

* optimize some functions

* modify gen cmake

* add more api gen

* add header

* modify static

* add static expand unsqueeze

* comments

* modify compopmaker

* revert

* modify gen name
Co-authored-by: NJiabinYang <360788950@qq.com>
Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com>
Co-authored-by: Ncxxly <chenxx_id@163.com>
Co-authored-by: Ncharles-hit <wanghao107@baidu.com>

813e27c9

[PHI]Change feed_op to phi kernel (#49116) · f7f1dc03

由 YuanRisheng 提交于 1月 17, 2023

* change feed_op to phi kernel

* fix ci bugs

* fix build bugs

* fix ci bugs

* fix compile bugs

* fix ci bugs

* perfect code

* perfect comment code

* fix install bugs

* modify code according comment

* remove visitor in feed_op

* modify according comment

* perfect code according comment

* add infershape

* fix py3 bugs

* fix getexpected kernel type

* fix getexpected kernel type

* fix ci bugs

* add registry for custom device

* fix py3 bugs

* fix floating point error

* fix py3 test bugs

f7f1dc03

J

add test for composite with dy2st (#49873) · b927ce81
由 Jiabin Yang 提交于 1月 17, 2023

b927ce81

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功