提交 · 788be26d31b67b8bf786dc0bca4b132319e11ffb · PaddlePaddle / Paddle

13 7月, 2023 6 次提交
- add phi operator c_concat and ut (#55320) · 788be26d
  由 lil-Xing 提交于 7月 13, 2023
```
* add phi operator c_concat and ut

* update create_var use

* update copyright
```
  788be26d
- Z
  Move compare_raw_kernel to legacy (#53928) · 1dd8770a
  由 zhangyuqin1998 提交于 7月 13, 2023
```
* Move compare_raw_kernel to legacy

* fix

* Update compare_kernel.cc

* Move compare_raw_kernel to legacy
```
  1dd8770a
- F
  
  fix roi_align roi_pool to static num 0 (#55342) · 0a21836d
  由 Feng Ni 提交于 7月 13, 2023
  
  0a21836d
- W
  
  fix conv_fusion in multi thread. (#55374) · ceb83562
  由 Wilber 提交于 7月 13, 2023
  
  ceb83562
- R
  Add matmul_int8 op (#55228) · 27cc0df5
  由 RichardWooSJTU 提交于 7月 13, 2023
```
* add matmul int8
```
  27cc0df5
- Q
  Modify bf16 and fix the elementwise_max (#54799) · 6f7ceca0
  由 Qi Shao 提交于 7月 13, 2023
```
* modify the accuracy checking framework of bf16 optest, including both of forward and backward
```
  6f7ceca0
12 7月, 2023 3 次提交

Fix llm int8 build error (#55338) · 006bd959

由 FormlessUnit 提交于 7月 12, 2023

* add macro to avoid llm.int8 build error

* fix ci

---------
Co-authored-by: Nwufeisheng <wfs1997@163.com>

006bd959

[ONEDNN] Upgrade oneDNN version to v3.1 (#52463) · cfa513f7

由 YangQun 提交于 7月 12, 2023

* squash pick the poc code
* fix build after rebase
* fix int8 conv and fc uts
* Fix and clean-up Get_SRC_Scale_Memory
* fix floating point fc uts
* fix test_analyzer_int8_googlenet
* test_analyzer_int8_mobilenetv1
* fix int8 mobilenet v2 and v3
* fix build error after rebase
* [oneDNN] rename library version
* fix conv bias datatype
* try to fix import error
* fix rebase error
* [oneDNN] pack library into python wheel
* add MKLDNN_SHARED_LIB_3 to env_dict
* fix test_analyzer_bert
* fix fill_constant op kernel
* fix ernie and matmul op ut
* fix softplus ut
* fix conv+relu6 fusion ut
* fix hardswish fusion
* fix quant+transpose fusion ut
* fixsgd ut
* fix int8 matmul with flatten
* fix fc+scale fusion
* fix conv/matmul+gelu fusion uts
* fix rebase error
* Revert "fix conv/matmul+gelu fusion uts"
This reverts commit 47eb5e49972bd8f7271a233def9bfb3e98ce78e1.
* upgrade to onednn v3.1
* remove older version onednn
* use densetensor::data() for achieving mean and var in layernorm impl
* comments for atol of integer tests
* fix clang-format
* Revert "remove older version onednn"
This reverts commit 783e57ddfd4401254596eae7d47adb9b03590c09.
* improve binary handle
* fix expand kernel
* Revert "use densetensor::data() for achieving mean and var in layernorm impl"
* always use forward_inference for conv
* remove activation scales
* rollback changes to mkldnn.cmake
* address comments
* port changes to dequantize kernel
* fix merge error
* fix fused_elementwise_kernel
* upgrade onednn version to v3.1.1
* fix some approval error
* fix error msg format
* remove old onednn libs
* try to fix symbolic link issue
* fix cinn test case segfault
* do not explicit link test with onednn
* remove unnecessary changes
* integrate CINN with onednn v3
* link with mkldnn project
* fix cinn build file

---------
Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
Co-authored-by: NChen, Xinyu1 <xinyu1.chen@intel.com>
Co-authored-by: Ntianshuo78520a <707759223@qq.com>

cfa513f7

[clang-tidy] enable `readability-container-size-empty` check (#55279) · be3a6fa7

由 Wang Xin 提交于 7月 12, 2023

* [clang-tidy] enable readability-container-size-empty check

* fix test_custom_kernel Failed

* add clang-tid-10 in dockerfile

* add clang-tidy in dockerfile

* fix bug

be3a6fa7

11 7月, 2023 3 次提交
- R
  
  [ROCM] reduce build log (#55097) · a1396a80
  由 ronnywang 提交于 7月 11, 2023
  
  a1396a80
- Integrate rmsnorm kernel (#54998) · 97d3d6ee
  由 MarDino 提交于 7月 11, 2023
```
* add rmsnorm kernel
* add static graph test
* fix round type
* use alignas to avoid msvc compile error
* remove redundant headerfile to avoid rocm compile error
* fix rocm compile not found cub
* Add document
```
  97d3d6ee
- Linear compress (#55128) · f4290a92
  由 FormlessUnit 提交于 7月 11, 2023
```
* rename weight_only/llm.int8
```
  f4290a92
07 7月, 2023 3 次提交
- X
  
  [fix] move exception throw out of omp parallel for loop (#55064) · 9ed8bafd
  由 xiaoye 提交于 7月 07, 2023
  
  9ed8bafd
- W
  
  [XPU] Add layernorm fuse pass (#55154) · eb12739e
  由 wz1qqx 提交于 7月 07, 2023
  
  eb12739e
- 傅
  fix index_put bug when index is multi-dim bool tensor (#55191) · 4543ca91
  由傅剑寒提交于 7月 07, 2023
```
* fix index_put bug when index is multi-dim bool tensor

* fix name error
```
  4543ca91
06 7月, 2023 3 次提交
- L
  Fix bugs of dropout and dropout_grad. test=kunlun. (#55184) · 0e67fb63
  由 Leo Guo 提交于 7月 06, 2023
```
* Fix bugs of dropout and dropout_grad. test=kunlun

* Modify the code style of dropout_grad_kernel. test=kunlun
```
  0e67fb63
- S
  
  fix workspace size (#54884) · 71568672
  由 Shijie 提交于 7月 06, 2023
  
  71568672
- H
  
  [XPU] speed up for special case of strided_slice op. (#55166) · 2ff949da
  由 houj04 提交于 7月 06, 2023
  
  2ff949da
05 7月, 2023 2 次提交
- H
  masked select x and mask support broadcast (#54776) · 413d1abf
  由 Hui Zhang 提交于 7月 05, 2023
```
* masked select forward support broadcast

* cpu forward and backward

* gpu support mask broadcast

* fix comment

* x support broadcast

* fix comment
```
  413d1abf
- L
  
  [sparse] Add backend conv2d support (#54707) · 3e3f5d90
  由 LUZY0726 提交于 7月 05, 2023
  
  3e3f5d90
04 7月, 2023 2 次提交
- H
  [XPU] Add XPU plugin support (#55101) · 6d5d9f23
  由 hong19860320 提交于 7月 04, 2023
```
* Add XPU plugin to support the customized ops or improve the performance of the fusion ops based on hand-written xpu micro kernels.

* refine README.md
```
  6d5d9f23
- R
  
  [CustomDevice] refine set_constant_with_place by calling full kernel (#55089) · 07b83f2e
  由 ronnywang 提交于 7月 04, 2023
  
  07b83f2e
03 7月, 2023 5 次提交
- J
  [XPU] Fix the topk, set_value ops that using temporary tensors avoiding the... · cc2059a0
  由 jiangfan06 提交于 7月 03, 2023
```
[XPU] Fix the topk, set_value ops that using temporary tensors avoiding the memory overlaps during multi-stream inference (#54851)
```
  cc2059a0
- R
  [CustomDevice] release device manager in py::atexit (#54932) · e5725680
  由 ronnywang 提交于 7月 03, 2023
```
* [CustomDevice] release device manager in py::atexit

* fix hip_version macro

* update

* update
```
  e5725680
- L
  【PaddlePaddle Hackathon 4】No.63 : add lerp bf16 support (#53078) · ce31a72e
  由 LoneRanger 提交于 7月 03, 2023
```
* add lerp bf16 support

* fix bug

* Update test_lerp_op.py

modify the input dtype

* modify the test_lerp_op.py

* Update test_lerp_op.py

* fix bug of import

* add user_defined_grads

* Update test_lerp_op.py

* fix bug of grad

* fix bug of grad

* fix bug of grad

* add the check for bfloat16 dtype
```
  ce31a72e
- add linear_compress API (#54140) · c4d5ec66
  由 FormlessUnit 提交于 7月 03, 2023
```
* add linear_compress API
```
  c4d5ec66
- N
  
  Update the rope op according to the comments (#54985) · 2401d48d
  由 niuliling123 提交于 7月 03, 2023
  
  2401d48d
02 7月, 2023 1 次提交
- H
  Fix fetch op and null type bug (#55027) · a20051cd
  由 hong 提交于 7月 02, 2023
```
* fix_fetch_op_and_null_type_bug

* fix compile bug

* add test case
```
  a20051cd
30 6月, 2023 1 次提交
- M
  
  [XPU] Add conv2d transpose fuse pass (#54904) · 12c15b89
  由 mjp9527 提交于 6月 30, 2023
  
  12c15b89
29 6月, 2023 3 次提交
- Y
  Fix compiling on XPU related to MPTypeTrait. (#54924) · 7353e9e9
  由 Yiqun Liu 提交于 6月 29, 2023
```
* Fix compiling on XPU related to MPTypeTrait.

* Unify the use of MPTypeTrait.

* Fix compiling error.
```
  7353e9e9
- N
  Add fused_rope forward op (#54351) · a215c46a
  由 niuliling123 提交于 6月 29, 2023
```
* style

* more

* update ctest

* Update legacy_backward.yaml

* Update legacy_ops.yaml

* Update legacy_ops.yaml

* update

* update

* update for move
```
  a215c46a
- H
  
  [XPU] fix layer_norm_grad bug when bias_grad and scale_grad are nullptr (#54669) · 55b974e7
  由 haosicheng 提交于 6月 29, 2023
  
  55b974e7
28 6月, 2023 4 次提交
- L
  [XPU][PHI Kernels] add int_with_ll quantization for conv kernels (#54827) · bd67209f
  由 lijin23 提交于 6月 28, 2023
```
* add int_with_ll to conv

* fix bugs when output_size is specified for conv2d_transpose
```
  bd67209f
- S
  [BugFix] Fix bug for binary_cross_entropy_with_logits loss (#54869) · bb42d870
  由 Siming Dai 提交于 6月 28, 2023
```
* add pos_weight in kernel

* fix unittest

* fix xpu

* fix bce unittest, change infermeta order
```
  bb42d870
- R
  [ROCM] fix cupti, rccl on rocm (#54807) · 57da105c
  由 ronnywang 提交于 6月 28, 2023
```
* [ROCM] fix cupti, hipcub

* update

* update
```
  57da105c
- Y
  
  Support 0-D Tensor for check_numerics_kernel. (#54868) · b7fbd339
  由 Yiqun Liu 提交于 6月 28, 2023
  
  b7fbd339
27 6月, 2023 2 次提交
- Z
  delete swish_raw (#54536) · 0cdaafea
  由 zhangyuqin1998 提交于 6月 27, 2023
```
* delete swish_raw

* fix

* Update activation_kernel.cc

* fix
```
  0cdaafea
- add all_to_all phi operator (#54797) · 158b7ae5
  由 TaoTao Li 提交于 6月 27, 2023
```
* add all_to_all phi operator, kernel, api

* add all_to_all ut

* tinyfix
```
  158b7ae5
26 6月, 2023 2 次提交

P

exclude xpu (#54848) · 6962d3e2
由 pangengzheng 提交于 6月 26, 2023

6962d3e2

remove ops from OpsWithFluidKernelNeedMoveToPhi set (#54007) · 733eca85

由 Sonder 提交于 6月 26, 2023

* remove ops from OpsWithFluidKernelNeedMoveToPhi set

* open static build flag

* OpsWithFluidKernelNeedMoveToPhi

* open new_executor_static_build

* add infermate for cudnn_lstm

* fix

* update

* fix

* update

* update

* update

* fix pow2 decay

* fix pow2 decay

* recover analysis_predictor.cc

* fix pow2 decay

* fix cudnn lstm

* add output register info for svd

* fix pow2_decay_with_linear_warmup_kernel

* recover test lstm cudnn

* recover svg register codes

* fix register info

* fix reduce sum register info

* add output info for adadelta

* add output info for adadelta

* add output info for adamax

* fix complex abs register info

* add register info for cudnn_lstm_grad

* recover

* fix lstm cudnn

* fix

* fix xpu output registe info

* remove std::cout

* add backend

* remove output info in pow2_decay_with_linear_warmup_kernel

* add judgment in TensorShouldBeFakeInitialized

* recover power_

* close new_executor_static_build

* fix set_value_xpu

733eca85

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功