提交 · 772be4f50759ad504d0b23af0a42cb68bb42f9f7 · BaiXuePrincess / Paddle

06 2月, 2022 1 次提交
- W
  
  [PTEN] Add Gpu context (#39305) · a821c4a9
  由 Wilber 提交于 2月 06, 2022
  
  a821c4a9
28 1月, 2022 1 次提交
- W
  compile fix (#39272) · 91dd0f0d
  由 wenbin 提交于 1月 28, 2022
```
* slice

* shuffle pass enhancement
```
  91dd0f0d
27 1月, 2022 4 次提交

[PluggableDevice] Add custom kernel support based on pten kernel management (#38848) · a8879215

由 Aganlengzi 提交于 1月 27, 2022

* [Demo] custom kernel based on pten kernel

* merge and npu custom work well

* del comments

* delete other code

* fix CUDAContext

* fix not found small_vector.h

* support NPU

* fix NPUContext

* fix DeviceContext support

* add UT

* fix call

* add UT

* fix

* fix for comments and ut

* add MACRO control

* fix multi input output

* support env CUSTOM_DEVICE_ROOT

* deal with special cases

* fix for Windows

* try coverage with test_custom_kernel_dot.py

* fix test_custom_kernel_dot

* fix test_custom_kernel_dot

* fix merge

* fix merge

* fix CI

* update

* merge and fix

* remove WITH_CUSTOM_KERNEL

* fix merge

* merge and fix

* fix ut

* fix ut for mac

* add more UT

* add more UT

* fix

a8879215

W
fix shuffle_channel_detect_pass (#39242) · af9ddeb7
由 wenbin 提交于 1月 27, 2022
```
* shuffle channel pass

* add ut

* timeout fix

* makefile fix
```
af9ddeb7
王

fix the c api unit test failed in windows. test=develop (#39244) · 9b79988c
由王明冬提交于 1月 27, 2022

9b79988c

[Paddle-Inference]: fix concat slice (#39096) · f080e8d5

由 Wangzheee 提交于 1月 27, 2022

* Paddle-Inference:fix_concat_slice

* Paddle-Inference:fix_concat_slice

* Paddle-Inference:fix_concat_slice

* Paddle-Inference:fix_concat_slice

* [Paddle-Inference]: fix concat slice

* [Paddle-Inference]: fix concat slice

* [Paddle-Inference]: fix concat slice

f080e8d5

26 1月, 2022 2 次提交

[pten] remove deprecated fluid op kernel for pten (#38842) · 3ab9aef1

由 Leo Chen 提交于 1月 26, 2022

* update cmake file to remove fluid kernel

* add pten declaration.h to where pybind.h used

* fix sync_bn and tensorrt_engine

* refine detection_library

* fix interpreter_core

* support eager legacy

* fit eager legacy for pten

* fall back to cpu if not found kernel

* fix compile problem

* fix compile problem

* refine fallback logic

* fit operator.run()

* fix xpu compile

* fit for new_exec

* add REGISTER_OP_WITHOUT_GRADIENT

* un-cache pt_kernel_context

* fix compile

* fix cudnn

* fix compiling with on_infer

* fix mkldnn

* fix isfinite_v2

* fix xpu problem

* fix op_device

* refine fallback for xpu

* fix xpu compile

* merge develop

* refine code format

* fix compile

* fix compile

* add data_transfer

* fix PreparePtenData

* fix cpu context

* merge develop

* fix compile

* fix error device context

* fix xpu

* fix dev_ctx

3ab9aef1

B
support npu weight unified H2D copy before inference (#39160) · 106b5514
由 baoachun 提交于 1月 26, 2022
```
* support npu weight unified H2D copy

* remove redundant variable
```
106b5514

25 1月, 2022 3 次提交

[inference] update trt convert reduce op&ut,test=develop (#39088) · 80753755

由 Zhang Jun 提交于 1月 25, 2022

* [inference] update convert reduce op&ut,test=develop

* update

* update

* update

* add int32 support

* add int32 support

* add comments

* trt < 7.0 do not support int32

* test=develop

* update

* test=develop

80753755

[Move selected_rows PR #3] Change the relationship of [include/Cmake]. (#39128) · 2bafd338

由 Weilong Wu 提交于 1月 25, 2022

* Added selected_rows and rw_lock to pten

* Renamed the unit test target to fix CI

* Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid

* Remove rw_lock.h,rw_lock_test.cc in fluid

* Use pten::RWLock and pten::AutoRDLock, fix CI

* Use pten::SelectedRows

* Use pten::SelectedRows

* Fix to pass NPU CI

* Use pten::SelectedRows, to pass NPU CI

* To fix NPU CI

* To fix NPU CI again

2bafd338

[PTen] Migrate string tinyformat errors and part of enforce into pten (#39051) · 6ca49164

由 xiongkun 提交于 1月 25, 2022

* transfer: string tinyformat errors and part of enforce into pten

* remove comment

* fix by code review

* assert is not compile in -DNDEBUG

* add string as dependences of paddle_inference

6ca49164

24 1月, 2022 1 次提交

[PTEN] Move dynload from fluid to pten. (#39120) · 3c1dc6f6

由 Wilber 提交于 1月 24, 2022

* move dynload from fluid to pten.

* fix ci compile

* fix windows ci compile.

* update

* update

* fix compile error

3c1dc6f6

18 1月, 2022 4 次提交

Mish FP32/BF16 kernel, conv and fc fuse passes (#38623) · 1d18bc2c

由 Sławomir Siwek 提交于 1月 18, 2022

* Mish

* Change exp() library

* mish fuse pass

* mish attrs

* fixes

* mishop maker

* remove attrs

* mish kernal for bf16

* fc+mish fuse

* fix code format error

* Resolve merge conflicts

* Update mish operator version

* update mish variable to new naming convention

1d18bc2c

[Unify Tensors PR #8] Merged Tensor into DenseTensor, test=allcases (#38914) · 2052f1e3

由 Zhanlue Yang 提交于 1月 18, 2022

* Merged LoDTensor with Tensor,test=allcases

* Patched python level LoDTensor

* Patched python level LoDTensor

* Merge Tensor into DenseTensor

* Fixed namespace issues,test=allcases

* Fixed merge issues

* Fixed inference issues

* Fixed NPU test issues

* Fixed merge issues

2052f1e3

J
fix trt convert conv2d skip (#38999) · dfa242e4
由 JingZhuangzhuang 提交于 1月 18, 2022
```
* fix trt convert conv2d skip

* fix trt convert conv2d skip
```
dfa242e4
W
modify transpose params check (#39006) · 27f8460a
由 wenbin 提交于 1月 18, 2022
```
* modify params check

* correct compile
```
27f8460a

17 1月, 2022 2 次提交

W
disable unsupported trt dimension (#38962) · 55e9087f
由 wenbin 提交于 1月 17, 2022
```
* develop test

* throw

* ne

* wrong cnt
```
55e9087f

[Pten] Replace platform::Place to pten::Place. (#38899) · c48a9ad5

由 Wilber 提交于 1月 17, 2022

* add pten::Place data structure.

* update ci problem

* fix ci problem

* update

* using platform::Place=pten::Place

* remove BOOST_GET_CONST for CPUPlace and GPUPlace

* compile pass 25%.

* compile pass 45%

* compile pass 60%

* remove boost_get for xpu npu mlu and ipu

* compile pass on cpu and gpu.

* fix compile problem

* fix compile error.

* update

* fix ci problem

* update

* ci approve

* fix ci problem

* fix ci eager test problem

* remove BOOST_GET_CONST

* fix npu compile

c48a9ad5

15 1月, 2022 1 次提交

[Unify Tensors PR #7] Merged LoDTensor with Tensor, test=allcases (#38880) · 88966b28

由 Zhanlue Yang 提交于 1月 15, 2022

* Merged LoDTensor with Tensor,test=allcases

* Patched python level LoDTensor

* Fixed example code failure

* Polished function names, removed duplicated forward declarations

88966b28

14 1月, 2022 1 次提交

add flatten_contiguous_range OpConvert for Paddle-TRT (#38922) · 050aa6fe

由 heliqi 提交于 1月 14, 2022

* add trt_convert_flatten_contiguous_rang op

* trt version >7,support trt_convert_flatten_contiguous_rang

* trt version >7,support trt_convert_flatten_contiguous_rang

* trt version >7,support trt_convert_flatten_contiguous_rang

* test cast add trt version >=7 skip

050aa6fe

13 1月, 2022 3 次提交
- W
  [Paddle-Inference] add Paddle Trt config: with_interleaved (#38884) · dccdc719
  由 Wangzheee 提交于 1月 13, 2022
```
* add Paddle Trt config: with_interleaved
```
  dccdc719
- W
  roi_align aligned supported (#38905) · 08dcea18
  由 wenbin 提交于 1月 13, 2022
```
roi_align aligned supported
```
  08dcea18
- 石
  
  splits allocation for pten, test=develop (#38853) · 277cf900
  由石晓伟提交于 1月 13, 2022
  
  277cf900
11 1月, 2022 1 次提交
- F
  
  roi_align fix (#38788) · ffbc2122
  由 fengkuangxiaxia 提交于 1月 11, 2022
  
  ffbc2122
10 1月, 2022 1 次提交

[Unify Tensors PR #5] framework::Tensor inherits from DenseTensor,test=allcases (#38632) · 5c73a6ea

由 Zhanlue Yang 提交于 1月 10, 2022

* Added shared_ptr<Allocation> member & corresponding interfaces to Storage

* Removed original pten::Allocation from Storage and adjusted the interfaces accordingly

* Fixed issues with storage offset

* Used place to malloc allocation for TensorStorage

* [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor

* Fixed issues with place

* Added comments

* Moved mutable_data with stream argument to DenseTensor

* Added set_offset interface

* Fixed CI issues,test=allcases

* [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor

* Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor

* Modified framework::Tensor to inherit from DenseTensor

* Reverted changes too pten_layout() interface

* Removed friend classes

* Rearranged cfunction calls from tensor.data<void>() to tensor.data()

* Fixed CI issues

* Fixed lite issues

* Fixed data() interface issues,test=allcases

* Resolved IsInitialized() issues

* Fixed ResetHolder() issues

* Fixed MKLDNN & Storage issues

* Resolved ShareBufferWith() issues

* Fixed LoD issues

5c73a6ea

06 1月, 2022 1 次提交
- W
  nearest_interp_v2 bug fix (#38725) · 9c1167cf
  由 wenbin 提交于 1月 06, 2022
```
* bug fix

* remove blank
```
  9c1167cf
05 1月, 2022 2 次提交

W
inference c_api support std::string (#38667) · f289cf85
由 Wilber 提交于 1月 05, 2022
```
* c_api support std::string

* update

* update

* add NOTE

* fix delete error.
```
f289cf85

Quantize nearest_interp and nearest_interp_v2 (#38622) · 1456b02d

由 joanna.wozna.intel 提交于 1月 05, 2022

* Quantize nearest_interp and nearest_interp_v2

* Check if avx_core supported

* Add depthwise_conv2d to supported quantization list

1456b02d

04 1月, 2022 1 次提交
- Z
  
  plugin terminate should be called by TensorRT (#38374) · ba411960
  由 zlsh80826 提交于 1月 04, 2022
  
  ba411960
31 12月, 2021 1 次提交
- J
  Fix for MKLDNNDeviceContext error in matmul_v2_transpose_reshape fuse pass when GLOG_v set (#38554) · 1d31764e
  由 jakpiase 提交于 12月 31, 2021
```
* glog fix

* changed approach
```
  1d31764e
30 12月, 2021 2 次提交
- W
  dynamic shape clone (#38520) · 339c34e6
  由 wenbin 提交于 12月 30, 2021
```
* dynamic shape clone supported
```
  339c34e6
- J
  
  params file will not be a nessary file (#38579) · de26b88b
  由 JingZhuangzhuang 提交于 12月 30, 2021
  
  de26b88b
23 12月, 2021 2 次提交
- S
  block warning when build demo_ci and infer_ut (#38306) · 3629cd27
  由 Sing_chan 提交于 12月 23, 2021
```
* block warning when build demo_ci and infer_ut

* use build pipe line clone to test
```
  3629cd27
- W
  Support external stream. (#38373) · 15ad7ee4
  由 Wilber 提交于 12月 23, 2021
```
* support external stream.

* update

* update

* update
```
  15ad7ee4
20 12月, 2021 3 次提交

B

add gelu pbtxt for conv+gelu mkldnn fuse pass (#38162) · 1b7f6ae9
由 baoachun 提交于 12月 20, 2021

1b7f6ae9

Support FP16 for more ops (#38123) · 1f445bf3

由 sneaxiy 提交于 12月 20, 2021

* support FP16 for more ops

* add amp list tests

* refine reduce_mean_grad

* fix OP benchmark ci

* fix fp16 reduce_mean

* updat ut, but still have some problems

* remove mean/reduce_mean fp16 kernel

1f445bf3

add matmul_scale_fuse_pass (#37962) · ce335c23

由 heliqi 提交于 12月 20, 2021

* add matmul_scale matmul_v2_scale fuse pass

* add scaletensor judge

* modify var name

* add timeout notest;test=coverag

* fix error commit

* fix use_mkldnn attr

* fix use_mkldnn attr

ce335c23

17 12月, 2021 2 次提交
- S
  
  fix bug when build inference lib without tensorrt (#38156) · 6d1b8c52
  由 Sing_chan 提交于 12月 17, 2021
  
  6d1b8c52
- L
  [Paddle-TRT] Use TRT inspector to show the information inside an engine to verbose log (#38200) · 237c1fe6
  由 Leo Chen 提交于 12月 17, 2021
```
* Inspect the information inside a TRT engine.

* Follow up the google code style.

* Fix code error.
```
  237c1fe6
15 12月, 2021 1 次提交

add mkldnn conv3d_bias_mkldnn_fuse_pass ut (#37700) · 0456e003

由 baoachun 提交于 12月 15, 2021

* add mkldnn conv3d_bias_mkldnn_fuse_pass ut

* update conv3d_bias_mkldnn_fuse_pass ut

* disable conv3d_bias_mkldnn_fuse_pass

0456e003

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致