- 16 2月, 2022 2 次提交
-
-
由 Wangzheee 提交于
[Paddle-Inference] support preln-ernie: add preln_emb_eltwise_layernorm_op, preln_skip_layernorm_op (#39570) * support preln_ernie: add preln_emb_eltwise_layernorm_op, preln_skip_layernorm_op * support preln_ernie: add preln_emb_eltwise_layernorm_op, preln_skip_layernorm_op
-
由 YuanRisheng 提交于
* remove reshape and elementwise_add registry * delete code * fix bugs when run ci ut * remove log * fix bugs when run unit test * fix bugs when run unit test * fix bugs when run cinn * fix bugs when run ci-mac-python3 * fix compile bugs * fix compile bugs * fix compile bugs * fix bugs when run kunlun * fix bugs when compile * update code according comment
-
- 15 2月, 2022 5 次提交
-
-
由 Wangzheee 提交于
[Paddle-Inference] support preln_ernie: add preln_embedding_eltwise_layernorm_fuse_pass, preln_skip_layernorm_fuse_pass (#39508) * support preln_ernie * support preln_ernie
-
由 feng_shuai 提交于
-
由 Leo Chen 提交于
* Replace GeLU plugin with TRT built-in layers for approximate GeLU * Add TensorRT built-in layer for nonapproximate GeLU
-
由 feng_shuai 提交于
-
由 Aurelius84 提交于
* #1 migrate dist-related type()-> dtype() * move datatype function from pten -> fluid/framework * change type() in imperative into convert(dtype()) * modify xx_tensor->type into xx_tensor->dtype * change the set_type interface and the caller * modify xx_tensor.type into xx_tensor.dtype * fix mutable_data(place, dtype()) * change caller of mutable_data in pten and distributed * change the caller of mutable_data in fluid/framework * change the caller of mutable_data in imperative directory * mutable_data: inference * update the call of mutable_data * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType * pass the compile. the next step is remove VarType in Pten * fix all and remove VarType from pten. success in linux. Next task is other platform * fix conflict with develop * fix compiled error * Fix reset conversion * fix conflict * fix compiled problem * fix typo * Fix << in tensor_utils.cc * fix type->dtype * fix unittest * fix tensor init constructor * fix DataTypeSize for BFloat16 * fix code style * fix npu compiled error * fix npu * compile npu sucessfully * fix conflict * fix conflict Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
-
- 14 2月, 2022 1 次提交
-
-
由 Sylwester Fraczek 提交于
* prevent squashing pair u8 dequantize -> s8 quantize * add relu op to check for uint8 * fix ptq fc attr name fuse_activation->activation_type * fix * add unit test * remove unused variable * test fix unsuccessful * fix test and logic * multiline comment * remove cout * Revert "fix ptq fc attr name fuse_activation->activation_type" This reverts commit ffd023353a5e9b0fd15e81b9e9f9fe1794035017. * fix ptq fc attr name fuse_activation->activation_type
-
- 11 2月, 2022 3 次提交
-
-
由 Leo Chen 提交于
-
由 JingZhuangzhuang 提交于
-
由 Wangzheee 提交于
* support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved * support ernie quant model with interleaved
-
- 10 2月, 2022 2 次提交
-
-
由 chenyanlann 提交于
-
由 wenbin 提交于
* mkldnn conv fix * definetion
-
- 09 2月, 2022 1 次提交
-
-
由 Wangzheee 提交于
* rebuild matmul pass: trt and gpu_cpu * rebuild matmul pass: trt and gpu_cpu * rebuild matmul pass: trt and gpu_cpu * rebuild matmul pass: trt and gpu_cpu
-
- 06 2月, 2022 1 次提交
-
-
由 Wilber 提交于
-
- 28 1月, 2022 1 次提交
-
-
由 wenbin 提交于
* slice * shuffle pass enhancement
-
- 27 1月, 2022 4 次提交
-
-
由 Aganlengzi 提交于
* [Demo] custom kernel based on pten kernel * merge and npu custom work well * del comments * delete other code * fix CUDAContext * fix not found small_vector.h * support NPU * fix NPUContext * fix DeviceContext support * add UT * fix call * add UT * fix * fix for comments and ut * add MACRO control * fix multi input output * support env CUSTOM_DEVICE_ROOT * deal with special cases * fix for Windows * try coverage with test_custom_kernel_dot.py * fix test_custom_kernel_dot * fix test_custom_kernel_dot * fix merge * fix merge * fix CI * update * merge and fix * remove WITH_CUSTOM_KERNEL * fix merge * merge and fix * fix ut * fix ut for mac * add more UT * add more UT * fix
-
由 wenbin 提交于
* shuffle channel pass * add ut * timeout fix * makefile fix
-
由 王明冬 提交于
-
由 Wangzheee 提交于
* Paddle-Inference:fix_concat_slice * Paddle-Inference:fix_concat_slice * Paddle-Inference:fix_concat_slice * Paddle-Inference:fix_concat_slice * [Paddle-Inference]: fix concat slice * [Paddle-Inference]: fix concat slice * [Paddle-Inference]: fix concat slice
-
- 26 1月, 2022 2 次提交
-
-
由 Leo Chen 提交于
* update cmake file to remove fluid kernel * add pten declaration.h to where pybind.h used * fix sync_bn and tensorrt_engine * refine detection_library * fix interpreter_core * support eager legacy * fit eager legacy for pten * fall back to cpu if not found kernel * fix compile problem * fix compile problem * refine fallback logic * fit operator.run() * fix xpu compile * fit for new_exec * add REGISTER_OP_WITHOUT_GRADIENT * un-cache pt_kernel_context * fix compile * fix cudnn * fix compiling with on_infer * fix mkldnn * fix isfinite_v2 * fix xpu problem * fix op_device * refine fallback for xpu * fix xpu compile * merge develop * refine code format * fix compile * fix compile * add data_transfer * fix PreparePtenData * fix cpu context * merge develop * fix compile * fix error device context * fix xpu * fix dev_ctx
-
由 baoachun 提交于
* support npu weight unified H2D copy * remove redundant variable
-
- 25 1月, 2022 3 次提交
-
-
由 Zhang Jun 提交于
* [inference] update convert reduce op&ut,test=develop * update * update * update * add int32 support * add int32 support * add comments * trt < 7.0 do not support int32 * test=develop * update * test=develop
-
由 Weilong Wu 提交于
* Added selected_rows and rw_lock to pten * Renamed the unit test target to fix CI * Removed Class SelectedRows in Fluid, changed include/cmake relationship, use pten::SelectedRows in Fluid * Remove rw_lock.h,rw_lock_test.cc in fluid * Use pten::RWLock and pten::AutoRDLock, fix CI * Use pten::SelectedRows * Use pten::SelectedRows * Fix to pass NPU CI * Use pten::SelectedRows, to pass NPU CI * To fix NPU CI * To fix NPU CI again
-
由 xiongkun 提交于
* transfer: string tinyformat errors and part of enforce into pten * remove comment * fix by code review * assert is not compile in -DNDEBUG * add string as dependences of paddle_inference
-
- 24 1月, 2022 1 次提交
-
-
由 Wilber 提交于
* move dynload from fluid to pten. * fix ci compile * fix windows ci compile. * update * update * fix compile error
-
- 18 1月, 2022 4 次提交
-
-
由 Sławomir Siwek 提交于
* Mish * Change exp() library * mish fuse pass * mish attrs * fixes * mishop maker * remove attrs * mish kernal for bf16 * fc+mish fuse * fix code format error * Resolve merge conflicts * Update mish operator version * update mish variable to new naming convention
-
由 Zhanlue Yang 提交于
* Merged LoDTensor with Tensor,test=allcases * Patched python level LoDTensor * Patched python level LoDTensor * Merge Tensor into DenseTensor * Fixed namespace issues,test=allcases * Fixed merge issues * Fixed inference issues * Fixed NPU test issues * Fixed merge issues
-
由 JingZhuangzhuang 提交于
* fix trt convert conv2d skip * fix trt convert conv2d skip
-
由 wenbin 提交于
* modify params check * correct compile
-
- 17 1月, 2022 2 次提交
-
-
由 wenbin 提交于
* develop test * throw * ne * wrong cnt
-
由 Wilber 提交于
* add pten::Place data structure. * update ci problem * fix ci problem * update * using platform::Place=pten::Place * remove BOOST_GET_CONST for CPUPlace and GPUPlace * compile pass 25%. * compile pass 45% * compile pass 60% * remove boost_get for xpu npu mlu and ipu * compile pass on cpu and gpu. * fix compile problem * fix compile error. * update * fix ci problem * update * ci approve * fix ci problem * fix ci eager test problem * remove BOOST_GET_CONST * fix npu compile
-
- 15 1月, 2022 1 次提交
-
-
由 Zhanlue Yang 提交于
* Merged LoDTensor with Tensor,test=allcases * Patched python level LoDTensor * Fixed example code failure * Polished function names, removed duplicated forward declarations
-
- 14 1月, 2022 1 次提交
-
-
由 heliqi 提交于
* add trt_convert_flatten_contiguous_rang op * trt version >7,support trt_convert_flatten_contiguous_rang * trt version >7,support trt_convert_flatten_contiguous_rang * trt version >7,support trt_convert_flatten_contiguous_rang * test cast add trt version >=7 skip
-
- 13 1月, 2022 3 次提交
- 11 1月, 2022 1 次提交
-
-
由 fengkuangxiaxia 提交于
-
- 10 1月, 2022 1 次提交
-
-
由 Zhanlue Yang 提交于
* Added shared_ptr<Allocation> member & corresponding interfaces to Storage * Removed original pten::Allocation from Storage and adjusted the interfaces accordingly * Fixed issues with storage offset * Used place to malloc allocation for TensorStorage * [Unify Tensors PR #3]Ported framework::Tensor interfaces to pten::DenseTensor * Fixed issues with place * Added comments * Moved mutable_data with stream argument to DenseTensor * Added set_offset interface * Fixed CI issues,test=allcases * [Unify Tensors PR #4] Port LoDTensor interfaces to DenseTensor * Removed friend class EigenTensor/EigenMatrix/EigenVector from Tensor * Modified framework::Tensor to inherit from DenseTensor * Reverted changes too pten_layout() interface * Removed friend classes * Rearranged cfunction calls from tensor.data<void>() to tensor.data() * Fixed CI issues * Fixed lite issues * Fixed data() interface issues,test=allcases * Resolved IsInitialized() issues * Fixed ResetHolder() issues * Fixed MKLDNN & Storage issues * Resolved ShareBufferWith() issues * Fixed LoD issues
-
- 06 1月, 2022 1 次提交
-
-
由 wenbin 提交于
* bug fix * remove blank
-