- 30 11月, 2022 6 次提交
-
-
由 Aurelius84 提交于
* [Perf]Fix interploate OutSize data transform problem * fix code style * fix grad * fix phi kernel
-
由 MarDino 提交于
* add activation support * fix cublasLt bug * remove useless code and fix test random range
-
由 zhangbo9674 提交于
* add fuse act add grad pass * polish code * refine code * add test * refine code
-
由 zyfncg 提交于
* rename some kernel name * fix compile problem
-
由 RichardWooSJTU 提交于
* delete unnecessary shape and slice op Co-authored-by: NYour Name <you@example.com>
-
由 james 提交于
some legacy code still use xpu_wait() for stream sync -- it only syncs default stream. this PR replaces them with dev_ctx.Wait() to ensure that correct stream is always used
-
- 29 11月, 2022 12 次提交
-
-
由 lzy 提交于
* fix mma_tensorcore (__CUDA_ARCH__) * disable tensorcore by default. disable tensorcore by default, because the judgment of __CUDA_ARCH__ will cause undefined behavior in some environments, can manually enable it on a machine that supports tensorcore.
-
由 Paulina Gacek 提交于
* traspose2 kernel migrated * Got rid of mutable_data * x modification added * ops added in extra info file * Formatting fix * 2 fuse passes with tanpose2 commented * nr of outs changed in 2 passes, passes uncommented * Changes in passes reverted * transpose chnaged in operator.cc * MKLDNN check in operator.cc * Transpose fixes * Fix deleted from operato * template corrected Co-authored-by: NPaulina Gacek <paulinagacek@intel.com>
-
由 张春乔 提交于
* replace LoDTensor with phi::DenseTensor in fluid\operators * replace LoDTensor with phi::DenseTensor in fluid\operators * Update split_lod_tensor_op.cc * Update warpctc_op.cc * Update broadcast_tensors_op.cc * Update crf_decoding_op.cc * Update lstm_op.cc * Update lstm_op.cc * Update lod_reset_op.cc * Update gru_op.cc * Update linear_chain_crf_op.cc * resume 2 files for confilct * Update gru_op.cc * Update linear_chain_crf_op.cc * Update lstm_op.cc
-
由 Nyakku Shigure 提交于
* isort all files * revert conflicting files * revert conflicting files * revert conflicting files
-
由 Sławomir Siwek 提交于
-
由 Sławomir Siwek 提交于
* cleanup unused code * unify is_int8 is_bfloat16 * Simplify matmul_v2 FWD kernel * remove RunKernel methods * remove import namespace * remove headers * clean fluid/phi cross imports * remove fluid axpy_handler * delete fluid methods * activations * OneDNNMemDesc * MKLDNNFormatForSize * MatchShapeToLayout * MKLDNNMemoryFormat * MKLDNNFormat * ReorderMKLDNNHandler * to_void_cast * review suggestions * interpolate * remove fluid depedency * init * ExecuteMatMulV2 * rm fluid kernel * matmul_grad * remove mutable_data * mul_grad * matmul fwd * add extra attr * temp disable passes * re-enable passes * workaround for matmul+act * fix for matmul+eltwise_add * fix typo * merge bugfix #48364 * remove merge conflict
-
由 kangguangli 提交于
* fix:add no support for cuda_arch<700 * replace Executor in while op with InterpreterCore * cache InterpreterCore as the member of WhileOp * fix bug: tensor place changed because of assign op in while loop * refine code * refine code * refine code * hot fix * fix compile * merge develop * follow comments * add log for test * remove LoDTensor * set flag control_flow_use_new_executor false Co-authored-by: Nfengshuai <fengshuai03@baidu.com> Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
-
由 JZ-LIANG 提交于
* get default calc stream from execution ctx instead of global dev ctx pool.
-
由 LiYuRio 提交于
* remove lod_tensor_to_array, array_to_lod_tensor, DynamicRNN * remove less_equal, greater_than, greater_equal, equal, not_equal
-
由 Sławomir Siwek 提交于
-
由 HappyHeavyRain 提交于
* generate static graph code for lerp by yaml, test=develop * modify the op_compat.yaml of lerp, test=develop * generate static graph code for lerp by yaml, test=develop * modify the op_compat.yaml of lerp, test=develop * remove the 'attrs' of lerp, test=develop Signed-off-by: lizhiyu02 <1528794076@qq.com> Signed-off-by: lizhiyu02 <1528794076@qq.com>
-
由 zhangkaihuo 提交于
-
- 28 11月, 2022 9 次提交
-
-
由 Sławomir Siwek 提交于
-
由 jakpiase 提交于
* re-enabled reshape, squeeze and flatten kernels * added formatting
-
由 Wang Bojun 提交于
* add trt support
-
由 zyfncg 提交于
* generate static graph code for some operators * add some ops generate * revert npu gelu
-
由 huangjiyi 提交于
* decouple cudnn_desc.h from fluid * move cudnn_desc.h from fluid to phi * fix bugs * decouple cudnn_helper.h from fluid * fix bugs * move cudnn_helper.h from fluid to phi * add fluid cudnn_helper.h * move miopen_desc.h from fluid to phi * move miopen_helper.h from fluid to phi * fix bugs * move gpu_dnn.h from fluid to phi * fix bugs * update copyright year * simplify gpu_dnn.h in fluid * fix bugs * fix xpu build bug * fix compile bug * fix bug
-
由 张春乔 提交于
-
由 Asthestarsfalll 提交于
-
由 MarDino 提交于
-
由 wenbin 提交于
-
- 24 11月, 2022 2 次提交
-
-
由 huangjiyi 提交于
* rm dependence to "convert_utils.h" in some files * fix bugs * replace DataType2String with DataTypeToString * replace framework::DataTypeSize with phi::SizeOf * mv convert_function from fluid to phi and rm old map * recommit with pre-commit * repalce ProtoVarType with ProtoDataType and update comment. * fix error about include "dnnl.hpp" * revert add dep mkldnn to convert_utils in phi * add mkldnn deps in convert_utils.h in phi * move deps to convert_utils.h in phi
-
由 Sławomir Siwek 提交于
-
- 23 11月, 2022 4 次提交
-
-
由 huangjiyi 提交于
* decouple im2col from fluid * move im2col to phi * fix build error * delete redundant comment
-
由 duanyanhui 提交于
-
由 zhangyikun02 提交于
-
由 MarDino 提交于
* use fused mlp in multi transformer * Restruct code * use cublaslt to fuse ffn * fix conflict
-
- 22 11月, 2022 7 次提交
-
-
由 Piotr Paturej 提交于
* Migrate elementwise_div * Migrate elementwise grad kernels
-
由 HongyuJia 提交于
-
由 huangjiyi 提交于
* move vol2col from fluid to phi * update copyright year
-
由 Tian Zheng 提交于
* Skip tests that use fused_ops on H100 * Add error message to FusedOps on H100
-
由 Sylwester Fraczek 提交于
Removed ResidualData and Bias from ExtraAttrProperties because it's not an attribute. Removed bug with checking for ResidualData attribute in matmul_elementwise_add_fuse_pass Removed residualData from list of matmul outputs in cpu_bfloat16_pass.cc because it's input Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
-
由 Hulek 提交于
* Delete caching from requantize_mkldnn_op and changed to Acquire API * Fixed codestyle and implementation
-
由 huangjiyi 提交于
* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi * update copyright years * rm "fluid/platform/device/gpu/gpu_device_function.h" in phi * rm dependence to "gpu_device_function.h" in fluid * rm gpu_device_function.h etc in fluid * fix rocm-complie bugs * fix cuda_helper_test.cu bugs
-