- 25 11月, 2022 5 次提交
-
-
由 Chitsing KUI 提交于
* attr ready * op ip ready * start dynamic * end2end ok * input shape to map, stat by op * layer wip * first version ready * fix proto depds * fix profiler deps * fix flops typo, rm tuple shape
-
由 Ruibiao Chen 提交于
* Move stream_anayzer to interpreter * Refactor StreamAnalyzer * Refactor RunNextInstructionList * Remove no_data_transform_index * Fix typos * Fix data_transfer OpFuncType error * Add event for depend_op * Update transfer OpFuncType for heter place
-
由 Nyakku Shigure 提交于
-
由 wanghuancoder 提交于
-
由 houj04 提交于
-
- 24 11月, 2022 12 次提交
-
-
由 tianshuo78520a 提交于
-
由 zhangyikun02 提交于
-
由 zhangyikun02 提交于
-
由 huangjiyi 提交于
* rm dependence to "convert_utils.h" in some files * fix bugs * replace DataType2String with DataTypeToString * replace framework::DataTypeSize with phi::SizeOf * mv convert_function from fluid to phi and rm old map * recommit with pre-commit * repalce ProtoVarType with ProtoDataType and update comment. * fix error about include "dnnl.hpp" * revert add dep mkldnn to convert_utils in phi * add mkldnn deps in convert_utils.h in phi * move deps to convert_utils.h in phi
-
由 PuQing 提交于
-
由 Wangzheee 提交于
* optimize token prune
-
由 Nyakku Shigure 提交于
-
由 HongyuJia 提交于
* support default use_gpudnn=True * fully support cudnn in phi * add header file * add white_list, verify accuracy * phi support all cudnn * opt affine_grad * try different arches of pretrained_model * try different arches of pretrained_model * add debug string * debug eager_method * add debug string, pass all local ctest * polish all debug code * delete use_cudnn relevant code autogen * fix depthwise_conv2d * Share all other members of Tensor except use_cudnn * polish codes according to review opinion * polish codes according to review opinion, fix bug * polish codes according to review opinion, opt performance * polish codes according to review opinion, fix pooling.py
-
由 Sławomir Siwek 提交于
-
由 james 提交于
Note: this is a temporary solution, should be replaced once reduce kernel is natively supported on KL2
-
由 wanghuancoder 提交于
* do not calc reduce_all in eager mode * refine python c cast list * refine * refine * refine * refine * refine * refine * refine * refine * refine
-
由 wanghuancoder 提交于
* dense tensor in eager mode support data_ptr
-
- 23 11月, 2022 10 次提交
-
-
由 Wen Sun 提交于
* feat: static check
-
由 huangjiyi 提交于
* decouple im2col from fluid * move im2col to phi * fix build error * delete redundant comment
-
由 Charles-hit 提交于
* add nparray case for basic operator * fix unit test * fix unit test * add unit test * fix unit test
-
由 ykkk2333 提交于
* add stat tool * add roll and roll_grad kernels and strided_slice and strided_slice_grad kernels, test=kunlun * add masked_selected_grad kernel,test=kunlun
-
由 Wilber 提交于
-
由 Yuanle Liu 提交于
-
由 sneaxiy 提交于
* make bfloat16 implicit convert to float/double * fix bfloat16_test ut compile
-
由 duanyanhui 提交于
-
由 zhangyikun02 提交于
-
由 MarDino 提交于
* use fused mlp in multi transformer * Restruct code * use cublaslt to fuse ffn * fix conflict
-
- 22 11月, 2022 8 次提交
-
-
由 Piotr Paturej 提交于
* Migrate elementwise_div * Migrate elementwise grad kernels
-
由 feng_shuai 提交于
* fix:fix the bug of trt_8.0.3.4 * fix: fix the bug of trt_8.0 * fix: notes
-
由 HongyuJia 提交于
-
由 huangjiyi 提交于
* move vol2col from fluid to phi * update copyright year
-
由 Tian Zheng 提交于
* Skip tests that use fused_ops on H100 * Add error message to FusedOps on H100
-
由 Sylwester Fraczek 提交于
Removed ResidualData and Bias from ExtraAttrProperties because it's not an attribute. Removed bug with checking for ResidualData attribute in matmul_elementwise_add_fuse_pass Removed residualData from list of matmul outputs in cpu_bfloat16_pass.cc because it's input Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
-
由 Hulek 提交于
* Delete caching from requantize_mkldnn_op and changed to Acquire API * Fixed codestyle and implementation
-
由 huangjiyi 提交于
* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi * update copyright years * rm "fluid/platform/device/gpu/gpu_device_function.h" in phi * rm dependence to "gpu_device_function.h" in fluid * rm gpu_device_function.h etc in fluid * fix rocm-complie bugs * fix cuda_helper_test.cu bugs
-
- 21 11月, 2022 5 次提交
-
-
由 Leo Chen 提交于
* fix doc of NPUPlace * fix doc of NPUPlace, test=document_fix
-
由 Roc 提交于
-
由 Sylwester Fraczek 提交于
* add fc-residual quantization * revert removal of check for use_mkldnn * fix bug * add disable_logs * review fix call twice AreScalesPresntForNodes instead of if-else * rewrite residual input to output * revert fc mkldnn taking residual data * format fix * fix LoDTensor->DenseTensor * LoDTensor->DenseTensor * output->input * revert changes to unsupported script revert changes to unsupported script * remove fc residualdata from output blocklist in cpu_bfloat16_pass.cc
-
由 RichardWooSJTU 提交于
-
由 Sławomir Siwek 提交于
* cleanup unused code * unify is_int8 is_bfloat16 * Simplify matmul_v2 FWD kernel * remove RunKernel methods * remove import namespace * remove headers * clean fluid/phi cross imports * remove fluid axpy_handler * delete fluid methods * activations * OneDNNMemDesc * MKLDNNFormatForSize * MatchShapeToLayout * MKLDNNMemoryFormat * MKLDNNFormat * ReorderMKLDNNHandler * to_void_cast * review suggestions * interpolate * remove fluid depedency * init * ExecuteMatMulV2 * rm fluid kernel * matmul_grad * remove mutable_data * mul_grad
-