- 28 11月, 2022 14 次提交
-
-
由 Asthestarsfalll 提交于
-
由 PuQing 提交于
-
由 Qi Li 提交于
* [NPU] apply npu_identity to conv bn and copy2cpu, test=develop * update npu identity to share data with x, test=develop * address review comments, test=develop
-
由 zhangbo9674 提交于
* add trace mode for interpretercore * fix bug * add a ctrl flag * add record for memcpyd2h * polish code * polish code
-
由 Ruibiao Chen 提交于
* Remove kSyncRun in StreamAnalyzer * Update code
-
由 huangjiyi 提交于
* rm fluid “xpu_header.h” deps in phi * move part of xpu_op_list.h from fluid to phi * add fluid xpu_op_list deps * add glog deps for xpu_op_list in phi * fix PR-CI-Kunlun
-
由 zyfncg 提交于
* add fluid_op_name_map * rename some kernel name * add comments for op-kernel map * refine map name of op to kernel
-
由 MarDino 提交于
-
由 wenbin 提交于
-
由 xiaoxiaohehe001 提交于
* add_gather_nd_ * add_gather_nd_ * add_gather_nd_
-
由 Thomas Young 提交于
* fix expand as op * fix bug
-
由 haosicheng 提交于
-
由 xiaoguoguo626807 提交于
* remove fluid.reduce_sum * remove fluid.reduce_sum * modify axis and import paddle * modify keepdim and out_name * modift unittest * modift unittest * modify CI_static and loss.py * modify test_mse_loss * modify static ci * modify static ci datatype * add import paddle in test * fix conflict * fix conflict * modify ci * modify ci * fix_conflict * fix bug * code_style
-
由 张春乔 提交于
* Update communicator.cc * Update communicator.cc * remove LoDTensor * remove LoDTensor and Tensor
-
- 26 11月, 2022 2 次提交
- 25 11月, 2022 13 次提交
-
-
由 zhangxin81 提交于
* fix loopup_table plugin deserialize size error
-
由 wanghuancoder 提交于
* for xpu multi thread bug test
-
由 Wangzheee 提交于
* fix
-
由 Wang Bojun 提交于
* group norm fp16 support
-
由 Chitsing KUI 提交于
* attr ready * op ip ready * start dynamic * end2end ok * input shape to map, stat by op * layer wip * first version ready * fix proto depds * fix profiler deps * fix flops typo, rm tuple shape
-
由 Ruibiao Chen 提交于
* Move stream_anayzer to interpreter * Refactor StreamAnalyzer * Refactor RunNextInstructionList * Remove no_data_transform_index * Fix typos * Fix data_transfer OpFuncType error * Add event for depend_op * Update transfer OpFuncType for heter place
-
由 Nyakku Shigure 提交于
-
由 Roc 提交于
* support xpu scalar inplace * sharding for xpu Co-authored-by: Nheyanru <81976792+heyanru01@users.noreply.github.com>
-
由 wanghuancoder 提交于
-
由 wanghuancoder 提交于
-
由 houj04 提交于
-
由 sneaxiy 提交于
-
由 sneaxiy 提交于
* add bfloat16 support for more ops * fix ci compile * fix windows compile error * fix windows compile error * fix rocm compile error * fix ROCM compile error
-
- 24 11月, 2022 11 次提交
-
-
由 tianshuo78520a 提交于
-
由 zhangyikun02 提交于
-
由 zhangyikun02 提交于
-
由 wangxiaoning 提交于
* add index sample fp16 support * remove fluid APIs in distributed_strategy.py and role_maker.py * Revert "remove fluid APIs in distributed_strategy.py and role_maker.py" This reverts commit 223bbee990d3bf69e252fc3c0f19e3873550a264. * remove fluid APIs in distributed_strategy.py and role_maker.py * remove index sample op changes * remove fluid APIs under fleet.base * remove fluid APIs under fleet.layers.mpu * remove fluid APIs under fleet.meta_optimizers * fix fluid error * fix util_factory.py * reset fluid.io.load_inference_model API
-
由 huangjiyi 提交于
* rm dependence to "convert_utils.h" in some files * fix bugs * replace DataType2String with DataTypeToString * replace framework::DataTypeSize with phi::SizeOf * mv convert_function from fluid to phi and rm old map * recommit with pre-commit * repalce ProtoVarType with ProtoDataType and update comment. * fix error about include "dnnl.hpp" * revert add dep mkldnn to convert_utils in phi * add mkldnn deps in convert_utils.h in phi * move deps to convert_utils.h in phi
-
由 PuQing 提交于
-
由 Wangzheee 提交于
* optimize token prune
-
由 Nyakku Shigure 提交于
-
由 HongyuJia 提交于
* support default use_gpudnn=True * fully support cudnn in phi * add header file * add white_list, verify accuracy * phi support all cudnn * opt affine_grad * try different arches of pretrained_model * try different arches of pretrained_model * add debug string * debug eager_method * add debug string, pass all local ctest * polish all debug code * delete use_cudnn relevant code autogen * fix depthwise_conv2d * Share all other members of Tensor except use_cudnn * polish codes according to review opinion * polish codes according to review opinion, fix bug * polish codes according to review opinion, opt performance * polish codes according to review opinion, fix pooling.py
-
由 Sławomir Siwek 提交于
-
由 james 提交于
Note: this is a temporary solution, should be replaced once reduce kernel is natively supported on KL2
-