- 01 9月, 2022 4 次提交
-
-
由 HongyuJia 提交于
* copy kernel file to phi * delete some code * migrate uniform_random, test=kunlun * fix input error, test=kunlun * fix gpu register error, test=kunlun * add include file, test=kunlun * try fix error from CI, test=kunlun * polish other PR * fix CI-coverage error, test=kunlun
-
由 Leo Chen 提交于
-
由 wangguanqun 提交于
* config * fix unittest * zero init & cache & patch config * add barrier to save and load * add unittest
-
由 Leo Chen 提交于
* refine cmake of framework * add deps for dense tensor * fix deps * remove alloc(ctx) * add depends on mkldnn
-
- 31 8月, 2022 3 次提交
- 30 8月, 2022 2 次提交
-
-
由 zyfncg 提交于
* add runtime config in phi * add runtime attr for op desc and op * fix no proto error * adjust opdesc set_attr impl * try to remove conv_op extra attrs * add init runtime attr map * change extra header path * fix runtime_attr * fix trace_op * fix bug of pass * fix merge conflict * fix dygraph attrs * fix bug of pass * fix dygraph bug * fix unittest module * delete extra attr default * fix dropout kernel * polish code * fix extra output of instance_norm * fix merge confilct * fix op_desc bug * add extra attr in yaml for conv3d_transpose * don't remove extra input and output * fix save_inference_model * fix bug of batch_norm * revert some change * polish log * polish code * add code comment Co-authored-by: NChen Weihang <chenweihang@baidu.com>
-
由 zhoutianzi666 提交于
add constant folding pass, for some model,it will get less latency;
-
- 29 8月, 2022 2 次提交
-
-
由 zhangbo9674 提交于
* add interpretercore * refine backward program id * add code * refine program * refine code * create forward/backward_program by prog2graph2prog method * test, do not care * refine code * refine code * refine code * test, do not care * add interpretorcore * add scope * refine scope create method * add jit for new_exe * solve conflict * delete unused code * polish code * polish code * refine scope in inplace * refine for datatransfer * refine _rebuild_from_desc * refine control eager deletion attr * refine used_for_jit * refine jit for infer * op size0 use ori program * polish code * refine jit * refine run_program_op ut * refine inplace * refine control * refine graph helper * refine control * refine inplace * refine buffer_share_inplace_pass * polish code * polish code * refine usage for compilerProgram * refine control * test * test core cache * refine code * refine io.py * increase test_seq2seq timeout * refine convert program * refine interpretercore_cache release * delete buildinplace * refine partial_program && io * refine code for io * test * test * test
-
由 Aurelius84 提交于
* [OpAttr]num_rows/num_colums of eye support Tensor type * fix attr cast with long type
-
- 26 8月, 2022 2 次提交
-
-
由 kangguangli 提交于
* remove fluid kernel and activate phi kernel * fix parameter error * transfer mkldnn part * modify header file path * fix compile error * transfer special case * fix lod setting and special case for layout setting * add testcase and refine code
-
由 王明冬 提交于
-
- 25 8月, 2022 3 次提交
-
-
由 Feiyu Chan 提交于
-
由 danleifeng 提交于
* update brpc version;test=develop
-
由 ronnywang 提交于
* [NPU] add run_program_op_npu * add run_program_op_npu ut
-
- 24 8月, 2022 3 次提交
-
-
由 ShenLiang 提交于
* fix utest * fix utest * fix utest * fix log * fix random utest
-
由 Leo Chen 提交于
* make tensor_util contains no cuda code * refine isfinite * revert ut * move isfinite function to its op * fix test * fix compile * std::isnan is not defined for int type on windows * fix windows compile * fix fp16 * fix rocm compile * revert gradient node
-
由 Wilber 提交于
-
- 23 8月, 2022 4 次提交
-
-
由 pangyoki 提交于
-
由 zhaoyingli 提交于
* add quant pass
-
由 OccupyMars2025 提交于
-
由 OccupyMars2025 提交于
* Update scope.h * typo * Update dense_tensor.inl
-
- 22 8月, 2022 3 次提交
-
-
由 joanna.wozna.intel 提交于
* Add int8 support for matmul+elementwiae_add fuse * Corrections after review and ernie test fix
-
由 Sławomir Siwek 提交于
* merge conv_concat_relu to conv_act * fix typo * extend unit test * reuse existing gpd * codestyle * enforce mkldnn conv
-
由 Yuanle Liu 提交于
-
- 19 8月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Fix random op depenency and lr_shedule bugs for standalone executor * Fix CI errors * Fix CI errors * Fix CI errors
-
- 18 8月, 2022 3 次提交
-
-
由 pangyoki 提交于
apply buffer_shared_inplace_pass and inplace_addto_op_pass pass to program in Standalone Executor (#45085) * apply inplace addto in python apply_pass * fix * apply inplace pass for program * skip feed and fetch var * fix block_desc.move_from * fix block desc * alltoall remove inplace * fix
-
由 zhangxiaoci 提交于
* change to async mode for xpu multi-card training in static graph mode * minor bugfix * irrelevant. move to another pr * move change to other pr * fix stream issue * fix 'stream not meet with current context' error * fix branch diverge, test=kunlun
-
由 JingZhuangzhuang 提交于
* fix infer tans scop * fix infer trans scope * fic infer trans scope * fic infer trans scope Co-authored-by: Ndingjiawei <327396238@qq.com>
-
- 17 8月, 2022 2 次提交
-
-
由 Aurelius84 提交于
* [OpAttr]Add SupportTensor for OpMaker * fix typo * fix code style * add SupportTensor for concat op * add unittest for register Tensor * add shape checker and split attribute
-
由 feng_shuai 提交于
-
- 16 8月, 2022 4 次提交
-
-
由 Chen Weihang 提交于
* move check finite and unscale kernel into phi * move infershape into phi * move update_loss_scaling kernel into phi * remove original kernels * move update loss scaling infershape into phi * add header for xpu and npu * solve coverage failed * fix npu test failed * remove mutable data in cu file * fix new executor failed * add valid check for meta tensor output
-
由 feng_shuai 提交于
* convert multihead to oss * fix:bug * fix:delete const cast * fix:don't support bias_qk * add vit pass * fix:convert bug and add preln_residual_bias * support length=-1 * add UT for convert * add no_bias_qk support for gpu_multihead_op * delete infer_shape depends on bias_qk * oss just can be used in T4 and A* * fix:change api for ROCM CI
-
由 Wangzheee 提交于
-
由 Feiyu Chan 提交于
-
- 15 8月, 2022 2 次提交
-
-
由 Yuanle Liu 提交于
-
由 Yulong Ao 提交于
* [Auto Parallel] Move the distributed info from python to c++ * [Auto Parallel] Add dist_attrs for VarDesc and OpDesc * [Auto Parallel] Add the lost file * [Auto Parallel] Make the dist attr be unique_ptr * [Auto Parallel] Add the proto conversion * [Auto Parallel] Improve the proto support * [Auto Parallel] Fix the bugs for adding a device or a link * [Auto Parallel] Add the C++ ProcessMesh and DistributedMapper * [Auto Parallel] Improve the impl of these dist attrs * [Auto Parallel] Pybind11 ProcessMesh and DeviceMesh * [Auto Parallel] Fix the unittest problem * [Auto Parallel] Explicitly add the src file for auto_parallel target * [Auto Parallel] Add the proto depedency explicitly * [Auto Parallel] Fix the cmake bug on windows and mac * [Auto Parallel] Remove the pybind11 header file in process_mesh.h * [Auto Parallel] Remove unused codes * [Auto Parallel] Check whether the dist attr is null * [Auto Parallel] Implement the assign operator for OpDesc explicitly
-
- 14 8月, 2022 1 次提交
-
-
由 xiaoxiaohehe001 提交于
This reverts commit 84bf5c31.
-
- 13 8月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* add cached_serialize_str_ * support program hash * add sha * add ut * use hash_str only for new_exe * fix attr order
-