- 15 3月, 2023 2 次提交
-
-
由 Guoxia Wang 提交于
-
由 zhangyuqin1998 提交于
* Delete hardswish_raw op * fix ut
-
- 14 3月, 2023 24 次提交
-
-
由 Vvsmile 提交于
-
由 zhouweiwei2014 提交于
-
由 ccrrong 提交于
* add split_with_num composite rule * add split_with_num composite rule * add split composite rule * update * update test * update test * delete split_with_num_grad
-
由 qizhaoaoe 提交于
-
由 limingshu 提交于
* first commit * fix code bugs in for_loop * fix bugs in cuLoadAddStridedInputs. * optimization for LayerNormBackwardComputeGradInput * add unitest for validating the optimization * fix windows ci error
-
由 gouzil 提交于
-
由 pangyoki 提交于
* cuda graph support multi-stream for new executor * fix windows compile error * delete create_cuda_graph_stream
-
由 zhaoyingli 提交于
-
由 YuhangLi 提交于
* wisemax fp16 support * add bf16 support 4 elementwise_max * append broadcast 4 op 4 fp16 / bf16 * fix elewise_max ut bf16 numeric delta * append fp/bf16 uts * add fp/bf16 uts * change bf16 uts delta * fix some issue * add prim 4 fp16
-
由 wangxiaoning 提交于
-
由 wenbin 提交于
-
由 Wang Bojun 提交于
* fix conv2d filter
-
由 zhiboniu 提交于
* add fp16 and bf16 test * update
-
由 cxxly 提交于
-
由 xiongkun 提交于
* [CINN]Enhance CacheKey hash logic by considering input dtypes (#50557) --------- Co-authored-by: Njiangcheng <thisjiang@qq.com> * [prim] enable dygraph_to_static to support custom_vjp * Pr 50885 (#7) * [CINN]Enhance CacheKey hash logic by considering input dtypes (#50557) * [CINN]Enhance CacheKey hash logic by considering input dtypes --------- Co-authored-by: Njiangcheng <thisjiang@qq.com> * [prim] enable dygraph_to_static to support custom_vjp * fix code in a dy2static-friendly way. * [dystatic] add hooker for prim --------- Co-authored-by: NAurelius84 <zhangliujie@baidu.com> Co-authored-by: Njiangcheng <thisjiang@qq.com> Co-authored-by: Ncxxly <chenxx_id@163.com> * [prim] enable dygraph_to_static to support custom_vjp * fix cast prim and vjp dtype mapping error bug * [dy2static-ci] fix dy2static ci errors. --------- Co-authored-by: NAurelius84 <zhangliujie@baidu.com> Co-authored-by: Njiangcheng <thisjiang@qq.com> Co-authored-by: Ncxxly <chenxx_id@163.com>
-
由 cxxly 提交于
-
由 cxxly 提交于
-
由 Aurelius84 提交于
* [CINN]Enhance CacheKey hash logic by considering input dtypes * add unittest * fix typo * fix typo * fix map.at * fix find * fix test * fix cinn cache key structure realize * using ordered map for attributes * add test by review advice --------- Co-authored-by: Njiangcheng <thisjiang@qq.com>
-
由 Sonder 提交于
-
由 denglianbin 提交于
* finish task * add static_check and fix unittest. * add int32/64 * Update test_cross_op.py --------- Co-authored-by: NZhang Ting <Douyaer2020@qq.com>
-
由 Aurelius84 提交于
* Fix is_paddle_func not take effect for plain paddle API * fix typo * fix typo
-
由 JYChen 提交于
* fix UT when np >= 1.24 * optimize decription of this change
-
由 Infinity_lee 提交于
-
由 cyber-pioneer 提交于
-
- 13 3月, 2023 14 次提交
-
-
由 Aurelius84 提交于
-
由 TaoTao Li 提交于
* add all_gather and fix conflicts * fix code format * fix ut * fix broadcast ut
-
由 heyanru 提交于
* refresh * compat * register * testop * fix * fix * fox * cast * cast * fix * type * fix * out * cast * fix * fix * fix * broad * broad * broad * fix * fix * fix * fix * fix * broad * broad * numel * fix * fix * fix * fix * cinn * fix * fix * fix * fix
-
由 mengziheng 提交于
* first test * add unsqueeze_op
-
由 wangxiaoning 提交于
* add fp16/bf16 * add grad bf16 * test name
-
由 Sławomir Siwek 提交于
* mkldnn->onednn * fused softplus op + kernel * remove extra attributes * add missing handler * change var name
-
由 wenbin 提交于
* squeeze2_op * add ut * fix ut * fix static * modity ut
-
由 kangguangli 提交于
* find relevant testcase * remove with_data_parallel * trigger CI * do not apply ParameterServerGraphOptimizer * remove useless optimizer * remove with_data_parallel in test_dist_base * fix test_fleet_base_3 * only reserve changes for GraphExecutionOptimizer * fix bug * fix test_minst_dgc_nccl * fix typo * fix test_dist_mnist_gradient_merge * rm TestDistMnistNCCL2DGCMultiCards * fix optimizer conflicts * fix dist_mnist * fix test_dist_hapi * delete test_fleet_graph_execution_meta_optimizer & test_fleet_graph_executor * temporally not delete unittest * fix unittests * fix ci * recover prune in python/paddle/hapi/model.py
-
由 kangguangli 提交于
-
由 xysheng-baidu 提交于
* Add expand composite rule * reshape x when dim_in less than dim_out * add tile op for expand * remove rensor shape case when comp prim * enable cinn case * dim_out can't be 0 * update test case for prim type
-
由 zhoutianzi666 提交于
* use python to generate cutlass code * refine CommonConvKernelPart1, CommonConvKernelPart2 * remove useless code in generate_cutlass_code.sh * add more config in conv2d_residual * CommonCutlassConvKernelPart1 and CommonCutlassConvKernelPart2 * add group conv support in util.cu * remove .sh * refine name * make name goodgit status! * add fuse_alpha * make code easy to understand * mot fopen generate in py * use python script to generate conv2d,group=1 cutlass code * use const & * use const & && use python script to generate conv2d/group=1 code
-
由 kangguangli 提交于
* remove with_data_parallel in test_sync_batch_norm_op * fix debug code * polish code * polish code * polish code
-
由 jiangcheng 提交于
-
由 houj04 提交于
* [XPU] add increment op. * fix ci
-