- 12 8月, 2021 7 次提交
-
-
由 Wilber 提交于
-
由 Feng Xing 提交于
This PR adds fused transformer related files defining c interface including class, function etc..
-
由 zhulei 提交于
* Fix safety-bug of functional.linear * Fix safety-bug of functional.linear * Fix safety-bug of functional.linear * Fix safety-bug of functional.linear
-
由 ShenLiang 提交于
* add recompute for pp * add recompute offload * add recompute partition
-
由 wuhuachaocoding 提交于
-
由 Fan Zhang 提交于
* [NPU] Support npu op expand_v2 and expand_v2_grad * [NPU] Support npu op expand_v2 and expand_v2_grad * [NPU] Support npu op expand_v2 and expand_v2_grad * update test_expand_v2_op_npu.py * update test_expand_v2_op_npu.py * modify expand_v2_op_npu.cc * modify expand_v2_op_npu.cc
-
由 Peihan 提交于
* add det_mv3_db & LeViT test case in pr-ci-inference * fix LeViT model dir bugs * fix grammar error
-
- 11 8月, 2021 21 次提交
-
-
由 Jacek Czaja 提交于
* - Added softmax without caching * - Binary is no longer manually cached * - Activation onednn caching removed * - Removed manual caching of activation * - modified UT * - fix * - fix * - fixes to building * - fix * - fix * - fix to UT * - Faulty UT workaround * - approval workaround * - Fixes after review * - compilation fixes * - more lint fixes * - more fixes after review * - fixes after another round of review
-
由 zhangbo9674 提交于
* add state_dict and load_state_dict and unittest for class GradScaler * refine unittest for coverage of load_state_dict * refine comments of code-block * refine some comments * refine state_dict code and unittest * add #require gpu, xpu for GradScaler get/set example code * add #require gpu, xpu for GradScaler get/set example code * refine example code * refine unittest for state_dict * refine unittest for state_dict * fix bug of DataLoader in TestGradScalerStateDict * add flag FLAGS_cudnn_deterministic
-
由 WeiXin 提交于
* add set_value_grad op * add unittest. * polish unittest. * polish code. * support cuda kernel * polish code according to CI * polish code. * polish code * remove *.pyc * polish code. * add unittest to improve coverage. * polish code.
-
由 Wangzheee 提交于
* fix_fc_reshape_convert * fix
-
由 Fan Zhang 提交于
-
由 pangyoki 提交于
* add while read_from_array write_to_array npu op * optimize unittest
-
由 Roc 提交于
-
由 ronnywang 提交于
* add momentum_op_npu and test * update * fix hang
-
由 ronnywang 提交于
* add reduce_mean_op_npu and test * remove skip.If * update
-
由 ronnywang 提交于
* add batch_norm_op_npu and tests * remove skip.If * fix bug
-
由 Hao Lin 提交于
* Add ext_tensor.slice() API, test=develop * Call Tensor::mutable_data first to fix bugs and add test for writing to sliced tensor * Fix unit test bug * Fix code format problem, test=develop * Fix code format problem * Fix code format problem * strengthen unit test * Use CustomTensorUtils::ShareDataFrom to simplify codes
-
由 WangXi 提交于
-
由 lilong12 提交于
* add auto_parallel apis
-
由 ShenLiang 提交于
* add save/load for pipelineparallel * add save/load
-
由 0x45f 提交于
* add exp and exp_grad npu op * modify support register type * remove empty line and remove exp_grad support data type int/int64 * move exp and epx_grad kernel to activation_op_npu.cc, delete attrs * move code to activation_op_npu.cc
-
由 andyjpaddle 提交于
-
由 wenbin 提交于
-
由 Yuang Liu 提交于
-
由 niuliling123 提交于
-
由 From00 提交于
* Add NPU kernel for TopKV2 op * deleted unnecessary cache file static_mode_white_list.cpython-37.pyc * A draft for error checking * A commit with accuracy error for float32 data * Modify codes according to the review comments * Modify codes according to the review comments
-
由 hong 提交于
* add not used output var to gc_check_list; test=develop * add useless output to gc check list; test=develop
-
- 10 8月, 2021 12 次提交
-
-
由 Liu-xiandong 提交于
* fix npu compile error, test=develop * [NPU] Support npu kernel for flatten_contiguous_range op, test=develop * [NPU] Support npu kernel for flatten_contiguous_range op, test=develop * [NPU] Support npu kernel for flatten_contiguous_range op, test=develop * [NPU] Support npu kernel for flatten_contiguous_range op, test=develop * [NPU] Support npu kernel for flatten_contiguous_range op, test=develop * [NPU] Support npu kernel for flatten_contiguous_range op, test=develop * [NPU] Support npu kernel for flatten_contiguous_range op, test=develop * Update flatten_op_npu.cc * Update flatten_op_npu.cc Co-authored-by: Nqili93 <qili93@qq.com>
-
由 niuliling123 提交于
添加Kernel primitives api: ReadData, WriteData ComputeFunctor
-
由 chentianyu03 提交于
-
由 Aganlengzi 提交于
* [NPU] add squared_l2_norm squared_l2_norm and tests * [NPU] replace Square&ReduceSumD with SquareSumV1
-
由 zyfncg 提交于
* Support npu kernel for fill_any_like op * modify the description of exception * remove useless template element * remove useless decorator * fix the code format error
-
由 andyjpaddle 提交于
* fix npu compile error, test=develop * add fill constant batch size lilke op npu,test=develop Co-authored-by: Nqili93 <qili93@qq.com>
-
由 XGZhang 提交于
-
由 shangliang Xu 提交于
-
由 WangXi 提交于
-
由 chentianyu03 提交于
-
由 chentianyu03 提交于
* add any.hpp to utils and replace boost::any with self defined paddle::any * add copy any.hpp to custom op depends * modify any.hpp include path * remove boost from setup.py.in * add copy any.hpp to custom op depends * move any.hpp to paddle/utils/ dirs * move any.h to extension/include direction * copy utils to right directions
-
由 Hui Zhang 提交于
* fix for div zero * fix err;test=develop * fix lod
-