- 26 5月, 2021 1 次提交
-
-
由 wuhuanzhou 提交于
* optimize OP's compilation time, test=develop * add more op and run ci test, test=develop * CUDA Kernel register in cc file, test=develop * fix macros, test=develop * fix undefined symbol error, test=develop * fix compilation error and undefined symbol, test=develop * fix compilation error on Windows, test=develop * fix compilation error on Windows, test=develop
-
- 25 5月, 2021 4 次提交
-
-
由 石晓伟 提交于
* add the op def proto, test=develop * add while.pbtxt
-
由 王明冬 提交于
-
由 danleifeng 提交于
* fix hogwild_worker dev_ctx place bug; test=develop
-
由 jakpiase 提交于
-
- 21 5月, 2021 1 次提交
-
-
由 王明冬 提交于
-
- 20 5月, 2021 1 次提交
-
-
由 chentianyu03 提交于
* add complex template file * add numtraits for complex template * add complex template type register * modify specify template of complex * modify specify template of complex * modify specify template of complex * modify specify template of complex * make TensorCheckerVisitor support complex type * fix operator= error * add complex template * add complex template type * add complex template type to pyarray transform * add complex template type to pyarray transform * remove complex type for dlpack register * set dlpack supprot complex type * set dlpack supprot complex type * set dlpack supprot complex type * remove explict for complex constructor * add complex unit test file
-
- 18 5月, 2021 1 次提交
-
-
由 Thunderbrook 提交于
* unit double * unit double
-
- 17 5月, 2021 2 次提交
-
-
由 ShenLiang 提交于
* fix precision of mp * fix bug of seed * fix dp * print group
-
由 Aurelius84 提交于
* BugFix with ParseInputDataType from LodTensorArray * BugFix with ParseInputDataType from LodTensorArray
-
- 13 5月, 2021 1 次提交
-
-
由 Baibaifan 提交于
-
- 12 5月, 2021 2 次提交
- 11 5月, 2021 2 次提交
-
-
由 xiayanming 提交于
-
由 ShenLiang 提交于
* fix find_unused_parameters default value
-
- 10 5月, 2021 1 次提交
-
-
由 Thunderbrook 提交于
* pslib with cmake * heter util * vlog * heter server test * add dtor * cmake
-
- 08 5月, 2021 3 次提交
-
-
由 danleifeng 提交于
* add trainprofiler for heterps in oneps; test=develop * add set_use_ps_gpu; test=develop
-
由 houj04 提交于
-
由 lilong12 提交于
* add raw program, test=develop
-
- 07 5月, 2021 1 次提交
-
-
由 Zhou Wei 提交于
* Remove paddle_custom_op dynamic libraries, change link to FLUID_CORE on windows, and check copy_to * fix CI
-
- 06 5月, 2021 1 次提交
-
-
由 gongweibao 提交于
-
- 30 4月, 2021 1 次提交
-
-
由 XiangGao 提交于
-
- 29 4月, 2021 3 次提交
-
-
由 Chen Weihang 提交于
-
由 cc 提交于
-
由 Pei Yang 提交于
-
- 28 4月, 2021 3 次提交
-
-
由 denglin-github 提交于
* Add dlnne engine runtime * Fix log * Remove <const_cast> and remove unrelated modify with dlnne, +clang-format * Fix CMakeList format error * Add copyright message * Fix dlnne CMakeList.txt * Add some paddlepaddle_pass to support more networks * Fix some format bug * Add delete dropout_op pass * Fix some format bug * Fix format bug
-
由 Thunderbrook 提交于
* Revert "Revert "[PsCore] optimize performance of large kv (#32535)" (#32599)" This reverts commit 809ac036. * brpc dep
-
由 Jacek Czaja 提交于
* - Added clearing oneDNN per executor * - Executor is nt always having FLAGS_use_mkldnn set to true
-
- 27 4月, 2021 2 次提交
-
-
由 tianshuo78520a 提交于
This reverts commit 4b7242b0.
-
由 XiangGao 提交于
Co-authored-by: NYang Zhang <yangzhang@live.com>
-
- 26 4月, 2021 3 次提交
-
-
由 Thunderbrook 提交于
* optimize pull sparse * optimize pull sparse * change macro * format
-
由 Yiqun Liu 提交于
* Unset ReserveSpace for inference program. * Support training from an inference program.
-
由 石晓伟 提交于
-
- 25 4月, 2021 3 次提交
- 23 4月, 2021 2 次提交
-
-
由 Aurelius84 提交于
* Refine Constructor logic of ParallelExecutor * refine function name * refine code comment
-
由 Baibaifan 提交于
solve hccl communicate conflict (#32447)
-
- 22 4月, 2021 1 次提交
-
-
由 WeiXin 提交于
* support save/load binary format tensor * Fix error when create cudaplace * Fix error when create cudaplace * Fix error when create cudaplace * get devive context from pool. * move define of 'SerializeToStream' and 'DeserializeFromStream' to 'lod_tensor.cc' and 'selected_rows.cc'. * improve coverage. * improve coverage. * polish API * deal with conflict * disable save/load large file in unnittest * split unnittest.
-
- 21 4月, 2021 1 次提交
-
-
由 zhang wenhui 提交于
* add allreduce and broadcast without test (#31024) add allreduce and broadcast without test * Refactor HCCLCommContext to be compatible with Paddle (#31359) Refactor HCCLCommContext to be compatible with Paddle (#31359) * [NPU] add npu kernel for communication op (#31437) * add allreduce and broadcast without test * add c_broadcast_test case * build c_comm_init and c_create_group operators * make the whole thing compile * add broadcast and init op test case but run failed * make unit test compile * fix broadcast test bug and change into hcom for ccl * change c_comm_init and c_create_group ops accordingly * make tests compile * transfer code to 27 * compiled successfully in 28, but run failed * test broadcast in 28, but failed * make hcom primitives work * change hccl data type for base.h * fix broadcast bug * make attributes work * fix group name bug * add allreduce but test failed * allreduce bug for qiuliang * allreduce finished * add allgather and reducescatter * merge all op code * add allgather test * finish run all ccl op test exclude send/recv * all all op and test exclude send/recv * send_v2_npu.cc recv_v2_npiu.cc compiled * fix ccl core dump bug and test allgather, reducescatter, broadcast op * fix allreduce bug just for test * hcom send&recv test pass, without hcom_destroy * for qiuliang test * Ascend Send&Recv Test Pass * all op (ex send/recv) ok * fix bug * merge all ccl op * style merge to PaddlePaddle * merge style * new merge style * merge style 2 * insert an empty at the end * disable ctest for hcom to pass ci Co-authored-by: Nvoid-main <voidmain1313113@gmail.com> Co-authored-by: Nf2hkop <f2huestc@outlook.com> * Add auto-increasing tag id for Hcom OPs (#31702) * add c_reduce_sum op (#31793) add c_reduce_sum op * update Ascendrc hccl to 20.3 (#32126) update Ascendrc hccl to 20.3 (#32126) * fix merge code * change cmake.txt1 * [NPU] Support npu kernel for c sync stream op (#31386) * sync stream npu op * add with_ascend_acl * update c++ unittest * compile all failed * try to pre commit * after pre commit * merge&compile&test hccl successfully! * fix code style * fix code style * fix bugs about hccl * fix some bugs * fix code style * fix style * fix style * fix * fixed * merge develop Co-authored-by: Nlw921014 <liuwei921014@yeah.net> Co-authored-by: NVoid Main <voidmain1313113@gmail.com> Co-authored-by: Nf2hkop <f2huestc@outlook.com> Co-authored-by: Nxiayanming <41795079@qq.com>
-