- 25 4月, 2021 6 次提交
-
-
由 wawltor 提交于
* fix bug: when x.dim < y.dim, the result of compare_op is inverse to expected result * support the cuda for fix the compare broadcast bug
-
由 Shang Zhizhou 提交于
* fix tc trt shape * fix fc dynamic shape * add fc shape assert * update
-
由 pangyoki 提交于
* let paddle.utils.install_check support CPU package with GPU device * use use_cuda in dygraph checking * add unittest for install_check
-
由 Leo Chen 提交于
-
由 Zhang Ting 提交于
-
由 WeiXin 提交于
-
- 24 4月, 2021 2 次提交
-
-
由 Huihuang Zheng 提交于
Reduce max iter size to fix windows openblas test_yolov3 random failure. Decrease batch size to fix pe related unittest random failure.
-
由 zhiboniu 提交于
-
- 23 4月, 2021 8 次提交
-
-
由 lilong12 提交于
* add c_identity op, test=develop
-
由 Leo Chen 提交于
* refactor_check_finite_and_scale_npu_kernel * fix compile * add alloc_float_status op * add alloc_float_status op * add FloatStatus for check_finite_and_unscale * refine code * remove unneccessary logic * refine for fleet
-
由 zhiboniu 提交于
-
由 Baibaifan 提交于
solve hccl communicate conflict (#32447)
-
由 lilong12 提交于
* add c_concat op
-
由 shanliang1992 提交于
-
由 ShenLiang 提交于
-
由 Kqnonrime 提交于
* fix two error message * fix two error message * fix error * fix error * fix error * fix error * fix some error message * fix some error * fix error * fix some error * fix some error * fix some error * fix one error * fix some error * fix seven error message * fix error * fix error * fix error * fix error
-
- 22 4月, 2021 11 次提交
-
-
由 Yang Zhang 提交于
-
由 wuyefeilin 提交于
support int32 and int64 kernel for clip operator
-
由 hutuxian 提交于
-
由 Yuang Liu 提交于
-
由 Feiyu Chan 提交于
* import sequence_* API to new namespace * fix typos, remove alias marking * update sample code * fix sample code * fix docstring for sequence_mask
-
由 wangxinxin08 提交于
* modify conv2d_transpose docs
-
由 zhiboniu 提交于
-
由 ShenLiang 提交于
* add clip/check * add amp & clip grad in dygraph * add logging
-
由 Feiyu Chan 提交于
add glu in nn.functional
-
由 WeiXin 提交于
* support save/load binary format tensor * Fix error when create cudaplace * Fix error when create cudaplace * Fix error when create cudaplace * get devive context from pool. * move define of 'SerializeToStream' and 'DeserializeFromStream' to 'lod_tensor.cc' and 'selected_rows.cc'. * improve coverage. * improve coverage. * polish API * deal with conflict * disable save/load large file in unnittest * split unnittest.
-
由 tianshuo78520a 提交于
-
- 21 4月, 2021 13 次提交
-
-
由 Chen Weihang 提交于
* add support for optimizer with varbase input * refine cond * fix failed unittest * add test for coverage
-
由 zhang wenhui 提交于
* add allreduce and broadcast without test (#31024) add allreduce and broadcast without test * Refactor HCCLCommContext to be compatible with Paddle (#31359) Refactor HCCLCommContext to be compatible with Paddle (#31359) * [NPU] add npu kernel for communication op (#31437) * add allreduce and broadcast without test * add c_broadcast_test case * build c_comm_init and c_create_group operators * make the whole thing compile * add broadcast and init op test case but run failed * make unit test compile * fix broadcast test bug and change into hcom for ccl * change c_comm_init and c_create_group ops accordingly * make tests compile * transfer code to 27 * compiled successfully in 28, but run failed * test broadcast in 28, but failed * make hcom primitives work * change hccl data type for base.h * fix broadcast bug * make attributes work * fix group name bug * add allreduce but test failed * allreduce bug for qiuliang * allreduce finished * add allgather and reducescatter * merge all op code * add allgather test * finish run all ccl op test exclude send/recv * all all op and test exclude send/recv * send_v2_npu.cc recv_v2_npiu.cc compiled * fix ccl core dump bug and test allgather, reducescatter, broadcast op * fix allreduce bug just for test * hcom send&recv test pass, without hcom_destroy * for qiuliang test * Ascend Send&Recv Test Pass * all op (ex send/recv) ok * fix bug * merge all ccl op * style merge to PaddlePaddle * merge style * new merge style * merge style 2 * insert an empty at the end * disable ctest for hcom to pass ci Co-authored-by: Nvoid-main <voidmain1313113@gmail.com> Co-authored-by: Nf2hkop <f2huestc@outlook.com> * Add auto-increasing tag id for Hcom OPs (#31702) * add c_reduce_sum op (#31793) add c_reduce_sum op * update Ascendrc hccl to 20.3 (#32126) update Ascendrc hccl to 20.3 (#32126) * fix merge code * change cmake.txt1 * [NPU] Support npu kernel for c sync stream op (#31386) * sync stream npu op * add with_ascend_acl * update c++ unittest * compile all failed * try to pre commit * after pre commit * merge&compile&test hccl successfully! * fix code style * fix code style * fix bugs about hccl * fix some bugs * fix code style * fix style * fix style * fix * fixed * merge develop Co-authored-by: Nlw921014 <liuwei921014@yeah.net> Co-authored-by: NVoid Main <voidmain1313113@gmail.com> Co-authored-by: Nf2hkop <f2huestc@outlook.com> Co-authored-by: Nxiayanming <41795079@qq.com>
-
由 Yiqun Liu 提交于
-
由 huangxu96 提交于
-
由 Aurelius84 提交于
-
由 Aurelius84 提交于
-
由 Yuang Liu 提交于
-
由 Leo Chen 提交于
* [NPU] register finalize on exit * fix
-
由 liuyuhui 提交于
-
由 jakpiase 提交于
-
由 gongweibao 提交于
-
由 jakpiase 提交于
-
由 xiemoyuan 提交于
* remove fluid for auto_checkpoint. * fix bug.
-