- 25 4月, 2021 11 次提交
-
-
由 liym27 提交于
-
由 liym27 提交于
-
由 WeiXin 提交于
* support save/load binary format tensor * Fix error when create cudaplace * Fix error when create cudaplace * Fix error when create cudaplace * get devive context from pool. * move define of 'SerializeToStream' and 'DeserializeFromStream' to 'lod_tensor.cc' and 'selected_rows.cc'. * support complex object * improve coverage. * improve coverage * improve coverage. * fix a bug. * polish API * save/load program * paddle.save/load: layer * deal with conflict * if PY2, block test_paddle_save_load.TestSaveLoadLayer * polish code. * polish code * edit unnittest * The condition for object to be identified as state_dict becomes strict * use 'core._cuda_synchronize'
-
由 minghaoBD 提交于
-
由 ShenLiang 提交于
* add pipeline layer
-
由 lilong12 提交于
* update
-
由 wawltor 提交于
* fix bug: when x.dim < y.dim, the result of compare_op is inverse to expected result * support the cuda for fix the compare broadcast bug
-
由 Shang Zhizhou 提交于
* fix tc trt shape * fix fc dynamic shape * add fc shape assert * update
-
由 pangyoki 提交于
* let paddle.utils.install_check support CPU package with GPU device * use use_cuda in dygraph checking * add unittest for install_check
-
由 Leo Chen 提交于
-
由 WeiXin 提交于
-
- 24 4月, 2021 2 次提交
-
-
由 Huihuang Zheng 提交于
Reduce max iter size to fix windows openblas test_yolov3 random failure. Decrease batch size to fix pe related unittest random failure.
-
由 zhiboniu 提交于
-
- 23 4月, 2021 6 次提交
-
-
由 lilong12 提交于
* add c_identity op, test=develop
-
由 Leo Chen 提交于
* refactor_check_finite_and_scale_npu_kernel * fix compile * add alloc_float_status op * add alloc_float_status op * add FloatStatus for check_finite_and_unscale * refine code * remove unneccessary logic * refine for fleet
-
由 lilong12 提交于
* add c_concat op
-
由 shanliang1992 提交于
-
由 ShenLiang 提交于
-
由 Kqnonrime 提交于
* fix two error message * fix two error message * fix error * fix error * fix error * fix error * fix some error message * fix some error * fix error * fix some error * fix some error * fix some error * fix one error * fix some error * fix seven error message * fix error * fix error * fix error * fix error
-
- 22 4月, 2021 7 次提交
-
-
由 Yang Zhang 提交于
-
由 wuyefeilin 提交于
support int32 and int64 kernel for clip operator
-
由 zhiboniu 提交于
-
由 ShenLiang 提交于
* add clip/check * add amp & clip grad in dygraph * add logging
-
由 Feiyu Chan 提交于
add glu in nn.functional
-
由 WeiXin 提交于
* support save/load binary format tensor * Fix error when create cudaplace * Fix error when create cudaplace * Fix error when create cudaplace * get devive context from pool. * move define of 'SerializeToStream' and 'DeserializeFromStream' to 'lod_tensor.cc' and 'selected_rows.cc'. * improve coverage. * improve coverage. * polish API * deal with conflict * disable save/load large file in unnittest * split unnittest.
-
由 tianshuo78520a 提交于
-
- 21 4月, 2021 5 次提交
-
-
由 Chen Weihang 提交于
* add support for optimizer with varbase input * refine cond * fix failed unittest * add test for coverage
-
由 Yuang Liu 提交于
-
由 liuyuhui 提交于
-
由 jakpiase 提交于
-
由 jakpiase 提交于
-
- 20 4月, 2021 3 次提交
- 19 4月, 2021 4 次提交
-
-
由 Leo Chen 提交于
* [NPU] support GarbageCollector for npu (#31874) * support GarbageCollector for npu * fix typo * fix gather_grad * disable NPUDefaultStreamGarbageCollector on NPU * [NPU] support npu for memcpy op (#31808) * support npu for memcpy op * add ut * fix ut * fix typo * 【NPU】fix bug of using temp vector (#31963) * fix bug when beta1_pow on cpu (#31995) * [NPU] support npu profiler (#31684) * support npu profiler * add python api * fix bugs * add wrapper for incomplete type * update profile proto * record npu wait * add xpu placeholder * fix adam (#32016) * [NPU] enable async copy and add wait before sync operation (#31956) * enable async copy and add wait before sync operation * remove unneccessary wait * add FillNpuTensorWithConstant * refine * fix fill_constant * make TensorFromVector/TensorToVector sync * [NPU] Support dataloader on npu place. (#31867) * [NPU] Wait on NPUPlace (#32086) * [NPU] fix cast op (#32121) * fix npu kernel of cast op to handle casting to same dtype * add comments * [NPU] support cann 20.3 (#32044) * fix compile problem on cann 20.3 * fix ut * fix test_mul * fix check_finite_and_scale * fix lookup_table_v2_grad * fix cmake * support print op * [NPU] Support npu save load (#31893) * support save load for NPU * add save load npu unittest * support np.array transform in NPU * fix errors * delete dygraph in unittest * add Wait * fix unittest * fix review comment * fix unittest problem * fix little problem * change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performance (#32196) * change aclrtSynchronizeDevice to aclrtSynchronizeStream for better performace * refine code * fix NPUDeviceContext in all c++ unittest (#32198) * fix NPUDeviceContext in all c++ unittest * refine log Co-authored-by: Npangyoki <pangyoki@126.com> * [NPU] Remove TensorFromVector and avoid sync copy in npu op kernel for better performance (#31994) * enable async copy and add wait before sync operation * remove unneccessary wait * add FillNpuTensorWithConstant * refine * fix fill_constant * change TensorFromVector to FillNpuTensorWithConstant * fix ignored api * delete extra unittest * fix little error * fix update_loss_scaling_op_npu and check_finite_and_unscale_op_npu * change TensorCopySync to TensorCopy * delete useless Wait and add StreamWait * fix npu_stream error * fix check_finite_and_unscale_op_npu TensorCopy * only save stream wait * fix NPUDeviceContext in all c++ unittest * delete wait Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com> * delete useless unittest file (#32206) * Fix op test (#32231) * fix conditional block (#32243) * fix adam bug again (#32246) * fix compile * fix ut * fix ut Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com> Co-authored-by: Npangyoki <pangyoki@126.com>
-
由 ShenLiang 提交于
* support dp & mp
-
由 Jiabin Yang 提交于
* fix sublayer error with include_sublayers=False * add ut * refactor include_sublayers related api * fix ut * fix ut of transformer * fix ut of transformer * remove useless code * change sublayer api * polish code * add test for include_self=True
-
由 joanna.wozna.intel 提交于
-
- 17 4月, 2021 1 次提交
-
-
由 ShenLiang 提交于
* add model parallel support in dygraph
-
- 15 4月, 2021 1 次提交
-
-
由 123malin 提交于
* add index_dataset and index_sampler for tree-based model
-