- 03 8月, 2021 1 次提交
-
-
由 QingshuChen 提交于
* support Kunlun2 * support KL2 * support KL2
-
- 09 4月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* [feature] support npu allocator (#30840) [feature] support npu allocator * [feature] support npu operator (#30951) [feature] support npu operator * [feature] support npu allocator, part 2 (#30972) * support npu allocator * add npu device context * fix some compile problem * fix some compile problem * add npu info * compile ok * fix include dir * support naive_best_fit_allocator * run ut ok, bug failed to exit * call aclrtResetDevice before exit * fix aclFinilize * add system allocatot test * add selected_gpus in gtest * add tensor_test for npu * support npu op, initial commit * add npu stream * add elementwise_add_op * compile ok * fix typo * fix elementwise_add_op_npu_test * support op run * test can run but failed * change aclopExecuteV2 to aclopCompileAndExecute * support parsing ascend rank table file (#31000) support parsing ascend rank table file * Fix reshape on GE graph. (#31084) Fix reshape on GE graph * add npu kernel for elementwise_sub and elementwise_sub_grad (#30973) * add npu sub op * fix typo * rename test * fix bug * fix bug * add fp16 kernel * fix typo * support sub grad op * support elementwise_sub_grad op Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com> * Fix compilation problem (#31100) Fix compilation problem (#31100) * fix compile * fix code stype * remove const_cast * support adding correct npu op in pybind.h (#31143) * support adding correct npu op in pybind.h * refine code * [NPU] Support executor with NPU (#31057) * [NPU] Support executor with NPU * Fix code according to reviews * Fix code * Add unittest for sub op npu * refactor npu device manager (#31154) refactor npu device manager (#31154) * fix selected npus * fix compile * fix reading flags from env * format Co-authored-by: Nxiayanming <41795079@qq.com> Co-authored-by: Ngongweibao <weibao.gong@gmail.com> Co-authored-by: Nfrankwhzhang <frankwhzhang@126.com> Co-authored-by: Nliym27 <33742067+liym27@users.noreply.github.com>
-
- 22 2月, 2021 1 次提交
-
-
由 Qi Li 提交于
-
- 04 2月, 2021 1 次提交
-
-
由 wanghuancoder 提交于
* use iwyu clean include second time, test=develop
-
- 15 1月, 2021 1 次提交
-
-
由 石晓伟 提交于
-
- 17 12月, 2020 1 次提交
-
-
由 wanghuancoder 提交于
* Windows generate pdb and dump, for debug * fix code style, test=develop * modify cmakelist
-
- 20 11月, 2020 1 次提交
-
-
由 gongweibao 提交于
-
- 04 11月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
-
- 30 10月, 2020 1 次提交
-
-
由 Leo Chen 提交于
-
- 21 8月, 2020 1 次提交
-
-
由 QingshuChen 提交于
* support Baidu AI Accelerator * test=kunlun * minor * test=kunlun * support xpu op in separate file * test=kunlun * update XPU error message and remove duplicated code * test=kunlun * minor * test=kunlun * minor * test=kunlun
-
- 04 8月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
-
- 29 7月, 2020 2 次提交
-
-
由 Chen Weihang 提交于
* unified signal error format * refine signal error message
-
由 Chen Weihang 提交于
* simplify buffered reader to improve DataLoader performance * fix 22 failed unittests * fix cuda pinned context condition * fix test_reader_reset failed * fix two failed unittests * change unittest place * polish error messaage * polish cast op GetExpecctedKernelType * remove debug info in unittest
-
- 15 7月, 2020 1 次提交
-
-
由 GaoWei8 提交于
* Refine PADDLE_ENFORCE in paddle/fluid/platform test=develop
-
- 07 7月, 2020 1 次提交
-
-
由 GaoWei8 提交于
* refine PADDLE_ENFORCE test=develop
-
- 03 6月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* remove REPLACE_ENFORCE_GLOG compile option & add ci rule prohibit LOG(FATAL) using, test=develop * remove ci test case, test=develop * replace all LOG(FATAL) & polish message, test=develop * fix typo, test=develop * polish error info detail, test=develop
-
- 01 6月, 2020 1 次提交
-
-
由 Wilber 提交于
-
- 19 5月, 2020 1 次提交
-
-
由 Leo Chen 提交于
-
- 29 4月, 2020 1 次提交
-
-
由 石晓伟 提交于
* update the analysis predictor, test=develop * update the unit test, test=develop * no priority set before the inferface determined, test=develop * interface name generalization, test=develop
-
- 04 4月, 2020 1 次提交
-
-
由 Leo Chen 提交于
* fix init_gflags with 'python -c', test=develop * add test, test=develop * use sys.executable instead of python, test=develop * keep dummy, test=develop
-
- 05 12月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
As the title
-
- 04 12月, 2019 1 次提交
-
-
由 Pei Yang 提交于
* make DisableGlogInfo able to mute all logs in inference.
-
- 03 12月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
Add warning message when initialize GLOG failed
-
- 18 10月, 2019 1 次提交
-
-
由 WangXi 提交于
-
- 11 9月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory. We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton. Also added data_feed_proto to operator to fix CI in CPU compilation
-
- 30 8月, 2019 2 次提交
-
-
由 liuwei1031 提交于
-
由 Zeng Jinle 提交于
-
- 28 8月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* add signal message to stderr, test=develop * add unittests for ugly SignalHandle, test=develop
-
- 16 8月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 04 7月, 2019 1 次提交
-
-
由 chengduo 提交于
* enhance execution error info test=develop
-
- 05 6月, 2019 1 次提交
-
-
由 chengduo 提交于
test=develop
-
- 18 4月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 28 3月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 15 3月, 2019 1 次提交
-
-
由 qingqing01 提交于
* Support Sync Batch Norm. * Note, do not enable it in one device. Usage: build_strategy = fluid.BuildStrategy() build_strategy.sync_batch_norm = True binary = fluid.compiler.CompiledProgram(tp).with_data_parallel( loss_name=loss_mean.name, build_strategy=build_strategy)
-
- 21 2月, 2019 1 次提交
-
-
由 Dun 提交于
* refine profiler && add runtime tracer * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * fix bug && test=develop * add thread id map && test=develop * test=develop * testing * bug fix * remove cuda event && refine code && test=develop * test=develop * test=develop * test=develop * fix windows temp file && test=develop * test=develop * fix windows bug && test=develop * fix start up issue && test=develop * code polish && test=develop * remove unused code && test=develop * add some cupti cbid && test=develop * add FLAGS_multiple_of_cupti_buffer_size && test=develop * fix compile error && test=develop * add keyword && test=develop * fix && test=develop * code polish && test=develop
-
- 21 12月, 2018 1 次提交
-
-
由 chengduo 提交于
* Add Temporal Allocator * add Temporay Allocator to DeviceContext test=develop * code refine test=develop * fix mean_iou test=develop * Add DeviceTemporaryAllocator test=develop * fix conv_op bug test=develop * small fix test=develop * code refine test=develop * log refine test=develop * fix unit test test=develop * move double check * refine concat_and_split test=develop * add limit_of_temporary_allocation test=develop * fix name test=develop
-
- 05 12月, 2018 2 次提交
-
-
由 tensor-tang 提交于
test=develop
-
由 tensor-tang 提交于
test=develop
-
- 04 12月, 2018 1 次提交
-
-
由 Wu Yi 提交于
* wip multi process multi gpu dist training * workable for p2p * update test=develop * change back env name test=develop * fix alloc init * fix cpu build test=devlop * fix mac tests test=develop * refine code * refine test=develop
-
- 26 11月, 2018 1 次提交
-
-
由 minqiyang 提交于
test=develop
-