- 21 4月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
* cherry-pick,Optimize the error messages of paddle CUDA API * fix the error messages of paddle CUDA API * Refactoring PADDLE_ENFORCE_CUDA_SUCCESS, and apply to curand/cudnn/cublas/NCCL * remove build_ex_string
-
- 20 4月, 2020 1 次提交
-
-
由 guofei 提交于
cherry-pick #23645
-
- 14 4月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
* correct reader device index, test=develop * fix async executor scope var initialization, test=release/2.0
-
- 10 4月, 2020 2 次提交
- 09 4月, 2020 1 次提交
-
-
由 mozga-intel 提交于
* Remove the NGraph engine from PDPD repository 1. Each operator was removed from the operator's directory 2. Each test was removed from the unittest directory 3. The parallel executor support was removed from the PDPD 4. The CMake file was removed from the PDPD 5. The NG flags were removed from the repository test=develop * Remove ngraph from: 1. Cmake file 2. Python file test=develop
-
- 07 4月, 2020 1 次提交
-
-
由 qingqing01 提交于
* Make optimizer consistent in dygraph and static-graph and remove some LOG-INFO
-
- 05 4月, 2020 1 次提交
-
- 04 4月, 2020 1 次提交
-
-
由 Zhen Wang 提交于
* solve the conflict of ops with the same name. test=develop
-
- 03 4月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 01 4月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 25 3月, 2020 2 次提交
-
-
由 Zeng Jinle 提交于
-
由 Zeng Jinle 提交于
-
- 20 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
* sequential reader stage 1, test=develop * fix ut, test=develop * fix iterable=False reset bug, add some logs and polish code, test=develop * inference feed partial data, test=develop * Turn on keep_order=True for test, test=develop * enhance ut to test more cases, test=develop * test commit for reverting * Revert "test commit for reverting", test=develop This reverts commit 80aef42e. * add ut of merged and unmerged results, test=develop * add more uts for coverages and add en doc of api, test=develop * follow comments, test=develop * change note style, test=develop
-
- 09 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
* refine grad maker, test=develop * refactor tracer stage 1, test=develop * merge develop to solve conflict third times, test=develop
-
- 02 3月, 2020 1 次提交
-
-
由 Zhen Wang 提交于
* update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results. * add the unit test for fetch_unmerged. * update ut for multi-card and multi-cpu. * add the error message and the user suggestion in FetchOpHandle. test=develop
-
- 23 2月, 2020 1 次提交
-
-
由 tianshuo78520a 提交于
-
- 22 2月, 2020 1 次提交
-
-
由 tangwei12 提交于
* add sync communicator and implement
-
- 13 2月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
test=develop
-
- 12 2月, 2020 1 次提交
-
-
由 tangwei12 提交于
* add thread barrier for the compiled program
-
- 11 2月, 2020 1 次提交
-
-
由 Wilber 提交于
支持不依赖nccl进行编译。[1/2] 多卡下,如果没有打开WITH_NCCL开关编译,多卡不能通信,则只能选择一张卡使用。 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 07 2月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
* Add the first implememtation of fusion_group op #19621 (#3) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Enable generating code for a given subgraph. #21126 (#4) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop * Enable the detection of subgraph of grad ops. * Generate code for detected subgraph in fusion_group_pass. * Add an option in BuildStrategy to enable fusion_group_pass and add unittest. test=develop * Fix a bug when checking whether the shape of all inputs are the same. * Add debug information. * Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5) test=develop * Call subgraph_detector in fusion_group pass. test=develop * Disable fusion_group when WITH_GPU is OFF. test=develop * Refine all PADDLE_ENFORCE message. test=develop * Fix the case that some inputs are not defined in grad ops, and set op_role for fused op. test=develop * Follow review comments. test=develop
-
- 05 2月, 2020 1 次提交
-
-
由 Wilber 提交于
cmake选项中添加了WITH_NCCL,显示指定是否编译NCCL的部分代码,WITH_NCCL默认打开,但如果WITH_GPU为OFF,则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义 单机单卡能够关闭NCCL编译,多卡的话需要默认打开NCCL,如果关闭NCCL,则只能使用单卡 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 17 1月, 2020 1 次提交
-
-
由 tangwei12 提交于
* add half_async in the communicator * fix DistributedStrategy
-
- 13 1月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* polish error message of parallel executor, test=develop * change PADDLE_ENFORCE, test=develop
-
- 10 1月, 2020 1 次提交
-
-
由 Zhen Wang 提交于
* add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop
-
- 19 12月, 2019 1 次提交
-
-
由 WangXi 提交于
-
- 18 12月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
The fixed bugs: 1. The condition sub-graph is not pruned 2. When backward graph is extremely simple, the whole backward ops are pruned.
-
- 15 12月, 2019 1 次提交
-
-
由 WangXi 提交于
-
- 12 12月, 2019 1 次提交
-
-
由 WangXi 提交于
-
- 11 12月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
-
- 06 12月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* polish infer shape registry, test=develop * modify some operators registry, test=develop
-
- 28 11月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* fix ref_cnt pass, test=develop * add cpp unittests to reference_count_pass, test=develop * follow comments, test=develop
-
- 25 11月, 2019 1 次提交
-
-
由 zhouwei25 提交于
-
- 18 11月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* fix warnings oof gcc 8 compilation, test=develop * fix boost::bad_get, test=develop * refine PADDLE_ENFORCE, test=develop
-
- 13 11月, 2019 2 次提交
-
-
由 Chen Weihang 提交于
Add examples for error message writing specification - PreconditionNotMet, Unimplemented, Unavailable (#21137) * add examples for error spec, test=develop * change ENFORCE to ENFORCE_**, test=develop
-
由 Chen Weihang 提交于
* add examples for error msg spec, test=develop * change ENFORCE to ENFORCE_**, test=develop * fix error, test=develop
-
- 12 11月, 2019 1 次提交
-
-
由 WangXi 提交于
-
- 05 11月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* support no need buffer vars in dygraph, test=develop * fix inference compilation error, test=develop * update no_need_buffer_vars_inference, test=develop * add unittests for no_need_buffer_vars_context, test=develop * refine no_need_buffer_vars by return ref, test=develop * polish some codes, test=develop
-
- 01 11月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
-