- 31 3月, 2020 1 次提交
-
-
由 Leo Chen 提交于
* expand parameters, test=develop * support resnet, test=develop * fix resnet, test=develop * support duplicable out, test=develop * support ptb * fix bugs, test=develop * support null input, test=develop * fix bugs, test=develop * fix batchNorm is_test, test=develop * refine code, test=develop * follow comments, test=develop * follow comments, test=develop * follow comments, test=develop * follow comments, test=develop
-
- 29 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
* distinguish public/private vars, test=develop * fix windows issues, test=develop
-
- 27 3月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
* expose dygraph.grad api, test=develop, test=document_fix * add more parameter in dygraph.grad API, test=develop * add only_inputs=True parameter, test=develop * follow comments, test=develop, test=document_fix * fix typo, test=develop, test=document_fix
-
- 26 3月, 2020 2 次提交
-
-
由 Zhaolong Xing 提交于
* add dynamic plugin support. test=develop * change emb eltwise layernorm to math function test=develop * add emb eltwise layernorm test=develop * can run dynamic shape ernie test=develop * fix ci test=develop * add ut for trt ernie dynamic test=develop * refine dynamic shape c++ interface. test=develop * fix comments test=develop * fix comments test=develop
-
由 xujiaqi01 提交于
* add clear_one_table * test=develop
-
- 20 3月, 2020 2 次提交
-
-
由 Zeng Jinle 提交于
* sequential reader stage 1, test=develop * fix ut, test=develop * fix iterable=False reset bug, add some logs and polish code, test=develop * inference feed partial data, test=develop * Turn on keep_order=True for test, test=develop * enhance ut to test more cases, test=develop * test commit for reverting * Revert "test commit for reverting", test=develop This reverts commit 80aef42e. * add ut of merged and unmerged results, test=develop * add more uts for coverages and add en doc of api, test=develop * follow comments, test=develop * change note style, test=develop
-
由 Zeng Jinle 提交于
* add double grad implementation for dygraph, test=develop * polish code, add uts, test=develop * fix place bug, test=develop * polish codes, add more uts for coverages, test=develop * add no_grad_set, test=develop * add star gan ut, test=develop * follow comments, test=develop
-
- 19 3月, 2020 1 次提交
-
-
由 songyouwei 提交于
* move __getitem__ to cpp * bug fix * add type check and gil release * support negative step with omitted ends test=develop * code refine test=develop * bug fix test=develop * slice always return different pyobj test=develop
-
- 09 3月, 2020 2 次提交
-
-
由 zhaoyuchen2018 提交于
As model fails when enable int8 quant, so disable allocate memory in cpu for small variable.
-
由 Zhaolong Xing 提交于
* change the ci trt from version 5. to 6.0 * paddle-trt dynamic shape support init * conv+bias or conv+bn dynamic shape support test=develop * modity trt engine opconvert test=develop * fix ci error test=develop
-
- 04 3月, 2020 2 次提交
-
-
由 Zeng Jinle 提交于
* add recorded cuda memory apis, fix typo, test=develop * add more ut, test=develop * follow comments, test=develop * fix py35 incompatible issues, test=develop
-
由 石晓伟 提交于
* encapsulate the PaddleTensorToLoDTensor, test=develop * serialize the pd_tensor, test=develop * serialize tensors to file, test=develop
-
- 03 3月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
* add fluid.device_guard to specify the device type for Op
-
- 02 3月, 2020 3 次提交
-
-
由 Zhen Wang 提交于
* update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results. * add the unit test for fetch_unmerged. * update ut for multi-card and multi-cpu. * add the error message and the user suggestion in FetchOpHandle. test=develop
-
由 Chen Weihang 提交于
* add lodtensor share memory & serialization, test=develop * fix windows compile error, test=develop * deal vartype pickle & fix unittest matching error message, test=develop * update timeout variable name, test=develop * refactor memory map implement, test=develop * clear mmap file discripter when exit unexpectedly, test=develop * remove the child process fd in advance, test=develop * remove mmap fds after Queue.put in child process, test=develop * add hard unittests for register exit func, test=develop * fix python2 compatibility problem in unittest, test=develop * fix exception unittest error, test=develop * polish code based review comment, test=develop
-
由 hutuxian 提交于
* user can call dataset.set_download_cmd to set its customized download cmd * add UT to cover this scenario
-
- 27 2月, 2020 1 次提交
-
-
由 zhaoyuchen2018 提交于
* Refine adam op, test=develop * Fuse kernels together to reduce cpu time. * Refine paddle enforce, test=develop * Remove some comments, test=develop * Refine code,test=develop * Refine cuda kernel, test=develop * Refine code according to comments, test=develop
-
- 26 2月, 2020 1 次提交
-
-
由 Leo Chen 提交于
* support cond in clone, test=develop * refine code, test=develop * refine code, test=develop * follow comments, test=develop * refine code, test=develop
-
- 25 2月, 2020 1 次提交
-
-
由 hutuxian 提交于
* Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator. * Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly. * Remove CPU code in Pull/PushSparse and we will add it back when testing it fully. * Fix some known issues: such as copying persistable vars after one epoch running.
-
- 23 2月, 2020 1 次提交
-
-
由 tianshuo78520a 提交于
-
- 22 2月, 2020 1 次提交
-
-
由 tangwei12 提交于
* add sync communicator and implement
-
- 18 2月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
* add python flag to control profile level test=develop
-
- 12 2月, 2020 1 次提交
-
-
由 tangwei12 提交于
* add thread barrier for the compiled program
-
- 11 2月, 2020 1 次提交
-
-
由 yaoxuefeng 提交于
* update * update test=develop * update compile set test=develop * update compile set test=develop * update test=develop * update test=develop * update test=develop * update compile setting test=develop * update compile setting test=develop * update run demo test=develop * update test=develop * update test=develop * fix test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update format test=develop * update format test=develop * update style test=develop * update style test=develop * change style test=develop * change style test=develop * change style test=develop * add dataset unittest test=develop * update test=develop * update for record test=develop * udpate style for record test=develop * update for record test=develop * update for record test=develop * update for record test=develop * fix format test=develop * update test=develop * update test=develop * update test=develop * update test=develop * update test=develop
-
- 10 2月, 2020 1 次提交
-
-
由 Wilber 提交于
Compile without nccl deps. [1/2] Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 07 2月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
* Add the first implememtation of fusion_group op #19621 (#3) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Enable generating code for a given subgraph. #21126 (#4) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop * Enable the detection of subgraph of grad ops. * Generate code for detected subgraph in fusion_group_pass. * Add an option in BuildStrategy to enable fusion_group_pass and add unittest. test=develop * Fix a bug when checking whether the shape of all inputs are the same. * Add debug information. * Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5) test=develop * Call subgraph_detector in fusion_group pass. test=develop * Disable fusion_group when WITH_GPU is OFF. test=develop * Refine all PADDLE_ENFORCE message. test=develop * Fix the case that some inputs are not defined in grad ops, and set op_role for fused op. test=develop * Follow review comments. test=develop
-
- 05 2月, 2020 1 次提交
-
-
由 Wilber 提交于
cmake选项中添加了WITH_NCCL,显示指定是否编译NCCL的部分代码,WITH_NCCL默认打开,但如果WITH_GPU为OFF,则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义 单机单卡能够关闭NCCL编译,多卡的话需要默认打开NCCL,如果关闭NCCL,则只能使用单卡 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 04 2月, 2020 2 次提交
- 02 2月, 2020 1 次提交
-
-
由 xujiaqi01 提交于
* add GeneralRoleMaker which is for general usage * test=develop
-
- 21 1月, 2020 1 次提交
-
-
由 Leo Chen 提交于
remove unnecessary template.
-
- 19 1月, 2020 1 次提交
-
-
由 Leo Chen 提交于
* use function instead of lambda, test=develop * follow comments, test=develop
-
- 17 1月, 2020 2 次提交
-
-
由 Yiqun Liu 提交于
* Implement a common python unittest to test the ir passes. test=develop * Save the results in np.array and support to startup on CPU. test=develop * Fix the unittest. test=develop * Add check_program to check whether the optimized program is different from the origin one. test=develop * Remove the inferface all_ops. test=develop * Add exception test in pass_test. test=develop
-
由 tangwei12 提交于
* add half_async in the communicator * fix DistributedStrategy
-
- 16 1月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add multiprocess for dygraph data loader, test=develop * polish code & add safe gurad, test=develop * refactor dygraph dataloader & add signal handler, test=develop * fix member initializer compile error on ci, test=develop * fix member initializer compile error one more, test=develop * remove useless config, test=develop * skip windows incompatible problem, test=develop * add unittest for coverage, test=coverage * add more exception unittest case, test=develop * deal with signal handler coverage, test=develop * polish code & add signal handler tests, test=develop * deal with coverage ci problem, test=develop * split data loader test & coverage ci fix, test=develop * remove test_imperative_data_loader_with_exception, test=develop * remove singal process except test case, test=develop * add exception tests again & remove sample list test, test=develop * split normal and exception unittests to diff class, test=develop * polish doc for use_multiprocess effect in static mode, test=develop
-
- 14 1月, 2020 1 次提交
-
-
由 xujiaqi01 提交于
* add collective communication library in fleet to replace mpi * test=develop
-
- 10 1月, 2020 1 次提交
-
-
由 Zhen Wang 提交于
* add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop
-
- 08 1月, 2020 1 次提交
-
-
由 zhongpu 提交于
* modify fc to linear in sample code, test=develop * remove FC, test=develop * remove warnings, test=develop * drop fluid/imperative/README.md , test=develop * change fc to linear, test=develop * polish code style, test=develop
-
- 06 1月, 2020 2 次提交
-
-
由 Huihuang Zheng 提交于
-
由 123malin 提交于
* add distributed_strategy
-