- 12 3月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 11 3月, 2020 4 次提交
-
-
由 Wilber 提交于
* add skip_layernorm pass. test=develop
-
由 wawltor 提交于
In the op of gemm, we use the gemm to replace batch gemm, speed up the matmul op
-
由 Adam 提交于
-
由 Zhaolong Xing 提交于
* 1. add embedding eltwise layernorm fuse 2. add embedding eltwise layernorm op 3. refine inplace_add_relu 4. refine fc_eltwise_layernorm test=develop * 1. refine fc test=develop * fix comments test=develop * fix comments test=develop
-
- 10 3月, 2020 1 次提交
-
-
由 guofei 提交于
As the title.
-
- 09 3月, 2020 6 次提交
-
-
由 Zeng Jinle 提交于
* refine grad maker, test=develop * refactor tracer stage 1, test=develop * merge develop to solve conflict third times, test=develop
-
由 liu zhengxi 提交于
* fix fc padding during fusion, test=develop * fix optim model inference after SaveOptimModel, test=develop
-
由 tangwei12 提交于
* fix communicator when breaking under PyReader mode, test=develop * revert some vlog level to 0, test=develop
-
由 mapingshuo 提交于
add lookup_table_dequant_op
-
由 zhaoyuchen2018 提交于
As model fails when enable int8 quant, so disable allocate memory in cpu for small variable.
-
由 Zhaolong Xing 提交于
* change the ci trt from version 5. to 6.0 * paddle-trt dynamic shape support init * conv+bias or conv+bn dynamic shape support test=develop * modity trt engine opconvert test=develop * fix ci error test=develop
-
- 08 3月, 2020 1 次提交
-
-
由 tangwei12 提交于
-
- 07 3月, 2020 2 次提交
-
-
由 Zhang Ting 提交于
-
由 wangchaochaohu 提交于
* refine the profiler print test=develop
-
- 06 3月, 2020 2 次提交
-
-
由 Michał Gallus 提交于
-
由 Chen Weihang 提交于
* polish detail implement of data loader, test=develop * solve coverage ci problem, test=develop
-
- 05 3月, 2020 3 次提交
-
-
由 Wilber 提交于
fix concat_mkldnn op when encounter extreame conditions.
-
由 hong 提交于
* reduce default attrs for dynamic graph, test=develop * add some explanations for explicit attr, test=develop * tweak explicit attr comments, test=develop
-
由 Zhaolong Xing 提交于
test=develop
-
- 04 3月, 2020 3 次提交
-
-
由 hong 提交于
* fix loaded program load bug; test=develop * first version * speed backward engin; test=develop * remove useless code; test=develop * reconvery io.py; test=develop * remove useless code; test=develop * remove useless code; test=develop
-
由 Zeng Jinle 提交于
* add recorded cuda memory apis, fix typo, test=develop * add more ut, test=develop * follow comments, test=develop * fix py35 incompatible issues, test=develop
-
由 石晓伟 提交于
* encapsulate the PaddleTensorToLoDTensor, test=develop * serialize the pd_tensor, test=develop * serialize tensors to file, test=develop
-
- 03 3月, 2020 3 次提交
-
-
由 Zhang Ting 提交于
-
由 Zhang Ting 提交于
* add fluid.device_guard to specify the device type for Op
-
由 石晓伟 提交于
* change the function in op_teller, test=develop * correct the commit-id, test=develop
-
- 02 3月, 2020 6 次提交
-
-
由 Zhen Wang 提交于
* update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results. * add the unit test for fetch_unmerged. * update ut for multi-card and multi-cpu. * add the error message and the user suggestion in FetchOpHandle. test=develop
-
由 wangchaochaohu 提交于
-
由 Chen Weihang 提交于
* add lodtensor share memory & serialization, test=develop * fix windows compile error, test=develop * deal vartype pickle & fix unittest matching error message, test=develop * update timeout variable name, test=develop * refactor memory map implement, test=develop * clear mmap file discripter when exit unexpectedly, test=develop * remove the child process fd in advance, test=develop * remove mmap fds after Queue.put in child process, test=develop * add hard unittests for register exit func, test=develop * fix python2 compatibility problem in unittest, test=develop * fix exception unittest error, test=develop * polish code based review comment, test=develop
-
由 liu zhengxi 提交于
* fix inference c api lod, test=develop * fix capi lod problem and enrich tests, test=develop * delete useless header files and alter const_cast, test=develop
-
由 wangchaochaohu 提交于
* add profiler_help.h to refine the code test=develop
-
由 hutuxian 提交于
* user can call dataset.set_download_cmd to set its customized download cmd * add UT to cover this scenario
-
- 01 3月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
* Add the codegen and auto fusion for sum Op in fusion group
-
- 28 2月, 2020 2 次提交
-
-
由 tianshuo78520a 提交于
-
由 Kaipeng Deng 提交于
-
- 27 2月, 2020 3 次提交
-
-
由 zhaoyuchen2018 提交于
* Refine adam op, test=develop * Fuse kernels together to reduce cpu time. * Refine paddle enforce, test=develop * Remove some comments, test=develop * Refine code,test=develop * Refine cuda kernel, test=develop * Refine code according to comments, test=develop
-
由 wangguanzhong 提交于
-
由 FlyingQianMM 提交于
* Correct CPU gradients of the argsort op, form a network to test its forward and backward process, test=develop * fix dynamic threshold error in test_argsort_op, test=develop
-
- 26 2月, 2020 2 次提交