- 11 2月, 2020 3 次提交
-
-
由 zhaoyuchen2018 提交于
* Refine code, fix select tile error,test=develop * Refine element type and some comments, test=develop * Refine comments and gpu utils, test=develop * Remove some useless condition * Refine floor and ceil, test=develop * refine for loop. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 Wilber 提交于
支持不依赖nccl进行编译。[1/2] 多卡下,如果没有打开WITH_NCCL开关编译,多卡不能通信,则只能选择一张卡使用。 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
由 guofei 提交于
This PR makes assign op support LoDTensorArray and enable the loop_vars in while_loop to support tuple or list.
-
- 10 2月, 2020 9 次提交
-
-
由 Zhaolong Xing 提交于
[Refine Paddle-TRT INT8]: Support PaddleSlim's Resnet50, Mobilenetv1, Yolov3 models for Inference. (#22483) * add int8 op teller for trt. * refine trt int8 * add int8 op teller for trt. test=develop
-
由 liu zhengxi 提交于
* add InterencePassTest for testing precision of inference passes, test=develop
-
由 zhongpu 提交于
add cp27-cp27m-gcc82 and cp27-cp27mu-gcc82 branch to support gcc8.2 compile for paddle, test=develop (#22504)
-
由 Guo Sheng 提交于
-
由 cc 提交于
* post_training_quantization support set bits, test=develop * up, test=develop
-
由 Wilber 提交于
Compile without nccl deps. [1/2] Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
由 Yiqun Liu 提交于
test=develop
-
由 Wilber 提交于
-
由 Huihuang Zheng 提交于
This PR provides very basic and simple framework for transforming Dygraph to Static Graph. API names, final outputs are not determined yet. Feel free to modify or add class/function/type when you think the framework is not extendable for you.
-
- 07 2月, 2020 6 次提交
-
-
由 Zhong Hui 提交于
Fix the integer overflow problem in the op of sequence2batch, change the int32_t to size_t, In the /paddle/fluid/operators/math/sequence2batch.h#L122.
-
由 cc 提交于
* support weight quantization in post_training_quanzitaion, test=develop * add test for weight quantization, test=develop
-
由 Yiqun Liu 提交于
* Add the first implememtation of fusion_group op #19621 (#3) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Enable generating code for a given subgraph. #21126 (#4) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop * Enable the detection of subgraph of grad ops. * Generate code for detected subgraph in fusion_group_pass. * Add an option in BuildStrategy to enable fusion_group_pass and add unittest. test=develop * Fix a bug when checking whether the shape of all inputs are the same. * Add debug information. * Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5) test=develop * Call subgraph_detector in fusion_group pass. test=develop * Disable fusion_group when WITH_GPU is OFF. test=develop * Refine all PADDLE_ENFORCE message. test=develop * Fix the case that some inputs are not defined in grad ops, and set op_role for fused op. test=develop * Follow review comments. test=develop
-
由 Aurelius84 提交于
* polish backward api doc test=develop, test=document_preview, test=document_fix * polish backward api doc test=develop, test=document_preview, test=document_fix * no_grad supports set of Variable test=develop, test=document_preview * polish sample code of append_backward test=develop, test=document_preview * modify assert into Raise TypeError test=develop,test=document_preview * fix unittest failed test=develop * rm useless file test=develop * polish en doc test=develop * polish code of no_grad_set test=develop * polish code of no_grad_set test=develop
-
由 Tao Luo 提交于
test=develop
-
由 LielinJiang 提交于
* optimize interpolate op, test=develop
-
- 06 2月, 2020 7 次提交
-
-
由 wangchaochaohu 提交于
-
由 Yiqun Liu 提交于
Correct the use of DeviceContext in unittest sequence_pooling_test and sequence_padding_test (#22456) * Add log in memory::Copy for debug purpose. * Change to use context in DeviceContextPool directly in sequence_pooling_test, instead to new one. * Change to use context in DeviceContextPool directly in sequence_padding_test, instead to new one. test=develop * Change the type of second_dim from size_t to int64_t. test=develop
-
由 flame 提交于
* R-language inference support
-
由 Tao Luo 提交于
-
由 joanna.wozna.intel 提交于
* Add dequant scale squash test=develop * Correct dequant-scale squash test test=develop
-
由 mapingshuo 提交于
* update readme * test=develop
-
由 Aurelius84 提交于
* add skip_check_grad_ci of var_conv_2d test=develop * modify check_shape_white_list test=develop
-
- 05 2月, 2020 7 次提交
-
-
由 Zhaolong Xing 提交于
* add mutex for trt engine test=develop * add the test for copy_to_cpu test=develop
-
由 XiaoguangHu 提交于
更新文档错误
-
由 Daniel Yang 提交于
* these are the revised readme/readme_cn pull-request, test=develop * this is a readme pull request, test = develop * these are the revised readme/readme_cn pull request,test = develop
-
由 Wilber 提交于
cmake选项中添加了WITH_NCCL,显示指定是否编译NCCL的部分代码,WITH_NCCL默认打开,但如果WITH_GPU为OFF,则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义 单机单卡能够关闭NCCL编译,多卡的话需要默认打开NCCL,如果关闭NCCL,则只能使用单卡 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
由 Bai Yifan 提交于
-
由 Tao Luo 提交于
* Sigmoid bug fix, test=develop * fix code format test=develop Co-authored-by: NManjunath Bhat <manjunathbhat9920@gmail.com>
-
由 xujiaqi01 提交于
* add hdfs ls retry time and sleep time, fix save inference * test=develop
-
- 04 2月, 2020 4 次提交
-
-
由 xujiaqi01 提交于
* fix copy table bug of lost some feasign * test=develop
-
由 Leo Chen 提交于
* add int16 support, test=develop * add test, test=develop * fix typo, test=develop * fix dtype error in slice, test=develop
-
由 石晓伟 提交于
-
由 juncaipeng 提交于
* fix chain doc, test=develop, test=document_preview
-
- 03 2月, 2020 1 次提交
-
-
由 tangwei12 提交于
* fix bug with half communicator
-
- 02 2月, 2020 2 次提交
-
-
由 liu zhengxi 提交于
* update the ut precision of pad pad2d pad_constant_like from fp32 to fp64, test=develop
-
由 xujiaqi01 提交于
* add GeneralRoleMaker which is for general usage * test=develop
-
- 31 1月, 2020 1 次提交
-
-
由 Michał Gallus 提交于
* Enable quantize to reorder to nchw as well * Correct FC MKL-DNN input dim requirements to accept 3D * Improve DNNL FC format, error and 3D input handling test=develop * Improve error checking in FC test=develop * Improve PADDLE_ENFORCE messages in fc-related files * Remove data layout attribute from obligatory pass args test=develop * Fix message in fc_mkldnn_pass to be logically correct test=develop
-