- 10 2月, 2020 3 次提交
-
-
由 cc 提交于
* post_training_quantization support set bits, test=develop * up, test=develop
-
由 Wilber 提交于
-
由 Huihuang Zheng 提交于
This PR provides very basic and simple framework for transforming Dygraph to Static Graph. API names, final outputs are not determined yet. Feel free to modify or add class/function/type when you think the framework is not extendable for you.
-
- 07 2月, 2020 3 次提交
-
-
由 cc 提交于
* support weight quantization in post_training_quanzitaion, test=develop * add test for weight quantization, test=develop
-
由 Yiqun Liu 提交于
* Add the first implememtation of fusion_group op #19621 (#3) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Enable generating code for a given subgraph. #21126 (#4) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop * Enable the detection of subgraph of grad ops. * Generate code for detected subgraph in fusion_group_pass. * Add an option in BuildStrategy to enable fusion_group_pass and add unittest. test=develop * Fix a bug when checking whether the shape of all inputs are the same. * Add debug information. * Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5) test=develop * Call subgraph_detector in fusion_group pass. test=develop * Disable fusion_group when WITH_GPU is OFF. test=develop * Refine all PADDLE_ENFORCE message. test=develop * Fix the case that some inputs are not defined in grad ops, and set op_role for fused op. test=develop * Follow review comments. test=develop
-
由 Aurelius84 提交于
* polish backward api doc test=develop, test=document_preview, test=document_fix * polish backward api doc test=develop, test=document_preview, test=document_fix * no_grad supports set of Variable test=develop, test=document_preview * polish sample code of append_backward test=develop, test=document_preview * modify assert into Raise TypeError test=develop,test=document_preview * fix unittest failed test=develop * rm useless file test=develop * polish en doc test=develop * polish code of no_grad_set test=develop * polish code of no_grad_set test=develop
-
- 06 2月, 2020 2 次提交
-
-
由 Tao Luo 提交于
-
由 Aurelius84 提交于
* add skip_check_grad_ci of var_conv_2d test=develop * modify check_shape_white_list test=develop
-
- 05 2月, 2020 3 次提交
-
-
由 Wilber 提交于
cmake选项中添加了WITH_NCCL,显示指定是否编译NCCL的部分代码,WITH_NCCL默认打开,但如果WITH_GPU为OFF,则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义 单机单卡能够关闭NCCL编译,多卡的话需要默认打开NCCL,如果关闭NCCL,则只能使用单卡 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
由 Bai Yifan 提交于
-
由 xujiaqi01 提交于
* add hdfs ls retry time and sleep time, fix save inference * test=develop
-
- 04 2月, 2020 2 次提交
-
-
由 Leo Chen 提交于
* add int16 support, test=develop * add test, test=develop * fix typo, test=develop * fix dtype error in slice, test=develop
-
由 juncaipeng 提交于
* fix chain doc, test=develop, test=document_preview
-
- 03 2月, 2020 1 次提交
-
-
由 tangwei12 提交于
* fix bug with half communicator
-
- 02 2月, 2020 2 次提交
-
-
由 liu zhengxi 提交于
* update the ut precision of pad pad2d pad_constant_like from fp32 to fp64, test=develop
-
由 xujiaqi01 提交于
* add GeneralRoleMaker which is for general usage * test=develop
-
- 25 1月, 2020 2 次提交
-
-
由 lidanqing 提交于
-
由 joanna.wozna.intel 提交于
-
- 23 1月, 2020 1 次提交
-
-
由 Wojciech Uss 提交于
-
- 22 1月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 21 1月, 2020 7 次提交
-
-
由 lilong12 提交于
-
由 石晓伟 提交于
* add no_grad_set value check in op_test, test=develop * update ops list, test=develop
-
由 Chengmo 提交于
* test=develop, fix geo Send & Init
-
由 songyouwei 提交于
test=develop
-
由 gongweibao 提交于
-
由 juncaipeng 提交于
* remove skip_check in test_activation_mkldnn_op, test=develop
-
由 whs 提交于
-
- 20 1月, 2020 1 次提交
-
-
由 Zeng Jinle 提交于
* polish backward prune, test=develop * fix control flow op bug, test=develop * add some unittests, test=develop * fix unittest args, test=develop * follow huihuang's comments, test=develop
-
- 19 1月, 2020 3 次提交
-
-
由 Zeng Jinle 提交于
* fix dataloader early reset bug, test=develop * change implementation, test=develop * fix ut, test=develop
-
由 zhupengyang 提交于
-
由 juncaipeng 提交于
-
- 18 1月, 2020 1 次提交
-
-
由 Chengmo 提交于
fix timeout of test_dist_fleet_geo
-
- 17 1月, 2020 5 次提交
-
-
由 Yiqun Liu 提交于
* Implement a common python unittest to test the ir passes. test=develop * Save the results in np.array and support to startup on CPU. test=develop * Fix the unittest. test=develop * Add check_program to check whether the optimized program is different from the origin one. test=develop * Remove the inferface all_ops. test=develop * Add exception test in pass_test. test=develop
-
由 songyouwei 提交于
* fix circular dependent * try import layers.nn from dygraph test=develop
-
由 songyouwei 提交于
* allow sublayer or param shadow attrs * add unittest test=develop * change remove fn name test=develop
-
由 zhupengyang 提交于
-
由 tangwei12 提交于
* add half_async in the communicator * fix DistributedStrategy
-
- 16 1月, 2020 3 次提交
-
-
由 hong 提交于
* add learning rate api; test=develop * fix uni test converage; test=develop * fix travis ci error; test=develop * fix comment; test=develop * fix example error; test=develop * polish the api description, test=develop Co-authored-by: Nzhongpu <2013000149@qq.com>
-
由 Li Fuchen 提交于
* Fixed warpctc, test=develop * Set lod level of sequence_unpad's output to 1 in compile time test=develop * fix the en doc and example code of warpctc, test=develop, test=document_fix
-
由 zhangchunle 提交于
-