- 17 9月, 2019 2 次提交
-
-
由 翟飞跃 提交于
* Implement the operator with sprase matrix multiply * Update the URL of mklml library. test=develop * Disable MKLML implematation when using no-linux. test=develop * optimize bp with mkl sparse matrix test=develop * tmp add fused_emb_seq layer * Add the support of padding_idx attribute. test=develop * add padding_idx support test=develop * implement grad refer lego test=develop
-
由 chengduo 提交于
* fix example error test=develop * Remove set_desc test=develop
-
- 16 9月, 2019 11 次提交
-
-
由 chengduo 提交于
* fix warning info test=develop * fix bug of all_reduce_deps_pass test=develop
-
由 ruri 提交于
* add unit test for square error cost op
-
由 Zeng Jinle 提交于
-
由 Yiqun Liu 提交于
* Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop * Enhance fc_fuse_pass to enable fusing relu. * Allow print the shapes of var_desc in graph. test=develop * Enhance fc_fuse_pass_tester. * Remove the use of PADDLE_ENFORCE. test=develop * Correct the number of ops after fusing. test=develop * Fix a typo. test=develop * Set activation_type to null when there is no relu in fc. test=develop * Refine fc_fuse_pass's codes. * Enable the set of shape for tensor. * Refine repeated_fc_relu_pass and add unittest. test=develop
-
由 zhouwei25 提交于
-
由 Zeng Jinle 提交于
-
由 zhongpu 提交于
* add kernel for squeeze_op, test=develop * delete comment, test=develop
-
由 zhongpu 提交于
* add kernel for unstack_op, test=develop * add kernel for unstack_op, test=develop * add kernel for unstack_op, test=develop * adjust the code format, test=develop * modify some comment, test=develop
-
由 Chen Weihang 提交于
-
由 Kaipeng Deng 提交于
-
由 tangwei12 提交于
fix wrong place with distributed_lookup_table
-
- 14 9月, 2019 3 次提交
-
-
由 tianshuo78520a 提交于
* change approve site ;test=develop * test=develop
-
由 Adam 提交于
test=develop
-
由 Yihua Xu 提交于
test=develop
-
- 13 9月, 2019 1 次提交
-
-
由 chengduo 提交于
* Open fuse all reduce op test=develop * Add Fuse optimization op log * Add log in fuse_optimizer op pass and fuse all_reduce op pass * replace with boost::optional<bool> test=develop * Polish code test=develop * fix code coverage test=develop
-
- 12 9月, 2019 3 次提交
-
-
由 Aurelius84 提交于
* add one_hot_v2_op to remove last_dims==1 test=develop * add api unittest code for CI_Coverage test=develop * improve CI_Coverage rate by adding test_with_depth test=develop
-
由 JesseyXujin 提交于
-
由 Jacek Czaja 提交于
test=develop - fix to BWD test=develop
-
- 11 9月, 2019 14 次提交
-
-
由 Huihuang Zheng 提交于
TemporaryAllocator is a singleton used for allocating memory for Cudnn. Since it is a singleton, we can delete it for better performance in memory. We replace TemporaryAllocator by CUDADeviceContextAllocator and CUDADeviceContextAllocation, which uses stream callback to delete the memory allocated for the stream to avoid singleton. Also added data_feed_proto to operator to fix CI in CPU compilation
-
由 Zeng Jinle 提交于
* make leaky relu inplacable, test=develop * force add unittests to pass coverage, test=develop
-
由 chengduo 提交于
Fix test_parallel_executor_test_while_train
-
由 Zeng Jinle 提交于
-
由 chengduo 提交于
* fix vlog level and fuse option type test=develop
-
由 Jacek Czaja 提交于
test=develop - Cosmetic fixes test=develop
-
由 Yiqun Liu 提交于
* Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop
-
由 Aurelius84 提交于
* Remove constraint that last dimension is forced to be 1 in huber_loss test=develop * add y[rank-1] == 1 when x_rank=y_rank test=develop * modify into contain_unknown_dim test=develop
-
由 chengduo 提交于
* Enable fused_all_reduce_op_handle support GPU and CPU Gradients
-
由 Youwei Song 提交于
* update dygraph api-doc and backward api-doc, test=develop * update dygraph api-doc and backward api-doc, update api.spec, test=develop * update dygraph api-doc and backward api-doc, update api.spec, test=develop * update API.spec, test=develop
-
由 Thunderbrook 提交于
-
由 Youwei Song 提交于
* fix dygraph partitial backward problem, test=develop * add unittest, fix ClearGradient. test=develop * add filter and error in python side, test=develop * rebase develop, test=develop * bug fix for list equals in py3.5, test=develop * bug fix for list equals, test=develop
-
由 Tao Luo 提交于
remove unused accuracy-diff warpctc-cudnn implementation test=develop
-
由 Bai Yifan 提交于
* split teacher checkpoints with student checkpoints, test=develop * add unittest for graph.merge(), test=develop
-
- 10 9月, 2019 6 次提交
-
-
由 Zeng Jinle 提交于
-
由 Adam 提交于
* MKLDNN handler cleanup * MKLDNN handler cleanup test=develop
-
由 chengduo 提交于
test=develop
-
由 XiaoguangHu 提交于
Add document annotations for FLAGS that need to be open to external developers test=develop (#19692) Add document annotations for FLAGS that need to be open to external developers
-
由 Zeng Jinle 提交于
-
由 baojun 提交于
-