- 24 7月, 2019 2 次提交
-
-
由 Bob Zhu 提交于
* extend matmul op to support multiple head multiplication With the support of multiple head, the multiplication of two big matrixes is split into multiplication of several (head_number) small matrixes. e.g. if Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of [6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].
-
由 whs 提交于
* Make lod reset op support for append lod level. * Fix API.spec test=develop * Fix unitest. test=develop * Add python api for lod append. test=develop * Fix API.spec test=develop * Fix format of doc. test=develop * Fix unitest. test=develop * Fix doc. test=develop
-
- 23 7月, 2019 4 次提交
-
-
由 Jacek Czaja 提交于
test=develop - compileation fix - Yet another compilation fix - Even yet another compilation fix - Surprise! Again compilation fix - lint fixes test=develop - Fix to workspace acquire of LRN test=develop - Fix to hash of BWD LRN test=develop - fix to lrn BWD PD acquire test=develop - Fixing LRN PD creation test=develop - cosmetic fix in comment test=develop - Fixes after review test=develop
-
由 chengduo 提交于
* support sparse gradients test=develop
-
由 wangchaochaohu 提交于
* rewrite the conv_op using cudnn_conv_helper * add workspace limit for v7 test=develop * fix test=develop * add half float test=develop * fix test=develop * fix test=develop * revise code style test=develop * fix test=develop
-
由 Yi Liu 提交于
* supports distributed classification training * update API.spec * fix evenly division in python3 * change "index_range" to "index_num" in shard_index operator test=document_preview test=develop
-
- 22 7月, 2019 4 次提交
-
-
由 qingqing01 提交于
-
由 Tao Luo 提交于
test=develop
-
由 whs 提交于
test=develop
-
由 Bai Yifan 提交于
-
- 20 7月, 2019 2 次提交
-
-
由 cjt222 提交于
add license
-
由 wangguanzhong 提交于
* fix clip_by_norm doc, test=develop
-
- 19 7月, 2019 2 次提交
-
-
由 Huihuang Zheng 提交于
Test PaddingRNN on V100 GPU device. Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU. GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR) Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)
-
由 Adam 提交于
test=develop
-
- 18 7月, 2019 2 次提交
-
-
由 hutuxian 提交于
* hash_op support int64 hash_size * add corresponding UT
-
由 guru4elephant 提交于
* remove ctr reader, all functions are satisfied in dataset
-
- 17 7月, 2019 3 次提交
-
-
由 Yang Zhang 提交于
* Add GPU implementation for `prelu` backward pass test=develop * Fix logic error in `prelu` GPU backward and simplify a bit test=develop * Fix `prelu` backward CUDA implementation test=develop CPU version was not used actually, so test passed
-
由 Yihua Xu 提交于
-
由 baojun 提交于
-
- 16 7月, 2019 2 次提交
-
-
由 Jacek Czaja 提交于
* - Added partial draft of pooling acquire - Workspace support - compilation fix - Added draft of pooling backward reimplementation - Segfault fix - reverted 'any' for diff_dst crewation in pooling - Lint fixes test=develop - lint fixes test=develop - Further lint fixes test=develop * - Fixes after review test=develop * - Lint fixes test=develop * - Even more lint fixes test=develop
-
由 chengduo 提交于
test=develop
-
- 15 7月, 2019 1 次提交
-
-
由 guru4elephant 提交于
* make auc op compatible with 1 dim
-
- 11 7月, 2019 2 次提交
-
-
由 Hongyu Liu 提交于
-
由 Zeng Jinle 提交于
* feature/buffer_shared_inplace, test=develop * refine code, test=develop * fix elementwise_add op cpu inplace and sum inplace bug, test=develop * add unittest and debug log, test=develop * fix parallel_executor scope bug, polish code, test=develop * fix sum op, activation op, single_in_place_inference bug, test=develop * remove kLocalExecScopeName, test=develop * fix unittest,test=develop * fix out_var first version bug, test=develop * follow comments,test=develop
-
- 10 7月, 2019 4 次提交
-
-
由 Zeng Jinle 提交于
* clean code of dim and place, test=develop * fix failed unittests, test=develop
-
由 Jacek Czaja 提交于
-
由 Yibing Liu 提交于
-
由 Physher 提交于
-
- 09 7月, 2019 3 次提交
-
-
由 Jiabin Yang 提交于
* test=develop, fix docker with paddle nccl problem * test=develop, fix/gcc_4.8_ubt_link_error * test=develop, fix code format
-
由 Physher 提交于
-
由 LielinJiang 提交于
* fix transform matrix bug, test=develop * modify API.spec
-
- 08 7月, 2019 1 次提交
-
-
由 Zhaolong Xing 提交于
* Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop
-
- 05 7月, 2019 1 次提交
-
-
由 zhaoyuchen2018 提交于
* Fix topk cannot handle 1D vector bug Add path to handle 1D vector test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * refine code test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
- 04 7月, 2019 2 次提交
-
-
由 qingqing01 提交于
* Refine Infershape in activation_op for double_grad.
-
由 chengduo 提交于
-
- 03 7月, 2019 5 次提交
-
-
由 zhoukunsheng 提交于
-
由 zhoukunsheng 提交于
* test=develop support Tensor input for chunk_eval op * test=develop fix testcase for chunk_eval op * test=develop fix typos in nn.py
-
由 zhoukunsheng 提交于
-
由 zhoukunsheng 提交于
-
由 zhoukunsheng 提交于
-