- 24 7月, 2019 1 次提交
-
-
由 Bob Zhu 提交于
* extend matmul op to support multiple head multiplication With the support of multiple head, the multiplication of two big matrixes is split into multiplication of several (head_number) small matrixes. e.g. if Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of [6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].
-
- 28 6月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* add_elementwise_add_inplace_test,test=develop * rename file, test=develop
-
- 25 6月, 2019 1 次提交
-
-
由 Hongyu Liu 提交于
* sequnce mask support max length tensor input; test=develop * add rnn_impl.py; test=develop * add basic gru lstm unittest; test=develop * fix api spec; test=develop * fix sequence_mask op bug; test=develop test=document_preview * change +-*x to elmentwise_op; test=develop * add mkl flag; test=develop * fix rnn impl bug; test=develop * update api spec; test=develop * fix doc bug; test=develop * fix lstm bugs; test=develop
-
- 14 6月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
test=develop
-
- 12 6月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
Optimize the concat and split cuda implementation for cases when the number of inputs/outputs is less than 5. (#17979) test=develop
-
- 10 6月, 2019 1 次提交
-
-
由 Yibing Liu 提交于
* Enable seq_pool op to accept len 0 input test=develop * Update sequence_pool's api test=develop * Add more unittest cases for seq_pool op test=develop * Remove legacy comments test=develop * Don't use template in op maker test=develop
-
- 30 5月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Enhance fused_elementwise_activation op. test=develop * Move the api fused_elementwise_activation to contrib. test=develop * Add including files. test=develop * Add the support of sigmoid in fused_elementwise_activetion op. * Update API.spec. test=develop
-
- 29 5月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
Optimize the concat and split kernel for specical cases when the number of inputs/outputs is 2 (#17415) * Optimize the concat and split kernel for special cases that the number of inputs/outputs is 2. test=develop * Refine codes. test=develop * Correct the condition. test=develop * Move the define of tmp_data outside the if statement. * Print the cudnn minor version. test=develop * Fix the case when in_num/o_num is 1 in concat/split op. test=develop * Remove const_cast. test=develop
-
- 24 5月, 2019 1 次提交
-
-
由 tensor-tang 提交于
* refine softmax fwd test=develop * refine cpu softmax bwd test=develop * fix batch size test=develop * fix compile issue with gpu test=develop * add value clip
-
- 23 5月, 2019 1 次提交
-
-
由 tensor-tang 提交于
* refine softmax fwd test=develop * fix compile issue wih gpu test=develop * add value clip to avoid exp
-
- 21 5月, 2019 1 次提交
-
-
由 liuwei1031 提交于
http://newicafe.baidu.com:80/issue/PaddleSec-33/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-28/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-25/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-24/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-21/show?from=page http://newicafe.baidu.com:80/issue/PaddleSec-20/show?from=page test=develop
-
- 16 5月, 2019 1 次提交
-
-
由 zhaoyuchen2018 提交于
* improve gru unit performance. refine code test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * Add conditional compile for gru opt Not enable gru opt if compute ability < 700 test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * refine code. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
- 15 5月, 2019 1 次提交
-
-
由 Krzysztof Binias 提交于
test=develop
-
- 10 5月, 2019 1 次提交
-
-
由 zhaoyuchen2018 提交于
refine code fuse cublas calling and kernels into one cuda kernel. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
- 07 5月, 2019 1 次提交
-
-
由 Kaipeng Deng 提交于
* add attr axis infershape. test=develop * add CUDA kernel. test=develop * fix unittest. test=develop * fix unittest for soft_label. test=develop * fix fp16 unittest. test=develop * remove comment code. test=develop * refine test for axis. test=develop * add python api. test=develop * fix doc. test=develop * fix fp16 unittest. test=develop * fix ngraph test. test=develop * fix ENFORCE for test_imperative_transformer. test=develop * fit for ngraph test. test=develop * fix after rebase develop. test=develop * fix doc. test=develop * fix API.spec. test=develop * fix test_layers. test=develop * fix format. test=develop
-
- 20 4月, 2019 1 次提交
-
-
由 Yibing Liu 提交于
* Support seq len equal to 0 in sequence ops test=develop * Add more test cases * Fix some comments test=develop * Fix py3 error test=develop
-
- 17 4月, 2019 1 次提交
-
-
由 Kevin 提交于
* fix overflow by int32 mul test=develop * fix reference nullptr * fix codestyle test=develop * modify to point in ContextProjectFunctor test=develop * modify to point in ContextProjectFunctor test=develop * modify . to -> test=develop
-
- 12 4月, 2019 3 次提交
-
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
由 Qiao Longfei 提交于
-
- 25 3月, 2019 1 次提交
-
-
由 dengkaipeng 提交于
-
- 20 3月, 2019 2 次提交
-
-
由 phlrain 提交于
-
由 dengkaipeng 提交于
-
- 18 3月, 2019 2 次提交
-
-
由 dengkaipeng 提交于
-
由 phlrain 提交于
-
- 14 3月, 2019 2 次提交
-
-
由 sneaxiy 提交于
test=develop
-
由 Zeng Jinle 提交于
test=develop
-
- 12 3月, 2019 1 次提交
-
-
由 sneaxiy 提交于
test=develop
-
- 08 3月, 2019 3 次提交
-
-
由 tensor-tang 提交于
test=develop
-
由 Yiqun Liu 提交于
Make parent_idx a dispensable output for beam_search op to support models saved by older paddle version. (#16106) test=develop
-
由 Yiqun Liu 提交于
Make parent_idx a dispensable output for beam_search op to support models saved by older paddle version. (#16106) test=develop
-
- 07 3月, 2019 1 次提交
-
-
由 tensor-tang 提交于
test=develop
-
- 04 3月, 2019 3 次提交
-
-
由 Yiqun Liu 提交于
test=develop
-
由 Yihua Xu 提交于
test=develop
-
由 Qiao Longfei 提交于
-
- 28 2月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
test=develop
-
- 26 2月, 2019 1 次提交
-
-
由 Yihua Xu 提交于
test=develop
-
- 22 2月, 2019 2 次提交
-
-
由 tensor-tang 提交于
* Revert "Optimze Gelu with MKL Erf function (#15770)" This reverts commit 676995c8. * test=develop
-
由 Yihua Xu 提交于
* Optimize for gelu operator * Set up the low accuracy mode of MKL ERF function. test=develop * Only enable MKLML ERF when OS is linux * Use the speical mklml version included vmsErf function to verify gelu mkl kernel. test=develop * Add the CUDA macro to avoid NVCC's compile issue. test=develop * Add the TODO comments for mklml library modification. test=develop * Clean Code test=develop * Add the comment of marco for NVCC compiler. test=develop
-
- 19 2月, 2019 1 次提交
-
-
由 xuezhong 提交于
test=develop
-