提交 · 49523ea1898ddaea4e166ae5a603fec577e165c9 · BaiXuePrincess / Paddle

03 9月, 2019 1 次提交

replace PADDLE_ASSERT with PADDLE_ASSERT_MSG (#19586) · 49523ea1

由 Tao Luo 提交于 9月 03, 2019

* remove unused PADDLE_ASSERT(_IS_NOT_ERROR)

* replace PADDLE_ASSERT with PADDLE_ASSERT_MSG

test=develop

49523ea1

02 9月, 2019 1 次提交
- Z
  
  fix the compilation issue on windows caused by mkl_CSRMM (#19533) · 84c72801
  由 zhouwei25 提交于 9月 02, 2019
  
  84c72801
29 8月, 2019 1 次提交
- Z
  
  fix sofmax seg fault in AVX, test=develop (#19487) · 11f2f784
  由 Zeng Jinle 提交于 8月 29, 2019
  
  11f2f784
20 8月, 2019 1 次提交

Use sparse matrix to implement fused emb_seq_pool operator (#19064) · b9203958

由 Yihua Xu 提交于 8月 20, 2019

* Implement the operator with sprase matrix multiply

* Update the URL of mklml library.

test=develop

* Disable MKLML implematation when using no-linux.

test=develop

* Ignore the deprecated status for windows

test=develop

b9203958

19 8月, 2019 1 次提交
- S
  change PADDLE_ENFORCE to PADDLE_ENFORCE_CUDA_SUCCESS (#19205) · af0fbd90
  由 silingtong123 提交于 8月 19, 2019
```
* print error code if cuda related API fails
```
  af0fbd90
01 8月, 2019 1 次提交
- L
  Fix depthwise conv gpu kernel bug (#18582) · 22fa4c2d
  由 LielinJiang 提交于 8月 01, 2019
```
* fix depthwise conv gpu kernel bug, test=develop
* add more depthwise conv test, test=develop
```
  22fa4c2d
24 7月, 2019 1 次提交

Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60

由 Bob Zhu 提交于 7月 24, 2019

* extend matmul op to support multiple head multiplication

With the support of multiple head, the multiplication of two big matrixes is
split into multiplication of several (head_number) small matrixes. e.g. if
Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
[6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].

220eef60

28 6月, 2019 1 次提交
- Z
  Add a unittest to inplace elementwise_add (#18385) · f5641000
  由 Zeng Jinle 提交于 6月 28, 2019
```
* add_elementwise_add_inplace_test,test=develop

* rename file, test=develop
```
  f5641000
25 6月, 2019 1 次提交

Sequence mask support tensor (#18249) · df2eee71

由 Hongyu Liu 提交于 6月 25, 2019

* sequnce mask support max length tensor input; test=develop

* add rnn_impl.py; test=develop

* add basic gru lstm unittest; test=develop

* fix api spec; test=develop

* fix sequence_mask op bug;
test=develop
test=document_preview

* change +-*x to elmentwise_op; test=develop

* add mkl flag; test=develop

* fix rnn impl bug; test=develop

* update api spec; test=develop

* fix doc bug; test=develop

* fix lstm bugs; test=develop

df2eee71

14 6月, 2019 1 次提交
- Y
  Optimize fused_elewise_activation_grad op. (#18041) · 660c1a65
  由 Yiqun Liu 提交于 6月 14, 2019
```
test=develop
```
  660c1a65
12 6月, 2019 1 次提交
- Y
  Optimize the concat and split cuda implementation for cases when the number of... · 7e463c84
  由 Yiqun Liu 提交于 6月 12, 2019
```
Optimize the concat and split cuda implementation for cases when the number of inputs/outputs is less than 5. (#17979)

test=develop
```
  7e463c84
10 6月, 2019 1 次提交

Enable seq_pool op to accept len 0 input (#17284) · 33d1e565

由 Yibing Liu 提交于 6月 10, 2019

* Enable seq_pool op to accept len 0 input

test=develop

* Update sequence_pool's api

test=develop

* Add more unittest cases for seq_pool op

test=develop

* Remove legacy comments

test=develop

* Don't use template in op maker

test=develop

33d1e565

30 5月, 2019 1 次提交

Enhance fused_elementwise_activation op and add python api in contrib.layers (#17236) · 8fd39f3e

由 Yiqun Liu 提交于 5月 30, 2019

* Enhance fused_elementwise_activation op.
test=develop

* Move the api fused_elementwise_activation to contrib.
test=develop

* Add including files.
test=develop

* Add the support of sigmoid in fused_elementwise_activetion op.

* Update API.spec.
test=develop

8fd39f3e

29 5月, 2019 1 次提交

Optimize the concat and split kernel for specical cases when the number of... · 5782ddda

由 Yiqun Liu 提交于 5月 29, 2019

Optimize the concat and split kernel for specical cases when the number of inputs/outputs is 2 (#17415)

* Optimize the concat and split kernel for special cases that the number of inputs/outputs is 2.
test=develop

* Refine codes.
test=develop

* Correct the condition.
test=develop

* Move the define of tmp_data outside the if statement.

* Print the cudnn minor version.
test=develop

* Fix the case when in_num/o_num is 1 in concat/split op.
test=develop

* Remove const_cast.
test=develop

5782ddda

24 5月, 2019 1 次提交

[CPU] refine cpu softmax bwd (#17534) · 7ae461eb

由 tensor-tang 提交于 5月 24, 2019

* refine softmax fwd

test=develop

* refine cpu softmax bwd

test=develop

* fix batch size

test=develop

* fix compile issue with gpu

test=develop

* add value clip

7ae461eb

23 5月, 2019 1 次提交

[CPU] refine softmax op fwd on CPU (#17522) · 0600b370

由 tensor-tang 提交于 5月 23, 2019

* refine softmax fwd

test=develop

* fix compile issue wih gpu

test=develop

* add value clip to avoid exp

0600b370

21 5月, 2019 1 次提交

fix security bugs : (#17464) · ba70cc49

由 liuwei1031 提交于 5月 21, 2019

http://newicafe.baidu.com:80/issue/PaddleSec-33/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-28/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-25/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-24/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-21/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-20/show?from=page

test=develop

ba70cc49

16 5月, 2019 1 次提交

Add conditional compile for gru opt (#17368) · b02f2aff

由 zhaoyuchen2018 提交于 5月 16, 2019

* improve gru unit performance.
refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Add conditional compile for gru opt

Not enable gru opt if compute ability < 700

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* refine code.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

b02f2aff

15 5月, 2019 1 次提交
- K
  Optimize the sequence padding op (#17403) · 0823a7bc
  由 Krzysztof Binias 提交于 5月 15, 2019
```
test=develop
```
  0823a7bc
10 5月, 2019 1 次提交

improve gru unit performance. (#16338) · 8a2caacd

由 zhaoyuchen2018 提交于 5月 10, 2019

refine code

fuse cublas  calling and kernels into one cuda kernel.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

8a2caacd

07 5月, 2019 1 次提交

Softmax_cross_entropy op add axis (#16806) · a71d8fdb

由 Kaipeng Deng 提交于 5月 07, 2019

* add attr axis infershape. test=develop

* add CUDA kernel. test=develop

* fix unittest. test=develop

* fix unittest for soft_label. test=develop

* fix fp16 unittest. test=develop

* remove comment code. test=develop

* refine test for axis. test=develop

* add python api. test=develop

* fix doc. test=develop

* fix fp16 unittest. test=develop

* fix ngraph test. test=develop

* fix ENFORCE for test_imperative_transformer. test=develop

* fit for ngraph test. test=develop

* fix after rebase develop. test=develop

* fix doc. test=develop

* fix API.spec. test=develop

* fix test_layers. test=develop

* fix format. test=develop

a71d8fdb

20 4月, 2019 1 次提交

Support seq len equal to 0 in sequence ops (#16935) · 3c375751

由 Yibing Liu 提交于 4月 20, 2019

* Support seq len equal to 0 in sequence ops

test=develop

* Add more test cases

* Fix some comments

test=develop

* Fix py3 error

test=develop

3c375751

17 4月, 2019 1 次提交

fix overflow by int32 mul test=develop (#16794) · c474e7dd

由 Kevin 提交于 4月 17, 2019

* fix overflow by int32 mul test=develop

* fix reference nullptr

* fix codestyle test=develop

* modify to point in ContextProjectFunctor test=develop

* modify to point in ContextProjectFunctor test=develop

* modify . to -> test=develop

c474e7dd

12 4月, 2019 3 次提交
- Q
  
  fix cpplint test=develop · faae1b41
  由 Qiao Longfei 提交于 4月 12, 2019
  
  faae1b41
- Q
  
  add cpu_merge_add_multi_noduplicated_test test=develop · 0a8ff2ec
  由 Qiao Longfei 提交于 4月 12, 2019
  
  0a8ff2ec
- Q
  
  optimize merge add if input rows of all selected rows is not duplicated · 920a9609
  由 Qiao Longfei 提交于 4月 12, 2019
  
  920a9609
25 3月, 2019 1 次提交
- D
  
  fix format. test=develop · 90bd038d
  由 dengkaipeng 提交于 3月 25, 2019
  
  90bd038d
20 3月, 2019 2 次提交
- P
  
  fix sequence pad; test=develop · 1580be5d
  由 phlrain 提交于 3月 20, 2019
  
  1580be5d
- D
  
  add jit kernel for softmax axis. test=develop · 93701dba
  由 dengkaipeng 提交于 3月 20, 2019
  
  93701dba
18 3月, 2019 2 次提交
- D
  
  refine softmax kernel. test=develop · 6c641827
  由 dengkaipeng 提交于 3月 18, 2019
  
  6c641827
- P
  
  remove resize then seq num == 1; test=develop · 802b3348
  由 phlrain 提交于 3月 18, 2019
  
  802b3348
14 3月, 2019 2 次提交
- S
  revert revert 16144 · 5a92e4c0
  由 sneaxiy 提交于 3月 14, 2019
```
test=develop
```
  5a92e4c0
- Z
  Revert "PaddingRNN model memory optimize" · a91964c8
  由 Zeng Jinle 提交于 3月 14, 2019
```
test=develop
```
  a91964c8
12 3月, 2019 1 次提交
- S
  refine code · b26e9bd2
  由 sneaxiy 提交于 3月 12, 2019
```
test=develop
```
  b26e9bd2
08 3月, 2019 3 次提交
- T
  simplify the jitkernel templates and tests · 14a764c9
  由 tensor-tang 提交于 3月 08, 2019
```
test=develop
```
  14a764c9
- Y
  Make parent_idx a dispensable output for beam_search op to support models... · 66ead07e
  由 Yiqun Liu 提交于 3月 08, 2019
```
Make parent_idx a dispensable output for beam_search op to support models saved by older paddle version. (#16106)

test=develop
```
  66ead07e
- Y
  Make parent_idx a dispensable output for beam_search op to support models... · 5bde1202
  由 Yiqun Liu 提交于 3月 08, 2019
```
Make parent_idx a dispensable output for beam_search op to support models saved by older paddle version. (#16106)

test=develop
```
  5bde1202
07 3月, 2019 1 次提交
- T
  unify the kernelfuncs cache and add unit test · 802f362a
  由 tensor-tang 提交于 3月 07, 2019
```
test=develop
```
  802f362a
04 3月, 2019 2 次提交
- Y
  Fix error in CUDA kernel of beam_search. (#15957) · c90b82a6
  由 Yiqun Liu 提交于 2月 28, 2019
```
test=develop
```
  c90b82a6
- Y
  Optimize gelu operation with mkl erf. · b48d56e8
  由 Yihua Xu 提交于 2月 26, 2019
```
test=develop
```
  b48d56e8

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致