提交 · 5782dddad05ca41c9da8861cb399998f8138bba0 · Crayon鑫 / Paddle

29 5月, 2019 1 次提交

Optimize the concat and split kernel for specical cases when the number of... · 5782ddda

由 Yiqun Liu 提交于 5月 29, 2019

Optimize the concat and split kernel for specical cases when the number of inputs/outputs is 2 (#17415)

* Optimize the concat and split kernel for special cases that the number of inputs/outputs is 2.
test=develop

* Refine codes.
test=develop

* Correct the condition.
test=develop

* Move the define of tmp_data outside the if statement.

* Print the cudnn minor version.
test=develop

* Fix the case when in_num/o_num is 1 in concat/split op.
test=develop

* Remove const_cast.
test=develop

5782ddda

24 5月, 2019 1 次提交

[CPU] refine cpu softmax bwd (#17534) · 7ae461eb

由 tensor-tang 提交于 5月 24, 2019

* refine softmax fwd

test=develop

* refine cpu softmax bwd

test=develop

* fix batch size

test=develop

* fix compile issue with gpu

test=develop

* add value clip

7ae461eb

23 5月, 2019 1 次提交

[CPU] refine softmax op fwd on CPU (#17522) · 0600b370

由 tensor-tang 提交于 5月 23, 2019

* refine softmax fwd

test=develop

* fix compile issue wih gpu

test=develop

* add value clip to avoid exp

0600b370

21 5月, 2019 1 次提交

fix security bugs : (#17464) · ba70cc49

由 liuwei1031 提交于 5月 21, 2019

http://newicafe.baidu.com:80/issue/PaddleSec-33/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-28/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-25/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-24/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-21/show?from=page
http://newicafe.baidu.com:80/issue/PaddleSec-20/show?from=page

test=develop

ba70cc49

16 5月, 2019 1 次提交

Add conditional compile for gru opt (#17368) · b02f2aff

由 zhaoyuchen2018 提交于 5月 16, 2019

* improve gru unit performance.
refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Add conditional compile for gru opt

Not enable gru opt if compute ability < 700

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* refine code.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

b02f2aff

15 5月, 2019 1 次提交
- K
  Optimize the sequence padding op (#17403) · 0823a7bc
  由 Krzysztof Binias 提交于 5月 15, 2019
```
test=develop
```
  0823a7bc
10 5月, 2019 1 次提交

improve gru unit performance. (#16338) · 8a2caacd

由 zhaoyuchen2018 提交于 5月 10, 2019

refine code

fuse cublas  calling and kernels into one cuda kernel.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

8a2caacd

07 5月, 2019 1 次提交

Softmax_cross_entropy op add axis (#16806) · a71d8fdb

由 Kaipeng Deng 提交于 5月 07, 2019

* add attr axis infershape. test=develop

* add CUDA kernel. test=develop

* fix unittest. test=develop

* fix unittest for soft_label. test=develop

* fix fp16 unittest. test=develop

* remove comment code. test=develop

* refine test for axis. test=develop

* add python api. test=develop

* fix doc. test=develop

* fix fp16 unittest. test=develop

* fix ngraph test. test=develop

* fix ENFORCE for test_imperative_transformer. test=develop

* fit for ngraph test. test=develop

* fix after rebase develop. test=develop

* fix doc. test=develop

* fix API.spec. test=develop

* fix test_layers. test=develop

* fix format. test=develop

a71d8fdb

20 4月, 2019 1 次提交

Support seq len equal to 0 in sequence ops (#16935) · 3c375751

由 Yibing Liu 提交于 4月 20, 2019

* Support seq len equal to 0 in sequence ops

test=develop

* Add more test cases

* Fix some comments

test=develop

* Fix py3 error

test=develop

3c375751

17 4月, 2019 1 次提交

fix overflow by int32 mul test=develop (#16794) · c474e7dd

由 Kevin 提交于 4月 17, 2019

* fix overflow by int32 mul test=develop

* fix reference nullptr

* fix codestyle test=develop

* modify to point in ContextProjectFunctor test=develop

* modify to point in ContextProjectFunctor test=develop

* modify . to -> test=develop

c474e7dd

12 4月, 2019 3 次提交
- Q
  
  fix cpplint test=develop · faae1b41
  由 Qiao Longfei 提交于 4月 12, 2019
  
  faae1b41
- Q
  
  add cpu_merge_add_multi_noduplicated_test test=develop · 0a8ff2ec
  由 Qiao Longfei 提交于 4月 12, 2019
  
  0a8ff2ec
- Q
  
  optimize merge add if input rows of all selected rows is not duplicated · 920a9609
  由 Qiao Longfei 提交于 4月 12, 2019
  
  920a9609
25 3月, 2019 1 次提交
- D
  
  fix format. test=develop · 90bd038d
  由 dengkaipeng 提交于 3月 25, 2019
  
  90bd038d
20 3月, 2019 2 次提交
- P
  
  fix sequence pad; test=develop · 1580be5d
  由 phlrain 提交于 3月 20, 2019
  
  1580be5d
- D
  
  add jit kernel for softmax axis. test=develop · 93701dba
  由 dengkaipeng 提交于 3月 20, 2019
  
  93701dba
18 3月, 2019 2 次提交
- D
  
  refine softmax kernel. test=develop · 6c641827
  由 dengkaipeng 提交于 3月 18, 2019
  
  6c641827
- P
  
  remove resize then seq num == 1; test=develop · 802b3348
  由 phlrain 提交于 3月 18, 2019
  
  802b3348
14 3月, 2019 2 次提交
- S
  revert revert 16144 · 5a92e4c0
  由 sneaxiy 提交于 3月 14, 2019
```
test=develop
```
  5a92e4c0
- Z
  Revert "PaddingRNN model memory optimize" · a91964c8
  由 Zeng Jinle 提交于 3月 14, 2019
```
test=develop
```
  a91964c8
12 3月, 2019 1 次提交
- S
  refine code · b26e9bd2
  由 sneaxiy 提交于 3月 12, 2019
```
test=develop
```
  b26e9bd2
08 3月, 2019 3 次提交
- T
  simplify the jitkernel templates and tests · 14a764c9
  由 tensor-tang 提交于 3月 08, 2019
```
test=develop
```
  14a764c9
- Y
  Make parent_idx a dispensable output for beam_search op to support models... · 66ead07e
  由 Yiqun Liu 提交于 3月 08, 2019
```
Make parent_idx a dispensable output for beam_search op to support models saved by older paddle version. (#16106)

test=develop
```
  66ead07e
- Y
  Make parent_idx a dispensable output for beam_search op to support models... · 5bde1202
  由 Yiqun Liu 提交于 3月 08, 2019
```
Make parent_idx a dispensable output for beam_search op to support models saved by older paddle version. (#16106)

test=develop
```
  5bde1202
07 3月, 2019 1 次提交
- T
  unify the kernelfuncs cache and add unit test · 802f362a
  由 tensor-tang 提交于 3月 07, 2019
```
test=develop
```
  802f362a
04 3月, 2019 3 次提交
- Y
  Fix error in CUDA kernel of beam_search. (#15957) · c90b82a6
  由 Yiqun Liu 提交于 2月 28, 2019
```
test=develop
```
  c90b82a6
- Y
  Optimize gelu operation with mkl erf. · b48d56e8
  由 Yihua Xu 提交于 2月 26, 2019
```
test=develop
```
  b48d56e8
- Q
  
  improve communicator · 3691a46f
  由 Qiao Longfei 提交于 3月 04, 2019
  
  3691a46f
28 2月, 2019 1 次提交
- Y
  Fix error in CUDA kernel of beam_search. (#15957) · 87248281
  由 Yiqun Liu 提交于 2月 28, 2019
```
test=develop
```
  87248281
26 2月, 2019 1 次提交
- Y
  Optimize gelu operation with mkl erf. · 73967886
  由 Yihua Xu 提交于 2月 26, 2019
```
test=develop
```
  73967886
22 2月, 2019 2 次提交

T
Revert 15770 develop a6910f90 gelu mkl opt (#15872) · ee2321de
由 tensor-tang 提交于 2月 22, 2019
```
* Revert "Optimze Gelu with MKL Erf function (#15770)"

This reverts commit 676995c8.

* test=develop
```
ee2321de

Optimze Gelu with MKL Erf function (#15770) · 676995c8

由 Yihua Xu 提交于 2月 22, 2019

* Optimize for gelu operator

* Set up the low accuracy mode of MKL ERF function.

test=develop

* Only enable MKLML ERF when OS is linux

* Use the speical mklml version included vmsErf function to verify gelu mkl kernel.

test=develop

* Add the CUDA macro to avoid NVCC's compile issue.

test=develop

* Add the TODO comments for mklml library modification.

test=develop

* Clean Code

test=develop

* Add the comment of marco for NVCC compiler.

test=develop

676995c8

19 2月, 2019 1 次提交
- X
  update comment · f2262d73
  由 xuezhong 提交于 2月 19, 2019
```
test=develop
```
  f2262d73
11 2月, 2019 1 次提交
- X
  pass test for lstm op · fb9a6a2b
  由 xuezhong 提交于 2月 11, 2019
```
test=develop
```
  fb9a6a2b
02 2月, 2019 1 次提交
- P
  fix dependency · 061299be
  由 peizhilin 提交于 2月 02, 2019
```
test=develop
```
  061299be
30 1月, 2019 5 次提交
- X
  
  remove debug print · 4c98c2cc
  由 xuezhong 提交于 1月 30, 2019
  
  4c98c2cc
- X
  
  add sample_logits op · 58ad40cc
  由 xuezhong 提交于 1月 30, 2019
  
  58ad40cc
- X
  
  add cell clip and proj clip, fix bug for h0 · 88083632
  由 xuezhong 提交于 1月 30, 2019
  
  88083632
- T
  add analyzer_transformer_test · 3d0ecab4
  由 Tao Luo 提交于 1月 30, 2019
```
test=develop
```
  3d0ecab4
- Y
  Return parent_idx in beam_search op (#15520) · 16d54f7f
  由 Yiqun Liu 提交于 1月 30, 2019
```
* Refine beam_search_op to output an extra parent_idx tensor.
test=develop

* Fix the unittest test_beam_search_op.
test=develop

* Fix the merging mistake.
test=develop
```
  16d54f7f

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致