提交 · 920a960974eaa6af1f250ec6d299ccfa603dafd2 · 机器未来 / Paddle

12 4月, 2019 1 次提交
- Q
  
  optimize merge add if input rows of all selected rows is not duplicated · 920a9609
  由 Qiao Longfei 提交于 4月 12, 2019
  
  920a9609
25 3月, 2019 1 次提交
- D
  
  fix format. test=develop · 90bd038d
  由 dengkaipeng 提交于 3月 25, 2019
  
  90bd038d
20 3月, 2019 2 次提交
- P
  
  fix sequence pad; test=develop · 1580be5d
  由 phlrain 提交于 3月 20, 2019
  
  1580be5d
- D
  
  add jit kernel for softmax axis. test=develop · 93701dba
  由 dengkaipeng 提交于 3月 20, 2019
  
  93701dba
18 3月, 2019 2 次提交
- D
  
  refine softmax kernel. test=develop · 6c641827
  由 dengkaipeng 提交于 3月 18, 2019
  
  6c641827
- P
  
  remove resize then seq num == 1; test=develop · 802b3348
  由 phlrain 提交于 3月 18, 2019
  
  802b3348
14 3月, 2019 2 次提交
- S
  revert revert 16144 · 5a92e4c0
  由 sneaxiy 提交于 3月 14, 2019
```
test=develop
```
  5a92e4c0
- Z
  Revert "PaddingRNN model memory optimize" · a91964c8
  由 Zeng Jinle 提交于 3月 14, 2019
```
test=develop
```
  a91964c8
12 3月, 2019 1 次提交
- S
  refine code · b26e9bd2
  由 sneaxiy 提交于 3月 12, 2019
```
test=develop
```
  b26e9bd2
08 3月, 2019 3 次提交
- T
  simplify the jitkernel templates and tests · 14a764c9
  由 tensor-tang 提交于 3月 08, 2019
```
test=develop
```
  14a764c9
- Y
  Make parent_idx a dispensable output for beam_search op to support models... · 66ead07e
  由 Yiqun Liu 提交于 3月 08, 2019
```
Make parent_idx a dispensable output for beam_search op to support models saved by older paddle version. (#16106)

test=develop
```
  66ead07e
- Y
  Make parent_idx a dispensable output for beam_search op to support models... · 5bde1202
  由 Yiqun Liu 提交于 3月 08, 2019
```
Make parent_idx a dispensable output for beam_search op to support models saved by older paddle version. (#16106)

test=develop
```
  5bde1202
07 3月, 2019 1 次提交
- T
  unify the kernelfuncs cache and add unit test · 802f362a
  由 tensor-tang 提交于 3月 07, 2019
```
test=develop
```
  802f362a
04 3月, 2019 3 次提交
- Y
  Fix error in CUDA kernel of beam_search. (#15957) · c90b82a6
  由 Yiqun Liu 提交于 2月 28, 2019
```
test=develop
```
  c90b82a6
- Y
  Optimize gelu operation with mkl erf. · b48d56e8
  由 Yihua Xu 提交于 2月 26, 2019
```
test=develop
```
  b48d56e8
- Q
  
  improve communicator · 3691a46f
  由 Qiao Longfei 提交于 3月 04, 2019
  
  3691a46f
28 2月, 2019 1 次提交
- Y
  Fix error in CUDA kernel of beam_search. (#15957) · 87248281
  由 Yiqun Liu 提交于 2月 28, 2019
```
test=develop
```
  87248281
26 2月, 2019 1 次提交
- Y
  Optimize gelu operation with mkl erf. · 73967886
  由 Yihua Xu 提交于 2月 26, 2019
```
test=develop
```
  73967886
22 2月, 2019 2 次提交

T
Revert 15770 develop a6910f90 gelu mkl opt (#15872) · ee2321de
由 tensor-tang 提交于 2月 22, 2019
```
* Revert "Optimze Gelu with MKL Erf function (#15770)"

This reverts commit 676995c8.

* test=develop
```
ee2321de

Optimze Gelu with MKL Erf function (#15770) · 676995c8

由 Yihua Xu 提交于 2月 22, 2019

* Optimize for gelu operator

* Set up the low accuracy mode of MKL ERF function.

test=develop

* Only enable MKLML ERF when OS is linux

* Use the speical mklml version included vmsErf function to verify gelu mkl kernel.

test=develop

* Add the CUDA macro to avoid NVCC's compile issue.

test=develop

* Add the TODO comments for mklml library modification.

test=develop

* Clean Code

test=develop

* Add the comment of marco for NVCC compiler.

test=develop

676995c8

19 2月, 2019 1 次提交
- X
  update comment · f2262d73
  由 xuezhong 提交于 2月 19, 2019
```
test=develop
```
  f2262d73
11 2月, 2019 1 次提交
- X
  pass test for lstm op · fb9a6a2b
  由 xuezhong 提交于 2月 11, 2019
```
test=develop
```
  fb9a6a2b
02 2月, 2019 1 次提交
- P
  fix dependency · 061299be
  由 peizhilin 提交于 2月 02, 2019
```
test=develop
```
  061299be
30 1月, 2019 5 次提交
- X
  
  remove debug print · 4c98c2cc
  由 xuezhong 提交于 1月 30, 2019
  
  4c98c2cc
- X
  
  add sample_logits op · 58ad40cc
  由 xuezhong 提交于 1月 30, 2019
  
  58ad40cc
- X
  
  add cell clip and proj clip, fix bug for h0 · 88083632
  由 xuezhong 提交于 1月 30, 2019
  
  88083632
- T
  add analyzer_transformer_test · 3d0ecab4
  由 Tao Luo 提交于 1月 30, 2019
```
test=develop
```
  3d0ecab4
- Y
  Return parent_idx in beam_search op (#15520) · 16d54f7f
  由 Yiqun Liu 提交于 1月 30, 2019
```
* Refine beam_search_op to output an extra parent_idx tensor.
test=develop

* Fix the unittest test_beam_search_op.
test=develop

* Fix the merging mistake.
test=develop
```
  16d54f7f
29 1月, 2019 3 次提交
- T
  cache fc kernel · a18c0d42
  由 tensor-tang 提交于 1月 29, 2019
```
test=develop
```
  a18c0d42
- T
  cache softmax kernel func · 6e1ee7fb
  由 tensor-tang 提交于 1月 29, 2019
```
test=develop
```
  6e1ee7fb
- T
  refine softmax and use with cache · d59f7335
  由 tensor-tang 提交于 1月 28, 2019
```
test=develop
```
  d59f7335
24 1月, 2019 2 次提交

Add the CUDA kernel for beam_search op (#15020) · 3008fa12

由 Yiqun Liu 提交于 1月 24, 2019

* Refine the beam_search op and test.

* A basic CUDA implementation of beam_search for small batch_size.

* Implement CUDA kernel for beam_search_op.

* Use multiple CUDA threads in the same block to select the top beam.

* Update the python api of beam_search op.

* Enable extend function in CPU kernel of beam_search op.

* Unify the CUDA codes.
test=develop

* Unify the CPU kernel of beam_search op.

* Ensure the seletced items of beam_search_op's CPU kernel sorted by scores.

* Update the description of beam_search in API.spec.

* Enable the use of CUDA kernel in beam_search op.

* Exclude the beam_search's CUDA unittest when there is no CUDA gpu, and delete some debuging statements.
test=develop

* Follow comments.
test=develop

* Call the CPU kernel for beam_search op when batch_size > 4.
test=develop

* Remove the except of is_empty op in PrepareData.
test=develop

3008fa12

T
nce add check sample lables, test=develop (#15463) · 5cfc40de
由 tangwei12 提交于 1月 24, 2019
```
* nce add check sample lables, test=develop
```
5cfc40de

21 1月, 2019 1 次提交

Memory optimization of depthwise conv op and group norm op (#15313) · 9f8f0fc2

由 Dun 提交于 1月 21, 2019

* mem opt

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* test=develop

* refine code  test=develop

* refine code  test=develop

* refine code  test=develop

* refine code  test=develop

* refine with cub test=develop

* fix mkldnn test && remove comments && test=develop

* polish code && test=develop

* add only_forward test && test=develop

9f8f0fc2

18 1月, 2019 1 次提交

Tree conv op (#15217) · e2ba9668

由 zhaozhehao 提交于 1月 18, 2019

* refactor tree2col operator with new memory mechanism test=develop

* test=develop

* test=develop

* Modified API according to panyx0718 test=develop

* fix API change according to heavengate test=develop

* Modify API comment test=develop

e2ba9668

14 1月, 2019 1 次提交
- Q
  
  fix gru_gpu_kernel test=develop · 4d15515c
  由 Qiao Longfei 提交于 1月 14, 2019
  
  4d15515c
13 1月, 2019 3 次提交
- Q
  
  fix build problem test=develop · 4feae253
  由 Qiao Longfei 提交于 1月 13, 2019
  
  4feae253
- Q
  
  update avx gru grad kernel test=develop · 4c7be265
  由 Qiao Longfei 提交于 1月 13, 2019
  
  4c7be265
- Q
  update gru_grad_op · 9b16e540
  由 Qiao Longfei 提交于 1月 13, 2019
```
test=develop
```
  9b16e540
10 1月, 2019 1 次提交

[Feature] support mix precision training for resnet (#14899) · fd854183

由 Wu Yi 提交于 1月 10, 2019

* clip softmax for fp16

* updates

* fuse xent support fp16 test=develop

* wip

* wip

* add simple row reduce

* wip fp16 accurate softmax

* add accurate softmax kernel for fp16 test=develop

* update test=develop

* fix cpu build test=develop

* update api.spec test=develop

* follow comments test=develop

* fix build test=develop

* fix trt build test=develop

* fix inference build test=develop

* fix merge test=develop

* update test=develop

* try fix build test=develop

* fix build test=develop

* rename real_exp test=develop

* fortest

* remove hacky kernels test=develop

* clean up test=develop

fd854183

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致