提交 · 5b8837897d7dedfb27425bf63922d913a7369eb2 · BaiXuePrincess / Paddle

03 1月, 2020 3 次提交

Add the first implememtation of fusion_group op (#19621) · d4832077

由 Yiqun Liu 提交于 1月 03, 2020

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

* Add DeviceCodePool to manage all device codes.

* Add the first implementation fusion_group op.

* Add unit-test for fusion_group op.

* Add the check of result.

* Add the check of nvrtc in unit-test.
test=develop

* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop

* Disable fusion_group op for mac and windows.
test=develop

* Make the compiling of device code return status instead of hanging up.
test=develop

* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.

* Unify fusion_group_op's input and output names.
test=develop

* Add the check of CUDA driver library in unittest.
test=develop

* Refine the calling of PADDLE_ENFORCE.
test=develop

d4832077

M

[DNNL] 3D Fully-Connected (#21746) · 61921084
由 Michał Gallus 提交于 1月 03, 2020

61921084
F
fix generate_proposal_labesl op (#21793) · aa2ed0dc
由 FDInSky 提交于 1月 03, 2020
```
* test=develop fix generate_proposal_labesl op
```
aa2ed0dc

02 1月, 2020 2 次提交

C
update error log for batch_norm_grad (#22017) · 95d79b6d
由 ceci3 提交于 1月 02, 2020
```
* update error information about batch_norm_grad

* update bn,test=develop
```
95d79b6d

fix integer overflow in match_matrix (#22036) · c53b62eb

由 Aurelius84 提交于 1月 02, 2020

* fix integer overflow in match_matrix test=develop

* fix integer overflow in match_matrix test=develop

* fix typo test=develop

c53b62eb

01 1月, 2020 1 次提交
- C
  
  polish default error msg & cublas error hint, test=develop (#22032) · 2e908225
  由 Chen Weihang 提交于 1月 01, 2020
  
  2e908225
31 12月, 2019 1 次提交
- W
  
  polish code test=develop (#22014) · 64baee41
  由 wangchaochaohu 提交于 12月 31, 2019
  
  64baee41
30 12月, 2019 4 次提交
- C
  
  Add error message for cublas inItizalize failed (#21995) · 35ff1568
  由 Chen Weihang 提交于 12月 30, 2019
  
  35ff1568
- C
  
  fix no hint problem when use ENFORCE for cuda, test=develop (#21994) · fbb42173
  由 Chen Weihang 提交于 12月 30, 2019
  
  fbb42173
- Z
  
  Modify demo_ci to support Windows, prepare for PR_Windows_Inference (#21873) · e66f92d1
  由 zhouwei25 提交于 12月 30, 2019
  
  e66f92d1
- D
  
  fix broadcast bug;test=develop (#21898) · b7697f62
  由 danleifeng 提交于 12月 30, 2019
  
  b7697f62
29 12月, 2019 1 次提交

Fix multi-threads memory out of bounds error for passes (#21920) · 196e20df

由 liu zhengxi 提交于 12月 29, 2019

* fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop

* fix attention_lstm_fuse_pass during multi-threads inference, test=develop

* fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop

* fix fc_lstm_fuse_pass during multi-threads inference, test=develop

* fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop

196e20df

27 12月, 2019 5 次提交

Refine multihead kernel, align block to 32 (#21961) · 8859ddd6

由 zhaoyuchen2018 提交于 12月 27, 2019

* Refine multihead kernel, align block to 32

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine log comments

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

8859ddd6

S

test=develop, remove unused variable (#21974) · fd9b00df
由 silingtong123 提交于 12月 27, 2019

fd9b00df

add shuffle batch op (#21674) · cee2ccb0

由 zhoushiyu 提交于 12月 27, 2019

* add shuffle batch op, test=develop, test=document_preview

* fix size_t conflict and check_output test=develop, test=document_preview

* fix bug test=develop, test=document_preview

* add unittest of shuffle_batch layer test=develop, test=document_preview

* fix py coverage and op input type, test=develop, test=document_preview

* fix py coverage, test=develop

* fix en doc, test=develop

* move to contrib test=develop

* add unique_name test=develop

* invoke shuffle_batch in contrib.layers test=develop

cee2ccb0

M
make reverse op support negative axis (#21925) · c3e19549
由 mapingshuo 提交于 12月 27, 2019
```
* make reverse op support negative axis
```
c3e19549
石
fix multi-thread error of fc_gru_fuse_pass.cc, test=develop (#21841) · 03479469
由石晓伟提交于 12月 27, 2019
```
* fix multi-thread error of fc_gru_fuse_pass.cc, test=develop

* export FLAGS and GLOG symbols, test=develop
```
03479469

26 12月, 2019 4 次提交
- W
  add conda build python script test=develop (#21943) · de9ba01f
  由 wangchaochaohu 提交于 12月 27, 2019
```
* add script for conda package build
```
  de9ba01f
- A
  Remove double registered dataType in Pad2d (#21942) · 10d68469
  由 Aurelius84 提交于 12月 26, 2019
```
* fix compile error in CUDA10 test=develop

* remove double in pad2d test=develop
```
  10d68469
- Z
  Fix openblas bug to support compile on windows when WITH_MKL=OFF (#21902) · 2df4be5d
  由 zhouwei25 提交于 12月 26, 2019
```
* Fix openblas to support compile on Windows when WITH_MKL=OFF
```
  2df4be5d
- H
  fix aucop stat shape (#21846) · 27decacb
  由 hutuxian 提交于 12月 26, 2019
```
* fix stat shape back in global auc scenario
* add UT to cover global auc
```
  27decacb
25 12月, 2019 6 次提交
- P
  
  fix trt calib not working bug, test=develop (#21934) · 3e5008ad
  由 Pei Yang 提交于 12月 25, 2019
  
  3e5008ad
- A
  add register op_data_type of pad/expand_as et.al (#21718) · 5cb2c741
  由 Aurelius84 提交于 12月 25, 2019
```
* add register op_data_type test=develop

* fix register bug in isfinite op test=develop

* rm int int64_t in pad2d gradKernel  test=develop
```
  5cb2c741
- Q
  Pack imperative/layer into paddle_framework.so (#21921) · 20667458
  由 qingqing01 提交于 12月 25, 2019
```
* Pack imperative/layer into paddle_framework.so
```
  20667458
- H
  
  fix matmul error message; test=develop (#21885) · 30d000f8
  由 hong 提交于 12月 25, 2019
  
  30d000f8
- Z
  
  remove patch command and file of cares to Improved quality of Paddle Repo (#21776) · a01663ca
  由 zhouwei25 提交于 12月 25, 2019
  
  a01663ca
- F
  python zero copy inference, delete pass (#21897) · 2bbc0d7d
  由 flame 提交于 12月 25, 2019
```
* python zero copy inference
* support delete inference pass
```
  2bbc0d7d
24 12月, 2019 5 次提交

Optimize adam speed (#21777) · 51a86d2b

由 Aurelius84 提交于 12月 24, 2019

* optimize adam speed by removing _finish_update test=develop

* fix SparseAdamFunctor param list test=develop

* Remove scale_op in expect_list of adam_op test=develop

* fix test optimizer loss assert error test=develop

* fix test optimizer loss assert error test=develop

* modify PADDLE_ENFORCE usage test=develop

* fix op_type in lamb_op.cc test=develop

* fix errors ostream format bug test=develop

* add betaPowOut in ngraph op test=develop

* fix ngraph::op api for gcc8 test=develop

* clean code test=develop

* modify struct into class test=develop

* remove code of beta1Tensor in lamb_op test=develop

51a86d2b

Update layers used in ptb model to use auto-generated op functions in dygraph mode (#21724) · 310edc0d

由 Leo Chen 提交于 12月 24, 2019

* update layers, test=develop

* fix input numpy, test=develop

* fix bugs, test=develop

* follow commments, test=develop

* update getitem, test=develop

310edc0d

L
change qat_performance with mobilenet, change batch_size of qat2_resnet50 (#21895) · 9dff56e8
由 lidanqing 提交于 12月 24, 2019
```
test=develop
```
9dff56e8
F
Update iou_similarity op to support non-normalized bbox (#21671) · 6b9fbcf3
由 FDInSky 提交于 12月 24, 2019
```
Update iou_similarity op to support non-normalized bbox
```
6b9fbcf3
G

Modify the while_loop API (#21844) · 46f9184a
由 guofei 提交于 12月 24, 2019

46f9184a

23 12月, 2019 3 次提交
- G
  
  Fix default label dim of label_smooth_op. test=develop (#21862) · 7689b6aa
  由 Guo Sheng 提交于 12月 23, 2019
  
  7689b6aa
- Z
  
  change ci check rule of deleting unit-test (#21876) · 13e4756f
  由 zhouwei25 提交于 12月 23, 2019
  
  13e4756f
- G
  optimize fc jit (#21878) · d4dda862
  由 GaoWei8 提交于 12月 23, 2019
```
test=develop
```
  d4dda862
20 12月, 2019 4 次提交

Z

fix Execution order of ci_check_unittest, and add it to Linux_py35 (#21640) · 013225bb
由 zhouwei25 提交于 12月 20, 2019

013225bb
C

fix softmax_with_cross_entropy_fix bug, test=develop (#21810) · 2b941736
由 Chen Weihang 提交于 12月 20, 2019

2b941736

add table id in cache shuffle (#21585) · c3cf42d0

由 Thunderbrook 提交于 12月 20, 2019

* general table

* add sparse table
test=develop

* no cvm
test=develop

* add no_cvm
test=develop

* add note
test=develop

* code style
test=develop

* code style
test=develop

* code style
test=develop

* code style
test=develop

* code style
test=develop

* add key of optimizer
test=develop

* solve pslib stop core
test=develop

* barrier
test=develop

* add notes
test=develop

* add table id in cache shuffle
test=develop

* table id
test=develop

* code style
test=develop

c3cf42d0

Disable memory opt pass when DNNL is on (#21826) · 253e6642

由 Michał Gallus 提交于 12月 20, 2019

* Disable memory opt pass when DNNL is on

* Refine comment above mem optimization pass enablement

test=develop

253e6642

19 12月, 2019 1 次提交
- C
  Speed GEO dense calc & communication (#21579) · a86f11b5
  由 Chengmo 提交于 12月 19, 2019
```
* test=develop, speed dense calc & communication
```
  a86f11b5

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致