提交 · 5b8837897d7dedfb27425bf63922d913a7369eb2 · BaiXuePrincess / Paddle

03 1月, 2020 3 次提交

Add the first implememtation of fusion_group op (#19621) · d4832077

由 Yiqun Liu 提交于 1月 03, 2020

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

* Add DeviceCodePool to manage all device codes.

* Add the first implementation fusion_group op.

* Add unit-test for fusion_group op.

* Add the check of result.

* Add the check of nvrtc in unit-test.
test=develop

* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop

* Disable fusion_group op for mac and windows.
test=develop

* Make the compiling of device code return status instead of hanging up.
test=develop

* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.

* Unify fusion_group_op's input and output names.
test=develop

* Add the check of CUDA driver library in unittest.
test=develop

* Refine the calling of PADDLE_ENFORCE.
test=develop

d4832077

M

[DNNL] 3D Fully-Connected (#21746) · 61921084
由 Michał Gallus 提交于 1月 03, 2020

61921084
F
fix generate_proposal_labesl op (#21793) · aa2ed0dc
由 FDInSky 提交于 1月 03, 2020
```
* test=develop fix generate_proposal_labesl op
```
aa2ed0dc

02 1月, 2020 2 次提交

C
update error log for batch_norm_grad (#22017) · 95d79b6d
由 ceci3 提交于 1月 02, 2020
```
* update error information about batch_norm_grad

* update bn,test=develop
```
95d79b6d

fix integer overflow in match_matrix (#22036) · c53b62eb

由 Aurelius84 提交于 1月 02, 2020

* fix integer overflow in match_matrix test=develop

* fix integer overflow in match_matrix test=develop

* fix typo test=develop

c53b62eb

31 12月, 2019 1 次提交
- W
  
  polish code test=develop (#22014) · 64baee41
  由 wangchaochaohu 提交于 12月 31, 2019
  
  64baee41
30 12月, 2019 1 次提交
- D
  
  fix broadcast bug;test=develop (#21898) · b7697f62
  由 danleifeng 提交于 12月 30, 2019
  
  b7697f62
27 12月, 2019 3 次提交

Refine multihead kernel, align block to 32 (#21961) · 8859ddd6

由 zhaoyuchen2018 提交于 12月 27, 2019

* Refine multihead kernel, align block to 32

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* Refine log comments

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

8859ddd6

add shuffle batch op (#21674) · cee2ccb0

由 zhoushiyu 提交于 12月 27, 2019

* add shuffle batch op, test=develop, test=document_preview

* fix size_t conflict and check_output test=develop, test=document_preview

* fix bug test=develop, test=document_preview

* add unittest of shuffle_batch layer test=develop, test=document_preview

* fix py coverage and op input type, test=develop, test=document_preview

* fix py coverage, test=develop

* fix en doc, test=develop

* move to contrib test=develop

* add unique_name test=develop

* invoke shuffle_batch in contrib.layers test=develop

cee2ccb0

M
make reverse op support negative axis (#21925) · c3e19549
由 mapingshuo 提交于 12月 27, 2019
```
* make reverse op support negative axis
```
c3e19549

26 12月, 2019 2 次提交
- A
  Remove double registered dataType in Pad2d (#21942) · 10d68469
  由 Aurelius84 提交于 12月 26, 2019
```
* fix compile error in CUDA10 test=develop

* remove double in pad2d test=develop
```
  10d68469
- H
  fix aucop stat shape (#21846) · 27decacb
  由 hutuxian 提交于 12月 26, 2019
```
* fix stat shape back in global auc scenario
* add UT to cover global auc
```
  27decacb
25 12月, 2019 3 次提交
- A
  add register op_data_type of pad/expand_as et.al (#21718) · 5cb2c741
  由 Aurelius84 提交于 12月 25, 2019
```
* add register op_data_type test=develop

* fix register bug in isfinite op test=develop

* rm int int64_t in pad2d gradKernel  test=develop
```
  5cb2c741
- H
  
  fix matmul error message; test=develop (#21885) · 30d000f8
  由 hong 提交于 12月 25, 2019
  
  30d000f8
- Z
  
  remove patch command and file of cares to Improved quality of Paddle Repo (#21776) · a01663ca
  由 zhouwei25 提交于 12月 25, 2019
  
  a01663ca
24 12月, 2019 3 次提交

Optimize adam speed (#21777) · 51a86d2b

由 Aurelius84 提交于 12月 24, 2019

* optimize adam speed by removing _finish_update test=develop

* fix SparseAdamFunctor param list test=develop

* Remove scale_op in expect_list of adam_op test=develop

* fix test optimizer loss assert error test=develop

* fix test optimizer loss assert error test=develop

* modify PADDLE_ENFORCE usage test=develop

* fix op_type in lamb_op.cc test=develop

* fix errors ostream format bug test=develop

* add betaPowOut in ngraph op test=develop

* fix ngraph::op api for gcc8 test=develop

* clean code test=develop

* modify struct into class test=develop

* remove code of beta1Tensor in lamb_op test=develop

51a86d2b

F
Update iou_similarity op to support non-normalized bbox (#21671) · 6b9fbcf3
由 FDInSky 提交于 12月 24, 2019
```
Update iou_similarity op to support non-normalized bbox
```
6b9fbcf3
G

Modify the while_loop API (#21844) · 46f9184a
由 guofei 提交于 12月 24, 2019

46f9184a

23 12月, 2019 2 次提交
- G
  
  Fix default label dim of label_smooth_op. test=develop (#21862) · 7689b6aa
  由 Guo Sheng 提交于 12月 23, 2019
  
  7689b6aa
- G
  optimize fc jit (#21878) · d4dda862
  由 GaoWei8 提交于 12月 23, 2019
```
test=develop
```
  d4dda862
20 12月, 2019 1 次提交
- C
  
  fix softmax_with_cross_entropy_fix bug, test=develop (#21810) · 2b941736
  由 Chen Weihang 提交于 12月 20, 2019
  
  2b941736
19 12月, 2019 4 次提交
- C
  Speed GEO dense calc & communication (#21579) · a86f11b5
  由 Chengmo 提交于 12月 19, 2019
```
* test=develop, speed dense calc & communication
```
  a86f11b5
- W
  handle multi-inputs with empty inputs for mkldnn_concat_op (#21827) · 666c3bb9
  由 Wojciech Uss 提交于 12月 19, 2019
```
test=develop
```
  666c3bb9
- G
  Make While Op could run on GPU place and add while_loop unittest (#21672) · 8b7c50f4
  由 guofei 提交于 12月 19, 2019
```
1. Make while_op accept GPU conditional data
2. Add more complex test cases for while_loop API
```
  8b7c50f4
- W
  
  fix batch_norm_grad infer shape=0 & add allreduce enforce shape, test=develop (#21801) · 17299b8d
  由 WangXi 提交于 12月 19, 2019
  
  17299b8d
17 12月, 2019 1 次提交
- H
  
  Fix That conditional_block_op Doesn't Have InferShape (#21733) · 0677a1c1
  由 Huihuang Zheng 提交于 12月 17, 2019
  
  0677a1c1
16 12月, 2019 3 次提交
- Z
  Fix softmax cuda bug (#21720) · a5a8d144
  由 zhaoyuchen2018 提交于 12月 16, 2019
```
* Fix softmax cuda bug

* Refine multihead log and softmax logic
```
  a5a8d144
- K
  yolo_box OP add Attr(clip_bbox). (#21620) · 943a4449
  由 Kaipeng Deng 提交于 12月 16, 2019
```
* yolo_box OP add Attr(clip_bbox). test=develop
```
  943a4449
- L
  Fix elementwise_pow bug on CUDA place with integer (#21675) · 7181afd7
  由 Leo Chen 提交于 12月 16, 2019
```
* fix elementwise_pow bug on integer, test=develop

* use llrint to support elementwise_pow_grad, test=develop

* add some tests, test=develop

* revert grad functor, test=develop
```
  7181afd7
15 12月, 2019 1 次提交
- C
  Rename paddle throw error macro (#21657) · 1fd1f06f
  由 Chen Weihang 提交于 12月 15, 2019
```
* rename paddle throw error macro, test=develop

* fix new error use case, test=develop
```
  1fd1f06f
12 12月, 2019 2 次提交

Add reshape int8 mkldnn op (#21428) · d419b859

由 joanna.wozna.intel 提交于 12月 12, 2019

* Add reshape int8 op

test=develop

* Change test to CPUPlace

test=develop

* Correct tests

test=develop

d419b859

memory leak for cpu (#21174) · 9ad940fd

由 tangwei12 提交于 12月 12, 2019

* add fake init for the trainer, fix large memory hold in the trainer
* do not merge recv vars from a remote endpoint, test=develop
* add recv and save op, merge slice var in one op, save memory
* remove hsigmoid with pull sparse, test=develop

9ad940fd

11 12月, 2019 1 次提交
- G
  Modify padding strategy: remove weight copy in fc padding (#21650) · 5af0c7ba
  由 GaoWei8 提交于 12月 11, 2019
```
test=develop
```
  5af0c7ba
10 12月, 2019 5 次提交

W

fix the mean grad OP performance improvement test=develop (#21658) · 5eec8cf5
由 wangchaochaohu 提交于 12月 10, 2019

5eec8cf5
Z

refine some grad op makers, test=develop (#21629) · 29f64c8c
由 Zeng Jinle 提交于 12月 10, 2019

29f64c8c
M
Dropout with seed (#21590) · e2d849b9
由 mapingshuo 提交于 12月 10, 2019
```
* add seed op
```
e2d849b9

MKL-DNN 1.0 Update (#20162) · e81f0228

由 Adam 提交于 12月 10, 2019

* MKLDNN v1.0 rebase to Paddle 1.6
test=develop

* Add hacky paddle::string::to_string() implementation

* vectorize<int64-t>() -> vectorize() cleanup
test=develop

* PADDLE_ENFORCE and void_cast fixes
test=develop

* Rebase changes
test=develop

* Cosmetics
test=develop

* Delete MKL from mkldnn.cmake
test=develop

* CMake debug commands
test=develop

* Delete MKLDNN_VERBOSE and rebase fixes
test=develop

* Rebase fixes
test=develop

* Temporarily disable int8 resnet101 vgg16 and vgg19 tests
test=develop

* Add libmkldnn.so.1 to python setup
test=develop

* Add libmkldnn.so.1 to inference_lib cmake after rebase
test=develop

* Post rebase fixes + FC int8 changes
test=develop

* Fix LRN NHWC
test=develop

* Fix NHWC conv3d
test=develop

* Windows build fix + next conv3d fix
test=develop

* Fix conv2d on AVX2 machines
test=develop

e81f0228

W
Mean gpu optimize (#21643) · 95b95a28
由 wangchaochaohu 提交于 12月 09, 2019
```
* accelerate mean op test=develop
```
95b95a28

06 12月, 2019 2 次提交
- Z
  Polish op registry codes (#21561) · 0f888836
  由 Zeng Jinle 提交于 12月 06, 2019
```
* polish infer shape registry, test=develop

* modify some operators registry, test=develop
```
  0f888836
- A
  
  Set lod_level of Out in compile time of sequence_pool_op (#21604) · 3d9dee57
  由 Aurelius84 提交于 12月 06, 2019
  
  3d9dee57

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致