提交 · 2efb282c861b2eba83bdfdbe3c8f7b3d0de16a25 · Crayon鑫 / Paddle

24 7月, 2019 2 次提交

Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60

由 Bob Zhu 提交于 7月 24, 2019

* extend matmul op to support multiple head multiplication

With the support of multiple head, the multiplication of two big matrixes is
split into multiplication of several (head_number) small matrixes. e.g. if
Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
[6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].

220eef60

Add python API for appending LoD level (#18702) · 075e1cf7

由 whs 提交于 7月 24, 2019

* Make lod reset op support for append lod level.

* Fix API.spec
test=develop

* Fix unitest.
test=develop

* Add python api for lod append.
test=develop

* Fix API.spec
test=develop

* Fix format of doc.
test=develop

* Fix unitest.
test=develop

* Fix doc.
test=develop

075e1cf7

23 7月, 2019 4 次提交

[MKL-DNN] Extended LRN with reusing via Acquire API (#18675) · 95c1816e

由 Jacek Czaja 提交于 7月 23, 2019

test=develop

- compileation fix

- Yet another compilation fix

- Even yet another compilation fix

- Surprise! Again compilation fix

- lint fixes

test=develop

- Fix to workspace acquire of LRN

test=develop

- Fix to hash of BWD LRN

test=develop

- fix to lrn BWD PD acquire

test=develop

- Fixing LRN PD creation

test=develop

- cosmetic fix in comment

test=develop

- Fixes after review

test=develop

95c1816e

C
Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664) · fd3aad6c
由 chengduo 提交于 7月 23, 2019
```
* support sparse gradients
test=develop
```
fd3aad6c

Cudnn convolution reconstruction (#18284) · 6b78e00d

由 wangchaochaohu 提交于 7月 23, 2019

* rewrite the conv_op using cudnn_conv_helper

* add workspace limit for v7 test=develop

* fix test=develop

* add half float test=develop

* fix test=develop

* fix test=develop

* revise code style test=develop

* fix test=develop

6b78e00d

supports distributed classification (#18690) · 157211c4

由 Yi Liu 提交于 7月 23, 2019

* supports distributed classification training
* update API.spec
* fix evenly division in python3
* change "index_range" to "index_num" in shard_index operator
test=document_preview
test=develop

157211c4

22 7月, 2019 4 次提交
- Q
  
  Fix CPU implementation of roi_align_op backward (#18728) · 3429e65a
  由 qingqing01 提交于 7月 22, 2019
  
  3429e65a
- T
  Revert "Add LeakyRelu MKLDNN support (#18656)" (#18723) · bd22453f
  由 Tao Luo 提交于 7月 22, 2019
```
test=develop
```
  bd22453f
- W
  Make infer shape of pad2d support for input with negative dims in compile time. (#18695) · 189b08dc
  由 whs 提交于 7月 22, 2019
```
test=develop
```
  189b08dc
- B
  
  add license, test=develop (#18709) · 7e3963f2
  由 Bai Yifan 提交于 7月 22, 2019
  
  7e3963f2
20 7月, 2019 2 次提交
- C
  test=develop (#18701) · ccf06a48
  由 cjt222 提交于 7月 20, 2019
```
add license
```
  ccf06a48
- W
  fix clip_by_norm doc (#18688) · 185b3ace
  由 wangguanzhong 提交于 7月 20, 2019
```
* fix clip_by_norm doc, test=develop
```
  185b3ace
19 7月, 2019 2 次提交

Support memory eager deletion on recurrent OP (#17710) · 89bc3fd8

由 Huihuang Zheng 提交于 7月 19, 2019

Test PaddingRNN on V100 GPU device.

Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU.

GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR)
Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)

89bc3fd8

A
Add LeakyRelu MKLDNN support (#18656) · d6b6a337
由 Adam 提交于 7月 19, 2019
```
test=develop
```
d6b6a337

18 7月, 2019 2 次提交
- H
  hash_op support int64 hash_size (#18674) · bb2f5d24
  由 hutuxian 提交于 7月 18, 2019
```
* hash_op support int64 hash_size
* add corresponding UT
```
  bb2f5d24
- G
  remove ctr reader, all functions are satisfied in dataset (#18672) · 5ed713d5
  由 guru4elephant 提交于 7月 18, 2019
```
* remove ctr reader, all functions are satisfied in dataset
```
  5ed713d5
17 7月, 2019 3 次提交
- Y
  Add cuda implementation for `prelu` backward pass (#18633) · ce1ec332
  由 Yang Zhang 提交于 7月 17, 2019
```
* Add GPU implementation for `prelu` backward pass

test=develop

* Fix logic error in `prelu` GPU backward and simplify a bit

test=develop

* Fix `prelu` backward CUDA implementation

test=develop

CPU version was not used actually, so test passed
```
  ce1ec332
- Y
  
  [CPU] Fix the compiling issue with AVX512F macro. (#18634) · 97549a4f
  由 Yihua Xu 提交于 7月 17, 2019
  
  97549a4f
- B
  
  [NGraph] handle dim element 0 of ngraph op (#18568) · 256ba7cb
  由 baojun 提交于 7月 16, 2019
  
  256ba7cb
16 7月, 2019 2 次提交

[MKL-DNN] Reimplemented pool2d mkl-dnn to use Acquire API (#18585) · 71d883b8

由 Jacek Czaja 提交于 7月 16, 2019

* - Added partial draft of pooling acquire

- Workspace support

- compilation fix

- Added draft of pooling backward reimplementation

- Segfault fix

- reverted 'any' for diff_dst crewation in pooling

- Lint fixes

test=develop

- lint fixes

test=develop

- Further lint fixes

test=develop

* - Fixes after review

test=develop

* - Lint fixes

test=develop

* - Even more lint fixes

test=develop

71d883b8

C
fix bug of scatter op (#18640) · f4ec7d54
由 chengduo 提交于 7月 16, 2019
```
test=develop
```
f4ec7d54

15 7月, 2019 1 次提交
- G
  make auc op compatible with 1 dim (#18551) · ab57d389
  由 guru4elephant 提交于 7月 15, 2019
```
* make auc op compatible with 1 dim
```
  ab57d389
11 7月, 2019 2 次提交

H

fix cudnn lstm shape bug; test=develop (#18492) · a20b2b43
由 Hongyu Liu 提交于 7月 11, 2019

a20b2b43

Feature/buffer_shared_inplace (#17911) · d3003a16

由 Zeng Jinle 提交于 7月 11, 2019

* feature/buffer_shared_inplace, test=develop

* refine code, test=develop

* fix elementwise_add op cpu inplace and sum inplace bug, test=develop

* add unittest and debug log, test=develop

* fix parallel_executor scope bug, polish code, test=develop

* fix sum op, activation op, single_in_place_inference bug, test=develop

* remove kLocalExecScopeName, test=develop

* fix unittest,test=develop

* fix out_var first version bug, test=develop

* follow comments,test=develop

d3003a16

10 7月, 2019 4 次提交
- Z
  Clean unused code of dim and place (#18565) · be24e5b3
  由 Zeng Jinle 提交于 7月 10, 2019
```
* clean code of dim and place, test=develop

* fix failed unittests, test=develop
```
  be24e5b3
- J
  
  Activations MKLDNN ops refactoring (#18191) · 8869d7f7
  由 Jacek Czaja 提交于 7月 10, 2019
  
  8869d7f7
- Y
  
  Register fp16 for concat_op (#18563) · b86234fc
  由 Yibing Liu 提交于 7月 10, 2019
  
  b86234fc
- P
  
  fix compile error which caused by gcc4.8 related commit;test=develop (#18567) · 5e1220ef
  由 Physher 提交于 7月 10, 2019
  
  5e1220ef
09 7月, 2019 3 次提交
- J
  Fix/gcc 4.8 ubt link error (#18558) · 667f88f9
  由 Jiabin Yang 提交于 7月 09, 2019
```
* test=develop, fix docker with paddle nccl problem

* test=develop, fix/gcc_4.8_ubt_link_error

* test=develop, fix code format
```
  667f88f9
- P
  
  Add mkldnn int8 mul-op kernel (#17834) · 0caa08ea
  由 Physher 提交于 7月 09, 2019
  
  0caa08ea
- L
  Fix roi_perspective_transform_op bug (#18522) · 24d1c44a
  由 LielinJiang 提交于 7月 09, 2019
```
* fix transform matrix bug, test=develop

* modify API.spec
```
  24d1c44a
08 7月, 2019 1 次提交

Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532) · 88b52a27

由 Zhaolong Xing 提交于 7月 08, 2019

* Fix Mask rcnn predictor
    1. refine memory optim algorithm to support the model with the block op.
    2. output diff : modify the affine channel fuse
    3. add condition_block_infer op
add interface for setting trt calib table dir
test=develop

* add the missing files.
test=develop

88b52a27

05 7月, 2019 1 次提交

Fix topk cannot handle 1D vector bug (#18466) · 832d8191

由 zhaoyuchen2018 提交于 7月 05, 2019

* Fix topk cannot handle 1D vector bug

Add path to handle 1D vector

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

832d8191

04 7月, 2019 2 次提交
- Q
  Refine Infershape in activation_op for double_grad. (#18485) · 7ac4818a
  由 qingqing01 提交于 7月 04, 2019
```
* Refine Infershape in activation_op for double_grad.
```
  7ac4818a
- C
  
  Make fuse_all_reduce_op_pass support mix_precision (#17652) · 74538573
  由 chengduo 提交于 7月 04, 2019
  
  74538573
03 7月, 2019 5 次提交
- Z
  
  support Tensor input for edit_distance op (#18162) · 7c6f2350
  由 zhoukunsheng 提交于 7月 03, 2019
  
  7c6f2350
- Z
  support Tensor input for chunk_eval op (#18226) · 26318544
  由 zhoukunsheng 提交于 7月 03, 2019
```
* test=develop
support Tensor input for chunk_eval op

* test=develop
fix testcase for chunk_eval op

* test=develop
fix typos in nn.py
```
  26318544
- Z
  
  add unique kernel and op (#17557) · 206c44e2
  由 zhoukunsheng 提交于 7月 03, 2019
  
  206c44e2
- Z
  
  upgrade hash op to support Tensor and LoDTensor input (#17998) · 71af72b1
  由 zhoukunsheng 提交于 7月 03, 2019
  
  71af72b1
- Z
  
  add ones_like op (#17388) · d3b3443d
  由 zhoukunsheng 提交于 7月 03, 2019
  
  d3b3443d

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致