提交 · 2efb282c861b2eba83bdfdbe3c8f7b3d0de16a25 · Crayon鑫 / Paddle

24 7月, 2019 5 次提交

Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60

由 Bob Zhu 提交于 7月 24, 2019

* extend matmul op to support multiple head multiplication

With the support of multiple head, the multiplication of two big matrixes is
split into multiplication of several (head_number) small matrixes. e.g. if
Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
[6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].

220eef60

Add python API for appending LoD level (#18702) · 075e1cf7

由 whs 提交于 7月 24, 2019

* Make lod reset op support for append lod level.

* Fix API.spec
test=develop

* Fix unitest.
test=develop

* Add python api for lod append.
test=develop

* Fix API.spec
test=develop

* Fix format of doc.
test=develop

* Fix unitest.
test=develop

* Fix doc.
test=develop

075e1cf7

Modify auc doc. Add output variable description, previously was the scalar... · 25c9b57b

由 JesseyXujin 提交于 7月 24, 2019

Modify auc doc. Add output variable description, previously was the scalar type, now changed to the tuple type.test=develop (#18771)

25c9b57b

Update trt5 for paddle-trt (#18645) · 26ae6d49

由 Zhaolong Xing 提交于 7月 24, 2019

* update paddle-trt for:
    1. fix bug: when batch > 2, core in split plugin.
    2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.)
    3. add new attr to dropout.
    4. shuffle channel, swish, relu6 support
    test=develop

* 1. fix ci
test=develop

26ae6d49

add slot to sparse table (#18686) · d8396281

由 Thunderbrook 提交于 7月 24, 2019

The change includes 2 things:

1. save delta model and shrink table are control by the same parameter before, now add delete_after_unseen_days to control shrink table.
2. value in sparse table has no slot before, now add slot in sparse table, and add DownpureCtrAccessor to support the new meta.
test=develop

d8396281

23 7月, 2019 5 次提交

[MKL-DNN] Extended LRN with reusing via Acquire API (#18675) · 95c1816e

由 Jacek Czaja 提交于 7月 23, 2019

test=develop

- compileation fix

- Yet another compilation fix

- Even yet another compilation fix

- Surprise! Again compilation fix

- lint fixes

test=develop

- Fix to workspace acquire of LRN

test=develop

- Fix to hash of BWD LRN

test=develop

- fix to lrn BWD PD acquire

test=develop

- Fixing LRN PD creation

test=develop

- cosmetic fix in comment

test=develop

- Fixes after review

test=develop

95c1816e

support patch data, add load_one_table, fix bug (#18509) · d18aabb4

由 jiaqi 提交于 7月 23, 2019

（1）support patch data （merge slots of instances of same line id, modify dense layer which
changes its size）
（2）add fleet load_one_table interface, support load from paddle model and load from pslib model
（3）fix push sparse bug which cause push sparse cost more time（about 10% in my testcase）
（4）when some slots are not in one of your network (join/update, etc.)，data feed、collect label info、push/pull sparse will skip these slots， instead of throw error.
（5）add more debug info in TrainFilesWithProfiler

d18aabb4

C
Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664) · fd3aad6c
由 chengduo 提交于 7月 23, 2019
```
* support sparse gradients
test=develop
```
fd3aad6c

Cudnn convolution reconstruction (#18284) · 6b78e00d

由 wangchaochaohu 提交于 7月 23, 2019

* rewrite the conv_op using cudnn_conv_helper

* add workspace limit for v7 test=develop

* fix test=develop

* add half float test=develop

* fix test=develop

* fix test=develop

* revise code style test=develop

* fix test=develop

6b78e00d

supports distributed classification (#18690) · 157211c4

由 Yi Liu 提交于 7月 23, 2019

* supports distributed classification training
* update API.spec
* fix evenly division in python3
* change "index_range" to "index_num" in shard_index operator
test=document_preview
test=develop

157211c4

22 7月, 2019 4 次提交
- Q
  
  Fix CPU implementation of roi_align_op backward (#18728) · 3429e65a
  由 qingqing01 提交于 7月 22, 2019
  
  3429e65a
- T
  Revert "Add LeakyRelu MKLDNN support (#18656)" (#18723) · bd22453f
  由 Tao Luo 提交于 7月 22, 2019
```
test=develop
```
  bd22453f
- W
  Make infer shape of pad2d support for input with negative dims in compile time. (#18695) · 189b08dc
  由 whs 提交于 7月 22, 2019
```
test=develop
```
  189b08dc
- B
  
  add license, test=develop (#18709) · 7e3963f2
  由 Bai Yifan 提交于 7月 22, 2019
  
  7e3963f2
20 7月, 2019 2 次提交
- C
  test=develop (#18701) · ccf06a48
  由 cjt222 提交于 7月 20, 2019
```
add license
```
  ccf06a48
- W
  fix clip_by_norm doc (#18688) · 185b3ace
  由 wangguanzhong 提交于 7月 20, 2019
```
* fix clip_by_norm doc, test=develop
```
  185b3ace
19 7月, 2019 3 次提交

Support memory eager deletion on recurrent OP (#17710) · 89bc3fd8

由 Huihuang Zheng 提交于 7月 19, 2019

Test PaddingRNN on V100 GPU device.

Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU.

GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR)
Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)

89bc3fd8

J
MKL-DNN upgrade to 0.20 (#18370) · 0d8e6c9b
由 Jacek Czaja 提交于 7月 19, 2019
```
test=develop
```
0d8e6c9b
A
Add LeakyRelu MKLDNN support (#18656) · d6b6a337
由 Adam 提交于 7月 19, 2019
```
test=develop
```
d6b6a337

18 7月, 2019 4 次提交

Optimize the content of error reporting information, print error code and... · 772e0956

由 zhouwei25 提交于 7月 18, 2019

Optimize the content of error reporting information, print error code and official document web sites (#18671)

optimize the error reporting information of cuda related API
index on develop: 130ac177 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into develop

772e0956

Feature/auto_growth_allocator (#18561) · ae58afc5

由 Zeng Jinle 提交于 7月 18, 2019

* feature/auto_growth_allocator, test=develop

* add unittest of AlignedAllocator, test=develop

* try to turn on auto_growth to test on CI, test=develop

* fix segmentation fault in mixed_vector.h, test=develop

* add unittests, test=develop

ae58afc5

H
hash_op support int64 hash_size (#18674) · bb2f5d24
由 hutuxian 提交于 7月 18, 2019
```
* hash_op support int64 hash_size
* add corresponding UT
```
bb2f5d24
G
remove ctr reader, all functions are satisfied in dataset (#18672) · 5ed713d5
由 guru4elephant 提交于 7月 18, 2019
```
* remove ctr reader, all functions are satisfied in dataset
```
5ed713d5

17 7月, 2019 5 次提交

G
remove async executor and add data_feed.proto to the deps of train demo (#18659) · d714bf03
由 guru4elephant 提交于 7月 17, 2019
```
* remove async executor and add data_feed.proto to the deps of train demo
```
d714bf03

Add cuda implementation for `prelu` backward pass (#18633) · ce1ec332

由 Yang Zhang 提交于 7月 17, 2019

* Add GPU implementation for `prelu` backward pass

test=develop

* Fix logic error in `prelu` GPU backward and simplify a bit

test=develop

* Fix `prelu` backward CUDA implementation

test=develop

CPU version was not used actually, so test passed

ce1ec332

石

Fix Bitmain Predictor::Clone() (#18599) · 25d80791

由石晓伟提交于 7月 17, 2019

* update anakin-engine interfaces for content-dnn

test=develop

* support only-gpu mode of Anakin

modify eltwise parse

test=develop

* modification for thread-safe

test=develop

* Integrated template instance

test=develop

* increase template parameters

test=develop

* support MLU predictor

test=develop

* update anakin cmake files

test=develop

* update TargetWrapper::set_device

* update the initialization of anakin subgraph

test=develop

* use the default constructor of base class

test=develop

* load model from buffer with length

test=develop

* modify the access level of class

test=develop

* support anakin for bitmain arch

test=develop

* remove files

* checkout cmakelists

test=develop

* modify interfaces

test=develop

* add cmake dependments

test=develop

* enforce the outputs of net

test=develop

25d80791

Y

[CPU] Fix the compiling issue with AVX512F macro. (#18634) · 97549a4f
由 Yihua Xu 提交于 7月 17, 2019

97549a4f
B

[NGraph] handle dim element 0 of ngraph op (#18568) · 256ba7cb
由 baojun 提交于 7月 16, 2019

256ba7cb

16 7月, 2019 4 次提交

C
fix PE fetch bug (#18644) · a6d468a2
由 chengduo 提交于 7月 16, 2019
```
test=develop
```
a6d468a2
L

print out error code of cudaGetDeviceProperties if failed (#18643) · 75953096
由 liuwei1031 提交于 7月 16, 2019

75953096

[MKL-DNN] Reimplemented pool2d mkl-dnn to use Acquire API (#18585) · 71d883b8

由 Jacek Czaja 提交于 7月 16, 2019

* - Added partial draft of pooling acquire

- Workspace support

- compilation fix

- Added draft of pooling backward reimplementation

- Segfault fix

- reverted 'any' for diff_dst crewation in pooling

- Lint fixes

test=develop

- lint fixes

test=develop

- Further lint fixes

test=develop

* - Fixes after review

test=develop

* - Lint fixes

test=develop

* - Even more lint fixes

test=develop

71d883b8

C
fix bug of scatter op (#18640) · f4ec7d54
由 chengduo 提交于 7月 16, 2019
```
test=develop
```
f4ec7d54

15 7月, 2019 1 次提交
- G
  make auc op compatible with 1 dim (#18551) · ab57d389
  由 guru4elephant 提交于 7月 15, 2019
```
* make auc op compatible with 1 dim
```
  ab57d389
12 7月, 2019 4 次提交
- L
  not use transferscope cache in cpu case (#18578) · ff77dea9
  由 Leo Zhao 提交于 7月 12, 2019
```
* not use transferscope cache in cpu case

test=develop

* adjust variable name and add comments

test=develop

* use correct format for class member in operator.h

* use correct format for class member in operator.cc

test=develop
```
  ff77dea9
- 1
  fix #17430: int64类型的attr训练非预期 (#18264) · b414645a
  由 123malin 提交于 7月 12, 2019
```
* fix int64_t

* update fill constant op unittest

* add empty line
```
  b414645a
- T
  delete AllocatorFacade destructor (#18606) · db212bb9
  由 tangwei12 提交于 7月 12, 2019
```
* delete m, test=develop
```
  db212bb9
- K
  
  Modify embedding_op input dtype to int64 (#18598) · 995d7d86
  由 Kevin 提交于 7月 12, 2019
  
  995d7d86
11 7月, 2019 3 次提交
- T
  add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy (#18580) · 076f8331
  由 Tao Luo 提交于 7月 11, 2019
```
* add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy

test=develop

* enhance MkldnnPostReset

test=develop

* add comments for mkldnn_cache_capacity field

test=develop
```
  076f8331
- H
  
  fix cudnn lstm shape bug; test=develop (#18492) · a20b2b43
  由 Hongyu Liu 提交于 7月 11, 2019
  
  a20b2b43
- G
  
  Polish backwards optimizer dependency codes and use more default values. (#18255) · c0a82748
  由 gongweibao 提交于 7月 11, 2019
  
  c0a82748

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致