提交 · d3ac561d65d77c06026f3b33a5cf5f41f065f0b5 · PaddlePaddle / Paddle

25 7月, 2019 4 次提交
- B
  
  fix deformable_conv_op compile error, test=develop (#18793) · d3ac561d
  由 Bai Yifan 提交于 7月 25, 2019
  
  d3ac561d
- L
  change ComputeINT8 to template version to remove checking dst_datatype code (#18756) · 9ecd8ee7
  由 lidanqing 提交于 7月 25, 2019
```
* change INT8 to template so that checking dst_dt with if-else could be removed. CI will be enabled after fixing reviews

* reverse user_residual_memory_p and user_bias_memory_p declaration scope
test=develop
```
  9ecd8ee7
- J
  
  fix bug of swish op formula,test=develop (#18772) · d9e7b5b5
  由 JesseyXujin 提交于 7月 25, 2019
  
  d9e7b5b5
- G
  split test_dist_se_resnext.py into 4 testcases (#18743) · 2efb282c
  由 guru4elephant 提交于 7月 25, 2019
```
* split test_dist_se_resnext.py into 4 testcases
```
  2efb282c
24 7月, 2019 8 次提交

Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60

由 Bob Zhu 提交于 7月 24, 2019

* extend matmul op to support multiple head multiplication

With the support of multiple head, the multiplication of two big matrixes is
split into multiplication of several (head_number) small matrixes. e.g. if
Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
[6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].

220eef60

Add python API for appending LoD level (#18702) · 075e1cf7

由 whs 提交于 7月 24, 2019

* Make lod reset op support for append lod level.

* Fix API.spec
test=develop

* Fix unitest.
test=develop

* Add python api for lod append.
test=develop

* Fix API.spec
test=develop

* Fix format of doc.
test=develop

* Fix unitest.
test=develop

* Fix doc.
test=develop

075e1cf7

T
remove package.cmake (#18760) · 8de5aa1b
由 Tao Luo 提交于 7月 24, 2019
```
test=develop
```
8de5aa1b
C
Enhance backward process (#18700) · 8259f141
由 chengduo 提交于 7月 24, 2019
```
* prun backward ops
test=develop
```
8259f141

Modify auc doc. Add output variable description, previously was the scalar... · 25c9b57b

由 JesseyXujin 提交于 7月 24, 2019

Modify auc doc. Add output variable description, previously was the scalar type, now changed to the tuple type.test=develop (#18771)

25c9b57b

Update trt5 for paddle-trt (#18645) · 26ae6d49

由 Zhaolong Xing 提交于 7月 24, 2019

* update paddle-trt for:
    1. fix bug: when batch > 2, core in split plugin.
    2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.)
    3. add new attr to dropout.
    4. shuffle channel, swish, relu6 support
    test=develop

* 1. fix ci
test=develop

26ae6d49

add slot to sparse table (#18686) · d8396281

由 Thunderbrook 提交于 7月 24, 2019

The change includes 2 things:

1. save delta model and shrink table are control by the same parameter before, now add delete_after_unseen_days to control shrink table.
2. value in sparse table has no slot before, now add slot in sparse table, and add DownpureCtrAccessor to support the new meta.
test=develop

d8396281

X
modify install GPU 97 (#18768) · f0cfc3c3
由 xsrobin 提交于 7月 24, 2019
```
* modify install GPU97

* modify install GPU97
```
f0cfc3c3

23 7月, 2019 6 次提交

[MKL-DNN] Extended LRN with reusing via Acquire API (#18675) · 95c1816e

由 Jacek Czaja 提交于 7月 23, 2019

test=develop

- compileation fix

- Yet another compilation fix

- Even yet another compilation fix

- Surprise! Again compilation fix

- lint fixes

test=develop

- Fix to workspace acquire of LRN

test=develop

- Fix to hash of BWD LRN

test=develop

- fix to lrn BWD PD acquire

test=develop

- Fixing LRN PD creation

test=develop

- cosmetic fix in comment

test=develop

- Fixes after review

test=develop

95c1816e

T
remove unused cmake file (#18744) · 0ae45f0b
由 Tao Luo 提交于 7月 23, 2019
```
test=develop
```
0ae45f0b

support patch data, add load_one_table, fix bug (#18509) · d18aabb4

由 jiaqi 提交于 7月 23, 2019

（1）support patch data （merge slots of instances of same line id, modify dense layer which
changes its size）
（2）add fleet load_one_table interface, support load from paddle model and load from pslib model
（3）fix push sparse bug which cause push sparse cost more time（about 10% in my testcase）
（4）when some slots are not in one of your network (join/update, etc.)，data feed、collect label info、push/pull sparse will skip these slots， instead of throw error.
（5）add more debug info in TrainFilesWithProfiler

d18aabb4

C
Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664) · fd3aad6c
由 chengduo 提交于 7月 23, 2019
```
* support sparse gradients
test=develop
```
fd3aad6c

Cudnn convolution reconstruction (#18284) · 6b78e00d

由 wangchaochaohu 提交于 7月 23, 2019

* rewrite the conv_op using cudnn_conv_helper

* add workspace limit for v7 test=develop

* fix test=develop

* add half float test=develop

* fix test=develop

* fix test=develop

* revise code style test=develop

* fix test=develop

6b78e00d

supports distributed classification (#18690) · 157211c4

由 Yi Liu 提交于 7月 23, 2019

* supports distributed classification training
* update API.spec
* fix evenly division in python3
* change "index_range" to "index_num" in shard_index operator
test=document_preview
test=develop

157211c4

22 7月, 2019 11 次提交
- Q
  
  Fix CPU implementation of roi_align_op backward (#18728) · 3429e65a
  由 qingqing01 提交于 7月 22, 2019
  
  3429e65a
- G
  add parameter server launch (#18687) · 70b03760
  由 guru4elephant 提交于 7月 22, 2019
```
add parameter server launch so that a user can easily launch parameter server
```
  70b03760
- Z
  
  add more traceback to py_reader error msg, test=develop (#18722) · d07ad4c6
  由 Zeng Jinle 提交于 7月 22, 2019
  
  d07ad4c6
- H
  Fix random test_recurrent_op failure (#18718) · a3028bb7
  由 Huihuang Zheng 提交于 7月 22, 2019
```
The change includes 3 things:

1. Set CPU_NUM to 1 in the tests because the ParallelExecutor will print warning that CPU_NUM is not set and use default 1.

2. Old tests compare two RNNs, hand written simple RNN and same RNN built by Paddle, but initialized RNN weights in numpy random and Paddle random separately. Fixed it by setting weights and bias values.

3. Also set numpy random seed in the tests. Now the two RNNs diff can be smaller (rtol from 0.1, 0.2 to. 0.01) in the tests.

test=develop
```
  a3028bb7
- T
  Revert "Add LeakyRelu MKLDNN support (#18656)" (#18723) · bd22453f
  由 Tao Luo 提交于 7月 22, 2019
```
test=develop
```
  bd22453f
- T
  
  Change api approval people name (#18699) · 58469186
  由 tianshuo78520a 提交于 7月 22, 2019
  
  58469186
- W
  Make infer shape of pad2d support for input with negative dims in compile time. (#18695) · 189b08dc
  由 whs 提交于 7月 22, 2019
```
test=develop
```
  189b08dc
- T
  remove unused gzstream.cmake (#18705) · c457a69d
  由 Tao Luo 提交于 7月 22, 2019
```
test=develop
```
  c457a69d
- T
  do some odd jobs (#18641) · d8458483
  由 tangwei12 提交于 7月 22, 2019
```
do some odd jobs, test=develop
```
  d8458483
- B
  
  add license, test=develop (#18709) · 7e3963f2
  由 Bai Yifan 提交于 7月 22, 2019
  
  7e3963f2
- G
  split different comm method for mnist distributed training (#18715) · ebf9797e
  由 guru4elephant 提交于 7月 22, 2019
```
* split different comm method for mnist distributed training
```
  ebf9797e
20 7月, 2019 2 次提交
- C
  test=develop (#18701) · ccf06a48
  由 cjt222 提交于 7月 20, 2019
```
add license
```
  ccf06a48
- W
  fix clip_by_norm doc (#18688) · 185b3ace
  由 wangguanzhong 提交于 7月 20, 2019
```
* fix clip_by_norm doc, test=develop
```
  185b3ace
19 7月, 2019 5 次提交
- H
  Support memory eager deletion on recurrent OP (#17710) · 89bc3fd8
  由 Huihuang Zheng 提交于 7月 19, 2019
```
Test PaddingRNN on V100 GPU device.

Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU.
                   
GPU memory (MiB):   6414 (this PR)     vs   6837 (without this PR)
Speed (steps/s):         10.28 (this PR)    vs    9.89 (without this PR)
 
```
  89bc3fd8
- J
  MKL-DNN upgrade to 0.20 (#18370) · 0d8e6c9b
  由 Jacek Czaja 提交于 7月 19, 2019
```
test=develop
```
  0d8e6c9b
- A
  Add LeakyRelu MKLDNN support (#18656) · d6b6a337
  由 Adam 提交于 7月 19, 2019
```
test=develop
```
  d6b6a337
- T
  add check of executor (#17986) · 0b9acb49
  由 tangwei12 提交于 7月 19, 2019
```
* add check of executor, test=develop
```
  0b9acb49
- G
  
  Change to use brpc rdma branch instead of personal branch. (#18683) · ec1000cc
  由 gongweibao 提交于 7月 19, 2019
  
  ec1000cc
18 7月, 2019 4 次提交

Optimize the content of error reporting information, print error code and... · 772e0956

由 zhouwei25 提交于 7月 18, 2019

Optimize the content of error reporting information, print error code and official document web sites (#18671)

optimize the error reporting information of cuda related API
index on develop: 130ac177 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into develop

772e0956

Feature/auto_growth_allocator (#18561) · ae58afc5

由 Zeng Jinle 提交于 7月 18, 2019

* feature/auto_growth_allocator, test=develop

* add unittest of AlignedAllocator, test=develop

* try to turn on auto_growth to test on CI, test=develop

* fix segmentation fault in mixed_vector.h, test=develop

* add unittests, test=develop

ae58afc5

H
hash_op support int64 hash_size (#18674) · bb2f5d24
由 hutuxian 提交于 7月 18, 2019
```
* hash_op support int64 hash_size
* add corresponding UT
```
bb2f5d24
X

update readme to 1.5.1 (#18670) · a5d4c2fa
由 xsrobin 提交于 7月 18, 2019

a5d4c2fa

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功