提交 · 3816d221ffdcf97babf0ad95f2ae03e70be698ea · BaiXuePrincess / Paddle

02 8月, 2019 1 次提交

Fusion: seqpool_cvm_concat (#18471) · ee2f296e

由石晓伟提交于 8月 02, 2019

* add fusion_seqpool_cvm_concat test=develop

* simplify pass, test=develop

* fix code style, test=develop

ee2f296e

01 8月, 2019 3 次提交

Add the op of unique_with_counts, expand count function of the op unique (#18720) · 3ab1866c

由 wawltor 提交于 8月 01, 2019

* test=develop
Add the op of unique_with_counts, the op is calc the unqiue input of data, and output the corresponding indices and count of data.

* test=develop
Check the input and dtype in the op of unique_with_counts

* test=develop
test=document_preview
update the API.spec for `unique_with_counts`, at the same time, optimize the python api in the op of `unique_with_count`

* test=develop
test=document_preview
Fix some python api problem in the op of `unique_with_counts`, and change the error messsage in this op.

* Fix some API problem in the op of `unique_with_counts`
test=develop
test=document_preview

* test=develop
test=document_preview
Fix the api sample of op `unique_with_counts`, and update api.spec

3ab1866c

- Removed passing X from FWD to GRAD via device context (#18911) · 5cf2d385

由 Jacek Czaja 提交于 8月 01, 2019

test=develop

- Extracted key generation from FWD and GRAD into separate function

test=develop

- Compilation fix

test=develop

- another compilation

test=develop

5cf2d385

L
Fix depthwise conv gpu kernel bug (#18582) · 22fa4c2d
由 LielinJiang 提交于 8月 01, 2019
```
* fix depthwise conv gpu kernel bug, test=develop
* add more depthwise conv test, test=develop
```
22fa4c2d

31 7月, 2019 7 次提交

fix several security bugs reported by security team (#18831) · 0d996908

由 liuwei1031 提交于 7月 31, 2019

* fix security issue, test=develop

* bug fix, test=develop

* throw an exception when null pointer data with non-zero length PaddleBuf is passed, test=develop

0d996908

Trt fp16 support (#18860) · 61238d31

由 Zhaolong Xing 提交于 7月 31, 2019

* Fix Mask rcnn predictor
    1. refine memory optim algorithm to support the model with the block op.
    2. output diff : modify the affine channel fuse
    3. add condition_block_infer op
add interface for setting trt calib table dir
test=develop

* add the missing files.
test=develop

* 1 add trt fp16 support
test=develop

61238d31

C
[DyGraph] Make multi-card program faster (#18892) · 20859c08
由 chengduo 提交于 7月 31, 2019
```
* update parallel.py
test=develop
```
20859c08

Add center Loss Op Support (#18681) · 24f85431

由 HaoRen 提交于 7月 31, 2019

* support center loss
* change tensor copy  api to high level api tensorcopy

* test=develop rewrite the center_loss cuda_kernel to make it faster
and add document of the center loss api,also update test function

* test=document_preview test=develop
update document of center loss

* test=document_preview test=develop
modify API.spec modify test code remove nouse const_cast

24f85431

L
use mkl to accelerate gelu_grad (#18099) · 86e494eb
由 Leo Zhao 提交于 7月 31, 2019
```
test=develop
```
86e494eb
W
Optimize the error report information when loadcombine fail to open model... · dfd6a62a
由 wopeizl 提交于 7月 31, 2019
```
Optimize the error report information when loadcombine fail to open model files test=develop (#18888)
```
dfd6a62a
B
upgrade ngraph version and simplify ngraph engine (#18853) · adcfc53b
由 baojun 提交于 7月 30, 2019
```
* upgrade ngraph to v0.24 test=develop

* simplify io test=develop
```
adcfc53b

30 7月, 2019 2 次提交
- J
  [MKL-DNN] Fix int8 performance regression (#18758) · cfcb96d2
  由 Jacek Czaja 提交于 7月 30, 2019
```
test=develop

- optimization of TID to string

test=develop
```
  cfcb96d2
- D
  
  Add elementwise_pow_op backward implementation and the unit test codes of it. (#18848) · e0a2d4df
  由 danleifeng 提交于 7月 30, 2019
  
  e0a2d4df
28 7月, 2019 1 次提交
- Z
  
  fix affine_channel no_need buffer bug, test=develop (#18844) · 9a8a7a1d
  由 Zeng Jinle 提交于 7月 28, 2019
  
  9a8a7a1d
26 7月, 2019 4 次提交
- A
  
  Add LeakyReLU MKLDNN support (#18762) · ee022279
  由 Adam 提交于 7月 26, 2019
  
  ee022279
- L
  remove unused TransposeINT8Op for higher UT coverage (#18791) · b05bdda0
  由 lidanqing 提交于 7月 26, 2019
```
test=develop
```
  b05bdda0
- P
  
  fix mul_mkldnn_op build failure (#18816) · c5f47c21
  由 Physher 提交于 7月 26, 2019
  
  c5f47c21
- P
  
  clarify MKLDNN INT8 Mul Op attributes (#18685) · a5c98630
  由 Physher 提交于 7月 26, 2019
  
  a5c98630
25 7月, 2019 4 次提交
- F
  fix roi_align_op cpu backward's bug (#18789) · cff5e2c1
  由 FDInSky 提交于 7月 25, 2019
```
* test=develop fix cpu roi_align_op backward bug
```
  cff5e2c1
- B
  
  fix deformable_conv_op compile error, test=develop (#18793) · d3ac561d
  由 Bai Yifan 提交于 7月 25, 2019
  
  d3ac561d
- L
  change ComputeINT8 to template version to remove checking dst_datatype code (#18756) · 9ecd8ee7
  由 lidanqing 提交于 7月 25, 2019
```
* change INT8 to template so that checking dst_dt with if-else could be removed. CI will be enabled after fixing reviews

* reverse user_residual_memory_p and user_bias_memory_p declaration scope
test=develop
```
  9ecd8ee7
- J
  
  fix bug of swish op formula,test=develop (#18772) · d9e7b5b5
  由 JesseyXujin 提交于 7月 25, 2019
  
  d9e7b5b5
24 7月, 2019 2 次提交

Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60

由 Bob Zhu 提交于 7月 24, 2019

* extend matmul op to support multiple head multiplication

With the support of multiple head, the multiplication of two big matrixes is
split into multiplication of several (head_number) small matrixes. e.g. if
Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
[6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].

220eef60

Add python API for appending LoD level (#18702) · 075e1cf7

由 whs 提交于 7月 24, 2019

* Make lod reset op support for append lod level.

* Fix API.spec
test=develop

* Fix unitest.
test=develop

* Add python api for lod append.
test=develop

* Fix API.spec
test=develop

* Fix format of doc.
test=develop

* Fix unitest.
test=develop

* Fix doc.
test=develop

075e1cf7

23 7月, 2019 4 次提交

[MKL-DNN] Extended LRN with reusing via Acquire API (#18675) · 95c1816e

由 Jacek Czaja 提交于 7月 23, 2019

test=develop

- compileation fix

- Yet another compilation fix

- Even yet another compilation fix

- Surprise! Again compilation fix

- lint fixes

test=develop

- Fix to workspace acquire of LRN

test=develop

- Fix to hash of BWD LRN

test=develop

- fix to lrn BWD PD acquire

test=develop

- Fixing LRN PD creation

test=develop

- cosmetic fix in comment

test=develop

- Fixes after review

test=develop

95c1816e

C
Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664) · fd3aad6c
由 chengduo 提交于 7月 23, 2019
```
* support sparse gradients
test=develop
```
fd3aad6c

Cudnn convolution reconstruction (#18284) · 6b78e00d

由 wangchaochaohu 提交于 7月 23, 2019

* rewrite the conv_op using cudnn_conv_helper

* add workspace limit for v7 test=develop

* fix test=develop

* add half float test=develop

* fix test=develop

* fix test=develop

* revise code style test=develop

* fix test=develop

6b78e00d

supports distributed classification (#18690) · 157211c4

由 Yi Liu 提交于 7月 23, 2019

* supports distributed classification training
* update API.spec
* fix evenly division in python3
* change "index_range" to "index_num" in shard_index operator
test=document_preview
test=develop

157211c4

22 7月, 2019 4 次提交
- Q
  
  Fix CPU implementation of roi_align_op backward (#18728) · 3429e65a
  由 qingqing01 提交于 7月 22, 2019
  
  3429e65a
- T
  Revert "Add LeakyRelu MKLDNN support (#18656)" (#18723) · bd22453f
  由 Tao Luo 提交于 7月 22, 2019
```
test=develop
```
  bd22453f
- W
  Make infer shape of pad2d support for input with negative dims in compile time. (#18695) · 189b08dc
  由 whs 提交于 7月 22, 2019
```
test=develop
```
  189b08dc
- B
  
  add license, test=develop (#18709) · 7e3963f2
  由 Bai Yifan 提交于 7月 22, 2019
  
  7e3963f2
20 7月, 2019 2 次提交
- C
  test=develop (#18701) · ccf06a48
  由 cjt222 提交于 7月 20, 2019
```
add license
```
  ccf06a48
- W
  fix clip_by_norm doc (#18688) · 185b3ace
  由 wangguanzhong 提交于 7月 20, 2019
```
* fix clip_by_norm doc, test=develop
```
  185b3ace
19 7月, 2019 2 次提交

Support memory eager deletion on recurrent OP (#17710) · 89bc3fd8

由 Huihuang Zheng 提交于 7月 19, 2019

Test PaddingRNN on V100 GPU device.

Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU.

GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR)
Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)

89bc3fd8

A
Add LeakyRelu MKLDNN support (#18656) · d6b6a337
由 Adam 提交于 7月 19, 2019
```
test=develop
```
d6b6a337

18 7月, 2019 2 次提交
- H
  hash_op support int64 hash_size (#18674) · bb2f5d24
  由 hutuxian 提交于 7月 18, 2019
```
* hash_op support int64 hash_size
* add corresponding UT
```
  bb2f5d24
- G
  remove ctr reader, all functions are satisfied in dataset (#18672) · 5ed713d5
  由 guru4elephant 提交于 7月 18, 2019
```
* remove ctr reader, all functions are satisfied in dataset
```
  5ed713d5
17 7月, 2019 2 次提交

Add cuda implementation for `prelu` backward pass (#18633) · ce1ec332

由 Yang Zhang 提交于 7月 17, 2019

* Add GPU implementation for `prelu` backward pass

test=develop

* Fix logic error in `prelu` GPU backward and simplify a bit

test=develop

* Fix `prelu` backward CUDA implementation

test=develop

CPU version was not used actually, so test passed

ce1ec332

Y

[CPU] Fix the compiling issue with AVX512F macro. (#18634) · 97549a4f
由 Yihua Xu 提交于 7月 17, 2019

97549a4f

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致