提交 · 22fa4c2d2440208870eb94de8694c05f2605cfe8 · BaiXuePrincess / Paddle

01 8月, 2019 1 次提交
- L
  Fix depthwise conv gpu kernel bug (#18582) · 22fa4c2d
  由 LielinJiang 提交于 8月 01, 2019
```
* fix depthwise conv gpu kernel bug, test=develop
* add more depthwise conv test, test=develop
```
  22fa4c2d
31 7月, 2019 9 次提交

H
GPU allocation uses fraction of available memory (#18896) · ea6ee76f
由 Huihuang Zheng 提交于 7月 31, 2019
```
GPU allocation uses fraction of available memory, also fix the GetUsed without lock
```
ea6ee76f

fix several security bugs reported by security team (#18831) · 0d996908

由 liuwei1031 提交于 7月 31, 2019

* fix security issue, test=develop

* bug fix, test=develop

* throw an exception when null pointer data with non-zero length PaddleBuf is passed, test=develop

0d996908

Trt fp16 support (#18860) · 61238d31

由 Zhaolong Xing 提交于 7月 31, 2019

* Fix Mask rcnn predictor
    1. refine memory optim algorithm to support the model with the block op.
    2. output diff : modify the affine channel fuse
    3. add condition_block_infer op
add interface for setting trt calib table dir
test=develop

* add the missing files.
test=develop

* 1 add trt fp16 support
test=develop

61238d31

C
[DyGraph] Make multi-card program faster (#18892) · 20859c08
由 chengduo 提交于 7月 31, 2019
```
* update parallel.py
test=develop
```
20859c08

Add center Loss Op Support (#18681) · 24f85431

由 HaoRen 提交于 7月 31, 2019

* support center loss
* change tensor copy  api to high level api tensorcopy

* test=develop rewrite the center_loss cuda_kernel to make it faster
and add document of the center loss api,also update test function

* test=document_preview test=develop
update document of center loss

* test=document_preview test=develop
modify API.spec modify test code remove nouse const_cast

24f85431

L
replace paper link (#18861) · d21c3914
由 lvmengsi 提交于 7月 31, 2019
```
Update conv2d transpose link
```
d21c3914
L
use mkl to accelerate gelu_grad (#18099) · 86e494eb
由 Leo Zhao 提交于 7月 31, 2019
```
test=develop
```
86e494eb
W
Optimize the error report information when loadcombine fail to open model... · dfd6a62a
由 wopeizl 提交于 7月 31, 2019
```
Optimize the error report information when loadcombine fail to open model files test=develop (#18888)
```
dfd6a62a
B
upgrade ngraph version and simplify ngraph engine (#18853) · adcfc53b
由 baojun 提交于 7月 30, 2019
```
* upgrade ngraph to v0.24 test=develop

* simplify io test=develop
```
adcfc53b

30 7月, 2019 4 次提交
- W
  Make lod_append support variable lod. (#18908) · 6cccab92
  由 whs 提交于 7月 30, 2019
```
test=develop
```
  6cccab92
- J
  [MKL-DNN] Fix int8 performance regression (#18758) · cfcb96d2
  由 Jacek Czaja 提交于 7月 30, 2019
```
test=develop

- optimization of TID to string

test=develop
```
  cfcb96d2
- D
  
  Add elementwise_pow_op backward implementation and the unit test codes of it. (#18848) · e0a2d4df
  由 danleifeng 提交于 7月 30, 2019
  
  e0a2d4df
- L
  Revert "use static variable to do cache instead of thread local in thread... · 10eeed93
  由 Leo Zhao 提交于 7月 30, 2019
```
Revert "use static variable to do cache instead of thread local in thread frequent switching case (#18428)" (#18879)

This reverts commit ce38bb53.

test=develop
```
  10eeed93
29 7月, 2019 3 次提交

H

Try to modify external gflags to solve CI compilation (#18872) · 0d3f16f5
由 Huihuang Zheng 提交于 7月 29, 2019

0d3f16f5

Remove legacy C++ memory optimization codes (#18834) · 8008ab4e

由 Zeng Jinle 提交于 7月 29, 2019

* remove legacy memory optimization codes, test=develop

* follow huihuang's comments,test=develop

* follow luotao's comments, test=develop

8008ab4e

add clear_model interface in fleetwrapper (#18815) · 52c1431e

由 Thunderbrook 提交于 7月 29, 2019

* dump slot

* test

* proto

* dump slot

* test

* proto

* code style

* code style

* code style

* style

* add delete after unseen days

* add unseen days

* code style

* conflict solve
test=develop

* add clear model

* code style
test=develop

* code style
test=develop

52c1431e

28 7月, 2019 2 次提交
- Z
  
  fix affine_channel no_need buffer bug, test=develop (#18844) · 9a8a7a1d
  由 Zeng Jinle 提交于 7月 28, 2019
  
  9a8a7a1d
- L
  Fix drop deconv (#18813) · 829ef262
  由 lvmengsi 提交于 7月 28, 2019
```
* replace link

* update api.spec

* fix mistake
```
  829ef262
27 7月, 2019 2 次提交
- H
  Merge cuda 9/10 dockerfile with root dockerfile (#18693) · cfce4994
  由 Huihuang Zheng 提交于 7月 27, 2019
```
Also fix a dependency error which may cause compile error
```
  cfce4994
- C
  Open fuse optimization ops (#18741) · 4140fe11
  由 chengduo 提交于 7月 27, 2019
```
* open fuse optimization ops
test=develop
```
  4140fe11
26 7月, 2019 5 次提交
- A
  
  Add LeakyReLU MKLDNN support (#18762) · ee022279
  由 Adam 提交于 7月 26, 2019
  
  ee022279
- L
  remove unused TransposeINT8Op for higher UT coverage (#18791) · b05bdda0
  由 lidanqing 提交于 7月 26, 2019
```
test=develop
```
  b05bdda0
- Z
  Feature/mem opt pass refactor (#18735) · a802da65
  由 Zeng Jinle 提交于 7月 26, 2019
```
* first version memory optimize pass, test=develop

* remove move_tensor_sharing_pass, test=develop

* refine code comments, add unittests, test=develop

* turn off memory_optimize by default, test=develop

* follow huihuang's comments, test=develop

* follow chengduoZH's comments, test=develop

* fix grammar error, add const qualifier, fix pass_test exception message, test=develop

* follow chengduoZH's comments 2nd, test=develop
```
  a802da65
- P
  
  fix mul_mkldnn_op build failure (#18816) · c5f47c21
  由 Physher 提交于 7月 26, 2019
  
  c5f47c21
- P
  
  clarify MKLDNN INT8 Mul Op attributes (#18685) · a5c98630
  由 Physher 提交于 7月 26, 2019
  
  a5c98630
25 7月, 2019 7 次提交
- F
  fix roi_align_op cpu backward's bug (#18789) · cff5e2c1
  由 FDInSky 提交于 7月 25, 2019
```
* test=develop fix cpu roi_align_op backward bug
```
  cff5e2c1
- 石
  Fix examples of API (#18092) · 9dbb62ee
  由石晓伟提交于 7月 25, 2019
```
* fix logical APIs

test=develop

test=document_preview

* fix isfinite

* update matmul comments

* update API.spec

test=document_preview

test=develop

* update API.spec

test=document_preview

test=develop

* update API.spec

test=document_preview

test=develop
```
  9dbb62ee
- C
  fix build strategy doc (#18725) · 292dfbce
  由 chengduo 提交于 7月 25, 2019
```
test=develop
```
  292dfbce
- F
  Fix shrink-dense and add scale-datanorm (#18746) · c167a4b4
  由 fuyinno4 提交于 7月 25, 2019
```
Fix FleetWrapper:
1. fix shrink dense: just scale show
2. add datanorm scale: divide datanorm's gradient by batch_size
```
  c167a4b4
- B
  
  fix deformable_conv_op compile error, test=develop (#18793) · d3ac561d
  由 Bai Yifan 提交于 7月 25, 2019
  
  d3ac561d
- L
  change ComputeINT8 to template version to remove checking dst_datatype code (#18756) · 9ecd8ee7
  由 lidanqing 提交于 7月 25, 2019
```
* change INT8 to template so that checking dst_dt with if-else could be removed. CI will be enabled after fixing reviews

* reverse user_residual_memory_p and user_bias_memory_p declaration scope
test=develop
```
  9ecd8ee7
- J
  
  fix bug of swish op formula,test=develop (#18772) · d9e7b5b5
  由 JesseyXujin 提交于 7月 25, 2019
  
  d9e7b5b5
24 7月, 2019 5 次提交

Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60

由 Bob Zhu 提交于 7月 24, 2019

* extend matmul op to support multiple head multiplication

With the support of multiple head, the multiplication of two big matrixes is
split into multiplication of several (head_number) small matrixes. e.g. if
Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
[6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].

220eef60

Add python API for appending LoD level (#18702) · 075e1cf7

由 whs 提交于 7月 24, 2019

* Make lod reset op support for append lod level.

* Fix API.spec
test=develop

* Fix unitest.
test=develop

* Add python api for lod append.
test=develop

* Fix API.spec
test=develop

* Fix format of doc.
test=develop

* Fix unitest.
test=develop

* Fix doc.
test=develop

075e1cf7

Modify auc doc. Add output variable description, previously was the scalar... · 25c9b57b

由 JesseyXujin 提交于 7月 24, 2019

Modify auc doc. Add output variable description, previously was the scalar type, now changed to the tuple type.test=develop (#18771)

25c9b57b

Update trt5 for paddle-trt (#18645) · 26ae6d49

由 Zhaolong Xing 提交于 7月 24, 2019

* update paddle-trt for:
    1. fix bug: when batch > 2, core in split plugin.
    2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.)
    3. add new attr to dropout.
    4. shuffle channel, swish, relu6 support
    test=develop

* 1. fix ci
test=develop

26ae6d49

add slot to sparse table (#18686) · d8396281

由 Thunderbrook 提交于 7月 24, 2019

The change includes 2 things:

1. save delta model and shrink table are control by the same parameter before, now add delete_after_unseen_days to control shrink table.
2. value in sparse table has no slot before, now add slot in sparse table, and add DownpureCtrAccessor to support the new meta.
test=develop

d8396281

23 7月, 2019 2 次提交

[MKL-DNN] Extended LRN with reusing via Acquire API (#18675) · 95c1816e

由 Jacek Czaja 提交于 7月 23, 2019

test=develop

- compileation fix

- Yet another compilation fix

- Even yet another compilation fix

- Surprise! Again compilation fix

- lint fixes

test=develop

- Fix to workspace acquire of LRN

test=develop

- Fix to hash of BWD LRN

test=develop

- fix to lrn BWD PD acquire

test=develop

- Fixing LRN PD creation

test=develop

- cosmetic fix in comment

test=develop

- Fixes after review

test=develop

95c1816e

support patch data, add load_one_table, fix bug (#18509) · d18aabb4

由 jiaqi 提交于 7月 23, 2019

（1）support patch data （merge slots of instances of same line id, modify dense layer which
changes its size）
（2）add fleet load_one_table interface, support load from paddle model and load from pslib model
（3）fix push sparse bug which cause push sparse cost more time（about 10% in my testcase）
（4）when some slots are not in one of your network (join/update, etc.)，data feed、collect label info、push/pull sparse will skip these slots， instead of throw error.
（5）add more debug info in TrainFilesWithProfiler

d18aabb4

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致