提交 · a802da650bf2f7a68bd5c61ef77c1932378234d4 · 机器未来 / Paddle

26 7月, 2019 1 次提交

Feature/mem opt pass refactor (#18735) · a802da65

由 Zeng Jinle 提交于 7月 26, 2019

* first version memory optimize pass, test=develop

* remove move_tensor_sharing_pass, test=develop

* refine code comments, add unittests, test=develop

* turn off memory_optimize by default, test=develop

* follow huihuang's comments, test=develop

* follow chengduoZH's comments, test=develop

* fix grammar error, add const qualifier, fix pass_test exception message, test=develop

* follow chengduoZH's comments 2nd, test=develop

a802da65

25 7月, 2019 4 次提交

石

Fix examples of API (#18092) · 9dbb62ee

由石晓伟提交于 7月 25, 2019

* fix logical APIs

test=develop

test=document_preview

* fix isfinite

* update matmul comments

* update API.spec

test=document_preview

test=develop

* update API.spec

test=document_preview

test=develop

* update API.spec

test=document_preview

test=develop

9dbb62ee

G
refine launch_ps and role_maker (#18795) · 30562e37
由 guru4elephant 提交于 7月 25, 2019
```
refine launch_ps and role_maker
```
30562e37

Fix shrink-dense and add scale-datanorm (#18746) · c167a4b4

由 fuyinno4 提交于 7月 25, 2019

Fix FleetWrapper:
1. fix shrink dense: just scale show
2. add datanorm scale: divide datanorm's gradient by batch_size

c167a4b4

G
split test_dist_se_resnext.py into 4 testcases (#18743) · 2efb282c
由 guru4elephant 提交于 7月 25, 2019
```
* split test_dist_se_resnext.py into 4 testcases
```
2efb282c

24 7月, 2019 5 次提交

Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60

由 Bob Zhu 提交于 7月 24, 2019

* extend matmul op to support multiple head multiplication

With the support of multiple head, the multiplication of two big matrixes is
split into multiplication of several (head_number) small matrixes. e.g. if
Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
[6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].

220eef60

Add python API for appending LoD level (#18702) · 075e1cf7

由 whs 提交于 7月 24, 2019

* Make lod reset op support for append lod level.

* Fix API.spec
test=develop

* Fix unitest.
test=develop

* Add python api for lod append.
test=develop

* Fix API.spec
test=develop

* Fix format of doc.
test=develop

* Fix unitest.
test=develop

* Fix doc.
test=develop

075e1cf7

C
Enhance backward process (#18700) · 8259f141
由 chengduo 提交于 7月 24, 2019
```
* prun backward ops
test=develop
```
8259f141

Modify auc doc. Add output variable description, previously was the scalar... · 25c9b57b

由 JesseyXujin 提交于 7月 24, 2019

Modify auc doc. Add output variable description, previously was the scalar type, now changed to the tuple type.test=develop (#18771)

25c9b57b

add slot to sparse table (#18686) · d8396281

由 Thunderbrook 提交于 7月 24, 2019

The change includes 2 things:

1. save delta model and shrink table are control by the same parameter before, now add delete_after_unseen_days to control shrink table.
2. value in sparse table has no slot before, now add slot in sparse table, and add DownpureCtrAccessor to support the new meta.
test=develop

d8396281

23 7月, 2019 3 次提交

support patch data, add load_one_table, fix bug (#18509) · d18aabb4

由 jiaqi 提交于 7月 23, 2019

（1）support patch data （merge slots of instances of same line id, modify dense layer which
changes its size）
（2）add fleet load_one_table interface, support load from paddle model and load from pslib model
（3）fix push sparse bug which cause push sparse cost more time（about 10% in my testcase）
（4）when some slots are not in one of your network (join/update, etc.)，data feed、collect label info、push/pull sparse will skip these slots， instead of throw error.
（5）add more debug info in TrainFilesWithProfiler

d18aabb4

C
Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664) · fd3aad6c
由 chengduo 提交于 7月 23, 2019
```
* support sparse gradients
test=develop
```
fd3aad6c

supports distributed classification (#18690) · 157211c4

由 Yi Liu 提交于 7月 23, 2019

* supports distributed classification training
* update API.spec
* fix evenly division in python3
* change "index_range" to "index_num" in shard_index operator
test=document_preview
test=develop

157211c4

22 7月, 2019 5 次提交

Z

add more traceback to py_reader error msg, test=develop (#18722) · d07ad4c6
由 Zeng Jinle 提交于 7月 22, 2019

d07ad4c6

Fix random test_recurrent_op failure (#18718) · a3028bb7

由 Huihuang Zheng 提交于 7月 22, 2019

The change includes 3 things:

1. Set CPU_NUM to 1 in the tests because the ParallelExecutor will print warning that CPU_NUM is not set and use default 1.

2. Old tests compare two RNNs, hand written simple RNN and same RNN built by Paddle, but initialized RNN weights in numpy random and Paddle random separately. Fixed it by setting weights and bias values.

3. Also set numpy random seed in the tests. Now the two RNNs diff can be smaller (rtol from 0.1, 0.2 to. 0.01) in the tests.

test=develop

a3028bb7

T
Revert "Add LeakyRelu MKLDNN support (#18656)" (#18723) · bd22453f
由 Tao Luo 提交于 7月 22, 2019
```
test=develop
```
bd22453f
T
do some odd jobs (#18641) · d8458483
由 tangwei12 提交于 7月 22, 2019
```
do some odd jobs, test=develop
```
d8458483
G
split different comm method for mnist distributed training (#18715) · ebf9797e
由 guru4elephant 提交于 7月 22, 2019
```
* split different comm method for mnist distributed training
```
ebf9797e

19 7月, 2019 3 次提交

Support memory eager deletion on recurrent OP (#17710) · 89bc3fd8

由 Huihuang Zheng 提交于 7月 19, 2019

Test PaddingRNN on V100 GPU device.

Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU.

GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR)
Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)

89bc3fd8

A
Add LeakyRelu MKLDNN support (#18656) · d6b6a337
由 Adam 提交于 7月 19, 2019
```
test=develop
```
d6b6a337
T
add check of executor (#17986) · 0b9acb49
由 tangwei12 提交于 7月 19, 2019
```
* add check of executor, test=develop
```
0b9acb49

18 7月, 2019 3 次提交

Feature/auto_growth_allocator (#18561) · ae58afc5

由 Zeng Jinle 提交于 7月 18, 2019

* feature/auto_growth_allocator, test=develop

* add unittest of AlignedAllocator, test=develop

* try to turn on auto_growth to test on CI, test=develop

* fix segmentation fault in mixed_vector.h, test=develop

* add unittests, test=develop

ae58afc5

H
hash_op support int64 hash_size (#18674) · bb2f5d24
由 hutuxian 提交于 7月 18, 2019
```
* hash_op support int64 hash_size
* add corresponding UT
```
bb2f5d24
G
remove ctr reader, all functions are satisfied in dataset (#18672) · 5ed713d5
由 guru4elephant 提交于 7月 18, 2019
```
* remove ctr reader, all functions are satisfied in dataset
```
5ed713d5

15 7月, 2019 2 次提交
- G
  make auc op compatible with 1 dim (#18551) · ab57d389
  由 guru4elephant 提交于 7月 15, 2019
```
* make auc op compatible with 1 dim
```
  ab57d389
- G
  increase timeout again (#18628) · b71b4543
  由 guru4elephant 提交于 7月 15, 2019
```
test=develop
```
  b71b4543
12 7月, 2019 3 次提交
- 1
  fix #17430: int64类型的attr训练非预期 (#18264) · b414645a
  由 123malin 提交于 7月 12, 2019
```
* fix int64_t

* update fill constant op unittest

* add empty line
```
  b414645a
- K
  
  Modify embedding_op input dtype to int64 (#18598) · 995d7d86
  由 Kevin 提交于 7月 12, 2019
  
  995d7d86
- K
  1）change to parallel mode on python coverage run (#18594) · 9ad57f2d
  由 kh2se2013 提交于 7月 12, 2019
```
2）add pip install coverage in Dockerfile.tmp
test=develop
```
  9ad57f2d
11 7月, 2019 2 次提交

G

Polish backwards optimizer dependency codes and use more default values. (#18255) · c0a82748
由 gongweibao 提交于 7月 11, 2019

c0a82748

Feature/buffer_shared_inplace (#17911) · d3003a16

由 Zeng Jinle 提交于 7月 11, 2019

* feature/buffer_shared_inplace, test=develop

* refine code, test=develop

* fix elementwise_add op cpu inplace and sum inplace bug, test=develop

* add unittest and debug log, test=develop

* fix parallel_executor scope bug, polish code, test=develop

* fix sum op, activation op, single_in_place_inference bug, test=develop

* remove kLocalExecScopeName, test=develop

* fix unittest,test=develop

* fix out_var first version bug, test=develop

* follow comments,test=develop

d3003a16

10 7月, 2019 3 次提交

Add code example in CI (#18228) · 1c10dac4

由 tianshuo78520a 提交于 7月 10, 2019

* test api example

* update python

* add sampcd_processor.py

* add if 0

* sort

* test paddle

* test paddle

* test paddle

* add whitelist

* change sampcd_processor.py

* change sampcd_processor.py

* change sampcd_processor.py

* add exit

* test=develop

* test=develop

1c10dac4

L
update dygraph api doc for web (#18550) · b6d5c74f
由 lujun 提交于 7月 10, 2019
```
remove dygraph.enable from __all__
hidden dygraph. profiler
add doc to dygraph. no_grad
```
b6d5c74f
G
upgrade collective fleet api (#18533) · 9c17a899
由 guru4elephant 提交于 7月 10, 2019
```
* upgrade collective fleet api
```
9c17a899

09 7月, 2019 3 次提交
- B
  
  QAT int8 MKL-DNN transformation pass with MUL (#18322) · a25be53c
  由 bingyanghuang 提交于 7月 09, 2019
  
  a25be53c
- P
  
  Add mkldnn int8 mul-op kernel (#17834) · 0caa08ea
  由 Physher 提交于 7月 09, 2019
  
  0caa08ea
- L
  Fix roi_perspective_transform_op bug (#18522) · 24d1c44a
  由 LielinJiang 提交于 7月 09, 2019
```
* fix transform matrix bug, test=develop

* modify API.spec
```
  24d1c44a
08 7月, 2019 1 次提交
- G
  add random port (#18504) · 1f1cc222
  由 guru4elephant 提交于 7月 08, 2019
```
* add random port
```
  1f1cc222
05 7月, 2019 2 次提交

Fix topk cannot handle 1D vector bug (#18466) · 832d8191

由 zhaoyuchen2018 提交于 7月 05, 2019

* Fix topk cannot handle 1D vector bug

Add path to handle 1D vector

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

* refine code

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

832d8191

Hide no support (#18515) · 7586cdd5

由 Jiabin Yang 提交于 7月 05, 2019

* test=develop, fix docker with paddle nccl problem

* test=develop, hide no_support api and add ut for it

7586cdd5

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致