提交 · 20f18930ae463f4eba1f8c0b682fb7db5ddbce33 · 机器未来 / Paddle

12 8月, 2019 2 次提交

Add hard swish op (new op) (#19001) · 20f18930

由 huangjun12 提交于 8月 12, 2019

* add hard_swish activation op (new op)
test=develop

* remove redundancy files

* modify document content of HardSwish OP

* add API test in test_layers.py

* add dynamic_graph for test_hard_swish

20f18930

G
Polish fleet API to support cuda collective mode and nccl2 mode. (#18966) · 29d87812
由 gongweibao 提交于 8月 12, 2019
```
Polish fleet API to support cuda collective mode and nccl2 mode
```
29d87812

11 8月, 2019 1 次提交

add save cache model api in fleet& add slots shuffle in dataset module & add... · 9150cf50

由 yaoxuefeng 提交于 8月 11, 2019

add save cache model api in fleet& add slots shuffle in dataset module & add metric op to calculate ctr related metrics (#18871)

* add ctr related metric layer test=develop

* add save cache and slots shuffle test=develop

* add save cache and slots shuffle test=develop

* fix error

* fix error

* fix style for ci

* fix for comments

* change SlotsShuffle input to std::strinf for generality

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix stylr

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* fix style

* change non-const reference to pointer

* fix style

* fix style

* fix style test=develop

* fix style  test=develop

* add return ins num in ctr metric op

* change dtype to float in metric_op.py

* fix error test=develop

* fix style test=develop

* fix API spec

* fix API spec

* fix API spec test=develop

* add UT test=develop

9150cf50

10 8月, 2019 2 次提交

Try to deprecate unstable python memory optimize (#18983) · c194b0c8

由 Zeng Jinle 提交于 8月 10, 2019

* deprecate python memory optimize, test=develop

* remove memory_optimize in unittests, test=develop

* add unittests to deprecated interfaces, test=develop

c194b0c8

Datafeed support reading to cuda place directly. (#19071) · 5a80cc84

由 hutuxian 提交于 8月 10, 2019

* add a place field in DataFeed to denote which place it will feed data to.
* abstract the copy process in CopyToFeedTensor function
* add UT for float32 type and for CUDAPlace

5a80cc84

09 8月, 2019 5 次提交

C
prune the feed op in compiler (#18997) · 3f4c088a
由 chengduo 提交于 8月 09, 2019
```
test=develop
```
3f4c088a

add eye op, kernel and unitest test=develop (#18980) · 4397cb31

由 ShenLiang 提交于 8月 09, 2019

* add eye op,test=document_preview test=develop

* fix the API.spec, test=develop

* fix the document, test=document_preview test=develop

* add unitest for CI coverage, test=develop

4397cb31

Add trilinear_interp OP (#18711) · f86fead6

由 Kaipeng Deng 提交于 8月 09, 2019

* add trilinear interp. test=develop

* fix unittest. test=develop

* add python api and test_layers. test=develop

* refine API.spec. test=develop

* fix format. test=develop

* add python API test. test=develop

* format code. test=develop

* refine code strcuture. test=develop

* fix format

* fix doc. test=develop

* fix converage. test=develop

* fix format. test=develop

f86fead6

C
Enhance fuse optimization op pass (#19010) · 17d62ab2
由 chengduo 提交于 8月 09, 2019
```
* Enhance fuse optimization op pass
test=develop
```
17d62ab2

Add call stack info during compile time (#19067) · 21440b4d

由 chengduo 提交于 8月 09, 2019

* Add call stack info during runtime and compile time
test=develop

* Rename operator_call_stack
test=develop

* Add unit test
test=develop

* follow comment
test=develop

21440b4d

08 8月, 2019 1 次提交

Fix memory overwriting of tensors returned by executor (#19030) · 8f537354

由 Leo Chen 提交于 8月 08, 2019

* fix memory overlapping of fetch var (return of executor.run), test=develop

* fix wrong usage of ParallelExecutor in op_test, test=develop

* remove useless parameter and simplify code

* avoid tensor destruct untimely, test=develop

* add testcase independent of OpTest, test=develop

8f537354

06 8月, 2019 2 次提交

Add var_conv_2d op (#18518) · e681d655

由 Kevin 提交于 8月 06, 2019

* fix overflow by int32 mul test=develop

* fix reference nullptr

* fix codestyle test=develop

* modify to point in ContextProjectFunctor test=develop

* modify to point in ContextProjectFunctor test=develop

* modify . to -> test=develop

* add var_conv_2d op test=develop

* edit api.spec test=develop

* ignore unittest if with_mkl=off test=develop

* fix python3 division test=develop

* fix ignore unittest bug test=develop

* remove useless code test=develop

* modify api.spec test=develop

* modify default_grad.spec test=develop

e681d655

Z

reduce_unittest_time,test=develop (#19005) · 311f90f1
由 Zeng Jinle 提交于 8月 06, 2019

311f90f1

05 8月, 2019 1 次提交
- L
  support tensor input for ctc align op (#18887) · faf6890b
  由 Liufang Sang 提交于 8月 05, 2019
```
* test=develop support Tensor input for ctc_align_op

* test=develop add some comment
```
  faf6890b
04 8月, 2019 1 次提交
- D
  make listen and server as exclusive run (#18990) · c97ea53c
  由 Dong Daxiang 提交于 8月 04, 2019
```
make listen and server as exclusive run 
```
  c97ea53c
02 8月, 2019 2 次提交

Open gc by default (#18836) · 7ac748ad

由 Zeng Jinle 提交于 8月 02, 2019

* open gc by default, test=develop

* fix test_train_recognize_digits and disable gc when ngraph is enabled, test=develop

* fix conditional_block op eager deletion bug, test=develop

* add some comments to reviewers, test=develop

7ac748ad

石

Fusion: seqpool_cvm_concat (#18471) · ee2f296e

由石晓伟提交于 8月 02, 2019

* add fusion_seqpool_cvm_concat test=develop

* simplify pass, test=develop

* fix code style, test=develop

ee2f296e

01 8月, 2019 2 次提交

Add the op of unique_with_counts, expand count function of the op unique (#18720) · 3ab1866c

由 wawltor 提交于 8月 01, 2019

* test=develop
Add the op of unique_with_counts, the op is calc the unqiue input of data, and output the corresponding indices and count of data.

* test=develop
Check the input and dtype in the op of unique_with_counts

* test=develop
test=document_preview
update the API.spec for `unique_with_counts`, at the same time, optimize the python api in the op of `unique_with_count`

* test=develop
test=document_preview
Fix some python api problem in the op of `unique_with_counts`, and change the error messsage in this op.

* Fix some API problem in the op of `unique_with_counts`
test=develop
test=document_preview

* test=develop
test=document_preview
Fix the api sample of op `unique_with_counts`, and update api.spec

3ab1866c

L
Fix depthwise conv gpu kernel bug (#18582) · 22fa4c2d
由 LielinJiang 提交于 8月 01, 2019
```
* fix depthwise conv gpu kernel bug, test=develop
* add more depthwise conv test, test=develop
```
22fa4c2d

31 7月, 2019 2 次提交

Add center Loss Op Support (#18681) · 24f85431

由 HaoRen 提交于 7月 31, 2019

* support center loss
* change tensor copy  api to high level api tensorcopy

* test=develop rewrite the center_loss cuda_kernel to make it faster
and add document of the center loss api,also update test function

* test=document_preview test=develop
update document of center loss

* test=document_preview test=develop
modify API.spec modify test code remove nouse const_cast

24f85431

D
make dist unit test exclusive run (#18865) · 2bb296df
由 Dong Daxiang 提交于 7月 31, 2019
```
make dist unit test exclusive run
```
2bb296df

30 7月, 2019 2 次提交
- D
  
  Add elementwise_pow_op backward implementation and the unit test codes of it. (#18848) · e0a2d4df
  由 danleifeng 提交于 7月 30, 2019
  
  e0a2d4df
- C
  add CPUInplaceTestWithFuseOptimizationOps (#18867) · ecd2bdad
  由 chengduo 提交于 7月 30, 2019
```
test=develop
```
  ecd2bdad
29 7月, 2019 1 次提交

Remove legacy C++ memory optimization codes (#18834) · 8008ab4e

由 Zeng Jinle 提交于 7月 29, 2019

* remove legacy memory optimization codes, test=develop

* follow huihuang's comments,test=develop

* follow luotao's comments, test=develop

8008ab4e

28 7月, 2019 1 次提交
- Z
  
  fix affine_channel no_need buffer bug, test=develop (#18844) · 9a8a7a1d
  由 Zeng Jinle 提交于 7月 28, 2019
  
  9a8a7a1d
27 7月, 2019 1 次提交
- C
  Open fuse optimization ops (#18741) · 4140fe11
  由 chengduo 提交于 7月 27, 2019
```
* open fuse optimization ops
test=develop
```
  4140fe11
26 7月, 2019 2 次提交

A

Add LeakyReLU MKLDNN support (#18762) · ee022279
由 Adam 提交于 7月 26, 2019

ee022279

Feature/mem opt pass refactor (#18735) · a802da65

由 Zeng Jinle 提交于 7月 26, 2019

* first version memory optimize pass, test=develop

* remove move_tensor_sharing_pass, test=develop

* refine code comments, add unittests, test=develop

* turn off memory_optimize by default, test=develop

* follow huihuang's comments, test=develop

* follow chengduoZH's comments, test=develop

* fix grammar error, add const qualifier, fix pass_test exception message, test=develop

* follow chengduoZH's comments 2nd, test=develop

a802da65

25 7月, 2019 1 次提交
- G
  split test_dist_se_resnext.py into 4 testcases (#18743) · 2efb282c
  由 guru4elephant 提交于 7月 25, 2019
```
* split test_dist_se_resnext.py into 4 testcases
```
  2efb282c
24 7月, 2019 3 次提交

Extend Matmul to support matrix multiplication with multiple heads (#18570) · 220eef60

由 Bob Zhu 提交于 7月 24, 2019

* extend matmul op to support multiple head multiplication

With the support of multiple head, the multiplication of two big matrixes is
split into multiplication of several (head_number) small matrixes. e.g. if
Mat A is [3, 24] and Mat B is [24, 4], when multiple A and B with head_number
as 4, Mat A will be split as 4 matrix of [3, 6] and Mat B will be 4 matrix of
[6, 4]. The result of final matrix will be 4 matrix of [3, 4], i.e. [3, 16].

220eef60

Add python API for appending LoD level (#18702) · 075e1cf7

由 whs 提交于 7月 24, 2019

* Make lod reset op support for append lod level.

* Fix API.spec
test=develop

* Fix unitest.
test=develop

* Add python api for lod append.
test=develop

* Fix API.spec
test=develop

* Fix format of doc.
test=develop

* Fix unitest.
test=develop

* Fix doc.
test=develop

075e1cf7

C
Enhance backward process (#18700) · 8259f141
由 chengduo 提交于 7月 24, 2019
```
* prun backward ops
test=develop
```
8259f141

23 7月, 2019 2 次提交
- C
  Make fuse_optimizer_op_pass also work when the model contains sparse gradients. (#18664) · fd3aad6c
  由 chengduo 提交于 7月 23, 2019
```
* support sparse gradients
test=develop
```
  fd3aad6c
- Y
  supports distributed classification (#18690) · 157211c4
  由 Yi Liu 提交于 7月 23, 2019
```
* supports distributed classification training
* update API.spec
* fix evenly division in python3
* change "index_range" to "index_num" in shard_index operator
test=document_preview
test=develop
```
  157211c4
22 7月, 2019 3 次提交

Fix random test_recurrent_op failure (#18718) · a3028bb7

由 Huihuang Zheng 提交于 7月 22, 2019

The change includes 3 things:

1. Set CPU_NUM to 1 in the tests because the ParallelExecutor will print warning that CPU_NUM is not set and use default 1.

2. Old tests compare two RNNs, hand written simple RNN and same RNN built by Paddle, but initialized RNN weights in numpy random and Paddle random separately. Fixed it by setting weights and bias values.

3. Also set numpy random seed in the tests. Now the two RNNs diff can be smaller (rtol from 0.1, 0.2 to. 0.01) in the tests.

test=develop

a3028bb7

T
Revert "Add LeakyRelu MKLDNN support (#18656)" (#18723) · bd22453f
由 Tao Luo 提交于 7月 22, 2019
```
test=develop
```
bd22453f
G
split different comm method for mnist distributed training (#18715) · ebf9797e
由 guru4elephant 提交于 7月 22, 2019
```
* split different comm method for mnist distributed training
```
ebf9797e

19 7月, 2019 3 次提交

Support memory eager deletion on recurrent OP (#17710) · 89bc3fd8

由 Huihuang Zheng 提交于 7月 19, 2019

Test PaddingRNN on V100 GPU device.

Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU.

GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR)
Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)

89bc3fd8

A
Add LeakyRelu MKLDNN support (#18656) · d6b6a337
由 Adam 提交于 7月 19, 2019
```
test=develop
```
d6b6a337
T
add check of executor (#17986) · 0b9acb49
由 tangwei12 提交于 7月 19, 2019
```
* add check of executor, test=develop
```
0b9acb49

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致