提交 · fa7ace7cf2859f927c26f1970bbc2f5551532df1 · 机器未来 / Paddle

10 1月, 2020 2 次提交

Cherry pick from #21862 (#22194) · fa7ace7c

由 Guo Sheng 提交于 1月 10, 2020

* Fix default label dim of label_smooth_op. test=develop (#21862)

* Fix unit tests of label_smooth_op's data size.

fa7ace7c

[cherry-pick] Add FC padding, ernie test unit and layernorm parallel (#22198) · 3df38f5c

由 GaoWei8 提交于 1月 10, 2020

* Optimize the kernel implementation of layernorm with openmp (#20895)

* Add ernie c++ inference test (#21015)

* Add ernie unit test
test=develop

* Add ernie unit test
test=develop

* Add ernie unit test
test=develop

* remove ngraph

* optimize gpu test
test=develop

* optimize codes
test=develop

* fix cmake fails on inference_download_and_uncompress (#21185)

* solve cmake fails on inference_download_and_uncompress
test=develop

* solve cmake fails on inference_download_and_uncompress
test=develop

* Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972)

* Add fc padding to solve mkl performance
test=develop

* fix gpu pass and error information
test=develop

* fix fc_fuse_pass_test
test=develop

* fix error information
test=develop

* fix error information
test=develop

* fix name and add fc op padding test
test=develop

* fix attributes
test=develop

* optimize fc padding
test=develop

* fix test
test=develop

* Polish the codes of fc when needs padding (#21378)

test=develop

* Add ernie large c++ inference test (#21365)

* add ernie-large test
test=develop

* add ernie large c++ inference test
test=develop

* Modify padding strategy: remove weight copy in fc padding (#21650)

test=develop

* optimize fc jit (#21878)

test=develop
Co-authored-by: NYihua Xu <yihuaxu@hotmail.com>

3df38f5c

09 1月, 2020 2 次提交
- C
  
  fix softmax_with_cross_entropy_fix bug, test=develop (#21810) (#22183) · bc385a29
  由 Chen Weihang 提交于 1月 09, 2020
  
  bc385a29
- W
  [Cherry-pick 1.6] fix batch_norm_grad shape=0 & allreduce shape enforce &... · 515b206d
  由 WangXi 提交于 1月 09, 2020
```
[Cherry-pick 1.6] fix batch_norm_grad shape=0 & allreduce shape enforce & sync_batch_norm hang in fleet (#22157)
```
  515b206d
08 1月, 2020 1 次提交
- Z
  [cherry-pick] Fix softmax cuda bug (#21720) (#22160) · b9a1d954
  由 zhaoyuchen2018 提交于 1月 08, 2020
```
* Fix softmax cuda bug

* Refine multihead log and softmax logic

* Align block to 32
```
  b9a1d954
07 1月, 2020 2 次提交

Fix optimizer op infershape failed in dygraph multi-cards mode (#21374) (#22112) · 34ef38c8

由 Chen Weihang 提交于 1月 07, 2020

* add param & grad shape check for sgd op

* add _reshape_inplece interface for dygraph parallel

* refine unittest based paddle/models scripts, test=develop

* add unittest for parallel grad fuse, test=develop

34ef38c8

Y
Fix the global_step & continuous applying error in EMA (#22090) (#22130) · 9b64d636
由 Yibing Liu 提交于 1月 07, 2020
```
* Fix the global_step & continuous applying error in EMA

* Fix for step 0 & add unit test

test=release/1.6
```
9b64d636

06 12月, 2019 2 次提交
- B
  
  cherry-pick MKL-DNN NHWC FWD support fix (#21593) · 1f598dfa
  由 bingyanghuang 提交于 12月 06, 2019
  
  1f598dfa
- A
  
  cherry-pick pyramid_hash op test=develop (#20779)(#18525) (#21562) · f83254d6
  由 Aurelius84 提交于 12月 06, 2019
  
  f83254d6
05 12月, 2019 1 次提交

[Cherry-pick] fix the computation for dx (grad for x) for prelu operation. (#20949) (#21514) · 40549473

由 lilong12 提交于 12月 05, 2019

* fix the computation for dx (grad for x) for prelu operation. (#20949)

* set the default value of alpha for prelu to 0.25, test=develop

* add the call to __syncthreads(), test=develop

* fix the implementation of cpu prelu, test=develop

* repair the implementation of element mode prelu, test=develop

* modify test_prelu_op.py, test=develop

40549473

04 12月, 2019 3 次提交

Refactor fetch handler (#21264) (#21537) · 87a8caa8

由 tangwei12 提交于 12月 04, 2019

* fix fetch handler problem and refactor
when a user define FetchHandler class, he or she should initialize a handler
with variable dict. the key of a variable dict is a user defined name,
the value of a variable dict is a Varaible generated from python API.

For each fetching, a user should implement handler function in which
fetched_result_dict will be available and the user can access the fetched value
with user defined keys.

87a8caa8

W

Fix dgc clip & rampup step, test=release/1.6 (#21519) · 3f1169fe
由 WangXi 提交于 12月 04, 2019

3f1169fe
B

[cherry pick] Conv2d and Conv2d transpose MKL-DNN NHWC support (#21525) · 0e63746b
由 bingyanghuang 提交于 12月 04, 2019

0e63746b

03 12月, 2019 7 次提交
- L
  set dim[0] to -1 if dim[0] < 0 during compiling for c_allgather op (#21402) (#21512) · df2b4002
  由 lilong12 提交于 12月 03, 2019
```
* set dim[0] to -1 if dim[0] < 0 and remove assertion to runtime, test=develop
```
  df2b4002
- Z
  [cherry-pick] Improve argsort performance. (#21267) (#21442) · 66c18f4a
  由 zhaoyuchen2018 提交于 12月 03, 2019
```
* Improve argsort performance.

- Give 200000 data to compute argsort on v100,
can speed up ~190x
before opt cost: 0.53s
after opt cost:0.0027s

- Add fp16 support

* Refine error message
* Refine code
* Add descending sort

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
```
  66c18f4a
- K
  [cherry-pick] add Adam beta1/beta2 support Variable (#21433) · 735a2db0
  由 Kaipeng Deng 提交于 12月 03, 2019
```
* add Adam beta1/beta2 support Variable. test=develop
```
  735a2db0
- Z
  [cherry-pick] Add Asypadding for conv fusion. (#21041) (#21439) · 2660107c
  由 zhaoyuchen2018 提交于 12月 03, 2019
```
* Add Asypadding for conv fusion.

test=develop

reference: pr/20042

* Fix eigen build link error

* Change back file mode

* Use math function & add more checks.
```
  2660107c
- L
  add the framework support for distfc (#21197) (#21463) · e06f4439
  由 lilong12 提交于 12月 03, 2019
```
* add the framework support for distfc and ut, test=develop
* fix the implementation of shard_index_op, test=develop
```
  e06f4439
- K
  [cherry-pick] add bn momentum variable (#21435) · 9c63b7c1
  由 Kaipeng Deng 提交于 12月 03, 2019
```
* batch_norm momentum support variable. test=develop
```
  9c63b7c1
- B
  
  cherry-pick LRN and Pool2d (FWD) NHWC support (#21476) · ccb508dc
  由 bingyanghuang 提交于 12月 03, 2019
  
  ccb508dc
29 11月, 2019 1 次提交
- W
  
  Fix dgc accuracy by mv regularization to local, test=release/1.6 (#21390) · 6ce49eea
  由 WangXi 提交于 11月 29, 2019
  
  6ce49eea
28 11月, 2019 1 次提交

cherry-pick1.6 fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21339) · 072eb5b6

由 xujiaqi01 提交于 11月 28, 2019

* fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21052)

* fix cache table bug
* add save_paddle_inference_model
* fix hdfs util bug
* test=develop

* fix several sparse table issuses (#20686)

* no longer need to define all embedding layers (no one less) of all slots in each program. make trainer_param repeated in ps.proto.
* add find_distributed_lookup_table_grads instead of hard code GRAD
* support embedding stop gradient. push sparse has error before fix this.* 
* fix fill sparse, skip slots which do not have embedding. each slot's embedding in a sparse table should be used in all training programs before fix this.
* fix pull sparse, skip slots which do not have embedding.
* fix collect feasign label info, skip slots which do not have embedding.
* support when there are multi sparse tables in one or multi training programs, each program can pull/push its own related sparse tables instead of all sparse tables.
* test=develop

* add copy table (#21086)

* copy some feasigns and corresponding embeddings from one sparse table to another
* copy all feasigns and corresponding embeddings from one sparse table to another
* copy all dense params from one table to another
* copy some local vars to other local vars

* fix fs_client_param bug (#21212)

* fix fs_client_param bug， user can set this config through fleet_desc_file or fleet config
* test=develop

* fix fleet util bug (#21254)

* fix fleet util bug in save paddle inference model
* test=develop

072eb5b6

26 11月, 2019 4 次提交
- L
  [Cherry pick] instance_norm, gradients and batch_norm (#21301) · 97bbab47
  由 Lv Mengsi 提交于 11月 26, 2019
```
* Fix gradients (#20857)

* fix_gradients

* fix_gradients, test=develop

* fix instance norm (#21042)

* fix instance norm

* update unitest,test=develop

* fix_bn

* revert unittest,test=develop
```
  97bbab47
- B
  
  [cherry-pick] Refactor mkldnn eletwise_mul and error message for NHWC in mkldnn (#21361) · 03dda317
  由 bingyanghuang 提交于 11月 26, 2019
  
  03dda317
- W
  
  [Cherry-pick 1.6] Fix dgc buffer illegal & reuse velocity & fix fuse (#21281) · 93c7f058
  由 WangXi 提交于 11月 26, 2019
  
  93c7f058
- W
  
  Fix INF bug of softmax_cross_entropy_op, test=release/1.6 (#21283) · 3423f0b6
  由 WangXi 提交于 11月 26, 2019
  
  3423f0b6
25 11月, 2019 2 次提交

cherry-pick error info check of Print_op for release1.6 (#21349) · 9a98d11e

由 lijianshe02 提交于 11月 25, 2019

* add input type and input data type check for Print_op test=develop (#21250)

* add input type and input data type check for Print_op test=develop

* cherry-pick error info check of Print_op for release1.6 test=develop

* cherry-pick error info check of Print_op for release1.6 test=develop

9a98d11e

[cherry-pick] fix crop_tensor, maxout and lrn (#21302) · 3848f720

由 Zhang Ting 提交于 11月 25, 2019

* [cherry-pick] All elements in attr(shape) of crop_tensor can be -1 and int32/64 kernel registered (#20756)

* All elements in attr(shape) of crop_tensor can be -1, test=develop, test=document_preview

* fix the bug that attr(offsets) should be initialized, test=develop

* [cherry-pick] maxout supports channel_last input (#20846)

* maxout support channel_last input, test=develop

* modified details of Input(X) and Attr(groups, axis) in doc, test=develop

* [cherry-pick] lrn supports channel_last input, test=develop (#20954)

3848f720

23 11月, 2019 1 次提交
- K
  [cherry-pick] fix elementwise mod (#21315) · 5e35e5ea
  由 Kaipeng Deng 提交于 11月 23, 2019
```
* fix elementwise_mod FP kernel. test=develop

* fix unittest. test=develop
```
  5e35e5ea
21 11月, 2019 1 次提交

[cherry-pick]fix bug in pool/conv/conv_transpose: UpdatePaddingAndDilation,... · 7ab85396

由 liym27 提交于 11月 21, 2019

[cherry-pick]fix bug in pool/conv/conv_transpose: UpdatePaddingAndDilation, _get_padding_with_SAME and conv2dtranspose_forward_naive. (#20997) (#21225)

* fix bug in pool/conv/conv_transpose:
    1. It should be stride[i] not stride[0] in UpdatePaddingAndDilation;
    2. fix bug of func  _get_padding_with_SAME in test_conv/conv_transpose_op.py;
    3. fix bug of the computation process in function conv2dtranspose_forward_naive.
    test=release/1.6

7ab85396

14 11月, 2019 1 次提交
- T
  fix error message in expand API, and fix two error unit-tests (#21180) · cdb81264
  由 Tao Luo 提交于 11月 14, 2019
```
test=release/1.6
```
  cdb81264
11 11月, 2019 1 次提交
- H
  Disable cudnn_conv in Parallel Executor unit tests. (#21083) · e7d5e0ea
  由 Huihuang Zheng 提交于 11月 11, 2019
```
TODO: fix cudnn_conv and re-enable it

test=develop
test=release/1.6
```
  e7d5e0ea
07 11月, 2019 2 次提交

[cherry-pick] Add support for asymetric padding in MKLDNN pool, conv and conv_transpose (#21072) · e8890031

由 Adam 提交于 11月 07, 2019

* Add asymetric padding support for mkldnn pooling
test=develop

* Add asymetric padding support for mkldnn conv
test=develop

* Add asymetric padding support for mkldnn conv_transpose
test=develop

e8890031

H
fix uniform random (#21009) (#21057) · e112ea2b
由 hong 提交于 11月 07, 2019
```
* fix uniform random; test=develop

* add uniform random test; test=develop
```
e112ea2b

01 11月, 2019 4 次提交

Cherry pick bug fix for Ops: reshape,concat, split and squeeze (#20929) · 33d7aae1

由 liym27 提交于 11月 01, 2019

* [cherry-pick]fix bug in reshape: (#20781)

consider the situation that shape of input can contain more than one -1.

* [cherry-pick]support Tensor for split and concat, support -1 in num_or_sections, add check num_or_sections (#20780)

* improve split and concat op:
1. support Tensor for argument 'dim' in split op.
2. support Tensor for argument 'axis' in concat op.
* redefine function GetDataFromTensor and set unknown output shape to - 1.
* add check: Attr(sections) match Input(X).
* support Tensor for attr(sections) and attr(sections) can contain -1.
* modify error message and fix bug for concat and call Resize only when necessary.
test=release/1.6

* [cherry-pick]improve unsqueeze op to support int, Tensor for argument axes (#20824)

* improve unsqueeze op to support int, Tensor and Tensor list for argument axes.
* call Resize only when necessary. test=release/1.6

* [cherry-pick]Compatible int32 and int64 for attr in concat/split/unsqueeze. test=release/1.6 (#20912)

33d7aae1

cherry-pick1.6 simplify master+patch，remove ins when size != merge_size or has... · 3db61dc0

由 xujiaqi01 提交于 11月 01, 2019

cherry-pick1.6 simplify master+patch，remove ins when size != merge_size or has conflict slot  (#20941)

* simplify master+patch，remove ins when size != merge_size or has conflict slot
* test=develop

3db61dc0

1
Optimize decay (#20816) (#20952) · 781d2844
由 123malin 提交于 11月 01, 2019
```
* update pserver decay blocks

* update distributed notify handler
```
781d2844
C
[Cherry-pick]Cherry pick paddle cloud role maker (#20947) · 0b429a22
由 Chengmo 提交于 11月 01, 2019
```
* Fix Paddle Cloud role maker (#20860)
```
0b429a22

31 10月, 2019 1 次提交
- W
  fix repeated_fc_fuse_pass and jit::matmul bug test=develop test=release/1.6 (#20948) · ad867398
  由 Wilber 提交于 10月 31, 2019
```
- fix jit::matmul bug 

input x, shape(m, k), weight, shape(k, n) 
```
  ad867398
30 10月, 2019 1 次提交

Cherry pick save load new feature (#20877) · 5119f262

由 hong 提交于 10月 30, 2019

* Serialize to pickle format (#20820)

test=develop

* save load problem fix and new feature add (#20823)

* fix persistable;

* fix save load bugs; test=develop

* fix bug; test=develop

* add example for new io api; test=develop

* addd example; test=develop

5119f262

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致