提交 · 3df38f5cdd0866c1e78f1c2674d3d6cf3166d35f · 机器未来 / Paddle

10 1月, 2020 2 次提交

[cherry-pick] Add FC padding, ernie test unit and layernorm parallel (#22198) · 3df38f5c

由 GaoWei8 提交于 1月 10, 2020

* Optimize the kernel implementation of layernorm with openmp (#20895)

* Add ernie c++ inference test (#21015)

* Add ernie unit test
test=develop

* Add ernie unit test
test=develop

* Add ernie unit test
test=develop

* remove ngraph

* optimize gpu test
test=develop

* optimize codes
test=develop

* fix cmake fails on inference_download_and_uncompress (#21185)

* solve cmake fails on inference_download_and_uncompress
test=develop

* solve cmake fails on inference_download_and_uncompress
test=develop

* Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972)

* Add fc padding to solve mkl performance
test=develop

* fix gpu pass and error information
test=develop

* fix fc_fuse_pass_test
test=develop

* fix error information
test=develop

* fix error information
test=develop

* fix name and add fc op padding test
test=develop

* fix attributes
test=develop

* optimize fc padding
test=develop

* fix test
test=develop

* Polish the codes of fc when needs padding (#21378)

test=develop

* Add ernie large c++ inference test (#21365)

* add ernie-large test
test=develop

* add ernie large c++ inference test
test=develop

* Modify padding strategy: remove weight copy in fc padding (#21650)

test=develop

* optimize fc jit (#21878)

test=develop
Co-authored-by: NYihua Xu <yihuaxu@hotmail.com>

3df38f5c

石
fix multi-thread error of fc_gru_fuse_pass.cc, test=develop (#21841) (#22185) · e8e12499
由石晓伟提交于 1月 10, 2020
```
* fix multi-thread error of fc_gru_fuse_pass.cc, test=develop

* export FLAGS and GLOG symbols, test=develop
```
e8e12499

09 1月, 2020 1 次提交
- W
  [Cherry-pick 1.6] fix batch_norm_grad shape=0 & allreduce shape enforce &... · 515b206d
  由 WangXi 提交于 1月 09, 2020
```
[Cherry-pick 1.6] fix batch_norm_grad shape=0 & allreduce shape enforce & sync_batch_norm hang in fleet (#22157)
```
  515b206d
08 1月, 2020 1 次提交

Fix multi-threads memory out of bounds error for passes (#21920) (#22132) · 835201bf

由 liu zhengxi 提交于 1月 08, 2020

* fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop

* fix attention_lstm_fuse_pass during multi-threads inference, test=develop

* fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop

* fix fc_lstm_fuse_pass during multi-threads inference, test=develop

* fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop

835201bf

07 1月, 2020 1 次提交
- P
  
  fix trt calib not working bug, test=develop (#21934) (#22110) · 5a611afd
  由 Pei Yang 提交于 1月 07, 2020
  
  5a611afd
09 12月, 2019 1 次提交
- Z
  Revert "CHERRY_PICK: TRT int8: refine trt int8 for dynamic range set (#21112) (#21449)" (#21619) · f7c629d9
  由 Zhaolong Xing 提交于 12月 09, 2019
```
This reverts commit 0473cdb8.
```
  f7c629d9
04 12月, 2019 5 次提交

P
make config option DisableGlogInfo() able to mute all inference logs (#21544) · 857cd9f8
由 Pei Yang 提交于 12月 04, 2019
```
make config option DisableGlogInfo() able to mute all inference logs
```
857cd9f8

Refactor fetch handler (#21264) (#21537) · 87a8caa8

由 tangwei12 提交于 12月 04, 2019

* fix fetch handler problem and refactor
when a user define FetchHandler class, he or she should initialize a handler
with variable dict. the key of a variable dict is a user defined name,
the value of a variable dict is a Varaible generated from python API.

For each fetching, a user should implement handler function in which
fetched_result_dict will be available and the user can access the fetched value
with user defined keys.

87a8caa8

Z
[cherry-pick] NV JETSON support and auto_growth strategy for inference. (#21500) · 20a09375
由 Zhaolong Xing 提交于 12月 04, 2019
```
* ADD NV JETSON SUPPORT
test=release/1.6

* CHERRY_PICK: specify the auto growth allocator for inference.
test=release/1.6
```
20a09375
B

[cherry pick] Conv2d and Conv2d transpose MKL-DNN NHWC support (#21525) · 0e63746b
由 bingyanghuang 提交于 12月 04, 2019

0e63746b

Pick disable reshape inplace in dygraph (#21486) · 32a0eb50

由 hong 提交于 12月 04, 2019

* disable reshape inplace in dygraph model; test=develop (#21157)

* fix ExecutionContext::HasInput and ExecutionContext::HasOutput depend on the scope structure, test=develop (#20721)

32a0eb50

03 12月, 2019 2 次提交
- 石
  
  revert ProgOptimUnsupported check, test=release/1.6 (#21475) · 5c7c6b1e
  由石晓伟提交于 12月 03, 2019
  
  5c7c6b1e
- B
  
  cherry-pick LRN and Pool2d (FWD) NHWC support (#21476) · ccb508dc
  由 bingyanghuang 提交于 12月 03, 2019
  
  ccb508dc
02 12月, 2019 2 次提交

[cherry-pick] find lookup table in order & support dump param (#21347) · 893ea7e0

由 Thunderbrook 提交于 12月 02, 2019

* support dump param of model into afs (#20302)

* support dump param to afs
test=develop

* code style
test=develop

* code style
test=develop

* dump param
test=develop

* dump param
test=develop

* dump param
test=develop

* dump param
test=develop

* find lookup table in order (#20932)

test=develop

* cherry-pick
test=develop

* solve pslib core in stop worker
test=develop

* print table stat info for pslib
test=develop

893ea7e0

Z

CHERRY_PICK: TRT int8: refine trt int8 for dynamic range set (#21112) (#21449) · 0473cdb8
由 Zhaolong Xing 提交于 12月 02, 2019

0473cdb8

28 11月, 2019 1 次提交

cherry-pick1.6 fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21339) · 072eb5b6

由 xujiaqi01 提交于 11月 28, 2019

* fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21052)

* fix cache table bug
* add save_paddle_inference_model
* fix hdfs util bug
* test=develop

* fix several sparse table issuses (#20686)

* no longer need to define all embedding layers (no one less) of all slots in each program. make trainer_param repeated in ps.proto.
* add find_distributed_lookup_table_grads instead of hard code GRAD
* support embedding stop gradient. push sparse has error before fix this.* 
* fix fill sparse, skip slots which do not have embedding. each slot's embedding in a sparse table should be used in all training programs before fix this.
* fix pull sparse, skip slots which do not have embedding.
* fix collect feasign label info, skip slots which do not have embedding.
* support when there are multi sparse tables in one or multi training programs, each program can pull/push its own related sparse tables instead of all sparse tables.
* test=develop

* add copy table (#21086)

* copy some feasigns and corresponding embeddings from one sparse table to another
* copy all feasigns and corresponding embeddings from one sparse table to another
* copy all dense params from one table to another
* copy some local vars to other local vars

* fix fs_client_param bug (#21212)

* fix fs_client_param bug， user can set this config through fleet_desc_file or fleet config
* test=develop

* fix fleet util bug (#21254)

* fix fleet util bug in save paddle inference model
* test=develop

072eb5b6

26 11月, 2019 1 次提交
- W
  
  [Cherry-pick 1.6] Fix dgc buffer illegal & reuse velocity & fix fuse (#21281) · 93c7f058
  由 WangXi 提交于 11月 26, 2019
  
  93c7f058
25 11月, 2019 1 次提交

Add pre-condition check for fuse optimizer op pass (#21005) (#21305) · 9f004548

由 Chen Weihang 提交于 11月 25, 2019

* add pre condition check for fuse optimizer op pass, test=develop

* add log & set init to zero, test=develop

* fix test_fuse_all_reduce_pass failed, test=develop

* polish details, test=develop

* refine PADDLE_ENFORCE & remove needless VLOG, test=develop

* refactor op check method, test=develop

9f004548

21 11月, 2019 1 次提交

Cherry-pick error type support for release1.6 (#21294) · 974b8a83

由 Chen Weihang 提交于 11月 21, 2019

* delete paddle infershape enforce marco (#20832)

* Polish and arrange code in enforce.h (#20901)

* Enrich the type of error and declare the error type interfaces (#21024)

* Enrich the type of error and declare the error type interfaces, test=develop

* adjust tests to adapt new form, test=develop

* add inference deps with error_codes.pb.h, test=develop

* restore stack iter start pos, test=develop

* polish code based review comments, test=develop

* Add dependency for error_codes.proto (#21084)

* fix activation_functions deps, test=develop, test=document_fix

* add error_codes_proto deps, test=develop, test=document_fix

* try delete enforce.h, test=develop, test=document_fix

* change cuda enforce & add example (#21142)
test=release/1.6

974b8a83

07 11月, 2019 1 次提交

[cherry-pick] fix squared_mat_sub_fuse_pass bug when elementwise_op input is... · e6ed6379

由 Wilber 提交于 11月 07, 2019

[cherry-pick] fix squared_mat_sub_fuse_pass bug when elementwise_op input is persistable param test=develop test=release/1.6 (#21044)

fix squared_mat_sub_fuse_pass bug when elementwise_op input is persistable param

e6ed6379

02 11月, 2019 1 次提交
- 石
  fix infer crashes caused by conv/pool upgrades, test=release/1.6 (#20969) · 53f1e024
  由石晓伟提交于 11月 02, 2019
```
* fix infer crashes caused by conv/pool upgrades, test=release/1.6

* fix bug, test=release/1.6
```
  53f1e024
01 11月, 2019 3 次提交

cherry-pick1.6 simplify master+patch，remove ins when size != merge_size or has... · 3db61dc0

由 xujiaqi01 提交于 11月 01, 2019

cherry-pick1.6 simplify master+patch，remove ins when size != merge_size or has conflict slot  (#20941)

* simplify master+patch，remove ins when size != merge_size or has conflict slot
* test=develop

3db61dc0

X
add check nan / inf in downpour worker (#20694) (#20925) · 5c3656bb
由 xujiaqi01 提交于 11月 01, 2019
```
* add check nan / inf in downpour worker during training
* test=develop
```
5c3656bb
1
Optimize decay (#20816) (#20952) · 781d2844
由 123malin 提交于 11月 01, 2019
```
* update pserver decay blocks

* update distributed notify handler
```
781d2844

30 10月, 2019 2 次提交

L
[cherry-pick] Add support to gcc8, add docker env (#20892) · 6fb04e8a
由 liu zhengxi 提交于 10月 30, 2019
```
* add support to gcc8, add docker env
* remove the warning issue
```
6fb04e8a

Cherry pick save load new feature (#20877) · 5119f262

由 hong 提交于 10月 30, 2019

* Serialize to pickle format (#20820)

test=develop

* save load problem fix and new feature add (#20823)

* fix persistable;

* fix save load bugs; test=develop

* fix bug; test=develop

* add example for new io api; test=develop

* addd example; test=develop

5119f262

29 10月, 2019 1 次提交

[Cherry-pick to 1.6] Block part of "tensor should not be null" error message (#20845) · d29e9aa4

由 Chen Weihang 提交于 10月 29, 2019

* Add IndicateVarDataType interface to block tensor is not initialized problem in OP GetExceptedKernelType (#20044)

* add indicate_var_data_type inferface, test=develop

* add unittests & polish error message, test=develop

* remove needless include, test=develop

* extract public function & polish message, test=develop

* delete empty var check, test=develop

* change data_type to pointer parameter, test=develop

* polish details, test=develop

* Replace risky GetInputType method with secure IndicateVarDataType interface (#20668)

* replace part of the old implementation, test=develop

* restore concat op, test=develop

* update all ops implemention & delete GetDataTypeOfVar func, test=develop

test=release/1.6

d29e9aa4

25 10月, 2019 1 次提交
- C
  
  Make formatted ENFORCE stack adapt to more situations (#20826) (#20828) · 4841474e
  由 Chen Weihang 提交于 10月 25, 2019
  
  4841474e
24 10月, 2019 1 次提交
- Z
  [Cherry-pick 1.6]Add more error debug message to Operator::Run (#20807) · 03a89450
  由 Zeng Jinle 提交于 10月 24, 2019
```
* add more err msg, test=develop

* add more unittests, test=release/1.6
```
  03a89450
21 10月, 2019 1 次提交
- W
  
  [Cherry-pick 1.6] Fix DGC test and DGC nan bug (#20708) · 2378aa8a
  由 WangXi 提交于 10月 21, 2019
  
  2378aa8a
20 10月, 2019 1 次提交
- Z
  CHERRY_PICK 20720: fix ts_sort's bug, test=develop (#20726) · a7d0d888
  由 Zhaolong Xing 提交于 10月 20, 2019
```
test=release/1.6
```
  a7d0d888
18 10月, 2019 1 次提交

MKL-DNN] Added mkl-dnn cache clearing when creating Executor instance (#20241) (#20693) · 2099618d

由 Michał Gallus 提交于 10月 18, 2019

test=release/1.6

* - Flushing mkl-dnn cache

test=develop

- Disabled clearing cache for LoadModel

- Added clearing of mkl-dnn cache when Executor is created

test=develop

- Do not clear for GPU places

test=develop

- compilation fix

test=develop

* - Moved clearing of mkl-dnn cache in destructor of executor

test=develop

* - Compilation fix

test=develop

- Reverted conditional clearing of mkl-dnn cache in Executors's
  destructor

test=develop

- compilation fix

2099618d

17 10月, 2019 2 次提交

Z
[Cherry-pick 1.6]Fix op run log when memory optimization strategy is enabled (#20696) · a77d75cd
由 Zeng Jinle 提交于 10月 17, 2019
```
* fix op log bug, test=release/1.6

* add unittests, test=release/1.6
```
a77d75cd

[cherry-pick]Fix communicator slow bug & fix communicator stop bug (#20366) (#20646) · eeaf04da

由 Chengmo 提交于 10月 17, 2019

* Fix communicator slow bug & fix communicator stop bug (#20366)

* test=develop,Fix communicator slow bug

* test=develop, delete if() in stop_worker()

* test=develop

* fix UT, test=develop

* fix bug in fetch handler, test=develop

* fix bug in fetch handler, test=develop

* test=develop, fix fetch barrier bug

* test=develop, bug fix

* test=develop, bug fix

* test=develop, fix bug

* test=develop,test=release/1.6

eeaf04da

15 10月, 2019 1 次提交
- T
  dump fix dov vec file num (#20539) (#20605) · ddf80c4b
  由 Thunderbrook 提交于 10月 15, 2019
```
* support dump multi file
test=develop

* dump fix num file
test=develop
```
  ddf80c4b
14 10月, 2019 4 次提交

6

support convert tensor to cudf depends on dlpack test=release/1.6 (#20611) · 5da8db61
由 633WHU 提交于 10月 14, 2019

5da8db61
P

add DisableGlogInfo() to AnalysisConfig, test=develop (#20581) (#20600) · fed1263c
由 Pei Yang 提交于 10月 14, 2019

fed1263c

fix parse content in CreatePreLoadReaders (#20258) (#20570) · 4f66e922

由 xujiaqi01 提交于 10月 14, 2019

Fix parse content in CreatePreLoadReaders. Before this fix, if you use dataset.set_parse_content and dataset.preload, parse content didn't work.

4f66e922

[cherry-pick] Add multihead matmul fuse pass(#20167) (#20592) · cefbcf77

由 zhaoyuchen2018 提交于 10月 13, 2019

* Add Multihead matmul fuse pass (#20167)

* Add multihead fuse pass for ernie opt

* Refine softmax

test=develop

* Refine cuda kernel

* Refine cuda version

* Refine cmake

test=develop

* refine header file

* refine test case and pass
* refine comments

* Delete useless code.

test=develop
Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>

cefbcf77

13 10月, 2019 1 次提交
- Z
  
  fix cuda dev_ctx by event, test=release/1.6 (#20559) · c8de7284
  由 Zeng Jinle 提交于 10月 13, 2019
  
  c8de7284

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致