- 10 1月, 2020 2 次提交
-
-
由 GaoWei8 提交于
* Optimize the kernel implementation of layernorm with openmp (#20895) * Add ernie c++ inference test (#21015) * Add ernie unit test test=develop * Add ernie unit test test=develop * Add ernie unit test test=develop * remove ngraph * optimize gpu test test=develop * optimize codes test=develop * fix cmake fails on inference_download_and_uncompress (#21185) * solve cmake fails on inference_download_and_uncompress test=develop * solve cmake fails on inference_download_and_uncompress test=develop * Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972) * Add fc padding to solve mkl performance test=develop * fix gpu pass and error information test=develop * fix fc_fuse_pass_test test=develop * fix error information test=develop * fix error information test=develop * fix name and add fc op padding test test=develop * fix attributes test=develop * optimize fc padding test=develop * fix test test=develop * Polish the codes of fc when needs padding (#21378) test=develop * Add ernie large c++ inference test (#21365) * add ernie-large test test=develop * add ernie large c++ inference test test=develop * Modify padding strategy: remove weight copy in fc padding (#21650) test=develop * optimize fc jit (#21878) test=develop Co-authored-by: NYihua Xu <yihuaxu@hotmail.com>
-
由 石晓伟 提交于
* fix multi-thread error of fc_gru_fuse_pass.cc, test=develop * export FLAGS and GLOG symbols, test=develop
-
- 09 1月, 2020 1 次提交
-
-
由 WangXi 提交于
[Cherry-pick 1.6] fix batch_norm_grad shape=0 & allreduce shape enforce & sync_batch_norm hang in fleet (#22157)
-
- 08 1月, 2020 1 次提交
-
-
由 liu zhengxi 提交于
* fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop * fix attention_lstm_fuse_pass during multi-threads inference, test=develop * fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop * fix fc_lstm_fuse_pass during multi-threads inference, test=develop * fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop
-
- 07 1月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 09 12月, 2019 1 次提交
-
-
由 Zhaolong Xing 提交于
This reverts commit 0473cdb8.
-
- 04 12月, 2019 5 次提交
-
-
由 Pei Yang 提交于
make config option DisableGlogInfo() able to mute all inference logs
-
由 tangwei12 提交于
* fix fetch handler problem and refactor when a user define FetchHandler class, he or she should initialize a handler with variable dict. the key of a variable dict is a user defined name, the value of a variable dict is a Varaible generated from python API. For each fetching, a user should implement handler function in which fetched_result_dict will be available and the user can access the fetched value with user defined keys.
-
由 Zhaolong Xing 提交于
* ADD NV JETSON SUPPORT test=release/1.6 * CHERRY_PICK: specify the auto growth allocator for inference. test=release/1.6
-
由 bingyanghuang 提交于
-
由 hong 提交于
* disable reshape inplace in dygraph model; test=develop (#21157) * fix ExecutionContext::HasInput and ExecutionContext::HasOutput depend on the scope structure, test=develop (#20721)
-
- 03 12月, 2019 2 次提交
-
-
由 石晓伟 提交于
-
由 bingyanghuang 提交于
-
- 02 12月, 2019 2 次提交
-
-
由 Thunderbrook 提交于
* support dump param of model into afs (#20302) * support dump param to afs test=develop * code style test=develop * code style test=develop * dump param test=develop * dump param test=develop * dump param test=develop * dump param test=develop * find lookup table in order (#20932) test=develop * cherry-pick test=develop * solve pslib core in stop worker test=develop * print table stat info for pslib test=develop
-
由 Zhaolong Xing 提交于
-
- 28 11月, 2019 1 次提交
-
-
由 xujiaqi01 提交于
* fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21052) * fix cache table bug * add save_paddle_inference_model * fix hdfs util bug * test=develop * fix several sparse table issuses (#20686) * no longer need to define all embedding layers (no one less) of all slots in each program. make trainer_param repeated in ps.proto. * add find_distributed_lookup_table_grads instead of hard code GRAD * support embedding stop gradient. push sparse has error before fix this.* * fix fill sparse, skip slots which do not have embedding. each slot's embedding in a sparse table should be used in all training programs before fix this. * fix pull sparse, skip slots which do not have embedding. * fix collect feasign label info, skip slots which do not have embedding. * support when there are multi sparse tables in one or multi training programs, each program can pull/push its own related sparse tables instead of all sparse tables. * test=develop * add copy table (#21086) * copy some feasigns and corresponding embeddings from one sparse table to another * copy all feasigns and corresponding embeddings from one sparse table to another * copy all dense params from one table to another * copy some local vars to other local vars * fix fs_client_param bug (#21212) * fix fs_client_param bug, user can set this config through fleet_desc_file or fleet config * test=develop * fix fleet util bug (#21254) * fix fleet util bug in save paddle inference model * test=develop
-
- 26 11月, 2019 1 次提交
-
-
由 WangXi 提交于
-
- 25 11月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
* add pre condition check for fuse optimizer op pass, test=develop * add log & set init to zero, test=develop * fix test_fuse_all_reduce_pass failed, test=develop * polish details, test=develop * refine PADDLE_ENFORCE & remove needless VLOG, test=develop * refactor op check method, test=develop
-
- 21 11月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
* delete paddle infershape enforce marco (#20832) * Polish and arrange code in enforce.h (#20901) * Enrich the type of error and declare the error type interfaces (#21024) * Enrich the type of error and declare the error type interfaces, test=develop * adjust tests to adapt new form, test=develop * add inference deps with error_codes.pb.h, test=develop * restore stack iter start pos, test=develop * polish code based review comments, test=develop * Add dependency for error_codes.proto (#21084) * fix activation_functions deps, test=develop, test=document_fix * add error_codes_proto deps, test=develop, test=document_fix * try delete enforce.h, test=develop, test=document_fix * change cuda enforce & add example (#21142) test=release/1.6
-
- 07 11月, 2019 1 次提交
-
-
由 Wilber 提交于
[cherry-pick] fix squared_mat_sub_fuse_pass bug when elementwise_op input is persistable param test=develop test=release/1.6 (#21044) fix squared_mat_sub_fuse_pass bug when elementwise_op input is persistable param
-
- 02 11月, 2019 1 次提交
-
-
由 石晓伟 提交于
* fix infer crashes caused by conv/pool upgrades, test=release/1.6 * fix bug, test=release/1.6
-
- 01 11月, 2019 3 次提交
-
-
由 xujiaqi01 提交于
cherry-pick1.6 simplify master+patch,remove ins when size != merge_size or has conflict slot (#20941) * simplify master+patch,remove ins when size != merge_size or has conflict slot * test=develop
-
由 xujiaqi01 提交于
* add check nan / inf in downpour worker during training * test=develop
-
由 123malin 提交于
* update pserver decay blocks * update distributed notify handler
-
- 30 10月, 2019 2 次提交
-
-
由 liu zhengxi 提交于
* add support to gcc8, add docker env * remove the warning issue
-
由 hong 提交于
* Serialize to pickle format (#20820) test=develop * save load problem fix and new feature add (#20823) * fix persistable; * fix save load bugs; test=develop * fix bug; test=develop * add example for new io api; test=develop * addd example; test=develop
-
- 29 10月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
* Add IndicateVarDataType interface to block tensor is not initialized problem in OP GetExceptedKernelType (#20044) * add indicate_var_data_type inferface, test=develop * add unittests & polish error message, test=develop * remove needless include, test=develop * extract public function & polish message, test=develop * delete empty var check, test=develop * change data_type to pointer parameter, test=develop * polish details, test=develop * Replace risky GetInputType method with secure IndicateVarDataType interface (#20668) * replace part of the old implementation, test=develop * restore concat op, test=develop * update all ops implemention & delete GetDataTypeOfVar func, test=develop test=release/1.6
-
- 25 10月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
-
- 24 10月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* add more err msg, test=develop * add more unittests, test=release/1.6
-
- 21 10月, 2019 1 次提交
-
-
由 WangXi 提交于
-
- 20 10月, 2019 1 次提交
-
-
由 Zhaolong Xing 提交于
test=release/1.6
-
- 18 10月, 2019 1 次提交
-
-
由 Michał Gallus 提交于
test=release/1.6 * - Flushing mkl-dnn cache test=develop - Disabled clearing cache for LoadModel - Added clearing of mkl-dnn cache when Executor is created test=develop - Do not clear for GPU places test=develop - compilation fix test=develop * - Moved clearing of mkl-dnn cache in destructor of executor test=develop * - Compilation fix test=develop - Reverted conditional clearing of mkl-dnn cache in Executors's destructor test=develop - compilation fix
-
- 17 10月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
* fix op log bug, test=release/1.6 * add unittests, test=release/1.6
-
由 Chengmo 提交于
* Fix communicator slow bug & fix communicator stop bug (#20366) * test=develop,Fix communicator slow bug * test=develop, delete if() in stop_worker() * test=develop * fix UT, test=develop * fix bug in fetch handler, test=develop * fix bug in fetch handler, test=develop * test=develop, fix fetch barrier bug * test=develop, bug fix * test=develop, bug fix * test=develop, fix bug * test=develop,test=release/1.6
-
- 15 10月, 2019 1 次提交
-
-
由 Thunderbrook 提交于
* support dump multi file test=develop * dump fix num file test=develop
-
- 14 10月, 2019 4 次提交
-
-
由 633WHU 提交于
-
由 Pei Yang 提交于
-
由 xujiaqi01 提交于
Fix parse content in CreatePreLoadReaders. Before this fix, if you use dataset.set_parse_content and dataset.preload, parse content didn't work.
-
由 zhaoyuchen2018 提交于
* Add Multihead matmul fuse pass (#20167) * Add multihead fuse pass for ernie opt * Refine softmax test=develop * Refine cuda kernel * Refine cuda version * Refine cmake test=develop * refine header file * refine test case and pass * refine comments * Delete useless code. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
- 13 10月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
-