1. 10 1月, 2020 2 次提交
    • G
      [cherry-pick] Add FC padding, ernie test unit and layernorm parallel (#22198) · 3df38f5c
      GaoWei8 提交于
      * Optimize the kernel implementation of layernorm with openmp (#20895)
      
      * Add ernie c++ inference test (#21015)
      
      * Add ernie unit test
      test=develop
      
      * Add ernie unit test
      test=develop
      
      * Add ernie unit test
      test=develop
      
      * remove ngraph
      
      * optimize gpu test
      test=develop
      
      * optimize codes
      test=develop
      
      * fix cmake fails on inference_download_and_uncompress (#21185)
      
      * solve cmake fails on inference_download_and_uncompress
      test=develop
      
      * solve cmake fails on inference_download_and_uncompress
      test=develop
      
      * Add fc padding to improve mkl GEMM's performance when N and K are multiple of 128. (#20972)
      
      * Add fc padding to solve mkl performance
      test=develop
      
      * fix gpu pass and error information
      test=develop
      
      * fix fc_fuse_pass_test
      test=develop
      
      * fix error information
      test=develop
      
      * fix error information
      test=develop
      
      * fix name and add fc op padding test
      test=develop
      
      * fix attributes
      test=develop
      
      * optimize fc padding
      test=develop
      
      * fix test
      test=develop
      
      * Polish the codes of fc when needs padding (#21378)
      
      test=develop
      
      * Add ernie large c++ inference test (#21365)
      
      * add ernie-large test
      test=develop
      
      * add ernie large c++ inference test
      test=develop
      
      * Modify padding strategy: remove weight copy in fc padding (#21650)
      
      test=develop
      
      * optimize fc jit (#21878)
      
      test=develop
      Co-authored-by: NYihua Xu <yihuaxu@hotmail.com>
      3df38f5c
    • fix multi-thread error of fc_gru_fuse_pass.cc, test=develop (#21841) (#22185) · e8e12499
      石晓伟 提交于
      * fix multi-thread error of fc_gru_fuse_pass.cc, test=develop
      
      * export FLAGS and GLOG symbols, test=develop
      e8e12499
  2. 09 1月, 2020 1 次提交
  3. 08 1月, 2020 1 次提交
    • L
      Fix multi-threads memory out of bounds error for passes (#21920) (#22132) · 835201bf
      liu zhengxi 提交于
      * fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop
      
      * fix attention_lstm_fuse_pass during multi-threads inference, test=develop
      
      * fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop
      
      * fix fc_lstm_fuse_pass during multi-threads inference, test=develop
      
      * fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop
      835201bf
  4. 07 1月, 2020 1 次提交
  5. 09 12月, 2019 1 次提交
  6. 04 12月, 2019 5 次提交
  7. 03 12月, 2019 2 次提交
  8. 02 12月, 2019 2 次提交
  9. 28 11月, 2019 1 次提交
    • X
      cherry-pick1.6 fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21339) · 072eb5b6
      xujiaqi01 提交于
      * fix cache table bug, add save_paddle_inference_model, fix hdfs util bug (#21052)
      
      * fix cache table bug
      * add save_paddle_inference_model
      * fix hdfs util bug
      * test=develop
      
      * fix several sparse table issuses (#20686)
      
      * no longer need to define all embedding layers (no one less) of all slots in each program. make trainer_param repeated in ps.proto.
      * add find_distributed_lookup_table_grads instead of hard code GRAD
      * support embedding stop gradient. push sparse has error before fix this.* 
      * fix fill sparse, skip slots which do not have embedding. each slot's embedding in a sparse table should be used in all training programs before fix this.
      * fix pull sparse, skip slots which do not have embedding.
      * fix collect feasign label info, skip slots which do not have embedding.
      * support when there are multi sparse tables in one or multi training programs, each program can pull/push its own related sparse tables instead of all sparse tables.
      * test=develop
      
      * add copy table (#21086)
      
      * copy some feasigns and corresponding embeddings from one sparse table to another
      * copy all feasigns and corresponding embeddings from one sparse table to another
      * copy all dense params from one table to another
      * copy some local vars to other local vars
      
      * fix fs_client_param bug (#21212)
      
      * fix fs_client_param bug, user can set this config through fleet_desc_file or fleet config
      * test=develop
      
      * fix fleet util bug (#21254)
      
      * fix fleet util bug in save paddle inference model
      * test=develop
      072eb5b6
  10. 26 11月, 2019 1 次提交
  11. 25 11月, 2019 1 次提交
    • C
      Add pre-condition check for fuse optimizer op pass (#21005) (#21305) · 9f004548
      Chen Weihang 提交于
      * add pre condition check for fuse optimizer op pass, test=develop
      
      * add log & set init to zero, test=develop
      
      * fix test_fuse_all_reduce_pass failed, test=develop
      
      * polish details, test=develop
      
      * refine PADDLE_ENFORCE & remove needless VLOG, test=develop
      
      * refactor op check method, test=develop
      9f004548
  12. 21 11月, 2019 1 次提交
    • C
      Cherry-pick error type support for release1.6 (#21294) · 974b8a83
      Chen Weihang 提交于
      * delete paddle infershape enforce marco (#20832)
      
      * Polish and arrange code in enforce.h (#20901)
      
      * Enrich the type of error and declare the error type interfaces (#21024)
      
      * Enrich the type of error and declare the error type interfaces, test=develop
      
      * adjust tests to adapt new form, test=develop
      
      * add inference deps with error_codes.pb.h, test=develop
      
      * restore stack iter start pos, test=develop
      
      * polish code based review comments, test=develop
      
      * Add dependency for error_codes.proto (#21084)
      
      * fix activation_functions deps, test=develop, test=document_fix
      
      * add error_codes_proto deps, test=develop, test=document_fix
      
      * try delete enforce.h, test=develop, test=document_fix
      
      * change cuda enforce & add example (#21142)
      test=release/1.6
      974b8a83
  13. 07 11月, 2019 1 次提交
  14. 02 11月, 2019 1 次提交
  15. 01 11月, 2019 3 次提交
  16. 30 10月, 2019 2 次提交
  17. 29 10月, 2019 1 次提交
    • C
      [Cherry-pick to 1.6] Block part of "tensor should not be null" error message (#20845) · d29e9aa4
      Chen Weihang 提交于
      * Add IndicateVarDataType interface to block tensor is not initialized problem in OP GetExceptedKernelType (#20044)
      
      * add indicate_var_data_type inferface, test=develop
      
      * add unittests & polish error message, test=develop
      
      * remove needless include, test=develop
      
      * extract public function & polish message, test=develop
      
      * delete empty var check, test=develop
      
      * change data_type to pointer parameter, test=develop
      
      * polish details, test=develop
      
      * Replace risky GetInputType method with secure IndicateVarDataType interface (#20668)
      
      * replace part of the old implementation, test=develop
      
      * restore concat op, test=develop
      
      * update all ops implemention & delete GetDataTypeOfVar func, test=develop
      
      test=release/1.6
      d29e9aa4
  18. 25 10月, 2019 1 次提交
  19. 24 10月, 2019 1 次提交
  20. 21 10月, 2019 1 次提交
  21. 20 10月, 2019 1 次提交
  22. 18 10月, 2019 1 次提交
    • M
      MKL-DNN] Added mkl-dnn cache clearing when creating Executor instance (#20241) (#20693) · 2099618d
      Michał Gallus 提交于
      test=release/1.6
      
      * - Flushing mkl-dnn cache
      
      test=develop
      
      - Disabled clearing cache for LoadModel
      
      - Added clearing of mkl-dnn cache when Executor is created
      
      test=develop
      
      - Do not clear for GPU places
      
      test=develop
      
      - compilation fix
      
      test=develop
      
      * - Moved clearing of mkl-dnn cache in destructor of executor
      
      test=develop
      
      * - Compilation fix
      
      test=develop
      
      - Reverted conditional clearing of mkl-dnn cache in Executors's
        destructor
      
      test=develop
      
      - compilation fix
      2099618d
  23. 17 10月, 2019 2 次提交
  24. 15 10月, 2019 1 次提交
  25. 14 10月, 2019 4 次提交
  26. 13 10月, 2019 1 次提交