1. 22 4月, 2022 1 次提交
    • Z
      Ssd sparse table (#41812) · cca57c4a
      zhaocaibei123 提交于
      * [cherry-pick2.3]fix compile bug of windows cuda11.5 (#41464)
      
      cherry-pick
      
      fix compile bug of windows cuda11.5 #41433
      
      * fix bug of missing boost when compile cache.cc (#41449)
      
      【chery-pick #41430】fix bug of random compile failure, due to incorrect compile order of dependencies
      
      * Fix eager try catch (#41438) (#41477)
      
      [Cherry-Pick]Fix eager try catch (#41438)
      
      * Cherry-pick-PR41407, fix device_id bug for final_state op in multiprocess testcase (#41407) (#41475)
      
      Cherry-pick PR #41407
      
      * [BugFix] Add error hint for one_hot gpu version (#41335) (#41495)
      
      * add one_hot gpu hint
      
      * move allow_out_of_range judgement
      
      * delete useless unittest
      
      * fix bugs of reshape double grad infermeta (#41459) (#41493)
      
      * [cherrypick-2.3] modify infer gpu memory strategy (#41427), remove cudnn_deterministic=True (#41341)  (#41491)
      Co-authored-by: NJingZhuangzhuang <75348594+JZZ-NOTE@users.noreply.github.com>
      
      * [Cherry-pick][ROCm] fix dcu error in device event base, test=develop (#41523)
      
      Cherry-pick of #41521
      
      * [Cherry-Pick]Cherry pick PR41200, PR41474, PR41382 (#41509)
      
      * Use `self`as a parameter of _hash_with_id function to avoid error caused by hash_id reuse (#41200)
      
      * Add fill_constant_batch_size YAML and UT (#41474)
      
      * Switch some dy2st UT to eager mode (#41382)
      
      * Sitch some dy2st UT to eager mode
      
      * Fix test_lstm and remove test_transformer
      
      * Run test_resnet_v2 in old dy mode
      
      * Unittest recover (#41431)
      
      * update name
      
      * update name
      
      * fix test
      
      * fix fleet bind
      
      * update name
      
      * update name
      
      * fix test
      
      * fix gpups wrapper
      
      * remove Push/Pull/Load/Save with context in client and wrapper base class
      
      * fix
      
      * fix
      
      * remove some interface
      
      * fix
      
      * remove
      
      * code style
      
      * recover
      
      * fix
      
      * remove code unused
      
      * remove some unused table & accessor & CommonDenseTable => MemoryDenseTable
      
      * fix
      
      * fix
      
      * fix
      
      * recover
      
      * remove unused code
      
      * recover unittest
      
      * fix
      
      * remove
      
      * fix
      
      * remove code unuseful
      
      * remove
      
      * fix
      
      * recover
      
      * remove
      Co-authored-by: Nesythan <esythan@126.com>
      
      * add ssd sparse table
      
      * fix
      
      * add cache shuffle
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * add unit test
      
      * fix
      Co-authored-by: zhouweiwei2014's avatarZhou Wei <1183042833@qq.com>
      Co-authored-by: NSing_chan <51314274+betterpig@users.noreply.github.com>
      Co-authored-by: N0x45f <23097963+0x45f@users.noreply.github.com>
      Co-authored-by: Npangyoki <pangyoki@126.com>
      Co-authored-by: NSiming Dai <908660116@qq.com>
      Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
      Co-authored-by: NZhang Jun <ewalker@live.cn>
      Co-authored-by: NJingZhuangzhuang <75348594+JZZ-NOTE@users.noreply.github.com>
      Co-authored-by: NQi Li <qili93@qq.com>
      Co-authored-by: Nesythan <esythan@126.com>
      cca57c4a
  2. 15 4月, 2022 2 次提交
    • Z
      solve brpc compile in arm-ubantu18 (#41649) · 56dafc4f
      ziyoujiyi 提交于
      * back fl
      
      * delete ssl cert
      
      * .
      
      * make warning
      
      * .
      
      * unittest paral degree
      
      * solve unittest
      
      * heter & multi cloud commm ready
      
      * .
      
      * .
      
      * arm_brpc compile
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * only output is ok
      
      * base is ok
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * add switch server bin
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * adapt brpc ssl
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      56dafc4f
    • L
      update (#41762) · 482e5b6c
      lilong12 提交于
      482e5b6c
  3. 04 4月, 2022 1 次提交
    • Z
      Table refine: Pull/Push(TableContext) (#41320) · 19cb0d18
      zhaocaibei123 提交于
      * update name
      
      * update name
      
      * fix test
      
      * fix fleet bind
      
      * update name
      
      * update name
      
      * fix test
      
      * fix gpups wrapper
      
      * remove Push/Pull/Load/Save with context in client and wrapper base class
      
      * fix
      
      * fix
      
      * remove some interface
      
      * fix
      
      * remove
      
      * code style
      
      * recover
      
      * fix
      
      * remove code unused
      
      * fix
      
      * recover
      
      * fix
      Co-authored-by: Nesythan <esythan@126.com>
      19cb0d18
  4. 02 4月, 2022 2 次提交
  5. 01 4月, 2022 1 次提交
  6. 31 3月, 2022 1 次提交
  7. 30 3月, 2022 1 次提交
  8. 28 3月, 2022 1 次提交
  9. 23 3月, 2022 1 次提交
    • Z
      two-phase training for ps (#40762) · b1a4668c
      zhaocaibei123 提交于
      * fix benchmark and communicator config
      
      * fix bugs of the_one_ps
      
      * multi program and fix bug in optimizer
      
      * multi program in the_one_ps
      
      * public commcontext
      
      * ps optimizer multi programs
      
      * cvm & datanorm backend
      
      * fix dim
      
      * fix unittest
      
      * fix
      
      * the one ps merge
      
      * remove comm
      
      * add DownpourLiteWorker
      
      * all
      
      * fix
      
      * fix
      
      * device worker downpour lite
      
      * fix
      
      * fix bug in global shuffle
      
      * save inference model
      
      * fix & add log
      
      * fix
      
      * remove log
      
      * fix
      
      * fix save summary
      
      * fix
      
      * fix pscore
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * remove logs
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * add some comments
      
      * fix
      Co-authored-by: Nesythan <esythan@126.com>
      b1a4668c
  10. 21 3月, 2022 1 次提交
  11. 17 3月, 2022 1 次提交
  12. 28 2月, 2022 1 次提交
  13. 22 2月, 2022 1 次提交
    • W
      fix bug in new the_one_ps (#39505) · d56a0a1b
      wangguanqun 提交于
      * fix benchmark and communicator config
      
      * fix bugs of the_one_ps
      
      * multi program and fix bug in optimizer
      
      * multi program in the_one_ps
      
      * public commcontext
      d56a0a1b
  14. 20 2月, 2022 1 次提交
  15. 19 2月, 2022 2 次提交
    • A
      [Pten]Unify paddle/pten::framework::ddim into pten::ddim (#39614) · 2fe04264
      Aurelius84 提交于
      * Unify paddle/pten::framework::ddim into pten::ddim
      
      * fix paddle namespace
      
      * compile sucessfully
      
      * fix npu src file
      
      * fix conflict
      
      * fix conflict
      
      * fix tensorrt compiler error
      
      * fix conflict
      
      * fix conflict
      
      * fix tesst file conflict
      
      * fix conflict
      
      * fix mlu file conflict
      
      * fix mlu file conflict
      
      * fix cinn header file conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      2fe04264
    • C
      Update record interface using part1 (#39693) · eec6ef81
      chenjian 提交于
      * fix RecordEvent interface
      
      * modify default level to 4
      
      * update interface use
      
      * add const default trace level
      
      * update record event interface using
      
      * update operator.cc
      
      * update part1
      
      * fix include profiler.h header in ps server
      
      * fix include profiler.h header in ps server
      eec6ef81
  16. 18 2月, 2022 1 次提交
  17. 15 2月, 2022 1 次提交
    • A
      [PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404
      Aurelius84 提交于
      * #1 migrate dist-related type()-> dtype()
      
      * move datatype function from pten -> fluid/framework
      
      * change type() in imperative into convert(dtype())
      
      * modify xx_tensor->type into xx_tensor->dtype
      
      * change the set_type interface and the caller
      
      * modify xx_tensor.type into xx_tensor.dtype
      
      * fix mutable_data(place, dtype())
      
      * change caller of mutable_data in pten and distributed
      
      * change the caller of mutable_data in fluid/framework
      
      * change the caller of mutable_data in imperative directory
      
      * mutable_data: inference
      
      * update the call of mutable_data
      
      * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType
      
      * pass the compile. the next step is remove VarType in Pten
      
      * fix all and remove VarType from pten. success in linux. Next task is other platform
      
      * fix conflict with develop
      
      * fix compiled error
      
      * Fix reset conversion
      
      * fix conflict
      
      * fix compiled problem
      
      * fix typo
      
      * Fix << in tensor_utils.cc
      
      * fix type->dtype
      
      * fix unittest
      
      * fix tensor init constructor
      
      * fix DataTypeSize for BFloat16
      
      * fix code style
      
      * fix npu compiled error
      
      * fix npu
      
      * compile npu sucessfully
      
      * fix conflict
      
      * fix conflict
      Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
      7e7e9404
  18. 11 2月, 2022 1 次提交
  19. 06 2月, 2022 1 次提交
  20. 30 1月, 2022 1 次提交
  21. 25 1月, 2022 1 次提交
  22. 12 1月, 2022 1 次提交
    • Z
      the_one_ps dirs reconstruct (#38804) · 50609214
      ziyoujiyi 提交于
      * delete gloo connect retry
      
      * the_one_ps dirs reconstruct
      
      * .
      
      * .
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      50609214