1. 22 4月, 2022 1 次提交
    • Z
      Ssd sparse table (#41812) · cca57c4a
      zhaocaibei123 提交于
      * [cherry-pick2.3]fix compile bug of windows cuda11.5 (#41464)
      
      cherry-pick
      
      fix compile bug of windows cuda11.5 #41433
      
      * fix bug of missing boost when compile cache.cc (#41449)
      
      【chery-pick #41430】fix bug of random compile failure, due to incorrect compile order of dependencies
      
      * Fix eager try catch (#41438) (#41477)
      
      [Cherry-Pick]Fix eager try catch (#41438)
      
      * Cherry-pick-PR41407, fix device_id bug for final_state op in multiprocess testcase (#41407) (#41475)
      
      Cherry-pick PR #41407
      
      * [BugFix] Add error hint for one_hot gpu version (#41335) (#41495)
      
      * add one_hot gpu hint
      
      * move allow_out_of_range judgement
      
      * delete useless unittest
      
      * fix bugs of reshape double grad infermeta (#41459) (#41493)
      
      * [cherrypick-2.3] modify infer gpu memory strategy (#41427), remove cudnn_deterministic=True (#41341)  (#41491)
      Co-authored-by: NJingZhuangzhuang <75348594+JZZ-NOTE@users.noreply.github.com>
      
      * [Cherry-pick][ROCm] fix dcu error in device event base, test=develop (#41523)
      
      Cherry-pick of #41521
      
      * [Cherry-Pick]Cherry pick PR41200, PR41474, PR41382 (#41509)
      
      * Use `self`as a parameter of _hash_with_id function to avoid error caused by hash_id reuse (#41200)
      
      * Add fill_constant_batch_size YAML and UT (#41474)
      
      * Switch some dy2st UT to eager mode (#41382)
      
      * Sitch some dy2st UT to eager mode
      
      * Fix test_lstm and remove test_transformer
      
      * Run test_resnet_v2 in old dy mode
      
      * Unittest recover (#41431)
      
      * update name
      
      * update name
      
      * fix test
      
      * fix fleet bind
      
      * update name
      
      * update name
      
      * fix test
      
      * fix gpups wrapper
      
      * remove Push/Pull/Load/Save with context in client and wrapper base class
      
      * fix
      
      * fix
      
      * remove some interface
      
      * fix
      
      * remove
      
      * code style
      
      * recover
      
      * fix
      
      * remove code unused
      
      * remove some unused table & accessor & CommonDenseTable => MemoryDenseTable
      
      * fix
      
      * fix
      
      * fix
      
      * recover
      
      * remove unused code
      
      * recover unittest
      
      * fix
      
      * remove
      
      * fix
      
      * remove code unuseful
      
      * remove
      
      * fix
      
      * recover
      
      * remove
      Co-authored-by: Nesythan <esythan@126.com>
      
      * add ssd sparse table
      
      * fix
      
      * add cache shuffle
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * add unit test
      
      * fix
      Co-authored-by: zhouweiwei2014's avatarZhou Wei <1183042833@qq.com>
      Co-authored-by: NSing_chan <51314274+betterpig@users.noreply.github.com>
      Co-authored-by: N0x45f <23097963+0x45f@users.noreply.github.com>
      Co-authored-by: Npangyoki <pangyoki@126.com>
      Co-authored-by: NSiming Dai <908660116@qq.com>
      Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
      Co-authored-by: NZhang Jun <ewalker@live.cn>
      Co-authored-by: NJingZhuangzhuang <75348594+JZZ-NOTE@users.noreply.github.com>
      Co-authored-by: NQi Li <qili93@qq.com>
      Co-authored-by: Nesythan <esythan@126.com>
      cca57c4a
  2. 19 4月, 2022 1 次提交
  3. 15 4月, 2022 1 次提交
    • S
      gpu_graph engine optimization+ (#41455) · ce72690c
      seemingwang 提交于
      * extract sub-graph
      
      * graph-engine merging
      
      * fix
      
      * fix
      
      * fix heter-ps config
      
      * test performance
      
      * test performance
      
      * test performance
      
      * test
      
      * test
      
      * update bfs
      
      * change cmake
      
      * test
      
      * test gpu speed
      
      * gpu_graph_engine optimization
      
      * add ssd layer to graph_engine
      
      * fix allocation
      
      * fix syntax error
      
      * fix syntax error
      
      * fix pscore class
      
      * fix
      
      * recover test
      
      * recover test
      
      * fix spelling
      
      * recover
      
      * fix
      ce72690c
  4. 05 4月, 2022 1 次提交
    • Z
      Table refine: remove table/accessor unuseful (#41400) · a288fcab
      zhaocaibei123 提交于
      * update name
      
      * update name
      
      * fix test
      
      * fix fleet bind
      
      * update name
      
      * update name
      
      * fix test
      
      * fix gpups wrapper
      
      * remove Push/Pull/Load/Save with context in client and wrapper base class
      
      * fix
      
      * fix
      
      * remove some interface
      
      * fix
      
      * remove
      
      * code style
      
      * recover
      
      * fix
      
      * remove code unused
      
      * remove some unused table & accessor & CommonDenseTable => MemoryDenseTable
      
      * fix
      
      * fix
      
      * fix
      
      * recover
      
      * remove unused code
      Co-authored-by: Nesythan <esythan@126.com>
      a288fcab
  5. 02 4月, 2022 2 次提交
  6. 01 4月, 2022 1 次提交
  7. 30 3月, 2022 1 次提交
  8. 23 3月, 2022 1 次提交
    • Z
      two-phase training for ps (#40762) · b1a4668c
      zhaocaibei123 提交于
      * fix benchmark and communicator config
      
      * fix bugs of the_one_ps
      
      * multi program and fix bug in optimizer
      
      * multi program in the_one_ps
      
      * public commcontext
      
      * ps optimizer multi programs
      
      * cvm & datanorm backend
      
      * fix dim
      
      * fix unittest
      
      * fix
      
      * the one ps merge
      
      * remove comm
      
      * add DownpourLiteWorker
      
      * all
      
      * fix
      
      * fix
      
      * device worker downpour lite
      
      * fix
      
      * fix bug in global shuffle
      
      * save inference model
      
      * fix & add log
      
      * fix
      
      * remove log
      
      * fix
      
      * fix save summary
      
      * fix
      
      * fix pscore
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * remove logs
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * fix
      
      * add some comments
      
      * fix
      Co-authored-by: Nesythan <esythan@126.com>
      b1a4668c
  9. 17 3月, 2022 1 次提交
  10. 30 1月, 2022 1 次提交
  11. 25 1月, 2022 1 次提交
  12. 02 11月, 2021 1 次提交
  13. 27 5月, 2021 1 次提交
    • T
      [PsCore] support ssd (#33031) · 988b5fe1
      Thunderbrook 提交于
      * support ssd in PsCore
      
      * remove log
      
      * remove bz2
      
      * defalut value
      
      * code style
      
      * parse table class
      
      * code style
      
      * add define
      988b5fe1
  14. 02 4月, 2021 1 次提交
    • S
      graph engine (#31226) · 94736d60
      seemingwang 提交于
      * graph engine demo
      
      * upload unsaved changes
      
      * fix dependency error
      
      * fix shard_num problem
      
      * py client
      
      * remove lock and graph-type
      
      * add load direct graph
      
      * add load direct graph
      
      * add load direct graph
      
      * batch random_sample
      
      * batch_sample_k
      
      * fix num_nodes size
      
      * batch brpc
      
      * batch brpc
      
      * add test
      
      * add test
      
      * add load_nodes; change add_node function
      
      * change sample return type to pair
      
      * resolve conflict
      
      * resolved conflict
      
      * resolved conflict
      
      * separate server and client
      
      * merge pair type
      
      * fix
      
      * resolved conflict
      
      * fixed segment fault; high-level VLOG for load edges and load nodes
      
      * random_sample return 0
      
      * rm useless loop
      
      * test:load edge
      
      * fix ret -1
      
      * test: rm sample
      
      * rm sample
      
      * random_sample return future
      
      * random_sample return int
      
      * test fake node
      
      * fixed here
      
      * memory leak
      
      * remove test code
      
      * fix return problem
      
      * add common_graph_table
      
      * random sample node &test & change data-structure from linkedList to vector
      
      * add common_graph_table
      
      * sample with srand
      
      * add node_types
      
      * optimize nodes sample
      
      * recover test
      
      * random sample
      
      * destruct weighted sampler
      
      * GraphEdgeBlob
      
      * WeightedGraphEdgeBlob to GraphEdgeBlob
      
      * WeightedGraphEdgeBlob to GraphEdgeBlob
      
      * pybind sample nodes api
      
      * pull nodes with step
      
      * fixed pull_graph_list bug; add test for pull_graph_list by step
      
      * add graph table;name
      
      * add graph table;name
      
      * add pybind
      
      * add pybind
      
      * add FeatureNode
      
      * add FeatureNode
      
      * add FeatureNode Serialize
      
      * add FeatureNode Serialize
      
      * get_feat_node
      
      * avoid local rpc
      
      * fix get_node_feat
      
      * fix get_node_feat
      
      * remove log
      
      * get_node_feat return  py:bytes
      
      * merge develop with graph_engine
      
      * fix threadpool.h head
      
      * fix
      
      * fix typo
      
      * resolve conflict
      
      * fix conflict
      
      * recover lost content
      
      * fix pybind of FeatureNode
      
      * recover cmake
      
      * recover tools
      
      * resolve conflict
      
      * resolve linking problem
      
      * code style
      
      * change test_server port
      
      * fix code problems
      
      * remove shard_num config
      
      * remove redundent threads
      
      * optimize start server
      
      * remove logs
      
      * fix code problems by reviewers' suggestions
      Co-authored-by: NHuang Zhengjie <270018958@qq.com>
      Co-authored-by: NWeiyue Su <weiyue.su@gmail.com>
      Co-authored-by: Nsuweiyue <suweiyue@baidu.com>
      Co-authored-by: Nluobin06 <luobin06@baidu.com>
      Co-authored-by: Nliweibin02 <liweibin02@baidu.com>
      94736d60
  15. 04 2月, 2021 1 次提交
  16. 12 1月, 2021 1 次提交
  17. 08 1月, 2021 1 次提交
  18. 24 12月, 2020 1 次提交
  19. 10 12月, 2020 1 次提交