1. 18 2月, 2022 1 次提交
    • Z
      [AMP] support GPU BF16 amp for dygraph (#39029) · 7d6d3848
      zhangbo9674 提交于
      * support dtype param for auto_cast
      
      * add amp_dtype for tracer
      
      * add unsupported bf16 list
      
      * support bf16 amp for O2
      
      * refine python interface for bfloat16
      
      * refine code
      
      * refine code
      
      * refine unittest
      
      * refine code
      
      * refine code
      
      * add bf16 o1
      
      * refine code by comment
      
      * add gradient accumulator
      
      * add recompute
      7d6d3848
  2. 21 12月, 2021 2 次提交
  3. 09 12月, 2021 1 次提交
    • H
      support offload in sharding stage2 (#37904) · dfed4a63
      Haohongxiang 提交于
      * merge latest develop branch
      
      * fix bugs
      
      * update
      
      * fix bugs for unittest
      
      * modify for less use of gpu mem
      
      * fix bugs of using _reset_grad_inplace_version
      
      * update
      
      * update
      
      * modify for CI-Coverage
      
      * retrick all CIs
      dfed4a63
  4. 29 11月, 2021 2 次提交
  5. 25 11月, 2021 1 次提交
  6. 25 10月, 2021 1 次提交
  7. 21 10月, 2021 1 次提交
  8. 18 10月, 2021 1 次提交
  9. 13 10月, 2021 1 次提交
  10. 11 10月, 2021 1 次提交
  11. 08 10月, 2021 1 次提交
  12. 24 9月, 2021 1 次提交
    • S
      fix distributed ops combining problems (#35942) · 4c35f515
      seemingwang 提交于
      * graph engine demo
      
      * upload unsaved changes
      
      * fix dependency error
      
      * fix shard_num problem
      
      * py client
      
      * remove lock and graph-type
      
      * add load direct graph
      
      * add load direct graph
      
      * add load direct graph
      
      * batch random_sample
      
      * batch_sample_k
      
      * fix num_nodes size
      
      * batch brpc
      
      * batch brpc
      
      * add test
      
      * add test
      
      * add load_nodes; change add_node function
      
      * change sample return type to pair
      
      * resolve conflict
      
      * resolved conflict
      
      * resolved conflict
      
      * separate server and client
      
      * merge pair type
      
      * fix
      
      * resolved conflict
      
      * fixed segment fault; high-level VLOG for load edges and load nodes
      
      * random_sample return 0
      
      * rm useless loop
      
      * test:load edge
      
      * fix ret -1
      
      * test: rm sample
      
      * rm sample
      
      * random_sample return future
      
      * random_sample return int
      
      * test fake node
      
      * fixed here
      
      * memory leak
      
      * remove test code
      
      * fix return problem
      
      * add common_graph_table
      
      * random sample node &test & change data-structure from linkedList to vector
      
      * add common_graph_table
      
      * sample with srand
      
      * add node_types
      
      * optimize nodes sample
      
      * recover test
      
      * random sample
      
      * destruct weighted sampler
      
      * GraphEdgeBlob
      
      * WeightedGraphEdgeBlob to GraphEdgeBlob
      
      * WeightedGraphEdgeBlob to GraphEdgeBlob
      
      * pybind sample nodes api
      
      * pull nodes with step
      
      * fixed pull_graph_list bug; add test for pull_graph_list by step
      
      * add graph table;name
      
      * add graph table;name
      
      * add pybind
      
      * add pybind
      
      * add FeatureNode
      
      * add FeatureNode
      
      * add FeatureNode Serialize
      
      * add FeatureNode Serialize
      
      * get_feat_node
      
      * avoid local rpc
      
      * fix get_node_feat
      
      * fix get_node_feat
      
      * remove log
      
      * get_node_feat return  py:bytes
      
      * merge develop with graph_engine
      
      * fix threadpool.h head
      
      * fix
      
      * fix typo
      
      * resolve conflict
      
      * fix conflict
      
      * recover lost content
      
      * fix pybind of FeatureNode
      
      * recover cmake
      
      * recover tools
      
      * resolve conflict
      
      * resolve linking problem
      
      * code style
      
      * change test_server port
      
      * fix code problems
      
      * remove shard_num config
      
      * remove redundent threads
      
      * optimize start server
      
      * remove logs
      
      * fix code problems by reviewers' suggestions
      
      * move graph files into a folder
      
      * code style change
      
      * remove graph operations from base table
      
      * optimize get_feat function of graph engine
      
      * fix long long count problem
      
      * remove redandunt graph files
      
      * remove unused shell
      
      * recover dropout_op_pass.h
      
      * fix potential stack overflow when request number is too large & node add & node clear & node remove
      
      * when sample k is larger than neigbor num, return directly
      
      * using random seed generator of paddle to speed up
      
      * fix bug of random sample k
      
      * fix code style
      
      * fix code style
      
      * add remove graph to fleet_py.cc
      
      * fix blocking_queue problem
      
      * fix style
      
      * fix
      
      * recover capacity check
      
      * add remove graph node; add set_feature
      
      * add remove graph node; add set_feature
      
      * add remove graph node; add set_feature
      
      * add remove graph node; add set_feature
      
      * fix distributed op combining problems
      
      * optimize
      
      * remove logs
      Co-authored-by: NHuang Zhengjie <270018958@qq.com>
      Co-authored-by: NWeiyue Su <weiyue.su@gmail.com>
      Co-authored-by: Nsuweiyue <suweiyue@baidu.com>
      Co-authored-by: Nluobin06 <luobin06@baidu.com>
      Co-authored-by: Nliweibin02 <liweibin02@baidu.com>
      Co-authored-by: Ntangwei12 <tangwei12@baidu.com>
      4c35f515
  13. 17 9月, 2021 1 次提交
    • Z
      [AMP] Support pure fp16 training mode for dygraph (#35521) · adaeee4d
      zhangbo9674 提交于
      * add pure fp16 major function in auto_cast & tracer
      
      * support master weight in dygraph for pure fp16
      
      * check mix dtype of fp16&fp32 for check_finite_and_unscale op
      
      * change pure fp16 funtion name
      
      * refine some bug in auto_cast
      
      * refine auto_cast interface logic
      
      * add param _casted_by_pure_fp16 for class Layer
      
      * support state_dict hook for save model by user appointed dtype in pure_fp16_decorator
      
      * refine pure_fp16_decorator as decorator
      
      * add unittest
      
      * add comment
      
      * add comment
      
      * support recompute
      
      * add comment for auto_cast and decorator
      
      * support to_static_state_dict for paddle.jit.save
      
      * unlimite models num and optimizers num
      
      * add lookup_table in black_list
      
      * fix momentum and layer state_dict
      
      * fix bug in layer state_dict
      
      * fix bug in layer state_dict_helper
      
      * refine unittest
      
      * refine test_momentun_op
      
      * refine interface and some code
      
      * refine amp_decorator interface
      
      * refine pure fp16 interface
      
      * refine master weight interface
      adaeee4d
  14. 15 9月, 2021 1 次提交
  15. 14 9月, 2021 1 次提交
    • H
      Add solutions to PyLayer which is unsupported in DataParallel (#35401) · d483b8c0
      Haohongxiang 提交于
      * Add solutions to PyLayer which is unsupported in DataParallel
      
      * modify note format for parallel.py
      
      * modify docs of dataparallel
      
      * add docs of dp with pylayer
      
      * modify docs format
      
      * modify example format
      
      * change example of dp with pylayer
      
      * add unittest for dp with pylayer
      
      * modify ut
      
      * merge latest codes
      
      * update
      
      * modify for CI-Coverage
      
      * modify text-indent
      d483b8c0
  16. 13 9月, 2021 1 次提交
  17. 09 8月, 2021 1 次提交
  18. 05 7月, 2021 1 次提交
  19. 01 7月, 2021 1 次提交
  20. 21 6月, 2021 1 次提交
  21. 07 6月, 2021 1 次提交
  22. 03 6月, 2021 1 次提交
  23. 17 5月, 2021 1 次提交
  24. 06 5月, 2021 1 次提交
  25. 30 4月, 2021 1 次提交
  26. 26 4月, 2021 1 次提交
  27. 25 4月, 2021 2 次提交
  28. 22 4月, 2021 1 次提交
  29. 19 4月, 2021 1 次提交
  30. 07 4月, 2021 1 次提交
    • T
      Struct SparseValue && Bug Fix (#31721) · a881b4d5
      tangwei12 提交于
      * add PullSparseValue for pull sparse
      
      * fix bug for PullSparseValue
      
      * add test mode in lookuptable
      
      * revert API change
      
      * add comment for is_training
      a881b4d5
  31. 02 3月, 2021 1 次提交
  32. 14 1月, 2021 1 次提交
  33. 24 12月, 2020 1 次提交
  34. 08 12月, 2020 1 次提交
  35. 30 11月, 2020 1 次提交
  36. 16 10月, 2020 1 次提交
  37. 29 9月, 2020 1 次提交