提交 · 809ac03656712744d6dea7a6268aeeea46b6f12e · BaiXuePrincess / Paddle

27 4月, 2021 1 次提交
- T
  Revert "[PsCore] optimize performance of large kv (#32535)" (#32599) · 809ac036
  由 tianshuo78520a 提交于 4月 27, 2021
```
This reverts commit 4b7242b0.
```
  809ac036
26 4月, 2021 1 次提交
- T
  [PsCore] optimize performance of large kv (#32535) · 4b7242b0
  由 Thunderbrook 提交于 4月 26, 2021
```
* optimize pull sparse

* optimize pull sparse

* change macro

* format
```
  4b7242b0
15 4月, 2021 1 次提交

heterps support pscore (#32093) · 9f8c8f96

由 Thunderbrook 提交于 4月 15, 2021

* pscore support heterps

* fleet cmake

* fleet wrapper

* macro

* solve conflict

* solve conflict

* add unitest

* paddle enforce

* unitest

* unitest

* unitest

9f8c8f96

18 3月, 2021 1 次提交
- C
  【Paddle.Fleet】Fix one ps gradient clip (#31664) · 09482dde
  由 Chengmo 提交于 3月 18, 2021
```
* fix one ps gradient clip
```
  09482dde
24 2月, 2021 1 次提交

fix entry (#31079) · ebbdf525

由 tangwei12 提交于 2月 24, 2021

* fix entry

* fix distributed lookup table fuse case

* fix entry bug at first time

* move entry from paddle.fluid -> paddle.distributed

* fix ut with paddle.enable_static()
Co-authored-by: Nmalin10 <malin10@baidu.com>

ebbdf525

08 1月, 2021 1 次提交
- C
  【Paddle.Fleet】Fix tensor table (#30075) · 528e03fc
  由 Chengmo 提交于 1月 08, 2021
```
* add tensor table
```
  528e03fc
24 12月, 2020 1 次提交

[Feature] one ps (3/4) (#29604) · 032414ca

由 tangwei12 提交于 12月 24, 2020

* oneps (3/4)
Co-authored-by: NMrChengmo <cmchengmo@163.com>
Co-authored-by: Nmalin10 <malin10@baidu.com>
Co-authored-by: Nchengmo <chengmo@baidu.com>

032414ca

02 12月, 2020 1 次提交

Add pure fp16 training with master weights. (#27712) · be3777a5

由 Zhen Wang 提交于 12月 02, 2020

* add the weight decay func for the momentum op

* Add the multi_precision function in Momentum Optimizer.

* Make sure that the initial value of master weights are same with the fp16 weights.

* add static loss scaling.

* add the rescale_grad function in the pure fp16 training.

* use the original momentum updating method.

* Polish some codes, such as variable names.

* add docstring for apis.

* update the var creation details of _create_master_weight.

* not modify codes about imperative momentum updating.

* Fix the error of test_dist_sparse_tensor_load_momentum UT.

* add unit test for multi precision fp16 training.

* add more unit tests for CI.

* Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT.

* For CI Coverage Checking.

be3777a5

30 11月, 2020 1 次提交

Update ps gpu (#29209) · b5c63423

由 123malin 提交于 11月 30, 2020

* fix paramete prefetch & device guard
Co-authored-by: NMrChengmo <cmchengmo@163.com>
Co-authored-by: Nchengmo <chengmo@baidu.com>

b5c63423

28 10月, 2020 1 次提交
- C
  【Paddle.Fleet】Fix fleetrun heter (#28252) · 4dc8c44b
  由 Chengmo 提交于 10月 28, 2020
```
* fix fleetrun heter ps on paddlecloud
```
  4dc8c44b
15 10月, 2020 1 次提交

【paddle.fleet】geo send sparse optimize (#27719) · aa3b4ed7

由 123malin 提交于 10月 15, 2020

* test=develop, fix geo sgd communicator

* test=develop, gloo_init_method

* test=develop, bug fix for gloo http_init

aa3b4ed7

14 10月, 2020 1 次提交
- C
  【paddle.fleet】fix sparse load (#27680) · 328cb289
  由 Chengmo 提交于 10月 14, 2020
```
* add sparse tensor load method
```
  328cb289
13 10月, 2020 1 次提交

【paddle.fleet】Update fleetrun & ps-heter (#27472) · c5f2802d

由 Chengmo 提交于 10月 13, 2020

* refine fleetrun.ps_launch

* update fleet run for multi device support

* ps_graph support ps-gpu

* fix heter save

* add heter save unittest

* fix unittest & simple code

* update fleetrun

* fix fleetrun

* fix launch barrier

* fix role maker

* add paddlecloud rolemaker unittest

* rename heter_worker_device_guard

c5f2802d

29 9月, 2020 1 次提交
- 1
  test=develop, optimize geo communicator (#26857) · cc780b19
  由 123malin 提交于 9月 29, 2020
```
* test=develop, optimize geo communicator 
```
  cc780b19
23 9月, 2020 1 次提交

large scale kv speedup (#26510) · bc5f0246

由 tangwei12 提交于 9月 23, 2020

* rename communicator meet->BatchesCounter

* fix parame recv for sparse

* geo sparse init from pserver

* optimize init from pserver

* add large scale optimizer fuse(SGD/ADAM)

* rectification init_worker and exe.run startup program

bc5f0246

20 9月, 2020 1 次提交

【paddle.fleet】Fix/role maker api fix (#27326) · d6b54de4

由 tangwei12 提交于 9月 20, 2020

* fix fleet util and gloo

* fix worker endpoints

* fix

* fix UT

* fix gloo

* fix gloo

* update gloo

* update gloo

* update gloo

* update gloo

* update gloo

* fix gloo wrapper for hdfs

* add file gloo and UT

* fix UT

* fix UT

* fix UT

* hide public method of RoleMaker

* fix UT

* GPU fleetrun support gloo

* parameterserver fleetrun support gloo

* add UT

* add UT

* fix UT

* fix get server endpoint

* fix get server endpoint

* fix UT

* hide public method of rolemaker

* hide public method of rolemaker

* hide public method of rolemaker

* Update test_fleet_rolemaker_new.py

* hide public method of rolemaker

* hide public method of rolemaker

d6b54de4

08 9月, 2020 1 次提交
- 1
  【paddle.fleet】parameter_server_optimizer support auto_strategy (#26838) · f2d68d3e
  由 123malin 提交于 9月 08, 2020
```
* test=develop, add ps auto
```
  f2d68d3e
04 9月, 2020 1 次提交
- C
  fix Heter Ps multi thread (#26876) · c4846196
  由 Chengmo 提交于 9月 04, 2020
```
* fix heter-ps multi thread
```
  c4846196
02 9月, 2020 1 次提交
- C
  supplement bug fix of parameter server (#26217) · d0962abd
  由 Chengmo 提交于 9月 02, 2020
```
* fix fluid.embedding
```
  d0962abd
30 8月, 2020 1 次提交
- C
  【paddle.fleet】Support Heter Parameter Server (#25998) · 7f2aa2db
  由 Chengmo 提交于 8月 30, 2020
```
* Support Heter Parameter Server
```
  7f2aa2db
21 8月, 2020 1 次提交
- T
  fix decay global counter (#26387) · 8e4ed662
  由 tangwei12 提交于 8月 21, 2020
```
* fix decay global counter

* remove unused print, test=distp0
```
  8e4ed662
07 8月, 2020 1 次提交
- T
  Fix/large scale fix (#25999) · 3755564a
  由 tangwei12 提交于 8月 07, 2020
```
* fix large scale KV 
* fix single training using async ssa graph
```
  3755564a
30 7月, 2020 1 次提交

Integrated Trainer of Parameter Server (API add... · caa90a65

由 tangwei12 提交于 7月 30, 2020

Integrated Trainer of Parameter Server (API add `fluid.contrib.layers.sparse_embedding` only) (#22957)

* Integrated Trainer of Parameter Server

caa90a65

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致