提交 · e64fed86c8c09cd40e62e6312665f8b35e20c374 · Crayon鑫 / Paddle

16 9月, 2021 1 次提交

由 wangguanqun 提交于 9月 16, 2021

* add trainer desc config to distributed strategy

* code style modified

* data_feed set lod

* fix bug

* code style

* fix bug

e64fed86

25 5月, 2021 1 次提交
- D
  fix hogwild_worker init_place bug (#33078) · 88dfb30f
  由 danleifeng 提交于 5月 25, 2021
```
* fix hogwild_worker dev_ctx place bug; test=develop
```
  88dfb30f
08 5月, 2021 1 次提交
- D
  【heterps】support cuda11 for heterps; add profiler in oneps (#32640) · beab9563
  由 danleifeng 提交于 5月 08, 2021
```
* add trainprofiler for heterps in oneps; test=develop

* add set_use_ps_gpu; test=develop
```
  beab9563
01 4月, 2021 1 次提交
- T
  LOG CLEAN (#31819) · 0589ed21
  由 tangwei12 提交于 4月 01, 2021
```
* upgrade vlog

* train from dataset fetch optimize
```
  0589ed21
04 2月, 2021 1 次提交
- W
  use iwyu clean include second time, test=develop (#30829) · 35c5b23f
  由 wanghuancoder 提交于 2月 04, 2021
```
* use iwyu clean include second time, test=develop
```
  35c5b23f
12 1月, 2021 1 次提交

Fix/distributed proto (#29981) · 25f80fd3

由 tangwei12 提交于 1月 12, 2021

* rename sendrecv.proto to namespace paddle.distributed

* split ps with distributed

25f80fd3

24 12月, 2020 1 次提交

[Feature] one ps (3/4) (#29604) · 032414ca

由 tangwei12 提交于 12月 24, 2020

* oneps (3/4)
Co-authored-by: NMrChengmo <cmchengmo@163.com>
Co-authored-by: Nmalin10 <malin10@baidu.com>
Co-authored-by: Nchengmo <chengmo@baidu.com>

032414ca

14 10月, 2020 1 次提交

Multi task (#26002) · 5a83496c

由 zhang wenhui 提交于 10月 14, 2020

* add multitask

* add multitask, test=develop

* fix code style, test=develop

* add partail push dense, test=develop

* fix has_kay in py3, test=develop

* fix, test=develop

* fix, test=develop

* fix, test=develop

5a83496c

28 7月, 2020 1 次提交
- C
  Polish framework error message part3 (#25701) · c34c80d3
  由 Chen Weihang 提交于 7月 28, 2020
```
* polish framework error message part3

* polish details

* fix error message print error
```
  c34c80d3
03 6月, 2020 1 次提交
- 1
  downpour_worker增加try_catch机制，打印program所有参数 (#24700) · 9d2bd0ac
  由 123malin 提交于 6月 03, 2020
```
* test=develop, add try_catch for debug
```
  9d2bd0ac
19 5月, 2020 1 次提交

Random Dump (#24477) · 0ec3a42e

由 hutuxian 提交于 5月 19, 2020

* Refactor code for dump_field & dump_param: abstracting the common function in base class.
* Support dump randomly & random with lineid
* Support specify the random interval, which avoids printing too much logs.

0ec3a42e

02 4月, 2020 1 次提交
- X
  fix stat var in hogwild worker (#23367) · 93ea9dd2
  由 xujiaqi01 提交于 4月 02, 2020
```
* fix stat var in hogwild worker
* test=develop
```
  93ea9dd2
17 2月, 2020 1 次提交
- 1
  
  support dumping params/grads in transpiler mode (#22490) · 00594c1c
  由 123malin 提交于 2月 17, 2020
  
  00594c1c
17 1月, 2020 1 次提交
- T
  integrated HALF_ASYNC to communicator (#21869) · 82bc814a
  由 tangwei12 提交于 1月 17, 2020
```
* add half_async in the communicator
* fix DistributedStrategy
```
  82bc814a
30 8月, 2019 1 次提交

add thread scope stat accurate metrics test=develop (#19480) · 10ca3f96

由 yaoxuefeng 提交于 8月 30, 2019

* add thread scope stat accurate metrics test=develop

* fix style

* fix style

* fix style

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix style test=develop

* fix conflict

* fix style

* fix style test=develop

* fix error test=develop

* fix error test=develop

10ca3f96

23 7月, 2019 1 次提交

support patch data, add load_one_table, fix bug (#18509) · d18aabb4

由 jiaqi 提交于 7月 23, 2019

（1）support patch data （merge slots of instances of same line id, modify dense layer which
changes its size）
（2）add fleet load_one_table interface, support load from paddle model and load from pslib model
（3）fix push sparse bug which cause push sparse cost more time（about 10% in my testcase）
（4）when some slots are not in one of your network (join/update, etc.)，data feed、collect label info、push/pull sparse will skip these slots， instead of throw error.
（5）add more debug info in TrainFilesWithProfiler

d18aabb4

24 5月, 2019 1 次提交
- G
  polish_executor_and_add_ctx_cache (#17536) · 7f8bc49d
  由 guru4elephant 提交于 5月 24, 2019
```
* polish_executor_and_add_ctx_cache
```
  7f8bc49d
15 5月, 2019 1 次提交

add save/load model, shrink table, cvm, config file & fix pull dense bug (#17118) · 66d51206

由 jiaqi 提交于 5月 15, 2019

* add save/load model, shrink table, cvm, config file & fix pull dense bug
test=develop

* fix global shuffle bug, fix pull dense bug, fix release memeory bug, fix shrink error
add client flush, add get data size
test=develop

* fix global shuffle bug
test=develop

* fix global shuffle bug
test=develop

* fix code style
test=develop

* fix code style & modify pslib cmake
test=develop

* fix error of _role_maker
test=develop

* fix code style
test=develop

* fix code style
test=develop

* fix code style
test=develop

* fix code style
test=develop

* fix code style
test=develop

* fix windows compile error of fleet
test=develop

* fix global shuffle bug

* add comment
test=develop

* update pslib.cmake
test=develop

* fix fill sparse bug
test=develop

* fix push sparse bug
test=develop

66d51206

29 3月, 2019 13 次提交
- D
  fix async_executor problem and remove some unnecessary testcase, fix trainer_desc import problem · d739bab8
  由 dongdaxiang 提交于 3月 28, 2019
```
test=develop
```
  d739bab8
- D
  
  add infer_from_dataset for inference · 60b7bf6f
  由 dongdaxiang 提交于 3月 28, 2019
  
  60b7bf6f
- D
  
  refine print fetch list · 6bf796df
  由 dongdaxiang 提交于 3月 21, 2019
  
  6bf796df
- D
  add fetch var function · 68d7bf3d
  由 dongdaxiang 提交于 3月 20, 2019
```
test=develop
```
  68d7bf3d
- D
  add data_generator into paddle.fluid.incubate.data_generator, add op run log... · 73b1f396
  由 dongdaxiang 提交于 3月 17, 2019
```
add data_generator into paddle.fluid.incubate.data_generator, add op run log in hogwild_device_worker and downpour_device_worker
test=develop
```
  73b1f396
- D
  
  add training speed log · 73544e8b
  由 dongdaxiang 提交于 3月 15, 2019
  
  73544e8b
- D
  
  add IO percent for multi_trainer · 9419de52
  由 dongdaxiang 提交于 3月 15, 2019
  
  9419de52
- D
  
  add trainfileswithprofiler for downpour worker · 6af697ad
  由 dongdaxiang 提交于 3月 15, 2019
  
  6af697ad
- D
  
  fix data reading bugs in api, add VLOG(3) log for setup · b66f0074
  由 dongdaxiang 提交于 3月 10, 2019
  
  b66f0074
- D
  
  add printer for fetch variable · cf136064
  由 dongdaxiang 提交于 2月 18, 2019
  
  cf136064
- D
  
  fix class register problem · 39014b9f
  由 dongdaxiang 提交于 2月 02, 2019
  
  39014b9f
- D
  refine device_worker and trainer code · c1650120
  由 dongdaxiang 提交于 2月 02, 2019
```
test=develop
```
  c1650120
- D
  add dist_multi_trainer for distributed training, add trainer_factory and... · 855bf579
  由 dongdaxiang 提交于 1月 28, 2019
```
add dist_multi_trainer for distributed training, add trainer_factory and device_worker_factory so that we can easily extend new training mode, add pull dense worker which is a singleton for parameter fetching
```
  855bf579

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致