提交 · f5c6dd6def835d912624ebfcf183b2732c949eea · BaiXuePrincess / Paddle

14 5月, 2020 1 次提交
- S
  
  test=develop (#24522) · f5c6dd6d
  由 swtkiwi 提交于 5月 14, 2020
  
  f5c6dd6d
26 4月, 2020 1 次提交

Simplify Program printing code to improve debugging efficiency (#23918) · 25a233e4

由 Chen Weihang 提交于 4月 26, 2020

* add to_readable_code method, test=develop

* polish doc details, test=develop

* polish doc note, test=develop

* fix unittest error, test=develop

* fix coverage, test=develop

* add print test, test=develop

* add print param, test=develop

* hidden to_readable_code api, test=develop

* remove original tool methods, test=develop

* remove old api using code, test=develop

25a233e4

18 3月, 2020 1 次提交
- T
  
  fix bug at sync with communicator (#23077) · 853f2e52
  由 tangwei12 提交于 3月 18, 2020
  
  853f2e52
28 2月, 2020 1 次提交
- T
  
  fix typo word (#22784) · 433cef03
  由 tianshuo78520a 提交于 2月 28, 2020
  
  433cef03
25 2月, 2020 1 次提交

PaddleBox Framework Part2 (#22466) · 175954d8

由 hutuxian 提交于 2月 25, 2020

* Add two types of Metric Calculator: MultiTaskCalculator & CmatchRankCalculator.
* Add a config for DynamicAdjustChannelNum function to denote whether we will discard the remaining instances when they are not be distributed evenly.
* Remove CPU code in Pull/PushSparse and we will add it back when testing it fully.
* Fix some known issues: such as copying persistable vars after one epoch running.

175954d8

23 2月, 2020 1 次提交
- T
  
  fix typo words (#22653) · d2ba91aa
  由 tianshuo78520a 提交于 2月 23, 2020
  
  d2ba91aa
22 2月, 2020 1 次提交
- T
  SYNC with communicaotor (#22344) · 66a31501
  由 tangwei12 提交于 2月 22, 2020
```
* add sync communicator and implement
```
  66a31501
15 2月, 2020 1 次提交
- T
  deprecated for distribute transpiler api (#22513) · 948299ae
  由 tangwei12 提交于 2月 15, 2020
```
* add deprecated for distribute transpiler, will delete it after 2.0.0, test=develop
```
  948299ae
17 1月, 2020 1 次提交
- T
  integrated HALF_ASYNC to communicator (#21869) · 82bc814a
  由 tangwei12 提交于 1月 17, 2020
```
* add half_async in the communicator
* fix DistributedStrategy
```
  82bc814a
13 1月, 2020 1 次提交
- 1
  Bug fix for sparse recorder (#21969) · 985bceac
  由 123malin 提交于 1月 13, 2020
```
* test=develop, bug fix for sparse recorder
```
  985bceac
07 1月, 2020 2 次提交
- C
  Update pyramid related OP (#21372) · 418abc92
  由 Chengmo 提交于 1月 07, 2020
```
* add special way to add distribute vars， Update Pyramid hash op
```
  418abc92
- C
  Fix grad clip (#21784) · 5c339193
  由 Chengmo 提交于 1月 07, 2020
```
* fix grad clip， clip op belongs to Backward op when running in Parameter Server mode.
```
  5c339193
06 1月, 2020 1 次提交
- 1
  add distributed_strategy (#21710) · 7fb817d4
  由 123malin 提交于 1月 06, 2020
```
* add distributed_strategy
```
  7fb817d4
12 12月, 2019 1 次提交

memory leak for cpu (#21174) · 9ad940fd

由 tangwei12 提交于 12月 12, 2019

* add fake init for the trainer, fix large memory hold in the trainer
* do not merge recv vars from a remote endpoint, test=develop
* add recv and save op, merge slice var in one op, save memory
* remove hsigmoid with pull sparse, test=develop

9ad940fd

06 12月, 2019 1 次提交
- H
  Paddlebox Related to Framework (#21586) · c5aec2fe
  由 hutuxian 提交于 12月 06, 2019
```
* Add a single_process_multi_thread transpiler.
* Add some UTs.
* Fix some API description.
```
  c5aec2fe
28 11月, 2019 1 次提交
- K
  add Adam beta1/beta2 support Variable (#21234) · ebfb720a
  由 Kaipeng Deng 提交于 11月 28, 2019
```
* add Adam beta1/beta2 support Variable. test=develop
```
  ebfb720a
01 11月, 2019 1 次提交
- 1
  Optimize decay (#20816) · 20cdff0e
  由 123malin 提交于 11月 01, 2019
```
* update pserver decay blocks

* update distributed notify handler
```
  20cdff0e
17 10月, 2019 1 次提交
- T
  fix fetch handler error with pslib (#20679) · 1d925440
  由 tangwei12 提交于 10月 17, 2019
```
* fix fetch handler error with pslib
* fix distributed lookup table op with 1 pserver
```
  1d925440
15 10月, 2019 2 次提交

Fix communicator slow bug & fix communicator stop bug (#20366) · 940c6ff1

由 Chengmo 提交于 10月 15, 2019

* test=develop,Fix communicator slow bug

* test=develop, delete if() in stop_worker()

* test=develop

* fix UT, test=develop

* fix bug in fetch handler, test=develop

* fix bug in fetch handler, test=develop

* test=develop, fix fetch barrier bug

* test=develop, bug fix

* test=develop, bug fix

* test=develop, fix bug

940c6ff1

1
bug fix: invalid learning rate decay in pserver async mode (#20325) · b4a3b750
由 123malin 提交于 10月 15, 2019
```
* bug fix: invalid learning rate decay in pserver async mode
```
b4a3b750

11 10月, 2019 1 次提交
- T
  doc fix, test=develop, test=document_fix (#20239) · a010d883
  由 tangwei12 提交于 10月 11, 2019
```
* doc fix, test=develop, test=document_fix
```
  a010d883
09 10月, 2019 1 次提交
- C
  Fix transpiler en doc (#20149) · 494d6cf2
  由 Chengmo 提交于 10月 09, 2019
```
* test=develop,test=document_fix,fix transpiler doc,add API.spec
```
  494d6cf2
07 10月, 2019 2 次提交
- C
  Speed GEO-SGD (#20158) · eb05db71
  由 Chengmo 提交于 10月 07, 2019
```
* delete debug vlog & add rpc function & fix word2vec bug & speed GEO-SGD
```
  eb05db71
- T
  Trainer heartbeat for async mode (#19600) · b5a41046
  由 tangwei12 提交于 10月 07, 2019
```
Heartbeat for distributed async training.
```
  b5a41046
30 9月, 2019 2 次提交
- C
  Add GEO-SGD distribute training algorithm (#20018) · 728ec1b4
  由 Chengmo 提交于 9月 30, 2019
```
* refector geo sgd & communicator
```
  728ec1b4
- Z
  Add deprecated memory optimize doc (#20111) · 5f2290ab
  由 Zeng Jinle 提交于 9月 30, 2019
```
* add deprecated memory optimize doc, test=develop, test=document_fix

* merge develop to solve conflict, test=develop, test=document_fix
```
  5f2290ab
26 9月, 2019 1 次提交
- 1
  fix APIs, test=document_preview (#19954) · 6c74e738
  由 123malin 提交于 9月 26, 2019
```
* fix DistributeTranspilerConfig document, test=develop
```
  6c74e738
16 9月, 2019 1 次提交
- T
  fix sync_with_distributed_lookup_table, test=develop (#19737) · 6a1db204
  由 tangwei12 提交于 9月 16, 2019
```
fix wrong place with distributed_lookup_table
```
  6a1db204
06 9月, 2019 1 次提交
- 1
  Optimize fleet API: add input check for some interfaces (#18971) · a25a716e
  由 123malin 提交于 9月 06, 2019
```
* fleet api add input check, test=develop
```
  a25a716e
28 8月, 2019 2 次提交

Y
adapte fleet api for localsgd and support nccl comm configuration in executor (#19443) · 4ef6b845
由 Yi Liu 提交于 8月 28, 2019
```
test=develop
```
4ef6b845

Fix the correctness of async mode at distributed training (#18863) · 65c73684

由 tangwei12 提交于 8月 28, 2019

* fix correctness of the communicator

* fix a bug in send thread when sending var context is empty, test=develop

* add lookup_table_prefetch_op and prefetch optimize, test=develop

* remove remote prefetch GPU supported

* word2vec force with CPU, test=develop

* test dist remote lookup table force with CPU, test=develop

65c73684

26 8月, 2019 1 次提交
- T
  fix distribute transpiler GRPC error code 4, RPC Deadline (#18984) · 19dac67e
  由 tangwei12 提交于 8月 26, 2019
```
* fix sync mode hang in transpiler
* remove sync mode in send/recv
* replace PADDLE_ENFORCE with PADDLE_ENFORCE_NE
```
  19dac67e
16 8月, 2019 1 次提交

remove unused inference_transpiler unit-tests (#19130) · 2f8c7e02

由 Tao Luo 提交于 8月 16, 2019

* remove unused inference_transpiler unit-tests

test=develop

* remove InferenceTranspiler usage in quantize_transpiler.py

test=develop

2f8c7e02

12 8月, 2019 1 次提交
- G
  Polish fleet API to support cuda collective mode and nccl2 mode. (#18966) · 29d87812
  由 gongweibao 提交于 8月 12, 2019
```
Polish fleet API to support cuda collective mode and nccl2 mode
```
  29d87812
10 8月, 2019 1 次提交

Try to deprecate unstable python memory optimize (#18983) · c194b0c8

由 Zeng Jinle 提交于 8月 10, 2019

* deprecate python memory optimize, test=develop

* remove memory_optimize in unittests, test=develop

* add unittests to deprecated interfaces, test=develop

c194b0c8

29 7月, 2019 1 次提交

Remove legacy C++ memory optimization codes (#18834) · 8008ab4e

由 Zeng Jinle 提交于 7月 29, 2019

* remove legacy memory optimization codes, test=develop

* follow huihuang's comments,test=develop

* follow luotao's comments, test=develop

8008ab4e

23 7月, 2019 1 次提交

supports distributed classification (#18690) · 157211c4

由 Yi Liu 提交于 7月 23, 2019

* supports distributed classification training
* update API.spec
* fix evenly division in python3
* change "index_range" to "index_num" in shard_index operator
test=document_preview
test=develop

157211c4

22 7月, 2019 1 次提交
- T
  do some odd jobs (#18641) · d8458483
  由 tangwei12 提交于 7月 22, 2019
```
do some odd jobs, test=develop
```
  d8458483
11 7月, 2019 1 次提交
- G
  
  Polish backwards optimizer dependency codes and use more default values. (#18255) · c0a82748
  由 gongweibao 提交于 7月 11, 2019
  
  c0a82748
02 7月, 2019 1 次提交

supports collective training with programs (#18392) · a873fa84

由 Yi Liu 提交于 7月 02, 2019

1. Since allreduce op has 4 reduce types, We split these four reduce types into four ops
2. We also refined the collective op code, e.g. we separated the collective op kernel into CPUKernel and CUDAKernel, and remove the device specified DeviceContext parameter in template as we already knew the target DeviceContext
3. We remove the newly added Collective op role to reduce the complexity of program and graph analysis

a873fa84

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致