提交 · d0f50ede6c21100f6fe3cc2227dc719549e7350c · Oneflow-Inc / oneflow

07 9月, 2018 3 次提交

feat: update the data members to use RegstSlot in Actor (#1208) · d0f50ede

由 Niu Chong 提交于 9月 07, 2018

* feat(register_slot): add the RegstSlot

* feat(register_slot): update RegstSlot if

* feat(actor): update member of Actor to use RegstSlot

* fix(register_slot): fix the available_regst_desc_cnt init val

* refine(register_slot): rename PushBack/PopFront, FindTheRegstDescId to TryPushBack/TryPopFront, HasRegstDescId

* feat(regst_slot): rename ForEachCurRegstDeq/ForEachCurFrontRegst to ForEachRegstDeq/ForEachFrontRegst

* feat(regst_slot): add ForChosenRegstDeq/ForChosenFrontRegst, add CHECK empty in ForEachFrontRegst

* fix(register_slot): fix the CHECK empty


Former-commit-id: 38a50de4

d0f50ede

Dev allreduce2 (#1211) · e1b30bd5

由 Jinhui Yuan 提交于 9月 07, 2018

* add ReduceScatter2, ReduceAdd2, ReduceGather2 op and kernel

* add ReduceScatter2, ReduceAdd2, ReduceGather2 task node and actor

* complete Reduce2 op

* TODO: complete ReduceAdd2 kernel

* add ReduceScatter2 task to accept model_diff

* sketch of connecting ReduceScatter2/Add2/Gather2

* build allreduce2 logical graph

* connect allreduce2 task graph

* ReduceScatter2 task node

* complete ReduceAdd2, ReduceGather2 task node

* simplify ReduceAdd2 actor

* refactor ReduceAdd2 task node

* let global add -> gather share path

* separate ReduceLocalAdd2 and ReduceGlobalAdd2

* connect AllReduce2 task graph

* complete ReduceGlobalAdd2 op

* refine ReduceLocalAdd2 task node

* complete ReduceGlobalAdd2 task node

* global AllReduce2 works

* add device_num_of_each_machine to parallel_context

* simplify ReduceGlobalAdd2 runtime

* multi machine multi gpus AllReduce2 works

* add mem sharing and ctrl edge for AllReduce2

* single machine multiple gpu mem sharing works

* refine

* remove the previous allreduce

* change AllReduce2 to AllReduce variable convention

* change filename

* complete transfer to allreduce2

* remove unnecessary format change

* remove unnecessary format change

* simplify

* simplify mem sharing rule for reduce add and gather

* check for local add

* fix reduce_global_add actor bug

* refine reduce task node

* refine variable name

* refine

* refine


Former-commit-id: 5909cc43

e1b30bd5

J
fix bug in add kernel of allreduce (#1214) · a76f47b3
由 Jinhui Yuan 提交于 9月 07, 2018
```
Former-commit-id: 34ce4862
```
a76f47b3

06 9月, 2018 1 次提交
- G
  fix Div function (#1212) · bc8b50c2
  由 guo ran 提交于 9月 06, 2018
```
Former-commit-id: 91432cb5
```
  bc8b50c2
04 9月, 2018 5 次提交

Dev hinge loss (#1207) · 9cdea308

由 qq_22305325 提交于 9月 04, 2018

* add hinge loss

* add hinge loss test

* hack hinge loss

* optimize hinge loss

* optimize hinge loss

* optimize hinge loss

* optimize hinge loss


Former-commit-id: 87db37ed

9cdea308

Dev matmul dot multiply (#1189) · 8100cf84

由 qq_22305325 提交于 9月 04, 2018

* add matmul & dot & multiply

* optimize dot kernel

* fix multiply kernel code style

* optimize matmul kernel


Former-commit-id: 6ab4006f

8100cf84

L
call cudnnBatchNormalizationForwardInference if trainable == flase (#1197) · d5c6eecb
由 Li Xinqi 提交于 9月 04, 2018
```
Former-commit-id: a21dea46
```
d5c6eecb

Dev embedding hb (#1188) · 5fa70913

由 qq_22305325 提交于 9月 04, 2018

* add embedding look up infer blob desc

* optimize inifer blob desc


Former-commit-id: 6c92495a

5fa70913

Dev hinge loss (#1190) · f676d774

由 qq_22305325 提交于 9月 04, 2018

* add hinge loss

* add hinge loss test

* hack hinge loss

* optimize hinge loss

* optimize hinge loss

* optimize hinge loss

* optimize hinge loss


Former-commit-id: e2da4ecf

f676d774

03 9月, 2018 2 次提交
- L
  split sources when infer shape (#1202) · a5f1e505
  由 Li Xinqi 提交于 9月 03, 2018
```
Former-commit-id: 34fb73fe
```
  a5f1e505
- L
  two pass to infer shape (#1200) · dd9be365
  由 Li Xinqi 提交于 9月 03, 2018
```
Former-commit-id: ece6957b
```
  dd9be365
02 9月, 2018 3 次提交
- L
  no blob coping gdb function (#1196) · 27630a89
  由 Li Xinqi 提交于 9月 02, 2018
```
Former-commit-id: da21ecd6
```
  27630a89
- L
  bugfix: bind in regst in backward task nodes (#1193) · 59b9db64
  由 Li Xinqi 提交于 9月 02, 2018
```
Former-commit-id: 400cf2a6
```
  59b9db64
- J
  fix bugs in prediction mode (#1194) · 6c7fb61c
  由 Jinhui Yuan 提交于 9月 02, 2018
```
Former-commit-id: 2ebe0205
```
  6c7fb61c
01 9月, 2018 2 次提交
- L
  gdb breakpoints function (#1192) · b84b880c
  由 Li Xinqi 提交于 9月 01, 2018
```
Former-commit-id: 32053d84
```
  b84b880c
- J
  fix reduce_gather in case of enable_mem_sharing == false (#1186) · 528aeab8
  由 Jinhui Yuan 提交于 9月 01, 2018
```
Former-commit-id: ccc3b389
```
  528aeab8
31 8月, 2018 1 次提交
- J
  fix order of shared model nodes (#1180) · 2436d1a1
  由 Juncheng 提交于 8月 31, 2018
```
Former-commit-id: 28a6fc98
```
  2436d1a1
30 8月, 2018 1 次提交
- J
  rm duplicate ReduceTaskNodes caused by ReduceConcat&Split (#1179) · 6a139c48
  由 Jinhui Yuan 提交于 8月 30, 2018
```
Former-commit-id: 40c299bc
```
  6a139c48
29 8月, 2018 1 次提交

sketch of merge reduce project (#1159) · 0252bca8

由 Jinhui Yuan 提交于 8月 29, 2018

* sketch of merge reduce project

* add reduce_concat, reduce_split in logical graph (#1160)

* add reduce_concat, reduce_split in logical graph

* init ReduceTaskNodes in CollectReduceTaskNodes

* add CompTaskNode for ReduceConcat & ReduceSplit

* set ReduceConcat/Split color index

* copy blob desc from ReduceConcat in to ReduceSplit out

* refine CollectReduceTaskNodes

* SetMemSharing for ReduceConcat, ReduceSplit regst

* complete ReduceConcat & ReduceSplit op

* fill ReduceConcat & ReduceSplit kernel

* simplify ReduceConcatCompActor

* make ReduceScatter & ReduceSplit as input-wise actor

* reduce_scatter & reduce_split use is_inplace

* use ByteSizeOfBlobBody for reduce related packed blob

* Fix dev merge reduce (#1168)

* check concat and split occur simultaneously

* fix ReduceScatter & ReduceSplit as Inputwise actor

* ReduceConcat & ReduceSplit works

* fix single gpu issue

* Refactor reduce (#1170)

* backup, not complete yet

* remove reduce_id

* rm useless comment

* add reduce_graph (#1169)

* add reduce_graph

* fix iter

* add IsLogicalNodeMergeable and fix bug

* remove needless constructor calls

* node VisualStr may conflict, using node_id_str instead

* reduce group works (#1171)

* refine

* sort nodes in topo (#1172)

* add reduce_group_size in job_conf, fix 121 config of ReduceSplit and MdUpdt

* resolve code review issues (variable names)

* refine variable names

* Dev merge reduce rename reduce group (#1174)

* ReduceGraph=>ChainLogicalGraph

* rename Group=>Chain

* reformat

* use pointer instead of reference for mutable argument

* format change

* worker node only pull sub_plan (#1176)

* log compile time

* use c++11 member initialization syntax

* FixPackedBlobDescOfProducedRegst for ReduceSplit

* Dev merge reduce refine chain logical graph (#1177)

* remove IsMerageable

* split TryMergeOneChain and rename to TryMergeTwoChains

* reformat

* resolve review issues


Former-commit-id: 3aa79c70

0252bca8

27 8月, 2018 1 次提交
- J
  fix issue: not unbind bn with empty regst (#1166) · 216c4585
  由 Jinhui Yuan 提交于 8月 27, 2018
```
Former-commit-id: dc6fbefc
```
  216c4585
25 8月, 2018 2 次提交
- J
  Build task node in topological order (#1162) · 995e2196
  由 Jinhui Yuan 提交于 8月 25, 2018
```
Former-commit-id: a8b7dedb
```
  995e2196
- J
  Refactor infer blob desc (#1161) · aa5bee95
  由 Jinhui Yuan 提交于 8月 25, 2018
```
* refactor EraseEmptyRegst (no dependence on weak_ptr)

* weak_ptr -> shared_ptr

* refine


Former-commit-id: e585bba0
```
  aa5bee95
24 8月, 2018 3 次提交
- L
  Dev gdb copy blob (#1158) · 362ae2bc
  由 Li Xinqi 提交于 8月 24, 2018
```
* gdb copy blob

* make BnInOp2Blob called by gdb easily


Former-commit-id: ee70abf7
```
  362ae2bc
- S
  multi thread build chain_act_sub_graph (#1155) · 496a3781
  由 strickland12 提交于 8月 24, 2018
```
Former-commit-id: 55b46427
```
  496a3781
- S
  refine_init_bitset (#1157) · 4581d147
  由 strickland12 提交于 8月 24, 2018
```
* use resize()

* use .size to calc bitset_num


Former-commit-id: 400e277e
```
  4581d147
22 8月, 2018 3 次提交
- L
  gdb copy blob (#1152) · a52278cd
  由 Li Xinqi 提交于 8月 22, 2018
```
Former-commit-id: 3612c581
```
  a52278cd
- S
  Experiment Only In Relay Placement (#1149) · 073a1682
  由 strickland12 提交于 8月 22, 2018
```
* if UseRelayPlacement

* judge if there is only one gpu parallel_conf

* refine

* fix naive error


Former-commit-id: 3ea8ae21
```
  073a1682
- S
  Avoid Infer Unnecessary Register Number (#1148) · cc8f36a9
  由 strickland12 提交于 8月 22, 2018
```
* use Special judgment in InitNodeProducedRegstAct

* abandon kMdUpdtArea ActEvents


Former-commit-id: e853cef6
```
  cc8f36a9
21 8月, 2018 1 次提交

Dev refine runtime (#1147) · 8e16abdd

由 Jinhui Yuan 提交于 8月 21, 2018

* clear act_event_logger act_event_bin_filename

* cluster_thrd_ids_key

* simplify ofrecord_decoder multi-thread

* let decoder use AllocateCpuThrdIdEvenly

* let ofrecord_decoder use local thread pool


Former-commit-id: a4860e5b

8e16abdd

20 8月, 2018 6 次提交

L
bugfix bn trainable==false (#1143) · a8bb2028
由 Li Xinqi 提交于 8月 20, 2018
```
Former-commit-id: 9f01aa33
```
a8bb2028
J
move cudnn_conv_ctx_cache to device directory (#1141) · 33244851
由 Jinhui Yuan 提交于 8月 20, 2018
```
Former-commit-id: cae14ff3
```
33244851

Speedup conv algo select (#1140) · 0db85699

由 Jinhui Yuan 提交于 8月 20, 2018

* caching the cudnn conv algorithm to eliminate duplicate calculation

* refine cudnn conv algo ctx cache


Former-commit-id: ccb7f43b

0db85699

S
add CHECK node2ancestors->emplace (#1139) · de24c7b6
由 strickland12 提交于 8月 20, 2018
```
Former-commit-id: b611c93d
```
de24c7b6

rm collect kMdUpdtArea Ancestor (#1137) · 0eecd0e1

由 strickland12 提交于 8月 20, 2018

* rm collect kMdUpdtArea Ancestor

* refine AddOrderingCtrlEdgeInSameChain

* mv ChainGraph to SetChainIdAndOrderInGraphForEachNode

* rm task_node ancestors

* rm emplace()


Former-commit-id: 691704b7

0eecd0e1

Partition plan (#1138) · 14f7137a

由 Jinhui Yuan 提交于 8月 20, 2018

* fix typo

* rm useless IsThisMachineMaster

* refine the var name of naive_plan, mem_shared_plan, improved_plan

* refactor PushPlan and PullPlan

* let master node broadcast subplans instead the whole plan

* remove useless code

* rm useless code

* use total_mbn_name_key


Former-commit-id: b21c190b

14f7137a

19 8月, 2018 5 次提交

Dev trainable false (#1132) · 0d2010a1

由 Li Xinqi 提交于 8月 19, 2018

* backpropogate model_diff only if is trainable

* bugfix: consume bw task node only if trainable

* bugfix: connect md_updt and bw_node when bw_node is not null

* bugfix: md_updt enter HandlerNormal only if there is model to train

* set all op trainable = false when predicting


Former-commit-id: be213666

0d2010a1

rm MdUpdt chain merge (#1135) · 5273e91e

由 strickland12 提交于 8月 19, 2018

* rm MdUpdt chain merge

* use area_id == kMdUpdtArea

* rm judgement

* refine IsSubset


Former-commit-id: f9fe1ee0

5273e91e

J
fix wrong ActNum of ctrl regst produced by AccCompActor (#1136) · 81b38173
由 Jinhui Yuan 提交于 8月 19, 2018
```
Former-commit-id: 3654d164
```
81b38173

refine act_id order condition (#1088) · 48cef972

由 Jinhui Yuan 提交于 8月 19, 2018

* refine act_id order condition

* strict act id check (excluding model regst)

* add TODO: figure out the ActNumForEachOutput of model regsts to MdSave area


Former-commit-id: 5be84c50

48cef972

remove blob_inited check (#1130) · 3d58b602

由 Jinhui Yuan 提交于 8月 19, 2018

* remove blob_inited check

* fix inplace feature of reduce add actor and kernel

* rm useless code

* add EnableInplace, support CPU allreduce


Former-commit-id: 40a9b9a5

3d58b602

Oneflow-Inc / oneflow 上一次同步 2 年多

Oneflow-Inc / oneflow
上一次同步 2 年多