提交 · 380d2414d2ebd45dbd04b9a22a3241098790aec3 · Oneflow-Inc / oneflow

01 11月, 2021 1 次提交

Change maybe to optional (#6611) · 380d2414

由 Zhanghuihong Guan 提交于 11月 01, 2021

* initial commit, add code for async construct tensor from numpy array

* inital commit to change Maybe to Optional

* delete redundant code

* replace Maybe with Optional

* fix compile errors

* format code

* changes based on review

* format code, fix based on review

* format code

* fix multiclient type

* changes based on review

* changes based on review

* unify calling to IsMultiClirnt

* refector multi_client related code

* restore InMultiClient interface

* double check for unnecessary changes

* remove unnecessary changes

* format code

* Update oneflow/api/python/symbol/job_conf_symbol.cpp

* Update oneflow/api/python/symbol/op_conf_symbol.cpp

* Update oneflow/api/python/symbol/op_node_signature_symbol.cpp

* Update oneflow/core/common/optional.h

* Update oneflow/api/python/symbol/string_symbol.cpp

* Update oneflow/api/python/symbol/scope_symbol.cpp

* Update oneflow/api/python/symbol/placement_symbol.cpp

* Update oneflow/api/python/symbol/op_conf_symbol.cpp
Co-authored-by: NHoujiang Chen <chenhoujiangcug@gmail.com>
Co-authored-by: NTwice <i@twice.moe>

380d2414

18 10月, 2021 1 次提交
- G
  
  slice_boxing reuse mem (#6549) · 6fc666ab
  由 guo ran 提交于 10月 18, 2021
  
  6fc666ab
15 10月, 2021 1 次提交

refactor slice boxing (#6413) · fadf39c6

由 guo ran 提交于 10月 15, 2021

* refactor slice boxing

* refine

* slice boxing node use compute stream

* refine

* refine

* refine

* refine

fadf39c6

23 9月, 2021 2 次提交

Lazy build tensor compatible to nd_sbp (#6335) · 264f9b9c

由 leaves-zwx 提交于 9月 23, 2021

* LazyBuildTensor

* fix slice get sbp

* LazyBuildTensor -> BuildTensor

* fix adam sbp

* fix hierarchy condition

* fix hierarchy parallel conf

* sparse_softmax_cross_entropy_ms functor

* debug code

* rm debug code

* s to b
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

264f9b9c

Add primitive/include (#6379) · 6701db43

由 Juncheng 提交于 9月 23, 2021

Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

6701db43

11 9月, 2021 1 次提交

Fix bug of Multi-Client src tick output order (#6221) · 788bd1a5

由 cheng cheng 提交于 9月 11, 2021

* Fix bug of Multi-Client src tick output order

* Add input/output ctrl edge to DstSubTick for order io and callback_notify

* add test scripts

* remove note

* auto format by CI

* add note of sleep

* auto format by CI
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>

788bd1a5

08 9月, 2021 1 次提交
- J
  Primitive based copy task node (#6195) · 5c667f5c
  由 Juncheng 提交于 9月 08, 2021
```
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
```
  5c667f5c
06 9月, 2021 1 次提交

Remove IDMgr::GetGpuPhyIdFromThrdId/IDMgr::GetDeviceTypeFromThrdId (#6169) · 13b2a48d

由 Juncheng 提交于 9月 06, 2021

* Remove IDMgr::GetGpuPhyIdFromThrdId/IDMgr::GetDeviceTypeFromThrdId

* CHECK(new_task_id_)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

13b2a48d

31 8月, 2021 1 次提交

Replace xor with hash combine (part 1) (#6078) · 5c6b9f14

由 Twice 提交于 8月 31, 2021

* for all: use hash combine

* for all: add Hash(T...)

* util: clang format

* rename Hash(size_T*, T...) to AddHash

* clang format

* apply
Co-authored-by: NJuncheng <liujuncheng1022@gmail.com>

* clang format

* clang format

* fix
Co-authored-by: NShenghang Tsai <jackalcooper@gmail.com>
Co-authored-by: NJuncheng <liujuncheng1022@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

5c6b9f14

26 8月, 2021 1 次提交

functional_api: fix build error in mac os (#6010) · a52d35d3

由 Twice 提交于 8月 26, 2021

* functional_api: fix build error in mac os

* functional_api: fix interpreter_test

* for all: replace device_type.pb.h with device_type.h

* value_types: impl hash

* revert tools change

* CI: mac only

* cmake: fix of_functional_obj

* cmake: fix of_functional_obj

* CI: revert
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

a52d35d3

19 8月, 2021 2 次提交

Add CudaStreamIndexGenerator::GenerateNamedStreamIndex (#5940) · e7e39aa1

由 Juncheng 提交于 8月 19, 2021

Co-authored-by: Nleaves-zwx <kunta0932@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

e7e39aa1

Remove CudaWorkType (#5942) · 6f38134c

由 Juncheng 提交于 8月 19, 2021

Co-authored-by: Nleaves-zwx <kunta0932@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

6f38134c

17 8月, 2021 1 次提交

Remove GlobalWorkStreamId/GlobalThrdId (#5917) · 19fdde6d

由 Juncheng 提交于 8月 17, 2021

* Remove GlobalWorkStreamId/GlobalThrdId

* refine
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

19fdde6d

15 8月, 2021 1 次提交

Rename the `ParallelDistribution` class to `NdSbp` (#5814) · 59d7d346

由 Tianyu Zhao 提交于 8月 15, 2021

* Rename `ParallelDistribution` to `NdSbp`

* Rename `ParallelDistribution` to `NdSbp`

* Rename `ParallelDistribution` to `NdSbp`

* auto format by CI

* Rename `ParallelDistribution` to `NdSbp`
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

59d7d346

12 8月, 2021 1 次提交

Rename variables named `*parallel_distribution*` to `*nd_sbp*` (#5815) · 1ba13974

由 Tianyu Zhao 提交于 8月 12, 2021

* Rename `parallel_distribution` to `nd_sbp`

* Rename filenames containing `parallel_distribution`

* auto format by CI

* Rename `parallel_distribution` to `nd_sbp`

* auto format by CI

* Rename `parallel_distribution` to `nd_sbp`

* auto format by CI
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

1ba13974

07 8月, 2021 1 次提交

Feat empty op (#5659) · ed9b5a50

由 Yinggang Wang 提交于 8月 07, 2021

* fix bugs in shareing EagerBlobObject::blob_desc_.shape and EagerBlobObject::blob_.shape

* feat(EmptyOp): add flow.empty

* docs(EmptyOp): add doctest and refine document

* docs(EmptyOp): refine document

* refactor(Tensor): Tensor constructor use empty_op

* refactor(Tensor): remove useless code

* feat(EmptyOp): support construct in given device and add
               consistent_empty op

* feat(EmptyOp): support unpacked tuple shape

* refine array functor code

* docs(EmptyOp): update empty op document

* refine code

* docs(EmptyOp): add test and document for consistent empty op

* update document

* fix merge bugs

* fix(*): fix infer distribution

* test(EmptyOp): fix ConsistentEmptyOp CPU_ONLY test bug

* fix(*): init shape when InitBlob

* fix(*): Constant and Empty Op use broadcast sbp

* fix(indexing): replace MakeTensor with functional::Empty

* fix(*): fix compile bug

* refine code

* fix(nnGraph): make eager tensor

* auto format by CI

* fix(Stride): infer stride before initializing shape
Co-authored-by: NXinqi Li <lixinqi0703106@163.com>
Co-authored-by: NLi Xinqi <lixinqi2010@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>

ed9b5a50

06 8月, 2021 1 次提交
- J
  
  Remove obsolete Profiler (#5747) · d9d28b80
  由 Juncheng 提交于 8月 06, 2021
  
  d9d28b80
02 8月, 2021 1 次提交

0-dim tensor support (#5552) · 62a8cd84

由 Luyang 提交于 8月 02, 2021

* 0-dim tensor support

* test case

* add more test

* refine

* update

* update default constructor

* reconstuct

* merge master

* remove notes

* remove useless codes

* fix comments

* fix comment

* add test case

* format

* refine

* refine

* refine

* refine

* MirroredTensorMeta::MirroredTensorMeta()

* support 0-dim slice

* support 0-dim slice grad

* refine

* auto format by CI

* refine

* refine

* auto format by CI

* refine

* fix slice bug

* auto format by CI

* fix resnet50 0-im loss uasge

* fix 0-dim tensor usage in test cases

* add skip test

* auto format by CI

* fix test_dataset

* check blobdesc.shape init

* auto format by CI

* remove useless empty shape init

* fix l1loss 0-dim error

* auto format by CI

* fix argmax op test

* fix add_n op test

* auto format by CI

* fix bce loss op test

* auto format by CI

* fix squeeze op test

* fix conv2d op test

* fix xpu_shape for clip_grad_norm

* auto format by CI

* resolve confilct

* fix multi-cpu slice_copier 0-dim bug

* auto format by CI

* add memory copy for 0-dim

* auto format by CI

* support copy0dim

* refine

* auto format by CI

* remove unuse codes

* fix check for kldivloss

* gpu 0-dim copy

* auto format by CI

* fix clip_grad_norm doctest

* fix reduce_ops doctest

* fix argmax doctest

* fix loss module doctests

* fix math_ops doctests

* fix norm modules doctest
Co-authored-by: NXinqi Li <lixinqi0703106@163.com>
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

62a8cd84

16 7月, 2021 1 次提交

Job pass maybe system (#5503) · 50e1c346

由 Li Xinqi 提交于 7月 16, 2021

* refactor job_pass by maybe_system

* remove useless files
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

50e1c346

14 7月, 2021 1 次提交

Output arg modifier return maybe part 1 (#5451) · 77f9d83f

由 liufengwei0103 提交于 7月 14, 2021

* Modified the OutputArgModifyFn interface

* maybe error stack from CheckAndConstructOp to OutputArgModifier callback function

* maybe error stack from CheckAndConstructOp to OutputArgModifier callback function

* OutputArgModifier return maybe part_1

* maybe error stack from CheckAndConstructOp to OutputArgModifier callback function

* add JUST for hander in ForEachOperator
Co-authored-by: aishangjj <702572275@qq.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

77f9d83f

01 7月, 2021 2 次提交

Infer consistent tensor meta (#5118) · c3238bfd

由 Li Xinqi 提交于 7月 01, 2021

* Device::compute_dep_object_

* sequantialize instructions in the same stream.

* refactor AttrMap

* refactor Tensor

* Export ConsistentTensor::is_cuda

* remove ConsistentTensor::blob_object

* refactor TensorImpl

* minor fix

* fix compiler' complains

* Implements EagerConsistentTensorImpl::New

* minor fix

* fix compiler complains

* remove unused code

* skip test_creating_consistent_tensor

* backup code

* Symbol::shared_from_symbol

* remove redundant header file includes

* fix bug in Symbol::shared_from_symbol

* symbolize ParallelDesc and ParallelDistribution

* symbolize Scope::GetParallelDesc()

* IsScalarType

* fix compiler complains

* InputConsistentTensorMeta

* refactor Scope with PlacementScope

* fix bug in exporting Scope to python

* backup code

* refactor DType

* fix compiler complains

* backup code

* DType is only allowed to be used in python code

* backup code

* dtype api bugfix

* fix error on exiting
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* lazily get rank
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* Export const DType* into python

* minor fix

* fix bug

* refine

* refactor signature of OpExpr::InferLogicalShapeAndDtype

* fix bug

* backup_code

* fix bug

* refactor SbpXXX to cfg::SbpXXX

* merge refactor_sbp_to_cfg_sbp

* fix bug

* Infer ConsistentTensorMeta

* Implement EagerConsistentInterpret::ApplyImpl

* 1) move XXXTensorMeta into the new file tensor_meta.h; 2) add new Class ConsistentTensorInferCache

* add class ConsistentTensorInferResult

* remove unused OpArgMutConsistentTensorMeta::parallel_distribution_

* fix stack-overflow bug in Tensor::mut_eager_mirrored_tensor_impl

* ignore empty parallel distribution constaint

* fix bug

* add explicit of cfg

* fix xla compile bug

* auto format by CI

* fix according comment

* fix bug
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: clackhan <han_binbin@163.com>
Co-authored-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: NShenghang Tsai <jackalcooper@gmail.com>
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>

c3238bfd

add missing JUST (#5357) · 95337ebc

由 daquexian 提交于 7月 01, 2021

* add missing JUST
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* remove redundant header
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* add missing JUST in master
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* fix compile error on gcc5
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

95337ebc

30 6月, 2021 1 次提交

CommNet dynamic register memory (#5281) · c4285319

由 Juncheng 提交于 6月 30, 2021

Co-authored-by: Nguo ran <360112263@qq.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

c4285319

21 6月, 2021 1 次提交

Refactor Memory Zone (#5072) · 50f32b61

由 leaves-zwx 提交于 6月 21, 2021

* MemZoneId


Former-commit-id: 7550a129f15554c5a6e480b728079e431c00be25

* move mem zone id source code


Former-commit-id: 3859fc2a0fcda2fb23e57e886a0e3f1c0833d111

* revert


Former-commit-id: 5cf3ad7caebe787918d1ca1c0467415656d9b491

* refine GetProxyNode using MemZoneId


Former-commit-id: fba035f20b44b1acce2900b86b5bd24654e0d982

* refactor MemZoneId121


Former-commit-id: 0868a6139f1cf20dc7474d0a88714e03721c8e8e

* replace using IDMgr interface


Former-commit-id: 98b5db9ed879cd1d8197efd174c6d680bec69560

* fix linkage

* rm useless comment

* replace IsGpuMemZone

* format

* rm deprecated mem zone api in IDMgr

* fix merge conflict error

* refine mem zone id to include node index

* revert added header

* direct init device_id

* address review

* address review
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

50f32b61

15 6月, 2021 1 次提交

Refactor SbpXXX to cfg::SbpXXX (#5120) · dea9215c

由 liufengwei0103 提交于 6月 15, 2021

* refactor SbpXXX to cfg::SbpXXX

* modify ParallelDistributionHint4InputArgNameAndIndex to be const function

* fix sbp to cfg::sbp in job_pass

* fix bug ToProto, InitFromProto and pb passed to cfg

* auto format by CI

* fix gpt segment fault

* fix xla

* tmp commit

* tmp commit

* fix xla compile error

* [fix bug] return tmp in model_io_v2

* auto format by CI
Co-authored-by: Nlixinqi <lixinqi0703106@163.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: NpoohRui <yuruil@qq.com>

dea9215c

03 6月, 2021 2 次提交

CI checks if license duplicated (#5091) · ac7d3fb8

由 Shenghang Tsai 提交于 6月 03, 2021

* Remove redundant copyright header

* ci check if license duplicated

* refine

* refine

* refine

* address review
Co-authored-by: Nliujuncheng <liujuncheng1022@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

ac7d3fb8

add math.abs (module) (#4952) · 325160bc

由 Hongsheng Wang 提交于 6月 03, 2021

* Add scalar support of greater less module (#4841)

* add scalar input support

* format
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 9b11938c

* Refine optimizer (#4840)

* refactor(Optim): refine optimizer codes

* docs(SGD): add document for SGD

* docs(SGD): fix code

* test(Adam): fix test_optim_adam bug
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: cd6ffac6

* add docstring (#4846)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: da82bb8c

* fix eager with unknow symbol id (#4752)

* fix eager with unknow symbol id

* minor fix

* fix conflict

* remove unnnecessary function

* remove unnecessary header

* remove unnecessary methods
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: aa9f6f76

* Dev fix linear module (#4836)

* add broadcast matmul support

* refine

* add batch matmul support

* remove redundant test case

* linear module support high dimension input

* format

* fix linear
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: f1ccf2a3

* Add rmsprop optimizer (#4834)

* add rmsprop optimizer

* fix rmsprop optimizer bug

* fix rmsprop optimizer bug

* add rmsprop optimizer docs

* add rmsprop docs

* fix comment

* fix comment

* fix comment

* fix comment
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: f476d48d

* add adamw optimizer (#4824)

* init adamw optimizer

* fix adamw optimizer bug

* fix comment

* fix comment

* code format

* fix comment
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: f15a8aea

* add experimental apis (#4817)

* add experimental apis

* merge master fix conflict

* revert flow._oneflow_internal.dtype to flow.dtype

* refine

* fix test optimizer

* update module docs

* fix unit tests

* fix matmul module test

* fix adamw and rmsprop tests

* fix crossentropy loss grad
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: c50ff3cc

* remove MakeParallelDescByDevice, fix the missing setting of parallel_desc in InstructionMsg copy constructor (#4850)
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Former-commit-id: d5d7ef56

* add experimental (#4856)



Former-commit-id: 505d4865

* A more efficient implementation of NLL Loss (#4854)

* A more efficient implementation of NLL Loss

* A more efficient implementation of NLL Loss
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: ad5b0129

* Add where module (#4845)

* add broadcast_like module

* add where module, still has bug

* fix where module bug

* fix where module bug

* fix bug and add where module

* fix where module commnet

* code format

* fix where module

* code format
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: edf9a190

* dev_compare_cfg_file (#4860)

* dev_compare_cfg_file

* add def of org_content

* minor fix
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 5e399d38

* stateful local opkernel: return a temp parallel ctx (#4857)

* add temp parallel ctx for single card
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* add TODO comment
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: f450ff89

* Optimize memory occupancy for interface 1.0 (#4844)

* Do not save inputs in function nodes even if requires_grad is true.

* Allocate raw memory with actual size.
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: e5dadaf5

* use less event records (#4861)

* use less event records

* more comments
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 218f61e3

* Refine cast grad func. (#4853)



Former-commit-id: dfe7759e

* copy eager blob object to/from numpy in c++, use busy loop to wait (#4839)

* numpy: create np arr in python and copy in c++, use busy loop to wait
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* reformat
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* add CopyBetweenMirroredTensorAndNumpy
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 87c0c3d8

* align squeeze module with torch (#4855)

* align squeeze module with torch

* fix comment

* fix argmax bug

* fix bug
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: a57b7649

* Fix adam weight decay param (#4835)

* fix adam weight decay

* fix adam weight decay

* fix comment

* fix comment

* fix commnet

* fix commnet

* fix commnet

* fix bug

* fix(Adam): fix Adam test bug

* revert adam test threshold to 1e-3

* fix(Adam): fix adam test bug and adjust param to increase error
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: Nwyg1997 <wyg19970408@gmail.com>
Former-commit-id: 1d5f743d

* Fix groupnorm (#4848)

* fix GroupNorm and modify test case

* add grad op for reshape_like op
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 9bf14b8b

* use composed attr map in contexts (#4838)

* use composed attr map
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* move implementation to .cpp
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* OpExpr::New returns Maybe
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 9d78fdba

* flow.Size support negative index and add test (#4870)

* feat(PySize): support negative index and add test

* style(*): refine code

* format code

Former-commit-id: 3519d2e7

* fix export experimental docs bug (#4867)

* fix export experimental docs bug

* fix export experimental docs bug

* fix export experimental docs bug
Co-authored-by: NYao Chi <later@usopp.net>
Former-commit-id: d78c2bc6

* reorder VirtualMachine fields (#4873)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 1ce9f262

* Align module params with torch (#4865)

* align mean module

* allow negative dim param

* support tuple of negative dim param

* refine

* format
Co-authored-by: NXiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 373cefce

* Generate cfg header and source files in parallel and prevent rebuild from scratch when Python version changes (#4876)

* refine

* Update cfg.cmake

* Update cfg.cmake
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 72204213

* fix interpreter determin output leaf and grad (#4872)

* fix interpreter determin output leaf and grad

* fix GradMode get

* simplify

* add test for no_grad
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: de05e7a5

* support crossentropy loss 3dim (#4875)

* support crossentropy loss 3dim

* merge conflict
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 4e689d25

* supoort nllloss 3dim (#4874)

* supoort nllloss 3dim

* supoort nllloss 3dim

* merge conflict
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 30a75cee

* Device compute dep object (#4862)

* Device::compute_dep_object_

* sequantialize instructions in the same stream.

* adjust atexit sort
Co-authored-by: NHoujiang Chen <chenhoujiangcug@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: clackhan <han_binbin@163.com>
Former-commit-id: 55a223cf

* remove cambricon quantization test (#4879)
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 9cddc629

* Use dlopen to call ibverbs APIs (#4852)

* check in naive struct

* refine

* refine

* refine

* refine

* add functions

* refine

* refine

* refine

* fmt

* refine

* refine

* refine

* refine

* refine

* refine

* add note

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* rm include

* revert cmakelist changes

* refine

* address review

* rename

* address review

* address review

* remove glog dependency

* fix

* refine

* refine

* print lib path in stdout

* address review

* address review

* fix

* support ONEFLOW_LIBIBVERBS_PATH

* add case

* update init_cluster_env.py for ONEFLOW_LIBIBVERBS_PATH

* fix comment

* address review
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 75f11b82

* Add copy user op (#4842)

* copy user op

* add to module and tensor.to interface

* remove unnecessary code

* backward for tensor.to

* remove capture of input

* support cpu only tensor

* module to (#4858)

* remove backward kernel and op

* friendly deal with when tensor.grad is None

* minor fix

* minor fix

* revert

* suport 1m1d only

* skip test normalization

* skip test normalization

* skip conv

* support construct device using string

* minor fix

* minor fix

* use maybe

* fix device id type for device infer ctx

* skip batchnorm

* skip some tensor test case
Co-authored-by: NXiaoyu Xu <xiaoyulink@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 2d5fae50

* Fix reduce sum grad func. (#4882)

* Fix reduce sum grad func.

* Fix zeros op
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: a4022285

* align dim size funtion (#4880)

* align dim size funtion

* fix dim usage
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 4592d467

* support expand and repeat op int datatype (#4883)

* support expand and repeat op int datatype

* code format
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 2939e8cb

* return LocalTensor (TensorTuple) directly from op expr __call__ (#4864)

* expose local tensor
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* mt19937 -> minstd_rand
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* revert unnecessary diff
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* fix comments
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 72b76a29

* Support custom parameters for optimizer (#4881)

* feat(Optim): support custom parameters for optimizer

* feat(Adam): adam support custom parameters

* feat(Adamw): adamw support custom parameters

* feat(RMSprop): rmsprop support custom parameters

* style(Optim): refine adam and adamw
Co-authored-by: NXiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Former-commit-id: a26f7080

* align transpose module with pytorch (#4877)

* align transpose module with pytorch

* fix comment

* align tranpose module

* support expand and repeat op int datatype

* fix bug
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 2a5bc359

* Add ones like op (#4889)

* Add ones like op.

Conflicts:
	oneflow/api/foreign_lock_helper.h
	oneflow/api/python/autograd/autograd.cpp
	oneflow/api/python/framework/tensor.cpp
	oneflow/core/framework/op_interpreter/op_interpreter.cpp
	oneflow/core/framework/op_interpreter/op_interpreter_util.cpp
	oneflow/core/framework/tensor.cpp
	oneflow/core/framework/tensor_impl.cpp
	oneflow/core/framework/tensor_impl.h

* Add ones_like unittest.

* Use SwithCase

* Fix typo

* undef

* Bugfix

* Fix merge conflicits

Co-authored-by: hjchen2 <hjchen2>
Co-authored-by: NYinggang Wang <wyg19970408@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 7b4c8c50

* Cpu support conv module (#4894)

* support expand and repeat op int datatype

* support conv cpu module

* support conv cpu module

* support conv cpu module

* support conv cpu module
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 59fcd027

* Async cuda stream type (#4895)

* add class AsyncCudaStreamType

* fix bug

* remove useless headfile
Co-authored-by: Nlixinqi <lixinqi0703106@163.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 8784b21d

* skip infer instr when physical operand != nullptr, remove unused code (#4868)

* Disable infer instruction if instruction type has physical operand
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* remove more infer instructions
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* raise UNIMPLEMENTED() in infer
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* fix hanging on exit
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* reformat
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* fix typo
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* wrap results by Tensor() in .to()
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* set need_check_mem_case to false for copy op
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: NLi Xinqi <lixinqi2010@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 3137f51c

* fix and check to with module on forwad and backward (#4897)

* fix and check to on forwad and backward

* add todo
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 585afc09

* add JUST (#4891)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 98b88e20

* Fix docs bug (#4892)

* support expand and repeat op int datatype

* fix modules docs bug

* fix docs bug

* fix docstring bug

* fix docs bug

* fix docs bug
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: fb164b58

* Add warning when no param update (#4896)

* style(Optim): add warning when no param update

* style(Optim): add TODO
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 7e9902aa

* Add gather embedding module (#4826)

* add gather module

* add gather module

* add test case

* add embedding module

* fix comments

* update embedding module and test case

* refine

* fix comment

* fix comment

* fix comment
Co-authored-by: NXiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Co-authored-by: NBBuf <1182563586@qq.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 986becca

* Support create cpu only tensor (#4863)

* support create cpu tensor

* add empty op

* remove skip tensor test case:

* remove skip tensor test case

* remove TODO
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: dab6ba23

* Add permute module (#4901)

* support expand and repeat op int datatype

* fix modules docs bug

* fix docs bug

* fix docstring bug

* fix docs bug

* fix docs bug

* add permute module
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 698e8cd5

* remove useless code in expand module test (#4903)



Former-commit-id: ac06d3c8

* Add conv docs (#4904)

* remove useless code in expand module test

* add conv2d docs

Former-commit-id: 422ced27

* Fix eager test bug (#4678)

* skip test_gpt_data_loader in eager mode

* 1_node_fix_egaer_test_bug

* remove useless head file

* skip tensor and module

* skip 2-D sbp in eager mode

* fix error

* fix bug and remove some skip under eager

* fix error

* del oneflow_api

* rm test_tensor.py

* skip test_summary in eager mode

* skip test_stateful_local_kernel under cpu only mode

* add class AsyncCudaStreamType

* fix bug

* import os

* remove BlobObject::is_python_shutting_down_

* fix error

* sikp 2d sbp

* minor fix

* refine comment

* make of_format
Co-authored-by: Nlixinqi <lixinqi0703106@163.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 425bd439

* cache cudnn handle in bn infer (#4906)
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 516b3e5e

* Fix ones zeros like (#4907)

* feat(xxxLikeOp): ones_like and zeros_like use user op

* fix(Optim): fix learning rate device error bug

* style(*): format codes

* style(*): use int instead of np.int

* test(Optim): add optimizer gpu test (#4908)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 6d7c6d8b

* Bump nccl from  2.8.3  to v2.9.8 (#4899)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 3294cf2a

* Add prelu module (#4902)

* support expand and repeat op int datatype

* add prelu module

* add prelu module

* add prelu module

* fix comments

* fix comment

* fix comment

* add backward test

* fix comment
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 138af616

* add InitEagerSession for eager mode (#4589)

* try to merge eager ofrecord to master branch

* refine

* temp fix

* try to add seed but fails

* try to add seed but failsclear

* use global function to init mirror/conssitent flag

* fix test

* add modules

* fix record modules

* fix destruction order

* fix mirror gen seed

* skip record unit test

* remove TODO
Co-authored-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Nmosout <mosout@qq.com>
Co-authored-by: NLdpe2G <liangdepeng@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: a139f49e

* add hardtanh module (#4914)



Former-commit-id: d3c91a97

* fix_matmul_module_test_ci_bug (#4905)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 5c4a0d74

* Cpu support of batchnorm layernorm module (#4890)

* add rsqrt moduel

* add batchnorm module cpu support

* refine

* update

* fix param ini

* add reduce series modules

* add batchnorm,layernorm modules and test cases

* refine

* update .rst

* refine according to comments

* refine

* update

* fix layernorm bug

* refine

* remove additional license
Co-authored-by: NXiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 9262e1ae

* Add flow.tensor (#4829)

* add flow.tensor

* small fix

* fix dtype

* Update oneflow/python/framework/tensor.py
Co-authored-by: Ndaquexian <daquexian566@gmail.com>

* deal with multi-dimension list or tuple

* remove list

* Update oneflow/python/framework/tensor.py
Co-authored-by: Ndaquexian <daquexian566@gmail.com>

* add unit test case

* format
Co-authored-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: c43caa88

* Feat lr scheduler (#4921)

* feat(LrScheduler): add ConsineScheduler

* feat(LrScheduler): update cosine_scheduler and add test

* feat(LrScheduler): refine codes

* style(*): format codes

* docs(LrScheduler): add document

* docs(LrScheduler): refine documents
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: f7881ce7

* Fix CommNetIf::RegisterMemory/UnRegisterMemory lock scope (#4918)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 1a7c26b1

* add leakyrelu module (#4912)

* add leakyrelu module

* code format

* update docs

* update docs
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 2b74432f

* Support cast in tensor.to (#4917)

* support cast in to

* refactor to interface

* refine doc

* refine doc

* refine kwargs

* add test case support tensor

* minor fix and test case

* format
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 9fac6f36

* add hard swish module (#4915)

* add hard swish module

* fix bug

* fix bug
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 98f937f1

* Replace oneflow_worker with worker agent (#4900)

* naive impl

* refine

* refine

* add log

* refine

* refine

* refine

* refine

* add todo

* refine

* refine

* sync dynamic libs

* refine

* fix docker cmd

* fix rank

* refine

* refine

* add callbacks simple rpc

* refine

* refine

* fix

* refine

* refine

* refine

* fix conn

* support tradional mode

* refine

* refine

* refine

* rm

* refine

* refine

* refine

* refine todo

* refine

* refine

* rm unused

* rm todo

* revert

* refine

* add log

* refine

* refine

* fix order

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* rm

* rename

* add comment

* refine

* rm

* refine

* refine

* refine

* refine

* add todo

* add info

* refine

* refine

* refine

* add back some legacy code

* refine

* refine

* refine

* refine

* refine

* rm oneflow_worker exe

* rm log

* fix bug

* support --cmd

* add check

* refine

* fix

* fmt

Former-commit-id: 37c63928

* add hardsigmoid module (#4919)

* add hardsigmoid module

* refine docs
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 937ccaf3

* Refactor tensor (#4916)

* refactor Tensor

* Export ConsistentTensor::is_cuda

* minor fix

* minor fix
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 087c9753

* add relu6 module (#4925)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: c013fac1

* add cuda test for add method(module) (#4888)

* copy user op

* add to module and tensor.to interface

* remove unnecessary code

* backward for tensor.to

* remove capture of input

* support cpu only tensor

* module to (#4858)

* remove backward kernel and op

* friendly deal with when tensor.grad is None

* minor fix

* minor fix

* revert

* suport 1m1d only

* skip test normalization

* skip test normalization

* skip conv

* support construct device using string

* minor fix

* minor fix

* use maybe

* fix device id type for device infer ctx

* skip batchnorm

* skip some tensor test case

* startup of add backward

* startup of add gpu test

* refine

* add cuda test for Linear module

* refine after sum fixed

* gpu backward

* gpu backward crashed

* retain grad

* refine according to comments of WangYinggang

* refine: construct specified device tensor

* refine testcase

* refine: specifiy device when construct in test case

* refien testcase for linear

* refine

* refien to_device

* refine import statement

* refine import path

* remove useless _to_device fun
Co-authored-by: NpoohRui <yuruil@qq.com>
Co-authored-by: NYurui Li <32978179+poohRui@users.noreply.github.com>
Co-authored-by: NXiaoyu Xu <xiaoyulink@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 2e268dfc

* add upsample module (#4923)

* add upsample module

* add upsample2d unittest

* add upsample2d unittest

* add docs

* add UpsamplingNearest2d and UpsamplingBilinear2d module

* code format and add docs

* add more unit_test
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 8125b43f

* add elu module (#4924)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 4dfc375a

* improve ofrecord unit test (#4920)

* improve ofrecord unit test

* remove no_grad

* fix codes according to review

* fix format

Former-commit-id: 984b1f08

* align inplace param (#4933)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 8b0bdcdc

* refactor SGDUpdate and MomentumUpdate UserOp (#4930)

* refactor(ModelUpdate): SGDUpdate and MomentumUpdate use optional input
                       for learning rate

* fix(*): fix bugs

* style(*): refine code
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 49d53fec

* Dev support tensor slice (#4898)

* add scalar input support

* format

* update tensor slice function

* add slice slice_update module

* add slice funtion in tensor

* refine

* fix tensor slice

* fix bug

* add tenser slice test case

* add logical_slice_assign module

* fix LogicalSliceAssign kernel to support eager local
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* fix tests
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* update export strategy

* refine

* fix docs

* add more test case

* fix comments

* refine according to comments

* add TODO item

* format
Co-authored-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 4c958482

* add tensor.zeros_() and SoftSyncStream instr (#4927)

* add tensor.zeros_() and soft sync stream instr
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* separate cpu and gpu version of SoftSyncStream

* Remove SyncAutoMemset

* fix compile error
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* fix wrong parallel_desc()
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* remove unused code
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 0e7f1c6a

* Only upload log if distributed test fails (#4934)

* only upload log if distributed test fails

* refine

* refine

* reduce timeout

* refine
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 08d62afa

* Add logsigmoid softplus module (#4929)

* add logsigmodi and softplus module

* code format

* fix docs
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 0a636c6f

* Add module backward test (#4926)

* add scalar input support

* format

* update tensor slice function

* add slice slice_update module

* add slice funtion in tensor

* refine

* fix tensor slice

* fix bug

* add tenser slice test case

* update softmax testcase

* refine softmax backward

* add logsoftmax backward test

* add maskedfill backward test case

* add sigmoid backward test

* rewrite transpose bacckward op

* add transpose backward test

* format

* refine

* rm useless code

* Fix transose unittest.

* refine according to comments

* update

* refine

* refine

* format

* fix backward testcase

* fix perm param

* fix bug

* numpy method to cal sigmoid grad

* format

* refine

* refine comments
Co-authored-by: Nhjchen2 <chenhoujiangcug@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 7911d713

* fix flow.save (#4941)
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Former-commit-id: dbd9d76e

* add math.abs (module)


Former-commit-id: 9670c24a707e01d5445196dc01c145bda792995d

* add math.abs (module)

* fix zero point in fake quantization pass (#4586)

* fix zero point
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* round zero_point in fake quant kernel to align with onnx
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: bc9ec6c4

* also allow ONEFLOW_DEBUG (#4950)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: f65019f2

* Add Device Descriptor (#4939)

* Add Device Descriptor

* format

* refine

* refine

* check cuda version

* check cuda version

* fix

* fix

* fix WorkSize

* handle more error
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 1cc67da5

* Refactor Optimizers for eager (#4938)

* refactor(Adam): refactor Adam to use dynamic learning rate

* refactor(Adamw): refactor Adamw Optimizer

* refactor(Rmsprop): refactor Rmsprop to use dynamic learning rate
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 49060956

* Add instructions on making sys env permanent (#4949)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 64d3a1d9

* add gradient functions for dim_gather op (#4913)

* start up

* bugs left

* add backward test case

* fix bugs

* refine testcase

* refine

* replace optrait with composedAttrs

* refine

Former-commit-id: e400bc0b

* add gradient funs for unary and binary math op (#4961)

* add gradient funs for unary and binary math op

* add test exp and pow example

* refine pow test case

* refine

* rename register macro

Former-commit-id: a559f632

* Refactor consistent tensor (#4937)

* refactor Tensor

* Export ConsistentTensor::is_cuda

* remove ConsistentTensor::blob_object

* minor fix

* minor fix

* fix compiler complains

* remove unused code

* skip test_creating_consistent_tensor

* del useless function
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: clackhan <han_binbin@163.com>
Former-commit-id: 662cda36

* add backward test case


Former-commit-id: 28353e85d5d1aac17aefbcab9f15d78e9aa3bb0e

* add backward test case

* support tensorrt7 qat (#4958)

* support tensorrt7

* Ignore handling bias

* support label quantization

Former-commit-id: 53881e39

* modify math.abs test case


Former-commit-id: 1aa043e7221e71f3d049bcc89b8bd673f132b4bc

* modify math.abs test case

* fix softmax testcase (#4948)

* add scalar input support

* format

* fix softmax testcase

* fix logsoftmax test
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 10ca2bc0

* add module backward (#4935)

* add reshape module backward

* add expand module backward

* add expand backward

* add expand module backward

* add expand module test

* add squeeze module backward

* code format

* add repeat module backward

* code format

* fix bug

* fix comment

* align expand module with torch

* align repeat module with torch

* fix comment

* fix comment

* fix comment

* fix code format

* fix confilict

* fix bug

* fix comment

* add module backward

* fix pow bug
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 5362a0d8

* rewrite unsqueeze backward (#4966)

* rewrite unsqueeze backward

* fix comments

* fix comment
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: ec1c80ff

* add module backward (#4942)

* add exp module backward

* add greater test

* add less module test

* add negative module backward

* code format

* add matmul backward

* add broadcast_matmul_backward

* code format

* add batch_matmul backward

* add argmax module test

* delete unuseless code

* fix comment

* fix comment

* fix comment

* fix comment

* fix commet

* code format

* fix bug

* fix pow bug
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: d20a9953

* fix activation ci bug (#4980)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 4e974d7b

* Hashable attr map (#4951)

* Device::compute_dep_object_

* sequantialize instructions in the same stream.

* refactor AttrMap

* remove redundant header file includes
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: ded8a780

* Fix Global<CommNet>::Delete() (#4981)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 8aade7e7

* delete unused headfile


Former-commit-id: 2b3c27a907fbc75c537cabbbaaf5818efb2e2a29

* delete unused headfile

* update test case format


Former-commit-id: 9cda51fb978fe91245d8c40a4d3fc6a10f2dd4dc

* update test case format

* add of_softmax_use_fast_math (#4979)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: ba6c64e7

* NetIB device enumeration (#4974)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 3f1038d6

* Fix local tensor requires grad (#4992)

* fix(Tensor): add requires_grad setter for ExportTensor

* test(Tensor): refine tensor autograd test
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 09552182

* Support localtensor slice (#4985)

* add scalar input support

* format

* register local tensor slice methods
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 4f4ce1f2

* Symbol::shared_from_symbol (#4969)

* Symbol::shared_from_symbol

* fix bug in Symbol::shared_from_symbol
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: a30eb43f

* CI formats code automatically (#4983)

* check in

* qq mail

* wrong fmt

* auto format by CI

* use youarefly@qq.com

* wrong fmt

* auto format by CI

* use ci-bot@oneflow.org

* wrong fmt

* auto format by CI
Co-authored-by: Noneflow-ci-bot <373331853@qq.com>
Co-authored-by: Noneflow-ci-bot <youarefly@qq.com>
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 9cfee9b1

* Eager consistent tensor (#4984)

* Device::compute_dep_object_

* sequantialize instructions in the same stream.

* refactor AttrMap

* refactor Tensor

* Export ConsistentTensor::is_cuda

* remove ConsistentTensor::blob_object

* refactor TensorImpl

* minor fix

* fix compiler' complains

* Implements EagerConsistentTensorImpl::New

* minor fix

* fix compiler complains

* remove unused code

* skip test_creating_consistent_tensor

* backup code

* Symbol::shared_from_symbol

* remove redundant header file includes

* fix bug in Symbol::shared_from_symbol

* symbolize ParallelDesc and ParallelDistribution
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: clackhan <han_binbin@163.com>
Former-commit-id: 3356bcad

* add resnet50 model test (#4957)

* add resnet50 model test

* udpate script

* udpate script

* relax tolerant

* run resnet50 in 1n1d

* fix format

* test resnet50 fun parameters

* add resnet50 with and without bn test

* fix resnet50 without bn train overflow

* change assertEqual to assertTrue
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 01278a66

* Add arg where module (#4998)

* add argmax test

* add argwhere module

* add argwhere module

* code format

* update unit_test

* fix commet

* update docs
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 816983aa

* fix hierarchical_sub_task_graph_builder condition (#4990)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 0cc74104

* Add __str__ and tolist for tensor (#4928)

* add __str__ and tolist

* initial tensor printer

* reorginized tensor str

* minor fix

* user nparray2string

* Add FunctionNode op_name (#4970)

* feat(FunctionNode): add op_type_name

* style(FunctionNode): rename op_name to op_type_name

* add test case for numel

* style(OpExpr): rename type_name to op_type_name (#4976)

* add test for tensor str

* minor fix

* minor fix

* support for local tensor

* support for local tensor

* format

* fix typo
Co-authored-by: NYinggang Wang <wyg19970408@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: cf2ec98e

* Add squeeze module backward (#5007)

* add argmax test

* add squeeze module backward

* fix conflict
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 10c40f28

* RPC backend local supports barrier of barrier_num > 1 (#4968)

* check in changes

* refine

* fix

* erase when barrier exits
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: c3371486

* add activation module backward test  (#4967)

* add tanh module backward test

* add gelu module backward

* fix activation ci bug

* add reshape module backward

* add tensor reshape module and code format

* add permute module backward

* fix conflict

* add argmax test

* fix permute module bug

* add prelu module cpu backward

* fix prelu gpu backward bug

* code format

* restruct hardtanh module test

* add hardtanh backward

* add hardswish backward

* add hardsigmoid module backward

* add relu module backward

* add relu6 module backward

* add elu module backward

* fix comments
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: c1873f74

* Add sys_ptrace for build docker container (#5005)

* add sys_ptrace build docker container

* refine
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 659f59e1

* Refine pythonpath in cmake (#5002)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 7b80f84f

* Refactor scope parallel desc (#4996)

* Symbol::shared_from_symbol

* fix bug in Symbol::shared_from_symbol

* symbolize Scope::GetParallelDesc()

* IsScalarType

* fix compiler complains

* fix bug
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: clackhan <han_binbin@163.com>
Former-commit-id: 0be87a94

* add arange module backward (#4978)

* add arange module backward

* update

* refine

* fix comments

* refine

* fix docs
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 3878aea7

* CI skips resnet50 to prevent segfault (#5017)

* CI skip resnet50

* fix
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 13b46125

* NetSocket device enumeration (#4997)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: c38ccebb

* Add device unmatch info (#4989)

* add argmax test

* add device unmatch error

* delete unuse code

* delete unuse code

* refine code

* fix var name error bug

* add prelu exception get

* add prelu exception get

* add prelu exception get

* add prelu exception get

* add exception

* code format

* fix commet

* add more error information

* refine error info
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 8b0905b8

* BindFwBwObaPairs skip parallel_cast (#4986)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 3a8bee33

* rewrite matmul op backward (#4988)

* rewrite matmul op backward

* refine

* update

* fix comments

* refine

* refine
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: f06bf705

* Add concat module backward (#5013)

* add argmax test

* add concat module backward

* add concat backward impl

* add concat module backward

* fix concat module backward bug

* fix concat module backward bug

* fix concat module backward bug

* add concat module backward

* delete unuse code

* fix comments

* fix comments

* fix comments

* fix comments
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 21f1bddb

* rewrite dropout backward (#5014)

* rewrite dropout backward

* refine

* fix comments

* auto format by CI
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: bd778bf3

* fix has_grad template (#4962)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 1559abd4

* Upload core files optionally (#5020)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 7b5eb78c

* Fix docs bug (#5019)

* add argmax test

* fix oneflow docstring bug
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 1e9c6ff4

* Doesn't allow CI to run PRs in parallel (#5016)

* update commit

* fix sha
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: d770a56d

* try fix (#5029)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 5b7c7649

* add cosh module (#4943)

* add cosh module

* fix the calculation of cosh backward

* add testcase of cosh
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 7142ed99

* Fix log level (#5009)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 539d6093

* oop oneflow.Model  with training, validation and checkpoint (#4972)

* trainer structure

* add test

* add nnmodel api

* nn Model draft

* try run global_func in Model

* fit to be refined

* model run global_func train & eval

* nn Model for function style execution draf test pass

* refactor nn model

* nn model with nessary component

* format

* rm nn prefix of Model

* flow.Model multi-task numpy-input

* (flow.Model)op_dataload support multi job

* (flow.Model) auto job_func signature for numpy input

* (flow.Model)support auto numpy input job

* (flow.Model) nump input multi job train test pass

* (flow.Model)fix classmethod

* fix test

* (oneflow.Model)training_step multi output, refine according to pep8

* (oneflow.Model)pep8 check pass by flake8

* pytorch-style module
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* fix typo, update parameter
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* Model refine

* Model fix typo

* oneflow.Model optimizer variable lazy get, numpy job signature to DataModule

* oneflow.Model merge and format

* oneflow.Model: comment empty func to be overried

* Optimizer: lazy get var add check and tips

* oneflow.Model: refactor

* oneflow.Model: refactor 2

* add TODO, remove unused import, set consistent to True in parameter
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* oneflow.Model: ModelStage -> SubStep, TrainStage -> TrainStep

* fix format

* oneflow.Model: SubStep to SubModel

* oneflow.Model: infer_oneflow_data_placeholder and _infer_job_signature

* set placement of parameter
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* reformat
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* add __init__.py in oneflow.python.nn.modules
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* add todo for GetCurrentJobName()

* fix typo

* oneflow.Model: refine error message

* fix format

* OOPModel: import new Module

* oneflow.Model: rm FunctionConfig in Model

* oneflow.Model config_exe to config_execution

* OOPModel: add and test naive validate

* oneflow.Model: merge module

* Optimizer: user mode to confirm that Optimizer.Variable() is called inside a job

* merge master

* oop model : predict demo

* model inherit new module

* refine

* refine oneflow.mode

* add test oop model

* add test

* no_grad on ones_like

* fix has_grad template

* format

* fix data input

* fix

* check oop model

* add model checkpoint

* rm useless code
Co-authored-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: NLi Xinqi <lixinqi2010@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 121b1423

* Add upsample module backward (#5025)

* add argmax test

* add upsample module backward

* update upsample unittest

* fix unittest bug

* refine upsample backward

* code format

* fix comment

* fix comment
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 86d4a672

* Prevent CMake from using highest version of python3 (#5034)

* Use conda python if available

* refine

* refine

* refine
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 01e1a714

* rewrite slice backward (#5018)

* rewrite slice backward

* remove unuse .h
Co-authored-by: NXiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: c1880c01

* fix qat (#5038)

* fix qat

* format

* refine

* add comment

* Update test.yml

* Update test_quantize_op.py

* Update test_quantize_op.py
Co-authored-by: NShenghang Tsai <jackalcooper@gmail.com>
Former-commit-id: 2dfb5b56

* Support convert tensorbuffer to list of numpy (#4940)

* support convert tensorbuffer to list of numpy

* improve speed

* remove useless codes

* get tensor_buffer shapes and dtypes by single function

* add __eq__ and __hash__ to DType

* add dynamic_out to tensor_buffer_to_list_of_tensors_v2
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 577568af

* Add JUST (#5041)



Former-commit-id: 994b0df0

* Add broadcast like module backward (#5037)

* add argmax test

* add broadcastlike module backward, bug need fixed

* fix broadcast_like backward bug

* auto format by CI

* refine code
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: e7bb21bb

* Fix segfault caused by zlib in conda when share lib is enabled (#5045)

* check in changes

* address review

* address review
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 9bec9735

* Fix norm grad func to support dynamic attrs. (#5043)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: cc1c554c

* Fix segfault in new interface (#5042)

* fix data race about composed attr map
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* move ResetPrior before ChooseOpKernel
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* delete vm before others
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* revert deletion order change, sync by atexit
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* add comments
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* rename
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* fix multi machine bug
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* auto format by CI
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
Former-commit-id: 68a5c10b

* Add where module backward (#5035)

* add argmax test

* add where module backward

* fix where module unit_test bug

* add zerolike and where op function

* add backward code

* add broadcast like backward

* refine

* fix where module backward bug

* rebuild test

* fix comment

* fix comment

* fix comment

* fix comment

* fix comments
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: dec89a96

* Query system status if CI failed (#5052)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 26bbcfcd

* fix(vm): add virtual mechine backpressure (#5050)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: cd21df2f

* change in_edages/out_edages to from SKIPLIST to LIST (#5047)

* change in_edages/out_edages to from SKIPLIST to LIST

* minor fix

* refine
Co-authored-by: NLi Xinqi <lixinqi2010@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: b9568262

* Stop heartbeat and add barrier before  Global<CtrlServer>::Delete() (#5010)

refine
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: e861dd3b

* fix error on exiting (#5053)

* fix error on exiting
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* lazily get rank
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 51335128

* Restore eager r50 case (#5055)



Former-commit-id: 64665e75

* Add argmax softplus logsigmoid module backward (#5049)

* add argmax test

* add argmax module backward, bug need fixed

* add leakyrelu module backward

* delete argmax backward test

* delete argmax backward test

* add softplus module backward

* code format

* add softplus module backward

* add logsigmoid module backward
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 06b549b3

* add doctest (#5046)

* add doctest

* refine

* update modules doctest
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 7394ec5f

* refactor DType (#5024)

* refactor DType

* fix compiler complains

* DType is only allowed to be used in python code

* dtype api bugfix

* fix error on exiting
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* lazily get rank
Signed-off-by: Ndaquexian <daquexian566@gmail.com>

* Export const DType* into python
Co-authored-by: binbinHan <han_binbin@163.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: Ndaquexian <daquexian566@gmail.com>
Former-commit-id: c2b2eb25

* remove try_init_session in new interface (#5061)
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 912bc97d

* Rewrite batch broadcast matmul backward (#5012)

* rewrite matmul op backward

* refine

* update

* fix comments

* refine

* refine

* rewrite batch broadcast matmul backward

* refine

* refine

* refine

* refine

* Add JUST

* restructure matmul series module backward

* refine

* fix comments

* fix comments

* refine
Co-authored-by: Nhjchen2 <chenhoujiangcug@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: c4f277a1

* modify format


Former-commit-id: 5bdb45adbb0881263ef5fe0afe13dd606a8e6ff6

* modify format

* run make of_format


Former-commit-id: e976b9b0f88bc2e91fddb3053dab5bd5001e8318

* run make of_format

* lock cmake version in manylinux cmake (#5057)
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: fe662377

* Doctest support in CI (#4973)

* check in changes

* refine

* fix

* add test on obj

* add relu example

* run doctest in ci

* dont delete python

* address review

* address review

* address review

* address review

* address review
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 56f6348d

* Update README for 0.4.0 (#4965)

* refine

* remove content

* refine

* require py36

* address review

* refine

* address review
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: cc316206

* Add step lr and lambda lr (#5063)

* feat(StepLR): add StepLR

* feat(LambdaLR): add LambdaLR

* docs(LambdaLR): fix document

* style(LambdaLR): add comment
Co-authored-by: NXiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 4c005b7c

* Release cuda 112 (#5060)

* Nightly for cu112

* add arg

* nightly
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 72cfa991

* add flow.asin, flow.Tensor.asin, flow.arcsin, flow.Tensor.arcsin, flow.asinh, flow.Tensor.asinh,  flow.arcsinh, flow.Tensor.arcsinh (#4955)

* add flow.asin and  torch.arcsin

* add torch.asin and torch.arcsin

* add torch.sin and torch.arcsin

* add torch.asin and torch.arcsin

* add torch.asin and torch.arcsin

* add torch.asin and torch.arcsin

* add torch.asin and torch.arcsin

* add torch.asin and torch.arcsin

* update test_asin.py including forward and backward

* Update test_math_ops.py

remove asin testcase

* update testcase including forward and backward

* add torch.asinh and torch.Tensor.asinh

* update testcase of asin and asinh

* update testcase of asin and asinh

* update testcase of asin and asinh

* update testcase

* update testcase

* make format

* update license

* update testcase

* check in

* qq mail

* wrong fmt

* auto format by CI

* use youarefly@qq.com

* wrong fmt

* auto format by CI

* use ci-bot@oneflow.org

* wrong fmt

* auto format by CI

* mv op testcase  to test_tensor.py

* auto format by CI

* update docstring

* update docstring

* update doctest

* update doctest

* auto format by CI

* update arcsinh

* auto format by CI
Co-authored-by: N陈岱渊 <chendy@zhejianglab.com>
Co-authored-by: NXiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Co-authored-by: Njackalcooper <jackalcooper@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: Noneflow-ci-bot <373331853@qq.com>
Co-authored-by: Noneflow-ci-bot <youarefly@qq.com>
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
Former-commit-id: 95748870

* Add modules doctest bbuf (#5058)

* add argmax test

* add nllloss doctest

* add doctest

* add crossentropyloss doctest

* add expand module doctest

* add squeeze module doctest

* add repeat module doctest

* add exp module doctest

* add argmax module doctest

* add matmul module doctest

* add greater module doctest

* add less module doctest

* add negative  module doctest

* add linear  module doctest

* add tanh module doctest

* add gelu module doctest

* add reshape module doctest

* add transpose module doctest

* add where  module doctest

* add permute  module doctest

* add prelu  module doctest

* add hardtanh  module doctest

* add activation  module doctest

* add activation  module doctest

* add upsample  module doctest

* auto format by CI
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Former-commit-id: 58183f87

* remove origin testCase in test_math_ops


Former-commit-id: 3a4bae704e2698225b15734fb440a6f8ddef5120

* remove origin testCase in test_math_ops

* Add tensor detach python api (#5068)

* add argmax test

* add tensor detach python api

* delete unuse code

Former-commit-id: 9b30b7c9

* Delete preprocessor_internal.h.REMOVED.git-id

* Delete nn_ops.py.REMOVED.git-id
Co-authored-by: NLyon <flowingsun007@163.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: NYinggang Wang <wyg19970408@gmail.com>
Co-authored-by: NYurui Li <32978179+poohRui@users.noreply.github.com>
Co-authored-by: NXiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Co-authored-by: NLiang Depeng <liangdepeng@gmail.com>
Co-authored-by: Ndaquexian <daquexian566@gmail.com>
Co-authored-by: binbinHan <han_binbin@163.com>
Co-authored-by: NHoujiang Chen <chenhoujiangcug@gmail.com>
Co-authored-by: NLi Xinqi <lixinqi2010@gmail.com>
Co-authored-by: NShijie <821898965@qq.com>
Co-authored-by: NYao Chi <later@usopp.net>
Co-authored-by: NShenghang Tsai <jackalcooper@gmail.com>
Co-authored-by: NXiaoyu Xu <xiaoyulink@gmail.com>
Co-authored-by: Nlixinqi <lixinqi0703106@163.com>
Co-authored-by: NBBuf <1182563586@qq.com>
Co-authored-by: NJuncheng <liujuncheng1022@gmail.com>
Co-authored-by: Nmosout <mosout@qq.com>
Co-authored-by: NpoohRui <yuruil@qq.com>
Co-authored-by: Nguo ran <360112263@qq.com>
Co-authored-by: Noneflow-ci-bot <373331853@qq.com>
Co-authored-by: Noneflow-ci-bot <youarefly@qq.com>
Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: NYongtaoShi <73167956+YongtaoShi@users.noreply.github.com>
Co-authored-by: yayeoCddy <Dy_Chen95@163.com>
Co-authored-by: N陈岱渊 <chendy@zhejianglab.com>

325160bc

02 6月, 2021 1 次提交
- J
  Remove redundant copyright header (#5066) · 710264f6
  由 Juncheng 提交于 6月 02, 2021
```
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
```
  710264f6
26 5月, 2021 1 次提交
- G
  fix hierarchical_sub_task_graph_builder condition (#4990) · 0cc74104
  由 guo ran 提交于 5月 26, 2021
```
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
```
  0cc74104
08 5月, 2021 1 次提交

Use multi core to run TaskNode::ToProto (#4820) · 14099cc2

由 Shenghang Tsai 提交于 5月 08, 2021

* Serialize proto in binary rather than text

* move del ops out from loop

* refine

* Skip GenCollectiveBoxingPlan if no CollectiveBoxingTaskNode

* multi core to proto

* copy pointers explicitly

* make toproto const method

* reorder

* larger tol

* Update test_layers_conv1d.py

* fix deadlock

* remove ForeignCallBack in Operator::ToOpAttribute
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: Nliujuncheng <liujuncheng1022@gmail.com>
Co-authored-by: clackhan <han_binbin@163.com>

14099cc2

05 5月, 2021 1 次提交

NCCL logical support Pipeline Parallel By independent NcclComputeStream. (#4806) · 4a4f0322

由 cheng cheng 提交于 5月 05, 2021

* Fw/Bw support double compute stream

* NCCL comm create by stream id

* 2D NCCL logical kernel support BW independent stream

* StreamIndex: NcclComputeStream for each subgraph insert nccl logical.

* refactor code

* refine code for review

* Add WITH_CUDA in DoJobPass(InsertNcclLogicalOpPass)

4a4f0322

29 4月, 2021 1 次提交

Pipeline Parallelism by stage buffer (#4666) · 080d8eab

由 cheng cheng 提交于 4月 29, 2021

* Pipeline Parallelism: checkpointing insert identity buffer op

* fix complier err

* identity buffer op custom out regst num

* fix bug and runnable

* Chain merge divide fw/bw; MemChain ignore merge; copyhd regst num hack

* Pipeline buffer pass

* Pipeline runnable

* rollback NOT merge mem chain hack

* pipeline_stage_id_hint and rollback checkpointing buffer

* Pipeline buffer only. test pass.

* rollback repeat hack

* Remove CopyHd Hack; Add buffer cross label loader and loss

* refine code for review & fix for new dtype infer

* add note
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

080d8eab

28 4月, 2021 2 次提交

b21 boxing add ctrl_edge (#4770) · a9f70c79

由 guo ran 提交于 4月 28, 2021

* b21 boxing add ctrl_edge

* refine
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: Ncheng cheng <472491134@qq.com>

a9f70c79

Lml/mem optimize (#4725) · 3f728c00

由 levi 提交于 4月 28, 2021

* add memory detect info

* small fix in opattrref optimize

* use bitset

* refactor using vector

* refine

* refine

* rename

* refine

* address review

* address review

* refine

* refine

* address review

* smaller BITSET_SIZE

* refine

* refine

* refine

* refine nameing

* refine

* refine

* refine

* update

* delete swp file

* small update

* format fix

* format modify

* format modify

* Update compiler.cpp

fix for comment

* Update reshape_user_op_util.cpp

bug about reshape is fixed
Co-authored-by: Njackalcooper <jackalcooper@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

3f728c00

23 4月, 2021 1 次提交

Use bitset in MakePredicatorIsReachable to reduce memory usage (#4693) · e86dc7ed

由 Shenghang Tsai 提交于 4月 23, 2021

* use bitset

* refactor using vector

* refine

* refine

* rename

* refine

* address review

* address review

* refine

* refine

* address review

* smaller BITSET_SIZE

* refine

* refine

* refine

* refine nameing

* refine

* refine

* refine
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

e86dc7ed

19 4月, 2021 2 次提交

Feat: nccl_use_compute_stream support batch accumulation (#4618) · b38d9cde

由 cheng cheng 提交于 4月 19, 2021

* NCCL logical refine timeshape

* Insert nccl ops after acc interface

* Inser NCCL ops after acc implement; need refine or add new acc_tick_op

* deadlock

* speed up and run

* add acc tick fix deadlocak ; and add nccl comm debug log

* refine log: rm cc_debug_log and cclog

* use reference for speed up

* refine code for review

* fix for review
Co-authored-by: NJuncheng <liujuncheng1022@gmail.com>
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

b38d9cde

Remove RtBlobDesc (#4644) · a4a7e4df

由 cheng cheng 提交于 4月 19, 2021

* Remove RtBlobDesc

* refine code for RuntimeBlobShapeInferHelper::BlobDesc4BnInOp
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

a4a7e4df

12 4月, 2021 1 次提交
- J
  
  Fix ctrl_in_op restriction (#4622) · 7758fd82
  由 Juncheng 提交于 4月 12, 2021
  
  7758fd82
07 4月, 2021 1 次提交
- J
  Fix include cuda header (#4590) · 83c2db82
  由 Juncheng 提交于 4月 07, 2021
```
* Fix include cuda header

* Fix
```
  83c2db82
06 4月, 2021 1 次提交
- G
  Nccl support s1 to B and P to s1 (#4579) · ee2d57b3
  由 guo ran 提交于 4月 06, 2021
```
* Nccl support s1 to B and P to s1

* refine
```
  ee2d57b3

Oneflow-Inc / oneflow 上一次同步 2 年多

Oneflow-Inc / oneflow
上一次同步 2 年多