- 28 10月, 2021 7 次提交
-
-
由 Shenghang Tsai 提交于
* add todo * refine * add attr * refine * refine * add todo * refine * add alias c1 for check-oneflow * fix * update scripts * refine * fix single client env reinit * add attr * save and pass mlir module * fix * restore module in kernel * lower in kernel * refien * add scf to std * update lit * fmt * add all passes * add alisas * refein * refein * add check * fix pass order * add TODO * refein * create jit exe * refein * fix arity * add check and rpint err * refein * refein * refein * refein * refein * refein * emiit c * working * revert * add err print * e2e works * refein * refein * refein * use STATIC_SWITCH_FUNC * add log * rename * use invoke packed * refein * add todo * refein * rm log * fix * refein * rm * refein * add scf to gpu * add cmake flag for cuda runner * add CMAKE_CUDA_COMPILER * refine * refien * register gpu kernel * refein * add gpu passes * refein * add * refine * add ptx to cubin pass * produce cubin * add gpu to llvm pass * refein * add log * refien * link mlir cuda runtime lib * add note * make gpu runner available in file check * rm unused * add to prevent break * fix with cuda * edit mlir by hand to have it run on cuda * rm useless * add todo * upgrade llvm * refein m,irror scripts * fix for llvm upgrade * refein cmake * fix * fix for llvm upgrade * remove unused headers * refeine * refein * refactor * add * refine * refine * cmake first class cuda support * refine * refine * refein * refine * refine * refine * refein * add todo * refine * pass shared lib path from py * prevent redef ONEFLOW_CMAKE_BUILD_TYPE * refine msg * fix fmt * fix fmt * fix fmt * refine * refueb * fix * refactor jit function outline * refein * rm debug log * rm unnecessary erase * use 75 * refein * add allowFoldingUnitDimReshapes * refine * Outline JIT func (#6542) * check in pass impl * add test * check in changes * add todo * extract func to create attrs * refine * refine and mv bert * refein LLVM_EXTERNAL_LIT * refine log user_op::AttrValueUtil::ToCppAttrValue * fix for nd_sbp * refine log * fix warnings * fix * leverage input_order and output_order * save lbn_segment_keys as input output order * refine * refein * add CUDATOOLKIT_BIN_ROOT * finish todo * finish todo * finish todo * add matmul * rm repetitive code * add log * add unary * add gather * refine and add gelu * fix loc * add mlir conv op (#6559) * add mlir conv op * fix conv2d tabelgen bug * fix merge compile error * fix comments * Update mlir-cuda-75.cmake * add mlir resnet50 test * add SI32ArrayAttr Co-authored-by: NShenghang Tsai <jackalcooper@gmail.com> * backport refactoring of translation * Add resnet50 mlir dialect part ops (#6607) * add scalar math ops tablegen * add pool ops * add bias_add op * fix comment * fix comment * code format * add reshape op * add reduce ops and restruct scalar math ops * fix bug * fix typo * address review * address review * rm loggin * address review * rm logging * backport variable rename * add flag ONEFLOW_MLIR_ENABLE_FUSERS Co-authored-by: NXiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
-
由 guo ran 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Shenghang Tsai 提交于
* use git to clean dir * rm useless to trigger CI * trigger CI * refine * refine * refine * refine * fix typo PopulateOpAttribute
-
由 liufengwei0103 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Luyang 提交于
-
由 Yinggang Wang 提交于
* feat(autograd.Function): add base class define * format * feat(autograd.Function): cache FunctionOpExpr in AutogradFunctionBase and pass autograd.Function name to cpp * feat(autograd.Function): wrapper PyFunction to FType * fix(autograd.Function): fix wrapper function capture bug * feat(autograd.Function): support autograd.Function backward * feat(autograd.Function): refine apply return value * fix(autograd.Function): fix autograd.Function name bug * feat(autograd.Function): refine ctx python api * feat(*): refine apply interface * test(autograd.Function): fix ctx interface and add test * feat(autograd.Function): support mark_non_differentiable * align ctx.saved_tensors interface * docs(autograd.Function): export documentation * refine function names * refine interface * use py::args instead of py::object * refine code * fix(*): fix `func_name` variable conflict with CHECK_JUST * feat(autograd.Function): support static call * docs(autograd.Function): update documentation * refine code * add JUST Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Juncheng 提交于
* Interface primitive::BroadcastElementwiseBinary * refine Co-authored-by: Nguo ran <360112263@qq.com>
-
- 27 10月, 2021 4 次提交
-
-
由 daquexian 提交于
Signed-off-by: Ndaquexian <daquexian566@gmail.com> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Juncheng 提交于
* Matmul kernels use primitive * refine * fix Co-authored-by: Nguo ran <360112263@qq.com> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 qq_22305325 提交于
* fix_naive_eager_boxing_checker * refine eager_p_to_b_kerne and eager_p_to_s_kernel
-
由 Zhanghuihong Guan 提交于
* initial commit for adding logical_not operator * adding logical_not op, debugging Dtype related problems * finished testing locally, need to add tests * added tests * added docs and formatted code * format file * format file * remove python wrapper * modification based on review * remove redundant code, format file * modifications based on reviews * modifications based on review * fix duplicate license info * fix docstring * fix docstring warning Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 26 10月, 2021 8 次提交
-
-
由 Juncheng 提交于
-
由 ZZK 提交于
* fix sbp for prelu * auto format by CI Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Liang Depeng 提交于
* imporve roll speed * imporve speed of len(dims) > 1 cases Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 ZZK 提交于
* dev torch style permute kernel * Refine * fix batch permute launch condition * fix batch permute dispatch logic * remove redundant header file * simplified check logic * use permute primitives in transpose kernels * fix batch permute logic and avoid mod * remove redundant templates * fix grid step * add grid for loop to avoid the elementnum is too large * fix bug when hw is not divided by tile size * refine format * add a copy kernel as a baseline * remove annotation * add copy kernel * add sync * use batch permute for profile * add copy tile baseline * simplify params for copy kernel * add slow copy kernel * use mul to instead mod and remove copy * use movement size = 4 when h w is modify by 2 * Add temp process for half2 * add half2 specialized kernel * remove redundant license * simplified code * fix format * fix comment * fix comment * use bad for loop condition * merge half2 in load * fix bad for loop in batch permute * refine * use align storage * refine * fix comment * fix comment * fix format * add const and remove redundant header file * remove register macro * refine cuda code * fix guoran comment * fix format * fix some details * remove cuda graph * fix for 0d tensor Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Twice 提交于
* c++ standard: bump to 14 * remove cplusplus_14.h & use cxx14 * fix python test * fix .clang-format Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Xiaoyu Xu 提交于
* support create opt in graph * add comment Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 guo ran 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 qq_22305325 提交于
Co-authored-by: NLi Xinqi <lixinqi2010@gmail.com>
-
- 25 10月, 2021 4 次提交
-
-
由 Juncheng 提交于
-
由 Li Xinqi 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Liang Depeng 提交于
* add roll op * imporve speed * improve speed when len(dims) == 1 * move some logic to C++ * fix static analysis error * refine doc * add roll doc * refine codes according to review comments * remove runcudakernel macro Co-authored-by: NZZK <42901638+MARD1NO@users.noreply.github.com>
-
由 Li Xinqi 提交于
* remove most usage of macros INTRUSIVE_* * rename most INTRUSVE_XXX macros to REFLECTIVE_XXX * move intrusive::Base to intrusive/base.h * 1) remove OFFSET_STRUCT_FIELD; 2) mv test cases of HeadFreeList into head_free_list_test.cpp Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 23 10月, 2021 7 次提交
-
-
由 Li Xinqi 提交于
* refactor vm preschedule * TryMoveFromWaitingToReady * revert flying_instruction_cnt * revert to single position to call DispatchInstruction * revert several code * remove is_xxx_hook_empty Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 XIE Xuan 提交于
* add commit based pip index.html * use provision for test * roll back to release nodes Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Houjiang Chen 提交于
* Async access blob and copy data between python object, and fix deadlock. * revert * fix * refine code style * optimize treat single as tuple * Fix tensor numpy api. * adapt interface to compatiblility test * auto format by CI * refine * Back up the numpy array when copy data from array to tensor async * fix pybind blob api * Make sure array is C-style contiguous. * decrease ref * fixup * Move foreign lock helper base into core/common. * Release GIL before call SpinWaitUntilTimeout Co-authored-by: NZhanghuihong <garfield.gzhh@gmail.com> Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 guo ran 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Peihong Liu 提交于
* refine sequence_function.h * refine nn_functor with sequence_function * refine activation_functor with sequence_function * refine generator * refine * add thne_if * refine array_functor with sequence_function * refine * refine reduce grad funcs with sequence_function * remove GET_GENERATOR Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 tangnana925 提交于
* add test file at first * add tripletMarginLoss py code * module ok * add forward test * amend test code * delete import torch * add autotest ok * delete numpy test code * amend docstring * amend loss.py, delete None * API transfer to C++ * motify module * delete cout * delete cout * Submit some modified code first * submit vector_norm functor * matrix norm * Refine max/min functor (#6359) merge to dev_tripletMarginLoss * replace reducemax and reducemin * amend code error * motify code * delete norm2 * delete print * delete norm2 * delete print * motify review code * add assert to c++ * motify review code * add else * motify review problem * add code * add test code * motify code delete dim_check * delete norm.py code * delete print * delete print * delete pu norm * delete error code * motify docsting * auto format by CI * delete no use num_dims * delete import torch lib * delete CI bug code * motify clip_grad_norm_ resolve autotest bug * auto format by CI * motify loss docstring * motify norm docstring Co-authored-by: NZhenhua <1209435+hengzi@users.noreply.github.com> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Noneflow-ci-bot <ci-bot@oneflow.org>
-
由 Juncheng 提交于
* Fix SimplifyPermutation * fix * fix typo * fix * add test * fix init Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
- 22 10月, 2021 9 次提交
-
-
由 Zhanghuihong Guan 提交于
* initalizes cuda context in kernel of copy * debugging * add call once * remove redundant code * changes based on review * delete redundant code * fix clang compile error Co-authored-by: NHoujiang Chen <chenhoujiangcug@gmail.com> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 daquexian 提交于
Signed-off-by: Ndaquexian <daquexian566@gmail.com>
-
由 guo ran 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Shijie 提交于
* fix typo * use cuda elementwise Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Shijie 提交于
* dev masked fill * refine * make static_check happy Co-authored-by: NHoujiang Chen <chenhoujiangcug@gmail.com> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Juncheng 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 guo ran 提交于
* interface primitive::softmax * refine Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Houjiang Chen 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-
由 Luyang 提交于
-
- 21 10月, 2021 1 次提交
-
-
由 Juncheng 提交于
Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
-