- 19 11月, 2021 4 次提交
-
-
由 wuhuanzhou 提交于
* GeneratePass support attr condition and mapping, test=develop * fix coverage, test=develop * Add fuse_resnet_unit pass, test=develop * fix CI errors, test=develop * fix CI errors, test=develop * fix unittest error when compiling without CUDA, test=develop * fix static ci error, test=develop * limit kernel size must equal 1, test=develop
-
由 Feiyu Chan 提交于
-
由 Siming Dai 提交于
* add cpu version, using set: sum, min, max * add cpu version: mean * improve cpu code and fix dynamic memory allcation problem * fix arg error, add index judge, delete fp16 * fix bug in CudaAtomicMax and CudaAtomicMin * add CUDA version * fix grad_op bug for index * add op test, add correct cpu grad op * Add correct CUDA Mean grad * [Add] Successful MEAN and SUM * [Add] Successful MIN and MAX in CPU * [Add] Successful MIN and MAX in CUDA * fix windows dtype ci * fix ROCM ci by adding HIP flag * rename fused_gather_scatter to send_recv * unify name as send and recv * change zero index return time * add send_recv incubate api * fix index data type, add unittest case for API * delete redundant input tensor * fix en example and docs, add default value in pool_type * add shape judge and max grid judge * fix comment * fix index type bug * add const & * fix en docs * delete numpy in examples * add unittest for int input * fix send_recv comment * change send_recv to graph_send_recv
-
由 LiYuRio 提交于
-
- 18 11月, 2021 4 次提交
-
-
由 Li Min 提交于
* fix bug to support dropout eval grad computing. * Remove useless code.
-
由 YuanRisheng 提交于
* elementwise_add kernel refactor * fix compile bugs in elementwise_add refactor * fix compile bugs when run in npu/xpu * fix bugs when run unit test * fix bugs when run ci-windows * modify code as recommended * code format adjust * fix bugs when run ci * fix compile bug when run in ci-windwos * elementwise_sub refactor * add PD_DLL_DECL for elementwise_sub * fix bugs when compilei
-
由 Zhen Wang 提交于
* Add the `GetFetchNames` method in CinnGraphSymbolization. * Use unordered_set instead vector as the type of fetch_var_names. * Reuse the definition of kCompilationKey. * Use CompileOptions to set fetch_var_ids. * Update the argument passing of GraphCompiler.Build. * Fix some bugs in CinnGraphSymbolization::GetFetchIds.
-
由 zhangkaihuo 提交于
topk中有cub和手写kernel两种实现,而cub是通过排序来获取topk,通过多组数据发现只有当input_width>=128且k超过input_width 75%的时候性能会比手写的更好。
-
- 17 11月, 2021 6 次提交
-
-
由 Sławomir Siwek 提交于
* Use oneDNN reorder instead of custom one * Fix whitespace typo * Fix Code format error * Incorporating feedback * Remove unncessary reorder * Support GIOHW format * Fix code format error
-
由 piotrekobiIntel 提交于
* Change first batch of mkldnn headers and namespace names to dnnl * Revert changes to tensor.h, which require approval * Format changes with pre-commit * Add int32 tests * Fix int32 tests and call GetDataFromTensor for int32 * Fix test
-
由 niuliling123 提交于
* Modify reduce_op.op.h for xpu2 with kernel primitive api
-
由 zmx 提交于
* fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * refactor heter trainer. test=develop * fix. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop
-
由 Leo Chen 提交于
* copy beta pow to same place when skip_update=1 * fix xpu
-
由 WangXi 提交于
-
- 16 11月, 2021 5 次提交
-
-
由 arlesniak 提交于
* Added BF16 Pool2d grad * upstream pulled * fix for CI * fixes after review
-
由 YuanRisheng 提交于
* reshape kernel refactor * fix compile bugs when run ci * support xpu for reshape * fix bugs when run unittest in kunlun ci * fix compile bugs when run kunlun * perfect code according to suggestion * add api and unit test for reshape
-
由 Yiqun Liu 提交于
* Make FLAGS_determinstic effective in conv2d forward. * Add call of SetCinnCudnnDeterministic in cinn_launch op.
-
由 jakpiase 提交于
-
由 Li Min 提交于
fused_attention_op的实现中,使用了bias_add,且其实现是通过使用kernel primitive来实现的,之后kernel primitive的WriteData api接口及函数内部实现发生了更改,将判断越界的逻辑移到了template的参数中,使得调用的分支有错误,产生了越界赋值操作,污染了别的显存空间的内容。具体表现为:test_fused_attention_op_api.py 单次执行基本上不会报错,多次循环执行不同shape的输入,结果计算不对,具有偶发性,bug不易察觉。
-
- 15 11月, 2021 6 次提交
-
-
由 Chen Weihang 提交于
* move extension into pten [no-verify] * append tensor methods by ext_tensor [no-verify] * append other tensor methods [no-verify] * ext related files tidy [no-verify] * include relation tidy [no-verify] * add pten tensor test [no-verify] * replace tensor in custom op & compile success * refine tensor constructor for unittest * custom relu jit run success * fix all custom op unittests * add inference cmake adapt [no-verify] * fix failed unittests * fix windows failed unittests * try to fix kunlun and inference failed * fix test_elementwise_api error * try to fix win compile failed * fix kunlun fp16 type error * remove useless haddle error macro * add custom linear op test * fix compile failed & add win symbols * fix non pten kernel cast failed * add dll decl for api * polish several deetails * polish details by review comment * add dll_decl for register
-
由 feng_shuai 提交于
-
由 arlesniak 提交于
* Added BF16 to mean op * fix for CI * fix for CI * fix for CI
-
由 Weilong Wu 提交于
* Add elementwise_mul triple grad kernel * Removed InplaceInferer and polished code
-
由 Zeng Jinle 提交于
* add split_program * make ut faster * increase ut timeout * make result deterministic * add fuse_all_reduce pass * add ut framework, update * fix ut framework * remove useless code * add coverage support * update * fix CI * fix some bugs and fix ci coverage * fix conflict
-
由 zmx 提交于
* fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix ut. test=develop * fix ut. test=develop * fix ut. test=develop
-
- 14 11月, 2021 1 次提交
-
-
由 YuanRisheng 提交于
* reshape kernel refactor * fix compile bugs when run ci * support xpu for reshape * fix bugs when run unittest in kunlun ci * fix compile bugs when run kunlun * perfect code according to suggestion
-
- 13 11月, 2021 1 次提交
-
-
由 CtfGo 提交于
Modify serveral implements on CinnLaunchOp: 1. Skip checking input variables must be used 2. Move current helper functions to a CinnlaunchContext
-
- 12 11月, 2021 5 次提交
-
-
由 zhangkaihuo 提交于
* fix bug: 1. atten: set the default value of attn_dropout_rate to None 2. ffn: add activation parameter
-
由 Chen Weihang 提交于
-
由 zyfncg 提交于
* adjust the param of full_like api in pten * adjust the code format * adjust the code format * adjust the code format
-
由 Aganlengzi 提交于
-
由 YuanRisheng 提交于
* elementwise_add kernel refactor * fix compile bugs in elementwise_add refactor * fix compile bugs when run in npu/xpu * fix bugs when run unit test * fix bugs when run ci-windows * modify code as recommended * code format adjust * fix bugs when run ci * fix compile bug when run in ci-windwos
-
- 11 11月, 2021 4 次提交
-
-
由 zmx 提交于
-
由 TTerror 提交于
* add where/where_index/masked_select for kunlun * fix where/where_index * update where/masked_select
-
由 jakpiase 提交于
* added softplus + activation fuse plass * minor change * implemented reviewer suggestion * minor fix * minor fix * added scale_out parameter * minor fix * fix for iScan CI * conditionally disabled logs * refactored pass builder
-
由 zmx 提交于
* change username * fix * fix * fix * fix * fix * update * update * update unittests * fix * update * fix * update * fix * fix * fix * update * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update send_and_recv op. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix ut. test=develop * fix unit. notest,test=coverage * fix ut. notest, test=coverage * update. notest,test=coverage * fix ut. notest, test=coverage * fix ut. notest, test=coverage * fix. notest, test=coverage * fix. notest, test=coverage * fix ut. notest, test=coverage * fix ut. notest, test=coverage * fix ut. notest, test=coverage * fix ut. notest, test=coverage * add func. notest, test=coverage * fix ut. notest, test=coverage * fix. test=develop * fix. test=develop
-
- 10 11月, 2021 3 次提交
- 09 11月, 2021 1 次提交
-
-
由 Haohongxiang 提交于
-