- 11 7月, 2023 6 次提交
-
-
由 pangengzheng 提交于
* support sharding parallel * fix name * fix * update * test amp for sharding --------- Co-authored-by: pangengzheng <pangengzheng.baidu.com>
-
由 ronnywang 提交于
-
由 MarDino 提交于
* add rmsnorm kernel * add static graph test * fix round type * use alignas to avoid msvc compile error * remove redundant headerfile to avoid rocm compile error * fix rocm compile not found cub * Add document
-
由 hong 提交于
* suport optional input in new_ir * polish code * add coverate test * update * update * add unitest * remove reduplicate code * udpate * fix assign error * revert test arg min max * update * fix bug * polish code * update * fix unique and close op bug * update * update * revert test code * revert unique test * polish code * remove useless code --------- Co-authored-by: Nzhangbo9674 <zhangbo54@baidu.com>
-
由 FormlessUnit 提交于
* rename weight_only/llm.int8
-
由 hong 提交于
-
- 10 7月, 2023 2 次提交
- 07 7月, 2023 6 次提交
-
-
由 xiaoye 提交于
-
由 wz1qqx 提交于
-
由 ronnywang 提交于
-
由 hong 提交于
* add ir output check in OpTest * add ir grad check in op test * fix legacy name converter bug * add more unittest * fix * fix warprnn op bug * add whit list * polish code * polish code * fix cummin cummax bug --------- Co-authored-by: Nkangguangli <kangguangli@hotmail.com>
-
由 傅剑寒 提交于
* fix index_put bug when index is multi-dim bool tensor * fix name error
-
由 hong 提交于
* add ir output check in OpTest * add ir grad check in op test * fix legacy name converter bug * add more unittest * fix * fix warprnn op bug * add whit list * polish code * polish code --------- Co-authored-by: Nkangguangli <kangguangli@hotmail.com>
-
- 06 7月, 2023 4 次提交
-
-
由 Leo Guo 提交于
* Fix bugs of dropout and dropout_grad. test=kunlun * Modify the code style of dropout_grad_kernel. test=kunlun
-
由 zhangbo9674 提交于
-
由 Shijie 提交于
-
由 houj04 提交于
-
- 05 7月, 2023 4 次提交
-
-
由 Hui Zhang 提交于
* masked select forward support broadcast * cpu forward and backward * gpu support mask broadcast * fix comment * x support broadcast * fix comment
-
由 hong 提交于
* suport optional input in new_ir * polish code * add coverate test * update * update * add unitest * remove reduplicate code * udpate * fix assign error * revert test arg min max * update * fix bug * polish code
-
由 LUZY0726 提交于
-
由 RedContritio 提交于
* configure elementwise_pow op_version * support auto generate for static op elementwise_pow * pre-commit run
-
- 04 7月, 2023 4 次提交
-
-
由 hong19860320 提交于
* Add XPU plugin to support the customized ops or improve the performance of the fusion ops based on hand-written xpu micro kernels. * refine README.md
-
由 hong 提交于
* suport optional input in new_ir * polish code * add coverate test * update * update * add unitest * remove reduplicate code * set test timeout
-
由 RedContritio 提交于
-
由 ronnywang 提交于
-
- 03 7月, 2023 9 次提交
-
-
由 jiangfan06 提交于
[XPU] Fix the topk, set_value ops that using temporary tensors avoiding the memory overlaps during multi-stream inference (#54851)
-
由 lzydev 提交于
* support auto-gen concat * fix bug in legacy_backward.yaml * fix bug in get_expeceted_kernel_type
-
由 RedContritio 提交于
* configure elementwise_mod op_version * support auto generate for static op elementwise_mod
-
由 ronnywang 提交于
* [CustomDevice] release device manager in py::atexit * fix hip_version macro * update * update
-
由 RedContritio 提交于
* configure elementwise_floordiv op_version * support auto generate for static op elementwise_floordiv * update unity_build_rule.cmake
-
由 LoneRanger 提交于
* fix the static op generation for group_norm * fix bug * fix bug * Update op_compat.yaml
-
由 LoneRanger 提交于
* add lerp bf16 support * fix bug * Update test_lerp_op.py modify the input dtype * modify the test_lerp_op.py * Update test_lerp_op.py * fix bug of import * add user_defined_grads * Update test_lerp_op.py * fix bug of grad * fix bug of grad * fix bug of grad * add the check for bfloat16 dtype
-
由 FormlessUnit 提交于
* add linear_compress API
-
由 niuliling123 提交于
-
- 02 7月, 2023 1 次提交
-
-
由 hong 提交于
* fix_fetch_op_and_null_type_bug * fix compile bug * add test case
-
- 01 7月, 2023 1 次提交
-
-
由 kangguangli 提交于
* refine program translator * fix warning: not override * fix bug * merge new modifications * modify by reviews * resolve conflicts * resolve conflicts * fix * fix * fix conflicts * add unittest for special op transcriber * set cpu as default backend * modify by reviews
-
- 30 6月, 2023 3 次提交