- 14 7月, 2023 2 次提交
-
-
由 Tian Zheng 提交于
* Update CUDNN Frontend API to v0.9.1 - Remove old patches - Remove workarounds that are no longer needed * Fix test_switch_autotune
-
由 hong 提交于
-
- 13 7月, 2023 28 次提交
-
-
由 Yuanle Liu 提交于
* copy dense_tensor.h to inference lib * update * update
-
由 Yuanle Liu 提交于
-
由 niuliling123 提交于
-
由 xiaoguoguo626807 提交于
-
由 freeliuzc 提交于
* add init value for CudaSwishFunctor * add new phi kernel fusedBiasActKernel
-
由 Yichen Zhang 提交于
-
由 Ruibiao Chen 提交于
* Support nvprof for auto parallel * Fix CI errors * Fix CI errors
-
由 Charles-hit 提交于
* [prim]support fp16 for instance_norm and instance_norm_grad * support fp16 and bfp16 dtype for instance_norm prim rules * fix new ir test --------- Co-authored-by: Ncxxly <chenxx_id@163.com>
-
由 lil-Xing 提交于
* add phi operator c_concat and ut * update create_var use * update copyright
-
由 hong 提交于
* new ir support builtin slice op * fix phi kernel adaptor bug
-
由 gouzil 提交于
* [tools] Add CI for assert allclose. * fix * fix \s * update * rm demo1 * add demo1 * fix * rm demo;test=document_fix
-
由 zhangyuqin1998 提交于
* Move compare_raw_kernel to legacy * fix * Update compare_kernel.cc * Move compare_raw_kernel to legacy
-
由 Zhang Zheng 提交于
* [CINN] Schedule error message optimization * format code style * add test * fix format * using CINN_THROW and using flags * optimize error msg * do not use abtract class of error hanlder * fix header
-
由 ronnywang 提交于
-
由 Leo Chen 提交于
* Support AMP program for onnx QAT API * Integrate QAT into distributed optimizer * Reduce the size of test data and increase time limit * Use logger and reduce time limit of unittests * Rename and move unittest into fleet test * Test qat_init API
-
由 ming1753 提交于
-
由 Feng Ni 提交于
-
由 zyfncg 提交于
* new group fuse pass api * fix header * update * change logic of get master node to fix bug * revert update for ReduceFuseReduce * modify according review * modify by review * refine * update * fix code-format
-
由 zyfncg 提交于
* add check of input tensors in Yaml * fix bug of code-gen for opmaker * fix bug
-
由 risemeup1 提交于
* fix protobuf problem * fix protobuf problem
-
由 Wilber 提交于
-
由 HongyuJia 提交于
-
由 BiynXu 提交于
* [CINN] comb the op lowering code * [CINN] format code of OpLower
-
由 RichardWooSJTU 提交于
* add matmul int8
-
由 hong 提交于
* fix edit distance bug * add op define kernel data type * fix bug * update * add header * add op test to cmake
-
由 Qi Shao 提交于
* modify the accuracy checking framework of bf16 optest, including both of forward and backward
-
由 Aurelius84 提交于
* [NewIR]Disable copy and assign for Operation * add macros.h
-
由 Yuang Liu 提交于
-
- 12 7月, 2023 10 次提交
-
-
由 HongyuJia 提交于
-
由 JZ-LIANG 提交于
* resolute input sharding conflict maybe * fixed comment --------- Co-authored-by: NYichen Zhang <zhangyichen03@baidu.com> Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
-
由 HongyuJia 提交于
-
由 hong 提交于
* fix new ir expand op * fix count bug * remove useless code
-
由 Yuanle Liu 提交于
* rewrite identity_op_clean_pass * fix * adjust identity_op_clean_pass order in gpu passes * fix ut
-
由 FormlessUnit 提交于
* add macro to avoid llm.int8 build error * fix ci --------- Co-authored-by: Nwufeisheng <wfs1997@163.com>
-
由 ronnywang 提交于
-
由 ronnywang 提交于
* [CustomDevice] fix release error for process_group_custom * update
-
由 wangzhen38 提交于
-
由 hong 提交于
* refine program translator * fix warning: not override * fix bug * merge new modifications * modify by reviews * resolve conflicts * resolve conflicts * fix * fix * update * support selected rows * update * add selectrows * fix bug * add ut * refine code * refien code * update * update * support selected rows * support selected rows * support dense tensor * remove useless code * polish code * remote standalone executor test --------- Co-authored-by: Nkangguangli <kangguangli@hotmail.com> Co-authored-by: Nzhangbo9674 <zhangbo54@baidu.com>
-