- 10 5月, 2023 1 次提交
-
-
由 Bo Zhang 提交于
* Support different dtypes of inputs for broadcast for dropout optimization (#52093) * change judgement for DropoutGradGPUKernelDriver * add UnrollerWithoutVecSize and after this Loaddata to be refined * pass unittest * use same unroller with XPU * BroadcastWithInt64Index * BroadcastDataLoader template partial specialization * fix compile errs in ROCms * PR comment * dropout_nd_optimization (#51479) * with printf * add DropOutNdForwardKernel * PR comment * Dropout optimize & clean broadcast inT and ElementwiseType (#52969) * change judgement for DropoutGradGPUKernelDriver * add UnrollerWithoutVecSize and after this Loaddata to be refined * pass unittest * use same unroller with XPU * BroadcastWithInt64Index * BroadcastDataLoader template partial specialization * fix compile errs in ROCms * clean ElementwiseT and InT for BroadcastKernel * default axis and clean inT * remove redundant fast divmod computation * optimize drop_nd & drop_nd_grad * optimize BroadcastDataLoader bf16 fp16 * rm InT etc. after merge develop * delete constexpr for windows ci * fix conflict * fix conflic with develop * fix conflic * new clean * clean * Fix xpu2 kp compile error (#53548) * fix conflict * conflict
-
- 08 3月, 2023 1 次提交
-
-
由 Huang Jiyi 提交于
-
- 03 3月, 2023 1 次提交
-
-
由 YuanRisheng 提交于
* decouple memory copy * fix ci bugs * fix ci compile bugs * fix rocm compile * fix ci bugs
-
- 21 2月, 2023 1 次提交
-
-
由 YuanRisheng 提交于
* decouple_memory * perfect memory utils * fix ci bugs * fix inference bugs * fix custom test bugs * fix converage bugs * modify code according comment * modify namespace * deal with compile bugs
-
- 17 2月, 2023 1 次提交
-
-
由 Huang Jiyi 提交于
* rm framework::tensor_util in phi * clean TensoCopy * fix bugs * fix bugs * fix bugs * repalce mutable_data * revert custom_device_test.cc
-
- 01 9月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* refine cmake of framework * add deps for dense tensor * fix deps * remove alloc(ctx) * add depends on mkldnn
-
- 30 8月, 2022 1 次提交
-
-
由 WangZhen 提交于
* [OpAttr]Adapt tensor axis for reduce_min/max/mean/sum/prod
-
- 21 7月, 2022 1 次提交
-
-
由 xiongkun 提交于
* svd cpu forward * svd gpu forward * transfer the backward of svd * remove cusolver in svd_grad * svd kernel bug fix * fix bugs * fix bugs. * fix bug
-
- 21 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
resort .cu headers, set clang-format not sort include block and consider .cu as main source file (#43633)
-
- 05 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 19 4月, 2022 1 次提交
-
-
由 YuanRisheng 提交于
[Phi]Separate AddKernel/DivideKernel/SubtractKernel/MultiplyKernel from ElementwiseKernel(Part1) (#41806) * seperate add/div/sub/mul from elementwise * delete code * fix compile bugs * deal with conflict * fix bugs when compile * fix windows unit test bug * fix ci converage bugs
-
- 15 4月, 2022 1 次提交
-
-
由 chentianyu03 提交于
* split reduce_kernel * rm reduce_kernel in cmake * split reduce_grad kernels * fix cmake build error * format code * fix standalone_executor_test error
-
- 17 3月, 2022 1 次提交
-
-
由 YuanRisheng 提交于
-
- 16 3月, 2022 1 次提交
-
-
由 chentianyu03 提交于
* move reduce kernels into one file * rename reduce_prod to prod * move reduce sum/mean from math_kernel into reduce_kernel * rm comment
-
- 15 3月, 2022 1 次提交
-
-
由 crystal 提交于
-
- 14 3月, 2022 1 次提交
-
-
由 crystal 提交于
* migrate matrix_rank to phi * migrate eigh and matrix_rank to phi * fix matrix_rank * optimize code * move matrix_rank to phi * add max functor * migrate matrix_rank to phi * optimize code
-