“7759fb755fefb205ae875aa8d485f7854534f673”上不存在“PaddleCV/rrpn/checkpoint.py”
- 10 2月, 2023 1 次提交
-
-
由 RedContritio 提交于
* add dim check in scatter * add check in scatter.cu * add unittest * remove unnecessary log and comment --------- Co-authored-by: RedContritio <>
-
- 09 2月, 2023 2 次提交
-
-
由 Huang Jiyi 提交于
* decouple strided_memcpy * move strided_memcpy * move strided_memcpy to phi * fix namespace * update * fix gpu compile bugs
-
由 yuehuayingxueluo 提交于
* add multi_tenosr_adam * update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py * fix adam.py optimizer.py * fix adamw.py * fix test_multi_tensor_adam.py * fix CI bug * fix CI coverage * fix ci bug * fix betapow * fix some bugs * fix test_adamw_op.py * fix CI coverage * fix multi_tensor_adam_kernel.cc * fix CI bug * fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py * fix code style * update C++ parts * remove python parts modification temporarily * add C++ ut * update betapow copy code logic * fix ci ut * fix windows ci * fix coverage ci * improve coverage rate --------- Co-authored-by: Nsneaxiy <sneaxiy@126.com>
-
- 08 2月, 2023 1 次提交
-
-
由 Huang Jiyi 提交于
-
- 07 2月, 2023 1 次提交
-
-
由 Yuang Liu 提交于
-
- 03 2月, 2023 1 次提交
-
-
由 RedContritio 提交于
-
- 02 2月, 2023 2 次提交
-
-
由 RedContritio 提交于
* add stride check for PoolOutputSize * add unittest
-
由 YuanRisheng 提交于
* fix bugs * fix ci bugs
-
- 01 2月, 2023 3 次提交
-
-
由 RedContritio 提交于
* add stride check for MaxPool * add unittests
-
由 limingshu 提交于
* A leap of try for cudaLaunchCooperativeKernel * fix bugs * Totally replace the lar cuda kernel * Fix bugs * fix code according to comments * fix codes according to review comments * adding some function overload * relocate the power operation. * add bf16 support for index select relevant ops * revert bf16 type change. * add changes for more op * fix code writting bugs
-
由 limingshu 提交于
* profile reduce kernel for fp16 and reduceHigherdim * use reinterpret_cast * fix for CI on ROCm * add Macro for ROCm * ROCm CI config * ROCm CI config * unit test repair * pull * add common_funcs.h * reduceType * Update reduce_function.h * not higher * rename * implement of matmul using cublasLt instead of cublas * cublasLt bugfix * Update matmul_kernel_impl.h * Update matmul_kernel_impl_via_blasLt.h * for-loop-algo * PR comments changes * add macro * ci unused variable isCublasLt * ci unused variable isCublasLt macro * split matmul to autotune * rewrite the split kernel with segmented_array * rewrite the split kernel with segmented_array * rewrite the split kernel with segmented_array * add some method for cuda_graph * fix bugs for rocm * change for ci-error * i dont know why ci-model-benchmark gives a shit error, so i recover codes with original one to see if original codes work. * add some changes for passing mode_benchmark and coverage ci * fix ci error * fix ci-rocm error * add some changes for header --------- Co-authored-by: Nzhangbopd <1299246947@qq.com> Co-authored-by: NBo Zhang <105368690+zhangbopd@users.noreply.github.com>
-
- 31 1月, 2023 5 次提交
-
-
由 zhangkaihuo 提交于
-
由 张春乔 提交于
* fix mod 0 error * fix div 0 error in floormod
-
由 xiaoting 提交于
* support 0d tensor for interpolate * support 0d tensor for interpolate * add xpu unittest for interp * update unittest for interpolate * fix coverage * fix code style * fix for coverage * fix coverage
-
由 张春乔 提交于
-
由 Yiqun Liu 提交于
* Unify the gpu implementation of stack and unstack to reuse the optimization. * Optimize the cuda implementation of unstack. * Use GpuMemcpyAsync instead of memory::Copy. * Fix error of calculating the index. * Use FastDivMod to further imporve the performance of unstack.
-
- 30 1月, 2023 1 次提交
-
-
由 engineer1109 提交于
replace all TensorFromVector & TensorToVector AssignKernel async copy
-
- 18 1月, 2023 1 次提交
-
-
由 MarDino 提交于
* add align check * refine
-
- 16 1月, 2023 1 次提交
-
-
由 zlsh80826 提交于
* Update warpctc for cuda-12 * Deprecate cudaProfilerInitialize for CUDA > 11 * Deprecate CUSPARSE_MV_ALG_DEFAULT for CUDA_VERSION >= 11040 * Add the missing thrust header
-
- 13 1月, 2023 3 次提交
-
-
由 limingshu 提交于
* first commit * add some changes in stack kernel. * move the location of GeneralDivMod * fix code format error according to ci
-
由 zhangkaihuo 提交于
-
由 Yuanle Liu 提交于
-
- 11 1月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
* Implement a common PointerArray. * Polish codes. * Add including of header file. * Add the branch of kFix8. * Fix compiling error. * Add alignas hint to fix the performance drop. * Optimize the H2D copy in stack_grad. * Rename the macro. * Fix align hint for different compilers. * Polish the define of PADDLE_ALIGN. * Fix compiling error. * Remove the align hint on windows.
-
- 10 1月, 2023 2 次提交
- 09 1月, 2023 2 次提交
-
-
由 MarDino 提交于
* add concat optimization * refine * remove annotation * use alignas instead of aligned_storage
-
由 wangzhen38 提交于
-
- 04 1月, 2023 1 次提交
-
-
由 Yuanle Liu 提交于
-
- 03 1月, 2023 2 次提交
- 26 12月, 2022 1 次提交
-
-
由 Roc 提交于
* revert concat and change concat to stack * let stack kernel support int8, uint8 and bool type
-
- 20 12月, 2022 1 次提交
-
-
由 huangjiyi 提交于
* move dropout_impl from fluid to phi * move cuda_graph_with_memory_pool from fluid to phi * update namespace * remove cuad_graph in fluid * fix mac-build * fix bugs * correct CodeStyle * fix mac-build * fix mutable_data * fix stl include * fix copy param
-
- 19 12月, 2022 2 次提交
-
-
由 huangjiyi 提交于
* move gather_scatter_kernel from fluid to phi * mv gather_scatter_kernel to gather_scatter_functor
-
由 huangjiyi 提交于
* move maxouting from fluid to phi * move matrix_bit_code from fluid to phi * replace mutable_data and fix include * fix include * move gather_scatter_kernel from fluid to phi * Revert "move gather_scatter_kernel from fluid to phi" This reverts commit 3d0b1eaf179656072e8c483dfca688cccccdda01.
-
- 16 12月, 2022 1 次提交
-
-
由 MarDino 提交于
* optimize bias_add reluv2 in half2 * Add annotation * refine code format
-
- 15 12月, 2022 1 次提交
-
-
由 huangjiyi 提交于
-
- 14 12月, 2022 1 次提交
-
-
由 limingshu 提交于
* First Commit. * add some codes * add elementwise loader * fix code styles * merge with develop * add some changes both in elementwise and transpose * add init operation in broadcast kernel. * change codes according to pr suggestions about transpose file * fix error for op-benchmark ci * fix according to ci
-
- 12 12月, 2022 2 次提交
-
-
由 傅剑寒 提交于
* fix codestyle * add double complex<float> complex<double> dtype support for syevj_batched * fix use_syevj flag for precision loss when input dtype of syevj_batch is complex128 in some case * optimize eigh in different case * fix missing ; bug * fix use_syevj bug * fix use_cusolver_syevj_batched flag
-
由 huangjiyi 提交于
* move norm_utils.cu.h from fluid to phi * remove norm_utils.h in fluid * fix bugs and replace mutable_data with Alloc * replace mutable_data with Alloc
-
- 08 12月, 2022 1 次提交
-
-
由 limingshu 提交于
-