- 11 4月, 2023 1 次提交
-
-
由 Yiqun Liu 提交于
* Fix scale kernel for low precision, cherry pick #50998. * Fix the FP16 precision problem of add_n. (#50129) * Change squared_l2_norm to reuse ReduceKernel, and register fp16 and bf16 kernel, which is cherry pick #48315. * Cherry-pick the fix of MPTypeTrait in KP, which is implemented in #50993. * Cherry-pick the multi-precision support of AdamW for bf16, #48041. * Fix compiling error. * Cherry-pick the fix of CubTensorReduceImpl for bfloat16 in #50993. * Fix unittest. --------- Co-authored-by: Nliuruyan <44316842+liuruyan@users.noreply.github.com>
-
- 05 8月, 2022 1 次提交
-
-
由 Feiyu Chan 提交于
* move fft kernels to phi, done with cufft, pocketfft, mkl_cdft, hipfft * make stft_op use fft from phi/kernels/funcs, clean code
-
- 21 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
resort .cu headers, set clang-format not sort include block and consider .cu as main source file (#43633)
-
- 05 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 12 3月, 2022 1 次提交
-
-
由 zyfncg 提交于
* move roi_align kernel to phi * fix bug of roi_align xpu
-
- 01 3月, 2022 1 次提交
-
-
由 zhangbo9674 提交于
* add scale gather sum * refine CUDA_ATOMIC_WRAPPER ADD for bf16 * add gather unittest * solve conflict * add scale uinttest * add sum unittest * solve conflict * refine gather unittest * refine unittest
-
- 22 2月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* unify register macro * rename declare macro * fix infrt error
-
- 20 2月, 2022 2 次提交
-
-
由 Chen Weihang 提交于
* rename pten dir to phi * rename namespace to phi * rename infrt pten dir to phi * resolve conflict * rename pten to phi in cmake * revert all infrt change * change needed files * fix infrt failed * fix inference failed
-
由 Yiqun Liu 提交于
-
- 11 2月, 2022 1 次提交
-
-
由 Zhang Ting 提交于
* improve backward performance * support different dtypes for elementwise ops
-
- 08 2月, 2022 1 次提交
-
-
由 niuliling123 提交于
* Replace clip, bce_loss, full and full_like with elementwise
-
- 28 1月, 2022 1 次提交
-
-
由 YuanRisheng 提交于
* refactor scale kernel that its input is selected_rows * complement upload file
-
- 27 1月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* Support allocate_from in Tensor and allocate_data in Context * fix #ifdef CUDA * fix cycle depends * fix test_xxx_dev_api failed * fix windows compiling error * fix unittest * modify into PImpl * fix selected rows * add TODO comment * refine interface according reviewer
-
- 24 1月, 2022 1 次提交
-
-
由 石晓伟 提交于
* updates callers, test=develop * updates tensor, test=develop * fixes errors, test=develop * remove some dtypes, test=develop * fix errors in the base storage modification, test=develop * fixes a bug, test=develop * fixes the bugs in push the whole, test=develop * updates, test=develop * update * update, test=develop * fixes the mac-py3 CI, test=develop * remove the storage impl, test=develop * updates some codes, test=develop * update, test=develop * updates pten allocation, test=develop
-
- 20 1月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* Migrate bfloat16/float16/complex from platform into pten::common * fix typo * fix code style
-
- 18 1月, 2022 2 次提交
-
-
由 Zhanlue Yang 提交于
* Merged LoDTensor with Tensor,test=allcases * Patched python level LoDTensor * Patched python level LoDTensor * Merge Tensor into DenseTensor * Fixed namespace issues,test=allcases * Fixed merge issues * Fixed inference issues * Fixed NPU test issues * Fixed merge issues
-
由 YuanRisheng 提交于
-
- 15 1月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
-
- 13 1月, 2022 2 次提交
-
-
由 Chen Weihang 提交于
* rename register marco * fix error changing * fix format error
-
由 chentianyu03 提交于
* move dot_dev api into dot_kernel.h * add infermate header * modify to dotkerel in dot_op.h * mvoe conj dev api into complex_kernel.h * move sign dev api into sign_kernel.h * move scale dev api into kernel.h and remove infermete.h * rm paddle/pten/include/math.h * rm paddle/pten/include/math.h * rm include dir * rm paddle/pten/include/math.h * fix conflict with develop branch * rm devContext in conj_op.h * add the missing complex_kernel header
-
- 12 1月, 2022 1 次提交
-
-
由 Zhang Ting 提交于
-
- 04 1月, 2022 1 次提交
-
-
由 niuliling123 提交于
Add OpFunctor and replace cast, scale, clip, bce_loss and abs_grad with elementwise_no_broadcast (#38500)
-
- 21 12月, 2021 2 次提交
-
-
由 Chen Weihang 提交于
* rename cuda to gpu * revert CMake change * resolve conflit * rename other cuda to gpu * poish details
-
由 Chen Weihang 提交于
* remove eigen and blas dir * fix declare error
-
- 20 12月, 2021 1 次提交
-
-
由 zyfncg 提交于
-