- 14 6月, 2023 2 次提交
-
-
由 Zhang Jun 提交于
[inference][cherrypick] Implement layer_norm op using INormalization Layer and conv_fusion support bias's rank equal to input's rank (#54590) * [inference]conv_fusion support bias's rank equal to input's rank (#54477) * support bias's rank equal to input's rank * [inference][trt]layer_norm op with dynamic shape support INormalizationLayer in TRT8.6 (#54379) * layer_norm op with dynamic shape support INormalizationLayer in TRT8.6 * Using trt layer to make layers_norm op in lower than trt8.6 layer_norm op with dynamic shape support INormalizationLayer in TRT8.6 --------- Co-authored-by: Nbukejiyu <52310069+bukejiyu@users.noreply.github.com>
-
-
- 07 6月, 2023 1 次提交
-
-
由 Charles-hit 提交于
-
- 06 6月, 2023 2 次提交
-
-
由 houj04 提交于
-
由 Zhang Zheng 提交于
* Fix compilation error by using thrust * fix
-
- 05 6月, 2023 7 次提交
-
-
由 PommesPeter 提交于
* feat: added polygamma init code * feat: added polygamma unittest code * test: added more test cases * refactor: added forward impl * refactor: added backward impl * test: updated cases * refactor: updated test cases * refactor: added more case and fixed some bugs * test: updated ref func * refactor: updated code style * refactor: move the code * refactor: updated test * refactor: updated test * docs: updated en doc Co-authored-by: Nzachary sun <70642955+sunzhongkai588@users.noreply.github.com> * docs: updated math eq --------- Co-authored-by: Nzachary sun <70642955+sunzhongkai588@users.noreply.github.com>
-
由 gouzil 提交于
-
由 wangzhen38 提交于
-
由 houj04 提交于
-
由 umiswing 提交于
-
由 huangjiyi 提交于
Support code generation for op conv2d_transpose, conv3d_transpose, depthwise_conv2d_transpose (#54242)
-
由 Asthestarsfalll 提交于
* optimize logsumexp in small data scale * fix * fix * add #pragma once * swith to use aligned_vector and support arbitrarily shape * fix store * fix store * refine for special cases * try * fix * update * fix * fix all_reduce * try * fix rocm bug * fix rocm bug * fix rocm bug * fix rocm bug * fix rocm bug * fix rocm bug * fix rocm bug * fix rocm bug
-
- 03 6月, 2023 1 次提交
-
-
由 Scotty 提交于
-
- 02 6月, 2023 8 次提交
-
-
由 RedContritio 提交于
-
由 Difer 提交于
* add fp&bf16 bernoulli * add check_dtype & fix error * fix rocm error
-
由 wz1qqx 提交于
-
由 Hui Zhang 提交于
* floor div support float/double/bfloat16/float16 * add ut * fix bug * fix fft.ifftshift for floor_divide upgrade * fix comment * fix bugs * fix bug
-
由 Zhang Zheng 提交于
* Optimize perf of broadcast matmul * support more dtype
-
由 傅剑寒 提交于
-
由 Zhang Ting 提交于
* support master_grad for adam and momentum Co-authored-by: zhangting_2017@163.com <zhangting2020>
-
由 Wang Xin 提交于
* static graph autogen code for shape op * fix onednn * fix onednn
-
- 01 6月, 2023 5 次提交
-
-
由 umiswing 提交于
-
由 zhouweiwei2014 提交于
-
由 ronnywang 提交于
* [ROCM] fix multihead_matmul * skip bf16 uts * update
-
由 YuanRisheng 提交于
-
由 huangjiyi 提交于
* update * update cmake * update * update * update * update * Revert "update cmake" This reverts commit 1e1dc1b2bc9967b725201272607f939260070fd4. * update * update * update * update
-
- 31 5月, 2023 1 次提交
-
-
由 Charles-hit 提交于
* support activation prim op bf16 dtype * remove useless code
-
- 30 5月, 2023 4 次提交
-
-
由 risemeup1 提交于
* update_c++17 * update_c++17 * fix windows bug * solve cirle depend * solve cirle depend * solve cirle depend * solve cirle depend * solve cirle depend * fix windows bug * fix compiler error * fix compiler error * update eigen3 * update eigen3 * update eigen3 * fix mac-py3 compiler error * update C++17 * fix mac compiler error * fix compile error * fix coverage_compiler error * fix coverage_ci_problem * fix coverage_error * fix_kunlun200 compile error * fix kunlun200 compiler error * fix compile error * fix compiler error * fix py3 failed test * fix kunlun200 compiler error * test * fix test error * fix test error * fix test error * test * test * fix mac py3 error * fix mac py3 error * fix mac py3 error * fix test error * fix test error * fix compile error * fix compile error * fix compile error * test * test * fix compiler error * test * test * debug on ci * fix compiler error * fix compiler error * test * fix cinn compiler error * test * fix rocm cmpile error * fix cinn and kunlun compile error * update c++14 * Update flags.cmake
-
由 shaojie_wang 提交于
* softmax fwd: force vec size to 1 when dtype is float * use 1024 as threshold to use cudnn
-
由 Yiqun Liu 提交于
* Reimplement the check_nan_inf function as check_numerics kernel. * Remove the cpu implemention to phi. * Add ifdef for the including of omp.h. * Move the use of FLAGS_check_nan_inf_level out of header file. * Implement a common PrintAndThrowError function. * Fix the error using of __NVCC__, which should be instead with __CUDA_ARCH__. * Add dependency of phi. * Polish codes and unittest.
-
由 houj04 提交于
-
- 26 5月, 2023 1 次提交
-
-
由 YuanRisheng 提交于
* create phi so * fix ci bugs * fix py3 bugs * add file * fix py3 bugs * fix windows bugs * perfect so * fix py3 bugs * delete all static target in phi * fix windows bugs * fix py3 bugs * fix ci bugs * fix windows bugs * fix bugs: gflags can't be linked by dynamic and static lib * fix bugs that can not load 3rd party * fix ci bugs * fix compile bugs * fix py3 bugs * fix conflict * fix xpu bugs * fix mac compile bugs * fix psgpu bugs * fix inference failed * deal with conflict * fix LIBRARY_PATH bug * fix windows bugs * fix onednn error * fix windows compile bugs * fix windows compile bugs * fix test_cuda_graph_static_mode_error aborted * fix windows bugs * fix mac-python3 error * fix hip compile bugs * change mode to static * change to static mode * fix ci bugs * fix py3 bugs * fix windows bugs * fix bugs * add static flag * add PADDLE_API * change position of PADDLE_API * fix windows bugs * change mode to dynamic lib * fix windows static bugs * deal with conflict * fix windows unit bug * fix coverage * deal with conflict * fix windows-inference * fix py3 bugs * fix bugs when compile type_info * fix compile bugs * fix py3 bugs * fix windows bugs * fix windows openblas * fix xpu bugs * fix enforce_test in windows * update code according comment * fix windows cmake bug * fix windows bugs * fix windows bugs * delete cinn unittest * fix cinn bugs --------- Co-authored-by: lzydev <1528794076@qq.com>
-
- 25 5月, 2023 5 次提交
-
-
由 zhangkaihuo 提交于
-
由 zhangkaihuo 提交于
-
由 thunder95 提交于
-
由 zhouweiwei2014 提交于
-
由 Leo Chen 提交于
* add log for memory stats * fix string_split in einsum
-
- 24 5月, 2023 3 次提交
-
-
由 Yiqun Liu 提交于
* Try to increase the repeat of autotune and fix the setting of allow_tf32_cublas. * Change the repeat of cublaslt to 10. * Use FLAGS_cublaslt_exhaustive_search_times as repeats. * Fix compiling error on CI. * Polish the key and simplify codes.
-
由 zhangyuqin1998 提交于
-
由 zhangyuqin1998 提交于
* move raw kernels to legacy * Update elementwise_add_kernel.cu * fix
-