- 23 5月, 2023 3 次提交
- 22 5月, 2023 4 次提交
-
-
由 risemeup1 提交于
* update_c++14_to_c++17_on_windows * disable test_audio_logmel_feature and test_audio_mel_feature
-
由 Wilber 提交于
-
由 Tian Zheng 提交于
* Add GPU kernel for multiclass_nms3 op * Make multiclass_nms3 gpu kernel output consistent with cpu kernel * Fix API incompatibility * Fix unittests on builds without CUDA * Fix ROCM build * Remove fluid headers; Use default atol for unittest * Change function and variable naming * Add comments; Reduce redundant code * Use paddle test framework
-
由 wangshengxiang 提交于
* bind xpu op: 3D grid sample * fix edge cases in xpu op: reshape & slice
-
- 19 5月, 2023 1 次提交
-
-
由 limingshu 提交于
* Reorganize the forward codes of flash-attention. * Fix forward. * Remove some noused codes. * Simplify codes and fix backward. * Change all LOG(INFO) to VLOG and fix the backward. * add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes * decrease the effect of debug print on performance * Unify the initialize of flashattn arguments. * Rewirte the reshape of temp_mask and temp_bias. * API support use_flash_attn. * Fix compiling error on CI. * Try to crop the flash-attention lib. * Correct the condition of whether can use flash-attn. * Remove the softmax_out argument. * Remove is_causal. * Polish codes. * Fix qkv_transpose_out's shape and scaling of Q * K. * Update commit of flash-attention. --------- Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
-
- 18 5月, 2023 3 次提交
-
-
由 Yuanle Liu 提交于
-
由 RedContritio 提交于
* simplify layer_norm_op.cc * support auto generate for op layer_norm * update unittest for composite_layer_norm * remove layer_norm_op.cc from scripts * replace layer_norm_op with generated_op * add get_expected_kernel for layer_norm * update cmake kernel register function for layer_norm_mkldnn_op
-
由 张春乔 提交于
* rm cmake npu * Update generic.cmake * Update generic.cmake
-
- 17 5月, 2023 2 次提交
-
-
由 risemeup1 提交于
* optimize logsumexp in small data scale * fix * fix * add #pragma once * compile protobuf offline * add submodlu gflags * check_submodules * check_submodules * add_submodule protobuf * add_submodule_protobuf * add_submodule * add .gitmodules * add_submodules * fix_compiler error * support offline compile * support offline compile * support offline_compile * remove cub * remove brpc * support offline compile * support offline compile * canning patching on cryptopp * modify .gitigonre of cryptopp * test * offline compile * add_submodule zlib * modify .gitmodules * modify .gitmodules * fix setup.py bug * delete submodule cryptopp * fix windows compile bug * fix xxhash compile problem --------- Co-authored-by: Asthestarsfalll <1186454801@qq.com> Co-authored-by: NAsthestarsfalll <72954905+Asthestarsfalll@users.noreply.github.com>
-
由 Wilber 提交于
* update openblas version * update
-
- 15 5月, 2023 1 次提交
-
-
由 chalsliu 提交于
* Reduce inference library size and compile time * resolve conflicts
-
- 14 5月, 2023 1 次提交
-
-
由 tianshuo78520a 提交于
* fix build error * fix build error * fix
-
- 12 5月, 2023 1 次提交
-
-
由 RuohengMa 提交于
-
- 11 5月, 2023 2 次提交
- 09 5月, 2023 1 次提交
-
-
由 Wilber 提交于
-
- 08 5月, 2023 1 次提交
-
-
由 umiswing 提交于
-
- 06 5月, 2023 1 次提交
-
-
由 umiswing 提交于
kernels.
-
- 28 4月, 2023 2 次提交
-
-
由 wangshengxiang 提交于
-
由 xiaoguoguo626807 提交于
* add mul doubel grad * add sub_double_grad * add add sub high test * add mutiply test * modify other unsqueeze * delete api.yaml * only for make ci run * midify unsqueeze * modify unsqueeze * tmp * modify operants gen
-
- 27 4月, 2023 1 次提交
-
-
由 risemeup1 提交于
* update cmake3.16 to 3.18 * test * Update Dockerfile.ubuntu
-
- 26 4月, 2023 2 次提交
- 25 4月, 2023 2 次提交
- 24 4月, 2023 2 次提交
-
-
由 risemeup1 提交于
* fix patch error * fix patch error
-
由 HongyuJia 提交于
* [CppExtension Cuda] Add cuda unit test for CppExtension * update extra_compile_args for CUDAExtension * add debug info * Add patch to fix CUDA12 compile error * patch for all env * add windows judgement * Try to fix setup function not found error * fix mix_relu_and_extension include file * fix setup compile error * remove useless debug comments * add sleep, debug CI-build * add space to disable cmake cache * remove debug info * add space to pass CI-build
-
- 20 4月, 2023 1 次提交
-
-
由 Chitsing KUI 提交于
* add flash randomness control * fix VLOG undefied
-
- 19 4月, 2023 1 次提交
-
-
由 Yuanle Liu 提交于
* remove c++14 assert and remove include tensor.h in phi
-
- 18 4月, 2023 1 次提交
-
-
由 张春乔 提交于
-
- 13 4月, 2023 3 次提交
-
-
由 jjyaoao 提交于
* remove code with PADDLE_WITH_ASCEND * try pass codestyle
-
由 zhangyuqin1998 提交于
* rename PD_REGISTER_GENERAL_KERNEL * Update feed_op.cc * fix * Update strings_empty_kernel.cc
-
由 risemeup1 提交于
* fix ninja error * fix_ninja_error_qa
-
- 12 4月, 2023 1 次提交
-
-
由 zqw_1997 提交于
* slight modify * support cuda12+ arch, Hopper arch and discard 30 arch * add arch 90 for each paddle_known_gpu_archs12 * for comments
-
- 11 4月, 2023 3 次提交
-
-
由 Yuanle Liu 提交于
-
由 jjyaoao 提交于
* Delete the keyword WITH_ASCEND_INT64 in configure.cmake and CMakeList * try pass Static-Check
-
由 ykkk2333 提交于
-