1. 23 5月, 2023 3 次提交
  2. 22 5月, 2023 2 次提交
  3. 19 5月, 2023 1 次提交
    • L
      Add flash attention to speedup fused_gate_attention. (#52731) · d29c1f8e
      limingshu 提交于
      * Reorganize the forward codes of flash-attention.
      
      * Fix forward.
      
      * Remove some noused codes.
      
      * Simplify codes and fix backward.
      
      * Change all LOG(INFO) to VLOG and fix the backward.
      
      * add scale for AF2 flash_attn, much thanks to xreki and shaojie for debug these codes
      
      * decrease the effect of debug print on performance
      
      * Unify the initialize of flashattn arguments.
      
      * Rewirte the reshape of temp_mask and temp_bias.
      
      * API support use_flash_attn.
      
      * Fix compiling error on CI.
      
      * Try to crop the flash-attention lib.
      
      * Correct the condition of whether can use flash-attn.
      
      * Remove the softmax_out argument.
      
      * Remove is_causal.
      
      * Polish codes.
      
      * Fix qkv_transpose_out's shape and scaling of Q * K.
      
      * Update commit of flash-attention.
      
      ---------
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      d29c1f8e
  4. 17 5月, 2023 2 次提交
    • R
      Supports offline compilation of Paddle third-party libraries (#53744) · 734dc448
      risemeup1 提交于
      * optimize logsumexp in small data scale
      
      * fix
      
      * fix
      
      * add #pragma once
      
      * compile protobuf offline
      
      * add submodlu gflags
      
      * check_submodules
      
      * check_submodules
      
      * add_submodule protobuf
      
      * add_submodule_protobuf
      
      * add_submodule
      
      * add .gitmodules
      
      * add_submodules
      
      * fix_compiler error
      
      * support offline compile
      
      * support offline compile
      
      * support offline_compile
      
      * remove cub
      
      * remove brpc
      
      * support offline compile
      
      * support offline compile
      
      * canning patching on cryptopp
      
      * modify .gitigonre of cryptopp
      
      * test
      
      * offline compile
      
      * add_submodule zlib
      
      * modify .gitmodules
      
      * modify .gitmodules
      
      * fix setup.py bug
      
      * delete submodule cryptopp
      
      * fix windows compile bug
      
      * fix xxhash compile problem
      
      ---------
      Co-authored-by: Asthestarsfalll's avatarAsthestarsfalll <1186454801@qq.com>
      Co-authored-by: NAsthestarsfalll <72954905+Asthestarsfalll@users.noreply.github.com>
      734dc448
    • W
      update openblas version (#53748) · 89653668
      Wilber 提交于
      * update openblas version
      
      * update
      89653668
  5. 14 5月, 2023 1 次提交
  6. 12 5月, 2023 1 次提交
  7. 11 5月, 2023 2 次提交
  8. 09 5月, 2023 1 次提交
  9. 08 5月, 2023 1 次提交
  10. 06 5月, 2023 1 次提交
  11. 28 4月, 2023 2 次提交
  12. 27 4月, 2023 1 次提交
  13. 26 4月, 2023 1 次提交
  14. 24 4月, 2023 2 次提交
    • R
      Fix gpu ps compile patch error (#53256) · 562d2daf
      risemeup1 提交于
      * fix patch error
      
      * fix patch error
      562d2daf
    • H
      [CppExtension Cuda] Add cuda unit test for CppExtension (#52900) · 7a9754a7
      HongyuJia 提交于
      * [CppExtension Cuda] Add cuda unit test for CppExtension
      
      * update extra_compile_args for CUDAExtension
      
      * add debug info
      
      * Add patch to fix CUDA12 compile error
      
      * patch for all env
      
      * add windows judgement
      
      * Try to fix setup function not found error
      
      * fix mix_relu_and_extension include file
      
      * fix setup compile error
      
      * remove useless debug comments
      
      * add sleep, debug CI-build
      
      * add space to disable cmake cache
      
      * remove debug info
      
      * add space to pass CI-build
      7a9754a7
  15. 20 4月, 2023 1 次提交
  16. 13 4月, 2023 2 次提交
  17. 11 4月, 2023 1 次提交
  18. 10 4月, 2023 3 次提交
  19. 03 4月, 2023 2 次提交
  20. 01 4月, 2023 1 次提交
  21. 30 3月, 2023 1 次提交
  22. 29 3月, 2023 2 次提交
  23. 28 3月, 2023 1 次提交
    • F
      Add basic functionalities to support Scalar & Scalars in op attr (#51984) · 2e9fd5e4
      Feiyu Chan 提交于
      Add basic functionalities to support Scalar & Scalars in operator attribute.
      
      1. extend allowed types in operator's attribute type, add `paddle::experimental::Scalar`, add corresponding protobuf Message types;
      2. Scalar enhancement, add formatting, equality;
      3. add code to handle Scalar & Scalars in opmaker, conversion from  paddle operator to phi kernel, opdesc construction and manipulation,  tensorrt converter, tracer, operator construction, etc;
      4. bind `paddle::experimental::Scalar` to python, as `libpaddle.Scalar`;
      5. add functionality to canonicalize attribute map according to OpProto(if the op the attribute map used for has an OpProto);
      6. add code to manipulate Scalar proto message via protobuffer python API;
      
      Add unittests.
      
      1. add test cases for formatting, equality for Scalars, and WrapAsScalars;
      2. add test cases for 'casting' between different morphs of attributes;
      3. add test cases for extracting scalar & scalars from attribute;
      4. add test cases for CanonicalizeScalarAttrs(and fix a bug in type index offset);
      5. fix gmock's library filename on windows platform.
      6. clean code: use canonicalize_attrs instead of inlining the function;
      7. add test cases for libpaddle.Scalar in python code.
      8. add test cases for `make_scalar_proto`, which manipulate proto message `Scalar` via protobuffer python API.
      2e9fd5e4
  24. 27 3月, 2023 2 次提交
  25. 24 3月, 2023 2 次提交
    • TaoTao Li's avatar
      add phi operator allreduce/reduce (#51857) · 47f87ad3
      TaoTao Li 提交于
      * add all_reduce, reduce kernel and api
      
      * fix all_reduce reduce ut
      
      fix reduce op maker conflict
      
      fix merge conflicts
      
      * fix conflicts, rename ReduceOp->ReduceBaseOp in reduce_ops
      
      rename allreduce op, to remove
      
      * fix code format
      
      fix comments
      
      * modify test_collective_reduce_api ut timeout
      
      * fix PR-CI-Build
      
      fix comments: format phi operator
      47f87ad3
    • R
      Fix ninja error (#49499) · 7415b101
      risemeup1 提交于
      * fix ninja error
      
      * fix_lite_ninja_error
      7415b101
  26. 23 3月, 2023 1 次提交