1. 08 8月, 2023 12 次提交
  2. 07 8月, 2023 12 次提交
    • Y
      Add attn_mask supported for FlashAttnKernel. (#55969) · 42e0c6b8
      yin wei 提交于
      * add mask
      
      * add backword
      
      * add enforce info
      
      * update scale
      
      * integrate code
      
      * update enforce
      
      * add enforce eq
      
      * add error type
      
      * update enforce
      
      * add test_flash_attention
      
      * Polish codes and fix compiling errors.
      
      * Set num_splits to 0 for flash-attn with tensor mask.
      
      * Fix the compiling error for non flash-attn case.
      
      ---------
      Co-authored-by: NLiu Yiqun <liuyiqun01@baidu.com>
      42e0c6b8
    • Y
      [New IR]Add attrs Interface for Python (#55974) · 02e6347d
      YuanRisheng 提交于
      * add attrs and dtype interface
      
      * fix compile bugs
      
      * fix some bugs
      
      * fix windows bugs
      02e6347d
    • Y
      [Inference] save_optimized_model_pass support tensorrt (#55893) · 6b10c0e5
      Yuanle Liu 提交于
      * fix cudnn 8.7+ bug on cudnnConvolutionBiasActivationForward
      
      * save_optimized_model_pass support tensorrt
      
      * update
      
      * update
      
      * fix compile
      
      * update
      
      * fix ut timeout
      6b10c0e5
    • G
      5ada98b8
    • R
      30a02d27
    • C
      Fix typos (#56008) · 4d094b0c
      co63oc 提交于
      4d094b0c
    • X
      [dy2static] PaddleSOT pr (#54202) · c1913a5f
      xiongkun 提交于
      * add paddle-symbolic-trace to paddle
      
      * add symoblic trace
      
      * delete swp
      
      * support Layer in symbolic trace
      
      * fix test-symbolic-trace, make symbolic trace return a StaticFunction
      
      * template the error message
      
      * fix some unittest
      
      * Modify the execution mode of test
      
      * Modify the module name
      
      * add dy2static unittest decorator
      
      * change some unittest files by @ast_only_test
      
      * fix unittest.
      
      * test-symbolic-trace
      
      * update test_write_python_container.py
      
      * update
      
      * fix test_param_parse.py
      
      * add submodule and ln -sf in cmakefile
      
      * update
      
      * update
      
      * fix some ast only errors
      
      * update
      
      * Polish ut
      
      * fix unittests
      
      * update
      
      * update
      
      * fix unittests
      
      * update
      
      * test warning ast only
      
      * update
      
      * Ast only some uts
      
      * Fix unitests
      
      * test_error ast only
      
      * update
      
      * update
      
      * Support build_strategy for sot
      
      * update
      
      * import sot as a third party module
      
      * update
      
      * update
      
      * Polish code
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * remove old fluid api and use paddle.nn.relu instead
      
      * fix
      
      * comment the print of ast code
      
      * add try-finally block
      
      * fix dy2static stop-gradient bugs
      
      * fix code
      
      * remove unused submodule and minor codestyle fix
      
      * fix
      
      * fix cast error
      
      * fix interpolate meets int64 in static model
      
      * add evalframe support for py311
      
      * fix
      
      * fix err
      
      * switch ENABLE_FALL_BACK=False
      
      * fix
      
      * Fix CI for some unittest
      
      * add ENABLE_SOT
      
      * remove setup.py dependences
      
      ---------
      Co-authored-by: NNotHaozi <zhangmenghao@baidu.com>
      Co-authored-by: Nfeifei-111 <2364819892@qq.com>
      Co-authored-by: N0x45f <wangzhen45@baidu.com>
      Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
      c1913a5f
    • T
      Test Del paddle_bfloat (#55904) · 8fc2366c
      tianshuo78520a 提交于
      * Test Del paddle_bfloat
      
      * Del paddle_bfloat test
      8fc2366c
    • Y
      Increase absolute error of test_group_norm_op (#55992) · 496de7f3
      yangjianfengo1 提交于
      * inplace tol
      
      * code style
      496de7f3
    • H
      Update Save/Load Interface to 2.0 (#55836) · ab8c3179
      Huihuang Zheng 提交于
      Update Save/Load Interface to 2.0
      ab8c3179
    • C
      Fix typos, test=document_fix (#56005) · f55f601e
      co63oc 提交于
      f55f601e
    • U
      [WIP] Integration flash attention 2 (#55758) · 0473369f
      umiswing 提交于
      * Work for fa-2 padded fwd. Code to be cleaned.
      
      * Work for fa2 unpadded fwd.
      
      * Work for padded-bwd, dk get small diff on np.random.seed(0)
      
      * Anyway I pass paddle's utest, except return softmax without dropout.
      
      * Clean code.
      
      * Modify interface.
      
      * Clean code and add some check.
      
      * Easy compile for dev.
      
      * Fix ci.
      
      * Fix ci-build.
      
      * Add std c++17 option again.
      
      * Limit max job when compiling fa2.
      
      * Remove const_cast
      
      * Add fwd params, to be cleaned.
      
      * Clean code.
      
      * Add bwd params.
      
      * Clean code.
      
      * Add enforce.
      
      * Use v2.0.4
      
      * Pass RNG state to fa2 capi
      
      * Fix review.
      
      * Add assert
      
      * Skip compile for sm less than 80.
      0473369f
  3. 06 8月, 2023 1 次提交
  4. 04 8月, 2023 10 次提交
  5. 03 8月, 2023 4 次提交
  6. 02 8月, 2023 1 次提交
    • Z
      [IR] NewIr Interpreter Beta run regular (#55828) · 63b7fc80
      zhangbo9674 提交于
      * add interface
      
      * add code
      
      * add code
      
      * add code
      
      * add code
      
      * fix bug
      
      * fix bug
      
      * add var prefix
      
      * add code
      
      * add code
      
      * add code
      
      * fix compile bug
      
      * fix bug
      
      * refine code
      
      * refine code
      
      * refine code
      
      * refine code
      
      * fix bug
      
      * add code
      
      * add code
      
      * fix bug
      
      * add code
      
      * add code
      
      * refine code
      
      * refine code
      
      * fix bug
      
      * add code
      
      * fix bug in phi__kernel_utils
      
      * refine code
      
      * fix bug
      
      * open flag
      
      * refine code
      
      * fix bug
      
      * fix bug
      
      * refine code
      
      * fix bug
      63b7fc80