1. 07 8月, 2023 9 次提交
    • Y
      [Inference] save_optimized_model_pass support tensorrt (#55893) · 6b10c0e5
      Yuanle Liu 提交于
      * fix cudnn 8.7+ bug on cudnnConvolutionBiasActivationForward
      
      * save_optimized_model_pass support tensorrt
      
      * update
      
      * update
      
      * fix compile
      
      * update
      
      * fix ut timeout
      6b10c0e5
    • 【CINN】refactor codegen for cinn (#55955) · 68b0cf92
      傅剑寒 提交于
      * refactor codegen for cinn
      
      * add to_string to some type which can't be += with string
      
      * fix multi-thread bug caused by static var
      
      * delete dead code and comment
      68b0cf92
    • [paddle-trt] x and y 's rank should be same in trt_skip_layernorm_pass (#56007) · db96ae58
      周周周 提交于
      * commit
      
      * commit
      
      ---------
      Co-authored-by: Nzhoukangkang <zhoukangkang@baidu.com>
      db96ae58
    • G
      5ada98b8
    • R
      30a02d27
    • X
      [dy2static] PaddleSOT pr (#54202) · c1913a5f
      xiongkun 提交于
      * add paddle-symbolic-trace to paddle
      
      * add symoblic trace
      
      * delete swp
      
      * support Layer in symbolic trace
      
      * fix test-symbolic-trace, make symbolic trace return a StaticFunction
      
      * template the error message
      
      * fix some unittest
      
      * Modify the execution mode of test
      
      * Modify the module name
      
      * add dy2static unittest decorator
      
      * change some unittest files by @ast_only_test
      
      * fix unittest.
      
      * test-symbolic-trace
      
      * update test_write_python_container.py
      
      * update
      
      * fix test_param_parse.py
      
      * add submodule and ln -sf in cmakefile
      
      * update
      
      * update
      
      * fix some ast only errors
      
      * update
      
      * Polish ut
      
      * fix unittests
      
      * update
      
      * update
      
      * fix unittests
      
      * update
      
      * test warning ast only
      
      * update
      
      * Ast only some uts
      
      * Fix unitests
      
      * test_error ast only
      
      * update
      
      * update
      
      * Support build_strategy for sot
      
      * update
      
      * import sot as a third party module
      
      * update
      
      * update
      
      * Polish code
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * remove old fluid api and use paddle.nn.relu instead
      
      * fix
      
      * comment the print of ast code
      
      * add try-finally block
      
      * fix dy2static stop-gradient bugs
      
      * fix code
      
      * remove unused submodule and minor codestyle fix
      
      * fix
      
      * fix cast error
      
      * fix interpolate meets int64 in static model
      
      * add evalframe support for py311
      
      * fix
      
      * fix err
      
      * switch ENABLE_FALL_BACK=False
      
      * fix
      
      * Fix CI for some unittest
      
      * add ENABLE_SOT
      
      * remove setup.py dependences
      
      ---------
      Co-authored-by: NNotHaozi <zhangmenghao@baidu.com>
      Co-authored-by: Nfeifei-111 <2364819892@qq.com>
      Co-authored-by: N0x45f <wangzhen45@baidu.com>
      Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
      c1913a5f
    • H
      Update Save/Load Interface to 2.0 (#55836) · ab8c3179
      Huihuang Zheng 提交于
      Update Save/Load Interface to 2.0
      ab8c3179
    • Z
      [IR] Sovle bugs (#55991) · a25f013b
      zhangbo9674 提交于
      * sovle conflict bug
      
      * fix bug
      a25f013b
    • U
      [WIP] Integration flash attention 2 (#55758) · 0473369f
      umiswing 提交于
      * Work for fa-2 padded fwd. Code to be cleaned.
      
      * Work for fa2 unpadded fwd.
      
      * Work for padded-bwd, dk get small diff on np.random.seed(0)
      
      * Anyway I pass paddle's utest, except return softmax without dropout.
      
      * Clean code.
      
      * Modify interface.
      
      * Clean code and add some check.
      
      * Easy compile for dev.
      
      * Fix ci.
      
      * Fix ci-build.
      
      * Add std c++17 option again.
      
      * Limit max job when compiling fa2.
      
      * Remove const_cast
      
      * Add fwd params, to be cleaned.
      
      * Clean code.
      
      * Add bwd params.
      
      * Clean code.
      
      * Add enforce.
      
      * Use v2.0.4
      
      * Pass RNG state to fa2 capi
      
      * Fix review.
      
      * Add assert
      
      * Skip compile for sm less than 80.
      0473369f
  2. 05 8月, 2023 1 次提交
  3. 04 8月, 2023 11 次提交
    • K
      [NewIR] Rename feed with place to data (#55778) · 274e5e54
      kangguangli 提交于
      * fix bug: feed_with_place should consider variable existence
      
      * fix
      
      * fix build scope
      
      * change method to set feed var name
      
      * remove feed_with_place to placeholder
      
      * fix
      
      * rename to data
      
      * fix
      
      * fix
      274e5e54
    • J
      [Semi AutoParall] Support Partial Semantic I (#55508) · e3b6e02f
      JZ-LIANG 提交于
      e3b6e02f
    • H
      [NewIR]New ir aot placement refactor (#55810) · dd1379e4
      hong 提交于
      * refacot aot
      
      * update
      
      * fix bugs
      
      * remove some test
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * update
      dd1379e4
    • F
      [CINN] Dump more compilation result and optimize parallel compiler flags (#55935) · 39b59603
      Fisher 提交于
      1. `Parallel Compiler`:
          - 合并`FLAGS_cinn_parallel_compile_size`和`FLAGS_cinn_parallel_compile_thread`,通过`FLAGS_cinn_parallel_compile_thread`即可指定编译时使用的线程数,所有的`fusion_groups`将会平均分配到可用的线程上
          - 增强编译完成后返回的信息,除`instruction`外,将`lowered_function`、`source_code`、`source_ptx`返回,供上层进一步使用
      2. Debug信息:
          - 新增`FLAGS_ cinn_dump_group_lowered_func`、`FLAGS_cinn_dump_group_source_code`、`FLAGS_ cinn_dump_group_ptx`、`FLAGS_ cinn_dump_group_instruction`,可分别按`fusion_groups`储存编译的每个阶段中的中间代码
          - 重新整理`graph_visualization`,所有的可视化图、单测代码均能正确分组储存
      3. Bug修复:
          - 修复`MakeDirectory`不能正确创建文件夹的问题
      4. 其他:
          - 清除了一些无用代码
      39b59603
    • R
      [clang-tidy] enable modernize-use-emplace (#55799) · 469a0392
      Ruibin Cheung 提交于
      * [clang-tidy] enable modernize-use-emplace
      
      * Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into modernize_use_emplace
      469a0392
    • Z
      1e4f627d
    • K
      [NewIR] add decorator for dy2st test with new ir (#55840) · b67715a4
      kangguangli 提交于
      * add decorator for new_ir_test
      
      * fix bug and only test in ci-coverage
      
      * fix bug and only test in ci-coverage
      
      * fix
      
      * fix bugs
      
      * fix
      
      * fix
      b67715a4
    • J
      Support Combined indexing for __getitem__ and __setitem__ (#55211) · 697c712f
      JYChen 提交于
      * WIP: start writing combined indexing get
      
      * list/tuple/Variable
      
      * getitem 80%
      
      * add setitem
      
      * add some unittest for setitem
      
      * lazy import
      
      * fix some setitem error
      
      * fix advance indexing with decreasing axes; fix strided_slice input name
      
      * combine int-tensor getitem is ok (without boolean support & broadcast); add getitem unittest for static
      
      * add broadcast & parse bool tensor for __getitem
      
      * [change getitem] _getitem_impl_ to _getitem_static, not deleting the former one
      
      * refine new getitem; fix ut in variable/var_base
      
      * add __getitem__ ut in dygraph
      
      * re-dispatch getitem for Py/CPP; fix strided_slice decrease axes error in dygraph
      
      * fix ut; support tensor in slice
      
      * [change setitem] _setitem_impl_ to _setitem_static, not deleting the former one
      
      * remove some UT (for some, temporarily)
      
      * add IndexError to solve timeout problem in static-mode
      
      * 1.temply forbideen all-False bool-indexput; 2.setitem_static will return new variable
      
      * xpu uses old stratege
      
      * rename dy2st setitem ut to avoid same-name problem
      
      * dy2st for new combined index
      
      * ut case for combine-index with dy2st
      
      * open ut with all-false-bool setitem
      
      * remove useless doc and _getitem_impl_
      
      * change static res
      
      * fix static xpu
      697c712f
    • N
      Fix a bug in VecAutomaticAddPerBlock (#55929) · 81511469
      niuliling123 提交于
      81511469
    • C
      [IR] Reshape2 and Flatten_contiguous_range Support Inplace (#55809) · dd0681e3
      chen 提交于
      * inplace pass support reshape2 and flatten_contiguous_range
      
      * recover the modification to inplace_op_var_pass.cc
      dd0681e3
    • J
      97ab6aa6
  4. 03 8月, 2023 14 次提交
  5. 02 8月, 2023 5 次提交
    • X
      [EvalFrame] support python3.11 in eval frame. (#55887) · f45dd5ee
      xiongkun 提交于
      f45dd5ee
    • W
      Eager tensor doc (#55879) · 880e94fc
      wanghuancoder 提交于
      * add docstring of three eager method
      
      * test=docs_preview
      
      * update element size bind
      
      * update docs of numpy, clone, clear_gradient, element_size; test=docs_preview
      
      * refine clear_gradient docs; test=docs_preview
      
      * refine element_size docs; test=docs_preview
      
      * add detach doc; test=docs_preview
      
      * empty commit; test=docs_preview
      
      * update signature; test=docs_preview
      
      * refactor; test=docs_preview
      
      * empty commit; test=docs_preview
      
      * add docstring of Tensor
      
      * empty commit; test=docs_preview
      
      * refine TensorDoc; test=docs_preview
      
      * refine TensorDoc; test=docs_preview
      
      * remove extra indent in TensorDoc; test=docs_preview
      
      * remove a space; test=docs_preview
      
      * move docs ahead of implementation; test=docs_preview
      
      * refine
      
      ---------
      Co-authored-by: Nwj-Mcat <1435130236@qq.com>
      Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
      880e94fc
    • G
      [clang-tidy] NO.6 enable `modernize-avoid-c-arrays` check (#55774) · c000091e
      gouzil 提交于
      * [clang-tidy] modernize-avoid-c-arrays
      
      * rollback
      
      * [clang-tidy] fix
      
      * close modernize-avoid-c-arrays
      
      * fix PHI_DEFINE_string; add PHI_DEFINE_bool NOLINT
      
      * fix PHI_DEFINE_string
      
      * fix next_h_state and parity err
      
      * fix win32
      
      * fix cuda_graph
      
      * fix accuracy_kernel
      
      * fix math_function
      
      * fix fused_softmax_mask_kernel.cu load_data and warp_reduce; rollback concat_and_split_functor ins_addr
      
      * fix fused_dropout_add_grad_kernel
      
      * fix
      
      * rollback cu
      
      * rollback concat_and_split_functor.cu
      
      * rollback
      c000091e
    • W
      [XPU]Add conv1d fuse pass (#55719) · 22c7a6eb
      wz1qqx 提交于
      22c7a6eb
    • Z
      [IR] NewIr Interpreter Beta run regular (#55828) · 63b7fc80
      zhangbo9674 提交于
      * add interface
      
      * add code
      
      * add code
      
      * add code
      
      * add code
      
      * fix bug
      
      * fix bug
      
      * add var prefix
      
      * add code
      
      * add code
      
      * add code
      
      * fix compile bug
      
      * fix bug
      
      * refine code
      
      * refine code
      
      * refine code
      
      * refine code
      
      * fix bug
      
      * add code
      
      * add code
      
      * fix bug
      
      * add code
      
      * add code
      
      * refine code
      
      * refine code
      
      * fix bug
      
      * add code
      
      * fix bug in phi__kernel_utils
      
      * refine code
      
      * fix bug
      
      * open flag
      
      * refine code
      
      * fix bug
      
      * fix bug
      
      * refine code
      
      * fix bug
      63b7fc80