1. 13 5月, 2023 1 次提交
  2. 11 5月, 2023 1 次提交
  3. 10 5月, 2023 1 次提交
    • B
      [cherry-pick 2.5] Broadcast && Dropout_nd Performance Optimization into Release/2.5 (#53623) · f9ea2301
      Bo Zhang 提交于
      * Support different dtypes of inputs for broadcast for dropout optimization  (#52093)
      
      * change judgement for DropoutGradGPUKernelDriver
      
      * add UnrollerWithoutVecSize and after this Loaddata to be refined
      
      * pass unittest
      
      * use same unroller with XPU
      
      * BroadcastWithInt64Index
      
      * BroadcastDataLoader template partial specialization
      
      * fix compile errs in ROCms
      
      * PR comment
      
      * dropout_nd_optimization (#51479)
      
      * with printf
      
      * add DropOutNdForwardKernel
      
      * PR comment
      
      * Dropout optimize & clean broadcast inT and ElementwiseType (#52969)
      
      * change judgement for DropoutGradGPUKernelDriver
      
      * add UnrollerWithoutVecSize and after this Loaddata to be refined
      
      * pass unittest
      
      * use same unroller with XPU
      
      * BroadcastWithInt64Index
      
      * BroadcastDataLoader template partial specialization
      
      * fix compile errs in ROCms
      
      * clean ElementwiseT and InT for BroadcastKernel
      
      * default axis and clean inT
      
      * remove redundant fast divmod computation
      
      * optimize drop_nd & drop_nd_grad
      
      * optimize BroadcastDataLoader bf16 fp16
      
      * rm InT etc. after merge develop
      
      * delete constexpr for windows ci
      
      * fix conflict
      
      * fix conflic with develop
      
      * fix conflic
      
      * new clean
      
      * clean
      
      * Fix xpu2 kp compile error (#53548)
      
      * fix conflict
      
      * conflict
      f9ea2301
  4. 09 5月, 2023 8 次提交
  5. 08 5月, 2023 3 次提交
    • Z
      [Paddle-TRT] The Graph uses OpConverterType for op converter (#53214) (#53585) · 2cf4a04a
      zhoutianzi666 提交于
      * add ```converter_type``` for op converter
      2cf4a04a
    • N
      [cherry-pick] Fix core dumped in training when check_nan_inf=1 (#53423) · d5c3f032
      niuliling123 提交于
      修复优化器精度检查bug
      d5c3f032
    • G
      [Cherry-pick]Cherry pick 0d output (#53538) · 2d02b0c1
      GGBond8488 提交于
      * add 0D output support for inalg.slogdet,test=allcase
      
      * fix zerom dime test error test=allcase
      
      * fix test error test=allcase
      
      * add static backward test, test=allcase
      
      * support_0D_output_for_matrix_rank_multi_dot, test=allcase
      
      * add 0D output test for matrox_rank and mutli_dot test=allcase
      
      * fix assert error ,test=allcase
      
      * fix test error, test=allcase
      
      * fix other test error, test=allcase
      
      * fix other test error, test=allcase
      
      * fix test error, test=allcase
      
      * fix matrix_rank and multi dot test err test=allcase
      
      * fix test error test=allcase
      
      * fix test zero dim test, test=allcase
      
      * add static backward test for multi_dot, test=allcase
      
      * add tol 2d broadcast test case, test=allcase
      
      * fix test error test=allcase
      
      * fix test error test=allcase
      
      * test=allcase
      
      * support_0d_output_for_linalg.norm
      
      * fix test error test=allcase
      
      * fix 0D test
      
      * fix test error test=allcase
      
      * fix test error test=allcase
      
      * fix tets,test=allcase
      
      * fix error,test=allcase
      
      * fix errors ,test=allcase
      
      * add static backward , test=allcase
      
      * add static backwward test, test=allcase
      
      * slogdet_support_0D_output
      
      * add new case
      
      * fix tests, test=allcase
      
      * cherry-pick
      
      * cherry-pick
      
      * fix trace gpu kernel 0d error, test=allcase
      
      * fix windows error, test=allcase
      
      * add matrixrank cherry-pick
      2d02b0c1
  6. 06 5月, 2023 1 次提交
  7. 05 5月, 2023 2 次提交
  8. 04 5月, 2023 1 次提交
  9. 28 4月, 2023 1 次提交
  10. 27 4月, 2023 1 次提交
  11. 25 4月, 2023 2 次提交
  12. 24 4月, 2023 2 次提交
  13. 23 4月, 2023 2 次提交
    • J
      Cherry pick getitem/setitem 0d (#53125) · a79c04f3
      JYChen 提交于
      * support 0-D output and 0-D as indice in __getitem__
      
      * fix tests
      
      * fix inference and UT
      
      * add unittest for setitem
      
      * fix xpu test
      
      * fix xpu 0-d
      a79c04f3
    • G
      Fix bug of block desc. (#53163) (#53176) · 7adecf40
      Ghost Screaming 提交于
      * Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
      is wrong.
      
      * Remove climits.
      
      * Fix bug of BlockDesc::MoveFrom(). It's used to rebuild main_program_desc from ProgramDesc modified by Fusion Pass. As some fused operators need to create new Variables in modified ProgramDesc, MoveFrom function uses std::move() function to move these VarDesc to main_program_desc. As a result, their pointers holded by modified ProgramDesc become nullptr. When call block()->Program()->proto() function, it will call ProgramDesc::Flush() function at first, which may cause a segmentation fault.
      7adecf40
  14. 20 4月, 2023 3 次提交
  15. 17 4月, 2023 10 次提交
  16. 15 4月, 2023 1 次提交