1. 20 6月, 2022 1 次提交
  2. 15 6月, 2022 1 次提交
  3. 14 6月, 2022 1 次提交
    • X
      [ CherryPick ] Cherry pick for einsum optimization. (#43468) · 22e75d92
      xiongkun 提交于
      * [EinsumOp] Polish forward logic and backward logic for optimize (#42603)
      
      * change logic for optimize
      
      * modifty
      
      * merge
      
      * change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 (#43010)
      
      * [EinsumOp] Make EinsumOp support bfloat16. (#43085)
      
      * change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0
      
      * make EInsumOP support bf16
      
      * add unittest for BF16
      
      * add condition for test_BF16
      
      * fix bugs
      
      * fix
      
      * change the backward api to fit einsum op
      22e75d92
  4. 08 6月, 2022 1 次提交
    • N
      Replace ReduceAmax/Amax.part.cu with KP (#43202) (#43263) · e161979e
      niuliling123 提交于
      Reduce amax/amin frobenius_norm_kerne原始实现为Eigen实现,文件编译时间较长,因此本PR将其替换为KP实现
      删除DefaultElementwiseOperator中重复功能支持,减少elementwise_double_grad OP编译时间
      e161979e
  5. 07 6月, 2022 1 次提交
  6. 06 6月, 2022 1 次提交
    • N
      cherry-pick 42645 (#43205) · 835a1888
      niuliling123 提交于
      删除Broadcast function中rank例化以及Elementwise调用,降低编译时间。
      从develop分支中的#42645 PR修改而来,由于develop分支与release分支相差较大,无法实现cherry-pick,因此针对release2.3重新提交PR.
      Broadcast中关于rank的例化会导致底层模板展开较多,造成reduce_sum_grad_kernel.cu.o文件体积过大,修改后可以降低.o体积及编译时间
      835a1888
  7. 06 5月, 2022 1 次提交
  8. 04 5月, 2022 1 次提交
  9. 30 4月, 2022 1 次提交
  10. 28 4月, 2022 5 次提交
  11. 26 4月, 2022 1 次提交
    • C
      [Cherry-pick] Optimize dygraph performance part2 (#42224) · ab24b9c0
      Chen Weihang 提交于
      * Add paddle::variant and replace paddle::any (#42139)
      
      * add variant and replace any
      
      * split attribute
      
      * Optimize dygraph GetExpectedKernelType perf (#42154)
      
      * opt dygraph scheduling
      
      * revert part impl
      
      * fix variant compile error (#42203)
      
      * replace any by variant in infermeta (#42181)
      ab24b9c0
  12. 25 4月, 2022 1 次提交
    • A
      [Cherry-Pick][Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm... · 58d0d15e
      Aurelius84 提交于
      [Cherry-Pick][Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm and fix shape op (#42170)
      
      * [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT (#42138)
      
      * [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT
      
      * [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT
      
      * [Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm (#42132)
      58d0d15e
  13. 21 4月, 2022 2 次提交
  14. 19 4月, 2022 4 次提交
  15. 18 4月, 2022 2 次提交
  16. 15 4月, 2022 3 次提交
  17. 14 4月, 2022 2 次提交
    • C
      Cherry pick final state ops (#41755) · 921a6fb7
      chentianyu03 提交于
      * [Yaml]add exp yaml (#41217)
      
      * add exp yaml
      
      * add exp api in test case
      
      * add determinant yaml
      
      * fix exp op unittest
      
      * change test class name
      
      * modify api name
      
      * compacted with raw api
      
      * fix det api
      
      * add python_api
      
      * add test eager for determinant op
      
      * [Yaml] Add assign yaml (#41428)
      
      * add assign yaml
      
      * add assign api
      
      * add assign backward api
      
      * add assign
      
      * add assign yaml
      
      * add assign
      
      * assign yaml
      
      * add assign raw kernel and use assign_raw in yaml
      
      * merge develop branch
      
      * add missing python_api
      
      * exchange assign and assign_raw kernel name (#41625)
      
      * exchange assign and assign_raw kernel name
      
      * fix register error
      
      * [Yaml]add gaussian_random yaml and test case (#41312)
      
      * add guassian random yaml
      
      * add gaussian_random yaml and test case
      
      * fix error modify of full yaml
      
      * import in_dygraph_mode
      
      * import _in_legacy_dygraph
      
      * add place arg in api
      
      * import __current_expected_place
      
      * fix test_egr_python_api failed case
      
      * add test case
      
      * add cast for NormalInitializer
      
      * fix test error
      
      * fix test error
      
      * rm unsed check code
      
      * fix test error in test_initializer_nn
      
      * modify by review
      
      * [Phi]fix split error when sections has 0 size and add test case (#41708)
      
      * fix split error when sections has 0 size and add test case
      
      * fix test case
      921a6fb7
    • W
      add fp16 kernel to clip_grad (#41675) · d447c678
      wuyefeilin 提交于
      d447c678
  18. 13 4月, 2022 3 次提交
  19. 12 4月, 2022 3 次提交
  20. 11 4月, 2022 2 次提交
    • H
      add depthwise conv hip support (#41537) (#41603) · 676c960c
      hong 提交于
      676c960c
    • C
      [Cherry-pick] Add truncated_normal/unique/swish/unbind yaml and polish Getting... · b2e095c4
      Chen Weihang 提交于
      [Cherry-pick] Add truncated_normal/unique/swish/unbind yaml and polish Getting tensor place impl (#41539)
      
      * [Phi] Polish truncated normal kernel and add yaml (#41280)
      
      * polish truncated normal kernel
      
      * add yaml
      
      * add truncated normal kernel and add yaml
      
      * polish unittests and yaml
      
      * import dygraph mehtod
      
      * add unique yaml and final state api (#41460)
      
      * fix get tensor backend set bug (#41478)
      
      * [Phi] Add unbind yaml and final state api (#41277)
      
      * add unbind yaml
      
      * fix unittest
      
      * [Phi] Add swish yaml and final state api (#41479)
      
      * add swish yaml and final state api
      
      * skip mkldnn test
      
      * fix grad mkldnn test
      
      * add cherry-pick lost code
      b2e095c4
  21. 08 4月, 2022 1 次提交
  22. 07 4月, 2022 2 次提交