1. 27 6月, 2022 1 次提交
  2. 24 6月, 2022 2 次提交
    • A
      [cherry-pick] NVIDIA fixes (#43780) · 9edbe4aa
      Aganlengzi 提交于
      * Use all sitepackages path as the library/include path (#42940)
      
      * Fix several unit tests and increase the unit tests stability (#43670)
      
      * Reduce gather op unit tests size and increase the timeout
      
      * Add NVIDIA_TF32_OVERRIDE for multi-processes environment
      
      * Remove record test for device event ut
      
      * Fix 3 unittest errors (#43532)
      
      * Fix test_fuse_resnet_unit failure
      
      * Fix test_imperative_auto_mixed_precision failure
      
      * Fix sparse_attention_op error
      
      * Fix sparse_attention_op error
      
      * Use fixed random seed (#43659)
      
      * for CI test_collective_sendrecv_api
      Co-authored-by: Nzlsh80826 <rewang@nvidia.com>
      Co-authored-by: NShijie <505749828@qq.com>
      9edbe4aa
    • K
      [cherry pick] fix structure infos conflict in static return_list mode (#43691) · e700ffdc
      Kaipeng Deng 提交于
      * fix structure infos conflict in static return_list mode. test=develop
      
      * fix format. test=develop
      
      * fix format. test=develop
      e700ffdc
  3. 23 6月, 2022 1 次提交
  4. 22 6月, 2022 5 次提交
    • C
      Cherry pick 43307 (#43618) · d0bbf46c
      ccrrong 提交于
      * add bilinear_interp_v2 converter
      
      * update op_teller.cc
      
      * add unittest for bilinear_interp_v2 converter
      
      * code format
      
      * bug fix
      
      * code format and add unittest
      
      * remove merged modify in op_teller.cc
      
      * code format
      
      * code format
      
      * fix scale init error
      d0bbf46c
    • S
      Cherry-pick PR#43237 from deveop (#43685) · e90dfaf7
      shiyutang 提交于
      * merge_release_and_dev
      
      * merge_release_dev
      
      * update
      
      * Use tempfile to place the temporary files (#43237)
      
      * tempfile_fix
      
      * update
      
      * fix_CI
      
      * update_word2vec.inference.model
      
      * remove_change_in_word2vec_book
      
      * fix_word2vec_book
      
      * rm_affine
      
      * update
      e90dfaf7
    • Z
      fix the bug that _DataLoaderIterMultiProcess use time to generate the seed (#43318) (#43702) · f4c42389
      Zhang Ting 提交于
       fix the bug that _DataLoaderIterMultiProcess use time to generate the seed
      
      cherry-pick #43318
      f4c42389
    • Z
      set_state_dict not use state_dict hook (#43407) (#43711) · 0fb66355
      zhangbo9674 提交于
      在 amp-o2功能开发过程中,为了支持指定网络存储数据类型的功能,添加state_dict hook功能,但是在Layer的set_state_dict是通过state_dict获取网络参数并加载的,hook接口的存在导致 set_state_dict无法加载到原本网络参数。
      本pr通过增加hook控制开关,在set_state_dict中禁用hook解决该问题。
      
      详见pr43407
      0fb66355
    • Z
      [FIx bug]layer to 'NoneType' object has no attribute 'place' (#43597) (#43717) · 0b879318
      zhangbo9674 提交于
      bug:
      当class Layer的_buffers中有参数为None的时候,调用to()方法将会报layer to 'NoneType' object has no attribute 'place'的错误。
      修复方法:
      to()方法增加对_buffers中None类型参数的判断,如果为None,跳过该参数的处理。
      0b879318
  5. 21 6月, 2022 1 次提交
  6. 20 6月, 2022 5 次提交
  7. 17 6月, 2022 2 次提交
  8. 16 6月, 2022 4 次提交
  9. 14 6月, 2022 2 次提交
    • X
      [ CherryPick ] Cherry pick for einsum optimization. (#43468) · 22e75d92
      xiongkun 提交于
      * [EinsumOp] Polish forward logic and backward logic for optimize (#42603)
      
      * change logic for optimize
      
      * modifty
      
      * merge
      
      * change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 (#43010)
      
      * [EinsumOp] Make EinsumOp support bfloat16. (#43085)
      
      * change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0
      
      * make EInsumOP support bf16
      
      * add unittest for BF16
      
      * add condition for test_BF16
      
      * fix bugs
      
      * fix
      
      * change the backward api to fit einsum op
      22e75d92
    • F
      Use tempfile to place all the temporary files. (#43392) · afd0c1db
      freeliuzc 提交于
          使用 tempfile 替换临时文件,保证在单测结束后,所有临时文件都会被正常的删除,避免占用磁盘文件。
          此 PR 仅涉及单测修改,不影响现有功能。
          develop 分支修改在 PR 43376
      afd0c1db
  10. 09 6月, 2022 1 次提交
  11. 30 5月, 2022 1 次提交
  12. 26 5月, 2022 1 次提交
  13. 19 5月, 2022 1 次提交
  14. 10 5月, 2022 1 次提交
  15. 09 5月, 2022 1 次提交
  16. 07 5月, 2022 2 次提交
  17. 06 5月, 2022 1 次提交
  18. 05 5月, 2022 2 次提交
  19. 03 5月, 2022 1 次提交
  20. 30 4月, 2022 4 次提交
  21. 29 4月, 2022 1 次提交
    • W
      [cherry-pick 2.3] Add fused_multi_transformer op to optimize transformer... · 50bfe420
      WangXi 提交于
      [cherry-pick 2.3] Add fused_multi_transformer op to optimize transformer generation performance (#42311)
      
      * Add fused_multi_transformer op to optimize transformer generation performance (#41814)
      
      * fix fused_multi_transformer compile failed in cuda arch < sm53 (#42315)
      
      * fix ci timeout
      50bfe420