1. 24 3月, 2022 13 次提交
    • J
      fix build_cinn_pass internal var may be control var problem (#40812) · 310b7dba
      jiangcheng 提交于
      * fix build_cinn_pass internal var may be control var problem
      
      * add annotation and vlog by review advice
      310b7dba
    • Z
      Support intermediate for Sparse API (#40840) · 98244a9a
      zyfncg 提交于
      * support intermediate for saprse api
      
      * close intermediate in yaml
      
      * fix dygraph_api dep for eager
      98244a9a
    • Z
      [AMP] Support amp for Intermediate_dygraph (#40623) · c12f7d48
      zhangbo9674 提交于
      * approve amp for intermediate_dygraph
      
      * add amp_utils for intermediate_dygraph
      
      * add amp needcast check for mlu & npu
      
      * test unittest
      
      * add SetGradNode for set_stop_gradient && add checktensor for GradientHooks
      
      * refine code
      
      * refien unittest of imperative_amp for new dygraph
      
      * inplace api skip amp
      
      * add test_imperative_qat_amp for intermediate amp
      
      * refine code
      
      * refine test_amp ci strategy
      
      * refine unittest code
      
      * refine amp_utils code
      
      * refine amp getpromotetype for some special op
      
      * refine unittest code
      c12f7d48
    • J
      Correct MultipleQuantizeSquash (#40717) · 753964a2
      joanna.wozna.intel 提交于
      * Correct MultipleQuantizeSquash
      
      * Correct logging
      753964a2
    • R
      [MoE]Assign pos op (#40580) · 305f32d1
      Roc 提交于
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * fix for win
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * fix ut for number count
      Co-authored-by: Nhlygit66666 <2570058140@qq.com>
      305f32d1
    • L
      Refine events waiter (#40876) · 36ee6dd3
      liutiexing 提交于
      * add align for WorkQueue
      
      * add spinlock
      
      * merge develop
      
      * merge
      
      * Add EventsWaiter
      
      * Add EventsWaiter
      
      * update
      
      * Revert "Add EventsWaiter"
      
      This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.
      
      * update
      
      * update Error MSG
      
      * update EventsWaiter
      
      * update
      Co-authored-by: Nliutiexing <liutiexing@google.com>
      36ee6dd3
    • Z
      a8f86600
    • C
      [Phi] Migrate InferShape of multiplex, qr, tril_triu (#40102) · 2e736531
      caozhou 提交于
      * migrate infershape
      
      * fix tril_triu infershape error
      
      * fix qr_op infershape
      
      * add parse qr mode func
      
      * move order
      2e736531
    • Z
      [Refactor] refactored eager_gen.py PR #1 (#40815) · 68c9e3e4
      Zhanlue Yang 提交于
      * [Refactor] refactored eager_gen.py PR #1
      
      * [Refactor] refactored eager_gen.py PR #1
      
      * Refactored version 2
      
      * Added automatic code generation utils
      
      * Fixed merge issues
      68c9e3e4
    • S
      test gpu graph engine's performance (#40775) · 83ae1619
      seemingwang 提交于
      * extract sub-graph
      
      * graph-engine merging
      
      * fix
      
      * fix
      
      * fix heter-ps config
      
      * test performance
      
      * test performance
      
      * test performance
      
      * test
      
      * test
      
      * update bfs
      
      * change cmake
      83ae1619
    • 0
      Refine eager run_program OP for dy2st UT (#40768) · 4ccd5cb8
      0x45f 提交于
      * Refine eager run_program OP for dy2st UT
      
      * append run_program error string and refine run_program_grad
      
      * remove some comments
      
      * refine ConstructXGradTensors
      4ccd5cb8
    • C
      [Phi] Move mul op kernel into phi (#40833) · 1b491818
      Chen Weihang 提交于
      * add mul phi kernel
      
      * remove mul op kernel
      
      * remove original mul grad op
      
      * fix cinn test
      
      * fix dygraph test failed
      1b491818
    • N
      Add is_mean param for mean op (#40757) · 7e1155ed
      niuliling123 提交于
      7e1155ed
  2. 23 3月, 2022 24 次提交
  3. 22 3月, 2022 3 次提交
    • L
      [new-exec] async prepare deps (#40713) · 814f7211
      Leo Chen 提交于
      * async prepare deps
      
      * fix bug that std::future is not set
      
      * add ut
      
      * refine code
      
      * fix standalone ut
      
      * disable prof
      814f7211
    • X
      polish python api logic and add backward python api check (#40666) · c29f85b6
      xiongkun 提交于
      * 1. add the python api grad 2. add final and intermediate state vlog 3. change the python_api error logic
      
      * add python api or close the check_eager=True
      
      * fix the compatibility
      
      * matmul
      
      * disable unittests: test_elementwise_add_op test_scatter_nd_op test_gather_nd_op test_scatter_op test_index_sample_op test_elementwise_add_mkldnn_op
      c29f85b6
    • H
      Move embedding to phi (#39901) · 0331cfda
      hong 提交于
      * move embeding to phi;
      
      * update sig; test=develop
      
      * move reset impl to phi; test=develop
      
      * remove old register; test=develop
      
      * fix cpu bf16 bug; test=develop
      
      * fix lookup speed error
      
      * polish code
      
      * fix paddle throw type
      0331cfda