1. 26 5月, 2021 5 次提交
    • L
      [NPU] refine NpuOpRunner (#32869) · 8259d9bf
      Leo Chen 提交于
      * refine ~npuOpRunner
      
      * implement destructor and forbid copy
      
      * use reference to avoid copy
      
      * use const reference
      
      * relax adam precision
      
      * fix top_k
      8259d9bf
    • W
      optimize OP's compilation time (#32617) · 78ecb668
      wuhuanzhou 提交于
      * optimize OP's compilation time, test=develop
      
      * add more op and run ci test, test=develop
      
      * CUDA Kernel register in cc file, test=develop
      
      * fix macros, test=develop
      
      * fix undefined symbol error, test=develop
      
      * fix compilation error and undefined symbol, test=develop
      
      * fix compilation error on Windows, test=develop
      
      * fix compilation error on Windows, test=develop
      78ecb668
    • Y
      Marker op for profiling (#33034) · 5c79dbb2
      Yuang Liu 提交于
      5c79dbb2
    • Z
      Add double grad op for sigmoid activation, test=develop (#32971) · c711e913
      Zhanlue Yang 提交于
      Sigmoid: Out = Sigmoid(X)
      SigmoidGrad: DX = DOut*(1-Out)*Out
      
      [This Patch]
      Out
      DOut -> SigmoidGradGrad -> DOutNew
      DDX                        DDOut
      
      DDOut = (1-Out)*Out*DDX
      DOutNew = (1-2*Out)*DOut*DDX
      c711e913
    • J
      Added cast op oneDNN kernel for bf16/fp32 datatypes casting(FWD/BWD) (#33056) · a2a45d8d
      jakpiase 提交于
      * added op cast functionality for fp32/bf16
      
      * added newline
      
      * added entries in static mode white list and unity build
      
      * fixed failing tests
      
      * changes after review
      
      * added formatting
      
      * upgraded tests file as reviewer suggested
      
      * changes after review
      
      * minor change
      a2a45d8d
  2. 25 5月, 2021 5 次提交
  3. 24 5月, 2021 1 次提交
  4. 22 5月, 2021 1 次提交
    • J
      Added oneDNN matmul grad BF16/FP32 kernel (#32968) · e2a3a6f7
      jakpiase 提交于
      * added support for most matmul cases
      
      * added more functionality
      
      * full functionality of matmul op, fp32 only
      
      * added bf16 tests and functionality
      
      * added formatting
      
      * changes after review
      
      * minor change
      
      * added reviewers suggestions
      e2a3a6f7
  5. 21 5月, 2021 3 次提交
  6. 20 5月, 2021 4 次提交
    • T
      fix gather op and add logsumexp op on kunlun (#32931) · a96e8bc9
      TTerror 提交于
      * fix gather op and add logsumexp op on kunlun
      
      * update xpu depence
      
      * update tests and fix elementwise_add
      a96e8bc9
    • B
      revert_matmulv2_npu (#33014) · be8e94aa
      Baibaifan 提交于
      be8e94aa
    • C
      Add complex template type (#32857) · 738bf20e
      chentianyu03 提交于
      * add complex template file
      
      * add numtraits for complex template
      
      * add complex template type register
      
      * modify specify template of complex
      
      * modify specify template of complex
      
      * modify specify template of complex
      
      * modify specify template of complex
      
      * make TensorCheckerVisitor support complex type
      
      * fix operator= error
      
      * add complex template
      
      * add complex template type
      
      * add complex template type to pyarray transform
      
      * add complex template type to pyarray transform
      
      * remove complex type for dlpack register
      
      * set dlpack supprot complex type
      
      * set dlpack supprot complex type
      
      * set dlpack supprot complex type
      
      * remove explict for complex constructor
      
      * add complex unit test file
      738bf20e
    • L
      14949521
  7. 19 5月, 2021 2 次提交
  8. 18 5月, 2021 4 次提交
  9. 14 5月, 2021 4 次提交
  10. 13 5月, 2021 4 次提交
  11. 12 5月, 2021 2 次提交
  12. 10 5月, 2021 3 次提交
  13. 08 5月, 2021 2 次提交