1. 27 10月, 2021 11 次提交
  2. 26 10月, 2021 18 次提交
  3. 25 10月, 2021 11 次提交
    • Z
      add ctr accessor (#36601) · cea1ba88
      zhaocaibei123 提交于
      cea1ba88
    • A
      [NPU] modifications for model ernie-1.0 (#36642) · 19b02d95
      Aganlengzi 提交于
      * [NPU] modifications for model ernie-1.0
      
      * rollback 503003 and change cast to dtype
      19b02d95
    • Z
      add op: fused_feedforward(backward) (#35611) · 2dd0a46a
      zhangkaihuo 提交于
      这个PR是fused_feedforward反向的代码
      
      相关kernel实现:fused_dropout_act_bias, fused_residual_dropout_bias, fused_layernorm_residual_dropout_bias
      
      fused_feedforward是一个融合算子,该算子对transformer模型的feed forward层的算子进行融合和封装,使得前端只呈现一个接口,通过融合减少部分访存和kernel launch的时间,以此提升性能。
      2dd0a46a
    • S
      Add bincount op (#36317) · 39f19127
      smallv0221 提交于
      * Add bincount op
      
      * upload cpu version
      
      * fix unitest
      
      * fix unittest
      
      * fix unittest
      
      * fix en doc
      
      * add more test
      
      * fix en doc
      
      * add more test case
      
      * fix test
      
      * fix input vailidation
      
      * fix input check
      
      * fix unittest
      
      * fix test
      
      * fix en doc
      39f19127
    • T
      CI build PR and dev whl (#36532) · e16fe48d
      tianshuo78520a 提交于
      CI build PR and dev whl
      e16fe48d
    • Z
      Create CinnCompiler class for compiling subgraphs found by build_cinn_pass. (#36562) · 4c460378
      Zhen Wang 提交于
      * Init the functions of CinnCompiler.
      
      * Add the unit test for CinnCompiler.
      
      * Fix some compilation errors.
      
      * Update the UT of cinn_compiler.
      
      * Use Decomposer&OpFusion passes in CinnCompiler::CompileGraph.
      
      * Update some comments.
      
      * Uncomment some includes in build_cinn_pass.cc.
      
      * Use refs instead of ptrs as returned types of FindGraph & Compile in
      CinnCompiler.
      
      * Use the merged CinnGraphSymbolization functions in CinnCompiler.
      4c460378
    • H
      [HybridParallel]fix bug of check_inf in fleet_base.py (#36651) · 59d8b8cb
      Haohongxiang 提交于
      * fix bug of check_inf
      
      * fix allreduce
      59d8b8cb
    • T
      add some ops to train ssd on kunlun (#36407) · 50778ad6
      TTerror 提交于
      * add some ops to train ssd on kunlun
      
      * add some ops to train ssd on kunlun
      
      * add some ops to train ssd on kunlun
      
      * update cast op unittest
      
      * update cast op unittest
      
      * update cast op unittest
      
      * update xpu cmake
      
      * update cast unittest
      50778ad6
    • L
      [new-exec] Add events waiter (#36480) · cdb9bfa3
      liutiexing 提交于
      * add align for WorkQueue
      
      * add spinlock
      
      * merge develop
      
      * merge
      
      * Add EventsWaiter
      
      * update
      
      * update
      
      * update Error MSG
      
      * update EventsWaiter
      cdb9bfa3
    • W
      Fix grid sampler while input size is [1] (#36183) · eff3ee5e
      whs 提交于
      eff3ee5e
    • Z
      add op: fused_feedforward(forward) (#35843) · b18cbfb2
      zhangkaihuo 提交于
      这个PR只包含fused_feedforward前向的代码。
      
      相关kernel实现:fused_dropout_act_bias, fused_residual_dropout_bias, fused_layernorm_residual_dropout_bias
      
      fused_feedforward是一个融合算子,该算子对transformer模型的feed forward层的算子进行融合和封装,使得前端只呈现一个接口,通过融合减少部分访存和kernel launch的时间,以此提升性能。
      b18cbfb2