1. 20 4月, 2022 1 次提交
    • T
      enable auto-tune when using cinn (#41795) · d70104e5
      TeFeng Chen 提交于
      * optimize preparation overhead before executing cinn compiled program
      
      * update code notes
      
      * fix flag annotation
      
      * enable auto-tune when using CINN
      
      * update cinn commit tag
      
      * skip test
      
      * fix lacking header file
      d70104e5
  2. 19 4月, 2022 1 次提交
  3. 18 4月, 2022 1 次提交
  4. 11 3月, 2022 1 次提交
    • C
      [Phi] Remove needless deps in unittests (#40256) · 89ed57e2
      Chen Weihang 提交于
      * remove needless deps in unittests
      
      * add gpu marco
      
      * fix other unittests
      
      * fix kernel name error
      
      * fix test_prepare_op
      
      * fix failed dygraph unittests
      
      * fix gpu failed tests
      
      * fix cinn test failed
      
      * fix cinn test failed
      
      * fix dropout tests
      89ed57e2
  5. 03 3月, 2022 1 次提交
    • T
      cinn_launch_op: switch to execution by PE (#39911) · 167d511f
      TeFeng Chen 提交于
      * swith to PE execution in cinn launch
      
      * fix outer variables erased
      
      * skip the map bug temporarily for test
      
      * temporary solution for batch_norm bug
      
      * update comment
      
      * fix compile error
      
      * cinn_instruction_run_op_test: update code to skip external alloc/free instructions generated
      167d511f
  6. 20 2月, 2022 1 次提交
  7. 19 2月, 2022 1 次提交
    • A
      [Pten]Unify paddle/pten::framework::ddim into pten::ddim (#39614) · 2fe04264
      Aurelius84 提交于
      * Unify paddle/pten::framework::ddim into pten::ddim
      
      * fix paddle namespace
      
      * compile sucessfully
      
      * fix npu src file
      
      * fix conflict
      
      * fix conflict
      
      * fix tensorrt compiler error
      
      * fix conflict
      
      * fix conflict
      
      * fix tesst file conflict
      
      * fix conflict
      
      * fix mlu file conflict
      
      * fix mlu file conflict
      
      * fix cinn header file conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      2fe04264
  8. 18 2月, 2022 1 次提交
    • T
      cinn_instruction_run_op test (#39576) · fdc4fe3b
      TeFeng Chen 提交于
      * add cinn_instruction_run_op test code
      
      * update several interfaces of CinnLaunchContext
      
      * update several interfaces and add detail comments in CinnLaunchContext class
      
      * to skip the bug of error message check
      
      * fix ut test failed due to reliant interface updated
      fdc4fe3b
  9. 16 2月, 2022 1 次提交
    • Y
      [Pten]Remove reshape and elementwise_add's registry code in Fluid (#39317) · c6478270
      YuanRisheng 提交于
      * remove reshape and elementwise_add registry
      
      * delete code
      
      * fix bugs when run ci ut
      
      * remove log
      
      * fix bugs when run unit test
      
      * fix bugs when run unit test
      
      * fix bugs when run cinn
      
      * fix bugs when run ci-mac-python3
      
      * fix compile bugs
      
      * fix compile bugs
      
      * fix compile bugs
      
      * fix bugs when run kunlun
      
      * fix bugs when compile
      
      * update code according comment
      c6478270
  10. 18 1月, 2022 1 次提交
  11. 09 12月, 2021 1 次提交
  12. 08 12月, 2021 1 次提交
  13. 01 12月, 2021 1 次提交
  14. 19 11月, 2021 1 次提交
  15. 13 11月, 2021 1 次提交
  16. 05 11月, 2021 1 次提交
  17. 03 11月, 2021 1 次提交
    • C
      improve CinnLaunchOpKernel implement (#36936) · 0590277a
      CtfGo 提交于
      1. 功能不变,简化CinnLaunchOpKernel实现:将原先直接从Scope获取变量信息的方式改为借助参数ExecutionContext标准接口获取,简化了实现逻辑,相应地也简化了辅助函数的实现,原先cinn_launch_op_helper较为冗余,删除不必要的接口并迁移至cinn_launch_op.cc中定义。
      2. 修复CinnLaunchOp InferShape判断是否有指定输出:HasOutput->HasOutputs
      3. 添加详细的注释和debug信息,方便问题排查和代码维护
      0590277a
  18. 01 11月, 2021 1 次提交
    • C
      add cinn_launch_op for using CINN to optimize graph (#36600) · 0a963ee9
      CtfGo 提交于
      增加CinnLaunchOp,负责执行Cinn子图编译的结果,要点如下:
      1. 在子图划分的BuildCinnPass中,每个子图在原图中会被替换为该CinnLaunchOp,由它来调用Cinn进行子图编译、执行的功能。
      2. CinnLaunchOp的输入/输出即为子图的输入和输出,另外增加`compilation_key`属性,它可由该属性key从全局Cache中获取子图对象、编译结果,该属性由BuildCinnPass在创建Op时进行设置
      3. CinnLaunchOp功能实现的流程为:
              - 从全局Cache中获取子图对象
              - 从全局Cache中获取子图编译结果,未命中cache时进行即时编译
              - 根据编译结果的变量信息(数据类型、shape)初始化运行时数据,分配内存/显存
              - 将运行时数据打包为参数,调用cinn的可执行对象runtime program进行计算
              - 子图运行结果通过参数指针同步到paddle侧的tensor
      0a963ee9