1. 27 10月, 2021 1 次提交
    • L
      Add fused attention op backward and python layer. (#36498) (#36752) · 64643d50
      Li Min 提交于
      功能:本PR的目标是提高attention模块的计算性能。
      为了减少框架层对op的调度开销,本PR通过在C++层手动实现attention模块,对外提供attention 大op;
      为了减少防存开销,本PR采取了两种优化方法:
      (1)在q,k,v计算时通过共享输入X,将该处的gemm,transpose和bias add从三次调用减少为一次;
      (2)使用kernel融合优化技术,在不同cuda kernel之间通过寄存器传输数据;
      64643d50
  2. 31 8月, 2021 1 次提交
  3. 10 2月, 2021 1 次提交
    • C
      New custom operator extension mechanism (#30690) · f649442d
      Chen Weihang 提交于
      * initial commit: simple demo
      
      * polish copyright format
      
      * add grap op simple demo
      
      * adapt uncertain number of argument
      
      * change trait marco name
      
      * add place & dtype support for add kernel
      
      * add dispath and infershape func
      
      * poish code & add notes
      
      * add dynamic_loader dep for paddle_framework
      
      * add new custom op test dir
      
      * polish impl details
      
      * add unittest for new custom op
      
      * fix failed unittest
      
      * Costum op (#1)
      
      * fix compile error
      
      * wrap framework tensor with LoDTensor
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * add CustomTensor default constructor
      
      * add size() for CustomTensor
      
      * make size const for CustomTensor
      
      * refactor place related api to circle the concept
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * make place const
      
      * make Tensor copy
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * remove additional head of framework
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * add gpu test
      
      * merge latest cwh code in
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * Remove ShareData from user && Change CustomTensor to Tensor && Support more data type (#2)
      
      * fix compile error
      
      * wrap framework tensor with LoDTensor
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * add CustomTensor default constructor
      
      * add size() for CustomTensor
      
      * make size const for CustomTensor
      
      * refactor place related api to circle the concept
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * make place const
      
      * make Tensor copy
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * remove additional head of framework
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * add gpu test
      
      * merge latest cwh code in
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * hid share data from and to
      
      * rename CustomTensor to Tensor
      
      * refactor register design & add test
      
      * change op_funtion to op_meta_info
      
      * split op meta info into .h and .cc
      
      * move get methods into friend class
      
      * move OpMetaInfoHelper into framework space
      
      * move CustomTensorUtils into framework space
      
      * change pybind api name
      
      * move PD C API into op meta info
      
      * add register custom op api
      
      * remove inference cmake change
      
      * refactor copy to api && change Reshape to lowercase && support more dtype && add more test (#3)
      
      * fix compile error
      
      * wrap framework tensor with LoDTensor
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * add CustomTensor default constructor
      
      * add size() for CustomTensor
      
      * make size const for CustomTensor
      
      * refactor place related api to circle the concept
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * make place const
      
      * make Tensor copy
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * remove additional head of framework
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * add gpu test
      
      * merge latest cwh code in
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * hid share data from and to
      
      * rename CustomTensor to Tensor
      
      * support multi dtype
      
      * remove lod, make reshape lowercase, add copy test and refactor copy api
      
      * remove lod, make reshape lowercase, add copy test and refactor copy api
      
      * remove lod, make reshape lowercase, add copy test and refactor copy api
      
      * remove lod, make reshape lowercase, add copy test and refactor copy api
      
      * fix copy to error
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * polish detail & error message
      
      * polish test details
      
      * Add cast api && Change copy related api to copy_to && add more test (#4)
      
      * fix compile error
      
      * wrap framework tensor with LoDTensor
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * add CustomTensor default constructor
      
      * add size() for CustomTensor
      
      * make size const for CustomTensor
      
      * refactor place related api to circle the concept
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * fix compile error
      
      * make place const
      
      * make Tensor copy
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * debug CustomTensor core
      
      * remove additional head of framework
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * use back to shared ptr for custom tensor
      
      * add gpu test
      
      * merge latest cwh code in
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * adjust ut code of custom op
      
      * hid share data from and to
      
      * rename CustomTensor to Tensor
      
      * support multi dtype
      
      * remove lod, make reshape lowercase, add copy test and refactor copy api
      
      * remove lod, make reshape lowercase, add copy test and refactor copy api
      
      * remove lod, make reshape lowercase, add copy test and refactor copy api
      
      * remove lod, make reshape lowercase, add copy test and refactor copy api
      
      * fix copy to error
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add more test
      
      * add type cast
      
      * add cast and make copy to api
      
      * add cast and make copy to api
      
      * add cast and make copy to api
      
      * add cast and make copy to api
      
      * merge cwh code
      
      * merge cwh code
      
      * merge cwh code
      
      * merge cwh code
      
      * merge cwh code
      
      * add more error log
      
      * add more error log
      
      * polish code
      
      * used for test
      
      * remove test comment
      
      * remove test comment
      
      * fix uint8 type error
      
      * fix lost uint8 type error
      
      * add test for coverage
      
      * polish details by reviewer comments
      
      * add prefix for DISABLE_COPY_AND_ASSIGN
      Co-authored-by: NJiabin Yang <360788950@qq.com>
      f649442d
  4. 01 3月, 2018 1 次提交
  5. 24 2月, 2018 1 次提交
  6. 12 2月, 2018 1 次提交
  7. 21 1月, 2018 1 次提交
    • D
      "fix decode bug" (#7711) · e983cc90
      dzhwinter 提交于
      * "fix decode bug"
      
      * "follow commnet"
      
      * "fix error"
      
      * "fix hook bug"
      
      * fix based comment
      
      * fix copyright
      
      * fix based on comment
      e983cc90
  8. 15 1月, 2018 1 次提交
    • D
      Feature/hooks (#7513) · b9b75377
      dzhwinter 提交于
      * add copyright hook
      
      * add copyright hook
      
      * refine copyright hook
      
      * "test copyright hook"
      
      * fix check style
      
      * fix ci
      b9b75377
  9. 21 12月, 2017 1 次提交
  10. 27 2月, 2017 1 次提交