1. 07 11月, 2022 1 次提交
    • H
      [Restore PR] Remove hard code of PADDLE_WITH_CUDA (#47630) · 908a381d
      HongyuJia 提交于
      * move cudnn hardcode outside GetExpectedKernelType
      
      * add header file
      
      * debug
      
      * update interpreter_util with hardcode
      
      * update interpreter_util headerfile
      
      * solve activation hardcode
      
      * debug with CI
      
      * add mkldnn_op_list header file
      
      * temporarily uncomment mkldnn
      
      * temporarily uncomment mkldnn
      
      * delete sequence_softmax cudnn hardcode
      
      * add hardcode to data_transfer.cc
      
      * update data_transfer headerfile
      
      * try fix segment fault
      
      * update cudnn&miopen_helper
      
      * reset HasAttr of DygraphExctnCtx
      
      * debug, this commit should pass all CI
      
      * debug should pass CI, temporarily disable activation
      
      * debug should pass CI
      
      * fix default_attr=nullptr bug
      
      * clean debug code
      
      * Call SetDnnFallback function in the base class
      
      * activation fallback to plain kernel
      
      * fix default GetExpectedKernelType find wrong kernel
      
      * search cudnn kernel instead of fallback
      
      * fix cudnn_handle bug
      
      * remove tanh use_cudnn
      
      * restore tanh use_cudnn
      
      * debug tanh
      
      * fix tanh bug
      
      * delete activation cudnn kernel
      
      * polish code
      908a381d
  2. 02 11月, 2022 1 次提交
  3. 01 11月, 2022 1 次提交
    • H
      [Kernel Selection] Remove hard code of PADDLE_WITH_CUDA (#47325) · f9134045
      HongyuJia 提交于
      * move cudnn hardcode outside GetExpectedKernelType
      
      * add header file
      
      * debug
      
      * update interpreter_util with hardcode
      
      * update interpreter_util headerfile
      
      * solve activation hardcode
      
      * debug with CI
      
      * add mkldnn_op_list header file
      
      * temporarily uncomment mkldnn
      
      * temporarily uncomment mkldnn
      
      * delete sequence_softmax cudnn hardcode
      
      * add hardcode to data_transfer.cc
      
      * update data_transfer headerfile
      
      * try fix segment fault
      
      * update cudnn&miopen_helper
      
      * reset HasAttr of DygraphExctnCtx
      
      * debug, this commit should pass all CI
      
      * debug should pass CI, temporarily disable activation
      
      * debug should pass CI
      
      * fix default_attr=nullptr bug
      
      * clean debug code
      f9134045
  4. 25 10月, 2022 1 次提交
  5. 28 9月, 2022 1 次提交
    • C
      Remove the declaration of using Tensor in framework/tensor.h (#46432) · e12a905e
      Chen Weihang 提交于
      * remove needless using tensor
      
      * remove needless using tensor
      
      * resolve conflict
      
      * replace tensor using
      
      * fix format error
      
      * revert needless changing
      
      * fix rocm and npu compile error
      
      * fix cinn compile error
      
      * fix format error
      
      * fix mkldnn format error
      
      * fix mkldnn format error
      
      * fix cinn compile error
      
      * fix cinn compile error
      
      * fix cinn compile error
      
      * resolve conflict
      e12a905e
  6. 16 9月, 2022 1 次提交
    • S
      Support broadcast elementwise operators with int64 index type (#45741) · 20b5bf84
      sneaxiy 提交于
      * support int64 non-broadcast
      
      * support broadcast case for int64 index
      
      * fix bug
      
      * support more Arity
      
      * remove some codes
      
      * upgrade patchelf to v0.15.0 to pass CI build
      
      * fix bug
      
      * fix patchelf installation
      
      * add debug flags
      
      * remove useless codes
      
      * fix viterbi_decode and set_value op uts
      
      * remove always enable int64
      20b5bf84
  7. 05 9月, 2022 1 次提交
  8. 01 8月, 2022 1 次提交
    • L
      unify gpu context (#44740) · 86763023
      Leo Chen 提交于
      * remove cudaDeviceContext
      
      * remove more template
      
      * fix rocm compile
      
      * remove alias name CUDADeviceContext
      
      * fix compile
      
      * fix tests
      
      * revert changes
      86763023
  9. 27 7月, 2022 1 次提交
  10. 26 6月, 2022 1 次提交
  11. 05 6月, 2022 1 次提交
  12. 04 6月, 2022 1 次提交
  13. 02 3月, 2022 1 次提交
  14. 23 2月, 2022 1 次提交
  15. 20 2月, 2022 1 次提交
  16. 19 2月, 2022 1 次提交
    • A
      [Pten]Unify paddle/pten::framework::ddim into pten::ddim (#39614) · 2fe04264
      Aurelius84 提交于
      * Unify paddle/pten::framework::ddim into pten::ddim
      
      * fix paddle namespace
      
      * compile sucessfully
      
      * fix npu src file
      
      * fix conflict
      
      * fix conflict
      
      * fix tensorrt compiler error
      
      * fix conflict
      
      * fix conflict
      
      * fix tesst file conflict
      
      * fix conflict
      
      * fix mlu file conflict
      
      * fix mlu file conflict
      
      * fix cinn header file conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      2fe04264
  17. 16 2月, 2022 1 次提交
    • L
      [bf16] pten matmul cuda kernel support bf16 (#39485) · d5a0d31a
      Leo Chen 提交于
      * pten matmul cuda kernel support bf16
      
      * fix pten kernel name
      
      * add matmul_grad bf16 kernel
      
      * add emptylike bf16 kernel
      
      * fix compile
      
      * suppport rocm
      
      * fix error
      
      * fix rocm
      
      * add bf16 header file
      
      * fix compile
      d5a0d31a
  18. 15 2月, 2022 1 次提交
    • A
      [PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404
      Aurelius84 提交于
      * #1 migrate dist-related type()-> dtype()
      
      * move datatype function from pten -> fluid/framework
      
      * change type() in imperative into convert(dtype())
      
      * modify xx_tensor->type into xx_tensor->dtype
      
      * change the set_type interface and the caller
      
      * modify xx_tensor.type into xx_tensor.dtype
      
      * fix mutable_data(place, dtype())
      
      * change caller of mutable_data in pten and distributed
      
      * change the caller of mutable_data in fluid/framework
      
      * change the caller of mutable_data in imperative directory
      
      * mutable_data: inference
      
      * update the call of mutable_data
      
      * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType
      
      * pass the compile. the next step is remove VarType in Pten
      
      * fix all and remove VarType from pten. success in linux. Next task is other platform
      
      * fix conflict with develop
      
      * fix compiled error
      
      * Fix reset conversion
      
      * fix conflict
      
      * fix compiled problem
      
      * fix typo
      
      * Fix << in tensor_utils.cc
      
      * fix type->dtype
      
      * fix unittest
      
      * fix tensor init constructor
      
      * fix DataTypeSize for BFloat16
      
      * fix code style
      
      * fix npu compiled error
      
      * fix npu
      
      * compile npu sucessfully
      
      * fix conflict
      
      * fix conflict
      Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
      7e7e9404
  19. 06 2月, 2022 1 次提交
  20. 20 1月, 2022 1 次提交
  21. 18 1月, 2022 1 次提交
  22. 12 1月, 2022 1 次提交
    • L
      Adjust warpper of gpu_lanuch_config (#38654) · f5166284
      limingshu 提交于
      * first commit
      
      * fix wrong filename
      
      * fix the wrong spell name
      
      * fix gpu config warper
      
      * modify according to pr advices
      
      * fix GpuLauchConfig1D api bugs
      
      * change the config for dropout grad
      
      * fix bugs
      
      * modification according to pr advices
      
      * modification according to pr advices
      f5166284
  23. 03 12月, 2021 1 次提交