1. 13 3月, 2023 1 次提交
    • Z
      [Paddle Inference ]use python to generate cutlass code (#50603) · 4e9e23cb
      zhoutianzi666 提交于
      * use python to generate cutlass code
      
      * refine CommonConvKernelPart1, CommonConvKernelPart2
      
      * remove useless code in generate_cutlass_code.sh
      
      * add more config in conv2d_residual
      
      * CommonCutlassConvKernelPart1 and CommonCutlassConvKernelPart2
      
      * add group conv support in util.cu
      
      * remove .sh
      
      * refine name
      
      * make name goodgit status!
      
      * add fuse_alpha
      
      * make code easy to understand
      
      * mot fopen generate in py
      
      * use python script to generate conv2d,group=1 cutlass code
      
      * use const &
      
      * use const & && use python script to generate conv2d/group=1 code
      4e9e23cb
  2. 06 3月, 2023 1 次提交
    • H
      [phi decoupling] decouple dependency to device_context in phi (Part 1) (#50865) · a1006b2b
      Huang Jiyi 提交于
      * move DeviceContextPool to phi
      
      * add EmplaceExternalContextFunc
      
      * update namespace
      
      * update cmake
      
      * fix bugs and create context_pool_impl.h
      
      * replace platform::is_xxx_place
      
      * fix bugs
      
      * update generator
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix enforce usage
      
      * Revert "fix enforce usage"
      
      This reverts commit 5f521f08a69713cee506e64a00ec6d9fba709e27.
      
      * fix bugs
      
      * rm XPUDeviceContext and CustomDeviceContext
      
      * fix bugs
      
      * fix fix context init bug
      
      * fix bugs after merge
      
      * fix bugs
      
      * fix name
      
      * fix mutable_data
      
      * update and fix bugs
      
      * fix bugs
      
      * update
      
      * fix bugs
      
      * fix name
      
      * fix bugs
      
      * merge
      
      * fix bugs
      
      * create context_pool in phi/backends
      
      * create context_pool in phi/backends
      
      * fix bugs
      
      * fix xpu bugs
      
      * fix rocm bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix bugs
      
      * fix xpu bugs
      
      * update
      
      * update
      
      * fix bugs
      
      * fix bugs
      a1006b2b
  3. 02 3月, 2023 1 次提交
  4. 23 2月, 2023 1 次提交
  5. 22 2月, 2023 1 次提交
    • J
      【Prim】Add gather vjp (#50305) · 4db8e5c7
      Jiabin Yang 提交于
      * tmp gather vjp
      
      * support gather
      
      * remove useless code
      
      * fix compiling error
      
      * fix ut
      
      * add eager test
      
      * add eager test
      
      * add seed
      
      * fix cpu error
      
      * fix transpose op compat
      
      * remove tensor index case
      
      * fix prim_cinn
      
      * fix ut
      4db8e5c7
  6. 20 2月, 2023 1 次提交
    • H
      [Autogen Tensor API] Autogen `tensor_api.cc` (#50642) · c36c7199
      HongyuJia 提交于
      * polish tensor operants implementation
      
      * change year, 2021->2023
      
      * autogen tensor.h and tensor_api.cc
      
      * polish CMakeLists logic
      
      * cancel tensor.h auto-gen
      
      * clean useless parameter
      
      * delete tensor_api.cc
      c36c7199
  7. 18 2月, 2023 1 次提交
  8. 14 2月, 2023 1 次提交
  9. 16 12月, 2022 1 次提交
    • H
      change staticRNN to while (#48213) · 69536892
      hong 提交于
      * change staticRNN to while
      
      * update code
      
      * fix rnn bug
      
      * update
      
      * fix _find_op_path_ bugs in append_backward.
      
      * polish code
      
      * revert op proto
      
      * update
      
      * udpate while
      
      * format
      
      * revert test while loop op
      
      * fix create array
      
      * fix windows error
      
      * fix bug
      
      * update
      
      * fix array write bug
      Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
      69536892
  10. 12 12月, 2022 2 次提交
  11. 28 11月, 2022 1 次提交
  12. 09 11月, 2022 1 次提交
  13. 18 10月, 2022 1 次提交
    • Z
      [code-gen] Support code-gen for opmaker of sparse op (#46993) · bdd3dde3
      zyfncg 提交于
      * support generating code of opmaker for backward op invoke forward op
      
      * gsupport code-gen of opmaker for sparse op
      
      * refind logic of choose phi kernrel
      
      * fix complie budg
      
      * fix code_gen bug
      
      * fix bug
      
      * fix kernel signature code-gen
      
      * fix complie bug of VarType
      
      * fix complie bug of VarType
      
      * fix test_sparse_conv_op
      
      * fix test_sparse_norm_op
      bdd3dde3
  14. 27 9月, 2022 1 次提交
  15. 26 9月, 2022 1 次提交
  16. 30 8月, 2022 1 次提交
    • Z
      Remove extra attribute in OpMaker (#44310) · fe321f9a
      zyfncg 提交于
      * add runtime config in phi
      
      * add runtime attr for op desc and op
      
      * fix no proto error
      
      * adjust opdesc set_attr impl
      
      * try to remove conv_op extra attrs
      
      * add init runtime attr map
      
      * change extra header path
      
      * fix runtime_attr
      
      * fix trace_op
      
      * fix bug of pass
      
      * fix merge conflict
      
      * fix dygraph attrs
      
      * fix bug of pass
      
      * fix dygraph bug
      
      * fix unittest module
      
      * delete extra attr default
      
      * fix dropout kernel
      
      * polish code
      
      * fix extra output of instance_norm
      
      * fix merge confilct
      
      * fix op_desc bug
      
      * add extra attr in yaml for conv3d_transpose
      
      * don't remove extra input and output
      
      * fix save_inference_model
      
      * fix bug of batch_norm
      
      * revert some change
      
      * polish log
      
      * polish code
      
      * add code comment
      Co-authored-by: NChen Weihang <chenweihang@baidu.com>
      fe321f9a
  17. 26 8月, 2022 1 次提交
  18. 15 8月, 2022 1 次提交
  19. 12 8月, 2022 1 次提交
  20. 12 7月, 2022 1 次提交
  21. 06 7月, 2022 1 次提交
  22. 01 7月, 2022 1 次提交
  23. 20 5月, 2022 1 次提交
  24. 27 3月, 2022 1 次提交
    • J
      Add StringTensor (#39830) · 0695e1ac
      Jack Zhou 提交于
      * add string tensor and case convert kernels
      
      * Add strings empty kernel; Reorganize the structure of case convert kernel
      
      * Add string infermeta
      
      * Update mutable_data of string tensor
      
      * rename kernel name
      
      * add string copy tmp
      
      * Fix strings copy device bug
      
      * add utf8 gpu converter
      
      * add string tensor c++ api
      
      * Remove mutable_data of string tensor
      
      * update string tensor interface
      
      * remove charcases_flag.h
      
      * remove some fluid headers
      
      * Add make_ddim
      
      * __HIPCC__ -> PADDLE_WITH_HIP
      
      * remove fluid headers
      
      * fix cpu compile
      
      * remove std::hash
      
      * Fix cudaMalloc
      
      * Remove strings/impl directory
      
      * Fix infrt/get_phi_kernel_info.py;Add custom_kernels deps
      
      * Add empty kernel test
      
      * Remove some comments
      
      * Modify lower/upper api encoding type: string->bool
      
      * STRING->PSTRING; Add CreateInferLikeMeta
      
      * Add code gen for C++ String API
      
      * remove strings_api_utils.h
      
      * Add ignore file (strings_api.h, strings_api.cc)
      
      * update strings gen script
      
      * change args order of case convert kernels
      
      * Add comments for pstring, StringTensor
      
      * cpstring_internal.h -> cpstring_impl.h
      
      * Update accordding to comments:
      
      1. Remove fluid headers
      2. paddle::platform::errors -> phi::errors
      3. Use 'place.GetType() == phi::AllocationType::GPU' instead of 'paddle::platform::is_cpu_space()'
      4. Use camel code style
      
      * Remove all singletons in strings kernels
      
      * fix rocm compile
      
      * Fix py3 compile
      
      * Fix c++ coverage
      
      * 1. Add pstring proto type
      2. Add StringTensor debug info
      3. Rename case_convert_kernel to strings_lower_upper
      4. Remove serialize derialize strings kernel
      
      * DataLayout::PSTRING -> DataLayout::PSTRING_UNION
      
      * Register pstring data type
      
      * Fix strings api gen
      
      * Fix dense tensor register pstring dtype
      
      * Fix error messages
      
      * remove line
      
      * add pstring unittest
      
      * remove test string api unitest
      
      * remove empty line
      
      * Remove some headers to decrease the size of executable file
      0695e1ac
  25. 17 3月, 2022 1 次提交
  26. 09 3月, 2022 1 次提交
  27. 08 3月, 2022 1 次提交
  28. 03 3月, 2022 1 次提交
  29. 02 3月, 2022 1 次提交
  30. 28 2月, 2022 1 次提交
  31. 20 2月, 2022 1 次提交
  32. 18 2月, 2022 2 次提交
  33. 15 2月, 2022 1 次提交
    • H
      move histogram to pten (#39496) · 556f6eb0
      hong 提交于
      * move histogram to pten; test=develop
      
      * fix format error; test=develop
      
      * fix histogram kernel format; test=develop
      556f6eb0
  34. 13 2月, 2022 1 次提交
  35. 09 2月, 2022 1 次提交
  36. 04 2月, 2022 1 次提交
  37. 30 1月, 2022 1 次提交
  38. 27 1月, 2022 1 次提交