1. 28 9月, 2022 2 次提交
  2. 22 9月, 2022 1 次提交
  3. 19 9月, 2022 1 次提交
  4. 09 9月, 2022 1 次提交
  5. 06 9月, 2022 1 次提交
  6. 01 9月, 2022 1 次提交
  7. 31 8月, 2022 1 次提交
  8. 30 8月, 2022 1 次提交
    • Z
      Remove extra attribute in OpMaker (#44310) · fe321f9a
      zyfncg 提交于
      * add runtime config in phi
      
      * add runtime attr for op desc and op
      
      * fix no proto error
      
      * adjust opdesc set_attr impl
      
      * try to remove conv_op extra attrs
      
      * add init runtime attr map
      
      * change extra header path
      
      * fix runtime_attr
      
      * fix trace_op
      
      * fix bug of pass
      
      * fix merge conflict
      
      * fix dygraph attrs
      
      * fix bug of pass
      
      * fix dygraph bug
      
      * fix unittest module
      
      * delete extra attr default
      
      * fix dropout kernel
      
      * polish code
      
      * fix extra output of instance_norm
      
      * fix merge confilct
      
      * fix op_desc bug
      
      * add extra attr in yaml for conv3d_transpose
      
      * don't remove extra input and output
      
      * fix save_inference_model
      
      * fix bug of batch_norm
      
      * revert some change
      
      * polish log
      
      * polish code
      
      * add code comment
      Co-authored-by: NChen Weihang <chenweihang@baidu.com>
      fe321f9a
  9. 25 8月, 2022 1 次提交
  10. 18 8月, 2022 1 次提交
  11. 08 8月, 2022 1 次提交
  12. 02 8月, 2022 1 次提交
  13. 01 8月, 2022 1 次提交
    • L
      unify gpu context (#44740) · 86763023
      Leo Chen 提交于
      * remove cudaDeviceContext
      
      * remove more template
      
      * fix rocm compile
      
      * remove alias name CUDADeviceContext
      
      * fix compile
      
      * fix tests
      
      * revert changes
      86763023
  14. 29 7月, 2022 1 次提交
  15. 26 7月, 2022 1 次提交
  16. 19 7月, 2022 1 次提交
  17. 15 7月, 2022 1 次提交
  18. 02 7月, 2022 1 次提交
    • L
      unify cpu context, part2 (#44012) · 755438a7
      Leo Chen 提交于
      * fix init()
      
      * delete test_device_context
      
      * replace CPUDeviceContext with CPUContext
      
      * fix test_scalar
      
      * remove dot_op.cc
      
      * fix compile
      755438a7
  19. 30 6月, 2022 1 次提交
  20. 28 6月, 2022 1 次提交
    • R
      Remove boost::variant (#43100) · b3cf28f8
      Ruibiao Chen 提交于
      * boost::variant -> paddle::variant
      
      * boost::variant.apply_visit -> paddle::visit
      
      * Update pybind_boost_hraders.h
      
      * Fix CINN compilation errors
      
      * Revert FetchResultType
      b3cf28f8
  21. 26 6月, 2022 1 次提交
  22. 10 6月, 2022 1 次提交
  23. 05 6月, 2022 1 次提交
  24. 04 6月, 2022 1 次提交
  25. 02 6月, 2022 1 次提交
  26. 17 5月, 2022 1 次提交
  27. 12 5月, 2022 1 次提交
  28. 07 4月, 2022 1 次提交
  29. 30 3月, 2022 1 次提交
  30. 07 3月, 2022 1 次提交
    • M
      cuBlasLt Epilogue To Fuse Linear + ReLU|GeLU (#39437) · 2a3d9eca
      Ming-Xu Huang 提交于
      * Added cuBlasLtHandle_t to device context.
      
      * Added fused_gemm_epilogue op.
      
      1. Added fused_gemm_epilogue op to leverage cuBlastLt Epilogue.
      2. Support fusion Act(X*Y + bias), X'dims >=2 and Y'dims shoule be 2.
      2. Act currently only be supported ReLU. (Will add GeLU in the future).
      
      * Added UT to fused_gemm_epilogue op.
      
      * Added LinearAct Pattern
      
      1. Added LinearAct into graph_pattern_detector.* to define (2.)'s
      pattern.
      2. LinearAct is used to detect act(element_add(matmul_v2(x, w), bias)).
      3. act currently only support ReLU (Will support GeLU in the future).
      
      * Added FuseGemmEpiloguePass
      
      1, Added FuseGemmEpiloguePass to handle nn.Linear + Act{ReLU}
      fusion (GeLU will be supported in the future).
      2. Only support matmul_v2 from nn.Linear.
      
      * Added pybind to BuildStrageter.fuse_gemm_epilogue_.
      
      * Added UT for fuse_gemm_epilogue_pass.
      
      * GeLU support and EpilogueSingleton
      
      1. Added GeLU support to fused_gemm_epilogue op.
      2. Added EpilogueSingleton to cache auxiliary pointer.
      3. Added related UTs.
      
      * Rename cublaslt_epilogue_opto gemm_epilogue_op.*.
      
      * Added both train and infer pattern to LinearAct.
      
      1. Added support of fwd graph with grap_ops linking to LinearAct.
      2. Added related changes to fuse_gemm_epilogue_pass for above
      modification.
      
      * Changed CUDA requirement from 11.4 to 11.6 for fuse_gemm_epilogue_pass.
      
      * Added identity activation support to gemm_epilogue_op.
      
      * Added Linear Fusion (matmul_v2 + ele_add)
      
      1. Added matmul_v2 + ele_add pattern to LinearActPattern.
      2. Added matmul_v2 + ele_add support to fuse_gemm_epilogue_pass.
      
      * Rename gemm_epilogue_op.* to fused_gemm_epilogue_op.*
      
      * Add fused_gemm_epilogue_grad op.
      
      1. Added fused_gemm_epilogue_grad to support backward epilogue fusion.
      
      * Add UTs to fused_gemm_epilogue_grad_op.
      
      * Change attribute name in fused_gemm_epilogue_grad_op for clearing.
      
      * Allow DX and DBias be dispensable to fused_gemm_epilogue_grad op.
      
      * Added ElementwiseAdd+Matmul+Act graph pattern detection.
      
      * Fuse backward of Linear( Act(x))
      
      1. Added backward fusion pass to Linear( Act(x)).
      2. Added backward fusion pass to Linear(x).
      
      * Added UTs to backward fusion of Linear(Act(x)).
      
      * Complete document of arguments to fused_gemm_epilogue_op.
      
      * Made arguments of some functions pass by reference.
      
      * Modify code with review comments.
      
      1. Made arguments of some function pass by reference.
      2. Removed redundant code.
      3. Followed Google code style to change code.
      
      * Made 'const' code style be consistent
      
      * Fixed random seed of python UTs.
      
      * Set Compiling constrains to cuBlasLt
      
      1. Require CUDA 11.6+
      2. Remove fuse_gemm_epilogue related tests when CUDA < 11.6.
      
      * Code Reivew from Paddle
      
      1. Changed arguments name is_first_gemm to without_x_gradient for
      clearing.
      2. Applied PADDLE_THROW in fused_gemm_epilogue_op.
      
      * Remove EpilogueSingleton
      
      1. Applied ReserveSpace to replace Epilogue for passing auxiliary
      pointers between FWD and BWD.
      
      * Fix a logical error and enhance UTs.
      
      1. Added act op count checking in UTs.
      2. Fix issue to fuse backward or ReLU(Linear(X)).
      3. TODO: solve GELU fusion issues.
      
      * Fix Linear and GeLU fusion issues.
      
      1. Modified graph_detech_pattern to fit with both linear wiht gelu or
      relu.
      2. Modified data range in Uts to allow negative values.
      
      * Removed fused_gemm_epilogue_op.h.
      
      * Rename namespace pten to phi.
      
      * Rename name of arguments in fused_gemm_epilogue_op
      
      1. bias -> Bias.
      2. out -> Out.
      3. reserve_space -> ReserveSpace.
      
      * Change EpiloguePassActivationCache as local variable.
      
      1. Removed singleton in EpiloguePassActivationCache.
      2. Made EpiloguePassActivationCache as an argument to each pass
      functions.
      2a3d9eca
  31. 28 2月, 2022 1 次提交
    • L
      Update host tracer (#39975) · 406f1b96
      liutiexing 提交于
      * add align for WorkQueue
      
      * add spinlock
      
      * merge develop
      
      * merge
      
      * Add EventsWaiter
      
      * Revert "Add EventsWaiter"
      
      This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2.
      
      * update HostTracer
      
      * fix
      
      * update
      
      * update
      Co-authored-by: Nliutiexing <liutiexing@google.com>
      406f1b96
  32. 23 2月, 2022 1 次提交
    • C
      Update record interface using part3 (#39695) · 1fcaab45
      chenjian 提交于
      * fix RecordEvent interface
      
      * modify default level to 4
      
      * update interface use
      
      * add const default trace level
      
      * update record event interface using
      
      * update record event interface using
      
      * update record event interface using
      
      * update operator.cc
      
      * update part2
      
      * update part1
      
      * update part3
      
      * fix include profiler.h header in ps server
      
      * fix include profiler.h header in ps server
      
      * fix profiler.h header
      
      * fix profiler.h header
      
      * fix merge buf
      
      * update
      
      * fix bug
      
      * fix bug
      1fcaab45
  33. 21 2月, 2022 1 次提交
    • C
      Update record interface using part2 (#39694) · c984cd85
      chenjian 提交于
      * fix RecordEvent interface
      
      * modify default level to 4
      
      * update interface use
      
      * add const default trace level
      
      * update record event interface using
      
      * update record event interface using
      
      * update operator.cc
      
      * update part2
      
      * update part1
      
      * fix include profiler.h header in ps server
      
      * fix include profiler.h header in ps server
      
      * fix profiler.h header
      c984cd85
  34. 20 2月, 2022 1 次提交
  35. 19 2月, 2022 2 次提交
    • A
      [Pten]Unify paddle/pten::framework::ddim into pten::ddim (#39614) · 2fe04264
      Aurelius84 提交于
      * Unify paddle/pten::framework::ddim into pten::ddim
      
      * fix paddle namespace
      
      * compile sucessfully
      
      * fix npu src file
      
      * fix conflict
      
      * fix conflict
      
      * fix tensorrt compiler error
      
      * fix conflict
      
      * fix conflict
      
      * fix tesst file conflict
      
      * fix conflict
      
      * fix mlu file conflict
      
      * fix mlu file conflict
      
      * fix cinn header file conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      
      * fix conflict
      2fe04264
    • C
      Update record interface using part1 (#39693) · eec6ef81
      chenjian 提交于
      * fix RecordEvent interface
      
      * modify default level to 4
      
      * update interface use
      
      * add const default trace level
      
      * update record event interface using
      
      * update operator.cc
      
      * update part1
      
      * fix include profiler.h header in ps server
      
      * fix include profiler.h header in ps server
      eec6ef81
  36. 15 2月, 2022 1 次提交
    • A
      [PTen]Migrate proto::VarType outside of Pten (#39411) · 7e7e9404
      Aurelius84 提交于
      * #1 migrate dist-related type()-> dtype()
      
      * move datatype function from pten -> fluid/framework
      
      * change type() in imperative into convert(dtype())
      
      * modify xx_tensor->type into xx_tensor->dtype
      
      * change the set_type interface and the caller
      
      * modify xx_tensor.type into xx_tensor.dtype
      
      * fix mutable_data(place, dtype())
      
      * change caller of mutable_data in pten and distributed
      
      * change the caller of mutable_data in fluid/framework
      
      * change the caller of mutable_data in imperative directory
      
      * mutable_data: inference
      
      * update the call of mutable_data
      
      * transfer MakePenScalarArray MakePtenScalar ResetHolderWithType
      
      * pass the compile. the next step is remove VarType in Pten
      
      * fix all and remove VarType from pten. success in linux. Next task is other platform
      
      * fix conflict with develop
      
      * fix compiled error
      
      * Fix reset conversion
      
      * fix conflict
      
      * fix compiled problem
      
      * fix typo
      
      * Fix << in tensor_utils.cc
      
      * fix type->dtype
      
      * fix unittest
      
      * fix tensor init constructor
      
      * fix DataTypeSize for BFloat16
      
      * fix code style
      
      * fix npu compiled error
      
      * fix npu
      
      * compile npu sucessfully
      
      * fix conflict
      
      * fix conflict
      Co-authored-by: Nxiongkun <xiongkun03@baidu.com>
      7e7e9404
  37. 10 2月, 2022 1 次提交
  38. 02 2月, 2022 1 次提交