1. 31 10月, 2022 5 次提交
    • X
      [Einsum] Einsum support repeated labels. (#47290) · 6e1c14e3
      xiongkun 提交于
      * add unittest for einsum-v2-trace and diagonal
      
      * repeat labels.
      
      * einsum support repeated labels.
      
      * forward is ok for diagonal and undiagonalized.
      TODO: check backward is ok by our theorem.
      
      * backward is ok!
      
      * fix by PR suggestions.
      
      * fix ci error
      
      * fix ci error
      
      * fix ci warning
      6e1c14e3
    • R
      [CustomDevice] GetCCLComm add custom device support (#47168) · 34d13d6a
      ronnywang 提交于
      * [CustomDevice] GetCCLComm add custom device support
      
      * update
      
      * update
      
      * update
      34d13d6a
    • K
      [ControlFlow] replace executor in run method of control flow ops with standalone_executor (#45696) · 3b219e5e
      kangguangli 提交于
      * replace executor in conditional_block_op.run with standalone_executor
      
      * add block_id as the argument of standalone executor's method run; add print for program
      
      * fix scope bug about conditional block op
      
      * fix bug: unnecessary return of fetch value
      
      * fix typo
      
      * fix: quantization will set variable persistable, and these variables must exist in global scope
      
      * add interpretercore cache for conditional block op but not activate in default
      
      * fix bug: local scope reuse for conditional block op
      
      * reset scope when conditional block op runs
      
      * fix typo
      
      * fix typo and code style
      
      * add build scope for conditional block op
      
      * add skip for transfer_layout kernel
      
      * refind code
      
      * fix reset_scope
      
      * fix reset_scope
      
      * refine code
      
      * refine code
      
      * refine code
      
      1. remove flag use in conditional_block_op
      2. pass execution_config to BuildOpFuncList instead of individual parameter
      
      * refine code
      
      * remove the use of FLAGS_control_flow_use_new_executor_cache
      
      * change FLAGS_control_flow_use_new_executor to false
      3b219e5e
    • zhouweiwei2014's avatar
    • W
      remove boost compiler flags in flags.cmake (#47468) · 91096ae2
      Wang Xin 提交于
      91096ae2
  2. 28 10月, 2022 1 次提交
  3. 27 10月, 2022 2 次提交
  4. 26 10月, 2022 3 次提交
  5. 25 10月, 2022 2 次提交
  6. 24 10月, 2022 4 次提交
  7. 21 10月, 2022 1 次提交
  8. 20 10月, 2022 2 次提交
  9. 19 10月, 2022 4 次提交
  10. 18 10月, 2022 3 次提交
  11. 17 10月, 2022 4 次提交
    • G
      Support BF16 training for sharding (#46846) · 0b39b244
      Ghost Screaming 提交于
      * Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result
      is wrong.
      
      * support pure bfloat16
      
      * support bf16 linear
      
      * update PR to pass CI
      
      * tiny fix where_grad_kernel.cu
      
      * Support bfloat16 type for reducer and sharding.
      
      * Fix some bug.
      
      * Polish code.
      
      * Polise code.
      
      * Add bfloat16 datatype in fill_grad kernels.
      Co-authored-by: Nsneaxiy <sneaxiy@126.com>
      0b39b244
    • Y
      [PHI]Modify DataLayout's namespace from paddle::experimental to phi (#46869) · ec749398
      YuanRisheng 提交于
      * namespace modify
      
      * update by comment
      ec749398
    • O
      [Hackathon 3rd No.22 ] add paddle.incubate.sparse.reshape (#46694) · abb38136
      OccupyMars2025 提交于
      * add sparse reshape
      
      * change the dtype in all test cases to int64
      
      * just one test case
      
      * modify comments
      
      * Update test_sparse_reshape_op.py
      
      * chang the type of "shape"  from  vector<int64_t>  to  IntArray
      
      * check whether sp_out.to_dense() is the cause  of error
      
      * print sp_out
      
      * Update reshape_kernel.cc
      
      * use numpy to generate the equal paddle tensor
      
      * just check dense_tensor.numpy()
      
      * check cpu and cuda versions
      
      * Update test_sparse_reshape_op.py
      
      * supply all test cases for cpu forward coo kernel
      
      * test forward coo cuda kernel
      
      * change configuration of cuda kernel
      
      * keep only one test case
      
      * test coo cpu kernel (forward and backward)
      
      * row major or column major ???
      
      * test cuda coo forward kernel
      
      * complete declaration and registration
      
      * Update __init__.py
      
      * rebuild
      
      * retrigger CI
      
      * add cudaMalloc and cudaMemcpy  in  ReshapeCooKernel  and change back to row major order in a cuda dense tensor
      
      * midify minor error
      
      * test only cpu coo forward kernel
      
      * add all test cases for coo forward kernel  (both cpu and gpu)
      
      * test all forward kernels (coo, csr; cpu, gpu)
      
      * add all test cases for all kinds of kernels
      
      * just retrigger CI
      
      * Update sparse_ops.yaml
      
      * Update sparse_ops.yaml
      
      * Update sparse_ops.yaml
      
      * resolve conflicts
      
      * Update sparse_ops.yaml
      
      * don't specify tensor place
      
      * new shape has -1 or 0 in it
      
      * Update unary_grad_kernel.h
      
      * correct lvalue error
      
      * code style
      
      * Update sparse_backward.yaml
      
      * Update sparse_ops.yaml
      
      * Update unary_kernel.h
      
      * Update unary.py
      
      * Update sparse_backward.yaml
      
      * Update unary.py
      
      * code style
      
      * code style
      
      * code style
      
      * Update unary.py
      
      * specify tensor place explicitly
      
      * do not use numpy array
      
      * use numpy array in unit test again
      
      * modify example code in docstring
      abb38136
    • L
      f9c1cdc1
  12. 14 10月, 2022 2 次提交
  13. 13 10月, 2022 5 次提交
  14. 12 10月, 2022 2 次提交