1. 28 3月, 2022 1 次提交
  2. 27 3月, 2022 8 次提交
    • L
      [new-exec] fit for mkldnn and inplace op (#40955) · afa0e82c
      Leo Chen 提交于
      * fit for mkldnn and inplace op
      
      * fix compile
      
      * refine ut
      
      * register op version
      
      * fix inplace op
      
      * fix transfer_layout
      afa0e82c
    • S
      fix reshape+transpose+matmul (#40948) · 1c6dcfd9
      Sylwester Fraczek 提交于
      1c6dcfd9
    • T
      add check of data type and support mutable_data with compiled infos (#40920) · 6a94adbe
      TeFeng Chen 提交于
      * support check data type and mutable_data with compiled infos in paddle with cinn
      
      * update cinn_instruction_run_op_test with multi data type
      6a94adbe
    • H
      Move slice to phi (#40736) · b8236b7b
      hong 提交于
      * move slice to pten
      
      * merge develop; test=develop
      
      * fix slice bug;
      
      * update
      
      * update
      
      * fix error
      
      * update
      
      * fix bug
      
      * polish code
      
      * polish code
      
      * polish code
      
      * try to fix windows bug
      
      * add gpu compile flag;
      
      * try to fix
      
      * remov template;
      
      * polish code;
      
      * fix npu bug;
      
      * fix npu bug
      
      * fix npu bug; test=develop
      
      * fix slice bug;
      
      * remove no need dep
      b8236b7b
    • F
      Make StreamSafeCUDAAllocator compatible with NaiveBestFit strategy (#40886) · 0ad2e192
      From00 提交于
      * Make StreamSafeCUDAAllocator compatible with NaiveBestFit strategy
      
      * Set FLAGS_use_stream_safe_cuda_allocator to false
      
      * Update
      
      * Remove unnecessary code
      
      * Fix CI errors
      
      * Add UT
      0ad2e192
    • P
      fix inplace bug in final_state eager_gen (#40979) · 25591674
      pangyoki 提交于
      * fix inplace bug in final_state eager_gen
      
      * fix python_c_gen
      25591674
    • Z
      Fix amp with optiontional api bug (#40980) · 52f07ab4
      zhangbo9674 提交于
      * fix amp with optiontional api bug
      
      * refine optional code for amp
      52f07ab4
    • J
      Add StringTensor (#39830) · 0695e1ac
      Jack Zhou 提交于
      * add string tensor and case convert kernels
      
      * Add strings empty kernel; Reorganize the structure of case convert kernel
      
      * Add string infermeta
      
      * Update mutable_data of string tensor
      
      * rename kernel name
      
      * add string copy tmp
      
      * Fix strings copy device bug
      
      * add utf8 gpu converter
      
      * add string tensor c++ api
      
      * Remove mutable_data of string tensor
      
      * update string tensor interface
      
      * remove charcases_flag.h
      
      * remove some fluid headers
      
      * Add make_ddim
      
      * __HIPCC__ -> PADDLE_WITH_HIP
      
      * remove fluid headers
      
      * fix cpu compile
      
      * remove std::hash
      
      * Fix cudaMalloc
      
      * Remove strings/impl directory
      
      * Fix infrt/get_phi_kernel_info.py;Add custom_kernels deps
      
      * Add empty kernel test
      
      * Remove some comments
      
      * Modify lower/upper api encoding type: string->bool
      
      * STRING->PSTRING; Add CreateInferLikeMeta
      
      * Add code gen for C++ String API
      
      * remove strings_api_utils.h
      
      * Add ignore file (strings_api.h, strings_api.cc)
      
      * update strings gen script
      
      * change args order of case convert kernels
      
      * Add comments for pstring, StringTensor
      
      * cpstring_internal.h -> cpstring_impl.h
      
      * Update accordding to comments:
      
      1. Remove fluid headers
      2. paddle::platform::errors -> phi::errors
      3. Use 'place.GetType() == phi::AllocationType::GPU' instead of 'paddle::platform::is_cpu_space()'
      4. Use camel code style
      
      * Remove all singletons in strings kernels
      
      * fix rocm compile
      
      * Fix py3 compile
      
      * Fix c++ coverage
      
      * 1. Add pstring proto type
      2. Add StringTensor debug info
      3. Rename case_convert_kernel to strings_lower_upper
      4. Remove serialize derialize strings kernel
      
      * DataLayout::PSTRING -> DataLayout::PSTRING_UNION
      
      * Register pstring data type
      
      * Fix strings api gen
      
      * Fix dense tensor register pstring dtype
      
      * Fix error messages
      
      * remove line
      
      * add pstring unittest
      
      * remove test string api unitest
      
      * remove empty line
      
      * Remove some headers to decrease the size of executable file
      0695e1ac
  3. 26 3月, 2022 2 次提交
  4. 25 3月, 2022 20 次提交
  5. 24 3月, 2022 9 次提交
    • C
      [Phi] Move mean op kernel into phi (#40872) · 8df91763
      Chen Weihang 提交于
      * add mean phi kernel
      
      * remove original mean kernel
      
      * add alias name
      8df91763
    • C
      [Phi] Move batch size like infershape into phi (#40847) · 6d3db9c7
      Chen Weihang 提交于
      * move batch size like infershape
      
      * revert other op change
      
      * call infermeta in infershape
      
      * adjust batchsize like pos
      6d3db9c7
    • Z
      p_norm transfer to phi kernels (#40819) · 92afe146
      zhiboniu 提交于
      92afe146
    • L
      22a5035e
    • J
      fix build_cinn_pass internal var may be control var problem (#40812) · 310b7dba
      jiangcheng 提交于
      * fix build_cinn_pass internal var may be control var problem
      
      * add annotation and vlog by review advice
      310b7dba
    • Z
      Support intermediate for Sparse API (#40840) · 98244a9a
      zyfncg 提交于
      * support intermediate for saprse api
      
      * close intermediate in yaml
      
      * fix dygraph_api dep for eager
      98244a9a
    • Z
      [AMP] Support amp for Intermediate_dygraph (#40623) · c12f7d48
      zhangbo9674 提交于
      * approve amp for intermediate_dygraph
      
      * add amp_utils for intermediate_dygraph
      
      * add amp needcast check for mlu & npu
      
      * test unittest
      
      * add SetGradNode for set_stop_gradient && add checktensor for GradientHooks
      
      * refine code
      
      * refien unittest of imperative_amp for new dygraph
      
      * inplace api skip amp
      
      * add test_imperative_qat_amp for intermediate amp
      
      * refine code
      
      * refine test_amp ci strategy
      
      * refine unittest code
      
      * refine amp_utils code
      
      * refine amp getpromotetype for some special op
      
      * refine unittest code
      c12f7d48
    • J
      Correct MultipleQuantizeSquash (#40717) · 753964a2
      joanna.wozna.intel 提交于
      * Correct MultipleQuantizeSquash
      
      * Correct logging
      753964a2
    • R
      [MoE]Assign pos op (#40580) · 305f32d1
      Roc 提交于
      * # This is a combination of 10 commits.
      # The first commit's message is:
      add expert count op
      
      add ut for expert_count
      
      # This is the 2nd commit message:
      
      update UT only for cuda
      
      # This is the 3rd commit message:
      
      fix for rocm
      
      # This is the 4th commit message:
      
      update ut
      
      # This is the 5th commit message:
      
      add moe module
      
      # This is the 6th commit message:
      
      add expert count op
      
      add ut for expert_count
      
      # This is the 7th commit message:
      
      update UT only for cuda
      
      # This is the 8th commit message:
      
      update ut
      
      # This is the 9th commit message:
      
      add moe module
      
      # This is the 10th commit message:
      
      make expert count private
      
      * add assign pos op
      
      * fix upper num name
      
      * add api _assign pos
      
      * add ut for assign pos op
      
      * update date
      
      * fix for win
      
      * update for test (timeout)
      
      * fix ut
      
      * update
      
      * fix ut for number count
      Co-authored-by: Nhlygit66666 <2570058140@qq.com>
      305f32d1