1. 02 4月, 2022 2 次提交
  2. 01 4月, 2022 6 次提交
    • L
      fix mac c++ version (#41172) · a2c01db1
      liutiexing 提交于
      * fix mac c++ version
      
      * update
      
      * fix apple systems
      a2c01db1
    • C
      [Phi] Move softmax with cross entropy kernel into phi (#40832) · e6ec98fe
      Chen Weihang 提交于
      * add cross_entropy_with_softmax phi kernel
      
      * remove softmax_with_cross_entropy kernel
      
      * add softmax_with_cross_entropy grad kernel
      
      * remove original op kernel
      
      * refine cross entropy impl
      
      * fix pointer error
      
      * revert kernel cu change
      
      * fix xpu failed
      
      * fix cinn failed
      
      * fix npu failed
      
      * add forward sig
      
      * add check_nan_inf for pt kernel
      
      * remove repeat cmake item
      
      * fix unittest error
      e6ec98fe
    • C
      [Phi]Interploatd kernels into phi (#40855) · d65a7a46
      chentianyu03 提交于
      * add interploate cpu kernel
      
      * fix nullptr bug
      
      * add interpolate gpu kernel
      
      * fix unit test error
      
      * remove raw kernels
      
      * add cuda kernel impl
      
      * add infermeta
      
      * recover accidentally deleted kernels in interpolate op
      
      * fix grad x_grad name error
      
      * remove interpolate_v2_op.h
      
      * rm unused codes
      
      * fix xpu build error
      
      * fix build error
      
      * fix namespace error
      
      * add register header for nup
      
      * fix infermeta error
      
      * modify by review
      
      * add the missing args in test_trt_convert_nearest_interp_v2
      d65a7a46
    • Z
      [GPUPS]fix CMakeLists with pslib (#41225) · 4da4265a
      zmxdream 提交于
      * fix cmake. test=develop
      
      * fix. test=develop
      
      * fix dep for graphs_ps_gpu. test=develop
      
      * update. test=develop
      
      * update. test=develop
      4da4265a
    • A
      [custom kernel] support fallback (#41212) · 9c2a9afd
      Aganlengzi 提交于
      9c2a9afd
    • L
      [new-exec] move WaitEvent/RecordEvent into try-catch (#41222) · 5dae6da0
      Leo Chen 提交于
      * move WaitEvent/RecordEvent into try-catch
      
      * refine supportNpu
      5dae6da0
  3. 31 3月, 2022 8 次提交
  4. 30 3月, 2022 6 次提交
  5. 29 3月, 2022 3 次提交
  6. 28 3月, 2022 5 次提交
  7. 27 3月, 2022 5 次提交
    • L
      [new-exec] fit for mkldnn and inplace op (#40955) · afa0e82c
      Leo Chen 提交于
      * fit for mkldnn and inplace op
      
      * fix compile
      
      * refine ut
      
      * register op version
      
      * fix inplace op
      
      * fix transfer_layout
      afa0e82c
    • T
      add check of data type and support mutable_data with compiled infos (#40920) · 6a94adbe
      TeFeng Chen 提交于
      * support check data type and mutable_data with compiled infos in paddle with cinn
      
      * update cinn_instruction_run_op_test with multi data type
      6a94adbe
    • H
      Move slice to phi (#40736) · b8236b7b
      hong 提交于
      * move slice to pten
      
      * merge develop; test=develop
      
      * fix slice bug;
      
      * update
      
      * update
      
      * fix error
      
      * update
      
      * fix bug
      
      * polish code
      
      * polish code
      
      * polish code
      
      * try to fix windows bug
      
      * add gpu compile flag;
      
      * try to fix
      
      * remov template;
      
      * polish code;
      
      * fix npu bug;
      
      * fix npu bug
      
      * fix npu bug; test=develop
      
      * fix slice bug;
      
      * remove no need dep
      b8236b7b
    • F
      Make StreamSafeCUDAAllocator compatible with NaiveBestFit strategy (#40886) · 0ad2e192
      From00 提交于
      * Make StreamSafeCUDAAllocator compatible with NaiveBestFit strategy
      
      * Set FLAGS_use_stream_safe_cuda_allocator to false
      
      * Update
      
      * Remove unnecessary code
      
      * Fix CI errors
      
      * Add UT
      0ad2e192
    • J
      Add StringTensor (#39830) · 0695e1ac
      Jack Zhou 提交于
      * add string tensor and case convert kernels
      
      * Add strings empty kernel; Reorganize the structure of case convert kernel
      
      * Add string infermeta
      
      * Update mutable_data of string tensor
      
      * rename kernel name
      
      * add string copy tmp
      
      * Fix strings copy device bug
      
      * add utf8 gpu converter
      
      * add string tensor c++ api
      
      * Remove mutable_data of string tensor
      
      * update string tensor interface
      
      * remove charcases_flag.h
      
      * remove some fluid headers
      
      * Add make_ddim
      
      * __HIPCC__ -> PADDLE_WITH_HIP
      
      * remove fluid headers
      
      * fix cpu compile
      
      * remove std::hash
      
      * Fix cudaMalloc
      
      * Remove strings/impl directory
      
      * Fix infrt/get_phi_kernel_info.py;Add custom_kernels deps
      
      * Add empty kernel test
      
      * Remove some comments
      
      * Modify lower/upper api encoding type: string->bool
      
      * STRING->PSTRING; Add CreateInferLikeMeta
      
      * Add code gen for C++ String API
      
      * remove strings_api_utils.h
      
      * Add ignore file (strings_api.h, strings_api.cc)
      
      * update strings gen script
      
      * change args order of case convert kernels
      
      * Add comments for pstring, StringTensor
      
      * cpstring_internal.h -> cpstring_impl.h
      
      * Update accordding to comments:
      
      1. Remove fluid headers
      2. paddle::platform::errors -> phi::errors
      3. Use 'place.GetType() == phi::AllocationType::GPU' instead of 'paddle::platform::is_cpu_space()'
      4. Use camel code style
      
      * Remove all singletons in strings kernels
      
      * fix rocm compile
      
      * Fix py3 compile
      
      * Fix c++ coverage
      
      * 1. Add pstring proto type
      2. Add StringTensor debug info
      3. Rename case_convert_kernel to strings_lower_upper
      4. Remove serialize derialize strings kernel
      
      * DataLayout::PSTRING -> DataLayout::PSTRING_UNION
      
      * Register pstring data type
      
      * Fix strings api gen
      
      * Fix dense tensor register pstring dtype
      
      * Fix error messages
      
      * remove line
      
      * add pstring unittest
      
      * remove test string api unitest
      
      * remove empty line
      
      * Remove some headers to decrease the size of executable file
      0695e1ac
  8. 25 3月, 2022 4 次提交
  9. 24 3月, 2022 1 次提交