1. 31 1月, 2023 16 次提交
  2. 30 1月, 2023 5 次提交
  3. 25 1月, 2023 1 次提交
  4. 20 1月, 2023 3 次提交
  5. 19 1月, 2023 2 次提交
    • H
      Fix paddle.queeze_ bug (#49903) · 11e34ae0
      heliqi 提交于
      * fix queeze_ bug
      
      * fix slove use squeeze_kernel
      
      * fix slove use squeeze_kernel
      
      * fix slove use squeeze_kernel
      
      * add test case
      11e34ae0
    • J
      [KUNLUN] add op: maxpool_with_index (#49505) · f71f77e9
      jameszhang 提交于
      * [KUNLUN] add op: maxpool_with_index
      
      * use DeviceContext::Alloc() instead of DenseTensor::mutable_data()
      
      * fix file format
      
      * solve clip unittest failure
      
      * minor fix
      
      * Revert "solve clip unittest failure" since the issue is fixed
      in #49535
      
      This reverts commit 1127adc66e79afe35ac3c00bb34e6aaa7cd7d78b.
      
      * align with xdnn on the definition of mask in max_pool_with_index
      
      * minor
      f71f77e9
  6. 18 1月, 2023 8 次提交
  7. 17 1月, 2023 5 次提交
    • Z
      Refine munmap freq for RefcountedMemoryMapAllocation (#49691) · 3fdc105f
      zhangbo9674 提交于
      * refine munmap freq for ref_cnt_mmap_allocator
      
      * add shm reuse logic
      
      * fix compile bug
      
      * fix compile bug
      
      * fix bug of file refcount
      
      * fix compile bug
      
      * fix compile bug
      
      * refine code for delete shm case
      
      * polish code
      
      * refine shm cache pool size setting logic
      
      * set buffer is 2
      
      * refine shm cache size logic
      
      * refine max shm cache
      
      * refine shm cache size
      3fdc105f
    • Y
      [Zero-Dim] support input 0D Tensor for equal_all (#49845) · f287b1e9
      yeliang2258 提交于
      * add zero dims test
      
      * update code
      
      * fix zero dims
      
      * update code
      f287b1e9
    • P
      support CUDA Graph for new executor (#49708) · 8e5ed04d
      pangyoki 提交于
      * new exe supports CUDA Graph
      
      * fix
      
      * fix
      
      * fix
      
      * fix FLAGS_use_stream_safe_cuda_allocator in unittest
      
      * insert output of coalesce_tensor op to skip_gc_var
      
      * fix
      8e5ed04d
    • Y
      [PHI]Change feed_op to phi kernel (#49116) · f7f1dc03
      YuanRisheng 提交于
      * change feed_op to phi kernel
      
      * fix ci bugs
      
      * fix build bugs
      
      * fix ci bugs
      
      * fix compile bugs
      
      * fix ci bugs
      
      * perfect code
      
      * perfect comment code
      
      * fix install bugs
      
      * modify code according comment
      
      * remove visitor in feed_op
      
      * modify according comment
      
      * perfect code according comment
      
      * add infershape
      
      * fix py3 bugs
      
      * fix getexpected kernel type
      
      * fix getexpected kernel type
      
      * fix ci bugs
      
      * add registry for custom device
      
      * fix py3 bugs
      
      * fix floating point error
      
      * fix py3 test bugs
      f7f1dc03
    • H
      SetDevice when parse TensorBase (#49860) · 4c576870
      HongyuJia 提交于
      4c576870