1. 18 1月, 2023 8 次提交
  2. 17 1月, 2023 6 次提交
    • Z
      Refine munmap freq for RefcountedMemoryMapAllocation (#49691) · 3fdc105f
      zhangbo9674 提交于
      * refine munmap freq for ref_cnt_mmap_allocator
      
      * add shm reuse logic
      
      * fix compile bug
      
      * fix compile bug
      
      * fix bug of file refcount
      
      * fix compile bug
      
      * fix compile bug
      
      * refine code for delete shm case
      
      * polish code
      
      * refine shm cache pool size setting logic
      
      * set buffer is 2
      
      * refine shm cache size logic
      
      * refine max shm cache
      
      * refine shm cache size
      3fdc105f
    • Y
      [Zero-Dim] support input 0D Tensor for equal_all (#49845) · f287b1e9
      yeliang2258 提交于
      * add zero dims test
      
      * update code
      
      * fix zero dims
      
      * update code
      f287b1e9
    • P
      support CUDA Graph for new executor (#49708) · 8e5ed04d
      pangyoki 提交于
      * new exe supports CUDA Graph
      
      * fix
      
      * fix
      
      * fix
      
      * fix FLAGS_use_stream_safe_cuda_allocator in unittest
      
      * insert output of coalesce_tensor op to skip_gc_var
      
      * fix
      8e5ed04d
    • Y
      [PHI]Change feed_op to phi kernel (#49116) · f7f1dc03
      YuanRisheng 提交于
      * change feed_op to phi kernel
      
      * fix ci bugs
      
      * fix build bugs
      
      * fix ci bugs
      
      * fix compile bugs
      
      * fix ci bugs
      
      * perfect code
      
      * perfect comment code
      
      * fix install bugs
      
      * modify code according comment
      
      * remove visitor in feed_op
      
      * modify according comment
      
      * perfect code according comment
      
      * add infershape
      
      * fix py3 bugs
      
      * fix getexpected kernel type
      
      * fix getexpected kernel type
      
      * fix ci bugs
      
      * add registry for custom device
      
      * fix py3 bugs
      
      * fix floating point error
      
      * fix py3 test bugs
      f7f1dc03
    • H
      SetDevice when parse TensorBase (#49860) · 4c576870
      HongyuJia 提交于
      4c576870
    • X
      【Prim】Add multiply,expand,div vjp rules (#49831) · 39c6765a
      Xiaoxu Chen 提交于
      * support elementwise base func
      
      * fix compiling error and add test
      
      * support vjp for div using comp
      
      * remove additional change
      
      * fix dy2st error with magic num
      
      * fix dy magic num
      
      * another magic
      
      * another magic
      
      * another magic
      
      * add skip rename strategy
      
      * support add vjp
      
      * support add with new axis cal
      
      * support sub vjp
      
      * [prim] add multiply vjp rules
      
      * [prim] add multiply vjp rules
      
      * [prim] fix no infershape with composite in _append_backward_ops
      
      * [prim] add expand vjp rule
      
      * [prim] add exp vjp rule
      
      * uncomment infer shape for reshape/sum static prim api
      
      * [prim] fix tanh nullptr error
      
      * remove some print message
      
      * fix magic number in run_program relative tests @JiaBinYang
      
      * [prim] add expand,multiply,exp vjp rules
      
      * fix only support single direction reduce error
      
      * infer reduce dims using out dims
      Co-authored-by: NJiabinYang <360788950@qq.com>
      39c6765a
  3. 16 1月, 2023 7 次提交
  4. 13 1月, 2023 16 次提交
  5. 12 1月, 2023 3 次提交