1. 29 1月, 2022 6 次提交
  2. 28 1月, 2022 18 次提交
  3. 27 1月, 2022 16 次提交
    • Z
      implement AllocateFrom (#39280) · d89f246c
      zhangkaihuo 提交于
      d89f246c
    • S
      Add Khop Graph Sampler API (#39146) · 35f949b5
      Siming Dai 提交于
      * add the test case for the UVA
      
      * add the context load for the uva
      
      * Add graph_sample kernel
      
      * Add graph_sample commit
      
      * add new commit for graph_sample
      
      * add unsigned long long int
      
      * delete some remarks
      
      * add cpu version
      
      * add cuda eids
      
      * add cpu eids
      
      * delete _uva
      
      * optimize speed: emplace_back, last_layer
      
      * add to_uva_tensor
      
      * add cpu return_eids choice
      
      * add gpu return_eids choice
      
      * add cpu reindex_nodes
      
      * add gpu reindex_nodes
      
      * rename op and add OMP for cpu
      
      * add incubate api
      
      * fix the compile problem for the PADDLE_ENFORE and different device
      
      * fix the rcom and windows compile problem
      
      * add unittest for graph_sample_neighbors
      
      * fix cpu unittest and unique problem
      
      * fix uva unittest, fix cuda unique problem
      
      * fix the windows compile problem
      
      * fix the windows rand_r compile problem
      
      * add correct unittest, add src_eids dispensable
      
      * delete black
      
      * combine uva unittest
      
      * mv Sample_index to Sample_Index; check input shape; fix random sample func
      
      * delete memset & cudaMemset
      
      * fix according to PR comments
      
      * fix rocm ci
      
      * modify function names according to the specification
      
      * fix windows_openblas ci
      
      * refine annotations, fix windows unittest, add default value for uva device_id, fix bug for input nodes with empty neighbors
      
      * fix rocm ci
      
      * rename graph_sample_neighbors as graph_khop_sampler, add incubate api doc
      
      * add data type
      
      * fix conflict
      Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
      35f949b5
    • L
      [pten] remove concat fluid kernel (#39268) · 552db8dc
      Leo Chen 提交于
      552db8dc
    • C
      Add kernelsignature constructor for windows (#39253) · 33e3f5ac
      Chen Weihang 提交于
      * add constructor for win
      
      * change impl
      
      * fix bug
      33e3f5ac
    • Z
      【PTen】Remove ReMakePtenDenseTensor (#39094) · 98c1829b
      zyfncg 提交于
      * remove remake densetensor
      
      * fix eager test error
      
      * fix bug in eager
      98c1829b
    • Y
      refactor elementwise sub grad (#39225) · 7a1e1193
      YuanRisheng 提交于
      7a1e1193
    • A
      [PTen]Support AllocateFrom in Tensor and Alloc/HostAlloc in Context (#39022) · 5631da9c
      Aurelius84 提交于
      * Support allocate_from in Tensor and allocate_data in Context
      
      * fix #ifdef CUDA
      
      * fix cycle depends
      
      * fix test_xxx_dev_api failed
      
      * fix windows compiling error
      
      * fix unittest
      
      * modify into PImpl
      
      * fix selected rows
      
      * add TODO comment
      
      * refine interface according reviewer
      5631da9c
    • C
      [PTen] Add infermeta registry (#39204) · f3f16126
      Chen Weihang 提交于
      * add infermeta registry
      
      * add infermeta registry
      
      * add unittest
      
      * polish details
      f3f16126
    • Q
      [MLU] add compile ci scripts for MLU, test=mlu_ci (#39122) · 56410b4a
      Qi Li 提交于
      56410b4a
    • A
      [PluggableDevice] Add custom kernel support based on pten kernel management (#38848) · a8879215
      Aganlengzi 提交于
      * [Demo] custom kernel based on pten kernel
      
      * merge and npu custom work well
      
      * del comments
      
      * delete other code
      
      * fix CUDAContext
      
      * fix not found small_vector.h
      
      * support NPU
      
      * fix NPUContext
      
      * fix DeviceContext support
      
      * add UT
      
      * fix call
      
      * add UT
      
      * fix
      
      * fix for comments and ut
      
      * add MACRO control
      
      * fix multi input output
      
      * support env CUSTOM_DEVICE_ROOT
      
      * deal with special cases
      
      * fix for Windows
      
      * try coverage with test_custom_kernel_dot.py
      
      * fix test_custom_kernel_dot
      
      * fix test_custom_kernel_dot
      
      * fix merge
      
      * fix merge
      
      * fix CI
      
      * update
      
      * merge and fix
      
      * remove WITH_CUSTOM_KERNEL
      
      * fix merge
      
      * merge and fix
      
      * fix ut
      
      * fix ut for mac
      
      * add more UT
      
      * add more UT
      
      * fix
      a8879215
    • zhouweiwei2014's avatar
      fix UT test_lr_scheduler random fail (#39254) · 7e6a2190
      zhouweiwei2014 提交于
      7e6a2190
    • J
      Update passes in quant2_int8_mkldnn_pass (#38912) · 0e235e58
      joanna.wozna.intel 提交于
      * Upadate pass in quant2_int8_mkldnn_pass
      
      * Back to the previous scale_matmul order
      
      * Change place of cpu_quantize_placement_pass
      0e235e58
    • C
      [pten] add full xpu kernel (#39172) · 93839717
      chentianyu03 提交于
      * add full_kernel xpu
      
      * fix full xpu register device type error
      
      * fix full kernel bug
      
      * add fulllike kernel impl and replace with raw kernel
      
      * fix dev_ctx convert template args error
      
      * modify namespace and header file
      
      * add isinf check
      
      * fix input type args in TensorSetConstantXPU error
      93839717
    • W
      fix shuffle_channel_detect_pass (#39242) · af9ddeb7
      wenbin 提交于
      * shuffle channel pass
      
      * add ut
      
      * timeout fix
      
      * makefile fix
      af9ddeb7
    • C
      【Auto Parallel】Update Planner (#39201) · f2226441
      caozhou 提交于
      * update planner
      
      * update unitest
      
      * update dist matmul
      
      * update auto converter
      f2226441
    • Q
      optimize kunlun/xpu softmax_with_cross_entropy add add unitest (#39180) · 2b9bb8bb
      QingshuChen 提交于
      * optimize kunlun/xpu softmax_with_cross_entropy add add unitest
      *test=kunlun
      
      * minor
      *test=kunlun
      
      * minor
      *test=kunlun
      
      * minor
      *test=kunlun
      
      * minor
      *test=kunlun
      2b9bb8bb