1. 27 1月, 2022 11 次提交
    • A
      [PluggableDevice] Add custom kernel support based on pten kernel management (#38848) · a8879215
      Aganlengzi 提交于
      * [Demo] custom kernel based on pten kernel
      
      * merge and npu custom work well
      
      * del comments
      
      * delete other code
      
      * fix CUDAContext
      
      * fix not found small_vector.h
      
      * support NPU
      
      * fix NPUContext
      
      * fix DeviceContext support
      
      * add UT
      
      * fix call
      
      * add UT
      
      * fix
      
      * fix for comments and ut
      
      * add MACRO control
      
      * fix multi input output
      
      * support env CUSTOM_DEVICE_ROOT
      
      * deal with special cases
      
      * fix for Windows
      
      * try coverage with test_custom_kernel_dot.py
      
      * fix test_custom_kernel_dot
      
      * fix test_custom_kernel_dot
      
      * fix merge
      
      * fix merge
      
      * fix CI
      
      * update
      
      * merge and fix
      
      * remove WITH_CUSTOM_KERNEL
      
      * fix merge
      
      * merge and fix
      
      * fix ut
      
      * fix ut for mac
      
      * add more UT
      
      * add more UT
      
      * fix
      a8879215
    • zhouweiwei2014's avatar
      fix UT test_lr_scheduler random fail (#39254) · 7e6a2190
      zhouweiwei2014 提交于
      7e6a2190
    • J
      Update passes in quant2_int8_mkldnn_pass (#38912) · 0e235e58
      joanna.wozna.intel 提交于
      * Upadate pass in quant2_int8_mkldnn_pass
      
      * Back to the previous scale_matmul order
      
      * Change place of cpu_quantize_placement_pass
      0e235e58
    • W
      fix shuffle_channel_detect_pass (#39242) · af9ddeb7
      wenbin 提交于
      * shuffle channel pass
      
      * add ut
      
      * timeout fix
      
      * makefile fix
      af9ddeb7
    • C
      【Auto Parallel】Update Planner (#39201) · f2226441
      caozhou 提交于
      * update planner
      
      * update unitest
      
      * update dist matmul
      
      * update auto converter
      f2226441
    • Q
      optimize kunlun/xpu softmax_with_cross_entropy add add unitest (#39180) · 2b9bb8bb
      QingshuChen 提交于
      * optimize kunlun/xpu softmax_with_cross_entropy add add unitest
      *test=kunlun
      
      * minor
      *test=kunlun
      
      * minor
      *test=kunlun
      
      * minor
      *test=kunlun
      
      * minor
      *test=kunlun
      2b9bb8bb
    • Z
      Add SparseCooTensor and SparseCsrTensor (#38906) · a7edb3f3
      zhangkaihuo 提交于
      * fix bug:
      1. atten: set the default value of attn_dropout_rate to None
      2. ffn: add activation parameter
      
      * for pure fp16
      
      * Add a SparseCsrTensor
      
      * remove unused functional
      
      * remove const
      
      * remove SetMemoberTensor
      
      * remove non_zero_nums_, the number of non zero elements of each batch can be obtained from the crows
      
      * SparseCooTensor
      
      * add SetMember
      
      * merge upstream; add SetMember
      
      * merge upstream
      
      * merge upstream; add newline at end of file
      
      * add newline at end of file
      
      * remove newline at end of file
      
      * remove newline at end of file
      
      * stash
      
      * user pten::framework::make_ddim
      
      * user pten::framework::make_ddim
      
      * merge upstream; use the latest mutable_data
      
      * merge upstream; use the latest mutable_data
      
      * return mutable dense tensor
      a7edb3f3
    • C
      【Auto Parallel】update dist param grad for pass (#38941) · cac6f408
      caozhou 提交于
      * update dist param grad for pass
      
      * update unitest
      
      * update unitests
      
      * fix conflict
      cac6f408
    • W
      [Paddle-Inference]: fix concat slice (#39096) · f080e8d5
      Wangzheee 提交于
      * Paddle-Inference:fix_concat_slice
      
      * Paddle-Inference:fix_concat_slice
      
      * Paddle-Inference:fix_concat_slice
      
      * Paddle-Inference:fix_concat_slice
      
      * [Paddle-Inference]: fix concat slice
      
      * [Paddle-Inference]: fix concat slice
      
      * [Paddle-Inference]: fix concat slice
      f080e8d5
    • H
      Take/Put_along_axis more input size support (#39072) · 41a64351
      huangxu96 提交于
      Support the cases that the indices shape size is larger than the arr shape size
      41a64351
    • Z
      [Optimizer] Add master weight for opt state_dict (#39121) · 3e6950d5
      zhangbo9674 提交于
      * add master weight for opt state_dict
      
      * check empty of master weight
      
      * strict gpu test
      
      * refine unittest
      3e6950d5
  2. 26 1月, 2022 10 次提交
  3. 25 1月, 2022 17 次提交
  4. 24 1月, 2022 2 次提交