1. 23 8月, 2023 1 次提交
  2. 22 8月, 2023 1 次提交
  3. 21 8月, 2023 1 次提交
  4. 18 8月, 2023 1 次提交
    • L
      [Inference] Make share_external_data supports bf16 and bool; fix while_op... · c65ef07c
      lzy 提交于
      [Inference] Make share_external_data supports bf16 and bool; fix while_op cache_inference_while_scope when using fleet_executor. (#56055)
      
      * 1. make share_external_data supports bf16 and bool; 2. don't drop_kids when cache_inference_while_scope
      
      * fix FLAGS_cache_inference_while_scope
      
      * add unitest
      
      * add unitest
      
      * skip unitest when cudnn_version < 8100
      
      * skip test share_external_data_bf16 when CUDA_ARCH < 80
      c65ef07c
  5. 17 8月, 2023 1 次提交
  6. 16 8月, 2023 1 次提交
  7. 15 8月, 2023 1 次提交
  8. 14 8月, 2023 1 次提交
  9. 10 8月, 2023 1 次提交
  10. 09 8月, 2023 2 次提交
  11. 07 8月, 2023 3 次提交
  12. 04 8月, 2023 2 次提交
  13. 03 8月, 2023 2 次提交
  14. 02 8月, 2023 3 次提交
    • W
      [XPU]Add conv1d fuse pass (#55719) · 22c7a6eb
      wz1qqx 提交于
      22c7a6eb
    • Y
      [Inference] Replace groupNorm when data types are bf16 and fp16, and data... · e61d892a
      yangjianfengo1 提交于
      [Inference] Replace groupNorm when data types are bf16 and fp16, and data format is NHWC implementation. (#55399)
      
      * finish
      
      * cpergroup odd
      
      * fix bf16
      
      * single channel
      
      * code style
      
      * jingdu duiqi
      
      * add head_file
      
      * add bf16 head file
      
      * bf16 2
      
      * bf16
      
      * bf16 head
      
      * bf16 compile
      
      * py test
      
      * bf16 compile
      
      * bf16 compile
      
      * unset py test
      
      * nhwc
      
      * test
      
      * mean var
      
      * bf16 success
      
      * su
      
      * ctest success
      
      * use is_same_as
      
      * is_same
      
      * use is_same
      
      * rtol
      
      * gpu_stream
      
      * del sigmod
      
      * fix bfloat16 type
      
      * use cuda_bf16_hpp
      
      * use_cuda_arch
      
      * bfloat162float2
      
      * del inplace_tol
      
      * del max_releative_tol
      
      * temp store
      
      * jingdu duiqi
      
      * temp store
      
      * plugin
      
      * jingdu duiqi
      
      * duiqi
      
      * include cuda.h
      
      * del half
      
      * half single
      
      * ci
      
      * add const
      
      * ci
      
      * cudamemset
      
      * del printf
      
      * fp16 test
      
      * add half compute
      
      * del br16 ci
      
      * del ci
      
      * ci approve
      
      * del fluid include
      e61d892a
    • J
      [XPU] Add gather_squeeze_pass (#55605) · d13a49d6
      jiangfan06 提交于
      d13a49d6
  15. 01 8月, 2023 1 次提交
  16. 27 7月, 2023 2 次提交
  17. 24 7月, 2023 2 次提交
  18. 21 7月, 2023 3 次提交
  19. 20 7月, 2023 2 次提交
  20. 19 7月, 2023 2 次提交
  21. 17 7月, 2023 2 次提交
  22. 13 7月, 2023 1 次提交
  23. 12 7月, 2023 3 次提交
    • Y
      [Inference] rewrite identity_op_clean_pass (#55240) · 2363e623
      Yuanle Liu 提交于
      * rewrite identity_op_clean_pass
      
      * fix
      
      * adjust identity_op_clean_pass order in gpu passes
      
      * fix ut
      2363e623
    • Y
      [ONEDNN] Upgrade oneDNN version to v3.1 (#52463) · cfa513f7
      YangQun 提交于
      * squash pick the poc code
      * fix build after rebase
      * fix int8 conv and fc uts
      * Fix and clean-up Get_SRC_Scale_Memory
      * fix floating point fc uts
      * fix test_analyzer_int8_googlenet
      * test_analyzer_int8_mobilenetv1
      * fix int8 mobilenet v2 and v3
      * fix build error after rebase
      * [oneDNN] rename library version
      * fix conv bias datatype
      * try to fix import error
      * fix rebase error
      * [oneDNN] pack library into python wheel
      * add MKLDNN_SHARED_LIB_3 to env_dict
      * fix test_analyzer_bert
      * fix fill_constant op kernel
      * fix ernie and matmul op ut
      * fix softplus ut
      * fix conv+relu6 fusion ut
      * fix hardswish fusion
      * fix quant+transpose fusion ut
      * fixsgd ut
      * fix int8 matmul with flatten
      * fix fc+scale fusion
      * fix conv/matmul+gelu fusion uts
      * fix rebase error
      * Revert "fix conv/matmul+gelu fusion uts"
      This reverts commit 47eb5e49972bd8f7271a233def9bfb3e98ce78e1.
      * upgrade to onednn v3.1
      * remove older version onednn
      * use densetensor::data() for achieving mean and var in layernorm impl
      * comments for atol of integer tests
      * fix clang-format
      * Revert "remove older version onednn"
      This reverts commit 783e57ddfd4401254596eae7d47adb9b03590c09.
      * improve binary handle
      * fix expand kernel
      * Revert "use densetensor::data() for achieving mean and var in layernorm impl"
      * always use forward_inference for conv
      * remove activation scales
      * rollback changes to mkldnn.cmake
      * address comments
      * port changes to dequantize kernel
      * fix merge error
      * fix fused_elementwise_kernel
      * upgrade onednn version to v3.1.1
      * fix some approval error
      * fix error msg format
      * remove old onednn libs
      * try to fix symbolic link issue
      * fix cinn test case segfault
      * do not explicit link test with onednn
      * remove unnecessary changes
      * integrate CINN with onednn v3
      * link with mkldnn project
      * fix cinn build file
      
      ---------
      Co-authored-by: NTomasz Socha <tomasz.socha@intel.com>
      Co-authored-by: NChen, Xinyu1 <xinyu1.chen@intel.com>
      Co-authored-by: Ntianshuo78520a <707759223@qq.com>
      cfa513f7
    • W
      [clang-tidy] enable `readability-container-size-empty` check (#55279) · be3a6fa7
      Wang Xin 提交于
      * [clang-tidy] enable readability-container-size-empty check
      
      * fix test_custom_kernel Failed
      
      * add clang-tid-10 in dockerfile
      
      * add clang-tidy in dockerfile
      
      * fix bug
      be3a6fa7
  24. 07 7月, 2023 1 次提交