1. 25 8月, 2023 1 次提交
  2. 24 8月, 2023 2 次提交
  3. 23 8月, 2023 2 次提交
    • L
      Integrate TRT qdq layers (#54803) · ae84c603
      Leo Chen 提交于
      * Integrate quantize/dequantize linear and add config for explicit quantization
      
      * Fix the build error
      
      * Add macro for TRT version < 8.0
      
      * Remove qdq UT from windows
      
      * Fix UT failure
      
      * Check TRT version in qdq UT
      
      * Test tensorrt_explicit_enabled API
      
      * Disable QDQ UT if TRT version < 8.5
      
      * Add quantization postfix into public APIs
      
      * Apply code formatter
      
      * Fix the UT failure for explicit quantization
      
      * Apply code formatter on modified files
      
      * Correct the year in copyright
      ae84c603
    • T
      b8d7f801
  4. 22 8月, 2023 1 次提交
  5. 21 8月, 2023 1 次提交
  6. 18 8月, 2023 1 次提交
    • L
      [Inference] Make share_external_data supports bf16 and bool; fix while_op... · c65ef07c
      lzy 提交于
      [Inference] Make share_external_data supports bf16 and bool; fix while_op cache_inference_while_scope when using fleet_executor. (#56055)
      
      * 1. make share_external_data supports bf16 and bool; 2. don't drop_kids when cache_inference_while_scope
      
      * fix FLAGS_cache_inference_while_scope
      
      * add unitest
      
      * add unitest
      
      * skip unitest when cudnn_version < 8100
      
      * skip test share_external_data_bf16 when CUDA_ARCH < 80
      c65ef07c
  7. 17 8月, 2023 1 次提交
  8. 16 8月, 2023 1 次提交
  9. 15 8月, 2023 1 次提交
  10. 14 8月, 2023 1 次提交
  11. 10 8月, 2023 1 次提交
  12. 09 8月, 2023 2 次提交
  13. 07 8月, 2023 3 次提交
  14. 04 8月, 2023 2 次提交
  15. 03 8月, 2023 2 次提交
  16. 02 8月, 2023 3 次提交
    • W
      [XPU]Add conv1d fuse pass (#55719) · 22c7a6eb
      wz1qqx 提交于
      22c7a6eb
    • Y
      [Inference] Replace groupNorm when data types are bf16 and fp16, and data... · e61d892a
      yangjianfengo1 提交于
      [Inference] Replace groupNorm when data types are bf16 and fp16, and data format is NHWC implementation. (#55399)
      
      * finish
      
      * cpergroup odd
      
      * fix bf16
      
      * single channel
      
      * code style
      
      * jingdu duiqi
      
      * add head_file
      
      * add bf16 head file
      
      * bf16 2
      
      * bf16
      
      * bf16 head
      
      * bf16 compile
      
      * py test
      
      * bf16 compile
      
      * bf16 compile
      
      * unset py test
      
      * nhwc
      
      * test
      
      * mean var
      
      * bf16 success
      
      * su
      
      * ctest success
      
      * use is_same_as
      
      * is_same
      
      * use is_same
      
      * rtol
      
      * gpu_stream
      
      * del sigmod
      
      * fix bfloat16 type
      
      * use cuda_bf16_hpp
      
      * use_cuda_arch
      
      * bfloat162float2
      
      * del inplace_tol
      
      * del max_releative_tol
      
      * temp store
      
      * jingdu duiqi
      
      * temp store
      
      * plugin
      
      * jingdu duiqi
      
      * duiqi
      
      * include cuda.h
      
      * del half
      
      * half single
      
      * ci
      
      * add const
      
      * ci
      
      * cudamemset
      
      * del printf
      
      * fp16 test
      
      * add half compute
      
      * del br16 ci
      
      * del ci
      
      * ci approve
      
      * del fluid include
      e61d892a
    • J
      [XPU] Add gather_squeeze_pass (#55605) · d13a49d6
      jiangfan06 提交于
      d13a49d6
  17. 01 8月, 2023 1 次提交
  18. 27 7月, 2023 2 次提交
  19. 24 7月, 2023 2 次提交
  20. 21 7月, 2023 3 次提交
  21. 20 7月, 2023 2 次提交
  22. 19 7月, 2023 2 次提交
  23. 17 7月, 2023 2 次提交
  24. 13 7月, 2023 1 次提交