• X
    alinux: block-throttle: only do io statistics if needed · b8a94ed8
    Xiaoguang Wang 提交于
    task #29063222
    
    Current blk throttle codes always do io statistics even though users
    don't specify valid throttle rules, which will introduce significant
    overheads for applications that don't use blk throttle function and
    is wrose in arm, see below perf data captured in arm:
    
    sudo taskset -c 66 fio -ioengine=io_uring -sqthread_poll=1 -hipri=1
    -sqthread_poll_cpu=65 -registerfiles=1 -fixedbufs=1 -direct=1
    -filename=/dev/nvme0n1 -bs=4k -iodepth=8 -rw=randwrite  -time_based
    -ramp_time=30 -runtime=60  -name="test"
    
    Samples: 25K of event 'cycles', Event count (approx.): 16586974662
    Overhead  Command      Shared Object      Symbol
       3.54%  io_uring-sq  [kernel.kallsyms]  [k]
    throtl_stats_update_completion
       0.89%  io_uring-sq  [kernel.kallsyms]  [k] throtl_bio_end_io
       0.66%  io_uring-sq  [kernel.kallsyms]  [k] blk_throtl_bio
       0.05%  io_uring-sq  [kernel.kallsyms]  [k] blk_throtl_stat_add
       0.05%  io_uring-sq  [kernel.kallsyms]  [k] throtl_track_latency
       0.01%  io_uring-sq  [kernel.kallsyms]  [k] blk_throtl_bio_endio
    
    Samples: 25K of event 'cycles', Event count (approx.): 16586974662
    Overhead  Command      Shared Object      Symbol
       1.62%  io_uring-sq  [kernel.kallsyms]  [k] io_submit_sqes
       1.06%  io_uring-sq  [kernel.kallsyms]  [k] io_issue_sqe
       0.32%  io_uring-sq  [kernel.kallsyms]  [k] __io_queue_sqe
       0.06%  io_uring-sq  [kernel.kallsyms]  [k] io_queue_sqe
    
    Above test doesn't set valid blk throttle rules, but the overhead
    introduced by blk throttle is even bigger than many io_uring framework
    functions, which is not acceptable.
    
    To improve this issue, only do do io statistics if users specify valid
    blk throttle rules, and this will also improve performance.
    
    Before this patch:
    clat (usec): min=5, max=6871, avg=18.70, stdev=17.89
     lat (usec): min=9, max=6871, avg=18.84, stdev=17.89
    WRITE: bw=1618MiB/s (1697MB/s), 1618MiB/s-1618MiB/s (1697MB/s-1697MB/s),
    io=94.8GiB (102GB), run=60001-60001msec
    
    With this patch:
    clat (usec): min=5, max=7554, avg=17.49, stdev=18.24
    lat (usec): min=9, max=7554, avg=17.62, stdev=18.24
     WRITE: bw=1727MiB/s (1810MB/s), 1727MiB/s-1727MiB/s
    (1810MB/s-1810MB/s), io=101GiB (109GB), run=60001-60001msec
    
    About 6.6% bps improvement and 6.4% latency reduction.
    Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
    Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
    b8a94ed8
blk-throttle.c 74.2 KB