• X
    alios: blk-throttle: limit bios to fix amount of pages entering writeback prematurely · 06a67773
    Xiaoguang Wang 提交于
    Currently in blk_throtl_bio(), if one bio exceeds its throtl_grp's bps
    or iops limit, this bio will be queued throtl_grp's throtl_service_queue,
    then obviously mm subsys will submit more pages, even underlying device
    can not handle these io requests, also this will make large amount of pages
    entering writeback prematurely, later if some process writes some of these
    pages, it will wait for long time.
    
    I have done some tests: one process does buffered writes on a 1GB file,
    and make this process's blkcg max bps limit be 10MB/s, I observe this:
    	#cat /proc/meminfo  | grep -i back
    	Writeback:        900024 kB
    	WritebackTmp:          0 kB
    
    I think this Writeback value is just too big, indeed many bios have been
    queued in throtl_grp's throtl_service_queue, if one process try to write
    the last bio's page in this queue, it will call wait_on_page_writeback(page),
    which must wait the previous bios to finish and will take long time, we
    have also see 120s hung task warning in our server.
    
     INFO: task kworker/u128:0:30072 blocked for more than 120 seconds.
           Tainted: G            E 4.9.147-013.ali3000_015_test.alios7.x86_64 #1
     "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
     kworker/u128:0  D    0 30072      2 0x00000000
     Workqueue: writeback wb_workfn (flush-8:16)
      ffff882ddd066b40 0000000000000000 ffff882e5cad3400 ffff882fbe959e80
      ffff882fa50b1a00 ffffc9003a5a3768 ffffffff8173325d ffffc9003a5a3780
      00ff882e5cad3400 ffff882fbe959e80 ffffffff81360b49 ffff882e5cad3400
     Call Trace:
      [<ffffffff8173325d>] ? __schedule+0x23d/0x6d0
      [<ffffffff81360b49>] ? alloc_request_struct+0x19/0x20
      [<ffffffff81733726>] schedule+0x36/0x80
      [<ffffffff81736c56>] schedule_timeout+0x206/0x4b0
      [<ffffffff81036c69>] ? sched_clock+0x9/0x10
      [<ffffffff81363073>] ? get_request+0x403/0x810
      [<ffffffff8110ca10>] ? ktime_get+0x40/0xb0
      [<ffffffff81732f8a>] io_schedule_timeout+0xda/0x170
      [<ffffffff81733f90>] ? bit_wait+0x60/0x60
      [<ffffffff81733fab>] bit_wait_io+0x1b/0x60
      [<ffffffff81733b28>] __wait_on_bit+0x58/0x90
      [<ffffffff811b0d91>] ? find_get_pages_tag+0x161/0x2e0
      [<ffffffff811aff62>] wait_on_page_bit+0x82/0xa0
      [<ffffffff810d47f0>] ? wake_atomic_t_function+0x60/0x60
      [<ffffffffa02fc181>] mpage_prepare_extent_to_map+0x2d1/0x310 [ext4]
      [<ffffffff8121ff65>] ? kmem_cache_alloc+0x185/0x1a0
      [<ffffffffa0305a2f>] ? ext4_init_io_end+0x1f/0x40 [ext4]
      [<ffffffffa0300294>] ext4_writepages+0x404/0xef0 [ext4]
      [<ffffffff81508c64>] ? scsi_init_io+0x44/0x200
      [<ffffffff81398a0f>] ? fprop_fraction_percpu+0x2f/0x80
      [<ffffffff811c139e>] do_writepages+0x1e/0x30
      [<ffffffff8127c0f5>] __writeback_single_inode+0x45/0x320
      [<ffffffff8127c942>] writeback_sb_inodes+0x272/0x600
      [<ffffffff8127cf6b>] wb_writeback+0x10b/0x300
      [<ffffffff8127d884>] wb_workfn+0xb4/0x380
      [<ffffffff810b85e9>] ? try_to_wake_up+0x59/0x3e0
      [<ffffffff810a5759>] process_one_work+0x189/0x420
      [<ffffffff810a5a3e>] worker_thread+0x4e/0x4b0
      [<ffffffff810a59f0>] ? process_one_work+0x420/0x420
      [<ffffffff810ac026>] kthread+0xe6/0x100
      [<ffffffff810abf40>] ? kthread_park+0x60/0x60
      [<ffffffff81738499>] ret_from_fork+0x39/0x50
    
    To fix this issue, we can simply limit throtl_service_queue's max queued
    bios, currently we limit it to throtl_grp's bps_limit or iops limit, if it
    still exteeds, we just sleep for a while.
    Signed-off-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
    Reviewed-by: NLiu Bo <bo.liu@linux.alibaba.com>
    Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
    Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
    06a67773
blk-throttle.c 74.2 KB