1. 27 11月, 2021 1 次提交
  2. 19 11月, 2021 1 次提交
  3. 16 11月, 2021 2 次提交
    • M
      blk-mq: cancel blk-mq dispatch work in both blk_cleanup_queue and disk_release() · 2a19b28f
      Ming Lei 提交于
      For avoiding to slow down queue destroy, we don't call
      blk_mq_quiesce_queue() in blk_cleanup_queue(), instead of delaying to
      cancel dispatch work in blk_release_queue().
      
      However, this way has caused kernel oops[1], reported by Changhui. The log
      shows that scsi_device can be freed before running blk_release_queue(),
      which is expected too since scsi_device is released after the scsi disk
      is closed and the scsi_device is removed.
      
      Fixes the issue by canceling blk-mq dispatch work in both blk_cleanup_queue()
      and disk_release():
      
      1) when disk_release() is run, the disk has been closed, and any sync
      dispatch activities have been done, so canceling dispatch work is enough to
      quiesce filesystem I/O dispatch activity.
      
      2) in blk_cleanup_queue(), we only focus on passthrough request, and
      passthrough request is always explicitly allocated & freed by
      its caller, so once queue is frozen, all sync dispatch activity
      for passthrough request has been done, then it is enough to just cancel
      dispatch work for avoiding any dispatch activity.
      
      [1] kernel panic log
      [12622.769416] BUG: kernel NULL pointer dereference, address: 0000000000000300
      [12622.777186] #PF: supervisor read access in kernel mode
      [12622.782918] #PF: error_code(0x0000) - not-present page
      [12622.788649] PGD 0 P4D 0
      [12622.791474] Oops: 0000 [#1] PREEMPT SMP PTI
      [12622.796138] CPU: 10 PID: 744 Comm: kworker/10:1H Kdump: loaded Not tainted 5.15.0+ #1
      [12622.804877] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
      [12622.813321] Workqueue: kblockd blk_mq_run_work_fn
      [12622.818572] RIP: 0010:sbitmap_get+0x75/0x190
      [12622.823336] Code: 85 80 00 00 00 41 8b 57 08 85 d2 0f 84 b1 00 00 00 45 31 e4 48 63 cd 48 8d 1c 49 48 c1 e3 06 49 03 5f 10 4c 8d 6b 40 83 f0 01 <48> 8b 33 44 89 f2 4c 89 ef 0f b6 c8 e8 fa f3 ff ff 83 f8 ff 75 58
      [12622.844290] RSP: 0018:ffffb00a446dbd40 EFLAGS: 00010202
      [12622.850120] RAX: 0000000000000001 RBX: 0000000000000300 RCX: 0000000000000004
      [12622.858082] RDX: 0000000000000006 RSI: 0000000000000082 RDI: ffffa0b7a2dfe030
      [12622.866042] RBP: 0000000000000004 R08: 0000000000000001 R09: ffffa0b742721334
      [12622.874003] R10: 0000000000000008 R11: 0000000000000008 R12: 0000000000000000
      [12622.881964] R13: 0000000000000340 R14: 0000000000000000 R15: ffffa0b7a2dfe030
      [12622.889926] FS:  0000000000000000(0000) GS:ffffa0baafb40000(0000) knlGS:0000000000000000
      [12622.898956] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [12622.905367] CR2: 0000000000000300 CR3: 0000000641210001 CR4: 00000000001706e0
      [12622.913328] Call Trace:
      [12622.916055]  <TASK>
      [12622.918394]  scsi_mq_get_budget+0x1a/0x110
      [12622.922969]  __blk_mq_do_dispatch_sched+0x1d4/0x320
      [12622.928404]  ? pick_next_task_fair+0x39/0x390
      [12622.933268]  __blk_mq_sched_dispatch_requests+0xf4/0x140
      [12622.939194]  blk_mq_sched_dispatch_requests+0x30/0x60
      [12622.944829]  __blk_mq_run_hw_queue+0x30/0xa0
      [12622.949593]  process_one_work+0x1e8/0x3c0
      [12622.954059]  worker_thread+0x50/0x3b0
      [12622.958144]  ? rescuer_thread+0x370/0x370
      [12622.962616]  kthread+0x158/0x180
      [12622.966218]  ? set_kthread_struct+0x40/0x40
      [12622.970884]  ret_from_fork+0x22/0x30
      [12622.974875]  </TASK>
      [12622.977309] Modules linked in: scsi_debug rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs sunrpc dm_multipath intel_rapl_msr intel_rapl_common dell_wmi_descriptor sb_edac rfkill video x86_pkg_temp_thermal intel_powerclamp dcdbas coretemp kvm_intel kvm mgag200 irqbypass i2c_algo_bit rapl drm_kms_helper ipmi_ssif intel_cstate intel_uncore syscopyarea sysfillrect sysimgblt fb_sys_fops pcspkr cec mei_me lpc_ich mei ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sr_mod cdrom sd_mod t10_pi sg ixgbe ahci libahci crct10dif_pclmul crc32_pclmul crc32c_intel libata megaraid_sas ghash_clmulni_intel tg3 wdat_wdt mdio dca wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_debug]
      Reported-by: NChanghuiZhong <czhong@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: linux-scsi@vger.kernel.org
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Link: https://lore.kernel.org/r/20211116014343.610501-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
      2a19b28f
    • J
      block: fix missing queue put in error path · 95febeb6
      Jens Axboe 提交于
      If we fail the submission queue checks, we don't put the queue afterwards.
      This can cause various issues like stalls on scheduler switch or failure
      to remove the device, or like in the original bug report, timeout waiting
      for the device on reboot/restart.
      
      While in there, fix a few whitespace discrepancies in the surrounding
      code.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=215039
      Fixes: b637108a ("blk-mq: fix filesystem I/O request allocation")
      Reported-and-tested-by: NStephen Smith <stephenmsmith@blueyonder.co.uk>
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      95febeb6
  4. 13 11月, 2021 1 次提交
  5. 12 11月, 2021 1 次提交
  6. 09 11月, 2021 1 次提交
  7. 08 11月, 2021 1 次提交
    • Y
      blk-mq: don't free tags if the tag_set is used by other device in queue initialztion · a846a8e6
      Ye Bin 提交于
      We got UAF report on v5.10 as follows:
      [ 1446.674930] ==================================================================
      [ 1446.675970] BUG: KASAN: use-after-free in blk_mq_get_driver_tag+0x9a4/0xa90
      [ 1446.676902] Read of size 8 at addr ffff8880185afd10 by task kworker/1:2/12348
      [ 1446.677851]
      [ 1446.678073] CPU: 1 PID: 12348 Comm: kworker/1:2 Not tainted 5.10.0-10177-gc9c81b1e346a #2
      [ 1446.679168] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
      [ 1446.680692] Workqueue: kthrotld blk_throtl_dispatch_work_fn
      [ 1446.681448] Call Trace:
      [ 1446.681800]  dump_stack+0x9b/0xce
      [ 1446.682916]  print_address_description.constprop.6+0x3e/0x60
      [ 1446.685999]  kasan_report.cold.9+0x22/0x3a
      [ 1446.687186]  blk_mq_get_driver_tag+0x9a4/0xa90
      [ 1446.687785]  blk_mq_dispatch_rq_list+0x21a/0x1d40
      [ 1446.692576]  __blk_mq_do_dispatch_sched+0x394/0x830
      [ 1446.695758]  __blk_mq_sched_dispatch_requests+0x398/0x4f0
      [ 1446.698279]  blk_mq_sched_dispatch_requests+0xdf/0x140
      [ 1446.698967]  __blk_mq_run_hw_queue+0xc0/0x270
      [ 1446.699561]  __blk_mq_delay_run_hw_queue+0x4cc/0x550
      [ 1446.701407]  blk_mq_run_hw_queue+0x13b/0x2b0
      [ 1446.702593]  blk_mq_sched_insert_requests+0x1de/0x390
      [ 1446.703309]  blk_mq_flush_plug_list+0x4b4/0x760
      [ 1446.705408]  blk_flush_plug_list+0x2c5/0x480
      [ 1446.708471]  blk_finish_plug+0x55/0xa0
      [ 1446.708980]  blk_throtl_dispatch_work_fn+0x23b/0x2e0
      [ 1446.711236]  process_one_work+0x6d4/0xfe0
      [ 1446.711778]  worker_thread+0x91/0xc80
      [ 1446.713400]  kthread+0x32d/0x3f0
      [ 1446.714362]  ret_from_fork+0x1f/0x30
      [ 1446.714846]
      [ 1446.715062] Allocated by task 1:
      [ 1446.715509]  kasan_save_stack+0x19/0x40
      [ 1446.716026]  __kasan_kmalloc.constprop.1+0xc1/0xd0
      [ 1446.716673]  blk_mq_init_tags+0x6d/0x330
      [ 1446.717207]  blk_mq_alloc_rq_map+0x50/0x1c0
      [ 1446.717769]  __blk_mq_alloc_map_and_request+0xe5/0x320
      [ 1446.718459]  blk_mq_alloc_tag_set+0x679/0xdc0
      [ 1446.719050]  scsi_add_host_with_dma.cold.3+0xa0/0x5db
      [ 1446.719736]  virtscsi_probe+0x7bf/0xbd0
      [ 1446.720265]  virtio_dev_probe+0x402/0x6c0
      [ 1446.720808]  really_probe+0x276/0xde0
      [ 1446.721320]  driver_probe_device+0x267/0x3d0
      [ 1446.721892]  device_driver_attach+0xfe/0x140
      [ 1446.722491]  __driver_attach+0x13a/0x2c0
      [ 1446.723037]  bus_for_each_dev+0x146/0x1c0
      [ 1446.723603]  bus_add_driver+0x3fc/0x680
      [ 1446.724145]  driver_register+0x1c0/0x400
      [ 1446.724693]  init+0xa2/0xe8
      [ 1446.725091]  do_one_initcall+0x9e/0x310
      [ 1446.725626]  kernel_init_freeable+0xc56/0xcb9
      [ 1446.726231]  kernel_init+0x11/0x198
      [ 1446.726714]  ret_from_fork+0x1f/0x30
      [ 1446.727212]
      [ 1446.727433] Freed by task 26992:
      [ 1446.727882]  kasan_save_stack+0x19/0x40
      [ 1446.728420]  kasan_set_track+0x1c/0x30
      [ 1446.728943]  kasan_set_free_info+0x1b/0x30
      [ 1446.729517]  __kasan_slab_free+0x111/0x160
      [ 1446.730084]  kfree+0xb8/0x520
      [ 1446.730507]  blk_mq_free_map_and_requests+0x10b/0x1b0
      [ 1446.731206]  blk_mq_realloc_hw_ctxs+0x8cb/0x15b0
      [ 1446.731844]  blk_mq_init_allocated_queue+0x374/0x1380
      [ 1446.732540]  blk_mq_init_queue_data+0x7f/0xd0
      [ 1446.733155]  scsi_mq_alloc_queue+0x45/0x170
      [ 1446.733730]  scsi_alloc_sdev+0x73c/0xb20
      [ 1446.734281]  scsi_probe_and_add_lun+0x9a6/0x2d90
      [ 1446.734916]  __scsi_scan_target+0x208/0xc50
      [ 1446.735500]  scsi_scan_channel.part.3+0x113/0x170
      [ 1446.736149]  scsi_scan_host_selected+0x25a/0x360
      [ 1446.736783]  store_scan+0x290/0x2d0
      [ 1446.737275]  dev_attr_store+0x55/0x80
      [ 1446.737782]  sysfs_kf_write+0x132/0x190
      [ 1446.738313]  kernfs_fop_write_iter+0x319/0x4b0
      [ 1446.738921]  new_sync_write+0x40e/0x5c0
      [ 1446.739429]  vfs_write+0x519/0x720
      [ 1446.739877]  ksys_write+0xf8/0x1f0
      [ 1446.740332]  do_syscall_64+0x2d/0x40
      [ 1446.740802]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [ 1446.741462]
      [ 1446.741670] The buggy address belongs to the object at ffff8880185afd00
      [ 1446.741670]  which belongs to the cache kmalloc-256 of size 256
      [ 1446.743276] The buggy address is located 16 bytes inside of
      [ 1446.743276]  256-byte region [ffff8880185afd00, ffff8880185afe00)
      [ 1446.744765] The buggy address belongs to the page:
      [ 1446.745416] page:ffffea0000616b00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x185ac
      [ 1446.746694] head:ffffea0000616b00 order:2 compound_mapcount:0 compound_pincount:0
      [ 1446.747719] flags: 0x1fffff80010200(slab|head)
      [ 1446.748337] raw: 001fffff80010200 ffffea00006a3208 ffffea000061bf08 ffff88801004f240
      [ 1446.749404] raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
      [ 1446.750455] page dumped because: kasan: bad access detected
      [ 1446.751227]
      [ 1446.751445] Memory state around the buggy address:
      [ 1446.752102]  ffff8880185afc00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1446.753090]  ffff8880185afc80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1446.754079] >ffff8880185afd00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1446.755065]                          ^
      [ 1446.755589]  ffff8880185afd80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1446.756574]  ffff8880185afe00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [ 1446.757566] ==================================================================
      
      Flag 'BLK_MQ_F_TAG_QUEUE_SHARED' will be set if the second device on the
      same host initializes it's queue successfully. However, if the second
      device failed to allocate memory in blk_mq_alloc_and_init_hctx() from
      blk_mq_realloc_hw_ctxs() from blk_mq_init_allocated_queue(),
      __blk_mq_free_map_and_rqs() will be called on error path, and if
      'BLK_MQ_TAG_HCTX_SHARED' is not set, 'tag_set->tags' will be freed
      while it's still used by the first device.
      
      To fix this issue we move release newly allocated hardware context from
      blk_mq_realloc_hw_ctxs to __blk_mq_update_nr_hw_queues. As there is needn't to
      release hardware context in blk_mq_init_allocated_queue.
      
      Fixes: 868f2f0b ("blk-mq: dynamic h/w context count")
      Signed-off-by: NYe Bin <yebin10@huawei.com>
      Signed-off-by: NYu Kuai <yukuai3@huawei.com>
      Reviewed-by: NMing Lei <ming.lei@redhat.com>
      Link: https://lore.kernel.org/r/20211108074019.1058843-1-yebin10@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
      a846a8e6
  8. 05 11月, 2021 4 次提交
  9. 03 11月, 2021 2 次提交
  10. 02 11月, 2021 2 次提交
  11. 29 10月, 2021 1 次提交
  12. 27 10月, 2021 5 次提交
  13. 26 10月, 2021 1 次提交
  14. 22 10月, 2021 1 次提交
  15. 21 10月, 2021 1 次提交
  16. 20 10月, 2021 6 次提交
  17. 19 10月, 2021 9 次提交