1. 03 12月, 2021 2 次提交
  2. 29 11月, 2021 2 次提交
  3. 29 10月, 2021 1 次提交
    • S
      null_blk: Fix handling of submit_queues and poll_queues attributes · 15dfc662
      Shin'ichiro Kawasaki 提交于
      Commit 0a593fbb ("null_blk: poll queue support") introduced the poll
      queue feature to null_blk. After this change, null_blk device has both
      submit queues and poll queues, and null_map_queues() callback maps the
      both queues for corresponding hardware contexts. The commit also added
      the device configuration attribute 'poll_queues' in same manner as the
      existing attribute 'submit_queues'. These attributes allow to modify the
      numbers of queues. However, when the new values are stored to these
      attributes, the values are just handled only for the corresponding
      queue. When number of submit_queue is updated, number of poll_queue is
      not counted, or vice versa.  This caused inconsistent number of queues
      and queue mapping and resulted in null-ptr-dereference. This failure was
      observed in blktests block/029 and block/030.
      
      To avoid the inconsistency, fix the attribute updates to care both
      submit_queues and poll_queues. Introduce the helper function
      nullb_update_nr_hw_queues() to handle stores to the both two attributes.
      Add poll_queues field to the struct nullb_device to track the number in
      same manner as submit_queues. Add two more fields prev_submit_queues and
      prev_poll_queues to keep the previous values before change. In case the
      block layer failed to update the nr_hw_queues, refer the previous values
      in null_map_queues() to map queues in same manner as before change.
      
      Also add poll_queues value checks in nullb_update_nr_hw_queues() and
      null_validate_conf(). They ensure the poll_queues value of each device
      is within the range from 1 to module parameter value of poll_queues.
      
      Fixes: 0a593fbb ("null_blk: poll queue support")
      Reported-by: NYi Zhang <yi.zhang@redhat.com>
      Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Link: https://lore.kernel.org/r/20211029103926.845635-1-shinichiro.kawasaki@wdc.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
      15dfc662
  4. 19 10月, 2021 1 次提交
  5. 18 10月, 2021 1 次提交
  6. 24 8月, 2021 1 次提交
  7. 12 8月, 2021 1 次提交
  8. 01 7月, 2021 1 次提交
  9. 12 6月, 2021 1 次提交
  10. 03 6月, 2021 1 次提交
  11. 01 6月, 2021 1 次提交
  12. 12 4月, 2021 1 次提交
    • M
      null_blk: add option for managing virtual boundary · cee1b215
      Max Gurtovoy 提交于
      This will enable changing the virtual boundary of null blk devices. For
      now, null blk devices didn't have any restriction on the scatter/gather
      elements received from the block layer. Add a module parameter and a
      configfs option that will control the virtual boundary. This will
      enable testing the efficiency of the block layer bounce buffer in case
      a suitable application will send discontiguous IO to the given device.
      
      Initial testing with patched FIO showed the following results (64 jobs,
      128 iodepth, 1 nullb device):
      IO size      READ (virt=false)   READ (virt=true)   Write (virt=false)  Write (virt=true)
      ----------  ------------------- -----------------  ------------------- -------------------
       1k            10.7M                8482k               10.8M              8471k
       2k            10.4M                8266k               10.4M              8271k
       4k            10.4M                8274k               10.3M              8226k
       8k            10.2M                8131k               9800k              7933k
       16k           9567k                7764k               8081k              6828k
       32k           8865k                7309k               5570k              5153k
       64k           7695k                6586k               2682k              2617k
       128k          5346k                5489k               1320k              1296k
      Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
      Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Link: https://lore.kernel.org/r/20210412095523.278632-1-mgurtovoy@nvidia.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
      cee1b215
  13. 01 4月, 2021 1 次提交
    • D
      null_blk: fix command timeout completion handling · de3510e5
      Damien Le Moal 提交于
      Memory backed or zoned null block devices may generate actual request
      timeout errors due to the submission path being blocked on memory
      allocation or zone locking. Unlike fake timeouts or injected timeouts,
      the request submission path will call blk_mq_complete_request() or
      blk_mq_end_request() for these real timeout errors, causing a double
      completion and use after free situation as the block layer timeout
      handler executes blk_mq_rq_timed_out() and __blk_mq_free_request() in
      blk_mq_check_expired(). This problem often triggers a NULL pointer
      dereference such as:
      
      BUG: kernel NULL pointer dereference, address: 0000000000000050
      RIP: 0010:blk_mq_sched_mark_restart_hctx+0x5/0x20
      ...
      Call Trace:
        dd_finish_request+0x56/0x80
        blk_mq_free_request+0x37/0x130
        null_handle_cmd+0xbf/0x250 [null_blk]
        ? null_queue_rq+0x67/0xd0 [null_blk]
        blk_mq_dispatch_rq_list+0x122/0x850
        __blk_mq_do_dispatch_sched+0xbb/0x2c0
        __blk_mq_sched_dispatch_requests+0x13d/0x190
        blk_mq_sched_dispatch_requests+0x30/0x60
        __blk_mq_run_hw_queue+0x49/0x90
        process_one_work+0x26c/0x580
        worker_thread+0x55/0x3c0
        ? process_one_work+0x580/0x580
        kthread+0x134/0x150
        ? kthread_create_worker_on_cpu+0x70/0x70
        ret_from_fork+0x1f/0x30
      
      This problem very often triggers when running the full btrfs xfstests
      on a memory-backed zoned null block device in a VM with limited amount
      of memory.
      
      Avoid this by executing blk_mq_complete_request() in null_timeout_rq()
      only for commands that are marked for a fake timeout completion using
      the fake_timeout boolean in struct null_cmd. For timeout errors injected
      through debugfs, the timeout handler will execute
      blk_mq_complete_request()i as before. This is safe as the submission
      path does not execute complete requests in this case.
      
      In null_timeout_rq(), also make sure to set the command error field to
      BLK_STS_TIMEOUT and to propagate this error through to the request
      completion.
      Reported-by: NJohannes Thumshirn <Johannes.Thumshirn@wdc.com>
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Tested-by: NJohannes Thumshirn <Johannes.Thumshirn@wdc.com>
      Reviewed-by: NJohannes Thumshirn <Johannes.Thumshirn@wdc.com>
      Link: https://lore.kernel.org/r/20210331225244.126426-1-damien.lemoal@wdc.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
      de3510e5
  14. 25 1月, 2021 1 次提交
  15. 08 12月, 2020 4 次提交
  16. 29 9月, 2020 1 次提交
    • N
      null_blk: add support for max open/active zone limit for zoned devices · dc4d137e
      Niklas Cassel 提交于
      Add support for user space to set a max open zone and a max active zone
      limit via configfs. By default, the default values are 0 == no limit.
      
      Call the block layer API functions used for exposing the configured
      limits to sysfs.
      
      Add accounting in null_blk_zoned so that these new limits are respected.
      Performing an operation that would exceed these limits results in a
      standard I/O error.
      
      A max open zone limit exists in the ZBC standard.
      While null_blk_zoned is used to test the Zoned Block Device model in
      Linux, when it comes to differences between ZBC and ZNS, null_blk_zoned
      mostly follows ZBC.
      
      Therefore, implement the manage open zone resources function from ZBC,
      but additionally add support for max active zones.
      This enables user space not only to test against a device with an open
      zone limit, but also to test against a device with an active zone limit.
      Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com>
      Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      dc4d137e
  17. 25 9月, 2020 1 次提交
  18. 22 8月, 2020 1 次提交
  19. 08 7月, 2020 1 次提交
  20. 01 7月, 2020 2 次提交
  21. 24 6月, 2020 1 次提交
  22. 30 5月, 2020 1 次提交
    • D
      null_blk: force complete for timeout request · 32215469
      Dongli Zhang 提交于
      The commit 7b11eab0 ("blk-mq: blk-mq: provide forced completion
      method") exports new API to force a request to complete without error
      injection.
      
      There should be no error injection when completing a request by timeout
      handler.
      
      Otherwise, the below would hang because timeout handler is failed.
      
      echo 100 > /sys/kernel/debug/fail_io_timeout/probability
      echo 1000 > /sys/kernel/debug/fail_io_timeout/times
      echo 1 > /sys/block/nullb0/io-timeout-fail
      dd if=/dev/zero of=/dev/nullb0 bs=512 count=1 oflag=direct
      
      With this patch, the timeout handler is able to complete the IO.
      Signed-off-by: NDongli Zhang <dongli.zhang@oracle.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      32215469
  23. 21 5月, 2020 1 次提交
  24. 19 5月, 2020 1 次提交
  25. 23 4月, 2020 2 次提交
    • D
      null_blk: Cleanup zoned device initialization · d205bde7
      Damien Le Moal 提交于
      Move all zoned mode related code from null_blk_main.c to
      null_blk_zoned.c, avoiding an ugly #ifdef in the process.
      Rename null_zone_init() into null_init_zoned_dev(), null_zone_exit()
      into null_free_zoned_dev() and add the new function
      null_register_zoned_dev() to finalize the zoned dev setup before
      add_disk().
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      d205bde7
    • D
      null_blk: Fix zoned command handling · 9dd44c7e
      Damien Le Moal 提交于
      For write operations issued to a null_blk device with zoned mode
      enabled, the state and write pointer position of the zone targeted by
      the command should be checked before badblocks and memory backing
      are handled as the write may be first failed due to, for instance, a
      sector position not aligned with the zone write pointer. This order of
      checking for errors reflects more accuratly the behavior of physical
      zoned devices.
      
      Furthermore, the write pointer position of the target zone should be
      incremented only and only if no errors are reported by badblocks and
      memory backing handling.
      
      To fix this, introduce the small helper function null_process_cmd()
      which execute null_handle_badblocks() and null_handle_memory_backed()
      and use this function in null_zone_write() to correctly handle write
      requests to zoned null devices depending on the type and state of the
      write target zone. Also call this function in null_handle_zoned() to
      process read requests to zoned null devices.
      
      null_process_cmd() is called directly from null_handle_cmd() for
      regular null devices, resulting in no functional change for these type
      of devices. To have symmetric names, the function null_handle_zoned()
      is renamed to null_process_zoned_cmd().
      Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9dd44c7e
  26. 28 3月, 2020 2 次提交
  27. 13 3月, 2020 1 次提交
  28. 12 3月, 2020 1 次提交
  29. 10 3月, 2020 4 次提交
    • B
      null_blk: Add support for init_hctx() fault injection · 596444e7
      Bart Van Assche 提交于
      This makes it possible to test the error path in blk_mq_realloc_hw_ctxs()
      and also several error paths in null_blk.
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Cc: Johannes Thumshirn <jth@kernel.org>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      596444e7
    • B
      null_blk: Handle null_add_dev() failures properly · 9b03b713
      Bart Van Assche 提交于
      If null_add_dev() fails then null_del_dev() is called with a NULL argument.
      Make null_del_dev() handle this scenario correctly. This patch fixes the
      following KASAN complaint:
      
      null-ptr-deref in null_del_dev+0x28/0x280 [null_blk]
      Read of size 8 at addr 0000000000000000 by task find/1062
      
      Call Trace:
       dump_stack+0xa5/0xe6
       __kasan_report.cold+0x65/0x99
       kasan_report+0x16/0x20
       __asan_load8+0x58/0x90
       null_del_dev+0x28/0x280 [null_blk]
       nullb_group_drop_item+0x7e/0xa0 [null_blk]
       client_drop_item+0x53/0x80 [configfs]
       configfs_rmdir+0x395/0x4e0 [configfs]
       vfs_rmdir+0xb6/0x220
       do_rmdir+0x238/0x2c0
       __x64_sys_unlinkat+0x75/0x90
       do_syscall_64+0x6f/0x2f0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Cc: Johannes Thumshirn <jth@kernel.org>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      9b03b713
    • B
      null_blk: Fix the null_add_dev() error path · 2004bfde
      Bart Van Assche 提交于
      If null_add_dev() fails, clear dev->nullb.
      
      This patch fixes the following KASAN complaint:
      
      BUG: KASAN: use-after-free in nullb_device_submit_queues_store+0xcf/0x160 [null_blk]
      Read of size 8 at addr ffff88803280fc30 by task check/8409
      
      Call Trace:
       dump_stack+0xa5/0xe6
       print_address_description.constprop.0+0x26/0x260
       __kasan_report.cold+0x7b/0x99
       kasan_report+0x16/0x20
       __asan_load8+0x58/0x90
       nullb_device_submit_queues_store+0xcf/0x160 [null_blk]
       configfs_write_file+0x1c4/0x250 [configfs]
       __vfs_write+0x4c/0x90
       vfs_write+0x145/0x2c0
       ksys_write+0xd7/0x180
       __x64_sys_write+0x47/0x50
       do_syscall_64+0x6f/0x2f0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x7ff370926317
      Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
      RSP: 002b:00007fff2dd2da48 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007ff370926317
      RDX: 0000000000000002 RSI: 0000559437ef23f0 RDI: 0000000000000001
      RBP: 0000559437ef23f0 R08: 000000000000000a R09: 0000000000000001
      R10: 0000559436703471 R11: 0000000000000246 R12: 0000000000000002
      R13: 00007ff370a006a0 R14: 00007ff370a014a0 R15: 00007ff370a008a0
      
      Allocated by task 8409:
       save_stack+0x23/0x90
       __kasan_kmalloc.constprop.0+0xcf/0xe0
       kasan_kmalloc+0xd/0x10
       kmem_cache_alloc_node_trace+0x129/0x4c0
       null_add_dev+0x24a/0xe90 [null_blk]
       nullb_device_power_store+0x1b6/0x270 [null_blk]
       configfs_write_file+0x1c4/0x250 [configfs]
       __vfs_write+0x4c/0x90
       vfs_write+0x145/0x2c0
       ksys_write+0xd7/0x180
       __x64_sys_write+0x47/0x50
       do_syscall_64+0x6f/0x2f0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 8409:
       save_stack+0x23/0x90
       __kasan_slab_free+0x112/0x160
       kasan_slab_free+0x12/0x20
       kfree+0xdf/0x250
       null_add_dev+0xaf3/0xe90 [null_blk]
       nullb_device_power_store+0x1b6/0x270 [null_blk]
       configfs_write_file+0x1c4/0x250 [configfs]
       __vfs_write+0x4c/0x90
       vfs_write+0x145/0x2c0
       ksys_write+0xd7/0x180
       __x64_sys_write+0x47/0x50
       do_syscall_64+0x6f/0x2f0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 2984c868 ("nullb: factor disk parameters")
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Cc: Johannes Thumshirn <jth@kernel.org>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      2004bfde
    • B
      null_blk: Fix changing the number of hardware queues · 78b10be2
      Bart Van Assche 提交于
      Instead of initializing null_blk hardware queues explicitly after the
      request queue has been created, provide .init_hctx() and .exit_hctx()
      callback functions. The latter functions are not only called during
      request queue allocation but also when the number of hardware queues
      changes. Allocate nr_cpu_ids queues during initialization to support
      increasing the number of hardware queues above the initial hardware
      queue count.
      
      This change fixes increasing the number of hardware queues above the
      initial number of hardware queues and also keeps nullb->nr_queues in
      sync with the number of hardware queues.
      
      Fixes: 45919fbf ("null_blk: Enable modifying 'submit_queues' after an instance has been configured")
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Cc: Johannes Thumshirn <jth@kernel.org>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      78b10be2