1. 24 5月, 2022 3 次提交
    • C
      bcache: improve multithreaded bch_sectors_dirty_init() · 4dc34ae1
      Coly Li 提交于
      Commit b144e45f ("bcache: make bch_sectors_dirty_init() to be
      multithreaded") makes bch_sectors_dirty_init() to be much faster
      when counting dirty sectors by iterating all dirty keys in the btree.
      But it isn't in ideal shape yet, still can be improved.
      
      This patch does the following changes to improve current parallel dirty
      keys iteration on the btree,
      - Add read lock to root node when multiple threads iterating the btree,
        to prevent the root node gets split by I/Os from other registered
        bcache devices.
      - Remove local variable "char name[32]" and generate kernel thread name
        string directly when calling kthread_run().
      - Allocate "struct bch_dirty_init_state state" directly on stack and
        avoid the unnecessary dynamic memory allocation for it.
      - Decrease BCH_DIRTY_INIT_THRD_MAX from 64 to 12 which is enough indeed.
      - Increase &state->started to count created kernel thread after it
        succeeds to create.
      - When wait for all dirty key counting threads to finish, use
        wait_event() to replace wait_event_interruptible().
      
      With the above changes, the code is more clear, and some potential error
      conditions are avoided.
      
      Fixes: b144e45f ("bcache: make bch_sectors_dirty_init() to be multithreaded")
      Signed-off-by: NColy Li <colyli@suse.de>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20220524102336.10684-3-colyli@suse.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
      4dc34ae1
    • C
      bcache: improve multithreaded bch_btree_check() · 62253644
      Coly Li 提交于
      Commit 8e710227 ("bcache: make bch_btree_check() to be
      multithreaded") makes bch_btree_check() to be much faster when checking
      all btree nodes during cache device registration. But it isn't in ideal
      shap yet, still can be improved.
      
      This patch does the following thing to improve current parallel btree
      nodes check by multiple threads in bch_btree_check(),
      - Add read lock to root node while checking all the btree nodes with
        multiple threads. Although currently it is not mandatory but it is
        good to have a read lock in code logic.
      - Remove local variable 'char name[32]', and generate kernel thread name
        string directly when calling kthread_run().
      - Allocate local variable "struct btree_check_state check_state" on the
        stack and avoid unnecessary dynamic memory allocation for it.
      - Reduce BCH_BTR_CHKTHREAD_MAX from 64 to 12 which is enough indeed.
      - Increase check_state->started to count created kernel thread after it
        succeeds to create.
      - When wait for all checking kernel threads to finish, use wait_event()
        to replace wait_event_interruptible().
      
      With this change, the code is more clear, and some potential error
      conditions are avoided.
      
      Fixes: 8e710227 ("bcache: make bch_btree_check() to be multithreaded")
      Signed-off-by: NColy Li <colyli@suse.de>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20220524102336.10684-2-colyli@suse.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
      62253644
    • J
      Merge branch 'md-next' of... · df7e7f2b
      Jens Axboe 提交于
      Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.19/drivers
      
      Pull MD updates from Song:
      
      "- Remove uses of bdevname, by Christoph Hellwig;
       - Bug fixes by Guoqing Jiang, and Xiao Ni."
      
      * 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
        md: fix double free of io_acct_set bioset
        md: Don't set mddev private to NULL in raid0 pers->free
        md: remove most calls to bdevname
        md: protect md_unregister_thread from reentrancy
        md: don't unregister sync_thread with reconfig_mutex held
      df7e7f2b
  2. 23 5月, 2022 5 次提交
    • X
      md: fix double free of io_acct_set bioset · 42b805af
      Xiao Ni 提交于
      Now io_acct_set is alloc and free in personality. Remove the codes that
      free io_acct_set in md_free and md_stop.
      
      Fixes: 0c031fd3 (md: Move alloc/free acct bioset in to personality)
      Signed-off-by: NXiao Ni <xni@redhat.com>
      Signed-off-by: NSong Liu <song@kernel.org>
      42b805af
    • X
      md: Don't set mddev private to NULL in raid0 pers->free · 0f2571ad
      Xiao Ni 提交于
      In normal stop process, it does like this:
         do_md_stop
            |
         __md_stop (pers->free(); mddev->private=NULL)
            |
         md_free (free mddev)
      __md_stop sets mddev->private to NULL after pers->free. The raid device
      will be stopped and mddev memory is free. But in reshape, it doesn't
      free the mddev and mddev will still be used in new raid.
      
      In reshape, it first sets mddev->private to new_pers and then runs
      old_pers->free(). Now raid0 sets mddev->private to NULL in raid0_free.
      The new raid can't work anymore. It will panic when dereference
      mddev->private because of NULL pointer dereference.
      
      It can panic like this:
      [63010.814972] kernel BUG at drivers/md/raid10.c:928!
      [63010.819778] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
      [63010.825011] CPU: 3 PID: 44437 Comm: md0_resync Kdump: loaded Not tainted 5.14.0-86.el9.x86_64 #1
      [63010.833789] Hardware name: Dell Inc. PowerEdge R6415/07YXFK, BIOS 1.15.0 09/11/2020
      [63010.841440] RIP: 0010:raise_barrier+0x161/0x170 [raid10]
      [63010.865508] RSP: 0018:ffffc312408bbc10 EFLAGS: 00010246
      [63010.870734] RAX: 0000000000000000 RBX: ffffa00bf7d39800 RCX: 0000000000000000
      [63010.877866] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffa00bf7d39800
      [63010.884999] RBP: 0000000000000000 R08: fffffa4945e74400 R09: 0000000000000000
      [63010.892132] R10: ffffa00eed02f798 R11: 0000000000000000 R12: ffffa00bbc435200
      [63010.899266] R13: ffffa00bf7d39800 R14: 0000000000000400 R15: 0000000000000003
      [63010.906399] FS:  0000000000000000(0000) GS:ffffa00eed000000(0000) knlGS:0000000000000000
      [63010.914485] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [63010.920229] CR2: 00007f5cfbe99828 CR3: 0000000105efe000 CR4: 00000000003506e0
      [63010.927363] Call Trace:
      [63010.929822]  ? bio_reset+0xe/0x40
      [63010.933144]  ? raid10_alloc_init_r10buf+0x60/0xa0 [raid10]
      [63010.938629]  raid10_sync_request+0x756/0x1610 [raid10]
      [63010.943770]  md_do_sync.cold+0x3e4/0x94c
      [63010.947698]  md_thread+0xab/0x160
      [63010.951024]  ? md_write_inc+0x50/0x50
      [63010.954688]  kthread+0x149/0x170
      [63010.957923]  ? set_kthread_struct+0x40/0x40
      [63010.962107]  ret_from_fork+0x22/0x30
      
      Removing the code that sets mddev->private to NULL in raid0 can fix
      problem.
      
      Fixes: 0c031fd3 (md: Move alloc/free acct bioset in to personality)
      Reported-by: NFine Fan <ffan@redhat.com>
      Signed-off-by: NXiao Ni <xni@redhat.com>
      Signed-off-by: NSong Liu <song@kernel.org>
      0f2571ad
    • C
      md: remove most calls to bdevname · 913cce5a
      Christoph Hellwig 提交于
      Use the %pg format specifier to save on stack consumption and code size.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NSong Liu <song@kernel.org>
      913cce5a
    • G
      md: protect md_unregister_thread from reentrancy · 1e267742
      Guoqing Jiang 提交于
      Generally, the md_unregister_thread is called with reconfig_mutex, but
      raid_message in dm-raid doesn't hold reconfig_mutex to unregister thread,
      so md_unregister_thread can be called simulitaneously from two call sites
      in theory.
      
      Then after previous commit which remove the protection of reconfig_mutex
      for md_unregister_thread completely, the potential issue could be worse
      than before.
      
      Let's take pers_lock at the beginning of function to ensure reentrancy.
      Reported-by: NDonald Buczek <buczek@molgen.mpg.de>
      Signed-off-by: NGuoqing Jiang <guoqing.jiang@linux.dev>
      Signed-off-by: NSong Liu <song@kernel.org>
      1e267742
    • G
      md: don't unregister sync_thread with reconfig_mutex held · 8b48ec23
      Guoqing Jiang 提交于
      Unregister sync_thread doesn't need to hold reconfig_mutex since it
      doesn't reconfigure array.
      
      And it could cause deadlock problem for raid5 as follows:
      
      1. process A tried to reap sync thread with reconfig_mutex held after echo
         idle to sync_action.
      2. raid5 sync thread was blocked if there were too many active stripes.
      3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer)
         which causes the number of active stripes can't be decreased.
      4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able
         to hold reconfig_mutex.
      
      More details in the link:
      https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t
      
      And add one parameter to md_reap_sync_thread since it could be called by
      dm-raid which doesn't hold reconfig_mutex.
      Reported-and-tested-by: NDonald Buczek <buczek@molgen.mpg.de>
      Signed-off-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: NSong Liu <song@kernel.org>
      8b48ec23
  3. 21 5月, 2022 1 次提交
  4. 20 5月, 2022 2 次提交
    • J
      Merge tag 'nvme-5.19-2022-05-19' of git://git.infradead.org/nvme into for-5.19/drivers · 8ad9f577
      Jens Axboe 提交于
      Pull NVMe updates from Christoph:
      
      "nvme updates for Linux 5.19
      
       - set non-mdts limits in nvme_scan_work (Chaitanya Kulkarni)
       - add support for TP4084 - Time-to-Ready Enhancements (me)"
      
      * tag 'nvme-5.19-2022-05-19' of git://git.infradead.org/nvme:
        nvme: set non-mdts limits in nvme_scan_work
        nvme: add support for TP4084 - Time-to-Ready Enhancements
      8ad9f577
    • C
      nvme: set non-mdts limits in nvme_scan_work · 78288665
      Chaitanya Kulkarni 提交于
      In current implementation we set the non-mdts limits by calling
      nvme_init_non_mdts_limits() from nvme_init_ctrl_finish().
      This also tries to set the limits for the discovery controller which
      has no I/O queues resulting in the warning message reported by the
      nvme_log_error() when running blktest nvme/002: -
      
      [ 2005.155946] run blktests nvme/002 at 2022-04-09 16:57:47
      [ 2005.192223] loop: module loaded
      [ 2005.196429] nvmet: adding nsid 1 to subsystem blktests-subsystem-0
      [ 2005.200334] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
      
      <------------------------------SNIP---------------------------------->
      
      [ 2008.958108] nvmet: adding nsid 1 to subsystem blktests-subsystem-997
      [ 2008.962082] nvmet: adding nsid 1 to subsystem blktests-subsystem-998
      [ 2008.966102] nvmet: adding nsid 1 to subsystem blktests-subsystem-999
      [ 2008.973132] nvmet: creating discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN testhostnqn.
      *[ 2008.973196] nvme1: Identify(0x6), Invalid Field in Command (sct 0x0 / sc 0x2) MORE DNR*
      [ 2008.974595] nvme nvme1: new ctrl: "nqn.2014-08.org.nvmexpress.discovery"
      [ 2009.103248] nvme nvme1: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
      
      Move the call of nvme_init_non_mdts_limits() to nvme_scan_work() after
      we verify that I/O queues are created since that is a converging point
      for each transport where these limits are actually used.
      
      1. FC :
      nvme_fc_create_association()
       ...
       nvme_fc_create_io_queues(ctrl);
       ...
       nvme_start_ctrl()
        nvme_scan_queue()
         nvme_scan_work()
      
      2. PCIe:-
      nvme_reset_work()
       ...
       nvme_setup_io_queues()
        nvme_create_io_queues()
         nvme_alloc_queue()
       ...
       nvme_start_ctrl()
        nvme_scan_queue()
         nvme_scan_work()
      
      3. RDMA :-
      nvme_rdma_setup_ctrl
       ...
        nvme_rdma_configure_io_queues
        ...
        nvme_start_ctrl()
         nvme_scan_queue()
          nvme_scan_work()
      
      4. TCP :-
      nvme_tcp_setup_ctrl
       ...
        nvme_tcp_configure_io_queues
        ...
        nvme_start_ctrl()
         nvme_scan_queue()
          nvme_scan_work()
      
      * nvme_scan_work()
      ...
      nvme_validate_or_alloc_ns()
        nvme_alloc_ns()
         nvme_update_ns_info()
          nvme_update_disk_info()
           nvme_config_discard() <---
           blk_queue_max_write_zeroes_sectors() <---
      Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com>
      Reviewed-by: NKeith Busch <kbusch@kernel.org>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      78288665
  5. 19 5月, 2022 1 次提交
  6. 18 5月, 2022 1 次提交
    • J
      Merge tag 'nvme-5.19-2022-05-18' of git://git.infradead.org/nvme into for-5.19/drivers · da14f237
      Jens Axboe 提交于
      Pull NVMe updates from Christoph:
      
      "nvme updates for Linux 5.19
      
       - tighten the PCI presence check (Stefan Roese):
       - fix a potential NULL pointer dereference in an error path
         (Kyle Miller Smith)
       - fix interpretation of the DMRSL field (Tom Yan)
       - relax the data transfer alignment (Keith Busch)
       - verbose error logging improvements (Max Gurtovoy, Chaitanya Kulkarni)
       - misc cleanups (Chaitanya Kulkarni, me)"
      
      * tag 'nvme-5.19-2022-05-18' of git://git.infradead.org/nvme:
        nvme: split the enum used for various register constants
        nvme-fabrics: add a request timeout helper
        nvme-pci: harden drive presence detect in nvme_dev_disable()
        nvme-pci: fix a NULL pointer dereference in nvme_alloc_admin_tags
        nvme: mark internal passthru request RQF_QUIET
        nvme: remove unneeded include from constants file
        nvme: add missing status values to verbose logging
        nvme: set dma alignment to dword
        nvme: fix interpretation of DMRSL
      da14f237
  7. 17 5月, 2022 1 次提交
  8. 16 5月, 2022 9 次提交
  9. 10 5月, 2022 4 次提交
  10. 04 5月, 2022 13 次提交