1. 12 4月, 2018 7 次提交
    • K
      nvme-pci: Skip queue deletion if there are no queues · 64ee0ac0
      Keith Busch 提交于
      User reported controller always retains CSTS.RDY to 1, which fails
      controller disabling when resetting the controller. This is also before
      the admin queue is allocated, and trying to disable an unallocated queue
      results in a NULL dereference.
      Reported-by: NAlex Gagniuc <Alex_Gagniuc@Dellteam.com>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      64ee0ac0
    • A
      nvme: target: fix buffer overflow · 6038aa53
      Arnd Bergmann 提交于
      nvmet_execute_get_disc_log_page() passes a fixed-length string into
      nvmet_format_discovery_entry(), which then does a longer memcpy() on
      it, as pointed out by gcc-8:
      
      In function 'nvmet_format_discovery_entry',
          inlined from 'nvmet_execute_get_disc_log_page' at drivers/nvme/target/discovery.c:126:4:
      drivers/nvme/target/discovery.c:62:2: error: 'memcpy' forming offset [38, 223] is out of the bounds [0, 37] [-Werror=array-bounds]
        memcpy(e->subnqn, subsys_nqn, NVMF_NQN_SIZE);
      
      Using strncpy() will make this well-defined, filling the rest of the
      buffer with zeroes, under the assumption that the input is either
      a NUL-terminated string, or a byte sequence containing no zeroes.
      If the input is a string that is longer than NVMF_NQN_SIZE, we
      continue to have no NUL-termination in the output.
      
      Fixes: a07b4970 ("nvmet: add a generic NVMe target")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      6038aa53
    • J
      nvme: don't send keep-alives to the discovery controller · 74c6c715
      Johannes Thumshirn 提交于
      NVMe over Fabrics 1.0 Section 5.2 "Discovery Controller Properties and
      Command Support" Figure 31 "Discovery Controller – Admin Commands"
      explicitly listst all commands but "Get Log Page" and "Identify" as
      reserved, but NetApp report the Linux host is sending Keep Alive
      commands to the discovery controller, which is a violation of the
      Spec.
      
      We're already checking for discovery controllers when configuring the
      keep alive timeout but when creating a discovery controller we're not
      hard wiring the keep alive timeout to 0 and thus remain on
      NVME_DEFAULT_KATO for the discovery controller.
      
      This can be easily remproduced when issuing a direct connect to the
      discovery susbsystem using:
      'nvme connect [...] --nqn=nqn.2014-08.org.nvmexpress.discovery'
      Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Fixes: 07bfcd09 ("nvme-fabrics: add a generic NVMe over Fabrics library")
      Reported-by: NMartin George <marting@netapp.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      74c6c715
    • J
      nvme: unexport nvme_start_keep_alive · 00b683db
      Johannes Thumshirn 提交于
      nvme_start_keep_alive() isn't used outside core.c so unexport it and
      make it static.
      Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      00b683db
    • M
      nvme-loop: fix kernel oops in case of unhandled command · 11d9ea6f
      Ming Lei 提交于
      When nvmet_req_init() fails, __nvmet_req_complete() is called
      to handle the target request via .queue_response(), so
      nvme_loop_queue_response() shouldn't be called again for
      handling the failure.
      
      This patch fixes this case by the following way:
      
      - move blk_mq_start_request() before nvmet_req_init(), so
      nvme_loop_queue_response() may work well to complete this
      host request
      
      - don't call nvme_cleanup_cmd() which is done in nvme_loop_complete_rq()
      
      - don't call nvme_loop_queue_response() which is done via
      .queue_response()
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      [trimmed changelog]
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      11d9ea6f
    • M
      nvme: enforce 64bit offset for nvme_get_log_ext fn · 7ec6074f
      Matias Bjørling 提交于
      Compiling on 32 bits system produces a warning for the shift width
      when shifting 32 bit integer with 64bit integer.
      
      Make sure that offset always is 64bit, and use macros for retrieving
      lower and upper bits of the offset.
      Signed-off-by: NMatias Bjørling <mb@lightnvm.io>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      7ec6074f
    • J
      sr: get/drop reference to device in revalidate and check_events · 2d097c50
      Jens Axboe 提交于
      We can't just use scsi_cd() to get the scsi_cd structure, we have
      to grab a live reference to the device. For both callbacks, we're
      not inside an open where we already hold a reference to the device.
      
      This fixes device removal/addition under concurrent device access,
      which otherwise could result in the below oops.
      
      NULL pointer dereference at 0000000000000010
      PGD 0 P4D 0
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in:
      sr 12:0:0:0: [sr2] scsi-1 drive
       scsi_debug crc_t10dif crct10dif_generic crct10dif_common nvme nvme_core sb_edac xl
      sr 12:0:0:0: Attached scsi CD-ROM sr2
       sr_mod cdrom btrfs xor zstd_decompress zstd_compress xxhash lzo_compress zlib_defc
      sr 12:0:0:0: Attached scsi generic sg7 type 5
       igb ahci libahci i2c_algo_bit libata dca [last unloaded: crc_t10dif]
      CPU: 43 PID: 4629 Comm: systemd-udevd Not tainted 4.16.0+ #650
      Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
      RIP: 0010:sr_block_revalidate_disk+0x23/0x190 [sr_mod]
      RSP: 0018:ffff883ff357bb58 EFLAGS: 00010292
      RAX: ffffffffa00b07d0 RBX: ffff883ff3058000 RCX: ffff883ff357bb66
      RDX: 0000000000000003 RSI: 0000000000007530 RDI: ffff881fea631000
      RBP: 0000000000000000 R08: ffff881fe4d38400 R09: 0000000000000000
      R10: 0000000000000000 R11: 00000000000001b6 R12: 000000000800005d
      R13: 000000000800005d R14: ffff883ffd9b3790 R15: 0000000000000000
      FS:  00007f7dc8e6d8c0(0000) GS:ffff883fff340000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000010 CR3: 0000003ffda98005 CR4: 00000000003606e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       ? __invalidate_device+0x48/0x60
       check_disk_change+0x4c/0x60
       sr_block_open+0x16/0xd0 [sr_mod]
       __blkdev_get+0xb9/0x450
       ? iget5_locked+0x1c0/0x1e0
       blkdev_get+0x11e/0x320
       ? bdget+0x11d/0x150
       ? _raw_spin_unlock+0xa/0x20
       ? bd_acquire+0xc0/0xc0
       do_dentry_open+0x1b0/0x320
       ? inode_permission+0x24/0xc0
       path_openat+0x4e6/0x1420
       ? cpumask_any_but+0x1f/0x40
       ? flush_tlb_mm_range+0xa0/0x120
       do_filp_open+0x8c/0xf0
       ? __seccomp_filter+0x28/0x230
       ? _raw_spin_unlock+0xa/0x20
       ? __handle_mm_fault+0x7d6/0x9b0
       ? list_lru_add+0xa8/0xc0
       ? _raw_spin_unlock+0xa/0x20
       ? __alloc_fd+0xaf/0x160
       ? do_sys_open+0x1a6/0x230
       do_sys_open+0x1a6/0x230
       do_syscall_64+0x5a/0x100
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      Reviewed-by: NLee Duncan <lduncan@suse.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      2d097c50
  2. 11 4月, 2018 2 次提交
  3. 10 4月, 2018 12 次提交
    • M
      backing: silence compiler warning using __printf · a93f00b3
      Mathieu Malaterre 提交于
      __printf marker was added in commit d2cc4dde ("bdi_register: add
      __printf verification, fix arg mismatch") for function `bdi_register`
      since it is useful to verify format and arguments. Apply equivalent gcc
      attribute to `bdi_register_va`.
      
      Remove warning triggered with W=1:
      
        mm/backing-dev.c:881:2: warning: function might be possible candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NMathieu Malaterre <malat@debian.org>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      a93f00b3
    • M
      blk-mq: remove code for dealing with remapping queue · 37c7c6c7
      Ming Lei 提交于
      Firstly, from commit 4b855ad3 ("blk-mq: Create hctx for each present CPU),
      blk-mq doesn't remap queue any more after CPU topo is changed.
      
      Secondly, set->nr_hw_queues can't be bigger than nr_cpu_ids, and now we map
      all possible CPUs to hw queues, so at least one CPU is mapped to each hctx.
      
      So queue mapping has became static and fixed just like percpu variable, and
      we don't need to handle queue remapping any more.
      
      Cc: Stefan Haberland <sth@linux.vnet.ibm.com>
      Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      37c7c6c7
    • M
      blk-mq: reimplement blk_mq_hw_queue_mapped · 127276c6
      Ming Lei 提交于
      Now the actual meaning of queue mapped is that if there is any online
      CPU mapped to this hctx, so implement blk_mq_hw_queue_mapped() in this
      way.
      
      Cc: Stefan Haberland <sth@linux.vnet.ibm.com>
      Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      127276c6
    • M
      blk-mq: don't check queue mapped in __blk_mq_delay_run_hw_queue() · efea8450
      Ming Lei 提交于
      There are several reasons for removing the check:
      
      1) blk_mq_hw_queue_mapped() returns true always now since each hctx
      may be mapped by one CPU at least
      
      2) when there isn't any online CPU mapped to this hctx, there won't
      be any IO queued to this CPU, blk_mq_run_hw_queue() only runs queue
      if there is IO queued to this hctx
      
      3) If __blk_mq_delay_run_hw_queue() is called by blk_mq_delay_run_hw_queue(),
      which is run from blk_mq_dispatch_rq_list() or scsi_mq_get_budget(), and
      the hctx to be handled has to be mapped.
      
      Cc: Stefan Haberland <sth@linux.vnet.ibm.com>
      Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      efea8450
    • M
      blk-mq: remove blk_mq_delay_queue() · 15fe8a90
      Ming Lei 提交于
      No driver uses this interface any more, so remove it.
      
      Cc: Stefan Haberland <sth@linux.vnet.ibm.com>
      Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      15fe8a90
    • M
      blk-mq: introduce blk_mq_hw_queue_first_cpu() to figure out first cpu · f82ddf19
      Ming Lei 提交于
      This patch introduces helper of blk_mq_hw_queue_first_cpu() for
      figuring out the hctx's first cpu, and code duplication can be
      avoided.
      
      Cc: Stefan Haberland <sth@linux.vnet.ibm.com>
      Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      f82ddf19
    • M
      blk-mq: avoid to write intermediate result to hctx->next_cpu · 476f8c98
      Ming Lei 提交于
      This patch figures out the final selected CPU, then writes
      it to hctx->next_cpu once, then we can avoid to intermediate
      next cpu observed from other dispatch paths.
      
      Cc: Stefan Haberland <sth@linux.vnet.ibm.com>
      Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      476f8c98
    • M
      blk-mq: don't keep offline CPUs mapped to hctx 0 · bffa9909
      Ming Lei 提交于
      From commit 4b855ad3 ("blk-mq: Create hctx for each present CPU),
      blk-mq doesn't remap queue after CPU topo is changed, that said when
      some of these offline CPUs become online, they are still mapped to
      hctx 0, then hctx 0 may become the bottleneck of IO dispatch and
      completion.
      
      This patch sets up the mapping from the beginning, and aligns to
      queue mapping for PCI device (blk_mq_pci_map_queues()).
      
      Cc: Stefan Haberland <sth@linux.vnet.ibm.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: stable@vger.kernel.org
      Fixes: 4b855ad3 ("blk-mq: Create hctx for each present CPU)
      Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      bffa9909
    • M
      blk-mq: make sure that correct hctx->next_cpu is set · a1c735fb
      Ming Lei 提交于
      From commit 20e4d813 (blk-mq: simplify queue mapping & schedule
      with each possisble CPU), one hctx can be mapped from all offline CPUs,
      then hctx->next_cpu can be set as wrong.
      
      This patch fixes this issue by making hctx->next_cpu pointing to the
      first CPU in hctx->cpumask if all CPUs in hctx->cpumask are offline.
      
      Cc: Stefan Haberland <sth@linux.vnet.ibm.com>
      Tested-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Fixes: 20e4d813 ("blk-mq: simplify queue mapping & schedule with each possisble CPU")
      Cc: stable@vger.kernel.org
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      a1c735fb
    • O
      loop: fix LOOP_GET_STATUS lock imbalance · bdac616d
      Omar Sandoval 提交于
      Commit 2d1d4c1e made loop_get_status() drop lo_ctx_mutex before
      returning, but the loop_get_status_old(), loop_get_status64(), and
      loop_get_status_compat() wrappers don't call loop_get_status() if the
      passed argument is NULL. The callers expect that the lock is dropped, so
      make sure we drop it in that case, too.
      
      Reported-by: syzbot+31e8daa8b3fc129e75f2@syzkaller.appspotmail.com
      Fixes: 2d1d4c1e ("loop: don't call into filesystem while holding lo_ctl_mutex")
      Signed-off-by: NOmar Sandoval <osandov@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      bdac616d
    • T
      block/loop: fix deadlock after loop_set_status · 1e047eaa
      Tetsuo Handa 提交于
      syzbot is reporting deadlocks at __blkdev_get() [1].
      
      ----------------------------------------
      [   92.493919] systemd-udevd   D12696   525      1 0x00000000
      [   92.495891] Call Trace:
      [   92.501560]  schedule+0x23/0x80
      [   92.502923]  schedule_preempt_disabled+0x5/0x10
      [   92.504645]  __mutex_lock+0x416/0x9e0
      [   92.510760]  __blkdev_get+0x73/0x4f0
      [   92.512220]  blkdev_get+0x12e/0x390
      [   92.518151]  do_dentry_open+0x1c3/0x2f0
      [   92.519815]  path_openat+0x5d9/0xdc0
      [   92.521437]  do_filp_open+0x7d/0xf0
      [   92.527365]  do_sys_open+0x1b8/0x250
      [   92.528831]  do_syscall_64+0x6e/0x270
      [   92.530341]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      [   92.931922] 1 lock held by systemd-udevd/525:
      [   92.933642]  #0: 00000000a2849e25 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x73/0x4f0
      ----------------------------------------
      
      The reason of deadlock turned out that wait_event_interruptible() in
      blk_queue_enter() got stuck with bdev->bd_mutex held at __blkdev_put()
      due to q->mq_freeze_depth == 1.
      
      ----------------------------------------
      [   92.787172] a.out           S12584   634    633 0x80000002
      [   92.789120] Call Trace:
      [   92.796693]  schedule+0x23/0x80
      [   92.797994]  blk_queue_enter+0x3cb/0x540
      [   92.803272]  generic_make_request+0xf0/0x3d0
      [   92.807970]  submit_bio+0x67/0x130
      [   92.810928]  submit_bh_wbc+0x15e/0x190
      [   92.812461]  __block_write_full_page+0x218/0x460
      [   92.815792]  __writepage+0x11/0x50
      [   92.817209]  write_cache_pages+0x1ae/0x3d0
      [   92.825585]  generic_writepages+0x5a/0x90
      [   92.831865]  do_writepages+0x43/0xd0
      [   92.836972]  __filemap_fdatawrite_range+0xc1/0x100
      [   92.838788]  filemap_write_and_wait+0x24/0x70
      [   92.840491]  __blkdev_put+0x69/0x1e0
      [   92.841949]  blkdev_close+0x16/0x20
      [   92.843418]  __fput+0xda/0x1f0
      [   92.844740]  task_work_run+0x87/0xb0
      [   92.846215]  do_exit+0x2f5/0xba0
      [   92.850528]  do_group_exit+0x34/0xb0
      [   92.852018]  SyS_exit_group+0xb/0x10
      [   92.853449]  do_syscall_64+0x6e/0x270
      [   92.854944]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      [   92.943530] 1 lock held by a.out/634:
      [   92.945105]  #0: 00000000a2849e25 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x3c/0x1e0
      ----------------------------------------
      
      The reason of q->mq_freeze_depth == 1 turned out that loop_set_status()
      forgot to call blk_mq_unfreeze_queue() at error paths for
      info->lo_encrypt_type != NULL case.
      
      ----------------------------------------
      [   37.509497] CPU: 2 PID: 634 Comm: a.out Tainted: G        W        4.16.0+ #457
      [   37.513608] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
      [   37.518832] RIP: 0010:blk_freeze_queue_start+0x17/0x40
      [   37.521778] RSP: 0018:ffffb0c2013e7c60 EFLAGS: 00010246
      [   37.524078] RAX: 0000000000000000 RBX: ffff8b07b1519798 RCX: 0000000000000000
      [   37.527015] RDX: 0000000000000002 RSI: ffffb0c2013e7cc0 RDI: ffff8b07b1519798
      [   37.529934] RBP: ffffb0c2013e7cc0 R08: 0000000000000008 R09: 47a189966239b898
      [   37.532684] R10: dad78b99b278552f R11: 9332dca72259d5ef R12: ffff8b07acd73678
      [   37.535452] R13: 0000000000004c04 R14: 0000000000000000 R15: ffff8b07b841e940
      [   37.538186] FS:  00007fede33b9740(0000) GS:ffff8b07b8e80000(0000) knlGS:0000000000000000
      [   37.541168] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   37.543590] CR2: 00000000206fdf18 CR3: 0000000130b30006 CR4: 00000000000606e0
      [   37.546410] Call Trace:
      [   37.547902]  blk_freeze_queue+0x9/0x30
      [   37.549968]  loop_set_status+0x67/0x3c0 [loop]
      [   37.549975]  loop_set_status64+0x3b/0x70 [loop]
      [   37.549986]  lo_ioctl+0x223/0x810 [loop]
      [   37.549995]  blkdev_ioctl+0x572/0x980
      [   37.550003]  block_ioctl+0x34/0x40
      [   37.550006]  do_vfs_ioctl+0xa7/0x6d0
      [   37.550017]  ksys_ioctl+0x6b/0x80
      [   37.573076]  SyS_ioctl+0x5/0x10
      [   37.574831]  do_syscall_64+0x6e/0x270
      [   37.576769]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
      ----------------------------------------
      
      [1] https://syzkaller.appspot.com/bug?id=cd662bc3f6022c0979d01a262c318fab2ee9b56fSigned-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-by: Nsyzbot <bot+48594378e9851eab70bcd6f99327c7db58c5a28a@syzkaller.appspotmail.com>
      Fixes: ecdd0959 ("block/loop: fix race between I/O and set_status")
      Cc: Ming Lei <tom.leiming@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: stable <stable@vger.kernel.org>
      Cc: Jens Axboe <axboe@fb.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      1e047eaa
    • M
      blk-mq: order getting budget and driver tag · 0bca799b
      Ming Lei 提交于
      This patch orders getting budget and driver tag by making sure to acquire
      driver tag after budget is got, this way can help to avoid the following
      race:
      
      1) before dispatch request from scheduler queue, get one budget first, then
      dequeue a request, call it request A.
      
      2) in another IO path for dispatching request B which is from hctx->dispatch,
      driver tag is got, then try to get budget in blk_mq_dispatch_rq_list(),
      unfortunately the budget is held by request A.
      
      3) meantime blk_mq_dispatch_rq_list() is called for dispatching request
      A, and try to get driver tag first, unfortunately no driver tag is
      available because the driver tag is held by request B
      
      4) both two IO pathes can't move on, and IO stall is caused.
      
      This issue can be observed when running dbench on USB storage.
      
      This patch fixes this issue by always getting budget before getting
      driver tag.
      
      Cc: stable@vger.kernel.org
      Fixes: de148297 ("blk-mq: introduce .get_budget and .put_budget in blk_mq_ops")
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Bart Van Assche <bart.vanassche@wdc.com>
      Cc: Omar Sandoval <osandov@fb.com>
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      0bca799b
  4. 06 4月, 2018 10 次提交
    • L
      Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 052c220d
      Linus Torvalds 提交于
      Pull SCSI updates from James Bottomley:
       "This is mostly updates of the usual drivers: arcmsr, qla2xx, lpfc,
        ufs, mpt3sas, hisi_sas.
      
        In addition we have removed several really old drivers: sym53c416,
        NCR53c406a, fdomain, fdomain_cs and removed the old scsi_module.c
        initialization from all remaining drivers.
      
        Plus an assortment of bug fixes, initialization errors and other minor
        fixes"
      
      * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (168 commits)
        scsi: ufs: Add support for Auto-Hibernate Idle Timer
        scsi: ufs: sysfs: reworking of the rpm_lvl and spm_lvl entries
        scsi: qla2xxx: fx00 copypaste typo
        scsi: qla2xxx: fix error message on <qla2400
        scsi: smartpqi: update driver version
        scsi: smartpqi: workaround fw bug for oq deletion
        scsi: arcmsr: Change driver version to v1.40.00.05-20180309
        scsi: arcmsr: Sleep to avoid CPU stuck too long for waiting adapter ready
        scsi: arcmsr: Handle adapter removed due to thunderbolt cable disconnection.
        scsi: arcmsr: Rename ACB_F_BUS_HANG_ON to ACB_F_ADAPTER_REMOVED for adapter hot-plug
        scsi: qla2xxx: Update driver version to 10.00.00.06-k
        scsi: qla2xxx: Fix Async GPN_FT for FCP and FC-NVMe scan
        scsi: qla2xxx: Cleanup code to improve FC-NVMe error handling
        scsi: qla2xxx: Fix FC-NVMe IO abort during driver reset
        scsi: qla2xxx: Fix retry for PRLI RJT with reason of BUSY
        scsi: qla2xxx: Remove nvme_done_list
        scsi: qla2xxx: Return busy if rport going away
        scsi: qla2xxx: Fix n2n_ae flag to prevent dev_loss on PDB change
        scsi: qla2xxx: Add FC-NVMe abort processing
        scsi: qla2xxx: Add changes for devloss timeout in driver
        ...
      052c220d
    • L
      Merge tag 'for-4.17/block-20180402' of git://git.kernel.dk/linux-block · 3526dd0c
      Linus Torvalds 提交于
      Pull block layer updates from Jens Axboe:
       "It's a pretty quiet round this time, which is nice. This contains:
      
         - series from Bart, cleaning up the way we set/test/clear atomic
           queue flags.
      
         - series from Bart, fixing races between gendisk and queue
           registration and removal.
      
         - set of bcache fixes and improvements from various folks, by way of
           Michael Lyle.
      
         - set of lightnvm updates from Matias, most of it being the 1.2 to
           2.0 transition.
      
         - removal of unused DIO flags from Nikolay.
      
         - blk-mq/sbitmap memory ordering fixes from Omar.
      
         - divide-by-zero fix for BFQ from Paolo.
      
         - minor documentation patches from Randy.
      
         - timeout fix from Tejun.
      
         - Alpha "can't write a char atomically" fix from Mikulas.
      
         - set of NVMe fixes by way of Keith.
      
         - bsg and bsg-lib improvements from Christoph.
      
         - a few sed-opal fixes from Jonas.
      
         - cdrom check-disk-change deadlock fix from Maurizio.
      
         - various little fixes, comment fixes, etc from various folks"
      
      * tag 'for-4.17/block-20180402' of git://git.kernel.dk/linux-block: (139 commits)
        blk-mq: Directly schedule q->timeout_work when aborting a request
        blktrace: fix comment in blktrace_api.h
        lightnvm: remove function name in strings
        lightnvm: pblk: remove some unnecessary NULL checks
        lightnvm: pblk: don't recover unwritten lines
        lightnvm: pblk: implement 2.0 support
        lightnvm: pblk: implement get log report chunk
        lightnvm: pblk: rename ppaf* to addrf*
        lightnvm: pblk: check for supported version
        lightnvm: implement get log report chunk helpers
        lightnvm: make address conversions depend on generic device
        lightnvm: add support for 2.0 address format
        lightnvm: normalize geometry nomenclature
        lightnvm: complete geo structure with maxoc*
        lightnvm: add shorten OCSSD version in geo
        lightnvm: add minor version to generic geometry
        lightnvm: simplify geometry structure
        lightnvm: pblk: refactor init/exit sequences
        lightnvm: Avoid validation of default op value
        lightnvm: centralize permission check for lightnvm ioctl
        ...
      3526dd0c
    • L
      Merge tag 'edac_for_4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp · dd972f92
      Linus Torvalds 提交于
      Pull EDAC updates from Borislav Petkov:
       "Noteworthy is the NVDIMM support:
      
         - NVDIMM support to EDAC (Tony Luck)
      
         - misc fixes"
      
      * tag 'edac_for_4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
        EDAC, sb_edac: Remove variable length array usage
        EDAC, skx_edac: Detect non-volatile DIMMs
        firmware, DMI: Add function to look up a handle and return DIMM size
        acpi, nfit: Add function to look up nvdimm device and provide SMBIOS handle
        EDAC: Add new memory type for non-volatile DIMMs
        EDAC: Drop duplicated array of strings for memory type names
        EDAC, layerscape: Allow building for LS1021A
      dd972f92
    • K
      kernel.h: Retain constant expression output for max()/min() · 3c8ba0d6
      Kees Cook 提交于
      In the effort to remove all VLAs from the kernel[1], it is desirable to
      build with -Wvla.  However, this warning is overly pessimistic, in that
      it is only happy with stack array sizes that are declared as constant
      expressions, and not constant values.  One case of this is the
      evaluation of the max() macro which, due to its construction, ends up
      converting constant expression arguments into a constant value result.
      
      All attempts to rewrite this macro with __builtin_constant_p() failed
      with older compilers (e.g.  gcc 4.4)[2].  However, Martin Uecker,
      constructed[3] a mind-shattering solution that works everywhere.
      Cthulhu fhtagn!
      
      This patch updates the min()/max() macros to evaluate to a constant
      expression when called on constant expression arguments.  This removes
      several false-positive stack VLA warnings from an x86 allmodconfig build
      when -Wvla is added:
      
        $ diff -u before.txt after.txt | grep ^-
        -drivers/input/touchscreen/cyttsp4_core.c:871:2: warning: ISO C90 forbids variable length array ‘ids’ [-Wvla]
        -fs/btrfs/tree-checker.c:344:4: warning: ISO C90 forbids variable length array ‘namebuf’ [-Wvla]
        -lib/vsprintf.c:747:2: warning: ISO C90 forbids variable length array ‘sym’ [-Wvla]
        -net/ipv4/proc.c:403:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
        -net/ipv6/proc.c:198:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
        -net/ipv6/proc.c:218:2: warning: ISO C90 forbids variable length array ‘buff64’ [-Wvla]
      
      This also updates two cases where different enums were being compared
      and explicitly casts them to int (which matches the old side-effect of
      the single-evaluation code): one in tpm/tpm_tis_core.h, and one in
      drm/drm_color_mgmt.c.
      
       [1] https://lkml.org/lkml/2018/3/7/621
       [2] https://lkml.org/lkml/2018/3/10/170
       [3] https://lkml.org/lkml/2018/3/20/845Co-Developed-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Co-Developed-by: NMartin Uecker <Martin.Uecker@med.uni-goettingen.de>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c8ba0d6
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 5414ab31
      Linus Torvalds 提交于
      Pull input updates from Dmitry Torokhov:
      
       - new driver for PhoenixRC Flight Controller Adapter
      
       - new driver for RAVE SP Power button
      
       - fixes for autosuspend-related deadlocks in a few unput USB dirvers
      
       - support for 2nd wheel in ATech PS/2 mouse
      
       - fix for ALPS trackpoint detection on Thinkpad L570 and Latitude 7370
      
       - bunch of cleanups in various in PS/2 protocols
      
       - other assorted changes and fixes
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (35 commits)
        Input: i8042 - enable MUX on Sony VAIO VGN-CS series to fix touchpad
        Input: stmfts, s6sy761 - update my e-mail
        Input: stmfts - use async probe & suspend/resume to avoid 2s delay
        Input: ALPS - fix TrackStick detection on Thinkpad L570 and Latitude 7370
        Input: xpad - add PDP device id 0x02a4
        Input: alps - report pressure of v3 and v7 trackstick
        Input: pxrc - new driver for PhoenixRC Flight Controller Adapter
        Input: usbtouchscreen - do not rely on input_dev->users
        Input: usbtouchscreen - fix deadlock in autosuspend
        Input: pegasus_notetaker - do not rely on input_dev->users
        Input: pagasus_notetaker - fix deadlock in autosuspend
        Input: synaptics_usb - do not rely on input_dev->users
        Input: synaptics_usb - fix deadlock in autosuspend
        Input: gpio-keys - add support for wakeup event action
        Input: appletouch - use true and false for boolean values
        Input: silead - add Chuwi Hi8 support
        Input: analog - use get_cycles() on PPC
        Input: stmpe-keypad - remove VLA usage
        Input: i8042 - add Lenovo ThinkPad L460 to i8042 reset list
        Input: add RAVE SP Powerbutton driver
        ...
      5414ab31
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial · 672a9c10
      Linus Torvalds 提交于
      Pull trivial tree updates from Jiri Kosina.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
        kfifo: fix inaccurate comment
        tools/thermal: tmon: fix for segfault
        net: Spelling s/stucture/structure/
        edd: don't spam log if no EDD information is present
        Documentation: Fix early-microcode.txt references after file rename
        tracing: Block comments should align the * on each line
        treewide: Fix typos in printk
        GenWQE: Fix a typo in two comments
        treewide: Align function definition open/close braces
      672a9c10
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · e8403b49
      Linus Torvalds 提交于
      Pull HID updates from Jiri Kosina:
      
       - 3rd generation Wacom Intuos BT device support from Aaron Armstrong
         Skomra
      
       - support for NSG-MR5U and NSG-MR7U devices from Todd Kelner
      
       - multitouch Razer Blade Stealth support from Benjamin Tissoires
      
       - Elantech touchpad support from Alexandrov Stansilav
      
       - a few other scattered-around fixes and cleanups to drivers and
         generic code
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (31 commits)
        HID: google: Enable PM Full On mode when adjusting backlight
        HID: google: add google hammer HID driver
        HID: core: reset the quirks before calling probe again
        HID: multitouch: do not set HID_QUIRK_NO_INIT_REPORTS
        HID: core: remove the need for HID_QUIRK_NO_EMPTY_INPUT
        HID: use BIT() macro for quirks too
        HID: use BIT macro instead of plain integers for flags
        HID: multitouch: remove dead zones of Razer Blade Stealth
        HID: multitouch: export a quirk for the button handling of touchpads
        HID: usbhid: extend the polling interval configuration to keyboards
        HID: ntrig: document sysfs interface
        HID: wacom: wacom_wac_collection() is local to wacom_wac.c
        HID: wacom: generic: add the "Report Valid" usage
        HID: wacom: generic: Support multiple tools per report
        HID: wacom: Add support for 3rd generation Intuos BT
        HID: core: rewrite the hid-generic automatic unbind
        HID: sony: Add touchpad support for NSG-MR5U and NSG-MR7U remotes
        HID: hid-multitouch: Use true and false for boolean values
        HID: hid-ntrig: use true and false for boolean values
        HID: logitech-hidpp: document sysfs interface
        ...
      e8403b49
    • L
      Merge tag 'sound-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · e02d37bf
      Linus Torvalds 提交于
      Pull sound updates from Takashi Iwai:
       "This became a large update. The changes are scattered widely, and the
        majority of them are attributed to ASoC componentization. The gitk
        output made me dizzy, but it's slightly better than London tube.
      
        OK, below are some highlights:
      
         - Continued hardening works in ALSA PCM core; most of the existing
           syzkaller reports should have been covered.
      
         - USB-audio got the initial USB Audio Class 3 support, as well as
           UAC2 jack detection support and more DSD-device support.
      
         - ASoC componentization: finally each individual driver was converted
           to components framework, which is more future-proof for further
           works. Most of conversations were systematic.
      
         - Lots of fixes for Intel Baytrail / Cherrytrail devices with Realtek
           codecs, typically tablets and small PCs.
      
         - Fixes / cleanups for Samsung Odroid systems
      
         - Cleanups in Freescale SSI driver
      
         - New ASoC drivers:
            * AKM AK4458 and AK5558 codecs
            * A few AMD based machine drivers
            * Intel Kabylake machine drivers
            * Maxim MAX9759 codec
            * Motorola CPCAP codec
            * Socionext Uniphier SoCs
            * TI PCM1789 and TDA7419 codecs
      
         - Retirement of Blackfin drivers along with architecture removal"
      
      * tag 'sound-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (497 commits)
        ALSA: pcm: Fix UAF at PCM release via PCM timer access
        ALSA: usb-audio: silence a static checker warning
        ASoC: tscs42xx: Remove owner assignment from i2c_driver
        ASoC: mediatek: remove "simple-mfd" in the example
        ASoC: cpcap: replace codec to component
        ASoC: Intel: bytcr_rt5651: don't use codec anymore
        ASoC: amd: don't use codec anymore
        ALSA: usb-audio: fix memory leak on cval
        ALSA: pcm: Fix mutex unbalance in OSS emulation ioctls
        ASoC: topology: Fix kcontrol name string handling
        ALSA: aloop: Mark paused device as inactive
        ALSA: usb-audio: update clock valid control
        ALSA: usb-audio: UAC2 jack detection
        ALSA: pcm: Return -EBUSY for OSS ioctls changing busy streams
        ALSA: pcm: Avoid potential races between OSS ioctls and read/write
        ALSA: usb-audio: Integrate native DSD support for ITF-USB based DACs.
        ALSA: usb-audio: FIX native DSD support for TEAC UD-501 DAC
        ALSA: usb-audio: Add native DSD support for Luxman DA-06
        ALSA: usb-audio: fix uac control query argument
        ASoC: nau8824: recover system clock when device changes
        ...
      e02d37bf
    • L
      Merge tag 'dma-mapping-4.17' of git://git.infradead.org/users/hch/dma-mapping · 652ede37
      Linus Torvalds 提交于
      Pull dma-mapping updates from Christoph Hellwig:
       "Very light this round as the interesting dma mapping changes went
        through the x86 tree.
      
        This just provides proper stubs for architectures not supporting dma
        (Geert Uytterhoeven)"
      
      * tag 'dma-mapping-4.17' of git://git.infradead.org/users/hch/dma-mapping:
        usb: gadget: Add NO_DMA dummies for DMA mapping API
        scsi: Add NO_DMA dummies for SCSI DMA mapping API
        mm: Add NO_DMA dummies for DMA pool API
        dma-coherent: Add NO_DMA dummies for managed DMA API
        dma-mapping: Convert NO_DMA get_dma_ops() into a real dummy
      652ede37
    • L
      Merge tag 'gpio-v4.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · 1b2951dd
      Linus Torvalds 提交于
      Pull GPIO updates from Linus Walleij:
       "This is the bulk of GPIO changes for the v4.17 kernel cycle:
      
        New drivers:
      
         - Nintendo Wii GameCube GPIO, known as "Hollywood"
      
         - Raspberry Pi mailbox service GPIO expander
      
         - Spreadtrum main SC9860 SoC and IEC GPIO controllers.
      
        Improvements:
      
         - Implemented .get_multiple() callback for most of the
           high-performance industrial GPIO cards for the ISA bus.
      
         - ISA GPIO drivers now select the ISA_BUS_API instead of depending on
           it. This is merged with the same pattern for all the ISA drivers
           and some other Kconfig cleanups related to this.
      
        Cleanup:
      
         - Delete the TZ1090 GPIO drivers following the deletion of this SoC
           from the ARM tree.
      
         - Move the documentation over to driver-api to conform with the rest
           of the kernel documentation build.
      
         - Continue to make the GPIO drivers include only
           <linux/gpio/driver.h> and not the too broad <linux/gpio.h> that we
           want to get rid of.
      
         - Managed to remove VLA allocation from two drivers pending more
           fixes in this area for the next merge window.
      
         - Misc janitorial fixes"
      
      * tag 'gpio-v4.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (77 commits)
        gpio: Add Spreadtrum PMIC EIC driver support
        gpio: Add Spreadtrum EIC driver support
        dt-bindings: gpio: Add Spreadtrum EIC controller documentation
        gpio: ath79: Fix potential NULL dereference in ath79_gpio_probe()
        pinctrl: qcom: Don't allow protected pins to be requested
        gpiolib: Support 'gpio-reserved-ranges' property
        gpiolib: Change bitmap allocation to kmalloc_array
        gpiolib: Extract mask allocation into subroutine
        dt-bindings: gpio: Add a gpio-reserved-ranges property
        gpio: mockup: fix a potential crash when creating debugfs entries
        gpio: pca953x: add compatibility for pcal6524 and pcal9555a
        gpio: dwapb: Add support for a bus clock
        gpio: Remove VLA from xra1403 driver
        gpio: Remove VLA from MAX3191X driver
        gpio: ws16c48: Implement get_multiple callback
        gpio: gpio-mm: Implement get_multiple callback
        gpio: 104-idi-48: Implement get_multiple callback
        gpio: 104-dio-48e: Implement get_multiple callback
        gpio: pcie-idio-24: Implement get_multiple/set_multiple callbacks
        gpio: pci-idio-16: Implement get_multiple callback
        ...
      1b2951dd
  5. 05 4月, 2018 9 次提交