1. 24 8月, 2017 1 次提交
    • C
      block: replace bi_bdev with a gendisk pointer and partitions index · 74d46992
      Christoph Hellwig 提交于
      This way we don't need a block_device structure to submit I/O.  The
      block_device has different life time rules from the gendisk and
      request_queue and is usually only available when the block device node
      is open.  Other callers need to explicitly create one (e.g. the lightnvm
      passthrough code, or the new nvme multipathing code).
      
      For the actual I/O path all that we need is the gendisk, which exists
      once per block device.  But given that the block layer also does
      partition remapping we additionally need a partition index, which is
      used for said remapping in generic_make_request.
      
      Note that all the block drivers generally want request_queue or
      sometimes the gendisk, so this removes a layer of indirection all
      over the stack.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      74d46992
  2. 10 8月, 2017 1 次提交
  3. 05 7月, 2017 1 次提交
  4. 04 7月, 2017 5 次提交
  5. 09 6月, 2017 1 次提交
  6. 03 6月, 2017 1 次提交
    • D
      bio-integrity: Do not allocate integrity context for bio w/o data · 3116a23b
      Dmitry Monakhov 提交于
      If bio has no data, such as ones from blkdev_issue_flush(),
      then we have nothing to protect.
      
      This patch prevent bugon like follows:
      
      kfree_debugcheck: out of range ptr ac1fa1d106742a5ah
      kernel BUG at mm/slab.c:2773!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: bcache
      CPU: 0 PID: 4428 Comm: xfs_io Tainted: G        W       4.11.0-rc4-ext4-00041-g2ef0043-dirty #43
      Hardware name: Virtuozzo KVM, BIOS seabios-1.7.5-11.vz7.4 04/01/2014
      task: ffff880137786440 task.stack: ffffc90000ba8000
      RIP: 0010:kfree_debugcheck+0x25/0x2a
      RSP: 0018:ffffc90000babde0 EFLAGS: 00010082
      RAX: 0000000000000034 RBX: ac1fa1d106742a5a RCX: 0000000000000007
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88013f3ccb40
      RBP: ffffc90000babde8 R08: 0000000000000000 R09: 0000000000000000
      R10: 00000000fcb76420 R11: 00000000725172ed R12: 0000000000000282
      R13: ffffffff8150e766 R14: ffff88013a145e00 R15: 0000000000000001
      FS:  00007fb09384bf40(0000) GS:ffff88013f200000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fd0172f9e40 CR3: 0000000137fa9000 CR4: 00000000000006f0
      Call Trace:
       kfree+0xc8/0x1b3
       bio_integrity_free+0xc3/0x16b
       bio_free+0x25/0x66
       bio_put+0x14/0x26
       blkdev_issue_flush+0x7a/0x85
       blkdev_fsync+0x35/0x42
       vfs_fsync_range+0x8e/0x9f
       vfs_fsync+0x1c/0x1e
       do_fsync+0x31/0x4a
       SyS_fsync+0x10/0x14
       entry_SYSCALL_64_fastpath+0x1f/0xc2
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.com>
      Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      3116a23b
  7. 28 10月, 2016 1 次提交
  8. 08 8月, 2016 1 次提交
    • J
      block: rename bio bi_rw to bi_opf · 1eff9d32
      Jens Axboe 提交于
      Since commit 63a4cc24, bio->bi_rw contains flags in the lower
      portion and the op code in the higher portions. This means that
      old code that relies on manually setting bi_rw is most likely
      going to be broken. Instead of letting that brokeness linger,
      rename the member, to force old and out-of-tree code to break
      at compile time instead of at runtime.
      
      No intended functional changes in this commit.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      1eff9d32
  9. 21 7月, 2016 1 次提交
  10. 14 6月, 2016 1 次提交
  11. 09 12月, 2015 1 次提交
  12. 04 12月, 2015 1 次提交
    • K
      blk-integrity: empty implementation when disabled · 06c1e390
      Keith Busch 提交于
      This patch moves the blk_integrity_payload definition outside the
      CONFIG_BLK_DEV_INTERITY dependency and provides empty function
      implementations when the kernel configuration disables integrity
      extensions. This simplifies drivers that make use of these to map user
      data so they don't need to repeat the same configuration checks.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      
      Updated by Jens to pass an error pointer return from
      bio_integrity_alloc(), otherwise if CONFIG_BLK_DEV_INTEGRITY isn't
      set, we return a weird ENOMEM from __nvme_submit_user_cmd()
      if a meta buffer is set.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      06c1e390
  13. 22 10月, 2015 3 次提交
  14. 11 9月, 2015 1 次提交
  15. 29 7月, 2015 1 次提交
    • C
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig 提交于
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4246a0b6
  16. 07 7月, 2015 1 次提交
    • M
      bio integrity: do not assume bio_integrity_pool exists if bioset exists · bb8bd38b
      Mike Snitzer 提交于
      bio_integrity_alloc() and bio_integrity_free() assume that if a bio was
      allocated from a bioset that that bioset also had its bio_integrity_pool
      allocated using bioset_integrity_create().  This is a very bad
      assumption given that bioset_create() and bioset_integrity_create() are
      completely disjoint.  Not all callers of bioset_create() have been
      trained to also call bioset_integrity_create() -- and they may not care
      to be.
      
      Fix this by falling back to kmalloc'ing 'struct bio_integrity_payload'
      rather than force all bioset consumers to (wastefully) preallocate a
      bio_integrity_pool that they very likely won't actually need (given the
      niche nature of the current block integrity support).
      
      Otherwise, a NULL pointer "Kernel BUG" with a trace like the following
      will be observed (as seen on s390x using zfcp storage) because dm-io
      doesn't use bioset_integrity_create() when creating its bioset:
      
          [  791.643338] Call Trace:
          [  791.643339] ([<00000003df98b848>] 0x3df98b848)
          [  791.643341]  [<00000000002c5de8>] bio_integrity_alloc+0x48/0xf8
          [  791.643348]  [<00000000002c6486>] bio_integrity_prep+0xae/0x2f0
          [  791.643349]  [<0000000000371e38>] blk_queue_bio+0x1c8/0x3d8
          [  791.643355]  [<000000000036f8d0>] generic_make_request+0xc0/0x100
          [  791.643357]  [<000000000036f9b2>] submit_bio+0xa2/0x198
          [  791.643406]  [<000003ff801f9774>] dispatch_io+0x15c/0x3b0 [dm_mod]
          [  791.643419]  [<000003ff801f9b3e>] dm_io+0x176/0x2f0 [dm_mod]
          [  791.643423]  [<000003ff8074b28a>] do_reads+0x13a/0x1a8 [dm_mirror]
          [  791.643425]  [<000003ff8074b43a>] do_mirror+0x142/0x298 [dm_mirror]
          [  791.643428]  [<0000000000154fca>] process_one_work+0x18a/0x3f8
          [  791.643432]  [<000000000015598a>] worker_thread+0x132/0x3b0
          [  791.643435]  [<000000000015d49a>] kthread+0xd2/0xd8
          [  791.643438]  [<00000000005bc0ca>] kernel_thread_starter+0x6/0xc
          [  791.643446]  [<00000000005bc0c4>] kernel_thread_starter+0x0/0xc
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJens Axboe <axboe@fb.com>
      bb8bd38b
  17. 22 5月, 2015 1 次提交
    • M
      block: remove management of bi_remaining when restoring original bi_end_io · 326e1dbb
      Mike Snitzer 提交于
      Commit c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for
      non-chains") regressed all existing callers that followed this pattern:
       1) saving a bio's original bi_end_io
       2) wiring up an intermediate bi_end_io
       3) restoring the original bi_end_io from intermediate bi_end_io
       4) calling bio_endio() to execute the restored original bi_end_io
      
      The regression was due to BIO_CHAIN only ever getting set if
      bio_inc_remaining() is called.  For the above pattern it isn't set until
      step 3 above (step 2 would've needed to establish BIO_CHAIN).  As such
      the first bio_endio(), in step 2 above, never decremented __bi_remaining
      before calling the intermediate bi_end_io -- leaving __bi_remaining with
      the value 1 instead of 0.  When bio_inc_remaining() occurred during step
      3 it brought it to a value of 2.  When the second bio_endio() was
      called, in step 4 above, it should've called the original bi_end_io but
      it didn't because there was an extra reference that wasn't dropped (due
      to atomic operations being optimized away since BIO_CHAIN wasn't set
      upfront).
      
      Fix this issue by removing the __bi_remaining management complexity for
      all callers that use the above pattern -- bio_chain() is the only
      interface that _needs_ to be concerned with __bi_remaining.  For the
      above pattern callers just expect the bi_end_io they set to get called!
      Remove bio_endio_nodec() and also remove all bio_inc_remaining() calls
      that aren't associated with the bio_chain() interface.
      
      Also, the bio_inc_remaining() interface has been moved local to bio.c.
      
      Fixes: c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for non-chains")
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      326e1dbb
  18. 02 12月, 2014 1 次提交
    • D
      block: fix regression where bio_integrity_process uses wrong bio_vec iterator · 594416a7
      Darrick J. Wong 提交于
      bio integrity handling is broken on a system with LVM layered atop a
      DIF/DIX SCSI drive because device mapper clones the bio, modifies the
      clone, and sends the clone to the lower layers for processing.
      However, the clone bio has bi_vcnt == 0, which means that when the sd
      driver calls bio_integrity_process to attach DIX data, the
      for_each_segment_all() call (which uses bi_vcnt) returns immediately
      and random garbage is sent to the disk on a disk write.  The disk of
      course returns an error.
      
      Therefore, teach bio_integrity_process() to use bio_for_each_segment()
      to iterate the bio_vecs, since the per-bio iterator tracks which
      bio_vecs are associated with that particular bio.  The integrity
      handling code is effectively part of the "driver" (it's not the bio
      owner), so it must use the correct iterator function.
      
      v2: Fix a compiler warning about abandoned local variables.  This
      patch supersedes "block: bio_integrity_process uses wrong bio_vec
      iterator".  Patch applies against 3.18-rc6.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Acked-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      594416a7
  19. 14 10月, 2014 1 次提交
  20. 27 9月, 2014 10 次提交
  21. 22 8月, 2014 1 次提交
  22. 02 7月, 2014 1 次提交
    • G
      bio-integrity: add "bip_max_vcnt" into struct bio_integrity_payload · cbcd1054
      Gu Zheng 提交于
      Commit 08778795 ("block: Fix nr_vecs for inline integrity vectors") from
      Martin introduces the function bip_integrity_vecs(get the useful vectors)
      to fix the issue about nr_vecs for inline integrity vectors that reported
      by David Milburn.
      
      But it seems that bip_integrity_vecs() will return the wrong number if the
      bio is not based on any bio_set for some reason(bio->bi_pool == NULL),
      because in that case, the bip_inline_vecs[0] is malloced directly.  So
      here we add the bip_max_vcnt to record the count of vector slots, and
      cleanup the function bip_integrity_vecs().
      Signed-off-by: NGu Zheng <guz.fnst@cn.fujitsu.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      cbcd1054
  23. 19 5月, 2014 1 次提交
  24. 23 4月, 2014 1 次提交
  25. 09 4月, 2014 1 次提交