1. 04 1月, 2016 1 次提交
  2. 01 11月, 2015 1 次提交
  3. 14 8月, 2015 1 次提交
  4. 12 8月, 2015 1 次提交
    • S
      block: don't access bio->bi_error after bio_put() · 9b81c842
      Sasha Levin 提交于
      Commit 4246a0b6 ("block: add a bi_error field to struct bio") has added a few
      dereferences of 'bio' after a call to bio_put(). This causes use-after-frees
      such as:
      
      [521120.719695] BUG: KASan: use after free in dio_bio_complete+0x2b3/0x320 at addr ffff880f36b38714
      [521120.720638] Read of size 4 by task mount.ocfs2/9644
      [521120.721212] =============================================================================
      [521120.722056] BUG kmalloc-256 (Not tainted): kasan: bad access detected
      [521120.722968] -----------------------------------------------------------------------------
      [521120.722968]
      [521120.723915] Disabling lock debugging due to kernel taint
      [521120.724539] INFO: Slab 0xffffea003cdace00 objects=32 used=25 fp=0xffff880f36b38600 flags=0x46fffff80004080
      [521120.726037] INFO: Object 0xffff880f36b38700 @offset=1792 fp=0xffff880f36b38800
      [521120.726037]
      [521120.726974] Bytes b4 ffff880f36b386f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.727898] Object ffff880f36b38700: 00 88 b3 36 0f 88 ff ff 00 00 d8 de 0b 88 ff ff  ...6............
      [521120.728822] Object ffff880f36b38710: 02 00 00 f0 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.729705] Object ffff880f36b38720: 01 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00  ................
      [521120.730623] Object ffff880f36b38730: 00 00 00 00 00 00 00 00 01 00 00 00 00 02 00 00  ................
      [521120.731621] Object ffff880f36b38740: 00 02 00 00 01 00 00 00 d0 f7 87 ad ff ff ff ff  ................
      [521120.732776] Object ffff880f36b38750: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.733640] Object ffff880f36b38760: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.734508] Object ffff880f36b38770: 01 00 03 00 01 00 00 00 88 87 b3 36 0f 88 ff ff  ...........6....
      [521120.735385] Object ffff880f36b38780: 00 73 22 ad 02 88 ff ff 40 13 e0 3c 00 ea ff ff  .s".....@..<....
      [521120.736667] Object ffff880f36b38790: 00 02 00 00 00 04 00 00 00 00 00 00 00 00 00 00  ................
      [521120.737596] Object ffff880f36b387a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.738524] Object ffff880f36b387b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.739388] Object ffff880f36b387c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.740277] Object ffff880f36b387d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.741187] Object ffff880f36b387e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.742233] Object ffff880f36b387f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [521120.743229] CPU: 41 PID: 9644 Comm: mount.ocfs2 Tainted: G    B           4.2.0-rc6-next-20150810-sasha-00039-gf909086 #2420
      [521120.744274]  ffff880f36b38000 ffff880d89c8f638 ffffffffb6e9ba8a ffff880101c0e5c0
      [521120.745025]  ffff880d89c8f668 ffffffffad76a313 ffff880101c0e5c0 ffffea003cdace00
      [521120.745908]  ffff880f36b38700 ffff880f36b38798 ffff880d89c8f690 ffffffffad772854
      [521120.747063] Call Trace:
      [521120.747520] dump_stack (lib/dump_stack.c:52)
      [521120.748053] print_trailer (mm/slub.c:653)
      [521120.748582] object_err (mm/slub.c:660)
      [521120.749079] kasan_report_error (include/linux/kasan.h:20 mm/kasan/report.c:152 mm/kasan/report.c:194)
      [521120.750834] __asan_report_load4_noabort (mm/kasan/report.c:250)
      [521120.753580] dio_bio_complete (fs/direct-io.c:478)
      [521120.755752] do_blockdev_direct_IO (fs/direct-io.c:494 fs/direct-io.c:1291)
      [521120.759765] __blockdev_direct_IO (fs/direct-io.c:1322)
      [521120.761658] blkdev_direct_IO (fs/block_dev.c:162)
      [521120.762993] generic_file_read_iter (mm/filemap.c:1738)
      [521120.767405] blkdev_read_iter (fs/block_dev.c:1649)
      [521120.768556] __vfs_read (fs/read_write.c:423 fs/read_write.c:434)
      [521120.772126] vfs_read (fs/read_write.c:454)
      [521120.773118] SyS_pread64 (fs/read_write.c:607 fs/read_write.c:594)
      [521120.776062] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:186)
      [521120.777375] Memory state around the buggy address:
      [521120.778118]  ffff880f36b38600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [521120.779211]  ffff880f36b38680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [521120.780315] >ffff880f36b38700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [521120.781465]                          ^
      [521120.782083]  ffff880f36b38780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [521120.783717]  ffff880f36b38800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [521120.784818] ==================================================================
      
      This patch fixes a few of those places that I caught while auditing the patch, but the
      original patch should be audited further for more occurences of this issue since I'm
      not too familiar with the code.
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      9b81c842
  5. 29 7月, 2015 1 次提交
    • C
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig 提交于
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      4246a0b6
  6. 28 2月, 2015 1 次提交
  7. 14 2月, 2015 1 次提交
    • D
      dm io: reject unsupported DISCARD requests with EOPNOTSUPP · 37527b86
      Darrick J. Wong 提交于
      I created a dm-raid1 device backed by a device that supports DISCARD
      and another device that does NOT support DISCARD with the following
      dm configuration:
      
       #  echo '0 2048 mirror core 1 512 2 /dev/sda 0 /dev/sdb 0' | dmsetup create moo
       # lsblk -D
       NAME         DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
       sda                 0        4K       1G         0
       `-moo (dm-0)        0        4K       1G         0
       sdb                 0        0B       0B         0
       `-moo (dm-0)        0        4K       1G         0
      
      Notice that the mirror device /dev/mapper/moo advertises DISCARD
      support even though one of the mirror halves doesn't.
      
      If I issue a DISCARD request (via fstrim, mount -o discard, or ioctl
      BLKDISCARD) through the mirror, kmirrord gets stuck in an infinite
      loop in do_region() when it tries to issue a DISCARD request to sdb.
      The problem is that when we call do_region() against sdb, num_sectors
      is set to zero because q->limits.max_discard_sectors is zero.
      Therefore, "remaining" never decreases and the loop never terminates.
      
      To fix this: before entering the loop, check for the combination of
      REQ_DISCARD and no discard and return -EOPNOTSUPP to avoid hanging up
      the mirror device.
      
      This bug was found by the unfortunate coincidence of pvmove and a
      discard operation in the RHEL 6.5 kernel; upstream is also affected.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Acked-by: N"Martin K. Petersen" <martin.petersen@oracle.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      37527b86
  8. 02 8月, 2014 1 次提交
  9. 11 7月, 2014 1 次提交
  10. 18 2月, 2014 1 次提交
    • M
      dm io: fix I/O to multiple destinations · d73f9907
      Mikulas Patocka 提交于
      Commit 003b5c57 ("block: Convert drivers
      to immutable biovecs") broke dm-mirror due to dm-io breakage.
      
      dm-io had three possible iterators (DM_IO_PAGE_LIST, DM_IO_BVEC,
      DM_IO_VMA) that iterate over pages where the I/O should be performed.
      
      The switch to immutable biovecs changed the DM_IO_BVEC iterator to
      DM_IO_BIO.  Before this change the iterator stored the pointer to a bio
      vector in the dpages structure.  The iterator incremented the pointer in
      the dpages structure as it advanced over the pages.  After the immutable
      biovecs change, the DM_IO_BIO iterator stores a pointer to the bio in
      the dpages structure and uses bio_advance to change the bio as it
      advances.
      
      The problem is that the function dispatch_io stores the content of the
      dpages structure into the variable old_pages and restores it before
      issuing I/O to each of the devices.  Before the change, the statement
      "*dp = old_pages;" restored the iterator to its starting position.
      After the change, struct dpages holds a pointer to the bio, thus the
      statement "*dp = old_pages;" doesn't restore the iterator.
      
      Consequently, in the context of dm-mirror: only the first mirror leg is
      written correctly, the kernel locks up when trying to write the other
      mirror legs because the number of sectors to write in the where->count
      variable doesn't match the number of sectors returned by the iterator.
      
      This patch fixes the bug by partially reverting the original patch - it
      changes the code so that struct dpages holds a pointer to the bio vector,
      so that the statement "*dp = old_pages;" restores the iterator correctly.
      
      The field "context_u" holds the offset from the beginning of the current
      bio vector entry, just like the "bio->bi_iter.bi_bvec_done" field.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      d73f9907
  11. 24 11月, 2013 2 次提交
    • K
      block: Convert drivers to immutable biovecs · 003b5c57
      Kent Overstreet 提交于
      Now that we've got a mechanism for immutable biovecs -
      bi_iter.bi_bvec_done - we need to convert drivers to use primitives that
      respect it instead of using the bvec array directly.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: dm-devel@redhat.com
      003b5c57
    • K
      block: Abstract out bvec iterator · 4f024f37
      Kent Overstreet 提交于
      Immutable biovecs are going to require an explicit iterator. To
      implement immutable bvecs, a later patch is going to add a bi_bvec_done
      member to this struct; for now, this patch effectively just renames
      things.
      Signed-off-by: NKent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: "Ed L. Cashin" <ecashin@coraid.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Geoff Levand <geoff@infradead.org>
      Cc: Yehuda Sadeh <yehuda@inktank.com>
      Cc: Sage Weil <sage@inktank.com>
      Cc: Alex Elder <elder@inktank.com>
      Cc: ceph-devel@vger.kernel.org
      Cc: Joshua Morris <josh.h.morris@us.ibm.com>
      Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux390@de.ibm.com
      Cc: Boaz Harrosh <bharrosh@panasas.com>
      Cc: Benny Halevy <bhalevy@tonian.com>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Chris Mason <chris.mason@fusionio.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Andreas Dilger <adilger.kernel@dilger.ca>
      Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Dave Kleikamp <shaggy@kernel.org>
      Cc: Joern Engel <joern@logfs.org>
      Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Ben Myers <bpm@sgi.com>
      Cc: xfs@oss.sgi.com
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Guo Chao <yan@linux.vnet.ibm.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Asai Thambi S P <asamymuthupa@micron.com>
      Cc: Selvan Mani <smani@micron.com>
      Cc: Sam Bradshaw <sbradshaw@micron.com>
      Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
      Cc: "Roger Pau Monné" <roger.pau@citrix.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
      Cc: Ian Campbell <Ian.Campbell@citrix.com>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Jiang Liu <jiang.liu@huawei.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchand@redhat.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Peng Tao <tao.peng@emc.com>
      Cc: Andy Adamson <andros@netapp.com>
      Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
      Cc: Jie Liu <jeff.liu@oracle.com>
      Cc: Sunil Mushran <sunil.mushran@gmail.com>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Namjae Jeon <namjae.jeon@samsung.com>
      Cc: Pankaj Kumar <pankaj.km@samsung.com>
      Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
      Cc: Mel Gorman <mgorman@suse.de>6
      4f024f37
  12. 23 9月, 2013 1 次提交
    • M
      dm: add reserved_bio_based_ios module parameter · e8603136
      Mike Snitzer 提交于
      Allow user to change the number of IOs that are reserved by
      bio-based DM's mempools by writing to this file:
      /sys/module/dm_mod/parameters/reserved_bio_based_ios
      
      The default value is RESERVED_BIO_BASED_IOS (16).  The maximum allowed
      value is RESERVED_MAX_IOS (1024).
      
      Export dm_get_reserved_bio_based_ios() for use by DM targets and core
      code.  Switch to sizing dm-io's mempool and bioset using DM core's
      configurable 'reserved_bio_based_ios'.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NFrank Mayhar <fmayhar@google.com>
      e8603136
  13. 22 12月, 2012 1 次提交
    • M
      dm kcopyd: add WRITE SAME support to dm_kcopyd_zero · 70d6c400
      Mike Snitzer 提交于
      Add WRITE SAME support to dm-io and make it accessible to
      dm_kcopyd_zero().  dm_kcopyd_zero() provides an asynchronous interface
      whereas the blkdev_issue_write_same() interface is synchronous.
      
      WRITE SAME is a SCSI command that can be leveraged for more efficient
      zeroing of a specified logical extent of a device which supports it.
      Only a single zeroed logical block is transfered to the target for each
      WRITE SAME and the target then writes that same block across the
      specified extent.
      
      The dm thin target uses this.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      70d6c400
  14. 09 9月, 2012 1 次提交
  15. 08 3月, 2012 1 次提交
    • M
      dm io: fix discard support · 0c535e0d
      Milan Broz 提交于
      This patch fixes a crash by recognising discards in dm_io.
      
      Currently dm_mirror can send REQ_DISCARD bios if running over a
      discard-enabled device and without support in dm_io the system
      crashes badly.
      
      BUG: unable to handle kernel paging request at 00800000
      IP:  __bio_add_page.part.17+0xf5/0x1e0
      ...
       bio_add_page+0x56/0x70
       dispatch_io+0x1cf/0x240 [dm_mod]
       ? km_get_page+0x50/0x50 [dm_mod]
       ? vm_next_page+0x20/0x20 [dm_mod]
       ? mirror_flush+0x130/0x130 [dm_mirror]
       dm_io+0xdc/0x2b0 [dm_mod]
      ...
      
      Introduced in 2.6.38-rc1 by commit 5fc2ffea
      (dm raid1: support discard).
      Signed-off-by: NMilan Broz <mbroz@redhat.com>
      Cc: stable@kernel.org
      Acked-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      0c535e0d
  16. 02 8月, 2011 1 次提交
    • M
      dm io: flush cpu cache with vmapped io · bb91bc7b
      Mikulas Patocka 提交于
      For normal kernel pages, CPU cache is synchronized by the dma layer.
      However, this is not done for pages allocated with vmalloc. If we do I/O
      to/from vmallocated pages, we must synchronize CPU cache explicitly.
      
      Prior to doing I/O on vmallocated page we must call
      flush_kernel_vmap_range to flush dirty cache on the virtual address.
      After finished read we must call invalidate_kernel_vmap_range to
      invalidate cache on the virtual address, so that accesses to the virtual
      address return newly read data and not stale data from CPU cache.
      
      This patch fixes metadata corruption on dm-snapshots on PA-RISC and
      possibly other architectures with caches indexed by virtual address.
      
      Cc: stable <stable@kernel.org>
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      bb91bc7b
  17. 29 5月, 2011 1 次提交
    • M
      dm io: use fixed initial mempool size · bda8efec
      Mikulas Patocka 提交于
      Replace the arbitrary calculation of an initial io struct mempool size
      with a constant.
      
      The code calculated the number of reserved structures based on the request
      size and used a "magic" multiplication constant of 4.  This patch changes
      it to reserve a fixed number - itself still chosen quite arbitrarily.
      Further testing might show if there is a better number to choose.
      
      Note that if there is no memory pressure, we can still allocate an
      arbitrary number of "struct io" structures.  One structure is enough to
      process the whole request.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      bda8efec
  18. 10 3月, 2011 1 次提交
    • J
      block: kill off REQ_UNPLUG · 721a9602
      Jens Axboe 提交于
      With the plugging now being explicitly controlled by the
      submitter, callers need not pass down unplugging hints
      to the block layer. If they want to unplug, it's because they
      manually plugged on their own - in which case, they should just
      unplug at will.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      721a9602
  19. 10 9月, 2010 1 次提交
    • T
      dm: implement REQ_FLUSH/FUA support for bio-based dm · d87f4c14
      Tejun Heo 提交于
      This patch converts bio-based dm to support REQ_FLUSH/FUA instead of
      now deprecated REQ_HARDBARRIER.
      
      * -EOPNOTSUPP handling logic dropped.
      
      * Preflush is handled as before but postflush is dropped and replaced
        with passing down REQ_FUA to member request_queues.  This replaces
        one array wide cache flush w/ member specific FUA writes.
      
      * __split_and_process_bio() now calls __clone_and_map_flush() directly
        for flushes and guarantees all FLUSH bio's going to targets are zero
      `  length.
      
      * It's now guaranteed that all FLUSH bio's which are passed onto dm
        targets are zero length.  bio_empty_barrier() tests are replaced
        with REQ_FLUSH tests.
      
      * Empty WRITE_BARRIERs are replaced with WRITE_FLUSHes.
      
      * Dropped unlikely() around REQ_FLUSH tests.  Flushes are not unlikely
        enough to be marked with unlikely().
      
      * Block layer now filters out REQ_FLUSH/FUA bio's if the request_queue
        doesn't support cache flushing.  Advertise REQ_FLUSH | REQ_FUA
        capability.
      
      * Request based dm isn't converted yet.  dm_init_request_based_queue()
        resets flush support to 0 for now.  To avoid disturbing request
        based dm code, dm->flush_error is added for bio based dm while
        requested based dm continues to use dm->barrier_error.
      
      Lightly tested linear, stripe, raid1, snap and crypt targets.  Please
      proceed with caution as I'm not familiar with the code base.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: dm-devel@redhat.com
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      d87f4c14
  20. 08 8月, 2010 1 次提交
    • C
      block: unify flags for struct bio and struct request · 7b6d91da
      Christoph Hellwig 提交于
      Remove the current bio flags and reuse the request flags for the bio, too.
      This allows to more easily trace the type of I/O from the filesystem
      down to the block driver.  There were two flags in the bio that were
      missing in the requests:  BIO_RW_UNPLUG and BIO_RW_AHEAD.  Also I've
      renamed two request flags that had a superflous RW in them.
      
      Note that the flags are in bio.h despite having the REQ_ name - as
      blkdev.h includes bio.h that is the only way to go for now.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      7b6d91da
  21. 11 12月, 2009 3 次提交
    • M
      dm io: handle empty barriers · 12fc0f49
      Mikulas Patocka 提交于
      Accept empty barriers in dm-io.
      
      dm-io will process empty write barrier requests just like the other
      read/write requests.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      12fc0f49
    • M
      dm io: remove extra bi_io_vec region hack · f1e53987
      Mikulas Patocka 提交于
      Remove the hack where we allocate an extra bi_io_vec to store additional
      private data.  This hack prevents us from supporting barriers in
      dm-raid1 without first making another little block layer change.
      Instead of doing that, this patch eliminates the bi_io_vec abuse by
      storing the region number directly in the low bits of bi_private.
      
      We need to store two things for each bio, the pointer to the main io
      structure and, if parallel writes were requested, an index indicating
      which of these writes this bio belongs to.  There can be at most
      BITS_PER_LONG regions - 32 or 64.
      
      The index (region number) was stored in the last (hidden) bio vector and
      the pointer to struct io was stored in bi_private.
      
      This patch now aligns "struct io" on BITS_PER_LONG bytes and stores the
      region number in the low BITS_PER_LONG bits of bi_private.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      f1e53987
    • M
      dm io: use slab for struct io · 952b3557
      Mikulas Patocka 提交于
      Allocate "struct io" from a slab.
      
      This patch changes dm-io, so that "struct io" is allocated from a slab cache.
      It used to be allocated with kmalloc. Allocating from a slab will be needed
      for the next patch, because it requires a special alignment of "struct io"
      and kmalloc cannot meet this alignment.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      952b3557
  22. 22 6月, 2009 2 次提交
  23. 03 4月, 2009 1 次提交
    • M
      dm io: make sync_io uninterruptible · b64b6bf4
      Mikulas Patocka 提交于
      If someone sends signal to a process performing synchronous dm-io call,
      the kernel may crash.
      
      The function sync_io attempts to exit with -EINTR if it has pending signal,
      however the structure "io" is allocated on stack, so already submitted io
      requests end up touching unallocated stack space and corrupting kernel memory.
      
      sync_io sets its state to TASK_UNINTERRUPTIBLE, so the signal can't break out
      of io_schedule() --- however, if the signal was pending before sync_io entered
      while (1) loop, the corruption of kernel memory will happen.
      
      There is no way to cancel in-progress IOs, so the best solution is to ignore
      signals at this point.
      
      Cc: stable@kernel.org
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      b64b6bf4
  24. 17 3月, 2009 1 次提交
  25. 18 2月, 2009 1 次提交
  26. 29 12月, 2008 1 次提交
    • J
      bio: allow individual slabs in the bio_set · bb799ca0
      Jens Axboe 提交于
      Instead of having a global bio slab cache, add a reference to one
      in each bio_set that is created. This allows for personalized slabs
      in each bio_set, so that they can have bios of different sizes.
      
      This means we can personalize the bios we return. File systems may
      want to embed the bio inside another structure, to avoid allocation
      more items (and stuffing them in ->bi_private) after the get a bio.
      Or we may want to embed a number of bio_vecs directly at the end
      of a bio, to avoid doing two allocations to return a bio. This is now
      possible.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      bb799ca0
  27. 22 10月, 2008 1 次提交
  28. 25 4月, 2008 4 次提交
    • M
      dm: unplug queues in threads · 7ff14a36
      Mikulas Patocka 提交于
      Remove an avoidable 3ms delay on some dm-raid1 and kcopyd I/O.
      
      It is specified that any submitted bio without BIO_RW_SYNC flag may plug the
      queue (i.e. block the requests from being dispatched to the physical device).
      
      The queue is unplugged when the caller calls blk_unplug() function. Usually, the
      sequence is that someone calls submit_bh to submit IO on a buffer. The IO plugs
      the queue and waits (to be possibly joined with other adjacent bios). Then, when
      the caller calls wait_on_buffer(), it unplugs the queue and submits the IOs to
      the disk.
      
      This was happenning:
      
      When doing O_SYNC writes, function fsync_buffers_list() submits a list of
      bios to dm_raid1, the bios are added to dm_raid1 write queue and kmirrord is
      woken up.
      
      fsync_buffers_list() calls wait_on_buffer().  That unplugs the queue, but
      there are no bios on the device queue as they are still in the dm_raid1 queue.
      
      wait_on_buffer() starts waiting until the IO is finished.
      
      kmirrord is scheduled, kmirrord takes bios and submits them to the devices.
      
      The submitted bio plugs the harddisk queue but there is no one to unplug it.
      (The process that called wait_on_buffer() is already sleeping.)
      
      So there is a 3ms timeout, after which the queues on the harddisks are
      unplugged and requests are processed.
      
      This 3ms timeout meant that in certain workloads (e.g. O_SYNC, 8kb writes),
      dm-raid1 is 10 times slower than md raid1.
      
      Every time we submit something asynchronously via dm_io, we must unplug the
      queue actually to send the request to the device.
      
      This patch adds an unplug call to kmirrord - while processing requests, it keeps
      the queue plugged (so that adjacent bios can be merged); when it finishes
      processing all the bios, it unplugs the queue to submit the bios.
      
      It also fixes kcopyd which has the same potential problem. All kcopyd requests
      are submitted with BIO_RW_SYNC.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      Acked-by: NJens Axboe <jens.axboe@oracle.com>
      7ff14a36
    • A
      dm: move include files · a765e20e
      Alasdair G Kergon 提交于
      Publish the dm-io, dm-log and dm-kcopyd headers in include/linux.
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      a765e20e
    • H
      dm io: clean interface · 22a1ceb1
      Heinz Mauelshagen 提交于
      Clean up the dm-io interface to prepare for publishing it in include/linux.
      Signed-off-by: NHeinz Mauelshagen <hjm@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      22a1ceb1
    • A
      dm io: rename error to error_bits · e01fd7ee
      Alasdair G Kergon 提交于
      Rename 'error' to 'error_bits' for clarity.
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      e01fd7ee
  29. 29 3月, 2008 1 次提交
  30. 10 10月, 2007 1 次提交
  31. 13 7月, 2007 1 次提交
  32. 10 5月, 2007 2 次提交