1. 06 1月, 2009 18 次提交
  2. 29 12月, 2008 1 次提交
    • J
      bio: allow individual slabs in the bio_set · bb799ca0
      Jens Axboe 提交于
      Instead of having a global bio slab cache, add a reference to one
      in each bio_set that is created. This allows for personalized slabs
      in each bio_set, so that they can have bios of different sizes.
      
      This means we can personalize the bios we return. File systems may
      want to embed the bio inside another structure, to avoid allocation
      more items (and stuffing them in ->bi_private) after the get a bio.
      Or we may want to embed a number of bio_vecs directly at the end
      of a bio, to avoid doing two allocations to return a bio. This is now
      possible.
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      bb799ca0
  3. 19 12月, 2008 1 次提交
    • N
      md: Don't read past end of bitmap when reading bitmap. · a2ed9615
      NeilBrown 提交于
      When we read the write-intent-bitmap off the device, we currently
      read a whole number of pages.
      When PAGE_SIZE is 4K, this works due to the alignment we enforce
      on the superblock and bitmap.
      When PAGE_SIZE is 64K, this case read past the end-of-device
      which causes an error.
      
      When we write the superblock, we ensure to clip the last page
      to just be the required size.  Copy that code into the read path
      to just read the required number of sectors.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      Cc: stable@kernel.org
      a2ed9615
  4. 03 12月, 2008 1 次提交
    • M
      block: fix setting of max_segment_size and seg_boundary mask · 0e435ac2
      Milan Broz 提交于
      Fix setting of max_segment_size and seg_boundary mask for stacked md/dm
      devices.
      
      When stacking devices (LVM over MD over SCSI) some of the request queue
      parameters are not set up correctly in some cases by default, namely
      max_segment_size and and seg_boundary mask.
      
      If you create MD device over SCSI, these attributes are zeroed.
      
      Problem become when there is over this mapping next device-mapper mapping
      - queue attributes are set in DM this way:
      
      request_queue   max_segment_size  seg_boundary_mask
      SCSI                65536             0xffffffff
      MD RAID1                0                      0
      LVM                 65536                 -1 (64bit)
      
      Unfortunately bio_add_page (resp.  bio_phys_segments) calculates number of
      physical segments according to these parameters.
      
      During the generic_make_request() is segment cout recalculated and can
      increase bio->bi_phys_segments count over the allowed limit.  (After
      bio_clone() in stack operation.)
      
      Thi is specially problem in CCISS driver, where it produce OOPS here
      
          BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);
      
      (MAXSEGENTRIES is 31 by default.)
      
      Sometimes even this command is enough to cause oops:
      
        dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10
      
      This command generates bios with 250 sectors, allocated in 32 4k-pages
      (last page uses only 1024 bytes).
      
      For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
      unfortunatelly on lower layer it is recalculated to 32 segments and this
      violates CCISS restriction and triggers BUG_ON().
      
      The patch tries to fix it by:
      
       * initializing attributes above in queue request constructor
         blk_queue_make_request()
      
       * make sure that blk_queue_stack_limits() inherits setting
      
       (DM uses its own function to set the limits because it
       blk_queue_stack_limits() was introduced later.  It should probably switch
       to use generic stack limit function too.)
      
       * sets the default seg_boundary value in one place (blkdev.h)
      
       * use this mask as default in DM (instead of -1, which differs in 64bit)
      
      Bugs related to this:
      https://bugzilla.redhat.com/show_bug.cgi?id=471639
      http://bugzilla.kernel.org/show_bug.cgi?id=8672Signed-off-by: NMilan Broz <mbroz@redhat.com>
      Reviewed-by: NAlasdair G Kergon <agk@redhat.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Mike Miller <mike.miller@hp.com>
      Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
      0e435ac2
  5. 26 11月, 2008 2 次提交
  6. 14 11月, 2008 6 次提交
  7. 06 11月, 2008 3 次提交
    • A
      md: linear: Fix a division by zero bug for very small arrays. · f1cd14ae
      Andre Noll 提交于
      We currently oops with a divide error on starting a linear software
      raid array consisting of at least two very small (< 500K) devices.
      
      The bug is caused by the calculation of the hash table size which
      tries to compute sector_div(sz, base) with "base" being zero due to
      the small size of the component devices of the array.
      
      Fix this by requiring the hash spacing to be at least one which
      implies that also "base" is non-zero.
      
      This bug has existed since about 2.6.14.
      
      Cc: stable@kernel.org
      Signed-off-by: NAndre Noll <maan@systemlinux.org>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      f1cd14ae
    • N
      md: fix bug in raid10 recovery. · a53a6c85
      NeilBrown 提交于
      Adding a spare to a raid10 doesn't cause recovery to start.
      This is due to an silly type in
        commit 6c2fce2e
      and so is a bug in 2.6.27 and .28-rc.
      
      Thanks to Thomas Backlund for bisecting to find this.
      
      Cc: Thomas Backlund <tmb@mandriva.org>
      Cc: stable@kernel.org
      Signed-off-by: NNeilBrown <neilb@suse.de>
      a53a6c85
    • N
      md: revert the recent addition of a call to the BLKRRPART ioctl. · cb3ac42b
      NeilBrown 提交于
      It turns out that it is only safe to call blkdev_ioctl when the device
      is actually open (as ->bd_disk is set to NULL on last close).  And it
      is quite possible for do_md_stop to be called when the device is not
      open.  So discard the call to blkdev_ioctl(BLKRRPART) which was
      added in
         commit 934d9c23
      
      It is just as easy to call this ioctl from userspace when needed (on
      mdadm -S) so leave it out of the kernel
      Signed-off-by: NNeilBrown <neilb@suse.de>
      cb3ac42b
  8. 30 10月, 2008 3 次提交
    • M
      dm snapshot: wait for chunks in destructor · 879129d2
      Mikulas Patocka 提交于
      If there are several snapshots sharing an origin and one is removed
      while the origin is being written to, the snapshot's mempool may get
      deleted while elements are still referenced.
      
      Prior to dm-snapshot-use-per-device-mempools.patch the pending
      exceptions may still have been referenced after the snapshot was
      destroyed, but this was not a problem because the shared mempool
      was still there.
      
      This patch fixes the problem by tracking the number of mempool elements
      in use.
      
      The scenario:
      - You have an origin and two snapshots 1 and 2.
      - Someone writes to the origin.
      - It creates two exceptions in the snapshots, snapshot 1 will be primary
      exception, snapshot 2's pending_exception->primary_pe will point to the
      exception in snapshot 1.
      - The exceptions are being relocated, relocation of exception 1 finishes
      (but it's pending_exception is still allocated, because it is referenced
      by an exception from snapshot 2)
      - The user lvremoves snapshot 1 --- it calls just suspend (does nothing)
      and destructor. md->pending is zero (there is no I/O submitted to the
      snapshot by md layer), so it won't help us.
      - The destructor waits for kcopyd jobs to finish on snapshot 1 --- but
      there are none.
      - The destructor on snapshot 1 cleans up everything.
      - The relocation of exception on snapshot 2 finishes, it drops reference
      on primary_pe. This frees its primary_pe pointer. Primary_pe points to
      pending exception created for snapshot 1. So it frees memory into
      non-existing mempool.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      879129d2
    • M
      dm snapshot: fix register_snapshot deadlock · 60c856c8
      Mikulas Patocka 提交于
      register_snapshot() performs a GFP_KERNEL allocation while holding
      _origins_lock for write, but that could write out dirty pages onto a
      device that attempts to acquire _origins_lock for read, resulting in
      deadlock.
      
      So move the allocation up before taking the lock.
      
      This path is not performance-critical, so it doesn't matter that we
      allocate memory and free it if we find that we won't need it.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      60c856c8
    • I
      dm raid1: fix do_failures · b34578a4
      Ilpo Jarvinen 提交于
      Missing braces.  Commit 1f965b19 (dm raid1: separate region_hash interface
      part1) broke it.
      Signed-off-by: NIlpo Jarvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      Cc: Heinz Mauelshagen <hjm@redhat.com>
      b34578a4
  9. 28 10月, 2008 1 次提交
    • N
      md: destroy partitions and notify udev when md array is stopped. · 934d9c23
      NeilBrown 提交于
      md arrays are not currently destroyed when they are stopped - they
      remain in /sys/block.  Last time I tried this I tripped over locking
      too much.
      
      A consequence of this is that udev doesn't remove anything from /dev.
      This is rather ugly.
      
      As an interim measure until proper device removal can be achieved,
      make sure all partitions are removed using the BLKRRPART ioctl, and
      send a KOBJ_CHANGE when an md array is stopped.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      934d9c23
  10. 23 10月, 2008 1 次提交
  11. 22 10月, 2008 3 次提交