1. 18 4月, 2011 4 次提交
    • N
      md: provide generic support for handling unplug callbacks. · 97658cdd
      NeilBrown 提交于
      When an md device adds a request to a queue, it can call
      mddev_check_plugged.
      If this succeeds then we know that the md thread will be woken up
      shortly, and ->plug_cnt will be non-zero until then, so some
      processing can be delayed.
      
      If it fails, then no unplug callback is expected and the make_request
      function needs to do whatever is required to make the request happen.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      97658cdd
    • N
      md - remove old plugging code. · 482c0834
      NeilBrown 提交于
      md has some plugging infrastructure for RAID5 to use because the
      normal plugging infrastructure required a 'request_queue', and when
      called from dm, RAID5 doesn't have one of those available.
      
      This relied on the ->unplug_fn callback which doesn't exist any more.
      
      So remove all of that code, both in md and raid5.  Subsequent patches
      with restore the plugging functionality.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      482c0834
    • N
      md/dm - remove remains of plug_fn callback. · af1db72d
      NeilBrown 提交于
      Now that unplugging is done differently, the unplug_fn callback is
      never called, so it can be completely discarded.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      af1db72d
    • N
      md: use new plugging interface for RAID IO. · e1dfa0a2
      NeilBrown 提交于
      md/raid submits a lot of IO from the various raid threads.
      So adding start/finish plug calls to those so that some
      plugging happens.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      e1dfa0a2
  2. 06 4月, 2011 1 次提交
    • M
      dm: improve block integrity support · a63a5cf8
      Mike Snitzer 提交于
      The current block integrity (DIF/DIX) support in DM is verifying that
      all devices' integrity profiles match during DM device resume (which
      is past the point of no return).  To some degree that is unavoidable
      (stacked DM devices force this late checking).  But for most DM
      devices (which aren't stacking on other DM devices) the ideal time to
      verify all integrity profiles match is during table load.
      
      Introduce the notion of an "initialized" integrity profile: a profile
      that was blk_integrity_register()'d with a non-NULL 'blk_integrity'
      template.  Add blk_integrity_is_initialized() to allow checking if a
      profile was initialized.
      
      Update DM integrity support to:
      - check all devices with _initialized_ integrity profiles match
        during table load; uninitialized profiles (e.g. for underlying DM
        device(s) of a stacked DM device) are ignored.
      - disallow a table load that would result in an integrity profile that
        conflicts with a DM device's existing (in-use) integrity profile
      - avoid clearing an existing integrity profile
      - validate all integrity profiles match during resume; but if they
        don't all we can do is report the mismatch (during resume we're past
        the point of no return)
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      a63a5cf8
  3. 31 3月, 2011 1 次提交
  4. 29 3月, 2011 1 次提交
  5. 24 3月, 2011 10 次提交
  6. 22 3月, 2011 1 次提交
  7. 17 3月, 2011 1 次提交
  8. 10 3月, 2011 2 次提交
    • J
      block: kill off REQ_UNPLUG · 721a9602
      Jens Axboe 提交于
      With the plugging now being explicitly controlled by the
      submitter, callers need not pass down unplugging hints
      to the block layer. If they want to unplug, it's because they
      manually plugged on their own - in which case, they should just
      unplug at will.
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      721a9602
    • J
      block: remove per-queue plugging · 7eaceacc
      Jens Axboe 提交于
      Code has been converted over to the new explicit on-stack plugging,
      and delay users have been converted to use the new API for that.
      So lets kill off the old plugging along with aops->sync_page().
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      7eaceacc
  9. 04 3月, 2011 1 次提交
  10. 24 2月, 2011 1 次提交
    • N
      md: Fix - again - partition detection when array becomes active · f0b4f7e2
      NeilBrown 提交于
      Revert
          b821eaa5
      and
          f3b99be1
      
      When I wrote the first of these I had a wrong idea about the
      lifetime of 'struct block_device'.  It can disappear at any time that
      the block device is not open if it falls out of the inode cache.
      
      So relying on the 'size' recorded with it to detect when the
      device size has changed and so we need to revalidate, is wrong.
      
      Rather, we really do need the 'changed' attribute stored directly in
      the mddev and set/tested as appropriate.
      
      Without this patch, a sequence of:
         mknod / open / close / unlink
      
      (which can cause a block_device to be created and then destroyed)
      will result in a rescan of the partition table and consequence removal
      and addition of partitions.
      Several of these in a row can get udev racing to create and unlink and
      other code can get confused.
      
      With the patch, the rescan is only performed when needed and so there
      are no races.
      
      This is suitable for any stable kernel from 2.6.35.
      Reported-by: N"Wojcik, Krzysztof" <krzysztof.wojcik@intel.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Cc: stable@kernel.org
      f0b4f7e2
  11. 21 2月, 2011 1 次提交
    • N
      md: avoid spinlock problem in blk_throtl_exit · da9cf505
      NeilBrown 提交于
      blk_throtl_exit assumes that ->queue_lock still exists,
      so make sure that it does.
      To do this, we stop redirecting ->queue_lock to conf->device_lock
      and leave it pointing where it is initialised - __queue_lock.
      
      As the blk_plug functions check the ->queue_lock is held, we now
      take that spin_lock explicitly around the plug functions.  We don't
      need the locking, just the warning removal.
      
      This is needed for any kernel with the blk_throtl code, which is
      which is 2.6.37 and later.
      
      Cc: stable@kernel.org
      Signed-off-by: NNeilBrown <neilb@suse.de>
      da9cf505
  12. 16 2月, 2011 2 次提交
    • N
      md: correctly handle probe of an 'mdp' device. · 8f5f02c4
      NeilBrown 提交于
      'mdp' devices are md devices with preallocated device numbers
      for partitions. As such it is possible to mknod and open a partition
      before opening the whole device.
      
      this causes  md_probe() to be called with a device number of a
      partition, which in-turn calls mddev_find with such a number.
      
      However mddev_find expects the number of a 'whole device' and
      does the wrong thing with partition numbers.
      
      So add code to mddev_find to remove the 'partition' part of
      a device number and just work with the 'whole device'.
      
      This patch addresses https://bugzilla.kernel.org/show_bug.cgi?id=28652
      
      Reported-by: hkmaly@bigfoot.com
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Cc: <stable@kernel.org>
      8f5f02c4
    • N
      md: don't set_capacity before array is active. · cbe6ef1d
      NeilBrown 提交于
      If the desired size of an array is set (via sysfs) before the array is
      active (which is the normal sequence), we currrently call set_capacity
      immediately.
      This means that a subsequent 'open' (as can be caused by some
      udev-triggers program) will notice the new size and try to probe for
      partitions.  However as the array isn't quite ready yet the read will
      fail.  Then when the array is read, as the size doesn't change again
      we don't try to re-probe.
      
      So when setting array size via sysfs, only call set_capacity if the
      array is already active.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      cbe6ef1d
  13. 14 2月, 2011 1 次提交
  14. 13 2月, 2011 1 次提交
  15. 08 2月, 2011 2 次提交
    • K
      FIX: md: process hangs at wait_barrier after 0->10 takeover · 02214dc5
      Krzysztof Wojcik 提交于
      Following symptoms were observed:
      1. After raid0->raid10 takeover operation we have array with 2
      missing disks.
      When we add disk for rebuild, recovery process starts as expected
      but it does not finish- it stops at about 90%, md126_resync process
      hangs in "D" state.
      2. Similar behavior is when we have mounted raid0 array and we
      execute takeover to raid10. After this when we try to unmount array-
      it causes process umount hangs in "D"
      
      In scenarios above processes hang at the same function- wait_barrier
      in raid10.c.
      Process waits in macro "wait_event_lock_irq" until the
      "!conf->barrier" condition will be true.
      In scenarios above it never happens.
      
      Reason was that at the end of level_store, after calling pers->run,
      we call mddev_resume. This calls pers->quiesce(mddev, 0) with
      RAID10, that calls lower_barrier.
      However raise_barrier hadn't been called on that 'conf' yet,
      so conf->barrier becomes negative, which is bad.
      
      This patch introduces setting conf->barrier=1 after takeover
      operation. It prevents to become barrier negative after call
      lower_barrier().
      Signed-off-by: NKrzysztof Wojcik <krzysztof.wojcik@intel.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      02214dc5
    • C
      md_make_request: don't touch the bio after calling make_request · e91ece55
      Chris Mason 提交于
      md_make_request was calling bio_sectors() for part_stat_add
      after it was calling the make_request function.  This is
      bad because the make_request function can free the bio and
      because the bi_size field can change around.
      
      The fix here was suggested by Jens Axboe.  It saves the
      sector count before the make_request call.  I hit this
      with CONFIG_DEBUG_PAGEALLOC turned on while trying to break
      his pretty fusionio card.
      
      Cc: <stable@kernel.org>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      e91ece55
  16. 02 2月, 2011 1 次提交
  17. 31 1月, 2011 8 次提交
    • N
      md: don't clear curr_resync_completed at end of resync. · 7281f812
      NeilBrown 提交于
      There is no need to set this to zero at this point.  It will be
      set to zero by remove_and_add_spares or at the start of
      md_do_sync at the latest.
      And setting it to zero before MD_RECOVERY_RUNNING is cleared can
      make a 'zero' appear briefly in the 'sync_completed' sysfs attribute
      just as resync is finishing.
      
      So simply remove this setting to zero.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      7281f812
    • N
      md: Don't use remove_and_add_spares to remove failed devices from a read-only array · a8c42c7f
      NeilBrown 提交于
      remove_and_add_spares is called in two places where the needs really
      are very different.
      remove_and_add_spares should not be called on an array which is about
      to be reshaped as some extra devices might have been manually added
      and that would remove them.  However if the array is 'read-auto',
      that will currently happen, which is bad.
      
      So in the 'ro != 0' case don't call remove_and_add_spares but simply
      remove the failed devices as the comment suggests is needed.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      a8c42c7f
    • K
      Add raid1->raid0 takeover support · fc3a08b8
      Krzysztof Wojcik 提交于
      This patch introduces raid 1 to raid0 takeover operation
      in kernel space.
      Signed-off-by: NKrzysztof Wojcik <krzysztof.wojcik@intel.com>
      Signed-off-by: NNeil Brown <neilb@nbeee.brown>
      fc3a08b8
    • N
      md: Remove the AllReserved flag for component devices. · f21e9ff7
      NeilBrown 提交于
      This flag is not needed and is used badly.
      
      Devices that are included in a native-metadata array are reserved
      exclusively for that array - and currently have AllReserved set.
      They all are bd_claimed for the rdev and so cannot be shared.
      
      Devices that are included in external-metadata arrays can be shared
      among multiple arrays - providing there is no overlap.
      These are bd_claimed for md in general - not for a particular rdev.
      
      When changing the amount of a device that is used in an array we need
      to check for overlap.  This currently includes a check on AllReserved
      So even without overlap, sharing with an AllReserved device is not
      allowed.
      However the bd_claim usage already precludes sharing with these
      devices, so the test on AllReserved is not needed.  And in fact it is
      wrong.
      
      As this is the only use of AllReserved, simply remove all usage and
      definition of AllReserved.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      f21e9ff7
    • N
      md: don't abort checking spares as soon as one cannot be added. · 50da0840
      NeilBrown 提交于
      As spares can be added manually before a reshape starts, we need to
      find them all to mark some of them as in_sync.
      
      Previously we would abort looking for spares when we found an
      unallocated spare what could not be added to the array (implying there
      was no room for new spares).  However already-added spares could be
      later in the list, so we need to keep searching.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      50da0840
    • N
      md: fix the test for finding spares in raid5_start_reshape. · 469518a3
      NeilBrown 提交于
      As spares can be added to the array before the reshape is started,
      we need to find and count them when checking there are enough.
      The array could have been degraded, so we need to check all devices,
      no just those out side of the range of devices in the array before
      the reshape.
      
      So instead of checking the index, check the In_sync flag as that
      reliably tells if the device is a spare or this purpose.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      469518a3
    • N
      md: simplify some 'if' conditionals in raid5_start_reshape. · 87a8dec9
      NeilBrown 提交于
      There are two consecutive 'if' statements.
      
       if (mddev->delta_disks >= 0)
            ....
       if (mddev->delta_disks > 0)
      
      The code in the second is equally valid if delta_disks == 0, and these
      two statements are the only place that 'added_devices' is used.
      
      So make them a single if statement, make added_devices a local
      variable, and re-indent it all.
      
      No functional change.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      87a8dec9
    • N
      md: revert change to raid_disks on failure. · de171cb9
      NeilBrown 提交于
      If we try to update_raid_disks and it fails, we should put
      'delta_disks' back to zero.  This is important because some code,
      such as slot_store, assumes that delta_disks has been validated.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      de171cb9
  18. 25 1月, 2011 1 次提交