1. 28 6月, 2008 8 次提交
    • N
      Close race in md_probe · f48ed538
      Neil Brown 提交于
      There is a possible race in md_probe.  If two threads call md_probe
      for the same device, then one could exit (having checked that
      ->gendisk exists) before the other has called kobject_init_and_add,
      thus returning an incomplete kobj which will cause problems when
      we try to add children to it.
      
      So extend the range of protection of disks_mutex slightly to
      avoid this possibility.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      f48ed538
    • N
      Allow setting start point for requested check/repair · 5e96ee65
      Neil Brown 提交于
      This makes it possible to just resync a small part of an array.
      e.g. if a drive reports that it has questionable sectors,
      a 'repair' of just the region covering those sectors will
      cause them to be read and, if there is an error, re-written
      with correct data.
      Signed-off-by: NNeil Brown <neilb@suse.de>
      5e96ee65
    • N
      Improve setting of "events_cleared" for write-intent bitmaps. · a0da84f3
      Neil Brown 提交于
      When an array is degraded, bits in the write-intent bitmap are not
      cleared, so that if the missing device is re-added, it can be synced
      by only updated those parts of the device that have changed since
      it was removed.
      
      The enable this a 'events_cleared' value is stored. It is the event
      counter for the array the last time that any bits were cleared.
      
      Sometimes - if a device disappears from an array while it is 'clean' -
      the events_cleared value gets updated incorrectly (there are subtle
      ordering issues between updateing events in the main metadata and the
      bitmap metadata) resulting in the missing device appearing to require
      a full resync when it is re-added.
      
      With this patch, we update events_cleared precisely when we are about
      to clear a bit in the bitmap.  We record events_cleared when we clear
      the bit internally, and copy that to the superblock which is written
      out before the bit on storage.  This makes it more "obviously correct".
      
      We also need to update events_cleared when the event_count is going
      backwards (as happens on a dirty->clean transition of a non-degraded
      array).
      
      Thanks to Mike Snitzer for identifying this problem and testing early
      "fixes".
      
      Cc:  "Mike Snitzer" <snitzer@gmail.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      a0da84f3
    • N
      use bio_endio instead of a call to bi_end_io · 0e13fe23
      Neil Brown 提交于
      Turn calls to bi->bi_end_io() into bio_endio(). Apparently bio_endio does
      exactly the same error processing as is hardcoded at these places.
      
      bio_endio() avoids recursion (or will soon), so it should be used.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      0e13fe23
    • N
      linear: correct disk numbering error check · 13864515
      Nikanth Karthikesan 提交于
      From: "Nikanth Karthikesan" <knikanth@novell.com>
      
      Correct disk numbering problem check.
      Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      13864515
    • N
      Fix error paths if md_probe fails. · 9bbbca3a
      Neil Brown 提交于
      md_probe can fail (e.g. alloc_disk could fail) without
      returning an error (as it alway returns NULL).
      So when we call mddev_find immediately afterwards, we need
      to check that md_probe actually succeeded.  This means checking
      that mdev->gendisk is non-NULL.
      
      cc: <stable@kernel.org>
      Cc: Dave Jones <davej@redhat.com>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      9bbbca3a
    • N
      Don't acknowlege that stripe-expand is complete until it really is. · efe31143
      Neil Brown 提交于
      We shouldn't acknowledge that a stripe has been expanded (When
      reshaping a raid5 by adding a device) until the moved data has
      actually been written out.  However we are currently
      acknowledging (by calling md_done_sync) when the POST_XOR
      is complete and before the write.
      
      So track in s.locked whether there are pending writes, and don't
      call md_done_sync yet if there are.
      
      Note: we all set R5_LOCKED on devices which are are about to
      read from.  This probably isn't technically necessary, but is
      usually done when writing a block, and justifies the use of
      s.locked here.
      
      This bug can lead to a crash if an array is stopped while an reshape
      is in progress.
      
      Cc: <stable@kernel.org>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      efe31143
    • N
      Ensure interrupted recovery completed properly (v1 metadata plus bitmap) · 8c2e870a
      Neil Brown 提交于
      If, while assembling an array, we find a device which is not fully
      in-sync with the array, it is important to set the "fullsync" flags.
      This is an exact analog to the setting of this flag in hot_add_disk
      methods.
      
      Currently, only v1.x metadata supports having devices in an array
      which are not fully in-sync (it keep track of how in sync they are).
      The 'fullsync' flag only makes a difference when a write-intent bitmap
      is being used.  In this case it tells recovery to ignore the bitmap
      and recovery all blocks.
      
      This fix is already in place for raid1, but not raid5/6 or raid10.
      
      So without this fix, a raid1 ir raid4/5/6 array with version 1.x
      metadata and a write intent bitmaps, that is stopped in the middle
      of a recovery, will appear to complete the recovery instantly
      after it is reassembled, but the recovery will not be correct.
      
      If you might have an array like that, issueing
         echo repair > /sys/block/mdXX/md/sync_action
      
      will make sure recovery completes properly.
      
      Cc: <stable@kernel.org>
      Signed-off-by: NNeil Brown <neilb@suse.de>
      8c2e870a
  2. 25 6月, 2008 17 次提交
  3. 24 6月, 2008 15 次提交