1. 27 11月, 2012 1 次提交
    • N
      md/raid1{,0}: fix deadlock in bitmap_unplug. · 874807a8
      NeilBrown 提交于
      If the raid1 or raid10 unplug function gets called
      from a make_request function (which is very possible) when
      there are bios on the current->bio_list list, then it will not
      be able to successfully call bitmap_unplug() and it could
      need to submit more bios and wait for them to complete.
      But they won't complete while current->bio_list is non-empty.
      
      So detect that case and handle the unplugging off to another thread
      just like we already do when called from within the scheduler.
      
      RAID1 version of bug was introduced in 3.6, so that part of fix is
      suitable for 3.6.y.  RAID10 part won't apply.
      
      Cc: stable@vger.kernel.org
      Reported-by: NTorsten Kaiser <just.for.lkml@googlemail.com>
      Reported-by: NPeter Maloney <peter.maloney@brockmann-consult.de>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      874807a8
  2. 22 11月, 2012 4 次提交
    • N
      md/raid10: decrement correct pending counter when writing to replacement. · 884162df
      NeilBrown 提交于
      When a write to a replacement device completes, we carefully
      and correctly found the rdev that the write actually went to
      and the blithely called rdev_dec_pending on the primary rdev,
      even if this write was to the replacement.
      
      This means that any writes to an array while a replacement
      was ongoing would cause the nr_pending count for the primary
      device to go negative, so it could never be removed.
      
      This bug has been present since replacement was introduced in
      3.3, so it is suitable for any -stable kernel since then.
      Reported-by: N"George Spelvin" <linux@horizon.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NNeilBrown <neilb@suse.de>
      884162df
    • N
      md/raid10: close race that lose writes lost when replacement completes. · e7c0c3fa
      NeilBrown 提交于
      When a replacement operation completes there is a small window
      when the original device is marked 'faulty' and the replacement
      still looks like a replacement.  The faulty should be removed and
      the replacement moved in place very quickly, bit it isn't instant.
      
      So the code write out to the array must handle the possibility that
      the only working device for some slot in the replacement - but it
      doesn't.  If the primary device is faulty it just gives up.  This
      can lead to corruption.
      
      So make the code more robust: if either  the primary or the
      replacement is present and working, write to them.  Only when
      neither are present do we give up.
      
      This bug has been present since replacement was introduced in
      3.3, so it is suitable for any -stable kernel since then.
      Reported-by: N"George Spelvin" <linux@horizon.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NNeilBrown <neilb@suse.de>
      e7c0c3fa
    • N
      md/raid5: Make sure we clear R5_Discard when discard is finished. · ca64cae9
      NeilBrown 提交于
      commit 9e444768
          MD: raid5 avoid unnecessary zero page for trim
      
      change raid5 to clear R5_Discard when the complete request is
      handled rather than when submitting the per-device discard request.
      However it did not clear R5_Discard for the parity device.
      
      This means that if the stripe_head was reused before it expired from
      the cache, the setting would be wrong and a hang would result.
      
      Also if the R5_Uptodate bit happens to be set, R5_Discard again
      won't be cleared.  But R5_Uptodate really should be clear at this point.
      
      So make sure R5_Discard is cleared in all cases, and clear
      R5_Uptodate when a 'discard' completes.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      ca64cae9
    • N
      md/raid5: move resolving of reconstruct_state earlier in · ef5b7c69
      NeilBrown 提交于
      stripe_handle.
      
      The chunk of code in stripe_handle which responds to a
      *_result value in reconstruct_state is really the completion
      of some processing that happened outside of handle_stripe
      (possibly asynchronously) and so should be one of the first
      things done in handle_stripe().
      
      After the next patch it will be important that it happens before
      handle_stripe_clean_event(), as that will clear some dev->flags
      bit that this code tests.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      ef5b7c69
  3. 20 11月, 2012 4 次提交
  4. 31 10月, 2012 2 次提交
    • J
      MD RAID10: Fix oops when creating RAID10 arrays via dm-raid.c · ed30be07
      Jonathan Brassow 提交于
      Commit 2863b9eb didn't take into account the changes to add TRIM support to
      RAID10 (commit 532a2a3f).  That is, when using dm-raid.c to create the
      RAID10 arrays, there is no mddev->gendisk or mddev->queue.  The code added
      to support TRIM simply assumes that mddev->queue is available without
      checking.  The result is an oops any time dm-raid.c attempts to create a
      RAID10 device.
      Signed-off-by: NJonathan Brassow <jbrassow@redhat.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      ed30be07
    • N
      md/raid1: Fix assembling of arrays containing Replacements. · 02b898f2
      NeilBrown 提交于
      setup_conf in raid1.c uses conf->raid_disks before assigning
      a value.  It is used when including 'Replacement' devices.
      
      The consequence is that assembling an array which contains a
      replacement will misbehave and either not include the replacement, or
      not include the device being replaced.
      
      Though this doesn't lead directly to data corruption, it could lead to
      reduced data safety.
      
      So use mddev->raid_disks, which is initialised, instead.
      
      Bug was introduced by commit c19d5798
            md/raid1: recognise replacements when assembling arrays.
      
      in 3.3, so fix is suitable for 3.3.y thru 3.6.y.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NNeilBrown <neilb@suse.de>
      02b898f2
  5. 22 10月, 2012 1 次提交
    • E
      md faulty: use disk_stack_limits() · 0be1fecd
      Eric Sandeen 提交于
      in:
      fe86cdce block: do not artificially constrain max_sectors for stacking drivers
      
      max_sectors defaults to UINT_MAX.  md faulty wasn't using
      disk_stack_limits(), so inherited this large value as well.
      This triggered a bug in XFS when stressed over md_faulty, when
      a very large bio_alloc() failed.
      
      That was on an older kernel, and I can't reproduce exactly the
      same thing upstream, but I think the fix is appropriate in any
      case.
      
      Thanks to Mike Snitzer for pointing out the problem.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      0be1fecd
  6. 13 10月, 2012 5 次提交
  7. 12 10月, 2012 12 次提交
  8. 11 10月, 2012 11 次提交