1. 15 3月, 2016 2 次提交
  2. 10 3月, 2016 2 次提交
    • S
      md/raid5: output stripe state for debug · fb3229d5
      Shaohua Li 提交于
      Neil recently fixed an obscure race in break_stripe_batch_list. Debug would be
      quite convenient if we know the stripe state. This is what this patch does.
      Signed-off-by: NShaohua Li <shli@fb.com>
      fb3229d5
    • N
      md/raid5: preserve STRIPE_PREREAD_ACTIVE in break_stripe_batch_list · 550da24f
      NeilBrown 提交于
      break_stripe_batch_list breaks up a batch and copies some flags from
      the batch head to the members, preserving others.
      
      It doesn't preserve or copy STRIPE_PREREAD_ACTIVE.  This is not
      normally a problem as STRIPE_PREREAD_ACTIVE is cleared when a
      stripe_head is added to a batch, and is not set on stripe_heads
      already in a batch.
      
      However there is no locking to ensure one thread doesn't set the flag
      after it has just been cleared in another.  This does occasionally happen.
      
      md/raid5 maintains a count of the number of stripe_heads with
      STRIPE_PREREAD_ACTIVE set: conf->preread_active_stripes.  When
      break_stripe_batch_list clears STRIPE_PREREAD_ACTIVE inadvertently
      this could becomes incorrect and will never again return to zero.
      
      md/raid5 delays the handling of some stripe_heads until
      preread_active_stripes becomes zero.  So when the above mention race
      happens, those stripe_heads become blocked and never progress,
      resulting is write to the array handing.
      
      So: change break_stripe_batch_list to preserve STRIPE_PREREAD_ACTIVE
      in the members of a batch.
      
      URL: https://bugzilla.kernel.org/show_bug.cgi?id=108741
      URL: https://bugzilla.redhat.com/show_bug.cgi?id=1258153
      URL: http://thread.gmane.org/5649C0E9.2030204@zoner.cz
      Reported-by: Martin Svec <martin.svec@zoner.cz> (and others)
      Tested-by: NTom Weber <linux@junkyard.4t2.com>
      Fixes: 1b956f7a ("md/raid5: be more selective about distributing flags across batch.")
      Cc: stable@vger.kernel.org (v4.1 and later)
      Signed-off-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      550da24f
  3. 08 3月, 2016 1 次提交
  4. 27 2月, 2016 4 次提交
    • S
      MD: warn for potential deadlock · 70d9798b
      Shaohua Li 提交于
      The personality thread shouldn't call mddev_suspend(). Because
      mddev_suspend() will for all IO finish, but IO is handled in personality
      thread, so this could cause deadlock. To trigger this early, add a
      warning if mddev_suspend() is called from personality thread.
      Suggested-by: NNeilBrown <neilb@suse.com>
      Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      70d9798b
    • S
      md: Drop sending a change uevent when stopping · 399146b8
      Sebastian Parschauer 提交于
      When stopping an MD device, then its device node /dev/mdX may still
      exist afterwards or it is recreated by udev. The next open() call
      can lead to creation of an inoperable MD device. The reason for
      this is that a change event (KOBJ_CHANGE) is sent to udev which
      races against the remove event (KOBJ_REMOVE) from md_free().
      So drop sending the change event.
      
      A change is likely also required in mdadm as many versions send the
      change event to udev as well.
      
      Neil mentioned the change event is a workaround for old kernel
      Commit: 934d9c23 ("md: destroy partitions and notify udev when md array is stopped.")
      new mdadm can handle device remove now, so this isn't required any more.
      
      Cc: NeilBrown <neilb@suse.com>
      Cc: Hannes Reinecke <hare@suse.de>
      Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
      Signed-off-by: NSebastian Parschauer <sebastian.riemer@profitbricks.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      399146b8
    • S
      RAID5: revert e9e4c377 to fix a livelock · 6ab2a4b8
      Shaohua Li 提交于
      Revert commit
      e9e4c377(md/raid5: per hash value and exclusive wait_for_stripe)
      
      The problem is raid5_get_active_stripe waits on
      conf->wait_for_stripe[hash]. Assume hash is 0. My test release stripes
      in this order:
      - release all stripes with hash 0
      - raid5_get_active_stripe still sleeps since active_stripes >
        max_nr_stripes * 3 / 4
      - release all stripes with hash other than 0. active_stripes becomes 0
      - raid5_get_active_stripe still sleeps, since nobody wakes up
        wait_for_stripe[0]
      The system live locks. The problem is active_stripes isn't a per-hash
      count. Revert the patch makes the live lock go away.
      
      Cc: stable@vger.kernel.org (v4.2+)
      Cc: Yuanhan Liu <yuanhan.liu@linux.intel.com>
      Cc: NeilBrown <neilb@suse.de>
      Signed-off-by: NShaohua Li <shli@fb.com>
      6ab2a4b8
    • S
      RAID5: check_reshape() shouldn't call mddev_suspend · 27a353c0
      Shaohua Li 提交于
      check_reshape() is called from raid5d thread. raid5d thread shouldn't
      call mddev_suspend(), because mddev_suspend() waits for all IO finish
      but IO is handled in raid5d thread, we could easily deadlock here.
      
      This issue is introduced by
      738a2738 ("md/raid5: fix allocation of 'scribble' array.")
      
      Cc: stable@vger.kernel.org (v4.1+)
      Reported-and-tested-by: NArtur Paszkiewicz <artur.paszkiewicz@intel.com>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Signed-off-by: NShaohua Li <shli@fb.com>
      27a353c0
  5. 26 2月, 2016 1 次提交
  6. 22 2月, 2016 12 次提交
  7. 21 2月, 2016 1 次提交
  8. 20 2月, 2016 8 次提交
  9. 19 2月, 2016 9 次提交