1. 18 5月, 2010 9 次提交
  2. 16 3月, 2010 1 次提交
    • N
      md: deal with merge_bvec_fn in component devices better. · 627a2d3c
      NeilBrown 提交于
      If a component device has a merge_bvec_fn then as we never call it
      we must ensure we never need to.  Currently this is done by setting
      max_sector to 1 PAGE, however this does not stop a bio being created
      with several sub-page iovecs that would violate the merge_bvec_fn.
      
      So instead set max_segments to 1 and set the segment boundary to the
      same as a page boundary to ensure there is only ever one single-page
      segment of IO requested at a time.
      
      This can particularly be an issue when 'xen' is used as it is
      known to submit multiple small buffers in a single bio.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Cc: stable@kernel.org
      627a2d3c
  3. 26 2月, 2010 1 次提交
  4. 14 12月, 2009 4 次提交
  5. 01 12月, 2009 1 次提交
    • N
      md: revert incorrect fix for read error handling in raid1. · d0e26078
      NeilBrown 提交于
      commit 4706b349 was a forward port of a fix that was needed
      for SLES10.  But in fact it is not needed in mainline because
      the earlier commit dd00a99e fixes the same problem in a
      better way.
      Further, this commit introduces a bug in the way it interacts with
      the automatic read-error-correction.  If, after a read error is
      successfully corrected, the same disk is chosen to re-read - the
      re-read won't be attempted but an error will be returned instead.
      
      After reverting that commit, there is the possibility that a
      read error on a read-only array (where read errors cannot
      be corrected as that requires a write) will repeatedly read the same
      device and continue to get an error.
      So in the "Array is readonly" case, fail the drive immediately on
      a read error.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Cc: stable@kernel.org
      d0e26078
  6. 16 10月, 2009 2 次提交
    • N
      md: raid1/raid10: handle allocation errors during array setup. · ed9bfdf1
      NeilBrown 提交于
      Both raid1 and raid10 create a mempool during startup.
      If the 'alloc' function for this mempool fails, unplug_slaves
      is called.
      If that happens when the pool is being initialised, unplug_slaves
      will try to use the 'conf' structure that isn't filled in yet, and
      badness will happen.
      
      So ensure that unplug_slaves doesn't get called unless we know
      that the conf structure if fully initialised.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      ed9bfdf1
    • N
      md/raid1/raid10: add a cond_resched · 1d9d5241
      NeilBrown 提交于
      During 'check' of a raid1 or raid10 it is possible for the management
      thread to spend a lot of time running 'memcmp' on blocks from
      different devices, so make sure the thread has a chance to schedule.
      raid5d already has a cond_resched (in process_stripe).
      Reported-By: NLee Howard <faxguy@howardsilvan.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      1d9d5241
  7. 23 9月, 2009 3 次提交
  8. 11 9月, 2009 1 次提交
  9. 03 8月, 2009 2 次提交
    • N
      md: Use revalidate_disk to effect changes in size of device. · 449aad3e
      NeilBrown 提交于
      As revalidate_disk calls check_disk_size_change, it will cause
      any capacity change of a gendisk to be propagated to the blockdev
      inode.  So use that instead of mucking about with locks and
      i_size_write.
      
      Also add a call to revalidate_disk in do_md_run and a few other places
      where the gendisk capacity is changed.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      449aad3e
    • A
      md: Push down data integrity code to personalities. · ac5e7113
      Andre Noll 提交于
      This patch replaces md_integrity_check() by two new public functions:
      md_integrity_register() and md_integrity_add_rdev() which are both
      personality-independent.
      
      md_integrity_register() is called from the ->run and ->hot_remove
      methods of all personalities that support data integrity.  The
      function iterates over the component devices of the array and
      determines if all active devices are integrity capable and if their
      profiles match. If this is the case, the common profile is registered
      for the mddev via blk_integrity_register().
      
      The second new function, md_integrity_add_rdev() is called from the
      ->hot_add_disk methods, i.e. whenever a new device is being added
      to a raid array. If the new device does not support data integrity,
      or has a profile different from the one already registered, data
      integrity for the mddev is disabled.
      
      For raid0 and linear, only the call to md_integrity_register() from
      the ->run method is necessary.
      Signed-off-by: NAndre Noll <maan@systemlinux.org>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      ac5e7113
  10. 01 7月, 2009 1 次提交
  11. 18 6月, 2009 3 次提交
  12. 16 6月, 2009 1 次提交
    • N
      md: remove mddev_to_conf "helper" macro · 070ec55d
      NeilBrown 提交于
      Having a macro just to cast a void* isn't really helpful.
      I would must rather see that we are simply de-referencing ->private,
      than have to know what the macro does.
      
      So open code the macro everywhere and remove the pointless cast.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      070ec55d
  13. 23 5月, 2009 1 次提交
  14. 15 4月, 2009 1 次提交
  15. 07 4月, 2009 1 次提交
  16. 06 4月, 2009 1 次提交
  17. 31 3月, 2009 7 次提交
    • D
      md: 'array_size' sysfs attribute · b522adcd
      Dan Williams 提交于
      Allow userspace to set the size of the array according to the following
      semantics:
      
      1/ size must be <= to the size returned by mddev->pers->size(mddev, 0, 0)
         a) If size is set before the array is running, do_md_run will fail
            if size is greater than the default size
         b) A reshape attempt that reduces the default size to less than the set
            array size should be blocked
      2/ once userspace sets the size the kernel will not change it
      3/ writing 'default' to this attribute returns control of the size to the
         kernel and reverts to the size reported by the personality
      
      Also, convert locations that need to know the default size from directly
      reading ->array_sectors to <pers>_size.  Resync/reshape operations
      always follow the default size.
      
      Finally, fixup other locations that read a number of 1k-blocks from
      userspace to use strict_blocks_to_sectors() which checks for unsigned
      long long to sector_t overflow and blocks to sectors overflow.
      Reviewed-by: NAndre Noll <maan@systemlinux.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      b522adcd
    • D
      md: centralize ->array_sectors modifications · 1f403624
      Dan Williams 提交于
      Get personalities out of the business of directly modifying
      ->array_sectors.  Lays groundwork to introduce policy on when
      ->array_sectors can be modified.
      Reviewed-by: NAndre Noll <maan@systemlinux.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      1f403624
    • D
      md: add 'size' as a personality method · 80c3a6ce
      Dan Williams 提交于
      In preparation for giving userspace control over ->array_sectors we need
      to be able to retrieve the 'default' size, and the 'anticipated' size
      when a reshape is requested.  For personalities that do not reshape emit
      a warning if anything but the default size is requested.
      
      In the raid5 case we need to update ->previous_raid_disks to make the
      new 'default' size available.
      Reviewed-by: NAndre Noll <maan@systemlinux.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      80c3a6ce
    • N
      md: enable suspend/resume of md devices. · 409c57f3
      NeilBrown 提交于
      To be able to change the 'level' of an md/raid array, we need to
      suspend the device so that no requests are active - then move some
      pointers around etc.
      
      The code already keeps counts of active requests and the ->quiesce
      function can be used to wait until those counts hit zero.
      However the quiesce function blocks new requests once they are all
      ready 'inside' the personality module, and that is too late if we want
      to replace the personality modules.
      
      So make all md requests come in through a common md_make_request
      function that keeps track of how many requests have entered the
      modules but may not yet be on the internal reference counts.
      Allow md_make_request to be blocked when we want to suspend the
      device, and make it possible to wait for all those in-transit requests
      to be added to internal lists so that ->quiesce can wait for them.
      
      There is still a problem that when a request completes, we drop the
      ref count inside the personality code so there is a short time between
      when the refcount hits zero, and when the personality code is no
      longer being used.
      The personality code never blocks (schedule or spinlock) between
      dropping the refcount and exiting the routine, so this should be safe
      (as put_module calls synchronize_sched() before unmapping the module
      code).
      Signed-off-by: NNeilBrown <neilb@suse.de>
      409c57f3
    • A
      md: Make mddev->size sector-based. · 58c0fed4
      Andre Noll 提交于
      This patch renames the "size" field of struct mddev_s to "dev_sectors"
      and stores the number of 512-byte sectors instead of the number of
      1K-blocks in it.
      
      All users of that field, including raid levels 1,4-6,10, are adjusted
      accordingly. This simplifies the code a bit because it allows to get
      rid of a couple of divisions/multiplications by two.
      
      In order to make checkpatch happy, some minor coding style issues
      have also been addressed. In particular, size_store() now uses
      strict_strtoull() instead of simple_strtoull().
      Signed-off-by: NAndre Noll <maan@systemlinux.org>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      58c0fed4
    • N
      md: move md_k.h from include/linux/raid/ to drivers/md/ · 43b2e5d8
      NeilBrown 提交于
      It really is nicer to keep related code together..
      Signed-off-by: NNeilBrown <neilb@suse.de>
      43b2e5d8
    • N
      md: move lots of #include lines out of .h files and into .c · bff61975
      NeilBrown 提交于
      This makes the includes more explicit, and is preparation for moving
      md_k.h to drivers/md/md.h
      
      Remove include/raid/md.h as its only remaining use was to #include
      other files.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      bff61975