1. 14 12月, 2009 21 次提交
    • R
      raid: improve MD/raid10 handling of correctable read errors. · 1e50915f
      Robert Becker 提交于
      We've noticed severe lasting performance degradation of our raid
      arrays when we have drives that yield large amounts of media errors.
      The raid10 module will queue each failed read for retry, and also
      will attempt call fix_read_error() to perform the read recovery.
      Read recovery is performed while the array is frozen, so repeated
      recovery attempts can degrade the performance of the array for
      extended periods of time.
      
      With this patch I propose adding a per md device max number of
      corrected read attempts.  Each rdev will maintain a count of
      read correction attempts in the rdev->read_errors field (not
      used currently for raid10). When we enter fix_read_error()
      we'll check to see when the last read error occurred, and
      divide the read error count by 2 for every hour since the
      last read error. If at that point our read error count
      exceeds the read error threshold, we'll fail the raid device.
      
      In addition in this patch I add sysfs nodes (get/set) for
      the per md max_read_errors attribute, the rdev->read_errors
      attribute, and added some printk's to indicate when
      fix_read_error fails to repair an rdev.
      
      For testing I used debugfs->fail_make_request to inject
      IO errors to the rdev while doing IO to the raid array.
      Signed-off-by: NRobert Becker <Rob.Becker@riverbed.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      1e50915f
    • R
      md/raid10: print more useful messages on device failure. · 67b8dc4b
      Robert Becker 提交于
      When we get a read error on a device in a RAID10, and attempting to
      repair the error fails, print more useful messages about why it
      failed.
      Signed-off-by: NRobert Becker <Rob.Becker@riverbed.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      67b8dc4b
    • N
      md/bitmap: update dirty flag when bitmap bits are explicitly set. · ffa23322
      NeilBrown 提交于
      There is a sysfs file which allows bits in the write-intent
      bitmap to be explicit set - indicating that the block is thought
      to be 'dirty'.
      When this happens we should really set recovery_cp backwards
      to include the block to reflect this dirtiness.
      
      In particular, a 'resync' process will refuse to start if
      recovery_cp is beyond the end of the array, so this is needed
      to allow a resync to be triggered.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      ffa23322
    • N
      md: Support write-intent bitmaps with externally managed metadata. · ece5cff0
      NeilBrown 提交于
      In this case, the metadata needs to not be in the same
      sector as the bitmap.
      md will not read/write any bitmap metadata.  Config must be
      done via sysfs and when a recovery makes the array non-degraded
      again, writing 'true' to 'bitmap/can_clear' will allow bits in
      the bitmap to be cleared again.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      ece5cff0
    • N
      md/bitmap: move setting of daemon_lastrun out of bitmap_read_sb · 624ce4f5
      NeilBrown 提交于
      Setting daemon_lastrun really has nothing to do with reading
      the bitmap superblock, it just happens to be needed at the same time.
      bitmap_read_sb is about to become options, so move that code out
      to after the call to bitmap_read_sb.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      624ce4f5
    • N
      md: support updating bitmap parameters via sysfs. · 43a70507
      NeilBrown 提交于
      A new attribute directory 'bitmap' in 'md' is created which
      contains files for configuring the bitmap.
      'location' identifies where the bitmap is, either 'none',
      or 'file' or 'sector offset from metadata'.
      Writing 'location' can create or remove a bitmap.
      Adding a 'file' bitmap this way is not yet supported.
      'chunksize' and 'time_base' must be set before 'location'
      can be set.
      
      'chunksize' can be set before creating a bitmap, but is
      currently always over-ridden by the bitmap superblock.
      
      'time_base' and 'backlog' can be updated at any time.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Reviewed-by: NAndre Noll <maan@systemlinux.org>
      43a70507
    • N
      md: factor out parsing of fixed-point numbers · 72e02075
      NeilBrown 提交于
      safe_delay_store can parse fixed point numbers (for fractions
      of a second).  We will want to do that for another sysfs
      file soon, so factor out the code.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      72e02075
    • N
      md: support bitmap offset appropriate for external-metadata arrays. · f6af949c
      NeilBrown 提交于
      For md arrays were metadata is managed externally, the kernel does not
      know about a superblock so the superblock offset is 0.
      If we want to have a write-intent-bitmap near the end of the
      devices of such an array, we should support sector_t sized offset.
      We need offset be possibly negative for when the bitmap is before
      the metadata, so use loff_t instead.
      
      Also add sanity check that bitmap does not overlap with data.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      f6af949c
    • N
      md: remove needless setting of thread->timeout in raid10_quiesce · 9cd30fdc
      NeilBrown 提交于
      As bitmap_create and bitmap_destroy already set thread->timeout
      as appropriate, there is no need to do it in raid10_quiesce.
      There is a possible need to wake the thread after the timeout
      has been set low, but it is better to do that where the timeout
      is actually set low, in bitmap_create.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      9cd30fdc
    • N
      md: change daemon_sleep to be in 'jiffies' rather than 'seconds'. · 1b04be96
      NeilBrown 提交于
      This removes a lot of multiplications by HZ.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      1b04be96
    • N
      md: move offset, daemon_sleep and chunksize out of bitmap structure · 42a04b50
      NeilBrown 提交于
      ... and into bitmap_info.  These are all configuration parameters
      that need to be set before the bitmap is created.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      42a04b50
    • N
      md: collect bitmap-specific fields into one structure. · c3d9714e
      NeilBrown 提交于
      In preparation for making bitmap fields configurable via sysfs,
      start tidying up by making a single structure to contain the
      configuration fields.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      c3d9714e
    • N
      md/raid1: add takeover support for raid5->raid1 · 709ae487
      NeilBrown 提交于
      A 2-device raid5 array can now be converted to raid1.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      709ae487
    • N
      md: add honouring of suspend_{lo,hi} to raid1. · 6eef4b21
      NeilBrown 提交于
      This will allow us to stop writeout to portions of the array
      while  they are resynced by someone else - e.g. another node in
      a cluster.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      6eef4b21
    • N
      md/raid5: don't complete make_request on barrier until writes are scheduled · 729a1866
      NeilBrown 提交于
      The post-barrier-flush is sent by md as soon as make_request on the
      barrier write completes.  For raid5, the data might not be in the
      per-device queues yet.  So for barrier requests, wait for any
      pre-reading to be done so that the request will be in the per-device
      queues.
      
      We use the 'preread_active' count to check that nothing is still in
      the preread phase, and delay the decrement of this count until after
      write requests have been submitted to the underlying devices.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      729a1866
    • N
      md: support barrier requests on all personalities. · a2826aa9
      NeilBrown 提交于
      Previously barriers were only supported on RAID1.  This is because
      other levels requires synchronisation across all devices and so needed
      a different approach.
      Here is that approach.
      
      When a barrier arrives, we send a zero-length barrier to every active
      device.  When that completes - and if the original request was not
      empty -  we submit the barrier request itself (with the barrier flag
      cleared) and then submit a fresh load of zero length barriers.
      
      The barrier request itself is asynchronous, but any subsequent
      request will block until the barrier completes.
      
      The reason for clearing the barrier flag is that a barrier request is
      allowed to fail.  If we pass a non-empty barrier through a striping
      raid level it is conceivable that part of it could succeed and part
      could fail.  That would be way too hard to deal with.
      So if the first run of zero length barriers succeed, we assume all is
      sufficiently well that we send the request and ignore errors in the
      second run of barriers.
      
      RAID5 needs extra care as write requests may not have been submitted
      to the underlying devices yet.  So we flush the stripe cache before
      proceeding with the barrier.
      
      Note that the second set of zero-length barriers are submitted
      immediately after the original request is submitted.  Thus when
      a personality finds mddev->barrier to be set during make_request,
      it should not return from make_request until the corresponding
      per-device request(s) have been queued.
      
      That will be done in later patches.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Reviewed-by: NAndre Noll <maan@systemlinux.org>
      a2826aa9
    • N
      md: don't reset curr_resync_completed after an interrupted resync · efa59339
      NeilBrown 提交于
      If a resync/recovery/check/repair is interrupted for some reason, it
      can be useful to know exactly where it got up to.
      So in that case, do not clear curr_resync_completed.
      Initialise it when starting a resync/recovery/... instead.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      efa59339
    • N
      md: adjust resync_min usefully when resync aborts. · c07b70ad
      NeilBrown 提交于
      When a 'check' or 'repair' finished we should clear resync_min
      so that a future check/repair will cover the whole array (by default).
      However if it is interrupted, we should update resync_min to
      where we got up to, so that when the check/repair continues it
      just does the remainder of the array.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      c07b70ad
    • N
      md: remove sparse warning:symbol XXX was not declared. · 7820f9e1
      NeilBrown 提交于
      Signed-off-by: NNeilBrown <neilb@suse.de>
      7820f9e1
    • N
      md/raid5: remove some sparse warnings. · 8553fe7e
      NeilBrown 提交于
      qd_idx is previously declared and given exactly the same value!
      Signed-off-by: NNeilBrown <neilb@suse.de>
      8553fe7e
    • N
      md/bitmap: protect against bitmap removal while being updated. · aa5cbd10
      NeilBrown 提交于
      A write intent bitmap can be removed from an array while the
      array is active.
      When this happens, all IO is suspended and flushed before the
      bitmap is removed.
      However it is possible that bitmap_daemon_work is still running to
      clear old bits from the bitmap.  If it is, it can dereference the
      bitmap after it has been freed.
      
      So introduce a new mutex to protect bitmap_daemon_work and get it
      before destroying a bitmap.
      
      This is suitable for any current -stable kernel.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Cc: stable@kernel.org
      aa5cbd10
  2. 13 12月, 2009 11 次提交
    • L
      Merge branch 'ixp4xx' of git://git.kernel.org/pub/scm/linux/kernel/git/chris/linux-2.6 · f4054253
      Linus Torvalds 提交于
      * 'ixp4xx' of git://git.kernel.org/pub/scm/linux/kernel/git/chris/linux-2.6:
        IXP4xx: GTWX5715 platform only has two PCI IRQ lines, not four.
        IXP4xx: Introduce IXP4XX_GPIO_IRQ(n) macro and convert IXP4xx platform files.
        IXP4xx: move Gemtek GTWX5715 platform macros to the platform code.
        IXP4xx: Remove unused Motorola PrPMC1100 platform macros.
        IXP4xx: move FSG platform macros to the platform code.
        IXP4xx: move DSM G600 platform macros to the platform code.
        IXP4xx: move NAS100D platform macros to the platform code.
        IXP4xx: move NSLU2 platform macros to the platform code.
        IXP4xx: move Coyote platform macros to the platform code.
        IXP4xx: move AVILA platform macros to the platform code.
        IXP4xx: move IXDP425 platform macros to the platform code.
        IXP4xx: Extend PCI MMIO indirect address space to 1 GB.
        IXP4xx: Fix compilation failure with CONFIG_IXP4XX_INDIRECT_PCI.
        IXP4xx: Drop "__ixp4xx_" prefix from in/out/ioread/iowrite functions for clarity.
        IXP4xx: Rename indirect MMIO primitives from __ixp4xx_* to __indirect_*.
        IXP4xx: Ensure index is positive in irq_to_gpio() and npe_request().
        ARM: fix insl() and outsl() endianness on IXP4xx architecture.
        IXP4xx: Fix normally-disabled debugging text in drivers/net/arm/ixp4xx_eth.c.
        IXP4xx: change the timer base frequency to 66.666000 MHz.
      f4054253
    • L
      [BKL] add 'might_sleep()' to the outermost lock taker · f01eb364
      Linus Torvalds 提交于
      As shown by the previous patch (6698e347: "tty: Fix BKL taken under a
      spinlock bug introduced in the BKL split") the BKL removal is prone to
      some subtle issues, where removing the BKL in one place may in fact make
      a previously nested BKL call the new outer call, and then prone to nasty
      deadlocks with other spinlocks.
      
      In general, we should never take the BKL while we're holding a spinlock,
      so let's just add a "might_sleep()" to it (even though the BKL doesn't
      technically sleep - at least not yet), and we'll get nice warnings the
      next time this kind of problem happens during BKL removal.
      Acked-and-Tested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f01eb364
    • A
      tty: Fix BKL taken under a spinlock bug introduced in the BKL split · 6698e347
      Alan Cox 提交于
      The fasync path takes the BKL (it probably doesn't need to in fact)
      while holding the file_list spinlock.  You can't do that with the kernel
      lock: it causes lock inversions and deadlocks.
      
      Leave the BKL over that bit for the moment.
      
      Identified by AKPM.
      Signed-off-by: NAlan Cox <alan@linux.intel.com>
      Acked-and-Tested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6698e347
    • L
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · 09cea96c
      Linus Torvalds 提交于
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (151 commits)
        powerpc: Fix usage of 64-bit instruction in 32-bit altivec code
        MAINTAINERS: Add PowerPC patterns
        powerpc/pseries: Track previous CPPR values to correctly EOI interrupts
        powerpc/pseries: Correct pseries/dlpar.c build break without CONFIG_SMP
        powerpc: Make "intspec" pointers in irq_host->xlate() const
        powerpc/8xx: DTLB Miss cleanup
        powerpc/8xx: Remove DIRTY pte handling in DTLB Error.
        powerpc/8xx: Start using dcbX instructions in various copy routines
        powerpc/8xx: Restore _PAGE_WRITETHRU
        powerpc/8xx: Add missing Guarded setting in DTLB Error.
        powerpc/8xx: Fixup DAR from buggy dcbX instructions.
        powerpc/8xx: Tag DAR with 0x00f0 to catch buggy instructions.
        powerpc/8xx: Update TLB asm so it behaves as linux mm expects.
        powerpc/8xx: Invalidate non present TLBs
        powerpc/pseries: Serialize cpu hotplug operations during deactivate Vs deallocate
        pseries/pseries: Add code to online/offline CPUs of a DLPAR node
        powerpc: stop_this_cpu: remove the cpu from the online map.
        powerpc/pseries: Add kernel based CPU DLPAR handling
        sysfs/cpu: Add probe/release files
        powerpc/pseries: Kernel DLPAR Infrastructure
        ...
      09cea96c
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 · 6eb7365d
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
        ALSA: hda - Overwrite pin config on intel DG45ID board.
        intelhdmi - dont power off HDA link
        ALSA: hrtimer - Fix lock-up
        ALSA: intelhdmi - add channel mapping for typical configurations
        ALSA: intelhdmi - channel mapping applies to Pin
        ALSA: intelhdmi - accept DisplayPort pin
        ALSA: hda - show HBR(High Bit Rate) pin cap in procfs
        ALSA: hda - Fix LED GPIO setup for HP laptops with IDT codecs
        ASoC: Fix build of OMAP sound drivers
        ALSA: opti93x: fix irq releasing if the irq cannot be allocated
      6eb7365d
    • L
      Merge branch 'omap-for-linus' of... · 9c3936cb
      Linus Torvalds 提交于
      Merge branch 'omap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6
      
      * 'omap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: (75 commits)
        omap3: Fix OMAP35XX_REV macros
        omap: serial: fix non-empty uart fifo read abort
        omap3: Zoom2/3: Update hsmmc board config params
        omap3 : Enable TWL4030 Keypad for Zoom2 and Zoom3 boards
        omap3: id code detection 3525 vs 3515
        omap3: rx51: Use wl1251 in SPI mode 3
        omap3: zoom2/3: make MMC slot work again
        omap1: htcherald: Update defconfig to include mux support
        omap1: LCD_DMA: Use some define rather than a hexadecimal
        omap: header: remove unused data-type
        omap: arch/arm/plat-omap/devices.c - sort alphabetically
        omap: Correcting GPMC_CONFIG1_DEVICETYPE_NAND
        OMAP3: serial - allow platforms specify which UARTs to initialize
        omap3: cm-t35: add mux initialization
        OMAP4: Sync up omap4430 defconfig
        OMAP4: Remove the secondary wait loop
        OMAP4: AuxCoreBoot registers only accessible in secure mode
        OMAP4: Fix SRAM base and size
        OMAP4: Fix cpu detection
        omap3: pandora: board file updates for .33
        ...
      9c3936cb
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · 5de76b18
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
        be2net: fix error in rx completion processing.
        igbvf: avoid reset storms due to mailbox issues
        igb: fix handling of mailbox collisions between PF/VF
        usb: remove rare pm primitive for conversion to new API
      5de76b18
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 · 8d0e7fb9
      Linus Torvalds 提交于
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
        slab, kmemleak: pass the correct pointer to kmemleak_erase()
        slab, kmemleak: stop calling kmemleak_erase() unconditionally
        SLAB: Fix unlikely() annotation in __cache_alloc_node()
        SLAB: Fix lockdep annotations for CPU hotplug
        SLUB: Fix __GFP_ZERO unlikely() annotation
        slub: allow stats to be cleared
      8d0e7fb9
    • L
      Merge branch 'sched-fixes-for-linus' of... · 702a7c76
      Linus Torvalds 提交于
      Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (21 commits)
        sched: Remove forced2_migrations stats
        sched: Fix memory leak in two error corner cases
        sched: Fix build warning in get_update_sysctl_factor()
        sched: Update normalized values on user updates via proc
        sched: Make tunable scaling style configurable
        sched: Fix missing sched tunable recalculation on cpu add/remove
        sched: Fix task priority bug
        sched: cgroup: Implement different treatment for idle shares
        sched: Remove unnecessary RCU exclusion
        sched: Discard some old bits
        sched: Clean up check_preempt_wakeup()
        sched: Move update_curr() in check_preempt_wakeup() to avoid redundant call
        sched: Sanitize fork() handling
        sched: Clean up ttwu() rq locking
        sched: Remove rq->clock coupling from set_task_cpu()
        sched: Consolidate select_task_rq() callers
        sched: Remove sysctl.sched_features
        sched: Protect sched_rr_get_param() access to task->sched_class
        sched: Protect task->cpus_allowed access in sched_getaffinity()
        sched: Fix balance vs hotplug race
        ...
      
      Fixed up conflicts in kernel/sysctl.c (due to sysctl cleanup)
      702a7c76
    • T
      Merge branch 'topic/hda' into for-linus · 84a3bd06
      Takashi Iwai 提交于
      84a3bd06
    • T
      Merge branch 'topic/asoc' into for-linus · f52d7a43
      Takashi Iwai 提交于
      f52d7a43
  3. 12 12月, 2009 8 次提交