1. 12 7月, 2012 11 次提交
    • R
      clk: add highbank clock support · 8d4d9f52
      Rob Herring 提交于
      This adds real clock support to Calxeda Highbank SOC using the common
      clock infrastructure.
      Signed-off-by: NRob Herring <rob.herring@calxeda.com>
      [mturquette@linaro.org: fixed up invalid writes to const struct member]
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      8d4d9f52
    • G
      clk: add DT fixed-clock binding support · 015ba402
      Grant Likely 提交于
      Add support for DT "fixed-clock" binding to the common fixed rate clock
      support.
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      [Rob Herring] Rework and move into common clock infrastructure
      Signed-off-by: NRob Herring <rob.herring@calxeda.com>
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      015ba402
    • G
      clk: add DT clock binding support · 766e6a4e
      Grant Likely 提交于
      Based on work 1st by Ben Herrenschmidt and Jeremy Kerr, then by Grant
      Likely, this patch adds support to clk_get to allow drivers to retrieve
      clock data from the device tree.
      
      Platforms scan for clocks in DT with of_clk_init and a match table, and
      the register a provider through of_clk_add_provider. The provider's
      clk_src_get function will be called when a device references the
      provider's OF node for a clock reference.
      
      v6 (Rob Herring):
          - Return error values instead of NULL to match clock framework
            expectations
      
      v5 (Rob Herring):
          - Move from drivers/of into common clock subsystem
          - Squashed "dt/clock: add a simple provider get function" and
            "dt/clock: add function to get parent clock name"
          - Rebase to 3.4-rc1
          - Drop CONFIG_OF_CLOCK and just use CONFIG_OF
          - Add missing EXPORT_SYMBOL to various functions
          - s/clock-output-name/clock-output-names/
          - Define that fixed-clock binding is a single output
      
      v4 (Rob Herring):
          - Rework for common clk subsystem
          - Add of_clk_get_parent_name function
      
      v3: - Clarified documentation
      
      v2: - fixed errant ';' causing compile error
          - Editorial fixes from Shawn Guo
          - merged in adding lookup to clkdev
          - changed property names to match established convention. After
            working with the binding a bit it really made more sense to follow the
            lead of 'reg', 'gpios' and 'interrupts' by making the input simply
            'clocks' & 'clock-names' instead of 'clock-input-*', and to only use
            clock-output* for the producer nodes. (Sorry Shawn, this will mean
            you need to change some code, but it should be trivial)
          - Add ability to inherit clocks from parent nodes by using an empty
            'clock-ranges' property.  Useful for busses.  I could use some feedback
            on the new property name, 'clock-ranges' doesn't feel right to me.
      Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
      Signed-off-by: NRob Herring <rob.herring@calxeda.com>
      Reviewed-by: NShawn Guo <shawn.guo@freescale.com>
      Cc: Sascha Hauer <kernel@pengutronix.de>
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      766e6a4e
    • L
      ARM: integrator: convert to common clock · a613163d
      Linus Walleij 提交于
      This converts the Integrator platform to use common clock
      and the ICST driver. Since from this point not all ARM
      reference platforms use the clock, we define
      CONFIG_PLAT_VERSATILE_CLOCK and select it for all platforms
      except the Integrator.
      
      Open issue: I could not use the .init_early() field of the
      machine descriptor to initialize the clocks, but had to
      move them to .init_irq(), so presumably .init_early() is
      so early that common clock is not up, and .init_machine()
      is too late since it's needed for the clockevent/clocksource
      initialization. Any suggestions on how to solve this is
      very welcome.
      
      Cc: Russell King <linux@arm.linux.org.uk>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      [mturquette@linaro.org: use 'select' instead of versatile Kconfig]
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      a613163d
    • L
      clk: add versatile ICST307 driver · 91b87a47
      Linus Walleij 提交于
      The ICST307 VCO clock has a shared driver in the ARM
      architecture. This patch provides a wrapper into the common
      clock framework so we can use the implementation in the
      ARM architecture without duplicating the code until all
      ARM platforms using this VCO are moved over. At that point
      we can merge the driver from the ARM platform into the
      generic file altogether.
      
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Mike Turquette <mturquette@ti.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      [mturquette@linaro.org: removed versatile Kconfig]
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      91b87a47
    • L
      ARM: u300: convert to common clock · 50667d63
      Linus Walleij 提交于
      This converts the U300 clock implementation over to use the common
      struct clk and moves the implementation down into drivers/clk.
      Since VCO isn't used in tree it was removed, it's not hard to
      put it back in if need be.
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      [mturquette@linaro.org: trivial Makefile conflict]
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      50667d63
    • R
      clk: cache parent clocks only for muxes · 9ca1c5a4
      Rajendra Nayak 提交于
      caching parent clocks makes sense only when a clock has more
      than one parent (mux clocks).
      Avoid doing this for every other clock.
      Signed-off-by: NRajendra Nayak <rnayak@ti.com>
      [mturquette@linaro.org: removed extra parentheses]
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      9ca1c5a4
    • M
      clk: wm831x: Add initial WM831x clock driver · f05259a6
      Mark Brown 提交于
      The WM831x and WM832x series of PMICs contain a flexible clocking
      subsystem intended to provide always on and system core clocks.  It
      features:
      
      - A 32.768kHz crystal oscillator which can optionally be used to pass
        through an externally generated clock.
      - A FLL which can be clocked from either the 32.768kHz oscillator or
        the CLKIN pin.
      - A CLKOUT pin which can bring out either the oscillator or the FLL
        output.
      - The 32.768kHz clock can also optionally be brought out on the GPIO
        pins of the device.
      
      This driver fully supports the 32.768kHz oscillator and CLKOUT.  The FLL
      is supported only in AUTO mode, the full flexibility of the FLL cannot
      currently be used.
      
      Due to a lack of access to systems where the core SoC has been converted
      to use the generic clock API this driver has been compile tested only.
      Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      f05259a6
    • R
      clk: Add CLK_IS_BASIC flag to identify basic clocks · f7d8caad
      Rajendra Nayak 提交于
      Most platforms end up using a mix of basic clock types and
      some which use clk_hw_foo struct for filling in custom platform
      information when the clocks don't fit into basic types supported.
      
      In platform code, its useful to know if a clock is using a basic
      type or clk_hw_foo, which helps platforms know if they can
      safely use to_clk_hw_foo to derive the clk_hw_foo pointer from
      clk_hw.
      
      Mark all basic clocks with a CLK_IS_BASIC flag.
      Signed-off-by: NRajendra Nayak <rnayak@ti.com>
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      f7d8caad
    • R
      clk: Add support for rate table based dividers · 357c3f0a
      Rajendra Nayak 提交于
      Some divider clks do not have any obvious relationship
      between the divider and the value programmed in the
      register. For instance, say a value of 1 could signify divide
      by 6 and a value of 2 could signify divide by 4 etc.
      Also there are dividers where not all values possible
      based on the bitfield width are valid. For instance
      a 3 bit wide bitfield can be used to program a value
      from 0 to 7. However its possible that only 0 to 4
      are valid values.
      
      All these cases need the platform code to pass a simple
      table of divider/value tuple, so the framework knows
      the exact value to be written based on the divider
      calculation and can also do better error checking.
      
      This patch adds support for such rate table based
      dividers and as part of the support adds a new
      registration function 'clk_register_divider_table()'
      and a new macro for static definition
      'DEFINE_CLK_DIVIDER_TABLE'.
      Signed-off-by: NRajendra Nayak <rnayak@ti.com>
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      357c3f0a
    • R
      clk: Add support for power of two type dividers · 6d9252bd
      Rajendra Nayak 提交于
      Quite often dividers and the value programmed in the
      register have a relation of 'power of two', something like
      value	div
      0	1
      1	2
      2	4
      3	8...
      
      Add support for such dividers as part of clk-divider.
      
      The clk-divider flag 'CLK_DIVIDER_POWER_OF_TWO' should be used
      to define such clocks.
      Signed-off-by: NRajendra Nayak <rnayak@ti.com>
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      6d9252bd
  2. 07 7月, 2012 2 次提交
  3. 06 7月, 2012 1 次提交
  4. 05 7月, 2012 2 次提交
  5. 04 7月, 2012 4 次提交
    • A
      leds: heartbeat: fix bug on panic · fb31fbeb
      Alexander Holler 提交于
      With commit 49dca5ae I introduced
      a bug (visible if CONFIG_PROVE_RCU is enabled) which occures when a panic
      has happened:
      
      [ 1526.520230] ===============================
      [ 1526.520230] [ INFO: suspicious RCU usage. ]
      [ 1526.520230] 3.5.0-rc1+ #12 Not tainted
      [ 1526.520230] -------------------------------
      [ 1526.520230] /c/kernel-tests/mm/include/linux/rcupdate.h:436 Illegal context switch in RCU read-side critical section!
      [ 1526.520230]
      [ 1526.520230] other info that might help us debug this:
      [ 1526.520230]
      [ 1526.520230]
      [ 1526.520230] rcu_scheduler_active = 1, debug_locks = 0
      [ 1526.520230] 3 locks held by net.agent/3279:
      [ 1526.520230]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff82f85962>] do_page_fault+0x193/0x390
      [ 1526.520230]  #1:  (panic_lock){+.+...}, at: [<ffffffff82ed2830>] panic+0x37/0x1d3
      [ 1526.520230]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff810b9b28>] rcu_lock_acquire+0x0/0x29
      [ 1526.520230]
      [ 1526.520230] stack backtrace:
      [ 1526.520230] Pid: 3279, comm: net.agent Not tainted 3.5.0-rc1+ #12
      [ 1526.520230] Call Trace:
      [ 1526.520230]  [<ffffffff810e1570>] lockdep_rcu_suspicious+0x109/0x112
      [ 1526.520230]  [<ffffffff810bfe3a>] rcu_preempt_sleep_check+0x45/0x47
      [ 1526.520230]  [<ffffffff810bfe5a>] __might_sleep+0x1e/0x19a
      [ 1526.520230]  [<ffffffff82f8010e>] down_write+0x26/0x81
      [ 1526.520230]  [<ffffffff8276a966>] led_trigger_unregister+0x1f/0x9c
      [ 1526.520230]  [<ffffffff8276def5>] heartbeat_reboot_notifier+0x15/0x19
      [ 1526.520230]  [<ffffffff82f85bf5>] notifier_call_chain+0x96/0xcd
      [ 1526.520230]  [<ffffffff82f85cba>] __atomic_notifier_call_chain+0x8e/0xff
      [ 1526.520230]  [<ffffffff81094b7c>] ? kmsg_dump+0x37/0x1eb
      [ 1526.520230]  [<ffffffff82f85d3f>] atomic_notifier_call_chain+0x14/0x16
      [ 1526.520230]  [<ffffffff82ed28e1>] panic+0xe8/0x1d3
      [ 1526.520230]  [<ffffffff811473e2>] out_of_memory+0x15d/0x1d3
      
      So in case of a panic, now just turn of the LED. Other approaches like
      scheduling a work to unregister the trigger aren't working because there
      isn't much which still runs after a panic occured (except timers).
      Signed-off-by: NAlexander Holler <holler@ahsoftware.de>
      Signed-off-by: NBryan Wu <bryan.wu@canonical.com>
      fb31fbeb
    • N
      md/raid10: fix careless build error · 10684112
      NeilBrown 提交于
      build error introduced by commit b357f04a
      
      That function doesn't get extra args until a later patch.  Bother.
      
      Reported-by: Fengguang Wu <wfg@linux.intel.com> 
      Reported-by: NSimon Kirby <sim@hostway.ca>
      Reported-by: NTobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      10684112
    • L
      floppy: cancel any pending fd_timeouts before adding a new one · dab058fd
      Linus Torvalds 提交于
      In commit 070ad7e7 ("floppy: convert to delayed work and
      single-thread wq") the 'fd_timeout' timer was converted to a delayed
      work.  However, the "del_timer(&fd_timeout)" was lost in the process,
      and any previous pending timeouts would stay active when we then
      re-queued the timeout.
      
      This resulted in the floppy probe sequence having a (stale) 20s timeout
      rather than the intended 3s timeout, and thus made booting with the
      floppy driver (but no actual floppy controller) take much longer than it
      should.
      
      Of course, there's little reason for most people to compile the floppy
      driver into the kernel at all, which is why most people never noticed.
      
      Canceling the delayed work where we used to do the del_timer() fixes the
      issue, and makes the floppy probing use the proper new timeout instead.
      The three second timeout is still very wasteful, but better than the 20s
      one.
      Reported-and-tested-by: NAndi Kleen <ak@linux.intel.com>
      Reported-and-tested-by: NCalvin Walton <calvin.walton@kepstin.ca>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      dab058fd
    • R
      clk: fix parent validation in __clk_set_parent() · 863b1327
      Rajendra Nayak 提交于
      The below commit introduced a bug in __clk_set_parent()
      which could cause it to *skip* the parent validation
      which makes sure the parent passed to the api is a valid
      one.
      
          commit 7975059d
          Author: Rajendra Nayak <rnayak@ti.com>
          Date:   Wed Jun 6 14:41:31 2012 +0530
      
              clk: Allow late cache allocation for clk->parents
      
      This was identified by the following compiler warning..
      
          drivers/clk/clk.c: In function '__clk_set_parent':
          drivers/clk/clk.c:1083:5: warning: 'i' may be used uninitialized in this function [-Wuninitialized]
      
      .. as reported by Marc Kleine-Budde.
      
      There were various options discussed on how to fix this, one
      being initing 'i' to clk->num_parents, but the below approach
      was found to be more appropriate as it also makes the 'parent
      validation' code simpler to read.
      Reported-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NRajendra Nayak <rnayak@ti.com>
      Signed-off-by: NMike Turquette <mturquette@linaro.org>
      Cc: stable@kernel.org
      863b1327
  6. 03 7月, 2012 20 次提交
    • M
      dm persistent data: fix allocation failure in space map checker init · b0239faa
      Mike Snitzer 提交于
      If CONFIG_DM_DEBUG_SPACE_MAPS is enabled and memory is fragmented and a
      sufficiently-large metadata device is used in a thin pool then the space
      map checker will fail to allocate the memory it requires.
      
      Switch from kmalloc to vmalloc to allow larger virtually contiguous
      allocations for the space map checker's internal count arrays.
      Reported-by: NVivek Goyal <vgoyal@redhat.com>
      Cc: stable@kernel.org
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      b0239faa
    • M
      dm persistent data: handle space map checker creation failure · 62662303
      Mike Snitzer 提交于
      If CONFIG_DM_DEBUG_SPACE_MAPS is enabled and dm_sm_checker_create()
      fails, dm_tm_create_internal() would still return success even though it
      cleaned up all resources it was supposed to have created.  This will
      lead to a kernel crash:
      
      general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      ...
      RIP: 0010:[<ffffffff81593659>]  [<ffffffff81593659>] dm_bufio_get_block_size+0x9/0x20
      Call Trace:
        [<ffffffff81599bae>] dm_bm_block_size+0xe/0x10
        [<ffffffff8159b8b8>] sm_ll_init+0x78/0xd0
        [<ffffffff8159c1a6>] sm_ll_new_disk+0x16/0xa0
        [<ffffffff8159c98e>] dm_sm_disk_create+0xfe/0x160
        [<ffffffff815abf6e>] dm_pool_metadata_open+0x16e/0x6a0
        [<ffffffff815aa010>] pool_ctr+0x3f0/0x900
        [<ffffffff8158d565>] dm_table_add_target+0x195/0x450
        [<ffffffff815904c4>] table_load+0xe4/0x330
        [<ffffffff815917ea>] ctl_ioctl+0x15a/0x2c0
        [<ffffffff81591963>] dm_ctl_ioctl+0x13/0x20
        [<ffffffff8116a4f8>] do_vfs_ioctl+0x98/0x560
        [<ffffffff8116aa51>] sys_ioctl+0x91/0xa0
        [<ffffffff81869f52>] system_call_fastpath+0x16/0x1b
      
      Fix the space map checker code to return an appropriate ERR_PTR and have
      dm_sm_disk_create() and dm_tm_create_internal() check for it with
      IS_ERR.
      Reported-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      62662303
    • M
      dm persistent data: fix shadow_info_leak on dm_tm_destroy · 25d7cd6f
      Mike Snitzer 提交于
      Cleanup the shadow table before destroying the transaction manager.
      
      Reference: leak was identified with kmemleak when running
      test_discard_random_sectors in the thinp-test-suite.
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      25d7cd6f
    • J
      dm thin: commit metadata before creating metadata snapshot · 0d200aef
      Joe Thornber 提交于
      Userland sometimes sees a corrupt metadata block if metadata is changing
      rapidly when a metadata snapshot is reserved for userland,  To make the
      problem go away, commit before we take the metadata snapshot (which is a
      sensible thing to do anyway).
      
      The checksums mean userland spots this corruption immediately so there's
      no risk of acting on incorrect data.  No corruption exists from the
      kernel's point of view, and thin_check passes after pool shutdown.
      
      I believe this is to do with shared blocks at the first level of the
      {device, mapping} btree.  Prior to the metadata-snap support no sharing
      at this level was possible, so this patch is only required after commit
      cc8394d8 ("dm thin: provide userspace
      access to pool metadata").
      Signed-off-by: NJoe Thornber <ejt@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      Signed-off-by: NAlasdair G Kergon <agk@redhat.com>
      0d200aef
    • D
      drm/i915: kick any firmware framebuffers before claiming the gtt · 9f846a16
      Daniel Vetter 提交于
      Especially vesafb likes to map everything as uc- (yikes), and if that
      mapping hangs around still while we try to map the gtt as wc the
      kernel will downgrade our request to uc-, resulting in abyssal
      performance.
      
      Unfortunately we can't do this as early as readon does (i.e. as the
      first thing we do when initializing the hw) because our fb/mmio space
      region moves around on a per-gen basis. So I've had to move it below
      the gtt initialization, but that seems to work, too. The important
      thing is that we do this before we set up the gtt wc mapping.
      
      Now an altogether different question is why people compile their
      kernels with vesafb enabled, but I guess making things just work isn't
      bad per se ...
      
      v2:
      - s/radeondrmfb/inteldrmfb/
      - fix up error handling
      
      v3: Kill #ifdef X86, this is Intel after all. Noticed by Ben Widawsky.
      
      v4: Jani Nikula complained about the pointless bool primary
      initialization.
      
      v5: Don't oops if we can't allocate, noticed by Chris Wilson.
      
      v6: Resolve conflicts with agp rework and fixup whitespace.
      
      This is commit e188719a in drm-next.
      
      Backport to 3.5 -fixes queue requested by Dave Airlie - due to grub
      using vesa on fedora their initrd seems to load vesafb before loading
      the real kms driver. So tons more people actually experience a
      dead-slow gpu. Hence also the Cc: stable.
      
      Cc: stable@vger.kernel.org
      Reported-and-tested-by: N"Kilarski, Bernard R" <bernard.r.kilarski@intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      9f846a16
    • T
      drm: edid: Don't add inferred modes with higher resolution · 7b668ebe
      Takashi Iwai 提交于
      When a monitor EDID doesn't give the preferred bit, driver assumes
      that the mode with the higest resolution and rate is the preferred
      mode.  Meanwhile the recent changes for allowing more modes in the
      GFT/CVT ranges give actually more modes, and some modes may be over
      the native size.  Thus such a mode would be picked up as the preferred
      mode although it's no native resolution.
      
      For avoiding such a problem, this patch limits the addition of
      inferred modes by checking not to be greater than other modes.
      Also, it checks the duplicated mode entry at the same time.
      Reviewed-by: NAdam Jackson <ajax@redhat.com>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      7b668ebe
    • J
      drm/radeon: fix rare segfault · 1ef5325b
      Jerome Glisse 提交于
      In gem idle/busy ioctl the radeon object was derefenced after
      drm_gem_object_unreference_unlocked which in case the object
      have been destroyed lead to use of a possibly free pointer with
      possibly wrong data.
      Signed-off-by: NJerome Glisse <jglisse@redhat.com>
      Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      1ef5325b
    • N
      md: fix up plugging (again). · b357f04a
      NeilBrown 提交于
      The value returned by "mddev_check_plug" is only valid until the
      next 'schedule' as that will unplug things.  This could happen at any
      call to mempool_alloc.
      So just calling mddev_check_plug at the start doesn't really make
      sense.
      
      So call it just before, or just after, queuing things for the thread.
      As the action that happens at unplug is to wake the thread, this makes
      lots of sense.
      If we cannot add a plug (which requires a small GFP_ATOMIC alloc) we
      wake thread immediately.
      
      RAID5 is a bit different.  Requests are queued for the thread and the
      thread is woken by release_stripe.  So we don't need to wake the
      thread on failure.
      However the thread doesn't perform certain actions when there is any
      active plug, so it is important to install a plug before waking the
      thread.  So for RAID5 we install the plug *before* queuing the request
      and waking the thread.
      
      Without this patch it is possible for raid1 or raid10 to queue a
      request without then waking the thread, resulting in the array locking
      up.
      
      Also change raid10 to only flush_pending_write when there are not
      active plugs, just like raid1.
      
      This patch is suitable for 3.0 or later.  I plan to submit it to
      -stable, but I'll like to let it spend a few weeks in mainline
      first to be sure it is completely safe.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      b357f04a
    • N
      md: support re-add of recovering devices. · f4563091
      NeilBrown 提交于
      We currently only allow a device to be re-added if it appear to be
      in-sync.  This is overly restrictive as it may be desirable to re-add
      a device that is in the middle of recovery.
      
      So remove the test for "InSync" - the test on rdev->raid_disk is
      sufficient to ensure that the re-add will succeed.
      Reported-by: NAlexander Lyakas <alex.bolshoy@gmail.com>
      Tested-by: NAlexander Lyakas <alex.bolshoy@gmail.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      f4563091
    • N
      md/raid1: fix bug in read_balance introduced by hot-replace · 32644afd
      NeilBrown 提交于
      When we added hot_replace we doubled the number of devices
      that could be in a RAID1 array.  So we doubled how far read_balance
      would search.  Unfortunately we didn't double the point at which
      it looped back to the beginning - so it effectively loops over
      all non-replacement disks twice.
      This doesn't cause bad behaviour, but it pointless and means we
      never read from replacement devices.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      32644afd
    • S
      raid5: delayed stripe fix · fab363b5
      Shaohua Li 提交于
      There isn't locking setting STRIPE_DELAYED and STRIPE_PREREAD_ACTIVE bits, but
      the two bits have relationship. A delayed stripe can be moved to hold list only
      when preread active stripe count is below IO_THRESHOLD. If a stripe has both
      the bits set, such stripe will be in delayed list and preread count not 0,
      which will make such stripe never leave delayed list.
      Signed-off-by: NShaohua Li <shli@fusionio.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      fab363b5
    • M
      md/raid456: When read error cannot be recovered, record bad block · 2e8ac303
      majianpeng 提交于
      We may not be able to fix a bad block if:
       - the array is degraded
       - the over-write fails.
      
      In these cases we currently eject the device, but we should
      record a bad block if possible.
      Signed-off-by: Nmajianpeng <majianpeng@gmail.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      2e8ac303
    • N
      md: make 'name' arg to md_register_thread non-optional. · 0232605d
      NeilBrown 提交于
      Having the 'name' arg optional and defaulting to the current
      personality name is no necessary and leads to errors, as when
      changing the level of an array we can end up using the
      name of the old level instead of the new one.
      
      So make it non-optional and always explicitly pass the name
      of the level that the array will be.
      Reported-by: Nmajianpeng <majianpeng@gmail.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      0232605d
    • N
      md/raid10: fix failure when trying to repair a read error. · 055d3747
      NeilBrown 提交于
      commit 58c54fcc
           md/raid10: handle further errors during fix_read_error better.
      
      in 3.1 added "r10_sync_page_io" which takes an IO size in sectors.
      But we were passing the IO size in bytes!!!
      This resulting in bio_add_page failing, and empty request being sent
      down, and a consequent BUG_ON in scsi_lib.
      
      [fix missing space in error message at same time]
      
      This fix is suitable for 3.1.y and later.
      
      Cc: stable@vger.kernel.org
      Reported-by: NChristian Balzer <chibi@gol.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      055d3747
    • N
      md/raid5: fix refcount problem when blocked_rdev is set. · 5f066c63
      NeilBrown 提交于
      commit 43220aa0
          md/raid5: fix a hang on device failure.
      
      fixed a hang, but introduced a refcounting in-balance so
      that if the presence of bad-blocks ever caused an rdev to
      be 'blocked' we would increment the refcount on the rdev and
      never decrement it.
      
      So added the needed rdev_dec_pending when md_wait_for_blocked_rdev
      is not called.
      Reported-by: Nmajianpeng <majianpeng@gmail.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      5f066c63
    • M
      md:Add blk_plug in sync_thread. · 7c2c57c9
      majianpeng 提交于
      Add blk_plug in sync_thread will increase the performance of sync.
      Because sync_thread did not blk_plug,so when raid sync, the bio merge
      not well.
      
      Testing environment:
      SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI
      Controller.
      OS:Linux xxx 3.5.0-rc2+ #340 SMP Tue Jun 12 09:00:25 CST 2012
      x86_64 x86_64 x86_64 GNU/Linux.
      RAID5: four ST31000524NS disk.
      
      Without blk_plug:recovery speed about 63M/Sec;
      Add blk_plug:recovery speed about 120M/Sec.
      
      Using blktrace:
      blktrace -d /dev/sdb -w 60  -o -|blkparse -i -
      
      without blk_plug:
      Total (8,16):
       Reads Queued:      309811,     1239MiB	 Writes Queued:           0,        0KiB
       Read Dispatches:   283583,     1189MiB	 Write Dispatches:        0,        0KiB
       Reads Requeued:         0		 Writes Requeued:         0
       Reads Completed:   273351,     1149MiB	 Writes Completed:        0,        0KiB
       Read Merges:        23533,    94132KiB	 Write Merges:            0,        0KiB
       IO unplugs:             0        	 Timer unplugs:           0
      
      add blk_plug:
      Total (8,16):
       Reads Queued:      428697,     1714MiB	 Writes Queued:           0,        0KiB
       Read Dispatches:     3954,     1714MiB	 Write Dispatches:        0,        0KiB
       Reads Requeued:         0		 Writes Requeued:         0
       Reads Completed:     3956,     1715MiB	 Writes Completed:        0,        0KiB
       Read Merges:       424743,     1698MiB	 Write Merges:            0,        0KiB
       IO unplugs:             0        	 Timer unplugs:        3384
      
      The ratio of merge will be markedly increased.
      Signed-off-by: Nmajianpeng <majianpeng@gmail.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      7c2c57c9
    • M
      md/raid5: In ops_run_io, inc nr_pending before calling md_wait_for_blocked_rdev · 1850753d
      majianpeng 提交于
      In ops_run_io(), the call to md_wait_for_blocked_rdev will decrement
      nr_pending so we lose the reference we hold on the rdev.
      So atomic_inc it first to maintain the reference.
      
      This bug was introduced by commit  73e92e51
          md/raid5.  Don't write to known bad block on doubtful devices.
      
      which appeared in 3.0, so patch is suitable for stable kernels since
      then.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: Nmajianpeng <majianpeng@gmail.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      1850753d
    • M
      md/raid5: Do not add data_offset before call to is_badblock · 6c0544e2
      majianpeng 提交于
      In chunk_aligned_read() we are adding data_offset before calling
      is_badblock.  But is_badblock also adds data_offset, so that is bad.
      
      So move the addition of data_offset to after the call to
      is_badblock.
      
      This bug was introduced by commit 31c176ec
           md/raid5: avoid reading from known bad blocks.
      which first appeared in 3.0.  So that patch is suitable for any
      -stable kernel from 3.0.y onwards.  However it will need minor
      revision for most of those (as the comment didn't appear until
      recently).
      
      Cc: stable@vger.kernel.org
      Signed-off-by: Nmajianpeng <majianpeng@gmail.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      6c0544e2
    • N
      md/raid5: prefer replacing failed devices over want-replacement devices. · 5cfb22a1
      NeilBrown 提交于
      If a RAID5 has both a failed device and a device marked as
      'WantReplacement', then we should preferentially replace the failed
      device.
      However the current code replaces whichever is found first.
      So split into 2 loops, check fail failed/missing first, and only check
      for WantReplacement if nothing is failed or missing.
      Reported-by: Nmajianpeng <majianpeng@gmail.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      5cfb22a1
    • N
      md/raid10: Don't try to recovery unmatched (and unused) chunks. · fc448a18
      NeilBrown 提交于
      If a RAID10 has an odd number of chunks - as might happen when there
      are an odd number of devices - the last chunk has no pair and so is
      not mirrored.  We don't store data there, but when recovering the last
      device in an array we retry to recover that last chunk from a
      non-existent location.  This results in an error, and the recovery
      aborts.
      
      When we get to that last chunk we should just stop - there is nothing
      more to do anyway.
      
      This bug has been present since the introduction of RAID10, so the
      patch is appropriate for any -stable kernel.
      
      Cc: stable@vger.kernel.org
      Reported-by: NChristian Balzer <chibi@gol.com>
      Tested-by: NChristian Balzer <chibi@gol.com>
      Signed-off-by: NNeilBrown <neilb@suse.de>
      fc448a18