1. 15 6月, 2013 1 次提交
    • M
      powerpc: Fix stack overflow crash in resume_kernel when ftracing · 0e37739b
      Michael Ellerman 提交于
      It's possible for us to crash when running with ftrace enabled, eg:
      
        Bad kernel stack pointer bffffd12 at c00000000000a454
        cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40]
            pc: c00000000000a454: resume_kernel+0x34/0x60
            lr: c00000000000335c: performance_monitor_common+0x15c/0x180
            sp: bffffd12
           msr: 8000000000001032
           dar: bffffd12
         dsisr: 42000000
      
      If we look at current's stack (paca->__current->stack) we see it is
      equal to c0000002ecab0000. Our stack is 16K, and comparing to
      paca->kstack (c0000002ecab3e30) we can see that we have overflowed our
      kernel stack. This leads to us writing over our struct thread_info, and
      in this case we have corrupted thread_info->flags and set
      _TIF_EMULATE_STACK_STORE.
      
      Dumping the stack we see:
      
        3:mon> t c0000002ecab0000
        [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70
        [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180
        --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30
        [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable)
        [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130
        [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
        [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90
        [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34
        [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300
        [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180
        --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
        [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable)
        [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280
        [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130
        [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
        [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40
        [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34
        --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
      
        ... and so on
      
      __ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry
      path. At that point the irq state is not consistent, ie. interrupts are
      hard disabled (by the exception entry), but the paca soft-enabled flag
      may be out of sync.
      
      This leads to the local_irq_restore() in trace_graph_entry() actually
      enabling interrupts, which we do not want. Because we have not yet
      reprogrammed the decrementer we immediately take another decrementer
      exception, and recurse.
      
      The fix is twofold. Firstly make sure we call DISABLE_INTS before
      calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles
      the irq state in the paca with the hardware, making it safe again to
      call local_irq_save/restore().
      
      Although that should be sufficient to fix the bug, we also mark the
      runlatch routines as notrace. They are called very early in the
      exception entry and we are asking for trouble tracing them. They are
      also fairly uninteresting and tracing them just adds unnecessary
      overhead.
      
      [ This regression was introduced by fe1952fc
        "powerpc: Rework runlatch code" by myself --BenH
      ]
      
      CC: <stable@vger.kernel.org> [v3.4+]
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0e37739b
  2. 11 6月, 2013 6 次提交
    • B
      Fix lockup related to stop_machine being stuck in __do_softirq. · 34376a50
      Ben Greear 提交于
      The stop machine logic can lock up if all but one of the migration
      threads make it through the disable-irq step and the one remaining
      thread gets stuck in __do_softirq.  The reason __do_softirq can hang is
      that it has a bail-out based on jiffies timeout, but in the lockup case,
      jiffies itself is not incremented.
      
      To work around this, re-add the max_restart counter in __do_irq and stop
      processing irqs after 10 restarts.
      
      Thanks to Tejun Heo and Rusty Russell and others for helping me track
      this down.
      
      This was introduced in 3.9 by commit c10d7367 ("softirq: reduce
      latencies").
      
      It may be worth looking into ath9k to see if it has issues with its irq
      handler at a later date.
      
      The hang stack traces look something like this:
      
          ------------[ cut here ]------------
          WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7()
          Watchdog detected hard LOCKUP on cpu 2
          Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
          Pid: 23, comm: migration/2 Tainted: G         C   3.9.4+ #11
          Call Trace:
           <NMI>   warn_slowpath_common+0x85/0x9f
            warn_slowpath_fmt+0x46/0x48
            watchdog_overflow_callback+0x9c/0xa7
            __perf_event_overflow+0x137/0x1cb
            perf_event_overflow+0x14/0x16
            intel_pmu_handle_irq+0x2dc/0x359
            perf_event_nmi_handler+0x19/0x1b
            nmi_handle+0x7f/0xc2
            do_nmi+0xbc/0x304
            end_repeat_nmi+0x1e/0x2e
           <<EOE>>
            cpu_stopper_thread+0xae/0x162
            smpboot_thread_fn+0x258/0x260
            kthread+0xc7/0xcf
            ret_from_fork+0x7c/0xb0
          ---[ end trace 4947dfa9b0a4cec3 ]---
          BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17]
          Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
          irq event stamp: 835637905
          hardirqs last  enabled at (835637904): __do_softirq+0x9f/0x257
          hardirqs last disabled at (835637905): apic_timer_interrupt+0x6d/0x80
          softirqs last  enabled at (5654720): __do_softirq+0x1ff/0x257
          softirqs last disabled at (5654725): irq_exit+0x5f/0xbb
          CPU 1
          Pid: 17, comm: migration/1 Tainted: G        WC   3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M.
          RIP: tasklet_hi_action+0xf0/0xf0
          Process migration/1
          Call Trace:
           <IRQ>
            __do_softirq+0x117/0x257
            irq_exit+0x5f/0xbb
            smp_apic_timer_interrupt+0x8a/0x98
            apic_timer_interrupt+0x72/0x80
           <EOI>
            printk+0x4d/0x4f
            stop_machine_cpu_stop+0x22c/0x274
            cpu_stopper_thread+0xae/0x162
            smpboot_thread_fn+0x258/0x260
            kthread+0xc7/0xcf
            ret_from_fork+0x7c/0xb0
      Signed-off-by: NBen Greear <greearb@candelatech.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Acked-by: NPekka Riikonen <priikone@iki.fi>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      34376a50
    • L
      Merge tag '9p-3.10-bug-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs · 1b79821f
      Linus Torvalds 提交于
      Pull net/9p bug fix from Eric Van Hensbergen:
       "zero copy error fix"
      
      * tag '9p-3.10-bug-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
        net/9p: Handle error in zero copy request correctly for 9p2000.u
      1b79821f
    • L
      Merge tag 'spi-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · ab029631
      Linus Torvalds 提交于
      Pull spi fixes from Mark Brown:
       "A few nasty issues, particularly a race with the interrupt controller
        in the xilinx driver, together with a couple of more minor fixes and a
        much needed move of the mailing list away from sourceforge."
      
      * tag 'spi-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: hspi: fixup long delay time
        spi: spi-xilinx: Remove ISR race condition
        spi: topcliff-pch: fix error return code in pch_spi_probe()
        spi: topcliff-pch: Pass correct pointer to free_irq()
        spi: Move mailing list to vger
      ab029631
    • L
      Merge tag 'stable/for-linus-3.10-rc5-tag' of... · 50e6f851
      Linus Torvalds 提交于
      Merge tag 'stable/for-linus-3.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
      
      Pull xen fixes from Konrad Rzeszutek Wilk:
       "Two bug-fixes for regressions:
         - xen/tmem stopped working after a certain combination of
           modprobe/swapon was used
         - cpu online/offlining would trigger WARN_ON."
      
      * tag 'stable/for-linus-3.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
        xen/tmem: Don't over-write tmem_frontswap_poolid after tmem_frontswap_init set it.
        xen/smp: Fixup NOHZ per cpu data when onlining an offline CPU.
      50e6f851
    • L
      Merge tag 'regmap-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · 30f5f739
      Linus Torvalds 提交于
      Pull regmap fixes from Mark Brown:
       "The biggest fix here is Lars-Peter's fix for custom locking callbacks
        which is pretty localised but important for those devices that use the
        feature.  Otherwise we've got a couple of fairly small cleanups which
        would have been sent sooner were it not for letting Lars-Peter's patch
        soak for a while"
      
      * tag 'regmap-v3.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
        regmap: rbtree: Fixed node range check on sync
        regmap: regcache: Fixup locking for custom lock callbacks
        regmap: debugfs: Check return value of regmap_write()
      30f5f739
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 822b4b6f
      Linus Torvalds 提交于
      Pull crypto fixes from Herbert Xu:
       "This fixes a build problem in sahara and temporarily disables two new
        optimisations because of performance regressions until a permanent fix
        is ready"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: sahara - fix building as module
        crypto: blowfish - disable AVX2 implementation
        crypto: twofish - disable AVX2 implementation
      822b4b6f
  3. 10 6月, 2013 12 次提交
  4. 09 6月, 2013 13 次提交
  5. 08 6月, 2013 8 次提交
    • L
      Merge tag 'trace-fixes-v3.10-rc3-v3' of... · 14d0ee05
      Linus Torvalds 提交于
      Merge tag 'trace-fixes-v3.10-rc3-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull tracing fixes from Steven Rostedt:
       "This contains 4 fixes.
      
        The first two fix the case where full RCU debugging is enabled,
        enabling function tracing causes a live lock of the system.  This is
        due to the added debug checks in rcu_dereference_raw() that is used by
        the function tracer.  These checks are also traced by the function
        tracer as well as cause enough overhead to the function tracer to slow
        down the system enough that the time to finish an interrupt can take
        longer than when the next interrupt is triggered, causing a live lock
        from the timer interrupt.
      
        Talking this over with Paul McKenney, we came up with a fix that adds
        a new rcu_dereference_raw_notrace() that does not perform these added
        checks, and let the function tracer use that.
      
        The third commit fixes a failed compile when branch tracing is
        enabled, due to the conversion of the trace_test_buffer() selftest
        that the branch trace wasn't converted for.
      
        The forth patch fixes a bug caught by the RCU lockdep code where a
        rcu_read_lock() is performed when rcu is disabled (either going to or
        from idle, or user space).  This happened on the irqsoff tracer as it
        calls task_uid().  The fix here was to use current_uid() when possible
        that doesn't use rcu locking.  Which luckily, is always used when
        irqsoff calls this code."
      
      * tag 'trace-fixes-v3.10-rc3-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Use current_uid() for critical time tracing
        tracing: Fix bad parameter passed in branch selftest
        ftrace: Use the rcu _notrace variants for rcu_dereference_raw() and friends
        rcu: Add _notrace variation of rcu_dereference_raw() and hlist_for_each_entry_rcu()
      14d0ee05
    • R
      Revert "ACPI / scan: do not match drivers against objects having scan handlers" · ea7f6656
      Rafael J. Wysocki 提交于
      Commit 9f29ab11 ("ACPI / scan: do not match drivers against objects
      having scan handlers") introduced a boot regression on Tony's ia64 HP
      rx2600.  Tony says:
      
        "It panics with the message:
      
         Kernel panic - not syncing: Unable to find SBA IOMMU: Try a generic or DIG kernel
      
         [...] my problem comes from arch/ia64/hp/common/sba_iommu.c
         where the code in sba_init() says:
      
              acpi_bus_register_driver(&acpi_sba_ioc_driver);
              if (!ioc_list) {
      
         but because of this change we never managed to call ioc_init()
         so ioc_list doesn't get set up, and we die."
      
      Revert it to avoid this breakage and we'll fix the problem it attempted
      to address later.
      Reported-by: NTony Luck <tony.luck@gmail.com>
      Cc: 3.9+ <stable@vger.kernel.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ea7f6656
    • O
      Merge tag 'mxs-fixes-3.10' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes · 090878aa
      Olof Johansson 提交于
      From Shawn Guo, mxs fixes for 3.10:
      
      - Since the time we move to MULTI_IRQ_HANDLER, the 0x7f polling for no
        interrupt in icoll_handle_irq() becomes insane, because 0x7f is an
        valid interrupt number, the irq of gpio bank 0.  That unnecessary
        polling results in the driver not detecting when irq 0x7f is active
        which makes the machine effectively dead lock.  The fix removes the
        interrupt poll loop and allows usage of gpio0 interrupt without an
        infinite loop.
      
      * tag 'mxs-fixes-3.10' of git://git.linaro.org/people/shawnguo/linux-2.6:
        ARM: mxs: icoll: Fix interrupts gpio bank 0
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      090878aa
    • O
      Merge tag 'imx-fixes-3.10-2' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes · 3d0d8b91
      Olof Johansson 提交于
      From Shawn Guo, imx fixes for 3.10, take 2:
      
      - One device tree fix for all spi node to have per clock added.
        The clock is needed by spi driver to calculate bit rate divisor.
        The spi node in the current device trees either does not have the
        clock or is defined as dummy clock, in which case the driver probe
        will fail or spi will run at a wrong bit rate.
      
      - Two imx6q clock fixes, which correct axi_sels and ldb_di_sels.
      
      * tag 'imx-fixes-3.10-2' of git://git.linaro.org/people/shawnguo/linux-2.6:
        ARM: imx: clk-imx6q: AXI clock select index is incorrect
        ARM: dts: imx: fix clocks for cspi
        ARM i.MX6q: fix for ldb_di_sels
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      3d0d8b91
    • D
      ARM: exynos: add debug_ll_io_init() call in exynos_init_io() · 9c1fcdcc
      Doug Anderson 提交于
      If the early MMU mapping of the UART happens to get booted out of the
      TLB between the start of paging_init() and when we finally re-add the
      UART at the very end of s3c_init_cpu(), we'll get a hang at bootup if
      we've got early_printk enabled.  Avoid this hang by calling
      debug_ll_io_init() early.
      
      Without this patch, you can reliably reproduce a hang when early
      printk is enabled by adding flush_tlb_all() at the start of
      exynos_init_io().  After this patch the hang goes away.
      Signed-off-by: NDoug Anderson <dianders@chromium.org>
      Acked-by: NKukjin Kim <kgene.kim@samsung.com>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      9c1fcdcc
    • O
      Merge tag 'renesas-fixes-for-v3.10' of... · fb565ff7
      Olof Johansson 提交于
      Merge tag 'renesas-fixes-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
      
      From Simon Horman, Renesas ARM based SoC fixes for v3.10:
      - Correction to USB OVC and PENC pin groupings on r8a7779 SoC.
        This avoids conflicts when the USB_OVCn pins are used by another function.
        This has been observed to be a problem in v3.10-rc1.
      - Update CMT clock rating for sh73a0 SoC to resolve boot failure
        on kzm9g-reference. This resolves a regression between v3.9 and v3.10-rc1.
      
      * tag 'renesas-fixes-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
        ARM: shmobile: sh73a0: Update CMT clockevent rating to 80
        sh-pfc: r8a7779: Don't group USB OVC and PENC pins
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      fb565ff7
    • T
      ARM: EXYNOS: uncompress - print debug messages if DEBUG_LL is defined · 437d8ac5
      Tushar Behera 提交于
      Printing low-level debug messages make an assumption that the specified
      UART port has been preconfigured by the bootloader. Incorrectly
      specified UART port results in system getting stalled while printing the
      message "Uncompressing Linux... done, booting the kernel"
      This UART port number is specified through S3C_LOWLEVEL_UART_PORT. Since
      the UART port might different for different board, it is not possible to
      specify it correctly for every board that use a common defconfig file.
      
      Calling this print subroutine only when DEBUG_LL fixes the problem. By
      disabling DEBUG_LL in default config file, we would be able to boot
      multiple boards with different default UART ports.
      
      With this current approach, we miss the print "Uncompressing Linux...
      done, booting the kernel." when DEBUG_LL is not defined.
      Signed-off-by: NTushar Behera <tushar.behera@linaro.org>
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      437d8ac5
    • L
      Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband · e8193ce5
      Linus Torvalds 提交于
      Pull infiniband fixes from Roland Dreier:
       - qib RCU/lockdep fix
       - iser device removal fix, plus doc fixes
      
      * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
        IB/qib: Fix lockdep splat in qib_alloc_lkey()
        MAINTAINERS: Add entry for iSCSI Extensions for RDMA (iSER) initiator
        IB/iser: Add Mellanox copyright
        IB/iser: Fix device removal flow
      e8193ce5