1. 18 2月, 2014 25 次提交
  2. 14 2月, 2014 1 次提交
  3. 10 2月, 2014 4 次提交
  4. 09 2月, 2014 10 次提交
    • D
      perf/x86/p4: Block PMIs on init to prevent a stream of unkown NMIs · 90ed5b0f
      Don Zickus 提交于
      A bunch of unknown NMIs have popped up on a Pentium4 recently when booting
      into a kdump kernel.  This was exposed because the watchdog timer went
      from 60 seconds down to 10 seconds (increasing the ability to reproduce
      this problem).
      
      What is happening is on boot up of the second kernel (the kdump one),
      the previous nmi_watchdogs were enabled on thread 0 and thread 1.  The
      second kernel only initializes one cpu but the perf counter on thread 1
      still counts.
      
      Normally in a kdump scenario, the other cpus are blocking in an NMI loop,
      but more importantly their local apics have the performance counters disabled
      (iow LVTPC is masked).  So any counters that fire are masked and never get
      through to the second kernel.
      
      However, on a P4 the local apic is shared by both threads and thread1's PMI
      (despite being configured to only interrupt thread1) will generate an NMI on
      thread0.  Because thread0 knows nothing about this NMI, it is seen as an
      unknown NMI.
      
      This would be fine because it is a kdump kernel, strange things happen
      what is the big deal about a single unknown NMI.
      
      Unfortunately, the P4 comes with another quirk: clearing the overflow bit
      to prevent a stream of NMIs.  This is the problem.
      
      The kdump kernel can not execute because of the endless NMIs that happen.
      
      To solve this, I instrumented the p4 perf init code, to walk all the counters
      and zero them out (just like a normal reset would).
      
      Now when the counters go off, they do not generate anything and no unknown
      NMIs are seen.
      
      I tested this on a P4 we have in our lab.  After two or three crashes, I could
      normally reproduce the problem.  Now after 10 crashes, everything continues
      to boot correctly.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140120154115.GZ25953@redhat.com
      [ Fixed a stylistic detail. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      90ed5b0f
    • D
      perf/x86/p4: Fix counter corruption when using lots of perf groups · 13beacee
      Don Zickus 提交于
      On a P4 box stressing perf with:
      
         ./perf record -o perf.data ./perf stat -v ./perf bench all
      
      it was noticed that a slew of unknown NMIs would pop out rather quickly.
      
      Painfully debugging this ancient platform, led me to notice cross cpu counter
      corruption.
      
      The P4 machine is special in that it has 18 counters, half are used for cpu0
      and the other half is for cpu1 (or all 18 if hyperthreading is disabled).  But
      the splitting of the counters has to be actively managed by the software.
      
      In this particular bug, one of the cpu0 specific counters was being used by
      cpu1 and caused all sorts of random unknown nmis.
      
      I am not entirely sure on the corruption path, but what happens is:
      
       o perf schedules a group with p4_pmu_schedule_events()
       o inside p4_pmu_schedule_events(), it notices an hwc pointer is being reused
         but for a different cpu, so it 'swaps' the config bits and returns the
         updated 'assign' array with a _new_ index.
       o perf schedules another group with p4_pmu_schedule_events()
       o inside p4_pmu_schedule_events(), it notices an hwc pointer is being reused
         (the same one as above) but for the _same_ cpu [BUG!!], so it updates the
         'assign' array to use the _old_ (wrong cpu) index because the _new_ index is in
         an earlier part of the 'assign' array (and hasn't been committed yet).
       o perf commits the transaction using the wrong index and corrupts the other cpu
      
      The [BUG!!] is because the 'hwc->config' is updated but not the 'hwc->idx'.  So
      the check for 'p4_should_swap_ts()' is correct the first time around but
      incorrect the second time around (because hwc->config was updated in between).
      
      I think the spirit of perf was to not modify anything until all the
      transactions had a chance to 'test' if they would succeed, and if so, commit
      atomically.  However, P4 breaks this spirit by touching the hwc->config
      element.
      
      So my fix is to continue the un-perf like breakage, by assigning hwc->idx to -1
      on swap to tell follow up group scheduling to find a new index.
      
      Of course if the transaction fails rolling this back will be difficult, but
      that is not different than how the current code works. :-)  And I wasn't sure
      how much effort to cleanup the code I should do for a platform that is almost
      10 years old by now.
      
      Hence the lazy fix.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1391024270-19469-1-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      13beacee
    • P
      x86/nmi: Push duration printk() to irq context · e90c7853
      Peter Zijlstra 提交于
      Calling printk() from NMI context is bad (TM), so move it to IRQ
      context.
      
      In doing so we slightly change (probably wreck) the debugfs
      nmi_longest_ns thingy, in that it doesn't update to reflect the
      longest, nor does writing to it reset the count.
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Link: http://lkml.kernel.org/n/tip-rdw0au56a5ymis1u8p48c12d@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e90c7853
    • P
      perf/x86: Push the duration-logging printk() to IRQ context · 6a02ad66
      Peter Zijlstra 提交于
      Calling printk() from NMI context is bad (TM), so move it to IRQ
      context.
      
      This also avoids the problem where the printk() time is measured by
      the generic NMI duration goo and triggers a second warning.
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Link: http://lkml.kernel.org/n/tip-75dv35xf6dhhmeb7nq6fua31@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6a02ad66
    • I
      Merge branch 'linus' into perf/core · 3c3d7cb1
      Ingo Molnar 提交于
      Refresh the branch to a v3.14-rc base before queueing up new devel patches.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3c3d7cb1
    • P
      perf/x86: Fix Userspace RDPMC switch · 0e9f2204
      Peter Zijlstra 提交于
      The current code forgets to change the CR4 state on the current CPU.
      Use on_each_cpu() instead of smp_call_function().
      Reported-by: NMark Davies <junk@eslaf.co.uk>
      Suggested-by: NMark Davies <junk@eslaf.co.uk>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: fweisbec@gmail.com
      Link: http://lkml.kernel.org/n/tip-69efsat90ibhnd577zy3z9gh@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0e9f2204
    • P
      perf/x86/intel/p6: Add userspace RDPMC quirk for PPro · e97df763
      Peter Zijlstra 提交于
      PPro machines can die hard when PCE gets enabled due to a CPU erratum.
      The safe way it so disable it by default and keep it disabled.
      
      See erratum 26 in:
      
        http://download.intel.com/design/archives/processors/pro/docs/24268935.pdfReported-and-Tested-by: NMark Davies <junk@eslaf.co.uk>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vince@deater.net>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140206170815.GW2936@laptop.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e97df763
    • L
      Merge tag 'pinctrl-v3.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 49447903
      Linus Torvalds 提交于
      Pull pinctrl fixes from Linus Walleij:
       "First round of pin control fixes for v3.14:
      
         - Protect pinctrl_list_add() with the proper mutex.  This was
           identified by RedHat.  Caused nasty locking warnings was rootcased
           by Stanislaw Gruszka.
      
         - Avoid adding dangerous debugfs files when either half of the
           subsystem is unused: pinmux or pinconf.
      
         - Various fixes to various drivers: locking, hardware particulars, DT
           parsing, error codes"
      
      * tag 'pinctrl-v3.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: tegra: return correct error type
        pinctrl: do not init debugfs entries for unimplemented functionalities
        pinctrl: protect pinctrl_list add
        pinctrl: sirf: correct the pin index of ac97_pins group
        pinctrl: imx27: fix offset calculation in imx_read_2bit
        pinctrl: vt8500: Change devicetree data parsing
        pinctrl: imx27: fix wrong offset to ICONFB
        pinctrl: at91: use locked variant of irq_set_handler
      49447903
    • L
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c132adef
      Linus Torvalds 提交于
      Pull irq fix from Thomas Gleixner:
       "Add a missing Kconfig dependency"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq: Generic irq chip requires IRQ_DOMAIN
      c132adef
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c1ff8431
      Linus Torvalds 提交于
      Pull x86 fixes from Peter Anvin:
       "Quite a varied little collection of fixes.  Most of them are
        relatively small or isolated; the biggest one is Mel Gorman's fixes
        for TLB range flushing.
      
        A couple of AMD-related fixes (including not crashing when given an
        invalid microcode image) and fix a crash when compiled with gcov"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, microcode, AMD: Unify valid container checks
        x86, hweight: Fix BUG when booting with CONFIG_GCOV_PROFILE_ALL=y
        x86/efi: Allow mapping BGRT on x86-32
        x86: Fix the initialization of physnode_map
        x86, cpu hotplug: Fix stack frame warning in check_irq_vectors_for_cpu_disable()
        x86/intel/mid: Fix X86_INTEL_MID dependencies
        arch/x86/mm/srat: Skip NUMA_NO_NODE while parsing SLIT
        mm, x86: Revisit tlb_flushall_shift tuning for page flushes except on IvyBridge
        x86: mm: change tlb_flushall_shift for IvyBridge
        x86/mm: Eliminate redundant page table walk during TLB range flushing
        x86/mm: Clean up inconsistencies when flushing TLB ranges
        mm, x86: Account for TLB flushes only when debugging
        x86/AMD/NB: Fix amd_set_subcaches() parameter type
        x86/quirks: Add workaround for AMD F16h Erratum792
        x86, doc, kconfig: Fix dud URL for Microcode data
      c1ff8431