1. 17 5月, 2011 2 次提交
  2. 16 5月, 2011 1 次提交
    • Y
      x86, apic: Fix spurious error interrupts triggering on all non-boot APs · e503f9e4
      Youquan Song 提交于
      This patch fixes a bug reported by a customer, who found
      that many unreasonable error interrupts reported on all
      non-boot CPUs (APs) during the system boot stage.
      
      According to Chapter 10 of Intel Software Developer Manual
      Volume 3A, Local APIC may signal an illegal vector error when
      an LVT entry is set as an illegal vector value (0~15) under
      FIXED delivery mode (bits 8-11 is 0), regardless of whether
      the mask bit is set or an interrupt actually happen. These
      errors are seen as error interrupts.
      
      The initial value of thermal LVT entries on all APs always reads
      0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI
      sequence to them and LVT registers are reset to 0s except for
      the mask bits which are set to 1s when APs receive INIT IPI.
      
      When the BIOS takes over the thermal throttling interrupt,
      the LVT thermal deliver mode should be SMI and it is required
      from the kernel to keep AP's LVT thermal monitoring register
      programmed as such as well.
      
      This issue happens when BIOS does not take over thermal throttling
      interrupt, AP's LVT thermal monitor register will be restored to
      0x10000 which means vector 0 and fixed deliver mode, so all APs will
      signal illegal vector error interrupts.
      
      This patch check if interrupt delivery mode is not fixed mode before
      restoring AP's LVT thermal monitor register.
      Signed-off-by: NYouquan Song <youquan.song@intel.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Acked-by: NYong Wang <yong.y.wang@intel.com>
      Cc: hpa@linux.intel.com
      Cc: joe@perches.com
      Cc: jbaron@redhat.com
      Cc: trenn@suse.de
      Cc: kent.liu@intel.com
      Cc: chaohong.guo@intel.com
      Cc: <stable@kernel.org> # As far back as possible
      Link: http://lkml.kernel.org/r/1303402963-17738-1-git-send-email-youquan.song@intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      e503f9e4
  3. 13 5月, 2011 1 次提交
  4. 10 5月, 2011 1 次提交
    • J
      x86, UV: Fix NMI handler for UV platforms · 1d44e828
      Jack Steiner 提交于
      This fixes problems seen on UV systems handling NMIs from the
      node controller.
      
      I isolated the "dazed..." messages that I saw earlier to a bug in
      the BMC on our platform. It was sending NMIs w/o properly setting
      a register that indicated the source of NMI.
      
      So rather than _assuming_ any unhandled NMI came from the UV system
      maintenance console (SMC), add a check to verify that the SMC actually
      sent the NMI.
      Signed-off-by: NJack Steiner <steiner@sgi.com>
      Cc: gorcunov@gmail.com
      Cc: dzickus@redhat.com
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1d44e828
  5. 06 5月, 2011 1 次提交
  6. 03 5月, 2011 1 次提交
  7. 02 5月, 2011 1 次提交
  8. 28 4月, 2011 1 次提交
    • S
      x86: devicetree: Configure IOAPIC pin only once · 20443598
      Sebastian Andrzej Siewior 提交于
      We use io_apic_setup_irq_pin() in order to configure pin's interrupt
      number polarity and type. This is done on every irq_create_of_mapping()
      which happens for instance during pci enable calls. Level typed
      interrupts are masked by default, edge are unmasked.
      
      On the first ->xlate() call the level interrupt is configured and
      masked. The driver calls request_irq() and the line is unmasked. Lets
      assume the interrupt line is shared with another device and we call
      pci_enable_device() for this device. The ->xlate() configures the pin
      again and it is masked. request_irq() does not unmask the line because
      it _is_ already unmasked according to its internal state. So the
      interrupt will never be unmasked again.
      
      This patch is based on an earlier work by Torben Hohn and solves the
      problem by configuring the pin only once. Since all devices must agree
      on the same type and polarity there is no point in configuring the pin
      more than once.
      
      [ tglx: Split out the ce4100 part into a separate patch ]
      
      Cc: Torben Hohn <torbenh@linutronix.de>
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Link: http://lkml.kernel.org/r/%3C20110427143052.GA15211%40linutronix.de%3ESigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      20443598
  9. 27 4月, 2011 2 次提交
    • D
      perf, x86, nmi: Move LVT un-masking into irq handlers · 2bce5dac
      Don Zickus 提交于
      It was noticed that P4 machines were generating double NMIs for
      each perf event.  These extra NMIs lead to 'Dazed and confused'
      messages on the screen.
      
      I tracked this down to a P4 quirk that said the overflow bit had
      to be cleared before re-enabling the apic LVT mask.  My first
      attempt was to move the un-masking inside the perf nmi handler
      from before the chipset NMI handler to after.
      
      This broke Nehalem boxes that seem to like the unmasking before
      the counters themselves are re-enabled.
      
      In order to keep this change simple for 2.6.39, I decided to
      just simply move the apic LVT un-masking to the beginning of all
      the chipset NMI handlers, with the exception of Pentium4's to
      fix the double NMI issue.
      
      Later on we can move the un-masking to later in the handlers to
      save a number of 'extra' NMIs on those particular chipsets.
      
      I tested this change on a P4 machine, an AMD machine, a Nehalem
      box, and a core2quad box.  'perf top' worked correctly along
      with various other small 'perf record' runs.  Anything high
      stress breaks all the machines but that is a different problem.
      
      Thanks to various people for testing different versions of this
      patch.
      Reported-and-tested-by: NShaun Ruffell <sruffell@digium.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Link: http://lkml.kernel.org/r/1303900353-10242-1-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      CC: Cyrill Gorcunov <gorcunov@gmail.com>
      2bce5dac
    • I
      perf events, x86: Work around the Nehalem AAJ80 erratum · ec75a716
      Ingo Molnar 提交于
      On Nehalem CPUs the retired branch-misses event can be completely bogus,
      when there are no branch-misses occuring. When there are a lot of branch
      misses then the count is pretty accurate. Still, this leaves us with an
      event that over-counts a lot.
      
      Detect this erratum and work it around by using BR_MISP_EXEC.ANY events.
      These will also count speculated branches but still it's a lot more
      precise in practice than the architectural event.
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/n/tip-yyfg0bxo9jsqxd6a0ovfny27@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      ec75a716
  10. 26 4月, 2011 1 次提交
  11. 25 4月, 2011 1 次提交
  12. 22 4月, 2011 4 次提交
    • P
      perf, x86: Update/fix Intel Nehalem cache events · f4929bd3
      Peter Zijlstra 提交于
      Change the Nehalem cache events to use retired memory instruction counters
      (similar to Westmere), this greatly improves the provided stats.
      
      Using:
      
      main ()
      {
              int i;
      
              for (i = 0; i < 1000000000; i++) {
                      asm("mov (%%rsp), %%rbx;"
                          "mov %%rbx, (%%rsp);" : : : "rbx");
              }
      }
      
      We find:
      
       $ perf stat --repeat 10 -e instructions:u -e l1-dcache-loads:u -e l1-dcache-stores:u ./loop_1b_loads+stores
        Performance counter stats for './loop_1b_loads+stores' (10 runs):
            4,000,081,056 instructions:u           #      0.000 IPC ( +-   0.000% )
            4,999,502,846 l1-dcache-loads:u          ( +-   0.008% )
            1,000,034,832 l1-dcache-stores:u         ( +-   0.000% )
               1.565184942  seconds time elapsed   ( +-   0.005% )
      
      The 5b is surprising - we'd expect 1b:
      
       $ perf stat --repeat 10 -e instructions:u -e r10b:u -e l1-dcache-stores:u ./loop_1b_loads+stores
        Performance counter stats for './loop_1b_loads+stores' (10 runs):
            4,000,081,054 instructions:u           #      0.000 IPC ( +-   0.000% )
            1,000,021,961 r10b:u                     ( +-   0.000% )
            1,000,030,951 l1-dcache-stores:u         ( +-   0.000% )
               1.565055422  seconds time elapsed   ( +-   0.003% )
      
      Which this patch thus fixes.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Link: http://lkml.kernel.org/n/tip-q9rtru7b7840tws75xzboapv@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      f4929bd3
    • C
      perf, x86: P4 PMU - Don't forget to clear cpuc->active_mask on overflow · 1ea5a6af
      Cyrill Gorcunov 提交于
      It's not enough to simply disable event on overflow the
      cpuc->active_mask should be cleared as well otherwise counter
      may stall in "active" even in real being already disabled (which
      potentially may lead to the situation that user may not use this
      counter further).
      
      Don pointed out that:
      
       " I also noticed this patch fixed some unknown NMIs
         on a P4 when I stressed the box".
      Tested-by: NLin Ming <ming.m.lin@intel.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Link: http://lkml.kernel.org/r/1303398203-2918-3-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      1ea5a6af
    • I
      x86, perf event: Turn off unstructured raw event access to offcore registers · b52c55c6
      Ingo Molnar 提交于
      Andi Kleen pointed out that the Intel offcore support patches were merged
      without user-space tool support to the functionality:
      
       |
       | The offcore_msr perf kernel code was merged into 2.6.39-rc*, but the
       | user space bits were not. This made it impossible to set the extra mask
       | and actually do the OFFCORE profiling
       |
      
      Andi submitted a preliminary patch for user-space support, as an
      extension to perf's raw event syntax:
      
       |
       | Some raw events -- like the Intel OFFCORE events -- support additional
       | parameters. These can be appended after a ':'.
       |
       | For example on a multi socket Intel Nehalem:
       |
       |    perf stat -e r1b7:20ff -a sleep 1
       |
       | Profile the OFFCORE_RESPONSE.ANY_REQUEST with event mask REMOTE_DRAM_0
       | that measures any access to DRAM on another socket.
       |
      
      But this kind of usability is absolutely unacceptable - users should not
      be expected to type in magic, CPU and model specific incantations to get
      access to useful hardware functionality.
      
      The proper solution is to expose useful offcore functionality via
      generalized events - that way users do not have to care which specific
      CPU model they are using, they can use the conceptual event and not some
      model specific quirky hexa number.
      
      We already have such generalization in place for CPU cache events,
      and it's all very extensible.
      
      "Offcore" events measure general DRAM access patters along various
      parameters. They are particularly useful in NUMA systems.
      
      We want to support them via generalized DRAM events: either as the
      fourth level of cache (after the last-level cache), or as a separate
      generalization category.
      
      That way user-space support would be very obvious, memory access
      profiling could be done via self-explanatory commands like:
      
        perf record -e dram ./myapp
        perf record -e dram-remote ./myapp
      
      ... to measure DRAM accesses or more expensive cross-node NUMA DRAM
      accesses.
      
      These generalized events would work on all CPUs and architectures that
      have comparable PMU features.
      
      ( Note, these are just examples: actual implementation could have more
        sophistication and more parameter - as long as they center around
        similarly simple usecases. )
      
      Now we do not want to revert *all* of the current offcore bits, as they
      are still somewhat useful for generic last-level-cache events, implemented
      in this commit:
      
        e994d7d2: perf: Fix LLC-* events on Intel Nehalem/Westmere
      
      But we definitely do not yet want to expose the unstructured raw events
      to user-space, until better generalization and usability is implemented
      for these hardware event features.
      
      ( Note: after generalization has been implemented raw offcore events can be
        supported as well: there can always be an odd event that is marginally
        useful but not useful enough to generalize. DRAM profiling is definitely
        *not* such a category so generalization must be done first. )
      
      Furthermore, PERF_TYPE_RAW access to these registers was not intended
      to go upstream without proper support - it was a side-effect of the above
      e994d7d2 commit, not mentioned in the changelog.
      
      As v2.6.39 is nearing release we go for the simplest approach: disable
      the PERF_TYPE_RAW offcore hack for now, before it escapes into a released
      kernel and becomes an ABI.
      
      Once proper structure is implemented for these hardware events and users
      are offered usable solutions we can revisit this issue.
      Reported-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1302658203-4239-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b52c55c6
    • A
      perf: Support Xeon E7's via the Westmere PMU driver · b2508e82
      Andi Kleen 提交于
      There's a new model number public, 47, for Xeon E7 (aka Westmere EX).
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: a.p.zijlstra@chello.nl
      Link: http://lkml.kernel.org/r/1303429715-10202-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b2508e82
  13. 21 4月, 2011 1 次提交
  14. 20 4月, 2011 1 次提交
  15. 19 4月, 2011 4 次提交
  16. 16 4月, 2011 2 次提交
    • J
      x86, amd: Disable GartTlbWlkErr when BIOS forgets it · 5bbc097d
      Joerg Roedel 提交于
      This patch disables GartTlbWlk errors on AMD Fam10h CPUs if
      the BIOS forgets to do is (or is just too old). Letting
      these errors enabled can cause a sync-flood on the CPU
      causing a reboot.
      
      The AMD BKDG recommends disabling GART TLB Wlk Error completely.
      
      This patch is the fix for
      
      	https://bugzilla.kernel.org/show_bug.cgi?id=33012
      
      on my machine.
      Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
      Link: http://lkml.kernel.org/r/20110415131152.GJ18463@8bytes.orgTested-by: NAlexandre Demers <alexandre.f.demers@gmail.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      5bbc097d
    • K
      x86, NUMA: Fix fakenuma boot failure · 7d6b4670
      KOSAKI Motohiro 提交于
      Currently, numa=fake boot parameter is broken. If it's used,
      kernel may panic due to devide by zero error depending on CPU
      configuration
      
      Call Trace:
       [<ffffffff8104ad4c>] find_busiest_group+0x38c/0xd30
       [<ffffffff81086aff>] ? local_clock+0x6f/0x80
       [<ffffffff81050533>] load_balance+0xa3/0x600
       [<ffffffff81050f53>] idle_balance+0xf3/0x180
       [<ffffffff81550092>] schedule+0x722/0x7d0
       [<ffffffff81550538>] ? wait_for_common+0x128/0x190
       [<ffffffff81550a65>] schedule_timeout+0x265/0x320
       [<ffffffff81095815>] ? lock_release_holdtime+0x35/0x1a0
       [<ffffffff81550538>] ? wait_for_common+0x128/0x190
       [<ffffffff8109bb6c>] ? __lock_release+0x9c/0x1d0
       [<ffffffff815534e0>] ? _raw_spin_unlock_irq+0x30/0x40
       [<ffffffff815534e0>] ? _raw_spin_unlock_irq+0x30/0x40
       [<ffffffff81550540>] wait_for_common+0x130/0x190
       [<ffffffff81051920>] ? try_to_wake_up+0x510/0x510
       [<ffffffff8155067d>] wait_for_completion+0x1d/0x20
       [<ffffffff8107f36c>] kthread_create_on_node+0xac/0x150
       [<ffffffff81077bb0>] ? process_scheduled_works+0x40/0x40
       [<ffffffff8155045f>] ? wait_for_common+0x4f/0x190
       [<ffffffff8107a283>] __alloc_workqueue_key+0x1a3/0x590
       [<ffffffff81e0cce2>] cpuset_init_smp+0x6b/0x7b
       [<ffffffff81df3d07>] kernel_init+0xc3/0x182
       [<ffffffff8155d5e4>] kernel_thread_helper+0x4/0x10
       [<ffffffff81553cd4>] ? retint_restore_args+0x13/0x13
       [<ffffffff81df3c44>] ? start_kernel+0x400/0x400
       [<ffffffff8155d5e0>] ? gs_change+0x13/0x13
      
      The divede by zero is caused by the following line,
      group->cpu_power==0:
      
       kernel/sched_fair.c::update_sg_lb_stats()
              /* Adjust by relative CPU power of the group */
              sgs->avg_load = (sgs->group_load * SCHED_LOAD_SCALE) / group->cpu_power;
      
      This regression was caused by commit e23bba60 ("x86-64, NUMA: Unify
      emulated distance mapping") because it changes cpu -> node
      mapping in the process of dropping fake_physnodes().
      
        old) all cpus are assinged node 0
        now) cpus are assigned round robin
             (the logic is implemented by numa_init_array())
      
        Note: The change in behavior only happens if the system doesn't
              have neither ACPI SRAT table nor AMD northbridge NUMA
      	information.
      
      Round robin assignment doesn't work because init_numa_sched_groups_power()
      assumes all logical cpus in the same physical cpu share the same node
      (then it only accounts for group_first_cpu()), and the simple round robin
      breaks the above assumption.
      
      Thus, this patch implements a reassignment of node-ids if buggy firmware
      or numa emulation makes wrong cpu node map. Tt enforce all logical cpus
      in the same physical cpu share the same node.
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Shaohui Zheng <shaohui.zheng@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/20110415203928.1303.A69D9226@jp.fujitsu.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      7d6b4670
  17. 07 4月, 2011 1 次提交
    • H
      x86, hibernate: Initialize mmu_cr4_features during boot · 4da9484b
      H. Peter Anvin 提交于
      Restore the initialization of mmu_cr4_features during boot, which was
      removed without comment in checkin e5f15b45
      
      x86: Cleanup highmap after brk is concluded
      
      thereby breaking resume from hibernate.  This restores previous
      functionality in approximately the same place, and corrects the
      reading of %cr4 on pre-CPUID hardware (%cr4 exists if and only if
      CPUID is supported.)
      
      However, part of the problem is that the hibernate suspend/resume
      sequence should manage the save/restore of %cr4 explicitly.
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      LKML-Reference: <201104020154.57136.rjw@sisk.pl>
      4da9484b
  18. 01 4月, 2011 2 次提交
  19. 31 3月, 2011 1 次提交
  20. 30 3月, 2011 2 次提交
    • S
      x86, mtrr, pat: Fix one cpu getting out of sync during resume · 84ac7cdb
      Suresh Siddha 提交于
      On laptops with core i5/i7, there were reports that after resume
      graphics workloads were performing poorly on a specific AP, while
      the other cpu's were ok. This was observed on a 32bit kernel
      specifically.
      
      Debug showed that the PAT init was not happening on that AP
      during resume and hence it contributing to the poor workload
      performance on that cpu.
      
      On this system, resume flow looked like this:
      
      1. BP starts the resume sequence and we reinit BP's MTRR's/PAT
         early on using mtrr_bp_restore()
      
      2. Resume sequence brings all AP's online
      
      3. Resume sequence now kicks off the MTRR reinit on all the AP's.
      
      4. For some reason, between point 2 and 3, we moved from BP
         to one of the AP's. My guess is that printk() during resume
         sequence is contributing to this. We don't see similar
         behavior with the 64bit kernel but there is no guarantee that
         at this point the remaining resume sequence (after AP's bringup)
         has to happen on BP.
      
      5. set_mtrr() was assuming that we are still on BP and skipped the
         MTRR/PAT init on that cpu (because of 1 above)
      
      6. But we were on an AP and this led to not reprogramming PAT
         on this cpu leading to bad performance.
      
      Fix this by doing unconditional mtrr_if->set_all() in set_mtrr()
      during MTRR/PAT init. This might be unnecessary if we are still
      running on BP. But it is of no harm and will guarantee that after
      resume, all the cpu's will be in sync with respect to the
      MTRR/PAT registers.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1301438292-28370-1-git-send-email-eric@anholt.net>
      Signed-off-by: NEric Anholt <eric@anholt.net>
      Tested-by: NKeith Packard <keithp@keithp.com>
      Cc: stable@kernel.org	[v2.6.32+]
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      84ac7cdb
    • T
      x86: apb_timer: Fixup genirq fallout · 86cc8dfc
      Thomas Gleixner 提交于
      The lonely user of the internal interface was not in the coccinelle
      script.
      Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      86cc8dfc
  21. 29 3月, 2011 2 次提交
    • X
      x86, microcode: Unregister syscore_ops after microcode unloaded · 4ac5fc6a
      Xiaotian Feng 提交于
      Currently, microcode doesn't unregister syscore_ops after it's
      unloaded. So if we modprobe then rmmod microcode, the stale
      microcode syscore_ops info will stay on syscore_ops_list.
      
      Later when we're trying to reboot/halt/shutdown the machine, kernel
      will panic on syscore_shutdown().
      
      With the patch applied, I can reboot/halt/shutdown my machine successfully.
      Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
      Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      LKML-Reference: <1301387672-23661-1-git-send-email-dfeng@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4ac5fc6a
    • J
      x86: Stop including <linux/delay.h> in two asm header files · ca444564
      Jean Delvare 提交于
      Stop including <linux/delay.h> in x86 header files which don't
      need it. This will let the compiler complain when this header is
      not included by source files when it should, so that
      contributors can fix the problem before building on other
      architectures starts to fail.
      
      Credits go to Geert for the idea.
      Signed-off-by: NJean Delvare <khali@linux-fr.org>
      Cc: James E.J. Bottomley <James.Bottomley@suse.de>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      LKML-Reference: <20110325152014.297890ec@endymion.delvare>
      [ this also fixes an upstream build bug in drivers/media/rc/ite-cir.c ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ca444564
  22. 26 3月, 2011 1 次提交
  23. 25 3月, 2011 4 次提交
    • I
      perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue · 45daae57
      Ingo Molnar 提交于
      Eric Dumazet reported that hardware PMU events do not work on his
      system, due to the BIOS corrupting PMU state:
      
          Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only.
          [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c)
      
      Linus suggested that we continue in the face of such BIOS-induced CPU
      state corruption:
      
         http://lkml.org/lkml/2011/3/24/608
      
      Such BIOSes will have to be fixed - Linux developers rely on a working and
      fully capable PMU and the BIOS interfering with the CPU's PMU state is simply
      not acceptable.
      
      So this patch changes perf to continue when it detects such BIOS
      interaction, some hardware events may be unreliable due to the BIOS
      writing and re-writing them - there's not much the kernel can do
      about that but to detect the corruption and report it.
      Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      45daae57
    • T
      x86: DT: Cleanup namespace and call irq_set_irq_type() unconditional · 07611dbd
      Thomas Gleixner 提交于
      That call escaped the name space cleanup. Fix it up.
      
      We really want to call there. The chip might have changed since the
      irq was setup initially. So let the core code and the chip decide what
      to do. The status is just an unreliable snapshot.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      07611dbd
    • T
      x86: DT: Fix return condition in irq_create_of_mapping() · 00a30b25
      Thomas Gleixner 提交于
      The xlate() function returns 0 or a negative error code. Returning the
      error code blindly will be seen as an huge irq number by the calling
      function because irq_create_of_mapping() returns an unsigned value.
      
      Return 0 (NO_IRQ) as required.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      00a30b25
    • D
      perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows · 242214f9
      Don Zickus 提交于
      The read of a proper MSR register was missed and instead of
      counter the configration register was tested (it has
      ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI
      hitting the system. As result the user may obtain "Dazed and
      confused, but trying to continue" message. Fix it by reading a
      proper MSR register.
      
      When an NMI happens on a P4, the perf nmi handler checks the
      configuration register to see if the overflow bit is set or not
      before taking appropriate action.  Unfortunately, various P4
      machines had a broken overflow bit, so a backup mechanism was
      implemented.  This mechanism checked to see if the counter
      rolled over or not.
      
      A previous commit that implemented this backup mechanism was
      broken. Instead of reading the counter register, it used the
      configuration register to determine if the counter rolled over
      or not. Reading that bit would give incorrect results.
      
      This would lead to 'Dazed and confused' messages for the end
      user when using the perf tool (or if the nmi watchdog is
      running).
      
      The fix is to read the counter register before determining if
      the counter rolled over or not.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Lin Ming <ming.m.lin@intel.com>
      LKML-Reference: <4D8BAB49.3080701@openvz.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      242214f9
  24. 24 3月, 2011 2 次提交
    • N
      x86, dumpstack: Use %pB format specifier for stack trace · 71f9e598
      Namhyung Kim 提交于
      Improve noreturn function entries in call traces:
      
      Before:
      
       Call Trace:
        [<ffffffff812a8502>] panic+0x8c/0x18d
        [<ffffffffa000012a>] deep01+0x0/0x38 [test_panic]  <--- bad
        [<ffffffff81104666>] proc_file_write+0x73/0x8d
        [<ffffffff811000b3>] proc_reg_write+0x8d/0xac
        [<ffffffff810c7d32>] vfs_write+0xa1/0xc5
        [<ffffffff810c7e0f>] sys_write+0x45/0x6c
        [<ffffffff8f02943b>] system_call_fastpath+0x16/0x1b
      
      After:
      
       Call Trace:
        [<ffffffff812bce69>] panic+0x8c/0x18d
        [<ffffffffa000012a>] panic_write+0x20/0x20 [test_panic] <--- good
        [<ffffffff81115fab>] proc_file_write+0x73/0x8d
        [<ffffffff81111a5f>] proc_reg_write+0x8d/0xac
        [<ffffffff810d90ee>] vfs_write+0xa1/0xc5
        [<ffffffff810d91cb>] sys_write+0x45/0x6c
        [<ffffffff812c07fb>] system_call_fastpath+0x16/0x1b
      Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <1300934550-21394-2-git-send-email-namhyung@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      71f9e598
    • O
      crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr,... · 93a72052
      Olaf Hering 提交于
      crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn
      
      The Xen PV drivers in a crashed HVM guest can not connect to the dom0
      backend drivers because both frontend and backend drivers are still in
      connected state.  To run the connection reset function only in case of a
      crashdump, the is_kdump_kernel() function needs to be available for the PV
      driver modules.
      
      Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
      kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
      usable for modules.
      
      Leave 'elfcorehdr' as early_param().  This changes powerpc from __setup()
      to early_param().  It adds an address range check from x86 also on ia64
      and powerpc.
      
      [akpm@linux-foundation.org: additional #includes]
      [akpm@linux-foundation.org: remove elfcorehdr_addr export]
      [akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
      Signed-off-by: NOlaf Hering <olaf@aepfle.de>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93a72052