1. 21 6月, 2013 3 次提交
    • S
      x86, trace: Add irq vector tracepoints · cf910e83
      Seiji Aguchi 提交于
      [Purpose of this patch]
      
      As Vaibhav explained in the thread below, tracepoints for irq vectors
      are useful.
      
      http://www.spinics.net/lists/mm-commits/msg85707.html
      
      <snip>
      The current interrupt traces from irq_handler_entry and irq_handler_exit
      provide when an interrupt is handled.  They provide good data about when
      the system has switched to kernel space and how it affects the currently
      running processes.
      
      There are some IRQ vectors which trigger the system into kernel space,
      which are not handled in generic IRQ handlers.  Tracing such events gives
      us the information about IRQ interaction with other system events.
      
      The trace also tells where the system is spending its time.  We want to
      know which cores are handling interrupts and how they are affecting other
      processes in the system.  Also, the trace provides information about when
      the cores are idle and which interrupts are changing that state.
      <snip>
      
      On the other hand, my usecase is tracing just local timer event and
      getting a value of instruction pointer.
      
      I suggested to add an argument local timer event to get instruction pointer before.
      But there is another way to get it with external module like systemtap.
      So, I don't need to add any argument to irq vector tracepoints now.
      
      [Patch Description]
      
      Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in all events.
      But there is an above use case to trace specific irq_vector rather than tracing all events.
      In this case, we are concerned about overhead due to unwanted events.
      
      So, add following tracepoints instead of introducing irq_vector_entry/exit.
      so that we can enable them independently.
         - local_timer_vector
         - reschedule_vector
         - call_function_vector
         - call_function_single_vector
         - irq_work_entry_vector
         - error_apic_vector
         - thermal_apic_vector
         - threshold_apic_vector
         - spurious_apic_vector
         - x86_platform_ipi_vector
      
      Also, introduce a logic switching IDT at enabling/disabling time so that a time penalty
      makes a zero when tracepoints are disabled. Detailed explanations are as follows.
       - Create trace irq handlers with entering_irq()/exiting_irq().
       - Create a new IDT, trace_idt_table, at boot time by adding a logic to
         _set_gate(). It is just a copy of original idt table.
       - Register the new handlers for tracpoints to the new IDT by introducing
         macros to alloc_intr_gate() called at registering time of irq_vector handlers.
       - Add checking, whether irq vector tracing is on/off, into load_current_idt().
         This has to be done below debug checking for these reasons.
         - Switching to debug IDT may be kicked while tracing is enabled.
         - On the other hands, switching to trace IDT is kicked only when debugging
           is disabled.
      
      In addition, the new IDT is created only when CONFIG_TRACING is enabled to avoid being
      used for other purposes.
      Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
      Link: http://lkml.kernel.org/r/51C323ED.5050708@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      cf910e83
    • S
      x86: Rename variables for debugging · 629f4f9d
      Seiji Aguchi 提交于
      Rename variables for debugging to describe meaning of them precisely.
      
      Also, introduce a generic way to switch IDT by checking a current state,
      debug on/off.
      Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
      Link: http://lkml.kernel.org/r/51C323A8.7050905@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      629f4f9d
    • S
      x86, trace: Introduce entering/exiting_irq() · eddc0e92
      Seiji Aguchi 提交于
      When implementing tracepoints in interrupt handers, if the tracepoints are
      simply added in the performance sensitive path of interrupt handers,
      it may cause potential performance problem due to the time penalty.
      
      To solve the problem, an idea is to prepare non-trace/trace irq handers and
      switch their IDTs at the enabling/disabling time.
      
      So, let's introduce entering_irq()/exiting_irq() for pre/post-
      processing of each irq handler.
      
      A way to use them is as follows.
      
      Non-trace irq handler:
      smp_irq_handler()
      {
      	entering_irq();		/* pre-processing of this handler */
      	__smp_irq_handler();	/*
      				 * common logic between non-trace and trace handlers
      				 * in a vector.
      				 */
      	exiting_irq();		/* post-processing of this handler */
      
      }
      
      Trace irq_handler:
      smp_trace_irq_handler()
      {
      	entering_irq();		/* pre-processing of this handler */
      	trace_irq_entry();	/* tracepoint for irq entry */
      	__smp_irq_handler();	/*
      				 * common logic between non-trace and trace handlers
      				 * in a vector.
      				 */
      	trace_irq_exit();	/* tracepoint for irq exit */
      	exiting_irq();		/* post-processing of this handler */
      
      }
      
      If tracepoints can place outside entering_irq()/exiting_irq() as follows,
      it looks cleaner.
      
      smp_trace_irq_handler()
      {
      	trace_irq_entry();
      	smp_irq_handler();
      	trace_irq_exit();
      }
      
      But it doesn't work.
      The problem is with irq_enter/exit() being called. They must be called before
      trace_irq_enter/exit(),  because of the rcu_irq_enter() must be called before
      any tracepoints are used, as tracepoints use  rcu to synchronize.
      
      As a possible alternative, we may be able to call irq_enter() first as follows
      if irq_enter() can nest.
      
      smp_trace_irq_hander()
      {
      	irq_entry();
      	trace_irq_entry();
      	smp_irq_handler();
      	trace_irq_exit();
      	irq_exit();
      }
      
      But it doesn't work, either.
      If irq_enter() is nested, it may have a time penalty because it has to check if it
      was already called or not. The time penalty is not desired in performance sensitive
      paths even if it is tiny.
      Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
      Link: http://lkml.kernel.org/r/51C3238D.9040706@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      eddc0e92
  2. 11 6月, 2013 1 次提交
    • M
      Modify UEFI anti-bricking code · f8b84043
      Matthew Garrett 提交于
      This patch reworks the UEFI anti-bricking code, including an effective
      reversion of cc5a080c and 31ff2f20. It turns out that calling
      QueryVariableInfo() from boot services results in some firmware
      implementations jumping to physical addresses even after entering virtual
      mode, so until we have 1:1 mappings for UEFI runtime space this isn't
      going to work so well.
      
      Reverting these gets us back to the situation where we'd refuse to create
      variables on some systems because they classify deleted variables as "used"
      until the firmware triggers a garbage collection run, which they won't do
      until they reach a lower threshold. This results in it being impossible to
      install a bootloader, which is unhelpful.
      
      Feedback from Samsung indicates that the firmware doesn't need more than
      5KB of storage space for its own purposes, so that seems like a reasonable
      threshold. However, there's still no guarantee that a platform will attempt
      garbage collection merely because it drops below this threshold. It seems
      that this is often only triggered if an attempt to write generates a
      genuine EFI_OUT_OF_RESOURCES error. We can force that by attempting to
      create a variable larger than the remaining space. This should fail, but if
      it somehow succeeds we can then immediately delete it.
      
      I've tested this on the UEFI machines I have available, but I don't have
      a Samsung and so can't verify that it avoids the bricking problem.
      Signed-off-by: NMatthew Garrett <matthew.garrett@nebula.com>
      Signed-off-by: Lee, Chun-Y <jlee@suse.com> [ dummy variable cleanup ]
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      f8b84043
  3. 31 5月, 2013 1 次提交
  4. 10 5月, 2013 1 次提交
  5. 07 5月, 2013 1 次提交
  6. 03 5月, 2013 3 次提交
  7. 01 5月, 2013 1 次提交
    • T
      dump_stack: unify debug information printed by show_regs() · a43cb95d
      Tejun Heo 提交于
      show_regs() is inherently arch-dependent but it does make sense to print
      generic debug information and some archs already do albeit in slightly
      different forms.  This patch introduces a generic function to print debug
      information from show_regs() so that different archs print out the same
      information and it's much easier to modify what's printed.
      
      show_regs_print_info() prints out the same debug info as dump_stack()
      does plus task and thread_info pointers.
      
      * Archs which didn't print debug info now do.
      
        alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r,
        metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc,
        um, xtensa
      
      * Already prints debug info.  Replaced with show_regs_print_info().
        The printed information is superset of what used to be there.
      
        arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86
      
      * s390 is special in that it used to print arch-specific information
        along with generic debug info.  Heiko and Martin think that the
        arch-specific extra isn't worth keeping s390 specfic implementation.
        Converted to use the generic version.
      
      Note that now all archs print the debug info before actual register
      dumps.
      
      An example BUG() dump follows.
      
       kernel BUG at /work/os/work/kernel/workqueue.c:4841!
       invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
       Modules linked in:
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7
       Hardware name: empty empty/S3992, BIOS 080011  10/26/2007
       task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000
       RIP: 0010:[<ffffffff8234a07e>]  [<ffffffff8234a07e>] init_workqueues+0x4/0x6
       RSP: 0000:ffff88007c861ec8  EFLAGS: 00010246
       RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001
       RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a
       RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Stack:
        ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650
        0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d
        ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760
       Call Trace:
        [<ffffffff81000312>] do_one_initcall+0x122/0x170
        [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        [<ffffffff81c4776e>] kernel_init+0xe/0xf0
        [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0
        [<ffffffff81c47760>] ? rest_init+0x140/0x140
        ...
      
      v2: Typo fix in x86-32.
      
      v3: CPU number dropped from show_regs_print_info() as
          dump_stack_print_info() has been updated to print it.  s390
          specific implementation dropped as requested by s390 maintainers.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NJesper Nilsson <jesper.nilsson@axis.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Mike Frysinger <vapier@gentoo.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Acked-by: Chris Metcalf <cmetcalf@tilera.com>		[tile bits]
      Acked-by: Richard Kuo <rkuo@codeaurora.org>		[hexagon bits]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a43cb95d
  8. 30 4月, 2013 1 次提交
    • G
      mm/hugetlb: add more arch-defined huge_pte functions · 106c992a
      Gerald Schaefer 提交于
      Commit abf09bed ("s390/mm: implement software dirty bits")
      introduced another difference in the pte layout vs.  the pmd layout on
      s390, thoroughly breaking the s390 support for hugetlbfs.  This requires
      replacing some more pte_xxx functions in mm/hugetlbfs.c with a
      huge_pte_xxx version.
      
      This patch introduces those huge_pte_xxx functions and their generic
      implementation in asm-generic/hugetlb.h, which will now be included on
      all architectures supporting hugetlbfs apart from s390.  This change
      will be a no-op for those architectures.
      
      [akpm@linux-foundation.org: fix warning]
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Acked-by: Michal Hocko <mhocko@suse.cz>	[for !s390 parts]
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      106c992a
  9. 28 4月, 2013 3 次提交
  10. 27 4月, 2013 1 次提交
  11. 26 4月, 2013 1 次提交
    • I
      perf/x86/intel/P4: Robistify P4 PMU types · 5ac2b5c2
      Ingo Molnar 提交于
      Linus found, while extending integer type extension checks in the
      sparse static code checker, various fragile patterns of mixed
      signed/unsigned  64-bit/32-bit integer use in perf_events_p4.c.
      
      The relevant hardware register ABI is 64 bit wide on 32-bit
      kernels as  well, so clean it all up a bit, remove unnecessary
      casts, and make sure we  use 64-bit unsigned integers in these
      places.
      
      [ Unfortunately this patch was not tested on real P4 hardware,
        those are pretty rare already. If this patch causes any
        problems on P4 hardware then please holler ... ]
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20130424072630.GB1780@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5ac2b5c2
  12. 25 4月, 2013 6 次提交
  13. 22 4月, 2013 5 次提交
  14. 21 4月, 2013 1 次提交
  15. 20 4月, 2013 1 次提交
  16. 19 4月, 2013 2 次提交
    • W
      mutex: Back out architecture specific check for negative mutex count · cc189d25
      Waiman Long 提交于
      Linus suggested that probably all the supported architectures can
      allow a negative mutex count without incorrect behavior, so we can
      then back out the architecture specific change and allow the
      mutex count to go to any negative number. That should further
      reduce contention for non-x86 architecture.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Chandramouleeswaran Aswin <aswin@hp.com>
      Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
      Cc: Norton Scott J <scott.norton@hp.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1366226594-5506-5-git-send-email-Waiman.Long@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cc189d25
    • W
      mutex: Make more scalable by doing less atomic operations · 0dc8c730
      Waiman Long 提交于
      In the __mutex_lock_common() function, an initial entry into
      the lock slow path will cause two atomic_xchg instructions to be
      issued. Together with the atomic decrement in the fast path, a
      total of three atomic read-modify-write instructions will be
      issued in rapid succession. This can cause a lot of cache
      bouncing when many tasks are trying to acquire the mutex at the
      same time.
      
      This patch will reduce the number of atomic_xchg instructions
      used by checking the counter value first before issuing the
      instruction. The atomic_read() function is just a simple memory
      read. The atomic_xchg() function, on the other hand, can be up
      to 2 order of magnitude or even more in cost when compared with
      atomic_read(). By using atomic_read() to check the value first
      before calling atomic_xchg(), we can avoid a lot of unnecessary
      cache coherency traffic. The only downside with this change is
      that a task on the slow path will have a tiny bit less chance of
      getting the mutex when competing with another task in the fast
      path.
      
      The same is true for the atomic_cmpxchg() function in the
      mutex-spin-on-owner loop. So an atomic_read() is also performed
      before calling atomic_cmpxchg().
      
      The mutex locking and unlocking code for the x86 architecture
      can allow any negative number to be used in the mutex count to
      indicate that some tasks are waiting for the mutex. I am not so
      sure if that is the case for the other architectures. So the
      default is to avoid atomic_xchg() if the count has already been
      set to -1. For x86, the check is modified to include all
      negative numbers to cover a larger case.
      
      The following table shows the jobs per minutes (JPM) scalability
      data on an 8-node 80-core Westmere box with a 3.7.10 kernel. The
      numactl command is used to restrict the running of the
      high_systime workloads to 1/2/4/8 nodes with hyperthreading on
      and off.
      
      +-----------------+-----------+------------+----------+
      |  Configuration  | Mean JPM  |  Mean JPM  | % Change |
      |		  | w/o patch | with patch |	      |
      +-----------------+-----------------------------------+
      |		  |      User Range 1100 - 2000	      |
      +-----------------+-----------------------------------+
      | 8 nodes, HT on  |    36980   |   148590  | +301.8%  |
      | 8 nodes, HT off |    42799   |   145011  | +238.8%  |
      | 4 nodes, HT on  |    61318   |   118445  |  +51.1%  |
      | 4 nodes, HT off |   158481   |   158592  |   +0.1%  |
      | 2 nodes, HT on  |   180602   |   173967  |   -3.7%  |
      | 2 nodes, HT off |   198409   |   198073  |   -0.2%  |
      | 1 node , HT on  |   149042   |   147671  |   -0.9%  |
      | 1 node , HT off |   126036   |   126533  |   +0.4%  |
      +-----------------+-----------------------------------+
      |		  |       User Range 200 - 1000	      |
      +-----------------+-----------------------------------+
      | 8 nodes, HT on  |   41525    |   122349  | +194.6%  |
      | 8 nodes, HT off |   49866    |   124032  | +148.7%  |
      | 4 nodes, HT on  |   66409    |   106984  |  +61.1%  |
      | 4 nodes, HT off |  119880    |   130508  |   +8.9%  |
      | 2 nodes, HT on  |  138003    |   133948  |   -2.9%  |
      | 2 nodes, HT off |  132792    |   131997  |   -0.6%  |
      | 1 node , HT on  |  116593    |   115859  |   -0.6%  |
      | 1 node , HT off |  104499    |   104597  |   +0.1%  |
      +-----------------+------------+-----------+----------+
      
      At low user range 10-100, the JPM differences were within +/-1%.
      So they are not that interesting.
      
      AIM7 benchmark run has a pretty large run-to-run variance due to
      random nature of the subtests executed. So a difference of less
      than +-5% may not be really significant.
      
      This patch improves high_systime workload performance at 4 nodes
      and up by maintaining transaction rates without significant
      drop-off at high node count.  The patch has practically no
      impact on 1 and 2 nodes system.
      
      The table below shows the percentage time (as reported by perf
      record -a -s -g) spent on the __mutex_lock_slowpath() function
      by the high_systime workload at 1500 users for 2/4/8-node
      configurations with hyperthreading off.
      
      +---------------+-----------------+------------------+---------+
      | Configuration | %Time w/o patch | %Time with patch | %Change |
      +---------------+-----------------+------------------+---------+
      |    8 nodes    |      65.34%     |      0.69%       |  -99%   |
      |    4 nodes    |       8.70%	  |      1.02%	     |  -88%   |
      |    2 nodes    |       0.41%     |      0.32%       |  -22%   |
      +---------------+-----------------+------------------+---------+
      
      It is obvious that the dramatic performance improvement at 8
      nodes was due to the drastic cut in the time spent within the
      __mutex_lock_slowpath() function.
      
      The table below show the improvements in other AIM7 workloads
      (at 8 nodes, hyperthreading off).
      
      +--------------+---------------+----------------+-----------------+
      |   Workload   | mean % change | mean % change  | mean % change   |
      |              | 10-100 users  | 200-1000 users | 1100-2000 users |
      +--------------+---------------+----------------+-----------------+
      | alltests     |     +0.6%     |   +104.2%      |   +185.9%       |
      | five_sec     |     +1.9%     |     +0.9%      |     +0.9%       |
      | fserver      |     +1.4%     |     -7.7%      |     +5.1%       |
      | new_fserver  |     -0.5%     |     +3.2%      |     +3.1%       |
      | shared       |    +13.1%     |   +146.1%      |   +181.5%       |
      | short        |     +7.4%     |     +5.0%      |     +4.2%       |
      +--------------+---------------+----------------+-----------------+
      Signed-off-by: NWaiman Long <Waiman.Long@hp.com>
      Reviewed-by: NDavidlohr Bueso <davidlohr.bueso@hp.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Chandramouleeswaran Aswin <aswin@hp.com>
      Cc: Norton: Scott J <scott.norton@hp.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Clark Williams <williams@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1366226594-5506-3-git-send-email-Waiman.Long@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0dc8c730
  17. 18 4月, 2013 2 次提交
    • N
      iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets · 03bbcb2e
      Neil Horman 提交于
      A few years back intel published a spec update:
      http://www.intel.com/content/dam/doc/specification-update/5520-and-5500-chipset-ioh-specification-update.pdf
      
      For the 5520 and 5500 chipsets which contained an errata (specificially errata
      53), which noted that these chipsets can't properly do interrupt remapping, and
      as a result the recommend that interrupt remapping be disabled in bios.  While
      many vendors have a bios update to do exactly that, not all do, and of course
      not all users update their bios to a level that corrects the problem.  As a
      result, occasionally interrupts can arrive at a cpu even after affinity for that
      interrupt has be moved, leading to lost or spurrious interrupts (usually
      characterized by the message:
      kernel: do_IRQ: 7.71 No irq handler for vector (irq -1)
      
      There have been several incidents recently of people seeing this error, and
      investigation has shown that they have system for which their BIOS level is such
      that this feature was not properly turned off.  As such, it would be good to
      give them a reminder that their systems are vulnurable to this problem.  For
      details of those that reported the problem, please see:
      https://bugzilla.redhat.com/show_bug.cgi?id=887006
      
      [ Joerg: Removed CONFIG_IRQ_REMAP ifdef from early-quirks.c ]
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: Prarit Bhargava <prarit@redhat.com>
      CC: Don Zickus <dzickus@redhat.com>
      CC: Don Dutile <ddutile@redhat.com>
      CC: Bjorn Helgaas <bhelgaas@google.com>
      CC: Asit Mallick <asit.k.mallick@intel.com>
      CC: David Woodhouse <dwmw2@infradead.org>
      CC: linux-pci@vger.kernel.org
      CC: Joerg Roedel <joro@8bytes.org>
      CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      CC: Arkadiusz Miśkiewicz <arekm@maven.pl>
      Signed-off-by: NJoerg Roedel <joro@8bytes.org>
      03bbcb2e
    • K
      tools/power turbostat: display C8, C9, C10 residency · ca58710f
      Kristen Carlson Accardi 提交于
      Display residency in the new C-states, C8, C9, C10.
      
      C8, C9, C10 are present on some:
      "Fourth Generation Intel(R) Core(TM) Processors",
      which are based on Intel(R) microarchitecture code name Haswell.
      Signed-off-by: NKristen Carlson Accardi <kristen@linux.intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      ca58710f
  18. 17 4月, 2013 4 次提交
  19. 16 4月, 2013 1 次提交
    • M
      efi: Pass boot services variable info to runtime code · cc5a080c
      Matthew Garrett 提交于
      EFI variables can be flagged as being accessible only within boot services.
      This makes it awkward for us to figure out how much space they use at
      runtime. In theory we could figure this out by simply comparing the results
      from QueryVariableInfo() to the space used by all of our variables, but
      that fails if the platform doesn't garbage collect on every boot. Thankfully,
      calling QueryVariableInfo() while still inside boot services gives a more
      reliable answer. This patch passes that information from the EFI boot stub
      up to the efi platform code.
      Signed-off-by: NMatthew Garrett <matthew.garrett@nebula.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      cc5a080c
  20. 14 4月, 2013 1 次提交