1. 21 6月, 2013 1 次提交
    • S
      x86, trace: Introduce entering/exiting_irq() · eddc0e92
      Seiji Aguchi 提交于
      When implementing tracepoints in interrupt handers, if the tracepoints are
      simply added in the performance sensitive path of interrupt handers,
      it may cause potential performance problem due to the time penalty.
      
      To solve the problem, an idea is to prepare non-trace/trace irq handers and
      switch their IDTs at the enabling/disabling time.
      
      So, let's introduce entering_irq()/exiting_irq() for pre/post-
      processing of each irq handler.
      
      A way to use them is as follows.
      
      Non-trace irq handler:
      smp_irq_handler()
      {
      	entering_irq();		/* pre-processing of this handler */
      	__smp_irq_handler();	/*
      				 * common logic between non-trace and trace handlers
      				 * in a vector.
      				 */
      	exiting_irq();		/* post-processing of this handler */
      
      }
      
      Trace irq_handler:
      smp_trace_irq_handler()
      {
      	entering_irq();		/* pre-processing of this handler */
      	trace_irq_entry();	/* tracepoint for irq entry */
      	__smp_irq_handler();	/*
      				 * common logic between non-trace and trace handlers
      				 * in a vector.
      				 */
      	trace_irq_exit();	/* tracepoint for irq exit */
      	exiting_irq();		/* post-processing of this handler */
      
      }
      
      If tracepoints can place outside entering_irq()/exiting_irq() as follows,
      it looks cleaner.
      
      smp_trace_irq_handler()
      {
      	trace_irq_entry();
      	smp_irq_handler();
      	trace_irq_exit();
      }
      
      But it doesn't work.
      The problem is with irq_enter/exit() being called. They must be called before
      trace_irq_enter/exit(),  because of the rcu_irq_enter() must be called before
      any tracepoints are used, as tracepoints use  rcu to synchronize.
      
      As a possible alternative, we may be able to call irq_enter() first as follows
      if irq_enter() can nest.
      
      smp_trace_irq_hander()
      {
      	irq_entry();
      	trace_irq_entry();
      	smp_irq_handler();
      	trace_irq_exit();
      	irq_exit();
      }
      
      But it doesn't work, either.
      If irq_enter() is nested, it may have a time penalty because it has to check if it
      was already called or not. The time penalty is not desired in performance sensitive
      paths even if it is tiny.
      Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
      Link: http://lkml.kernel.org/r/51C3238D.9040706@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      eddc0e92
  2. 05 5月, 2013 1 次提交
  3. 04 5月, 2013 2 次提交
  4. 30 4月, 2013 4 次提交
  5. 26 4月, 2013 1 次提交
    • I
      perf/x86/intel/P4: Robistify P4 PMU types · 5ac2b5c2
      Ingo Molnar 提交于
      Linus found, while extending integer type extension checks in the
      sparse static code checker, various fragile patterns of mixed
      signed/unsigned  64-bit/32-bit integer use in perf_events_p4.c.
      
      The relevant hardware register ABI is 64 bit wide on 32-bit
      kernels as  well, so clean it all up a bit, remove unnecessary
      casts, and make sure we  use 64-bit unsigned integers in these
      places.
      
      [ Unfortunately this patch was not tested on real P4 hardware,
        those are pretty rare already. If this patch causes any
        problems on P4 hardware then please holler ... ]
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20130424072630.GB1780@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5ac2b5c2
  6. 22 4月, 2013 1 次提交
  7. 21 4月, 2013 6 次提交
  8. 18 4月, 2013 1 次提交
  9. 16 4月, 2013 4 次提交
  10. 12 4月, 2013 1 次提交
    • K
      x86: Use a read-only IDT alias on all CPUs · 4eefbe79
      Kees Cook 提交于
      Make a copy of the IDT (as seen via the "sidt" instruction) read-only.
      This primarily removes the IDT from being a target for arbitrary memory
      write attacks, and has the added benefit of also not leaking the kernel
      base offset, if it has been relocated.
      
      We already did this on vendor == Intel and family == 5 because of the
      F0 0F bug -- regardless of if a particular CPU had the F0 0F bug or
      not.  Since the workaround was so cheap, there simply was no reason to
      be very specific.  This patch extends the readonly alias to all CPUs,
      but does not activate the #PF to #UD conversion code needed to deliver
      the proper exception in the F0 0F case except on Intel family 5
      processors.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Link: http://lkml.kernel.org/r/20130410192422.GA17344@www.outflux.net
      Cc: Eric Northup <digitaleric@google.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      4eefbe79
  11. 10 4月, 2013 2 次提交
  12. 03 4月, 2013 7 次提交
  13. 01 4月, 2013 4 次提交
  14. 27 3月, 2013 2 次提交
  15. 22 3月, 2013 2 次提交
  16. 21 3月, 2013 1 次提交
    • S
      perf/x86: Fix uninitialized pt_regs in intel_pmu_drain_bts_buffer() · 0e48026a
      Stephane Eranian 提交于
      This patch fixes an uninitialized pt_regs struct in drain BTS
      function. The pt_regs struct is propagated all the way to the
      code_get_segment() function from perf_instruction_pointer()
      and may get garbage.
      
      We cannot simply inherit the actual pt_regs from the interrupt
      because BTS must be flushed on context-switch or when the
      associated event is disabled. And there we do not have a pt_regs
      handy.
      
      Setting pt_regs to all zeroes may not be the best option but it
      is not clear what else to do given where the drain_bts_buffer()
      is called from.
      
      In V2, we move the memset() later in the code to avoid doing it
      when we end up returning early without doing the actual BTS
      processing. Also dropped the reg.val initialization because it
      is redundant with the memset() as suggested by PeterZ.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: peterz@infradead.org
      Cc: sqazi@google.com
      Cc: ak@linux.intel.com
      Cc: jolsa@redhat.com
      Link: http://lkml.kernel.org/r/20130319151038.GA25439@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0e48026a