提交 · 83ab85140bc1492f92de263a1c30ea04a0f465f7 · _Walt / cloud-kernel

21 6月, 2013 3 次提交

trace,x86: Move creation of irq tracepoints from apic.c to irq.c · 83ab8514

由 Steven Rostedt (Red Hat) 提交于 6月 21, 2013

Compiling without CONFIG_X86_LOCAL_APIC set, apic.c will not be
compiled, and the irq tracepoints will not be created via the
CREATE_TRACE_POINTS macro. When CONFIG_X86_LOCAL_APIC is not set,
we get the following build error:

LD init/built-in.o
arch/x86/built-in.o: In function `trace_x86_platform_ipi_entry':
linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:66: undefined reference to `__tracepoint_x86_platform_ipi_entry'
arch/x86/built-in.o: In function `trace_x86_platform_ipi_exit':
linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:66: undefined reference to `__tracepoint_x86_platform_ipi_exit'
arch/x86/built-in.o: In function `trace_irq_work_entry':
linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:72: undefined reference to `__tracepoint_irq_work_entry'
arch/x86/built-in.o: In function `trace_irq_work_exit':
linux-test.git/arch/x86/include/asm/trace/irq_vectors.h:72: undefined reference to `__tracepoint_irq_work_exit'
arch/x86/built-in.o:(__jump_table+0x8): undefined reference to `__tracepoint_x86_platform_ipi_entry'
arch/x86/built-in.o:(__jump_table+0x14): undefined reference to `__tracepoint_x86_platform_ipi_exit'
arch/x86/built-in.o:(__jump_table+0x20): undefined reference to `__tracepoint_irq_work_entry'
arch/x86/built-in.o:(__jump_table+0x2c): undefined reference to `__tracepoint_irq_work_exit'
make[1]: *** [vmlinux] Error 1
make: *** [sub-make] Error 2

As irq.c is always compiled for x86, it is a more appropriate location
to create the irq tracepoints.

Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

83ab8514

x86, trace: Add irq vector tracepoints · cf910e83

由 Seiji Aguchi 提交于 6月 20, 2013

[Purpose of this patch]

As Vaibhav explained in the thread below, tracepoints for irq vectors
are useful.

http://www.spinics.net/lists/mm-commits/msg85707.html

<snip>
The current interrupt traces from irq_handler_entry and irq_handler_exit
provide when an interrupt is handled.  They provide good data about when
the system has switched to kernel space and how it affects the currently
running processes.

There are some IRQ vectors which trigger the system into kernel space,
which are not handled in generic IRQ handlers.  Tracing such events gives
us the information about IRQ interaction with other system events.

The trace also tells where the system is spending its time.  We want to
know which cores are handling interrupts and how they are affecting other
processes in the system.  Also, the trace provides information about when
the cores are idle and which interrupts are changing that state.
<snip>

On the other hand, my usecase is tracing just local timer event and
getting a value of instruction pointer.

I suggested to add an argument local timer event to get instruction pointer before.
But there is another way to get it with external module like systemtap.
So, I don't need to add any argument to irq vector tracepoints now.

[Patch Description]

Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in all events.
But there is an above use case to trace specific irq_vector rather than tracing all events.
In this case, we are concerned about overhead due to unwanted events.

So, add following tracepoints instead of introducing irq_vector_entry/exit.
so that we can enable them independently.
   - local_timer_vector
   - reschedule_vector
   - call_function_vector
   - call_function_single_vector
   - irq_work_entry_vector
   - error_apic_vector
   - thermal_apic_vector
   - threshold_apic_vector
   - spurious_apic_vector
   - x86_platform_ipi_vector

Also, introduce a logic switching IDT at enabling/disabling time so that a time penalty
makes a zero when tracepoints are disabled. Detailed explanations are as follows.
 - Create trace irq handlers with entering_irq()/exiting_irq().
 - Create a new IDT, trace_idt_table, at boot time by adding a logic to
   _set_gate(). It is just a copy of original idt table.
 - Register the new handlers for tracpoints to the new IDT by introducing
   macros to alloc_intr_gate() called at registering time of irq_vector handlers.
 - Add checking, whether irq vector tracing is on/off, into load_current_idt().
   This has to be done below debug checking for these reasons.
   - Switching to debug IDT may be kicked while tracing is enabled.
   - On the other hands, switching to trace IDT is kicked only when debugging
     is disabled.

In addition, the new IDT is created only when CONFIG_TRACING is enabled to avoid being
used for other purposes.
Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
Link: http://lkml.kernel.org/r/51C323ED.5050708@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

cf910e83

x86, trace: Introduce entering/exiting_irq() · eddc0e92

由 Seiji Aguchi 提交于 6月 20, 2013

When implementing tracepoints in interrupt handers, if the tracepoints are
simply added in the performance sensitive path of interrupt handers,
it may cause potential performance problem due to the time penalty.

To solve the problem, an idea is to prepare non-trace/trace irq handers and
switch their IDTs at the enabling/disabling time.

So, let's introduce entering_irq()/exiting_irq() for pre/post-
processing of each irq handler.

A way to use them is as follows.

Non-trace irq handler:
smp_irq_handler()
{
	entering_irq();		/* pre-processing of this handler */
	__smp_irq_handler();	/*
				 * common logic between non-trace and trace handlers
				 * in a vector.
				 */
	exiting_irq();		/* post-processing of this handler */

}

Trace irq_handler:
smp_trace_irq_handler()
{
	entering_irq();		/* pre-processing of this handler */
	trace_irq_entry();	/* tracepoint for irq entry */
	__smp_irq_handler();	/*
				 * common logic between non-trace and trace handlers
				 * in a vector.
				 */
	trace_irq_exit();	/* tracepoint for irq exit */
	exiting_irq();		/* post-processing of this handler */

}

If tracepoints can place outside entering_irq()/exiting_irq() as follows,
it looks cleaner.

smp_trace_irq_handler()
{
	trace_irq_entry();
	smp_irq_handler();
	trace_irq_exit();
}

But it doesn't work.
The problem is with irq_enter/exit() being called. They must be called before
trace_irq_enter/exit(),  because of the rcu_irq_enter() must be called before
any tracepoints are used, as tracepoints use  rcu to synchronize.

As a possible alternative, we may be able to call irq_enter() first as follows
if irq_enter() can nest.

smp_trace_irq_hander()
{
	irq_entry();
	trace_irq_entry();
	smp_irq_handler();
	trace_irq_exit();
	irq_exit();
}

But it doesn't work, either.
If irq_enter() is nested, it may have a time penalty because it has to check if it
was already called or not. The time penalty is not desired in performance sensitive
paths even if it is tiny.
Signed-off-by: NSeiji Aguchi <seiji.aguchi@hds.com>
Link: http://lkml.kernel.org/r/51C3238D.9040706@hds.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

eddc0e92

20 2月, 2013 1 次提交

x86/apic: Fix parsing of the 'lapic' cmdline option · 27cf9298

由 Mathias Krause 提交于 2月 19, 2013

Including " lapic " in the kernel cmdline on an x86-64 kernel
makes it panic while parsing early params -- e.g. with no user
visible output.

Fix this bug by ensuring arg is non-NULL before passing it to
strncmp().
Reported-by: NPaX Team <pageexec@freemail.hu>
Signed-off-by: NMathias Krause <minipli@googlemail.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1361303227-13174-1-git-send-email-minipli@googlemail.com
Cc: stable@vger.kernel.org	# v3.8
Signed-off-by: NIngo Molnar <mingo@kernel.org>

27cf9298

28 1月, 2013 2 次提交

x86, apic: Mask IO-APIC and PIC unconditionally on LAPIC resume · 336224ba

由 Joerg Roedel 提交于 9月 26, 2012

IO-APIC and PIC use the same resume routines when IRQ
remapping is enabled or disabled. So it should be safe to
mask the other APICs for the IRQ-remapping-disabled case
too.
Signed-off-by: NJoerg Roedel <joro@8bytes.org>
Acked-by: NSebastian Andrzej Siewior <sebastian@breakpoint.cc>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

336224ba

x86, apic: Move irq_remapping_enabled checks into IRQ-remapping code · 70733e0c

由 Joerg Roedel 提交于 9月 26, 2012

Move the three easy to move checks in the x86' apic.c file
into the IRQ-remapping code.
Signed-off-by: NJoerg Roedel <joro@8bytes.org>
Acked-by: NSebastian Andrzej Siewior <sebastian@breakpoint.cc>
Reviewed-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

70733e0c

02 11月, 2012 1 次提交

x86: apic: Use tsc deadline for oneshot when available · 279f1461

由 Suresh Siddha 提交于 10月 22, 2012

If the TSC deadline mode is supported, LAPIC timer one-shot mode can be
implemented using IA32_TSC_DEADLINE MSR. An interrupt will be generated
when the TSC value equals or exceeds the value in the IA32_TSC_DEADLINE
MSR.

This enables us to skip the APIC calibration during boot. Also, in
xapic mode, this enables us to skip the uncached apic access to re-arm
the APIC timer.

As this timer ticks at the high frequency TSC rate, we use the
TSC_DIVISOR (32) to work with the 32-bit restrictions in the
clockevent API's to avoid 64-bit divides etc (frequency is u32 and
"unsigned long" in the set_next_event(), max_delta limits the next
event to 32-bit for 32-bit kernel).
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: venki@google.com
Cc: len.brown@intel.com
Link: http://lkml.kernel.org/r/1350941878.6017.31.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

279f1461

19 9月, 2012 1 次提交

arch/x86: Remove unecessary semicolons · 4b8073e4

由 Peter Senna Tschudin 提交于 9月 18, 2012

Found by http://coccinelle.lip6.fr/Signed-off-by: NPeter Senna Tschudin <peter.senna@gmail.com>
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: rusty@rustcorp.com.au
Cc: masami.hiramatsu.pt@hitachi.com
Cc: suresh.b.siddha@intel.com
Cc: joerg.roedel@amd.com
Cc: agordeev@redhat.com
Cc: yinghai@kernel.org
Cc: bhelgaas@google.com
Cc: liuj97@gmail.com
Link: http://lkml.kernel.org/r/1347986174-30287-7-git-send-email-peter.senna@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

4b8073e4

16 7月, 2012 1 次提交

apic: add apic_set_eoi_write for PV use · 1551df64

由 Michael S. Tsirkin 提交于 7月 15, 2012

KVM PV EOI optimization overrides eoi_write apic op with its own
version. Add an API for this to avoid meddling with core x86 apic driver
data structures directly.

For KVM use, we don't need any guarantees about when the switch to the
new op will take place, so it could in theory use this API after SMP init,
but it currently doesn't, and restricting callers to early init makes it
clear that it's safe as it won't race with actual APIC driver use.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NAvi Kivity <avi@redhat.com>

1551df64

14 6月, 2012 3 次提交

x86/apic: Fix ugly casting and branching in cpu_mask_to_apicid_and() · ea3807ea

由 Alexander Gordeev 提交于 6月 14, 2012

Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614074954.GF3383@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ea3807ea

x86/apic: Eliminate cpu_mask_to_apicid() operation · a5a39156

由 Alexander Gordeev 提交于 6月 14, 2012

Since there are only two locations where cpu_mask_to_apicid() is
called from, remove the operation and use only
cpu_mask_to_apicid_and() instead.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Suggested-and-acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614074935.GE3383@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

a5a39156

x86: Add read_mostly declaration/definition to variables from smp.h · 0816b0f0

由 Vlad Zolotarov 提交于 6月 11, 2012

Add "read-mostly" qualifier to the following variables in
smp.h:

 - cpu_sibling_map
 - cpu_core_map
 - cpu_llc_shared_map
 - cpu_llc_id
 - cpu_number
 - x86_cpu_to_apicid
 - x86_bios_cpu_apicid
 - x86_cpu_to_logical_apicid

As long as all the variables above are only written during the
initialization, this change is meant to prevent the false
sharing. More specifically, on vSMP Foundation platform
x86_cpu_to_apicid shared the same internode_cache_line with
frequently written lapic_events.

From the analysis of the first 33 per_cpu variables out of 219
(memories they describe, to be more specific) the 8 have read_mostly
nature (tlb_vector_offset, cpu_loops_per_jiffy, xen_debug_irq, etc.)
and 25 are frequently written (irq_stack_union, gdt_page,
exception_stacks, idt_desc, etc.).

Assuming that the spread of the rest of the per_cpu variables is
similar, identifying the read mostly memories will make more sense
in terms of long-term code maintenance comparing to identifying
frequently written memories.
Signed-off-by: NVlad Zolotarov <vlad@scalemp.com>
Acked-by: NShai Fultheim <shai@scalemp.com>
Cc: Shai Fultheim (Shai@ScaleMP.com) <Shai@scalemp.com>
Cc: ido@wizery.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1719258.EYKzE4Zbq5@vladSigned-off-by: NIngo Molnar <mingo@kernel.org>

0816b0f0

08 6月, 2012 2 次提交

x86/apic: Make cpu_mask_to_apicid() operations check cpu_online_mask · 4988a40c

由 Alexander Gordeev 提交于 6月 07, 2012

Currently cpu_mask_to_apicid() should not get a offline CPU with
the cpumask. Otherwise some apic drivers might try to access
non-existent per-cpu variables (i.e. x2apic). In that regard
cpu_mask_to_apicid() and cpu_mask_to_apicid_and() operations are
inconsistent.

This fix makes the two operations do not rely on calling
functions and always return the apicid for only online CPUs. As
result, the meaning and implementations of cpu_mask_to_apicid()
and cpu_mask_to_apicid_and() operations become straight.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131624.GG4759@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

4988a40c

x86/apic: Make cpu_mask_to_apicid() operations return error code · ff164324

由 Alexander Gordeev 提交于 6月 07, 2012

Current cpu_mask_to_apicid() and cpu_mask_to_apicid_and()
implementations have few shortcomings:

1. A value returned by cpu_mask_to_apicid() is written to
hardware registers unconditionally. Should BAD_APICID get ever
returned it will be written to a hardware too. But the value of
BAD_APICID is not universal across all hardware in all modes and
might cause unexpected results, i.e. interrupts might get routed
to CPUs that are not configured to receive it.

2. Because the value of BAD_APICID is not universal it is
counter- intuitive to return it for a hardware where it does not
make sense (i.e. x2apic).

3. cpu_mask_to_apicid_and() operation is thought as an
complement to cpu_mask_to_apicid() that only applies a AND mask
on top of a cpumask being passed. Yet, as consequence of 18374d89
commit the two operations are inconsistent in that of:
  cpu_mask_to_apicid() should not get a offline CPU with the cpumask
  cpu_mask_to_apicid_and() should not fail and return BAD_APICID
These limitations are impossible to realize just from looking at
the operations prototypes.

Most of these shortcomings are resolved by returning a error
code instead of BAD_APICID. As the result, faults are reported
back early rather than possibilities to cause a unexpected
behaviour exist (in case of [1]).

The only exception is setup_timer_IRQ0_pin() routine. Although
obviously controversial to this fix, its existing behaviour is
preserved to not break the fragile check_timer() and would
better addressed in a separate fix.
Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131559.GF4759@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ff164324

06 6月, 2012 1 次提交

x86/apic: Factor out default cpu_mask_to_apicid() operations · 6398268d

由 Alexander Gordeev 提交于 6月 05, 2012

Signed-off-by: NAlexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120605112340.GA11454@dhcp-26-207.brq.redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

6398268d

07 5月, 2012 5 次提交

x86: Conditionally update time when ack-ing pending irqs · 42fa4250

由 Shai Fultheim 提交于 4月 20, 2012

On virtual environments, apic_read could take a long time. As a
result, under certain conditions the ack pending loop may exit
without any queued irqs left, but after more than one second. A
warning will be printed needlessly in this case.

If the loop is about to exit regardless of max_loops, don't
update it.
Signed-off-by: NShai Fultheim <shai@scalemp.com>
[ rebased and reworded the commit message]
Signed-off-by: NIdo Yariv <ido@wizery.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1334873552-31346-1-git-send-email-ido@wizery.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

42fa4250

iommu: rename intr_remapping.[ch] to irq_remapping.[ch] · 8a8f422d

由 Suresh Siddha 提交于 3月 30, 2012

Make the file names consistent with the naming conventions of irq subsystem.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

8a8f422d

iommu: rename intr_remapping references to irq_remapping · 95a02e97

由 Suresh Siddha 提交于 3月 30, 2012

Make the code consistent with the naming conventions of irq subsystem.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

95a02e97

iommu/vt-d: Convert missing apic.c intr-remapping call to remap_ops · 4f3d8b67

由 Joerg Roedel 提交于 3月 30, 2012

Convert these calls too:

	* Disable of remapping hardware
	* Reenable of remapping hardware
	* Enable fault handling

With that all of arch/x86/kernel/apic/apic.c is converted to
use the generic intr-remapping interface.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

4f3d8b67

iommu/vt-d: Make intr-remapping initialization generic · 736baef4

由 Joerg Roedel 提交于 3月 30, 2012

This patch introduces irq_remap_ops to hold implementation
specific function pointer to handle interrupt remapping. As
the first part the initialization functions for VT-d are
converted to these ops.
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

736baef4

19 4月, 2012 1 次提交

x86, apic: APIC code touches invalid MSR on P5 class machines · cbf2829b

由 Bryan O'Donoghue 提交于 4月 18, 2012

Current APIC code assumes MSR_IA32_APICBASE is present for all systems.
Pentium Classic P5 and friends didn't have this MSR. MSR_IA32_APICBASE
was introduced as an architectural MSR by Intel @ P6.

Code paths that can touch this MSR invalidly are when vendor == Intel &&
cpu-family == 5 and APIC bit is set in CPUID - or when you simply pass
lapic on the kernel command line, on a P5.

The below patch stops Linux incorrectly interfering with the
MSR_IA32_APICBASE for P5 class machines. Other code paths exist that
touch the MSR - however those paths are not currently reachable for a
conformant P5.
Signed-off-by: NBryan O'Donoghue <bryan.odonoghue@linux.intel.com>
Link: http://lkml.kernel.org/r/4F8EEDD3.1080404@linux.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>

cbf2829b

29 3月, 2012 1 次提交

x86/apic/amd: Be more verbose about LVT offset assignments · 8abc3122

由 Robert Richter 提交于 3月 27, 2012

Add information about LVT offset assignments to better debug firmware
bugs related to this. See following examples.

 # dmesg | grep -i 'offset\|ibs'
 LVT offset 0 assigned for vector 0xf9
 [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for vector 0xf9 on another cpu
 [Firmware Bug]: cpu 0, IBS interrupt offset 0 not available (MSRC001103A=0x0000000000000100)
 Failed to setup IBS, -22

In this case the BIOS assigns both offsets for MCE (0xf9) and IBS
(0x400) vectors to offset 0, which is why the second APIC setup (IBS)
failed.

With correct setup you get:

 # dmesg | grep -i 'offset\|ibs'
 LVT offset 0 assigned for vector 0xf9
 LVT offset 1 assigned for vector 0x400
 IBS: LVT offset 1 assigned
 perf: AMD IBS detected (0x00000007)
 oprofile: AMD IBS detected (0x00000007)

Note: The vector includes also the message type to handle also NMIs
(0x400). In the firmware bug message the format is the same as of the
APIC500 register and includes the mask bit (bit 16) in addition.
Signed-off-by: NRobert Richter <robert.richter@amd.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

8abc3122

24 12月, 2011 2 次提交

x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOS · a31bc327

由 Yinghai Lu 提交于 12月 23, 2011

Currently "nox2apic" boot parameter was not enabling x2apic mode if the cpu,
kernel are all capable of enabling x2apic mode and the OS handover
happened in xapic mode.

However If the bios enabled x2apic prior to OS handover, using "nox2apic"
boot parameter had no effect.

If the boot cpu's apicid is < 255, enable "nox2apic" boot parameter to
disable the x2apic mode setup by the bios. This will enable the kernel to
fallback to xapic mode and bringup only the cpu's which has apic-id < 255.

-v2: fix patch error and two compiling warning
	make disable_x2apic to be __init
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/CAE9FiQUeB-3uxJAMiHsz=uPWoFv5Hg1pVepz7aU6YtqOxMC-=Q@mail.gmail.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

a31bc327

x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remapping · fb209bd8

由 Yinghai Lu 提交于 12月 21, 2011

On some of the recent Intel SNB platforms, by default bios is pre-enabling
x2apic mode in the cpu with out setting up interrupt-remapping.
This case was resulting in the kernel to panic as the cpu is already in
x2apic mode but the OS was not able to enable interrupt-remapping (which
is a pre-req for using x2apic capability).

On these platforms all the apic-ids are < 255 and the kernel can fallback to
xapic mode if the bios has not enabled interrupt-remapping (which is
mostly the case if the bios has not exported interrupt-remapping tables to the
OS).
Reported-by: NBerck E. Nash <flyboy@gmail.com>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20111222014632.600418637@sbsiddha-desk.sc.intel.comSigned-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

fb209bd8

18 12月, 2011 1 次提交

x86: Convert per-cpu counter icr_read_retry_count into a member of irq_stat · b49d7d87

由 Fernando Luis Vazquez Cao 提交于 12月 15, 2011

LAPIC related statistics are grouped inside the per-cpu
structure irq_stat, so there is no need for icr_read_retry_count
to be a standalone per-cpu variable.

This patch moves icr_read_retry_count to where it belongs.

Suggested-y: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Cc: Jörn Engel <joern@logfs.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b49d7d87

14 12月, 2011 1 次提交

x86: Add per-cpu stat counter for APIC ICR read tries · 346b46be

由 Fernando Luis Vázquez Cao 提交于 12月 13, 2011

In the IPI delivery slow path (NMI delivery) we retry the ICR
read to check for delivery completion a limited number of times.

[ The reason for the limited retries is that some of the places
  where it is used (cpu boot, kdump, etc) IPI delivery might not
  succeed (due to a firmware bug or system crash, for example)
  and in such a case it is better to give up and resume
  execution of other code. ]

This patch adds a new entry to /proc/interrupts, RTR, which
tells user space the number of times we retried the ICR read in
the IPI delivery slow path.

This should give some insight into how well the APIC
message delivery hardware is working - if the counts are way
too large then we are hitting a (very-) slow path way too
often.
Signed-off-by: NFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Cc: Jörn Engel <joern@logfs.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/n/tip-vzsp20lo2xdzh5f70g0eis2s@git.kernel.org
[ extended the changelog ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

346b46be

12 12月, 2011 1 次提交

x86: Call idle notifier after irq_enter() · 98ad1cc1

由 Frederic Weisbecker 提交于 10月 07, 2011

Interrupts notify the idle exit state before calling irq_enter().
But the notifier code calls rcu_read_lock() and this is not
allowed while rcu is in an extended quiescent state. We need
to wait for irq_enter() -> rcu_idle_exit() to be called before
doing so otherwise this results in a grumpy RCU:

[    0.099991] WARNING: at include/linux/rcupdate.h:194 __atomic_notifier_call_chain+0xd2/0x110()
[    0.099991] Hardware name: AMD690VM-FMH
[    0.099991] Modules linked in:
[    0.099991] Pid: 0, comm: swapper Not tainted 3.0.0-rc6+ #255
[    0.099991] Call Trace:
[    0.099991]  <IRQ>  [<ffffffff81051c8a>] warn_slowpath_common+0x7a/0xb0
[    0.099991]  [<ffffffff81051cd5>] warn_slowpath_null+0x15/0x20
[    0.099991]  [<ffffffff817d6fa2>] __atomic_notifier_call_chain+0xd2/0x110
[    0.099991]  [<ffffffff817d6ff1>] atomic_notifier_call_chain+0x11/0x20
[    0.099991]  [<ffffffff81001873>] exit_idle+0x43/0x50
[    0.099991]  [<ffffffff81020439>] smp_apic_timer_interrupt+0x39/0xa0
[    0.099991]  [<ffffffff817da253>] apic_timer_interrupt+0x13/0x20
[    0.099991]  <EOI>  [<ffffffff8100ae67>] ? default_idle+0xa7/0x350
[    0.099991]  [<ffffffff8100ae65>] ? default_idle+0xa5/0x350
[    0.099991]  [<ffffffff8100b19b>] amd_e400_idle+0x8b/0x110
[    0.099991]  [<ffffffff810cb01f>] ? rcu_enter_nohz+0x8f/0x160
[    0.099991]  [<ffffffff810019a0>] cpu_idle+0xb0/0x110
[    0.099991]  [<ffffffff817a7505>] rest_init+0xe5/0x140
[    0.099991]  [<ffffffff817a7468>] ? rest_init+0x48/0x140
[    0.099991]  [<ffffffff81cc5ca3>] start_kernel+0x3d1/0x3dc
[    0.099991]  [<ffffffff81cc5321>] x86_64_start_reservations+0x131/0x135
[    0.099991]  [<ffffffff81cc5412>] x86_64_start_kernel+0xed/0xf4
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Henroid <andrew.d.henroid@intel.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>

98ad1cc1

10 11月, 2011 1 次提交

x86/apic: Allow use of lapic timer early calibration result · 1ade93ef

由 Jacob Pan 提交于 11月 10, 2011

lapic timer calibration can be combined with tsc in platform
specific calibration functions. if such calibration result is
obtained early, we can skip the redundant calibration loops.
Signed-off-by: NJacob Pan <jacob.jun.pan@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NDirk Brandewie <dirk.brandewie@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

1ade93ef

21 9月, 2011 2 次提交

iommu: Rename the DMAR and INTR_REMAP config options · d3f13810

由 Suresh Siddha 提交于 8月 23, 2011

Change the CONFIG_DMAR to CONFIG_INTEL_IOMMU to be consistent
with the other IOMMU options.

Rename the CONFIG_INTR_REMAP to CONFIG_IRQ_REMAP to match the
irq subsystem name.

And define the CONFIG_DMAR_TABLE for the common ACPI DMAR
routines shared by both CONFIG_INTEL_IOMMU and CONFIG_IRQ_REMAP.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: youquan.song@intel.com
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.558630224@sbsiddha-desk.sc.intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

d3f13810

x86, x2apic: Enable the bios request for x2apic optout · 41750d31

由 Suresh Siddha 提交于 8月 23, 2011

On the platforms which are x2apic and interrupt-remapping
capable, Linux kernel is enabling x2apic even if the BIOS
doesn't. This is to take advantage of the features that x2apic
brings in.

Some of the OEM platforms are running into issues because of
this, as their bios is not x2apic aware. For example, this was
resulting in interrupt migration issues on one of the platforms.
Also if the BIOS SMI handling uses APIC interface to send SMI's,
then the BIOS need to be aware of x2apic mode that OS has
enabled.

On some of these platforms, BIOS doesn't have a HW mechanism to
turnoff the x2apic feature to prevent OS from enabling it.

To resolve this mess, recent changes to the VT-d2 specification:

 http://download.intel.com/technology/computing/vptech/Intel(r)_VT_for_Direct_IO.pdf

includes a mechanism that provides BIOS a way to request system
software to opt out of enabling x2apic mode.

Look at the x2apic optout flag in the DMAR tables before
enabling the x2apic mode in the platform. Also print a warning
that we have disabled x2apic based on the BIOS request.

Kernel boot parameter "intremap=no_x2apic_optout" can be used to
override the BIOS x2apic optout request.
Signed-off-by: NYouquan Song <youquan.song@intel.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.171766616@sbsiddha-desk.sc.intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

41750d31

27 7月, 2011 1 次提交

atomic: use <linux/atomic.h> · 60063497

由 Arun Sharma 提交于 7月 26, 2011

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: NArun Sharma <asharma@fb.com>
Reviewed-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

60063497

13 7月, 2011 1 次提交

x86, x2apic: Preserve high 32-bits of IA32_APIC_BASE MSR · 25970852

由 Naga Chumbalkar 提交于 7月 12, 2011

If there's no special reason to zero-out the "high" 32-bits of the IA32_APIC_BASE
MSR, let's preserve it.

The x2APIC Specification doesn't explicitly state any such requirement. (Sec 2.2
in: http://www.intel.com/Assets/PDF/manual/318148.pdf).
Signed-off-by: NNaga Chumbalkar <nagananda.chumbalkar@hp.com>
Link: http://lkml.kernel.org/r/20110712055831.2498.78521.sendpatchset@nchumbalkar.americas.cpqcorp.netReviewed-by: NCyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

25970852

09 7月, 2011 1 次提交

x86, boot: Wait for boot cpu to show up if nr_cpus limit is about to hit · 14cb6dcf

由 Vivek Goyal 提交于 7月 08, 2011

nr_cpus allows one to specify number of possible cpus in the system.
Current assumption seems to be that first cpu to show up is boot cpu
and this assumption will be broken in kdump scenario where we can be
booting on a non boot cpu with nr_cpus=1.

It might happen that first cpu we parse is not the cpu we boot on and
later we ignore boot cpu. Though code later seems to recognize this
anomaly and forcibly sets boot cpu in physical cpu map with following
warning.

if (!physid_isset(hard_smp_processor_id(), phys_cpu_present_map)) {
        printk(KERN_WARNING
                "weird, boot CPU (#%d) not listed by the BIOS.\n",
                hard_smp_processor_id());

        physid_set(hard_smp_processor_id(), phys_cpu_present_map);
}

This patch waits for boot cpu to show up and starts ignoring the cpus
once we have hit (nr_cpus - 1) number of cpus. So effectively we are
reserving one slot out of nr_cpus for boot cpu explicitly.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20110708171926.GF2930@redhat.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

14cb6dcf

09 6月, 2011 2 次提交

x86: i8253: Consolidate definitions of global_clock_event · 16f871bc

由 Ralf Baechle 提交于 6月 01, 2011

There are multiple declarations of global_clock_event in header files
specific to particular clock event implementations. Consolidate them
in <asm/time.h> and make sure all users include that header.
Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
Cc: Venkatesh Pallipadi (Venki) <venki@google.com>
Link: http://lkml.kernel.org/r/20110601180610.762763451@duck.linux-mips.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

16f871bc

i8253: Create linux/i8253.h and use it in all 8253 related files · 334955ef

由 Ralf Baechle 提交于 6月 01, 2011

Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/20110601180610.054254048@duck.linux-mips.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

 arch/arm/mach-footbridge/isa-timer.c |    2 +-
 arch/mips/cobalt/time.c              |    2 +-
 arch/mips/jazz/irq.c                 |    2 +-
 arch/mips/kernel/i8253.c             |    2 +-
 arch/mips/mti-malta/malta-time.c     |    2 +-
 arch/mips/sgi-ip22/ip22-time.c       |    2 +-
 arch/mips/sni/time.c                 |    2 +-
 arch/x86/kernel/apic/apic.c          |    2 +-
 arch/x86/kernel/apm_32.c             |    2 +-
 arch/x86/kernel/hpet.c               |    2 +-
 arch/x86/kernel/i8253.c              |    2 +-
 arch/x86/kernel/time.c               |    2 +-
 drivers/block/hd.c                   |    2 +-
 drivers/clocksource/i8253.c          |    2 +-
 drivers/input/gameport/gameport.c    |    2 +-
 drivers/input/joystick/analog.c      |    2 +-
 drivers/input/misc/pcspkr.c          |    2 +-
 include/linux/i8253.h                |   11 +++++++++++
 sound/drivers/pcsp/pcsp.h            |    2 +-
 19 files changed, 29 insertions(+), 18 deletions(-)

334955ef

30 5月, 2011 1 次提交

oprofile, x86: Add comments to IBS LVT offset initialization · cbf74cea

由 Robert Richter 提交于 5月 30, 2011

Adding a comment in the code as IBS LVT setup is not obvious at all ...
Signed-off-by: NRobert Richter <robert.richter@amd.com>

cbf74cea

20 5月, 2011 1 次提交

x86, ioapic: Use ioapic_saved_data while enabling intr-remapping · 31dce14a

由 Suresh Siddha 提交于 5月 18, 2011

Code flow for enabling interrupt-remapping was
allocating/freeing buffers for saving/restoring io-apic RTE's.
ioapic suspend/resume code uses boot time allocated
ioapic_saved_data that is a perfect match for reuse here.

This will remove the unnecessary allocation/free of the
temporary buffers during suspend/resume of interrupt-remapping
enabled platforms aswell as paving the way for further code
consolidation.
Tested-by: NDaniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/20110518233157.574469296@sbsiddha-MOBL3.sc.intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

31dce14a

02 5月, 2011 2 次提交

x86-32, NUMA: Make apic->x86_32_numa_cpu_node() optional · 84914ed0

由 Tejun Heo 提交于 5月 02, 2011

NUMAQ is the only meaningful user of this callback and
setup_local_APIC() the only callsite.  Stop torturing everyone else by
making the callback optional and removing all the boilerplate
implementations and assignments.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

84914ed0

x86-32, NUMA: Automatically set apicid -> node in setup_local_APIC() · c4b90c11

由 Tejun Heo 提交于 5月 02, 2011

Some x86-32 NUMA implementations (NUMAQ) don't initialize apicid ->
node mapping using set_apicid_to_node() during NUMA init but implement
custom apic->x86_32_numa_cpu_node() instead.

This patch automatically initializes the default apic -> node mapping
table from apic->x86_32_numa_cpu_node() from setup_local_APIC() such
that the mapping table is in sync with the actual mapping.

As the table isn't used by custom implementations, this doesn't make
any difference at this point.  This is in preparation of unifying
numa_cpu_node() between x86-32 and 64.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>

c4b90c11

20 4月, 2011 1 次提交

x86, apic: Print verbose error interrupt reason on apic=debug · 2b398bd9

由 Youquan Song 提交于 4月 14, 2011

End users worry about the error interrupt printout we generate
currently:

	pr_debug("APIC error on CPU%d: %02x(%02x)\n",
	smp_processor_id(), v , v1);

... and would like to know the reason why error interrupts are generated.

This patch prints out more detailed debug information.

Another practical problem is that dynamic debug is not initialized yet
when the APIC initializes, so the pr_debug() will not output the error
interrupt debug information on bootup. In this patch, we use
apic_printk(APIC_DEBUG, ...), so the apic=debug boot option will print
verbose error interupts during bootup.
Signed-off-by: NYouquan Song <youquan.song@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: hpa@linux.intel.com
Cc: suresh.b.siddha@intel.com
Cc: yong.y.wang@linux.intel.com
Cc: jbaron@redhat.com
Cc: trenn@suse.de
Cc: kent.liu@intel.com
Cc: chaohong.guo@intel.com
Link: http://lkml.kernel.org/r/1302762968-24380-2-git-send-email-youquan.song@intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

2b398bd9

_Walt / cloud-kernel 与 Fork 源项目一致

_Walt / cloud-kernel
与 Fork 源项目一致