提交 · b79109c3bbcf52cac5103979b283b9e5df4e796c · openanolis / cloud-kernel

20 2月, 2009 4 次提交

x86, mce: separate correct machine check poller and fatal exception handler · b79109c3

由 Andi Kleen 提交于 2月 12, 2009

Impact: cleanup, performance enhancement

The machine check poller is diverging more and more from the fatal
exception handler. Instead of adding more special cases separate the code
paths completely. The corrected poll path is actually quite simple,
and this doesn't result in much code duplication.

This makes both handlers much easier to read and results in
cleaner code flow.  The exception handler now only needs to care
about uncorrected errors, which also simplifies the handling of multiple
errors. The corrected poller also now always runs in standard interrupt
context and does not need to do anything special to handle NMI context.

Minor behaviour changes:
- MCG status is now not cleared on polling.
- Only the banks which had corrected errors get cleared on polling
- The exception handler only clears banks with errors now

v2: Forward port to new patch order. Add "uc" argument.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

b79109c3

x86, mce: factor out duplicated struct mce setup into one function · b5f2fa4e

由 Andi Kleen 提交于 2月 12, 2009

Impact: cleanup

This merely factors out duplicated code to set up
the initial struct mce state into a single function.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

b5f2fa4e

x86, mce: implement dynamic machine check banks support · 0d7482e3

由 Andi Kleen 提交于 2月 17, 2009

Impact: cleanup; making code future proof; memory saving on small systems

This patch replaces the hardcoded max number of machine check banks with 
dynamic allocation depending on what the CPU reports. The sysfs
data structures and the banks array are dynamically allocated.

There is still a hard bank limit (128) because the mcelog protocol uses
banks >= 128 as pseudo banks to escape other events. But we expect
that 128 banks is beyond any reasonable CPU for now.

This supersedes an earlier patch by Venki, but it solves the problem
more completely by making the limit fully dynamic (up to the 128
boundary).

This saves some memory on machines with less than 6 banks because
they won't need sysdevs for unused ones and also allows to 
use sysfs to control these banks on possible future CPUs with
more than 6 banks.

This is an updated patch addressing Venki's comments.  I also added in
another patch from Thomas which fixed the error allocation path (that
patch was previously separated)

Cc: Venki Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

0d7482e3

x86, mce: enable machine checks in 64-bit defconfig · e35849e9

由 Andi Kleen 提交于 2月 12, 2009

Impact: Low priority fix

The 32-bit defconfig already had it enabled. And it's a pretty
fundamental feature, so better enable it on 64 bits too.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

e35849e9

18 2月, 2009 11 次提交

x86, mce: fix a race condition in mce_read() · ef41df43

由 Huang Ying 提交于 2月 12, 2009

Impact: bugfix

Considering the situation as follow:

before: mcelog.next == 1, mcelog.entry[0].finished = 1

+--------------------------------------------------------------------------
R                   W1                  W2                  W3

read mcelog.next (1)
                    mcelog.next++ (2)
                    (working on entry 1,
                    finished == 0)

mcelog.next = 0
                                        mcelog.next++ (1)
                                        (working on entry 0)
                                                           mcelog.next++ (2)
                                                           (working on entry 1)
                        <----------------- race ---------------->
                    (done on entry 1,
                    finished = 1)
                                                           (done on entry 1,
                                                           finished = 1)

To fix the race condition, a cmpxchg loop is added to mce_read() to
ensure no new MCE record can be added between mcelog.next reading and
mcelog.next = 0.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

ef41df43

x86, mce: disable machine checks on offlined CPUs · d6b75584

由 Andi Kleen 提交于 2月 12, 2009

Impact: Lower priority bug fix

Offlined CPUs could still get machine checks, but the machine check handler
cannot handle them properly, leading to an unconditional crash. Disable
machine checks on CPUs that are going down.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

d6b75584

x86, mce: don't set up mce sysdev devices with mce=off · 5b4408fd

由 Andi Kleen 提交于 2月 12, 2009

Impact: bug fix, in this case the resume handler shouldn't run which
	avoids incorrectly reenabling machine checks on resume

When MCEs are completely disabled on the command line don't set
up the sysdev devices for them either.

Includes a comment fix from Thomas Gleixner.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

5b4408fd

x86, mce: switch machine check polling to per CPU timer · 52d168e2

由 Andi Kleen 提交于 2月 12, 2009

Impact: Higher priority bug fix

The machine check poller runs a single timer and then broadcasted an
IPI to all CPUs to check them. This leads to unnecessary
synchronization between CPUs. The original CPU running the timer has
to wait potentially a long time for all other CPUs answering. This is
also real time unfriendly and in general inefficient.

This was especially a problem on systems with a lot of events where
the poller run with a higher frequency after processing some events.
There could be more and more CPU time wasted with this, to
the point of significantly slowing down machines.

The machine check polling is actually fully independent per CPU, so
there's no reason to not just do this all with per CPU timers.  This
patch implements that.

Also switch the poller also to use standard timers instead of work
queues. It was using work queues to be able to execute a user program
on a event, but mce_notify_user() handles this case now with a
separate callback. So instead always run the poll code in in a
standard per CPU timer, which means that in the common case of not
having to execute a trigger there will be less overhead.

This allows to clean up the initialization significantly, because
standard timers are already up when machine checks get init'ed.  No
multiple initialization functions.

Thanks to Thomas Gleixner for some help.

Cc: thockin@google.com
v2: Use del_timer_sync() on cpu shutdown and don't try to handle
migrated timers.
v3: Add WARN_ON for timer running on unexpected CPU
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

52d168e2

x86, mce: always use separate work queue to run trigger · 9bd98405

由 Andi Kleen 提交于 2月 12, 2009

Impact: Needed for bug fix in next patch

This relaxes the requirement that mce_notify_user has to run in process
context. Useful for future changes, but also leads to cleaner
behaviour now. Now instead mce_notify_user can be called directly
from interrupt (but not NMI) context.

The work queue only uses a single global work struct, which can be done safely
because it is always free to reuse before the trigger function is executed.
This way no events can be lost.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

9bd98405

x86, mce: don't disable machine checks during code patching · 123aa76e

由 Andi Kleen 提交于 2月 12, 2009

Impact: low priority bug fix

This removes part of a a patch I added myself some time ago. After some
consideration the patch was a bad idea. In particular it stopped machine check
exceptions during code patching.

To quote the comment:

        * MCEs only happen when something got corrupted and in this
        * case we must do something about the corruption.
        * Ignoring it is worse than a unlikely patching race.
        * Also machine checks tend to be broadcast and if one CPU
        * goes into machine check the others follow quickly, so we don't
        * expect a machine check to cause undue problems during to code
        * patching.

So undo the machine check related parts of
8f4e956b NMIs are still disabled.

This only removes code, the only additions are a new comment.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

123aa76e

x86, mce: disable machine checks on suspend · 973a2dd1

由 Andi Kleen 提交于 2月 12, 2009

Impact: Bug fix

During suspend it is not reliable to process machine check
exceptions, because CPUs disappear but can still get machine check
broadcasts.  Also the system is slightly more likely to
machine check them, but the handler is typically not a position
to handle them in a meaningfull way.

So disable them during suspend and enable them during resume.

Also make sure they are always disabled on hot-unplugged CPUs.

This new code assumes that suspend always hotunplugs all
non BP CPUs.

v2: Remove the WARN_ONs Thomas objected to.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

973a2dd1

x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdown · 07db1c14

由 Andi Kleen 提交于 2月 12, 2009

Impact: Bugfix

The ifdef for the apic clear on shutdown for the 64bit intel thermal
vector was incorrect and never triggered. Fix that.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

07db1c14

x86, mce: use force_sig_info to kill process in machine check · 380851bc

由 Andi Kleen 提交于 2月 12, 2009

Impact: bug fix (with tolerant == 3)

do_exit cannot be called directly from the exception handler because
it can sleep and the exception handler runs on the exception stack.
Use force_sig() instead.

Based on a earlier patch by Ying Huang who debugged the problem.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

380851bc

x86, mce: reinitialize per cpu features on resume · 6ec68bff

由 Andi Kleen 提交于 2月 12, 2009

Impact: Bug fix

This fixes a long standing bug in the machine check code. On resume the
boot CPU wouldn't get its vendor specific state like thermal handling
reinitialized. This means the boot cpu wouldn't ever get any thermal
events reported again.

Call the respective initialization functions on resume

v2: Remove ancient init because they don't have a resume device anyways.
    Pointed out by Thomas Gleixner.
v3: Now fix the Subject too to reflect v2 change
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

6ec68bff

x86, rcu: fix strange load average and ksoftirqd behavior · bf51935f

由 Paul E. McKenney 提交于 2月 17, 2009

Damien Wyart reported high ksoftirqd CPU usage (20%) on an
otherwise idle system.

The function-graph trace Damien provided:

>   799.521187 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521371 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521555 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521738 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.521934 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522068 |   1)  ksoftir-2324  |               |                rcu_check_callbacks() {
>   799.522208 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522392 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522575 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522759 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.522956 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523074 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
>   799.523214 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523397 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523579 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523762 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.523960 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524079 |   1)  ksoftir-2324  |               |                  rcu_check_callbacks() {
>   799.524220 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524403 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524587 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
>   799.524770 |   1)    <idle>-0    |               |  rcu_check_callbacks() {
> [ . . . ]

Shows rcu_check_callbacks() being invoked way too often. It should be called
once per jiffy, and here it is called no less than 22 times in about
3.5 milliseconds, meaning one call every 160 microseconds or so.

Why do we need to call rcu_pending() and rcu_check_callbacks() from the
idle loop of 32-bit x86, especially given that no other architecture does
this?

The following patch removes the call to rcu_pending() and
rcu_check_callbacks() from the x86 32-bit idle loop in order to
reduce the softirq load on idle systems.
Reported-by: NDamien Wyart <damien.wyart@free.fr>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bf51935f

15 2月, 2009 2 次提交

x86, vm86: fix preemption bug · be716615

由 Thomas Gleixner 提交于 1月 13, 2009

Commit 3d2a71a5 ("x86, traps: converge
do_debug handlers") changed the preemption disable logic of do_debug()
so vm86_handle_trap() is called with preemption disabled resulting in:

 BUG: sleeping function called from invalid context at include/linux/kernel.h:155
 in_atomic(): 1, irqs_disabled(): 0, pid: 3005, name: dosemu.bin
 Pid: 3005, comm: dosemu.bin Tainted: G        W  2.6.29-rc1 #51
 Call Trace:
  [<c050d669>] copy_to_user+0x33/0x108
  [<c04181f4>] save_v86_state+0x65/0x149
  [<c0418531>] handle_vm86_trap+0x20/0x8f
  [<c064e345>] do_debug+0x15b/0x1a4
  [<c064df1f>] debug_stack_correct+0x27/0x2c
  [<c040365b>] sysenter_do_call+0x12/0x2f
 BUG: scheduling while atomic: dosemu.bin/3005/0x10000001

Restore the original calling convention and reenable preemption before
calling handle_vm86_trap().
Reported-by: NMichal Suchanek <hramrach@centrum.cz>
Cc: stable@kernel.org
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

be716615

x86, olpc: fix model detection without OFW · e49590b6

由 Chris Ball 提交于 2月 13, 2009

Impact: fix "garbled display, laptop is unusable" bug

Commit e51a1ac2 ("x86, olpc: fix endian
bug in openfirmware workaround") breaks model comparison on OLPC; the value
0xc2 needs to be scaled up by olpc_board().

The pre-patch version was wrong, but accidentally worked anyway
(big-endian 0xc2 is big enough to satisfy all other board revisions,
but little endian 0xc2 is not).
Signed-off-by: NChris Ball <cjb@laptop.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: NAndres Salomon <dilinger@queued.net>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e49590b6

13 2月, 2009 4 次提交

x86, hpet: fix for LS21 + HPET = boot hang · b13e2464

由 john stultz 提交于 2月 12, 2009

Between 2.6.23 and 2.6.24-rc1 a change was made that broke IBM LS21
systems that had the HPET enabled in the BIOS, resulting in boot hangs
for x86_64.

Specifically commit b8ce3359, which
merges the i386 and x86_64 HPET code.

Prior to this commit, when we setup the HPET timers in x86_64, we did
the following:

	hpet_writel(HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL |
                    HPET_TN_32BIT, HPET_T0_CFG);

However after the i386/x86_64 HPET merge, we do the following:

	cfg = hpet_readl(HPET_Tn_CFG(timer));
	cfg |= HPET_TN_ENABLE | HPET_TN_PERIODIC |
			HPET_TN_SETVAL | HPET_TN_32BIT;
	hpet_writel(cfg, HPET_Tn_CFG(timer));

However on LS21s with HPET enabled in the BIOS, the HPET_T0_CFG register
boots with Level triggered interrupts (HPET_TN_LEVEL) enabled. This
causes the periodic interrupt to be not so periodic, and that results in
the boot time hang I reported earlier in the delay calibration.

My fix: Always disable HPET_TN_LEVEL when setting up periodic mode.
Signed-off-by: NJohn Stultz <johnstul@us.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b13e2464

x86: CPA avoid repeated lazy mmu flush · 7ad9de6a

由 Thomas Gleixner 提交于 2月 12, 2009

Impact: Flush the lazy MMU only once

Pending mmu updates only need to be flushed once to bring the
in-memory pagetable state up to date.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

7ad9de6a

x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context · 34b0900d

由 Thomas Gleixner 提交于 2月 12, 2009

Impact: Catch cases where lazy MMU state is active in a preemtible context

arch_flush_lazy_mmu_cpu() has been changed to disable preemption so
the checks in enter/leave will never trigger. Put the preemtible()
check into arch_flush_lazy_mmu_cpu() to catch such cases.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

34b0900d

x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption · d85cf93d

由 Jeremy Fitzhardinge 提交于 2月 12, 2009

Impact: avoid access to percpu vars in preempible context

They are intended to be used whenever there's the possibility
that there's some stale state which is going to be overwritten
with a queued update, or to force a state change when we may be
in lazy mode.  Either way, we could end up calling it with
preemption enabled, so wrap the functions in their own little
preempt-disable section so they can be safely called in any
context (though preemption should never be enabled if we're actually
in a lazy state).

(Move out of line to avoid #include dependencies.)
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

d85cf93d

12 2月, 2009 2 次提交

x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem · be03d9e8

由 Suresh Siddha 提交于 2月 11, 2009

Jeff Mahoney reported:

> With Suse's hwinfo tool, on -tip:
> WARNING: at arch/x86/mm/pat.c:637 reserve_pfn_range+0x5b/0x26d()

reserve_pfn_range() is not tracking the memory range below 1MB
as non-RAM and as such is inconsistent with similar checks in
reserve_memtype() and free_memtype()

Rename the pagerange_is_ram() to pat_pagerange_is_ram() and add the
"track legacy 1MB region as non RAM" condition.

And also, fix reserve_pfn_range() to return -EINVAL, when the pfn
range is RAM. This is to be consistent with this API design.
Reported-and-tested-by: NJeff Mahoney <jeffm@suse.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

be03d9e8

x86/cpa: make sure cpa is safe to call in lazy mmu mode · 4f06b043

由 Jeremy Fitzhardinge 提交于 2月 11, 2009

Impact: fix race leading to crash under KVM and Xen

The CPA code may be called while we're in lazy mmu update mode - for
example, when using DEBUG_PAGE_ALLOC and doing a slab allocation
in an interrupt handler which interrupted a lazy mmu update.  In this
case, the in-memory pagetable state may be out of date due to pending
queued updates.  We need to flush any pending updates before inspecting
the page table.  Similarly, we must explicitly flush any modifications
CPA may have made (which comes down to flushing queued operations when
flushing the TLB).
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: NMarcelo Tosatti <mtosatti@redhat.com>
Cc: Stable Kernel <stable@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

4f06b043

11 2月, 2009 2 次提交

x86, ptrace, mm: fix double-free on race · 9f339e70

由 Markus Metzger 提交于 2月 11, 2009

Ptrace_detach() races with __ptrace_unlink() if the traced task is
reaped while detaching. This might cause a double-free of the BTS
buffer.

Change the ptrace_detach() path to only do the memory accounting in
ptrace_bts_detach() and leave the buffer free to ptrace_bts_untrace()
which will be called from __ptrace_unlink().

The fix follows a proposal from Oleg Nesterov.
Reported-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

9f339e70

ptrace, x86: fix the usage of ptrace_fork() · 06eb23b1

由 Oleg Nesterov 提交于 2月 09, 2009

I noticed by pure accident we have ptrace_fork() and friends. This was
added by "x86, bts: add fork and exit handling", commit
bf53de90.

I can't test this, ds_request_bts() returns -EOPNOTSUPP, but I strongly
believe this needs the fix. I think something like this program

	int main(void)
	{
		int pid = fork();

		if (!pid) {
			ptrace(PTRACE_TRACEME, 0, NULL, NULL);
			kill(getpid(), SIGSTOP);
			fork();
		} else {
			struct ptrace_bts_config bts = {
				.flags = PTRACE_BTS_O_ALLOC,
				.size  = 4 * 4096,
			};

			wait(NULL);

			ptrace(PTRACE_SETOPTIONS, pid, NULL, PTRACE_O_TRACEFORK);
			ptrace(PTRACE_BTS_CONFIG, pid, &bts, sizeof(bts));
			ptrace(PTRACE_CONT, pid, NULL, NULL);

			sleep(1);
		}

		return 0;
	}

should crash the kernel.

If the task is traced by its natural parent ptrace_reparented() returns 0
but we should clear ->btsxxx anyway.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Acked-by: NMarkus Metzger <markus.t.metzger@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

06eb23b1

10 2月, 2009 2 次提交

i8327: fix outb() parameter order · b52af409

由 Clemens Ladisch 提交于 2月 10, 2009

In i8237A_resume(), when resetting the DMA controller, the parameters to
dma_outb() were mixed up.
Signed-off-by: NClemens Ladisch <clemens@ladisch.de>
[ cleaned up the file a tiny bit. ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b52af409

x86: fix math_emu register frame access · d315760f

由 Tejun Heo 提交于 2月 09, 2009

do_device_not_available() is the handler for #NM and it declares that
it takes a unsigned long and calls math_emu(), which takes a long
argument and surprisingly expects the stack frame starting at the zero
argument would match struct math_emu_info, which isn't true regardless
of configuration in the current code.

This patch makes do_device_not_available() take struct pt_regs like
other exception handlers and initialize struct math_emu_info with
pointer to it and pass pointer to the math_emu_info to math_emulate()
like normal C functions do.  This way, unless gcc makes a copy of
struct pt_regs in do_device_not_available(), the register frame is
correctly accessed regardless of kernel configuration or compiler
used.

This doesn't fix all math_emu problems but it at least gets it
somewhat working.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d315760f

09 2月, 2009 5 次提交

x86: math_emu info cleanup · ae6af41f

由 Tejun Heo 提交于 2月 09, 2009

Impact: cleanup

* Come on, struct info?  s/struct info/struct math_emu_info/

* Use struct pt_regs and kernel_vm86_regs instead of defining its own
  register frame structure.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ae6af41f

x86: include correct %gs in a.out core dump · 914c3d63

由 Tejun Heo 提交于 2月 09, 2009

Impact: dump the correct %gs into a.out core dump

aout_dump_thread() read %gs but didn't include it in core dump.  Fix
it.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

914c3d63

x86, vmi: put a missing paravirt_release_pmd in pgd_dtor · 55a8ba4b

由 Alok Kataria 提交于 2月 06, 2009

Commit 6194ba6f ("x86: don't special-case
pmd allocations as much") made changes to the way we handle pmd allocations,
and while doing that it dropped a call to  paravirt_release_pd on the
pgd page from the pgd_dtor code path.

As a result of this missing release, the hypervisor is now unaware of the
pgd page being freed, and as a result it ends up tracking this page as a
page table page.

After this the guest may start using the same page for other purposes, and
depending on what use the page is put to, it may result in various performance
and/or functional issues ( hangs, reboots).

Since this release is only required for VMI, I now release the pgd page from
the (vmi)_pgd_free hook.
Signed-off-by: NAlok N Kataria <akataria@vmware.com>
Acked-by: NJeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>

55a8ba4b

x86: find nr_irqs_gsi with mp_ioapic_routing · 3f4a739c

由 Yinghai Lu 提交于 2月 08, 2009

Impact: find right nr_irqs_gsi on some systems.

One test-system has gap between gsi's:

[    0.000000] ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 4, version 0, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: IOAPIC (id[0x05] address[0xfeafd000] gsi_base[48])
[    0.000000] IOAPIC[1]: apic_id 5, version 0, address 0xfeafd000, GSI 48-54
[    0.000000] ACPI: IOAPIC (id[0x06] address[0xfeafc000] gsi_base[56])
[    0.000000] IOAPIC[2]: apic_id 6, version 0, address 0xfeafc000, GSI 56-62
...
[    0.000000] nr_irqs_gsi: 38

So nr_irqs_gsi is not right. some irq for MSI will overwrite with io_apic.

need to get that with acpi_probe_gsi when acpi io_apic is used
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3f4a739c

x86: add clflush before monitor for Intel 7400 series · e736ad54

由 Pallipadi, Venkatesh 提交于 2月 06, 2009

For Intel 7400 series CPUs, the recommendation is to use a clflush on the
monitored address just before monitor and mwait pair [1].

This clflush makes sure that there are no false wakeups from mwait when the
monitored address was recently written to.

[1] "MONITOR/MWAIT Recommendations for Intel Xeon Processor 7400 series"
    section in specification update document of 7400 series
    http://download.intel.com/design/xeon/specupdt/32033601.pdfSigned-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e736ad54

05 2月, 2009 5 次提交

x86: disable intel_iommu support by default · 0cd5c3c8

由 Kyle McMartin 提交于 2月 04, 2009

Due to recurring issues with DMAR support on certain platforms.
There's a number of filesystem corruption incidents reported:

  https://bugzilla.redhat.com/show_bug.cgi?id=479996
  http://bugzilla.kernel.org/show_bug.cgi?id=12578

Provide a Kconfig option to change whether it is enabled by
default.

If disabled, it can still be reenabled by passing intel_iommu=on to the
kernel. Keep the .config option off by default.
Signed-off-by: NKyle McMartin <kyle@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-By: NDavid Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

0cd5c3c8

x86: don't apply __supported_pte_mask to non-present ptes · b534816b

由 Jeremy Fitzhardinge 提交于 2月 04, 2009

On an x86 system which doesn't support global mappings,
__supported_pte_mask has _PAGE_GLOBAL clear, to make sure it never
appears in the PTE.  pfn_pte() and so on will enforce it with:

static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
{
	return __pte((((phys_addr_t)page_nr << PAGE_SHIFT) |
		      pgprot_val(pgprot)) & __supported_pte_mask);
}

However, we overload _PAGE_GLOBAL with _PAGE_PROTNONE on non-present
ptes to distinguish them from swap entries.  However, applying
__supported_pte_mask indiscriminately will clear the bit and corrupt the
pte.

I guess the best fix is to only apply __supported_pte_mask to present
ptes.  This seems like the right solution to me, as it means we can
completely ignore the issue of overlaps between the present pte bits and
the non-present pte-as-swap entry use of the bits.

__supported_pte_mask contains the set of flags we support on the
current hardware.  We also use bits in the pte for things like
logically present ptes with no permissions, and swap entries for
swapped out pages.  We should only apply __supported_pte_mask to
present ptes, because otherwise we may destroy other information being
stored in the ptes.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

b534816b

x86: fix grammar in user-visible BIOS warning · 45608399

由 Alex Chiang 提交于 2月 04, 2009

Fix user-visible grammo.
Signed-off-by: NAlex Chiang <achiang@hp.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

45608399

x86/Kconfig.cpu: make Kconfig help readable in the console · 36723bfe

由 Borislav Petkov 提交于 2月 04, 2009

Impact: cleanup

Some lines exceed the 80 char width making them unreadable.
Signed-off-by: NBorislav Petkov <petkovbb@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

36723bfe

x86, 64-bit: print DMI info in the oops trace · 48ec4d95

由 Kyle McMartin 提交于 2月 04, 2009

This patch echoes what we already do on 32-bit since
90f7d25c, and prints the DMI
product name in show_regs, so that system specific problems can be
easily identified.
Signed-off-by: NKyle McMartin <kyle@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

48ec4d95

04 2月, 2009 3 次提交

I

Merge branch 'core/xen' into x86/urgent · bb960a1e
由 Ingo Molnar 提交于 2月 04, 2009

bb960a1e

x86: APIC: enable workaround on AMD Fam10h CPUs · 85877061

由 Borislav Petkov 提交于 2月 03, 2009

Impact: fix to enable APIC for AMD Fam10h on chipsets with a missing/b0rked
	ACPI MP table (MADT)

Booting a 32bit kernel on an AMD Fam10h CPU running on chipsets with
missing/b0rked MP table leads to a hang pretty early in the boot process
due to the APIC not being initialized. Fix that by falling back to the
default APIC base address in 32bit code, as it is done in the 64bit
codepath.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

85877061

xen: disable interrupts before saving in percpu · 06fc732c

由 Jeremy Fitzhardinge 提交于 2月 03, 2009

Impact: Fix race condition

xen_mc_batch has a small preempt race where it takes the address of a
percpu variable immediately before disabling interrupts, thereby
leaving a small window in which we may migrate to another cpu and save
the flags in the wrong percpu variable.  Disable interrupts before
saving the old flags in a percpu.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

06fc732c

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功