提交 · fb746d0e1365b7472ccc4c3d5b0672b34a092d0b · openanolis / cloud-kernel

22 1月, 2009 1 次提交

x86: optimise page fault entry, cleanup · fb746d0e

由 Johannes Weiner 提交于 1月 21, 2009

tsk is already assigned to current, drop the redundant second
assignment.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fb746d0e

20 1月, 2009 1 次提交

x86: optimise x86's do_page_fault (C entry point for the page fault path) · 92181f19

由 Nick Piggin 提交于 1月 20, 2009

Impact: cleanup, restructure code to improve assembly

gcc isn't _all_ that smart about spilling registers to stack or reusing
stack slots, even with branch annotations. do_page_fault contained a lot
of functionality, so split unlikely paths into their own functions, and
mark them as noinline just to be sure. I consider this actually to be
somewhat of a cleanup too: the main function now contains about half
the number of lines so the normal path is easier to read, while the error
cases are also nicely split away.

Also, ensure the order of arguments to functions is always the same: regs,
addr, error_code. This can reduce code size a tiny bit, and just looks neater
too.

And add a couple of branch annotations.

Before:
  do_page_fault:
          subq    $360, %rsp      #,

After:
  do_page_fault:
          subq    $56, %rsp       #,

bloat-o-meter:
  add/remove: 8/0 grow/shrink: 0/1 up/down: 2222/-1680 (542)
  function                                     old     new   delta
  __bad_area_nosemaphore                         -     506    +506
  no_context                                     -     474    +474
  vmalloc_fault                                  -     424    +424
  spurious_fault                                 -     358    +358
  mm_fault_error                                 -     272    +272
  bad_area_access_error                          -      89     +89
  bad_area                                       -      89     +89
  bad_area_nosemaphore                           -      10     +10
  do_page_fault                               2464     784   -1680

Yes, the total size increases by 542 bytes, due to the extra function calls.
But these will very rarely be called (except for vmalloc_fault) in a normal
workload. Importantly, do_page_fault is less than 1/3rd it's original size,
and touches far less stack.

Existing gotos and branch hints did move a lot of the infrequently used text
out of the fastpath, but that's even further improved after this patch.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

92181f19

13 1月, 2009 1 次提交

x86: avoid theoretical vmalloc fault loop · f313e123

由 Andi Kleen 提交于 1月 09, 2009

Ajith Kumar noticed:

 I was going through the vmalloc fault handling for x86_64 and am unclear
 about the following lines in the vmalloc_fault() function.

 pgd = pgd_offset(current->mm ?: &init_mm, address);
 pgd_ref = pgd_offset_k(address);

 Here the intention is to get the pgd corresponding to the current process
 and sync it up with the pgd in init_mm(obtained from pgd_offset_k).
 However, for kernel threads current->mm is NULL and hence pgd =
 pgd_offset(init_mm, address) = pgd_ref which means the fault handler
 returns without setting the pgd entry in the MM structure in the context
 of which the kernel thread has faulted.  This could lead to never-ending
 faults and busy looping of kernel threads like pdflush.  So, shouldn't the
 pgd = pgd_offset(current->mm ?: &init_mm, address); be pgd =
 pgd_offset(current->active_mm ?: &init_mm, address);

We can use active_mm unconditionally because it should be always set.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f313e123

07 1月, 2009 1 次提交

mm: invoke oom-killer from page fault · 1c0fe6e3

由 Nick Piggin 提交于 1月 06, 2009

Rather than have the pagefault handler kill a process directly if it gets
a VM_FAULT_OOM, have it call into the OOM killer.

With increasingly sophisticated oom behaviour (cpusets, memory cgroups,
oom killing throttling, oom priority adjustment or selective disabling,
panic on oom, etc), it's silly to unconditionally kill the faulting
process at page fault time.  Create a hook for pagefault oom path to call
into instead.

Only converted x86 and uml so far.

[akpm@linux-foundation.org: make __out_of_memory() static]
[akpm@linux-foundation.org: fix comment]
Signed-off-by: NNick Piggin <npiggin@suse.de>
Cc: Jeff Dike <jdike@addtoit.com>
Acked-by: NIngo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1c0fe6e3

14 11月, 2008 1 次提交

CRED: Wrap task credential accesses in the x86 arch · 350b4da7

由 David Howells 提交于 11月 14, 2008

Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-by: NJames Morris <jmorris@namei.org>
Acked-by: NSerge Hallyn <serue@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: NJames Morris <jmorris@namei.org>

350b4da7

27 10月, 2008 1 次提交

trace: add the MMIO-tracer to the tracer menu, cleanup · fd3fdf11

由 Pekka Paalanen 提交于 10月 24, 2008

Impact: cleanup

We can remove MMIOTRACE_HOOKS and replace it with just MMIOTRACE.
MMIOTRACE_HOOKS is a remnant from the time when I thought that
something else could also use the kmmio facilities.
Signed-off-by: NPekka Paalanen <pq@iki.fi>
Acked-by: NSteven Rostedt <rostedt@goodmis.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fd3fdf11

22 10月, 2008 1 次提交

x86, dumpstack: let signr=0 signal no do_exit · 874d93d1

由 Alexander van Heukelum 提交于 10月 22, 2008

Change oops_end such that signr=0 signals that do_exit
is not to be called.

Currently, each use of __die is soon followed by a call
to oops_end and 'regs' is set to NULL if oops_end is expected
not to call do_exit. Change all such pairs to set signr=0
instead. On x86_64 oops_end is used 'bare' in die_nmi; use
signr=0 instead of regs=NULL there, too.
Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

874d93d1

14 10月, 2008 1 次提交

x86/mm: unify init task OOM handling · 3a1dfe6e

由 Ingo Molnar 提交于 10月 13, 2008

Linus noticed that the "again:" versus "survive:" OOM logic for
the init task was arbitrarily different.

The 64-bit codepath is the better one, because it correctly re-lookups
the vma after having dropped the ->mmap_sem.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>

3a1dfe6e

13 10月, 2008 2 次提交

x86/mm: do not trigger a kernel warning if user-space disables interrupts and... · 891cffbd

由 Linus Torvalds 提交于 10月 12, 2008

x86/mm: do not trigger a kernel warning if user-space disables interrupts and generates a page fault

Arjan reported a spike in the following bug pattern in v2.6.27:

   http://www.kerneloops.org/searchweek.php?search=lock_page

which happens because hwclock started triggering warnings due to
a (correct) might_sleep() check in the MM code.

The warning occurs because hwclock uses this dubious sequence of
code to run "atomic" code:

  static unsigned long
  atomic(const char *name, unsigned long (*op)(unsigned long),
         unsigned long arg)
  {
    unsigned long v;
    __asm__ volatile ("cli");
    v = (*op)(arg);
    __asm__ volatile ("sti");
    return v;
  }

Then it pagefaults in that "atomic" section, triggering the warning.

There is no way the kernel could provide "atomicity" in this path,
a page fault is a cannot-continue machine event so the kernel has to
wait for the page to be filled in.

Even if it was just a minor fault we'd have to take locks and might have
to spend quite a bit of time with interrupts disabled - not nice to irq
latencies in general.

So instead just enable interrupts in the pagefault path unconditionally
if we come from user-space, and handle the fault.

Also, while touching this code, unify some trivial parts of the x86
VM paths at the same time.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Reported-by: NArjan van de Ven <arjan@infradead.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

891cffbd

traps: x86: remove trace_hardirqs_fixup from pagefault handler · 69c89b5b

由 Alexander van Heukelum 提交于 9月 26, 2008

The last use of trace_hardirqs_fixup is unnecessary, because the
trap is taken with interrupt off on i386 as well as x86_64, and
the irq-tracer is notified of this from the assembly code.

trace_hardirqs_fixup and trace_hardirqs_fixup_flags are removed
from include/asm-x86/irqflags.h as they are no longer used.
Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

69c89b5b

07 9月, 2008 3 次提交

x86: add periodic corruption check · bb577f98

由 Hugh Dickins 提交于 9月 07, 2008

Perodically check for corruption in low phusical memory.  Don't bother
checking at fault time, since it won't show anything useful.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bb577f98

x86: check for and defend against BIOS memory corruption · 5394f80f

由 Jeremy Fitzhardinge 提交于 9月 07, 2008

Some BIOSes have been observed to corrupt memory in the low 64k.  This
change:
 - Reserves all memory which does not have to be in that area, to
   prevent it from being used as general memory by the kernel.  Things
   like the SMP trampoline are still in the memory, however.
 - Clears the reserved memory so we can observe changes to it.
 - Adds a function check_for_bios_corruption() which checks and reports on
   memory becoming unexpectedly non-zero.  Currently it's called in the
   x86 fault handler, and the powermanagement debug output.
Signed-off-by: NJeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5394f80f

x86: adjust vmalloc_sync_all() for Xen (2nd try) · cc643d46

由 Jan Beulich 提交于 8月 29, 2008

Since the fourth PDPT entry cannot be shared under Xen,
vmalloc_sync_all() must iterate over pmd-s rather than pgd-s here.
Luckily, the code isn't used for native PAE (SHARED_KERNEL_PMD is 1)
and the change is benign to non-PAE.

Also do a little more cleanup in that function.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>

cc643d46

23 7月, 2008 1 次提交

x86: mm/fault.c declare do_page_fault before they get used · 70ef5641

由 Jaswinder Singh 提交于 7月 23, 2008

declared do_page_fault() in asm-x86/trap.h for both X86_32 and X86_64

removed do_invalid_op declaration from mm/fault.c as it is already declared in asm-x86/trap.h
Signed-off-by: NJaswinder Singh <jaswinder@infradead.org>

70ef5641

08 7月, 2008 1 次提交

x86: simplify vmalloc_sync_all · 67350a5c

由 Jeremy Fitzhardinge 提交于 6月 25, 2008

vmalloc_sync_all() is only called from register_die_notifier and
alloc_vm_area.  Neither is on any performance-critical paths, so
vmalloc_sync_all() itself is not on any hot paths.

Given that the optimisations in vmalloc_sync_all add a fair amount of
code and complexity, and are fairly hard to evaluate for correctness,
it's better to just remove them to simplify the code rather than worry
about its absolute performance.
Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

67350a5c

03 7月, 2008 1 次提交

x86: remove unnecessary #ifdef CONFIG_X86_32...#else · 95c60b08

由 Gustavo Fernando Padovan 提交于 6月 25, 2008

Remove the #ifdef conditional because this comparison is already done in
user_mode_vm().
Signed-off-by: NGustavo F. Padovan <gustavo@las.ic.unicamp.br>
Cc: akpm@osdl.org
Signed-off-by: NIngo Molnar <mingo@elte.hu>

95c60b08

01 7月, 2008 1 次提交

x86: small unifications of address printing · f294a8ce

由 Vegard Nossum 提交于 7月 01, 2008

'man 3 printf' tells me that %p should be printed as if by %#x, but
this is not true for the kernel, which does not use the '0x' prefix
for the %p conversion specifier.

A small cast to (void *) is also prettier than #ifdef/#else/#endif.
Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f294a8ce

13 6月, 2008 1 次提交

x86: fix endless page faults in mount_block_root for Linux 2.6 · b29c701d

由 Henry Nestler 提交于 5月 12, 2008

Page faults in kernel address space between PAGE_OFFSET up to
VMALLOC_START should not try to map as vmalloc.

Fix rarely endless page faults inside mount_block_root for root
filesystem at boot time.

All 32bit kernels up to 2.6.25 can fail into this hole.
I can not present this under native linux kernel. I see, that the 64bit
has fixed the problem. I copied the same lines into 32bit part.

Recorded debugs are from coLinux kernel 2.6.22.18 (virtualisation):
http://www.henrynestler.com/colinux/testing/pfn-check-0.7.3/20080410-antinx/bug16-recursive-page-fault-endless.txt
The physicaly memory was trimmed down to 192MB to better catch the bug.
More memory gets the bug more rarely.

Details, how every x86 32bit system can fail:

Start from "mount_block_root",
http://lxr.linux.no/linux/init/do_mounts.c#L297
There the variable "fs_names" got one memory page with 4096 bytes.
Variable "p" walks through the existing file system types. The first
string is no problem.
But, with the second loop in mount_block_root the offset of "p" is not
at beginning of page, the offset is for example +9, if "reiserfs" is the
first in list.
Than calls do_mount_root, and lands in sys_mount.
Remember: Variable "type_page" contains now "fs_type+9" and not contains
a full page.
The sys_mount copies 4096 bytes with function "exact_copy_from_user()":
http://lxr.linux.no/linux/fs/namespace.c#L1540

Mostly exist pages after the buffer "fs_names+4096+9" and the page fault
handler was not called. No problem.

In the case, if the page after "fs_names+4096" is not mapped, the page
fault handler was called from http://lxr.linux.no/linux/fs/namespace.c#L1320

The do_page_fault gots an address 0xc03b4000.
It's kernel address, address >= TASK_SIZE, but not from vmalloc! It's
from "__getname()" alias "kmem_cache_alloc".
The "error_code" is 0. "vmalloc_fault" will be call:
http://lxr.linux.no/linux/arch/i386/mm/fault.c#L332

"vmalloc_fault" tryed to find the physical page for a non existing
virtual memory area. The macro "pte_present" in vmalloc_fault()
got a next page fault for 0xc0000ed0 at:
http://lxr.linux.no/linux/arch/i386/mm/fault.c#L282

No PTE exist for such virtual address. The page fault handler was trying
to sync the physical page for the PTE lockup.

This called vmalloc_fault() again for address 0xc000000, and that also
was not existing. The endless began...

In normal case the cpu would still loop with disabled interrrupts. Under
coLinux this was catched by a stack overflow inside printk debugs.
Signed-off-by: NHenry Nestler <henry.nestler@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

b29c701d

24 5月, 2008 3 次提交

x86: mmiotrace full patch, preview 1 · 0fd0e3da

由 Pekka Paalanen 提交于 5月 12, 2008

kmmio.c handles the list of mmio probes with callbacks, list of traced
pages, and attaching into the page fault handler and die notifier. It
arms, traps and disarms the given pages, this is the core of mmiotrace.

mmio-mod.c is a user interface, hooking into ioremap functions and
registering the mmio probes. It also decodes the required information
from trapped mmio accesses via the pre and post callbacks in each probe.
Currently, hooking into ioremap functions works by redefining the symbols
of the target (binary) kernel module, so that it calls the traced
versions of the functions.

The most notable changes done since the last discussion are:
- kmmio.c is a built-in, not part of the module
- direct call from fault.c to kmmio.c, removing all dynamic hooks
- prepare for unregistering probes at any time
- make kmmio re-initializable and accessible to more than one user
- rewrite kmmio locking to remove all spinlocks from page fault path

Can I abuse call_rcu() like I do in kmmio.c:unregister_kmmio_probe()
or is there a better way?

The function called via call_rcu() itself calls call_rcu() again,
will this work or break? There I need a second grace period for RCU
after the first grace period for page faults.

Mmiotrace itself (mmio-mod.c) is still a module, I am going to attack
that next. At some point I will start looking into how to make mmiotrace
a tracer component of ftrace (thanks for the hint, Ingo). Ftrace should
make the user space part of mmiotracing as simple as
'cat /debug/trace/mmio > dump.txt'.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

0fd0e3da

x86: explicit call to mmiotrace in do_page_fault() · 10c43d2e

由 Pekka Paalanen 提交于 5月 12, 2008

The custom page fault handler list is replaced with a single function
pointer. All related functions and variables are renamed for
mmiotrace.
Signed-off-by: NPekka Paalanen <pq@iki.fi>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: pq@iki.fi
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

10c43d2e

x86: add a list for custom page fault handlers. · 86069782

由 Pekka Paalanen 提交于 5月 12, 2008

Provides kernel modules a way to register custom page fault handlers.
On every page fault this will call a list of registered functions. The
functions may handle the fault and force do_page_fault() to return
immediately.

This functionality is similar to the now removed page fault notifiers.
Custom page fault handlers are used by debugging and reverse engineering
tools. Mmiotrace is one such tool and a patch to add it into the tree
will follow.

The custom page fault handlers are called earlier in do_page_fault()
than the page fault notifiers were.
Signed-off-by: NPekka Paalanen <pq@iki.fi>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

86069782

17 4月, 2008 2 次提交

x86: cleanup - rename VM_MASK to X86_VM_MASK · 6b6891f9

由 gorcunov@gmail.com 提交于 3月 28, 2008

This patch renames VM_MASK to X86_VM_MASK (which
in turn defined as alias to X86_EFLAGS_VM) to better
distinguish from virtual memory flags. We can't just
use X86_EFLAGS_VM instead because it is also used
for conditional compilation
Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6b6891f9

x86: check vmlinux limits, 64-bit · b4e0409a

由 Ingo Molnar 提交于 2月 21, 2008

these build-time and link-time checks would have prevented the
vmlinux size regression.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b4e0409a

28 3月, 2008 1 次提交

x86: prefetch fix · 3085354d

由 Ingo Molnar 提交于 3月 27, 2008

Linus noticed a second bug and an uncleanliness:

 - we'd return on any instruction fetch fault

 - we'd use both the value of 16 and the PF_INSTR symbol which are
   the same and make no sense

the cleanup nicely unifies this piece of logic.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3085354d

27 3月, 2008 1 次提交

x86: fix prefetch workaround · bc713dcf

由 Ingo Molnar 提交于 3月 27, 2008

some early Athlon XP's and Opterons generate bogus faults on prefetch
instructions. The workaround for this regressed over .24 - reinstate it.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bc713dcf

15 2月, 2008 1 次提交

x86: make dump_pagetable() static · cae30f82

由 Adrian Bunk 提交于 2月 13, 2008

dump_pagetable() can now become static.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Acked-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

cae30f82

07 2月, 2008 2 次提交

x86: fix deadlock, make pgd_lock irq-safe · 58d5d0d8

由 Ingo Molnar 提交于 2月 06, 2008

lockdep just caught this one:

=================================
[ INFO: inconsistent lock state ]
2.6.24 #38
---------------------------------
inconsistent {in-softirq-W} -> {softirq-on-W} usage.
swapper/1 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (pgd_lock){-+..}, at: [<ffffffff8022a9ea>] mm_init+0x1da/0x250
{in-softirq-W} state was registered at:
  [<ffffffffffffffff>] 0xffffffffffffffff
irq event stamp: 394559
hardirqs last  enabled at (394559): [<ffffffff80267f0a>] get_page_from_freelist+0x30a/0x4c0
hardirqs last disabled at (394558): [<ffffffff80267d25>] get_page_from_freelist+0x125/0x4c0
softirqs last  enabled at (393952): [<ffffffff80232f8e>] __do_softirq+0xce/0xe0
softirqs last disabled at (393945): [<ffffffff8020c57c>] call_softirq+0x1c/0x30

other info that might help us debug this:
no locks held by swapper/1.

stack backtrace:
Pid: 1, comm: swapper Not tainted 2.6.24 #38

Call Trace:
 [<ffffffff8024e1fb>] print_usage_bug+0x18b/0x190
 [<ffffffff8024f55d>] mark_lock+0x53d/0x560
 [<ffffffff8024fffa>] __lock_acquire+0x3ca/0xed0
 [<ffffffff80250ba8>] lock_acquire+0xa8/0xe0
 [<ffffffff8022a9ea>] ? mm_init+0x1da/0x250
 [<ffffffff809bcd10>] _spin_lock+0x30/0x70
 [<ffffffff8022a9ea>] mm_init+0x1da/0x250
 [<ffffffff8022aa99>] mm_alloc+0x39/0x50
 [<ffffffff8028b95a>] bprm_mm_init+0x2a/0x1a0
 [<ffffffff8028d12b>] do_execve+0x7b/0x220
 [<ffffffff80209776>] sys_execve+0x46/0x70
 [<ffffffff8020c214>] kernel_execve+0x64/0xd0
 [<ffffffff8020901e>] ? _stext+0x1e/0x20
 [<ffffffff802090ba>] init_post+0x9a/0xf0
 [<ffffffff809bc5f6>] ? trace_hardirqs_on_thunk+0x35/0x3a
 [<ffffffff8024f75a>] ? trace_hardirqs_on+0xba/0xd0
 [<ffffffff8020c1a8>] ? child_rip+0xa/0x12
 [<ffffffff8020bcbc>] ? restore_args+0x0/0x44
 [<ffffffff8020c19e>] ? child_rip+0x0/0x12

turns out that pgd_lock has been used on 64-bit x86 in an irq-unsafe
way for almost two years, since commit 8c914cb7.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

58d5d0d8

x86: make spurious fault handler aware of large mappings · d8b57bb7

由 Thomas Gleixner 提交于 2月 06, 2008

In very rare cases, on certain CPUs, we could end up in the spurious
fault handler and ignore a large pud/pmd mapping. The resulting pte
pointer points into the mapped physical space and dereferencing it
will fault recursively.

Make the code aware of large mappings and do the permission check
on the pmd/pud entry, when a large pud/pmd mapping is detected.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d8b57bb7

04 2月, 2008 2 次提交

x86: support gbpages in pagetable dump · b5360222

由 Andi Kleen 提交于 2月 04, 2008

Signed-off-by: NAndi Kleen <ak@suse.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

b5360222

x86: reduce ifdef sections in fault.c · cf89ec92

由 Harvey Harrison 提交于 2月 04, 2008

Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

cf89ec92

02 2月, 2008 1 次提交

x86: fixes for lookup_address args · 93809be8

由 Harvey Harrison 提交于 2月 01, 2008

Signedness mismatches in level argument.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

93809be8

30 1月, 2008 9 次提交

x86: use the same pgd_list for PAE and 64-bit · e3ed910d

由 Jeremy Fitzhardinge 提交于 1月 30, 2008

Use a standard list threaded through page->lru for maintaining the pgd
list on PAE.  This is the same as 64-bit, and seems saner than using a
non-standard list via page->index.
Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

e3ed910d

x86: shrink some ifdefs in fault.c · fd40d6e3

由 Harvey Harrison 提交于 1月 30, 2008

The change from current to tsk in do_page_fault is safe as
this is set at the very beginning of the function.

Removes a likely() annotation from the 64-bit version, this
could have instead been added to 32-bit.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

fd40d6e3

x86: ignore spurious faults · 5b727a3b

由 Jeremy Fitzhardinge 提交于 1月 30, 2008

When changing a kernel page from RO->RW, it's OK to leave stale TLB
entries around, since doing a global flush is expensive and they pose
no security problem.  They can, however, generate a spurious fault,
which we should catch and simply return from (which will have the
side-effect of reloading the TLB to the current PTE).

This can occur when running under Xen, because it frequently changes
kernel pages from RW->RO->RW to implement Xen's pagetable semantics.
It could also occur when using CONFIG_DEBUG_PAGEALLOC, since it avoids
doing a global TLB flush after changing page permissions.
Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

5b727a3b

x86: remove nx_enabled from fault.c · b406ac61

由 Harvey Harrison 提交于 1月 30, 2008

On !PAE 32-bit, _PAGE_NX will be 0, making is_prefetch always
return early.  The test is sufficient on PAE as __supported_pte_mask
is updated in the same places as nx_enabled in init_32.c which also
takes disable_nx into account.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

b406ac61

x86: unify fault_32|64.c · c61e211d

由 Harvey Harrison 提交于 1月 30, 2008

Unify includes in moved fault.c.

Modify Makefiles to pick up unified file.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

c61e211d

x86: unify fault_32|64.c with ifdefs · f8c2ee22

由 Harvey Harrison 提交于 1月 30, 2008

Elimination of these ifdefs can be done in a unified file.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

f8c2ee22

x86: unify fault_32|64.c by ifdef'd function bodies · 1156e098

由 Harvey Harrison 提交于 1月 30, 2008

It's about time to get on with unifying these files, elimination
of the ugly ifdefs can occur in the unified file.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

1156e098

x86: unify page fault oops printing · 19f0dda9

由 Harvey Harrison 提交于 1月 30, 2008

This changes the oops dumping format for page faults to
be similar between X86_32 and 64.

This is the first user of printk_address on X86_32.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

19f0dda9

x86: introduce show_fault_oops helper to fault_32|64.c · b3279c7f

由 Harvey Harrison 提交于 1月 30, 2008

This will help when unifying the oops dumping code on 32/64
bit.  No functional changes.
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

b3279c7f

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功