提交 · dac16fba6fc590fa7239676b35ed75dae4c4cd2b · openanolis / cloud-kernel

11 12月, 2015 1 次提交

x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap · dac16fba

由 Andy Lutomirski 提交于 12月 10, 2015

Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/9d37826fdc7e2d2809efe31d5345f97186859284.1449702533.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

dac16fba

07 10月, 2015 1 次提交

x86/vdso: Remove runtime 32-bit vDSO selection · 0a6d1fa0

由 Andy Lutomirski 提交于 10月 05, 2015

32-bit userspace will now always see the same vDSO, which is
exactly what used to be the int80 vDSO.  Subsequent patches will
clean it up and make it support SYSENTER and SYSCALL using
alternatives.
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/e7e6b3526fa442502e6125fe69486aab50813c32.1444091584.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

0a6d1fa0

06 7月, 2015 1 次提交

x86/compat: Don't build the 32-bit VDSO if not needed · ab8b82ee

由 Brian Gerst 提交于 6月 22, 2015

Build the 32-bit vdso only for native 32-bit or 32-bit compat is
enabled.  x32 should not force it to build.
Signed-off-by: NBrian Gerst <brgerst@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1434974121-32575-7-git-send-email-brgerst@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ab8b82ee

04 6月, 2015 1 次提交

x86/asm/entry, x86/vdso: Move the vDSO code to arch/x86/entry/vdso/ · d603c8e1

由 Ingo Molnar 提交于 6月 03, 2015

Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

d603c8e1

21 12月, 2014 1 次提交

x86_64, vdso: Fix the vdso address randomization algorithm · 394f56fe

由 Andy Lutomirski 提交于 12月 19, 2014

The theory behind vdso randomization is that it's mapped at a random
offset above the top of the stack.  To avoid wasting a page of
memory for an extra page table, the vdso isn't supposed to extend
past the lowest PMD into which it can fit.  Other than that, the
address should be a uniformly distributed address that meets all of
the alignment requirements.

The current algorithm is buggy: the vdso has about a 50% probability
of being at the very end of a PMD.  The current algorithm also has a
decent chance of failing outright due to incorrect handling of the
case where the top of the stack is near the top of its PMD.

This fixes the implementation.  The paxtest estimate of vdso
"randomisation" improves from 11 bits to 18 bits.  (Disclaimer: I
don't know what the paxtest code is actually calculating.)

It's worth noting that this algorithm is inherently biased: the vdso
is more likely to end up near the end of its PMD than near the
beginning.  Ideally we would either nix the PMD sharing requirement
or jointly randomize the vdso and the stack to reduce the bias.

In the mean time, this is a considerable improvement with basically
no risk of compatibility issues, since the allowed outputs of the
algorithm are unchanged.

As an easy test, doing this:

for i in `seq 10000`
  do grep -P vdso /proc/self/maps |cut -d- -f1
done |sort |uniq -d

used to produce lots of output (1445 lines on my most recent run).
A tiny subset looks like this:

7fffdfffe000
7fffe01fe000
7fffe05fe000
7fffe07fe000
7fffe09fe000
7fffe0bfe000
7fffe0dfe000

Note the suspicious fe000 endings.  With the fix, I get a much more
palatable 76 repeated addresses.
Reviewed-by: NKees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>

394f56fe

02 11月, 2014 1 次提交

x86: vdso: Fix build with older gcc · a92f101b

由 Andrew Morton 提交于 11月 01, 2014

gcc-4.4.4:

arch/x86/vdso/vma.c: In function 'vgetcpu_cpu_init':
arch/x86/vdso/vma.c:247: error: unknown field 'limit0' specified in initializer
arch/x86/vdso/vma.c:247: warning: missing braces around initializer
arch/x86/vdso/vma.c:247: warning: (near initialization for '(anonymous).<anonymous>')
arch/x86/vdso/vma.c:248: error: unknown field 'limit' specified in initializer
arch/x86/vdso/vma.c:248: warning: excess elements in struct initializer
arch/x86/vdso/vma.c:248: warning: (near initialization for '(anonymous)')
....

I couldn't find any way of tricking it into accepting an initializer
format :(
Reported-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Fixes: 25880156 ("x86/vdso: Change the PER_CPU segment to use struct desc_struct")
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

a92f101b

28 10月, 2014 5 次提交

x86_64/vdso: Clean up vgetcpu init and merge the vdso initcalls · 1c0c1b93

由 Andy Lutomirski 提交于 9月 23, 2014

Now vdso/vma.c has a single initcall and no references to
"vsyscall".
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/945c463e2804fedd8b08d63a040cbe85d55195aa.1411494540.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

1c0c1b93

x86/vdso: Make the PER_CPU segment 32 bits · 287e0131

由 Andy Lutomirski 提交于 9月 23, 2014

IMO users ought not to be able to use 16-bit segments without
using modify_ldt. Fortunately, it's impossible to break
espfix64 by loading the PER_CPU segment into SS because it's
PER_CPU is marked read-only and SS cannot contain an RO segment,
but marking PER_CPU as 32-bit is less fragile.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/179f490d659307873eefd09206bebd417e2ab5ad.1411494540.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

287e0131

x86/vdso: Make the PER_CPU segment start out accessed · 9c0080ef

由 Andy Lutomirski 提交于 9月 23, 2014

The first userspace attempt to read or write the PER_CPU segment
will write the accessed bit to the GDT. This is visible to
userspace using the LAR instruction, and it also pointlessly
dirties a cache line.

Set the segment's accessed bit at boot to prevent userspace
access to segments from having side effects.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/ac63814ca4c637a08ec2fd0360d67ca67560a9ee.1411494540.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

9c0080ef

x86/vdso: Change the PER_CPU segment to use struct desc_struct · 25880156

由 Andy Lutomirski 提交于 9月 23, 2014

This makes it easier to see what's going on. It produces
exactly the same segment descriptor as the old code.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/d492f7b55136cbc60f016adae79160707b2e03b7.1411494540.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

25880156

x86_64/vdso: Move getcpu code from vsyscall_64.c to vdso/vma.c · d4f829dd

由 Andy Lutomirski 提交于 9月 23, 2014

This is pure cut-and-paste. At this point, vsyscall_64.c
contains only code needed for vsyscall emulation, but some of
the comments and function names are still confused.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/a244daf7d3cbe71afc08ad09fdfe1866ca1f1978.1411494540.git.luto@amacapital.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

d4f829dd

26 7月, 2014 1 次提交

x86/vdso: Set VM_MAYREAD for the vvar vma · ac379835

由 Andy Lutomirski 提交于 7月 25, 2014

The VVAR area can, obviously, be read; that is kind of the point.

AFAIK this has no effect whatsoever unless x86 suddenly turns into a
nommu architecture. Nonetheless, not setting it is suspicious.
Reported-by: NNathan Lynch <Nathan_Lynch@mentor.com>
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/e4c8bf4bc2725bda22c4a4b7d0c82adcd8f8d9b8.1406330779.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

ac379835

12 7月, 2014 1 次提交

x86, vdso: Move the vvar area before the vdso text · e6577a7c

由 Andy Lutomirski 提交于 7月 10, 2014

Putting the vvar area after the vdso text is rather complicated: it
only works of the total length of the vdso text mapping is known at
vdso link time, and the linker doesn't allow symbol addresses to
depend on the sizes of non-allocatable data after the PT_LOAD
segment.

Moving the vvar area before the vdso text will allow is to safely
map non-allocatable data after the vdso text, which is a nice
simplification.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/156c78c0d93144ff1055a66493783b9e56813983.1405040914.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

e6577a7c

11 7月, 2014 1 次提交

x86-32, vdso: Fix vDSO build error due to missing align_vdso_addr() · d093601b

由 Jan Beulich 提交于 7月 03, 2014

Relying on static functions used just once to get inlined (and
subsequently have dead code paths eliminated) is wrong: Compilers are
free to decide whether they do this, regardless of optimization level.
With this not happening for vdso_addr() (observed with gcc 4.1.x), an
unresolved reference to align_vdso_addr() causes the build to fail.

[ hpa: vdso_addr() is never actually used on x86-32, as calculate_addr
  in map_vdso() is always false.  It ought to be possible to clean
  this up further, but this fixes the immediate problem. ]
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/53B5863B02000078000204D5@mail.emea.novell.comAcked-by: NAndy Lutomirski <luto@amacapital.net>
Tested-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

d093601b

21 5月, 2014 2 次提交

x86, mm: Improve _install_special_mapping and fix x86 vdso naming · a62c34bd

由 Andy Lutomirski 提交于 5月 19, 2014

Using arch_vma_name to give special mappings a name is awkward.  x86
currently implements it by comparing the start address of the vma to
the expected address of the vdso.  This requires tracking the start
address of special mappings and is probably buggy if a special vma
is split or moved.

Improve _install_special_mapping to just name the vma directly.  Use
it to give the x86 vvar area a name, which should make CRIU's life
easier.

As a side effect, the vvar area will show up in core dumps.  This
could be considered weird and is fixable.

[hpa: I say we accept this as-is but be prepared to deal with knocking
 out the vvars from core dumps if this becomes a problem.]

Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/276b39b6b645fb11e345457b503f17b83c2c6fd0.1400538962.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

a62c34bd

x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET · 1e844fb4

由 Andy Lutomirski 提交于 5月 19, 2014

The oops can be triggered in qemu using -no-hpet (but not nohpet) by
reading a couple of pages past the end of the vdso text.  This
should send SIGBUS instead of OOPSing.

The bug was introduced by:

commit 7a59ed41
Author: Stefani Seibold <stefani@seibold.net>
Date:   Mon Mar 17 23:22:09 2014 +0100

    x86, vdso: Add 32 bit VDSO time support for 32 bit kernel

which is new in 3.15.

This will be fixed separately in 3.15, but that patch will not apply
to tip/x86/vdso.  This is the equivalent fix for tip/x86/vdso and,
presumably, 3.16.

Cc: Stefani Seibold <stefani@seibold.net>
Reported-by: NSasha Levin <sasha.levin@oracle.com>
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/c8b0a9a0b8d011a8b273cbb2de88d37190ed2751.1400538962.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

1e844fb4

06 5月, 2014 3 次提交

x86, vdso: Move the 32-bit vdso special pages after the text · 18d0a6fd

由 Andy Lutomirski 提交于 5月 05, 2014

This unifies the vdso mapping code and teaches it how to map special
pages at addresses corresponding to symbols in the vdso image. The
new code is used for all vdso variants, but so far only the 32-bit
variants use the new vvar page position.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/b6d7858ad7b5ac3fd3c29cab6d6d769bc45d195e.1399317206.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

18d0a6fd

x86, vdso: Reimplement vdso.so preparation in build-time C · 6f121e54

由 Andy Lutomirski 提交于 5月 05, 2014

Currently, vdso.so files are prepared and analyzed by a combination
of objcopy, nm, some linker script tricks, and some simple ELF
parsers in the kernel. Replace all of that with plain C code that
runs at build time.

All five vdso images now generate .c files that are compiled and
linked in to the kernel image.

This should cause only one userspace-visible change: the loaded vDSO
images are stripped more heavily than they used to be. Everything
outside the loadable segment is dropped. In particular, this causes
the section table and section name strings to be missing. This
should be fine: real dynamic loaders don't load or inspect these
tables anyway. The result is roughly equivalent to eu-strip's
--strip-sections option.

The purpose of this change is to enable the vvar and hpet mappings
to be moved to the page following the vDSO load segment. Currently,
it is possible for the section table to extend into the page after
the load segment, so, if we map it, it risks overlapping the vvar or
hpet page. This happens whenever the load segment is just under a
multiple of PAGE_SIZE.

The only real subtlety here is that the old code had a C file with
inline assembler that did 'call VDSO32_vsyscall' and a linker script
that defined 'VDSO32_vsyscall = __kernel_vsyscall'. This most
likely worked by accident: the linker script entry defines a symbol
associated with an address as opposed to an alias for the real
dynamic symbol __kernel_vsyscall. That caused ld to relocate the
reference at link time instead of leaving an interposable dynamic
relocation. Since the VDSO32_vsyscall hack is no longer needed, I
now use 'call __kernel_vsyscall', and I added -Bsymbolic to make it
work. vdso2c will generate an error and abort the build if the
resulting image contains any dynamic relocations, so we won't
silently generate bad vdso images.

(Dynamic relocations are a problem because nothing will even attempt
to relocate the vdso.)
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/2c4fcf45524162a34d87fdda1eb046b2a5cecee7.1399317206.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

6f121e54

x86, vdso: Clean up 32-bit vs 64-bit vdso params · 3d7ee969

由 Andy Lutomirski 提交于 5月 05, 2014

Rather than using 'vdso_enabled' and an awful #define, just call the
parameters vdso32_enabled and vdso64_enabled.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/87913de56bdcbae3d93917938302fc369b05caee.1399317206.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

3d7ee969

21 3月, 2014 2 次提交

x86, vdso: Move more vdso definitions into vdso.h · 9e6f450f

由 Andy Lutomirski 提交于 3月 20, 2014

This fixes the Xen build and gets rid of a silly header file.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Cc: Stefani Seibold <stefani@seibold.net>
Link: http://lkml.kernel.org/r/1df77311795aff75f5742c787d277518314a38d3.1395366931.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

9e6f450f

x86: Load the 32-bit vdso in place, just like the 64-bit vdsos · b67e612c

由 Andy Lutomirski 提交于 3月 20, 2014

This replaces a decent amount of incomprehensible and buggy code
with much more straightforward code. It also brings the 32-bit vdso
more in line with the 64-bit vdsos, so maybe someday they can share
even more code.

This wastes a small amount of kernel .data and .text space, but it
avoids a couple of allocations on startup, so it should be more or
less a wash memory-wise.
Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
Cc: Stefani Seibold <stefani@seibold.net>
Link: http://lkml.kernel.org/r/b8093933fad09ce181edb08a61dcd5d2592e9814.1395352498.git.luto@amacapital.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

b67e612c

19 3月, 2014 1 次提交

x86, vdso: Patch alternatives in the 32-bit VDSO · b4b541a6

由 Andy Lutomirski 提交于 3月 17, 2014

We need the alternatives mechanism for rdtsc_barrier() to work.
Signed-off-by: NStefani Seibold <stefani@seibold.net>
Link: http://lkml.kernel.org/r/1395094933-14252-9-git-send-email-stefani@seibold.netSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

b4b541a6

12 12月, 2012 1 次提交

mm: use vm_unmapped_area() on x86_64 architecture · f9902472

由 Michel Lespinasse 提交于 12月 11, 2012

Update the x86_64 arch_get_unmapped_area[_topdown] functions to make use
of vm_unmapped_area() instead of implementing a brute force search.
Signed-off-by: NMichel Lespinasse <walken@google.com>
Reviewed-by: NRik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f9902472

24 3月, 2012 1 次提交

coredump: remove VM_ALWAYSDUMP flag · 909af768

由 Jason Baron 提交于 3月 23, 2012

The motivation for this patchset was that I was looking at a way for a
qemu-kvm process, to exclude the guest memory from its core dump, which
can be quite large.  There are already a number of filter flags in
/proc/<pid>/coredump_filter, however, these allow one to specify 'types'
of kernel memory, not specific address ranges (which is needed in this
case).

Since there are no more vma flags available, the first patch eliminates
the need for the 'VM_ALWAYSDUMP' flag.  The flag is used internally by
the kernel to mark vdso and vsyscall pages.  However, it is simple
enough to check if a vma covers a vdso or vsyscall page without the need
for this flag.

The second patch then replaces the 'VM_ALWAYSDUMP' flag with a new
'VM_NODUMP' flag, which can be set by userspace using new madvise flags:
'MADV_DONTDUMP', and unset via 'MADV_DODUMP'.  The core dump filters
continue to work the same as before unless 'MADV_DONTDUMP' is set on the
region.

The qemu code which implements this features is at:

  http://people.redhat.com/~jbaron/qemu-dump/qemu-dump.patch

In my testing the qemu core dump shrunk from 383MB -> 13MB with this
patch.

I also believe that the 'MADV_DONTDUMP' flag might be useful for
security sensitive apps, which might want to select which areas are
dumped.

This patch:

The VM_ALWAYSDUMP flag is currently used by the coredump code to
indicate that a vma is part of a vsyscall or vdso section.  However, we
can determine if a vma is in one these sections by checking it against
the gate_vma and checking for a non-NULL return value from
arch_vma_name().  Thus, freeing a valuable vma bit.
Signed-off-by: NJason Baron <jbaron@redhat.com>
Acked-by: NRoland McGrath <roland@hack.frob.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

909af768

22 2月, 2012 1 次提交

x32: Fix coding style violations in the x32 VDSO code · 22e842d4

由 H. Peter Anvin 提交于 2月 21, 2012

Move the prototype for x32_setup_additional_pages() to a header file,
and adjust the coding style to match standard.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: H. J. Lu <hjl.tools@gmail.com>

22e842d4

21 2月, 2012 1 次提交

x32: Add x32 VDSO support · 1a21d4e0

由 H. J. Lu 提交于 2月 19, 2012

Add support for the x32 VDSO.  The x32 VDSO takes advantage of the
similarity between the x86-64 and the x32 ABIs to contain the same
content, only the container is different, as the x32 VDSO obviously is
an x32 shared object.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

1a21d4e0

06 8月, 2011 1 次提交

x86, amd: Avoid cache aliasing penalties on AMD family 15h · dfb09f9b

由 Borislav Petkov 提交于 8月 05, 2011

This patch provides performance tuning for the "Bulldozer" CPU. With its
shared instruction cache there is a chance of generating an excessive
number of cache cross-invalidates when running specific workloads on the
cores of a compute module.

This excessive amount of cross-invalidations can be observed if cache
lines backed by shared physical memory alias in bits [14:12] of their
virtual addresses, as those bits are used for the index generation.

This patch addresses the issue by clearing all the bits in the [14:12]
slice of the file mapping's virtual address at generation time, thus
forcing those bits the same for all mappings of a single shared library
across processes and, in doing so, avoids instruction cache aliases.

It also adds the command line option "align_va_addr=(32|64|on|off)" with
which virtual address alignment can be enabled for 32-bit or 64-bit x86
individually, or both, or be completely disabled.

This change leaves virtual region address allocation on other families
and/or vendors unaffected.
Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
Link: http://lkml.kernel.org/r/1312550110-24160-2-git-send-email-bp@amd64.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

dfb09f9b

22 7月, 2011 1 次提交

x86-64, vdso: Do not allocate memory for the vDSO · aafade24

由 Andy Lutomirski 提交于 7月 21, 2011

We can map the vDSO straight from kernel data, saving a few page
allocations. As an added bonus, the deleted code contained a memory
leak.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/2c4ed5c2c2e93603790229e0c3403ae506ccc0cb.1311277573.git.luto@mit.eduSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

aafade24

14 7月, 2011 1 次提交

x86-64: Allow alternative patching in the vDSO · 1b3f2a72

由 Andy Lutomirski 提交于 7月 13, 2011

This code is short enough and different enough from the module
loader that it's not worth trying to share anything.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/e73112e4381fff29e31b882c2d0856822edaea53.1310563276.git.luto@mit.eduSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

1b3f2a72

24 5月, 2011 1 次提交

x86-64: Clean up vdso/kernel shared variables · 8c49d9a7

由 Andy Lutomirski 提交于 5月 23, 2011

Variables that are shared between the vdso and the kernel are
currently a bit of a mess.  They are each defined with their own
magic, they are accessed differently in the kernel, the vsyscall page,
and the vdso, and one of them (vsyscall_clock) doesn't even really
exist.

This changes them all to use a common mechanism.  All of them are
delcared in vvar.h with a fixed address (validated by the linker
script).  In the kernel (as before), they look like ordinary
read-write variables.  In the vsyscall page and the vdso, they are
accessed through a new macro VVAR, which gives read-only access.

The vdso is now loaded verbatim into memory without any fixups.  As a
side bonus, access from the vdso is faster because a level of
indirection is removed.

While we're at it, pack jiffies and vgetcpu_mode into the same
cacheline.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Borislav Petkov <bp@amd64.org>
Link: http://lkml.kernel.org/r/%3C7357882fbb51fa30491636a7b6528747301b7ee9.1306156808.git.luto%40mit.edu%3ESigned-off-by: NThomas Gleixner <tglx@linutronix.de>

8c49d9a7

03 8月, 2010 1 次提交

x86, vdso: Unmap vdso pages · be783a47

由 Shaohua Li 提交于 8月 02, 2010

We mapped vdso pages but never unmapped them and the virtual address
is lost after exiting from the function, so unmap vdso pages here.
Signed-off-by: NShaohua Li <shaohua.li@intel.com>
LKML-Reference: <20100802004934.GA2505@sli10-desk.sh.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

be783a47

19 6月, 2010 1 次提交

x86-64, mm: Initialize VDSO earlier on 64 bits · d7a0380d

由 Jiri Slaby 提交于 6月 16, 2010

When initrd is in use and a driver does request_module() in its
module_init (i.e. __initcall or device_initcall), a modprobe process
is created with VDSO mapping. But VDSO is inited even in __initcall,
i.e. on the same level (at the same time), so it may not be inited
yet (link order matters).

Move the VDSO initialization code earlier by switching to something
before rootfs_initcall where initrd is loaded as rootfs. Specifically
to subsys_initcall. Do it for standard 64-bit path (init_vdso_vars)
and for compat (sysenter_setup), just in case people have 32-bit
initrd and ia32 emulation built-in.

i386 (pure 32-bit) is not affected, since sysenter_setup() is called
from check_bugs()->identify_boot_cpu() in start_kernel() before
rest_init()->kernel_thread(kernel_init) where even kernel_init() calls
do_basic_setup()->do_initcalls().

What this patch fixes are early modprobe crashes such as:
Unpacking initramfs...
Freeing initrd memory: 9324k freed
modprobe[368]: segfault at 7fff4429c020 ip 00007fef397e160c \
    sp 00007fff442795c0 error 4 in ld-2.11.2.so[7fef397df000+1f000]
Signed-off-by: NJiri Slaby <jslaby@suse.cz>
LKML-Reference: <1276720242-13365-1-git-send-email-jslaby@suse.cz>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

d7a0380d

30 3月, 2010 1 次提交

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

05 6月, 2009 1 次提交

x86: Set context.vdso before installing the mapping · f7b6eb3f

由 Peter Zijlstra 提交于 6月 05, 2009

In order to make arch_vma_name() work from inside
install_special_mapping() we need to set the context.vdso
before calling it.

( This is needed for performance counters to be able to track
  this special executable area. )
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

f7b6eb3f

13 4月, 2009 1 次提交

x86: vdso/vma.c declare vdso_enabled and arch_setup_additional_pages before they get used · 3fa89ca7

由 Jaswinder Singh Rajput 提交于 4月 12, 2009

Impact: cleanup, address sparse warnings

Addresses the problem pointed out by these sparse warning:
arch/x86/vdso/vma.c:19:28: warning: symbol 'vdso_enabled' was not declared. Should it be static?
arch/x86/vdso/vma.c:101:5: warning: symbol 'arch_setup_additional_pages' was not declared. Should it be static?
Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
LKML-Reference: <1239548845.4170.2.camel@localhost.localdomain>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

3fa89ca7

21 2月, 2009 1 次提交

x86, mm: rename TASK_SIZE64 => TASK_SIZE_MAX · d9517346

由 Ingo Molnar 提交于 2月 20, 2009

Impact: cleanup

Rename TASK_SIZE64 to TASK_SIZE_MAX, and provide the
define on 32-bit too. (mapped to TASK_SIZE)

This allows 32-bit code to make use of the (former-) TASK_SIZE64
symbol as well, in a clean way.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d9517346

25 12月, 2008 1 次提交

[S390] arch_setup_additional_pages arguments · fc5243d9

由 Martin Schwidefsky 提交于 12月 25, 2008

arch_setup_additional_pages currently gets two arguments, the binary
format descripton and an indication if the process uses an executable
stack or not. The second argument is not used by anybody, it could
be removed without replacement.

What actually does make sense is to pass an indication if the process
uses the elf interpreter or not. The glibc code will not use anything
from the vdso if the process does not use the dynamic linker, so for
statically linked binaries the architecture backend can choose not
to map the vdso.
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

fc5243d9

19 7月, 2008 1 次提交

x86: fix two modpost warnings · 369c9920

由 Jan Beulich 提交于 7月 18, 2008

Even though it's only the difference of the two __initdata symbols
that's being calculated, modpost still doesn't like this. So rather
calculate the size once in an __init function and store it for later
use.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

369c9920

25 5月, 2008 1 次提交

x86: clean up vdso_enabled type on x86_64 · e6b0edef

由 OGAWA Hirofumi 提交于 5月 12, 2008

This fixes type of "vdso_enabled" on X86_64 to match extern in asm/elf.h.
Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

e6b0edef

30 1月, 2008 1 次提交

x86 vDSO: use vdso-syms.lds · 7f3646aa

由 Roland McGrath 提交于 1月 30, 2008

This patch changes the kernel's references to addresses in the vDSO image
to be based on the symbols defined by vdso-syms.lds instead of the old
vdso-syms.o symbols.  This is all wrapped up in a macro defined by the new
asm-x86/vdso.h header; that's the only place in the kernel source that has
to know the details of the scheme for getting vDSO symbol values.
Signed-off-by: NRoland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

7f3646aa

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功