提交 · 0e6fdb5f036338bc38bf660c65c931b3e92a31d7 · openanolis / cloud-kernel

11 3月, 2013 1 次提交

x86: Drop always empty .text..page_aligned section · ec7fd344

由 Jan Beulich 提交于 3月 11, 2013

Commit e44b7b75 ("x86: move suspend wakeup code to C") didn't
care to also eliminate the side effects that the earlier 4c491562
("x86: make arch/x86/kernel/acpi/wakeup_32.S use a separate")
had, thus leaving a now pointless, almost page size gap at the
beginning of .text.
Signed-off-by: NJan Beulich <jbeulich@suse.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pavel Machek <pavel@ucw.cz>
Link: http://lkml.kernel.org/r/513DBAA402000078000C4896@nat28.tlf.novell.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

ec7fd344

09 5月, 2012 1 次提交

x86, realmode: Move ACPI wakeup to unified realmode code · c9b77ccb

由 Jarkko Sakkinen 提交于 5月 08, 2012

Migrated ACPI wakeup code to the real-mode blob.
Code existing in .x86_trampoline  can be completely
removed. Static descriptor table in wakeup_asm.S is
courtesy of H. Peter Anvin.
Signed-off-by: NJarkko Sakkinen <jarkko.sakkinen@intel.com>
Link: http://lkml.kernel.org/r/1336501366-28617-7-git-send-email-jarkko.sakkinen@intel.com
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <len.brown@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

c9b77ccb

11 8月, 2011 1 次提交

x86-64: Rework vsyscall emulation and add vsyscall= parameter · 3ae36655

由 Andy Lutomirski 提交于 8月 10, 2011

There are three choices:

vsyscall=native: Vsyscalls are native code that issues the
corresponding syscalls.

vsyscall=emulate (default): Vsyscalls are emulated by instruction
fault traps, tested in the bad_area path.  The actual contents of
the vsyscall page is the same as the vsyscall=native case except
that it's marked NX.  This way programs that make assumptions about
what the code in the page does will not be confused when they read
that code.

vsyscall=none: Trying to execute a vsyscall will segfault.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/8449fb3abf89851fd6b2260972666a6f82542284.1312988155.git.luto@mit.eduSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

3ae36655

05 8月, 2011 2 次提交

x86-64: Work around gold bug 13023 · f670bb76

由 Andy Lutomirski 提交于 8月 03, 2011

Gold has trouble assigning numbers to the location counter inside of
an output section description.  The bug was triggered by
9fd67b4e, which consolidated all of
the vsyscall sections into a single section.  The workaround is IMO
still nicer than the old way of doing it.

This produces an apparently valid kernel image and passes my vdso
tests on both GNU ld version 2.21.51.0.6-2.fc15 20110118 and GNU
gold (version 2.21.51.0.6-2.fc15 20110118) 1.10 as distributed by
Fedora 15.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/0b260cb806f1f9a25c00ce8377a5f035d57f557a.1312378163.git.luto@mit.eduReported-by: NArkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

f670bb76

x86-64: Move the "user" vsyscall segment out of the data segment. · 9c40818d

由 Andy Lutomirski 提交于 8月 03, 2011

The kernel's loader doesn't seem to care, but gold complains.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/f0716870c297242a841b949953d80c0d87bf3d3f.1312378163.git.luto@mit.eduReported-by: NArkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

9c40818d

15 7月, 2011 1 次提交

x86-64: Move vread_tsc and vread_hpet into the vDSO · 98d0ac38

由 Andy Lutomirski 提交于 7月 14, 2011

The vsyscall page now consists entirely of trap instructions.

Cc: John Stultz <johnstul@us.ibm.com>
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/637648f303f2ef93af93bae25186e9a1bea093f5.1310639973.git.luto@mit.eduSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

98d0ac38

06 6月, 2011 3 次提交

x86-64: Fill unused parts of the vsyscall page with 0xcc · 5dfcea62

由 Andy Lutomirski 提交于 6月 05, 2011

Jumping to 0x00 might do something depending on the following
bytes. Jumping to 0xcc is a trap.  So fill the unused parts of
the vsyscall page with 0xcc to make it useless for exploits to
jump there.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Cc: Jesper Juhl <jj@chaosbits.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
Cc: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: pageexec@freemail.hu
Link: http://lkml.kernel.org/r/ed54bfcfbe50a9070d20ec1edbe0d149e22a4568.1307292171.git.luto@mit.eduSigned-off-by: NIngo Molnar <mingo@elte.hu>

5dfcea62

x86-64: Remove vsyscall number 3 (venosys) · bb5fe2f7

由 Andy Lutomirski 提交于 6月 05, 2011

It just segfaults since April 2008 (a4928cff), so I'm pretty
sure that nothing uses it.  And having an empty section makes
the linker script a bit fragile.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Cc: Jesper Juhl <jj@chaosbits.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
Cc: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: pageexec@freemail.hu
Link: http://lkml.kernel.org/r/4a4abcf47ecadc269f2391a313576fe6d06acef7.1307292171.git.luto@mit.eduSigned-off-by: NIngo Molnar <mingo@elte.hu>

bb5fe2f7

x86-64: Give vvars their own page · 9fd67b4e

由 Andy Lutomirski 提交于 6月 05, 2011

Move vvars out of the vsyscall page into their own page and mark
it NX.

Without this patch, an attacker who can force a daemon to call
some fixed address could wait until the time contains, say,
0xCD80, and then execute the current time.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Cc: Jesper Juhl <jj@chaosbits.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
Cc: Mikael Pettersson <mikpe@it.uu.se>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: pageexec@freemail.hu
Link: http://lkml.kernel.org/r/b1460f81dc4463d66ea3f2b5ce240f58d48effec.1307292171.git.luto@mit.eduSigned-off-by: NIngo Molnar <mingo@elte.hu>

9fd67b4e

24 5月, 2011 1 次提交

x86-64: Clean up vdso/kernel shared variables · 8c49d9a7

由 Andy Lutomirski 提交于 5月 23, 2011

Variables that are shared between the vdso and the kernel are
currently a bit of a mess.  They are each defined with their own
magic, they are accessed differently in the kernel, the vsyscall page,
and the vdso, and one of them (vsyscall_clock) doesn't even really
exist.

This changes them all to use a common mechanism.  All of them are
delcared in vvar.h with a fixed address (validated by the linker
script).  In the kernel (as before), they look like ordinary
read-write variables.  In the vsyscall page and the vdso, they are
accessed through a new macro VVAR, which gives read-only access.

The vdso is now loaded verbatim into memory without any fixups.  As a
side bonus, access from the vdso is faster because a level of
indirection is removed.

While we're at it, pack jiffies and vgetcpu_mode into the same
cacheline.
Signed-off-by: NAndy Lutomirski <luto@mit.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Borislav Petkov <bp@amd64.org>
Link: http://lkml.kernel.org/r/%3C7357882fbb51fa30491636a7b6528747301b7ee9.1306156808.git.luto%40mit.edu%3ESigned-off-by: NThomas Gleixner <tglx@linutronix.de>

8c49d9a7

22 5月, 2011 1 次提交

x86, apic: Introduce .apicdrivers section to find the list of apic drivers · 107e0e0c

由 Suresh Siddha 提交于 5月 20, 2011

This will pave the way for each apic driver to be self-contained
and eliminate the need for apic_probe[].

Order in which apic drivers are listed in the .apicdrivers
section is important, as this determines the apic probe order.
And this is enforced by the ordering of apic driver files in the
Makefile and the macros apic_driver()/apic_drivers().
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Tested-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: steiner@sgi.com
Cc: gorcunov@openvz.org
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/20110521005526.068775085@sbsiddha-MOBL3.sc.intel.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

107e0e0c

25 3月, 2011 1 次提交

percpu: Always align percpu output section to PAGE_SIZE · 0415b00d

由 Tejun Heo 提交于 3月 24, 2011

Percpu allocator honors alignment request upto PAGE_SIZE and both the
percpu addresses in the percpu address space and the translated kernel
addresses should be aligned accordingly.  The calculation of the
former depends on the alignment of percpu output section in the kernel
image.

The linker script macros PERCPU_VADDR() and PERCPU() are used to
define this output section and the latter takes @align parameter.
Several architectures are using @align smaller than PAGE_SIZE breaking
percpu memory alignment.

This patch removes @align parameter from PERCPU(), renames it to
PERCPU_SECTION() and makes it always align to PAGE_SIZE.  While at it,
add PCPU_SETUP_BUG_ON() checks such that alignment problems are
reliably detected and remove percpu alignment comment recently added
in workqueue.c as the condition would trigger BUG way before reaching
there.

For um, this patch raises the alignment of percpu area.  As the area
is in .init, there shouldn't be any noticeable difference.

This problem was discovered by David Howells while debugging boot
failure on mn10300.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NMike Frysinger <vapier@gentoo.org>
Cc: uclinux-dist-devel@blackfin.uclinux.org
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: user-mode-linux-devel@lists.sourceforge.net

0415b00d

09 3月, 2011 1 次提交

x86: Separate out entry text section · ea714547

由 Jiri Olsa 提交于 3月 07, 2011

Put x86 entry code into a separate link section: .entry.text.

Separating the entry text section seems to have performance
benefits - caused by more efficient instruction cache usage.

Running hackbench with perf stat --repeat showed that the change
compresses the icache footprint. The icache load miss rate went
down by about 15%:

 before patch:
         19417627  L1-icache-load-misses      ( +-   0.147% )

 after patch:
         16490788  L1-icache-load-misses      ( +-   0.180% )

The motivation of the patch was to fix a particular kprobes
bug that relates to the entry text section, the performance
advantage was discovered accidentally.

Whole perf output follows:

 - results for current tip tree:

  Performance counter stats for './hackbench/hackbench 10' (500 runs):

         19417627  L1-icache-load-misses      ( +-   0.147% )
       2676914223  instructions             #      0.497 IPC     ( +- 0.079% )
       5389516026  cycles                     ( +-   0.144% )

      0.206267711  seconds time elapsed   ( +-   0.138% )

 - results for current tip tree with the patch applied:

  Performance counter stats for './hackbench/hackbench 10' (500 runs):

         16490788  L1-icache-load-misses      ( +-   0.180% )
       2717734941  instructions             #      0.502 IPC     ( +- 0.079% )
       5414756975  cycles                     ( +-   0.148% )

      0.206747566  seconds time elapsed   ( +-   0.137% )
Signed-off-by: NJiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: masami.hiramatsu.pt@hitachi.com
Cc: ananth@in.ibm.com
Cc: davem@davemloft.net
Cc: 2nddept-manager@sdl.hitachi.co.jp
LKML-Reference: <20110307181039.GB15197@jolsa.redhat.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ea714547

18 2月, 2011 1 次提交

x86, trampoline: Common infrastructure for low memory trampolines · 4822b7fc

由 H. Peter Anvin 提交于 2月 14, 2011

Common infrastructure for low memory trampolines.  This code installs
the trampolines permanently in low memory very early.  It also permits
multiple pieces of code to be used for this purpose.

This code also introduces a standard infrastructure for computing
symbol addresses in the trampoline code.

The only change to the actual SMP trampolines themselves is that the
64-bit trampoline has been made reusable -- the previous version would
overwrite the code with a status variable; this moves the status
variable to a separate location.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <4D5DFBE4.7090104@intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Matthieu Castet <castet.matthieu@free.fr>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>

4822b7fc

10 2月, 2011 1 次提交

x86: Reduce back the alignment of the per-CPU data section · 94d1ac8b

由 Jan Beulich 提交于 2月 09, 2011

This complements commit:

  47f19a08: percpu: Remove the multi-page alignment facility

reverting one leftover of:

  fe8e0c25: x86, 32-bit: Align percpu area and irq stacks to THREAD_SIZE
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Acked-by: NAlexander van Heukelum <heukelum@fastmail.fm>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4D525CE60200007800030EE5@vpn.id2.novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>

94d1ac8b

25 1月, 2011 1 次提交

percpu: align percpu readmostly subsection to cacheline · 19df0c2f

由 Tejun Heo 提交于 1月 25, 2011

Currently percpu readmostly subsection may share cachelines with other
percpu subsections which may result in unnecessary cacheline bounce
and performance degradation.

This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR()
linker macros, makes each arch linker scripts specify its cacheline
size and use it to align percpu subsections.

This is based on Shaohua's x86 only patch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Shaohua Li <shaohua.li@intel.com>

19df0c2f

19 1月, 2011 1 次提交

Revert "x86: Make relocatable kernel work with new binutils" · 6b35eb9d

由 Ingo Molnar 提交于 1月 19, 2011

This reverts commit 86b1e8dd ("x86: Make relocatable kernel work with
new binutils").

Markus Trippelsdorf reported a boot failure caused by this patch.

The real solution to the original patch will likely involve an
arch-generic solution to define an overlaid jiffies_64 and jiffies
variables.

Until that's done and tested on all architectures revert this commit to
solve the regression.
Reported-and-bisected-by: NMarkus Trippelsdorf <markus@trippelsdorf.de>
Acked-by: N"H. Peter Anvin" <hpa@zytor.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Cc: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <4D36A759.60704@intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6b35eb9d

18 1月, 2011 1 次提交

x86: Make relocatable kernel work with new binutils · 86b1e8dd

由 Shaohua Li 提交于 1月 18, 2011

The CONFIG_RELOCATABLE=y option is broken with new binutils, which will make
boot panic.

According to Lu Hongjiu, the affected binutils are from 2.20.51.0.12 to
2.21.51.0.3, which are release since Oct 22 this year. At least ubuntu 10.10 is
using such binutils. See:

    http://sourceware.org/bugzilla/show_bug.cgi?id=12327

The reason of the boot panic is that we have 'jiffies = jiffies_64;' in
vmlinux.lds.S. The jiffies isn't in any section. In kernel build, there is
warning saying jiffies is an absolute address and can't be relocatable. At
runtime, jiffies will have virtual address 0.

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Cc: Lu Hongjiu<hongjiu.lu@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <1295312269.1949.725.camel@sli10-conroe>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

86b1e8dd

18 11月, 2010 1 次提交

x86: Add NX protection for kernel data · 5bd5a452

由 Matthieu Castet 提交于 11月 16, 2010

This patch expands functionality of CONFIG_DEBUG_RODATA to set main
(static) kernel data area as NX.

The following steps are taken to achieve this:

 1. Linker script is adjusted so .text always starts and ends on a page bound
 2. Linker script is adjusted so .rodata always start and end on a page boundary
 3. NX is set for all pages from _etext through _end in mark_rodata_ro.
 4. free_init_pages() sets released memory NX in arch/x86/mm/init.c
 5. bios rom is set to x when pcibios is used.

The results of patch application may be observed in the diff of kernel page
table dumps:

pcibios:

 -- data_nx_pt_before.txt       2009-10-13 07:48:59.000000000 -0400
 ++ data_nx_pt_after.txt        2009-10-13 07:26:46.000000000 -0400
  0x00000000-0xc0000000           3G                           pmd
  ---[ Kernel Mapping ]---
 -0xc0000000-0xc0100000           1M     RW             GLB x  pte
 +0xc0000000-0xc00a0000         640K     RW             GLB NX pte
 +0xc00a0000-0xc0100000         384K     RW             GLB x  pte
 -0xc0100000-0xc03d7000        2908K     ro             GLB x  pte
 +0xc0100000-0xc0318000        2144K     ro             GLB x  pte
 +0xc0318000-0xc03d7000         764K     ro             GLB NX pte
 -0xc03d7000-0xc0600000        2212K     RW             GLB x  pte
 +0xc03d7000-0xc0600000        2212K     RW             GLB NX pte
  0xc0600000-0xf7a00000         884M     RW         PSE GLB NX pmd
  0xf7a00000-0xf7bfe000        2040K     RW             GLB NX pte
  0xf7bfe000-0xf7c00000           8K                           pte

No pcibios:

 -- data_nx_pt_before.txt       2009-10-13 07:48:59.000000000 -0400
 ++ data_nx_pt_after.txt        2009-10-13 07:26:46.000000000 -0400
  0x00000000-0xc0000000           3G                           pmd
  ---[ Kernel Mapping ]---
 -0xc0000000-0xc0100000           1M     RW             GLB x  pte
 +0xc0000000-0xc0100000           1M     RW             GLB NX pte
 -0xc0100000-0xc03d7000        2908K     ro             GLB x  pte
 +0xc0100000-0xc0318000        2144K     ro             GLB x  pte
 +0xc0318000-0xc03d7000         764K     ro             GLB NX pte
 -0xc03d7000-0xc0600000        2212K     RW             GLB x  pte
 +0xc03d7000-0xc0600000        2212K     RW             GLB NX pte
  0xc0600000-0xf7a00000         884M     RW         PSE GLB NX pmd
  0xf7a00000-0xf7bfe000        2040K     RW             GLB NX pte
  0xf7bfe000-0xf7c00000           8K                           pte

The patch has been originally developed for Linux 2.6.34-rc2 x86 by
Siarhei Liakh <sliakh.lkml@gmail.com> and Xuxian Jiang <jiang@cs.ncsu.edu>.

 -v1:  initial patch for 2.6.30
 -v2:  patch for 2.6.31-rc7
 -v3:  moved all code into arch/x86, adjusted credits
 -v4:  fixed ifdef, removed credits from CREDITS
 -v5:  fixed an address calculation bug in mark_nxdata_nx()
 -v6:  added acked-by and PT dump diff to commit log
 -v7:  minor adjustments for -tip
 -v8:  rework with the merge of "Set first MB as RW+NX"
Signed-off-by: NSiarhei Liakh <sliakh.lkml@gmail.com>
Signed-off-by: NXuxian Jiang <jiang@cs.ncsu.edu>
Signed-off-by: NMatthieu CASTET <castet.matthieu@free.fr>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: James Morris <jmorris@namei.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Dave Jones <davej@redhat.com>
Cc: Kees Cook <kees.cook@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4CE2F82E.60601@free.fr>
[ minor cleanliness edits ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

5bd5a452

07 9月, 2010 1 次提交

x86, 32-bit: Align percpu area and irq stacks to THREAD_SIZE · fe8e0c25

由 Alexander van Heukelum 提交于 9月 06, 2010

The irq stacks, located in the percpu-area, need to be
THREAD_SIZE aligned. Add the infrastucture to align percpu
variables to larger-than-pagesize amounts within the percpu
area, and use it to specify the alignment for the irq stacks.
Also align the percpu area itself to THREAD_SIZE.

This should make irq stacks work with 8K THREAD_SIZE.
Signed-off-by: NAlexander van Heukelum <heukelum@fastmail.fm>
Cc: Tejun Heo <tj@kernel.org>
Cc: hch@lst.de
LKML-Reference: <1283799222.15941.1393621887@webmail.messagingengine.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fe8e0c25

31 8月, 2010 1 次提交

x86, iommu: Fix IOMMU_INIT alignment rules · 7ac41ccf

由 Konrad Rzeszutek Wilk 提交于 8月 30, 2010

This boot crash was observed:

 DMA-API: preallocated 32768 debug entries
 DMA-API: debugging enabled by kernel config
 BUG: unable to handle kernel paging request at 19da8955
 IP: [<f4ffffff>] 0xf4ffffff
 *pde = 00000000

The crux of the failure was that even if we did not use any
of the .iommu_table section, the linker would still insert it
in the vmlinux file. This patch fixes that and also fixes the
runtime crash where we would try to access the array.
Reported-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
LKML-Reference: <1283191802-25086-1-git-send-email-konrad.wilk@oracle.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

7ac41ccf

28 8月, 2010 1 次提交

x86, doc: Adding comments about .iommu_table and its neighbors. · 6f44d033

由 Konrad Rzeszutek Wilk 提交于 8月 27, 2010

Updating the linker section with comments about .iommu_table and
some other ones that I know of.

CC: Sam Ravnborg <sam@ravnborg.org>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Fujita Tomonori <fujita.tomonori@lab.ntt.co.jp>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
LKML-Reference: <1282933173-19960-1-git-send-email-konrad.wilk@oracle.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

6f44d033

27 8月, 2010 1 次提交

x86, iommu: Add IOMMU_INIT macros, .iommu_table section, and iommu_table_entry structure · 0444ad93

由 Konrad Rzeszutek Wilk 提交于 8月 26, 2010

This patch set adds a mechanism to "modularize" the IOMMUs we have
on X86. Currently the count of IOMMUs is up to six and they have a complex
relationship that requires careful execution order. 'pci_iommu_alloc'
does that today, but most folks are unhappy with how it does it.
This patch set addresses this and also paves a mechanism to jettison
unused IOMMUs during run-time. For details that sparked this, please
refer to: http://lkml.org/lkml/2010/8/2/282

The first solution that comes to mind is to convert wholesale
the IOMMU detection routines to be called during initcall
time frame. Unfortunately that misses the dependency relationship
that some of the IOMMUs have (for example: for AMD-Vi IOMMU to work,
GART detection MUST run first, and before all of that SWIOTLB MUST run).

The second solution would be to introduce a registration call wherein
the IOMMU would provide its detection/init routines and as well on what
MUST run before it. That would work, except that the 'pci_iommu_alloc'
which would run through this list, is called during mem_init. This means we
don't have any memory allocator, and it is so early that we haven't yet
started running through the initcall_t list.

This solution borrows concepts from the 2nd idea and from how
MODULE_INIT works. A macro is provided that each IOMMU uses to define
it's detect function and early_init (before the memory allocate is
active), and as well what other IOMMU MUST run before us. Since most IOMMUs
depend on having SWIOTLB run first ("pci_swiotlb_detect") a convenience macro
to depends on that is also provided.

This macro is similar in design to MODULE_PARAM macro wherein
we setup a .iommu_table section in which we populate it with the values
that match a struct iommu_table_entry. During bootup we will sort
through the array so that the IOMMUs that MUST run before us are first
elements in the array. And then we just iterate through them calling the
detection routine and if appropiate, the init routines.
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
LKML-Reference: <1282845485-8991-2-git-send-email-konrad.wilk@oracle.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Fujita Tomonori <fujita.tomonori@lab.ntt.co.jp>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

0444ad93

30 3月, 2010 1 次提交

x86: Make smp_locks end with page alignment · 596b711e

由 Yinghai Lu 提交于 3月 28, 2010

Fix:

 ------------[ cut here ]------------
 WARNING: at arch/x86/mm/init.c:342 free_init_pages+0x4c/0xfa()
 free_init_pages: range [0x40daf000, 0x40db5c24] is not aligned
 Modules linked in:
 Pid: 0, comm: swapper Not tainted
 2.6.34-rc2-tip-03946-g4f16b23-dirty #50 Call Trace:
  [<40232e9f>] warn_slowpath_common+0x65/0x7c
  [<4021c9f0>] ? free_init_pages+0x4c/0xfa
  [<40881434>] ? _etext+0x0/0x24
  [<40232eea>] warn_slowpath_fmt+0x24/0x27
  [<4021c9f0>] free_init_pages+0x4c/0xfa
  [<40881434>] ? _etext+0x0/0x24
  [<40d3f4bd>] alternative_instructions+0xf6/0x100
  [<40d3fe4f>] check_bugs+0xbd/0xbf
  [<40d398a7>] start_kernel+0x2d5/0x2e4
  [<40d390ce>] i386_start_kernel+0xce/0xd5
 ---[ end trace 4eaa2a86a8e2da22 ]---

Comments in vmlinux.lds.S already said:

 |        /*
 |         * smp_locks might be freed after init
 |         * start/end must be page aligned
 |         */
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1269830604-26214-2-git-send-email-yinghai@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

596b711e

03 3月, 2010 2 次提交

Rename .text.page_aligned to .text..page_aligned. · 819d6762

由 Denys Vlasenko 提交于 2月 20, 2010

Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: NMichal Marek <mmarek@suse.cz>

819d6762

Rename .bss.page_aligned to .bss..page_aligned. · 7c74df07

由 Tim Abbott 提交于 2月 20, 2010

Signed-off-by: NTim Abbott <tabbott@ksplice.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: NDenys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: NMichal Marek <mmarek@suse.cz>

7c74df07

15 12月, 2009 1 次提交

x86: Regex support and known-movable symbols for relocs, fix _end · 873b5271

由 H. Peter Anvin 提交于 12月 14, 2009

This adds a new category of symbols to the relocs program: symbols
which are known to be relative, even though the linker emits them as
absolute; this is the case for symbols that live in the linker script,
which currently applies to _end.

Unfortunately the previous workaround of putting _end in its own empty
section was defeated by newer binutils, which remove empty sections
completely.

This patch also changes the symbol matching to use regular expressions
instead of hardcoded C for specific patterns.

This is a decidedly non-minimal patch: a modified version of the
relocs program is used as part of the Syslinux build, and this 	is
basically a backport to Linux of some of those changes; they have
thus been well tested.
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
LKML-Reference: <4AF86211.3070103@zytor.com>
Acked-by: NMichal Marek <mmarek@suse.cz>
Tested-by: NSedat Dilek <sedat.dilek@gmail.com>

873b5271

19 11月, 2009 1 次提交

x86: Eliminate redundant/contradicting cache line size config options · 350f8f56

由 Jan Beulich 提交于 11月 13, 2009

Rather than having X86_L1_CACHE_BYTES and X86_L1_CACHE_SHIFT
(with inconsistent defaults), just having the latter suffices as
the former can be easily calculated from it.

To be consistent, also change X86_INTERNODE_CACHE_BYTES to
X86_INTERNODE_CACHE_SHIFT, and set it to 7 (128 bytes) for NUMA
to account for last level cache line size (which here matters
more than L1 cache line size).

Finally, make sure the default value for X86_L1_CACHE_SHIFT,
when X86_GENERIC is selected, is being seen before that for the
individual CPU model options (other than on x86-64, where
GENERIC_CPU is part of the choice construct, X86_GENERIC is a
separate option on ix86).
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Acked-by: NRavikiran Thirumalai <kiran@scalex86.org>
Acked-by: NNick Piggin <npiggin@suse.de>
LKML-Reference: <4AFD5710020000780001F8F0@vpn.id2.novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

350f8f56

29 10月, 2009 1 次提交

percpu: remove per_cpu__ prefix. · dd17c8f7

由 Rusty Russell 提交于 10月 29, 2009

Now that the return from alloc_percpu is compatible with the address
of per-cpu vars, it makes sense to hand around the address of per-cpu
variables.  To make this sane, we remove the per_cpu__ prefix we used
created to stop people accidentally using these vars directly.

Now we have sparse, we can use that (next patch).

tj: * Updated to convert stuff which were missed by or added after the
      original patch.

    * Kill per_cpu_var() macro.
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>

dd17c8f7

20 10月, 2009 2 次提交

x86-64: add comment for RODATA large page retainment · d6cc1c3a

由 Suresh Siddha 提交于 10月 19, 2009

Add a comment explaining why RODATA is aligned to 2 MB.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

d6cc1c3a

x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATA · 74e08179

由 Suresh Siddha 提交于 10月 14, 2009

CONFIG_DEBUG_RODATA chops the large pages spanning boundaries of kernel
text/rodata/data to small 4KB pages as they are mapped with different
attributes (text as RO, RODATA as RO and NX etc).

On x86_64, preserve the large page mappings for kernel text/rodata/data
boundaries when CONFIG_DEBUG_RODATA is enabled. This is done by allowing the
RODATA section to be hugepage aligned and having same RWX attributes
for the 2MB page boundaries

Extra Memory pages padding the sections will be freed during the end of the boot
and the kernel identity mappings will have different RWX permissions compared to
the kernel text mappings.

Kernel identity mappings to these physical pages will be mapped with smaller
pages but large page mappings are still retained for kernel text,rodata,data
mappings.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20091014220254.190119924@sbs-t61.sc.intel.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

74e08179

16 10月, 2009 1 次提交

x86: Document linker script ASSERT() quirk · a5912f6b

由 Ingo Molnar 提交于 10月 16, 2009

Older binutils breaks if ASSERT() is used without a sink
for the output.

For example 2.14.90.0.6 is known to be broken, the link
fails with:

  LD      .tmp_vmlinux1
  ld:arch/x86/kernel/vmlinux.lds:678: parse error

Document this quirk in all three files that use it.

  See:    http://marc.info/?l=linux-kbuild&m=124930110427870&w=2
  See[2]: d2ba8b21 ("x86: Fix assert syntax in vmlinux.lds.S")

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <4AD6523D.5030909@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

a5912f6b

15 10月, 2009 2 次提交

Revert "x86: linker script syntax nits" · db8590f5

由 Ingo Molnar 提交于 10月 15, 2009

This reverts commit e9a63a4e.

This breaks older binutils, where sink-less asserts are broken.

See this commit for further details:

  d2ba8b21: x86: Fix assert syntax in vmlinux.lds.S
Acked-by: N"H. Peter Anvin" <hpa@zytor.com>
Acked-by: NSam Ravnborg <sam@ravnborg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4AD6523D.5030909@zytor.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

db8590f5

x86: linker script syntax nits · e9a63a4e

由 Roland McGrath 提交于 10月 14, 2009

The linker scripts grew some use of weirdly wrong linker script syntax.
It happens to work, but it's not what the syntax is documented to be.
Clean it up to use the official syntax.
Signed-off-by: NRoland McGrath <roland@redhat.com>
CC: Ian Lance Taylor <iant@google.com>

e9a63a4e

21 9月, 2009 1 次提交

x86: Correct segment permission flags in 64-bit linker script · 8d0cc631

由 Jan Beulich 提交于 9月 04, 2009

While these don't get actively used (afaict), it still doesn't hurt
for them to properly reflect what how respective segments will get
mapped/ accessed.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
LKML-Reference: <4AA0E95F0200007800013707@vpn.id2.novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

8d0cc631

19 9月, 2009 4 次提交

T
x86: Cleanup linker script using new linker script macros. · 123f3e1d
由 Tim Abbott 提交于 9月 16, 2009
```
Signed-off-by: NTim Abbott <tabbott@ksplice.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
```
123f3e1d

x86: Use section .data.page_aligned for the idt_table. · 07e81d61

由 Tim Abbott 提交于 9月 16, 2009

The .data.idt section is just squashed into the .data.page_aligned
output section by the linker script anyway, so it might as well be in
the .data.page_aligned section.

This eliminates all references to .data.idt on x86.
Signed-off-by: NTim Abbott <tabbott@ksplice.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

07e81d61

x86: convert to use __HEAD and HEAD_TEXT macros. · 4ae59b91

由 Tim Abbott 提交于 9月 16, 2009

This has the consequence of changing the section name use for head
code from ".text.head" to ".head.text".  It also eliminates the
".text.head" output section (instead placing head code at the start of
the .text output section), which should be harmless.

This patch only changes the sections in the actual kernel, not those
in the compressed boot loader.
Signed-off-by: NTim Abbott <tabbott@ksplice.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

4ae59b91

x86: fix fragile computation of vsyscall address · d223246e

由 Anders Kaseorg 提交于 9月 16, 2009

Previously, the address of the vsyscall page (VSYSCALL_PHYS_ADDR,
VSYSCALL_VIRT_ADDR) was computed by arithmetic on the address of the
last section.  This leads to bugs when new sections are inserted, such
as the one fixed by commit d312ceda.
Let's compute it from the current address instead.
Signed-off-by: NAnders Kaseorg <andersk@ksplice.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

d223246e

25 8月, 2009 1 次提交

x86: Fix build with older binutils and consolidate linker script · c62e4320

由 Jan Beulich 提交于 8月 25, 2009

binutils prior to 2.17 can't deal with the currently possible
situation of a new segment following the per-CPU segment, but
that new segment being empty - objcopy misplaces the .bss (and
perhaps also the .brk) sections outside of any segment.

However, the current ordering of sections really just appears
to be the effect of cumulative unrelated changes; re-ordering
things allows to easily guarantee that the segment following
the per-CPU one is non-empty, and at once eliminates the need
for the bogus data.init2 segment.

Once touching this code, also use the various data section
helper macros from include/asm-generic/vmlinux.lds.h.

-v2: fix !SMP builds.
Signed-off-by: NJan Beulich <jbeulich@novell.com>
Cc: <sam@ravnborg.org>
LKML-Reference: <4A94085D02000078000119A5@vpn.id2.novell.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c62e4320

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功