1. 11 3月, 2013 1 次提交
  2. 09 5月, 2012 1 次提交
  3. 11 8月, 2011 1 次提交
  4. 05 8月, 2011 2 次提交
  5. 15 7月, 2011 1 次提交
  6. 06 6月, 2011 3 次提交
  7. 24 5月, 2011 1 次提交
    • A
      x86-64: Clean up vdso/kernel shared variables · 8c49d9a7
      Andy Lutomirski 提交于
      Variables that are shared between the vdso and the kernel are
      currently a bit of a mess.  They are each defined with their own
      magic, they are accessed differently in the kernel, the vsyscall page,
      and the vdso, and one of them (vsyscall_clock) doesn't even really
      exist.
      
      This changes them all to use a common mechanism.  All of them are
      delcared in vvar.h with a fixed address (validated by the linker
      script).  In the kernel (as before), they look like ordinary
      read-write variables.  In the vsyscall page and the vdso, they are
      accessed through a new macro VVAR, which gives read-only access.
      
      The vdso is now loaded verbatim into memory without any fixups.  As a
      side bonus, access from the vdso is faster because a level of
      indirection is removed.
      
      While we're at it, pack jiffies and vgetcpu_mode into the same
      cacheline.
      Signed-off-by: NAndy Lutomirski <luto@mit.edu>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Borislav Petkov <bp@amd64.org>
      Link: http://lkml.kernel.org/r/%3C7357882fbb51fa30491636a7b6528747301b7ee9.1306156808.git.luto%40mit.edu%3ESigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      8c49d9a7
  8. 22 5月, 2011 1 次提交
  9. 25 3月, 2011 1 次提交
    • T
      percpu: Always align percpu output section to PAGE_SIZE · 0415b00d
      Tejun Heo 提交于
      Percpu allocator honors alignment request upto PAGE_SIZE and both the
      percpu addresses in the percpu address space and the translated kernel
      addresses should be aligned accordingly.  The calculation of the
      former depends on the alignment of percpu output section in the kernel
      image.
      
      The linker script macros PERCPU_VADDR() and PERCPU() are used to
      define this output section and the latter takes @align parameter.
      Several architectures are using @align smaller than PAGE_SIZE breaking
      percpu memory alignment.
      
      This patch removes @align parameter from PERCPU(), renames it to
      PERCPU_SECTION() and makes it always align to PAGE_SIZE.  While at it,
      add PCPU_SETUP_BUG_ON() checks such that alignment problems are
      reliably detected and remove percpu alignment comment recently added
      in workqueue.c as the condition would trigger BUG way before reaching
      there.
      
      For um, this patch raises the alignment of percpu area.  As the area
      is in .init, there shouldn't be any noticeable difference.
      
      This problem was discovered by David Howells while debugging boot
      failure on mn10300.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NMike Frysinger <vapier@gentoo.org>
      Cc: uclinux-dist-devel@blackfin.uclinux.org
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: user-mode-linux-devel@lists.sourceforge.net
      0415b00d
  10. 09 3月, 2011 1 次提交
    • J
      x86: Separate out entry text section · ea714547
      Jiri Olsa 提交于
      Put x86 entry code into a separate link section: .entry.text.
      
      Separating the entry text section seems to have performance
      benefits - caused by more efficient instruction cache usage.
      
      Running hackbench with perf stat --repeat showed that the change
      compresses the icache footprint. The icache load miss rate went
      down by about 15%:
      
       before patch:
               19417627  L1-icache-load-misses      ( +-   0.147% )
      
       after patch:
               16490788  L1-icache-load-misses      ( +-   0.180% )
      
      The motivation of the patch was to fix a particular kprobes
      bug that relates to the entry text section, the performance
      advantage was discovered accidentally.
      
      Whole perf output follows:
      
       - results for current tip tree:
      
        Performance counter stats for './hackbench/hackbench 10' (500 runs):
      
               19417627  L1-icache-load-misses      ( +-   0.147% )
             2676914223  instructions             #      0.497 IPC     ( +- 0.079% )
             5389516026  cycles                     ( +-   0.144% )
      
            0.206267711  seconds time elapsed   ( +-   0.138% )
      
       - results for current tip tree with the patch applied:
      
        Performance counter stats for './hackbench/hackbench 10' (500 runs):
      
               16490788  L1-icache-load-misses      ( +-   0.180% )
             2717734941  instructions             #      0.502 IPC     ( +- 0.079% )
             5414756975  cycles                     ( +-   0.148% )
      
            0.206747566  seconds time elapsed   ( +-   0.137% )
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: masami.hiramatsu.pt@hitachi.com
      Cc: ananth@in.ibm.com
      Cc: davem@davemloft.net
      Cc: 2nddept-manager@sdl.hitachi.co.jp
      LKML-Reference: <20110307181039.GB15197@jolsa.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea714547
  11. 18 2月, 2011 1 次提交
    • H
      x86, trampoline: Common infrastructure for low memory trampolines · 4822b7fc
      H. Peter Anvin 提交于
      Common infrastructure for low memory trampolines.  This code installs
      the trampolines permanently in low memory very early.  It also permits
      multiple pieces of code to be used for this purpose.
      
      This code also introduces a standard infrastructure for computing
      symbol addresses in the trampoline code.
      
      The only change to the actual SMP trampolines themselves is that the
      64-bit trampoline has been made reusable -- the previous version would
      overwrite the code with a status variable; this moves the status
      variable to a separate location.
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      LKML-Reference: <4D5DFBE4.7090104@intel.com>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Matthieu Castet <castet.matthieu@free.fr>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      4822b7fc
  12. 10 2月, 2011 1 次提交
  13. 25 1月, 2011 1 次提交
    • T
      percpu: align percpu readmostly subsection to cacheline · 19df0c2f
      Tejun Heo 提交于
      Currently percpu readmostly subsection may share cachelines with other
      percpu subsections which may result in unnecessary cacheline bounce
      and performance degradation.
      
      This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR()
      linker macros, makes each arch linker scripts specify its cacheline
      size and use it to align percpu subsections.
      
      This is based on Shaohua's x86 only patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Shaohua Li <shaohua.li@intel.com>
      19df0c2f
  14. 19 1月, 2011 1 次提交
  15. 18 1月, 2011 1 次提交
    • S
      x86: Make relocatable kernel work with new binutils · 86b1e8dd
      Shaohua Li 提交于
      The CONFIG_RELOCATABLE=y option is broken with new binutils, which will make
      boot panic.
      
      According to Lu Hongjiu, the affected binutils are from 2.20.51.0.12 to
      2.21.51.0.3, which are release since Oct 22 this year. At least ubuntu 10.10 is
      using such binutils. See:
      
          http://sourceware.org/bugzilla/show_bug.cgi?id=12327
      
      The reason of the boot panic is that we have 'jiffies = jiffies_64;' in
      vmlinux.lds.S. The jiffies isn't in any section. In kernel build, there is
      warning saying jiffies is an absolute address and can't be relocatable. At
      runtime, jiffies will have virtual address 0.
      
      Signed-off-by: Shaohua Li<shaohua.li@intel.com>
      Cc: Lu Hongjiu<hongjiu.lu@intel.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      LKML-Reference: <1295312269.1949.725.camel@sli10-conroe>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      86b1e8dd
  16. 18 11月, 2010 1 次提交
    • M
      x86: Add NX protection for kernel data · 5bd5a452
      Matthieu Castet 提交于
      This patch expands functionality of CONFIG_DEBUG_RODATA to set main
      (static) kernel data area as NX.
      
      The following steps are taken to achieve this:
      
       1. Linker script is adjusted so .text always starts and ends on a page bound
       2. Linker script is adjusted so .rodata always start and end on a page boundary
       3. NX is set for all pages from _etext through _end in mark_rodata_ro.
       4. free_init_pages() sets released memory NX in arch/x86/mm/init.c
       5. bios rom is set to x when pcibios is used.
      
      The results of patch application may be observed in the diff of kernel page
      table dumps:
      
      pcibios:
      
       -- data_nx_pt_before.txt       2009-10-13 07:48:59.000000000 -0400
       ++ data_nx_pt_after.txt        2009-10-13 07:26:46.000000000 -0400
        0x00000000-0xc0000000           3G                           pmd
        ---[ Kernel Mapping ]---
       -0xc0000000-0xc0100000           1M     RW             GLB x  pte
       +0xc0000000-0xc00a0000         640K     RW             GLB NX pte
       +0xc00a0000-0xc0100000         384K     RW             GLB x  pte
       -0xc0100000-0xc03d7000        2908K     ro             GLB x  pte
       +0xc0100000-0xc0318000        2144K     ro             GLB x  pte
       +0xc0318000-0xc03d7000         764K     ro             GLB NX pte
       -0xc03d7000-0xc0600000        2212K     RW             GLB x  pte
       +0xc03d7000-0xc0600000        2212K     RW             GLB NX pte
        0xc0600000-0xf7a00000         884M     RW         PSE GLB NX pmd
        0xf7a00000-0xf7bfe000        2040K     RW             GLB NX pte
        0xf7bfe000-0xf7c00000           8K                           pte
      
      No pcibios:
      
       -- data_nx_pt_before.txt       2009-10-13 07:48:59.000000000 -0400
       ++ data_nx_pt_after.txt        2009-10-13 07:26:46.000000000 -0400
        0x00000000-0xc0000000           3G                           pmd
        ---[ Kernel Mapping ]---
       -0xc0000000-0xc0100000           1M     RW             GLB x  pte
       +0xc0000000-0xc0100000           1M     RW             GLB NX pte
       -0xc0100000-0xc03d7000        2908K     ro             GLB x  pte
       +0xc0100000-0xc0318000        2144K     ro             GLB x  pte
       +0xc0318000-0xc03d7000         764K     ro             GLB NX pte
       -0xc03d7000-0xc0600000        2212K     RW             GLB x  pte
       +0xc03d7000-0xc0600000        2212K     RW             GLB NX pte
        0xc0600000-0xf7a00000         884M     RW         PSE GLB NX pmd
        0xf7a00000-0xf7bfe000        2040K     RW             GLB NX pte
        0xf7bfe000-0xf7c00000           8K                           pte
      
      The patch has been originally developed for Linux 2.6.34-rc2 x86 by
      Siarhei Liakh <sliakh.lkml@gmail.com> and Xuxian Jiang <jiang@cs.ncsu.edu>.
      
       -v1:  initial patch for 2.6.30
       -v2:  patch for 2.6.31-rc7
       -v3:  moved all code into arch/x86, adjusted credits
       -v4:  fixed ifdef, removed credits from CREDITS
       -v5:  fixed an address calculation bug in mark_nxdata_nx()
       -v6:  added acked-by and PT dump diff to commit log
       -v7:  minor adjustments for -tip
       -v8:  rework with the merge of "Set first MB as RW+NX"
      Signed-off-by: NSiarhei Liakh <sliakh.lkml@gmail.com>
      Signed-off-by: NXuxian Jiang <jiang@cs.ncsu.edu>
      Signed-off-by: NMatthieu CASTET <castet.matthieu@free.fr>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: James Morris <jmorris@namei.org>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Kees Cook <kees.cook@canonical.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4CE2F82E.60601@free.fr>
      [ minor cleanliness edits ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5bd5a452
  17. 07 9月, 2010 1 次提交
  18. 31 8月, 2010 1 次提交
    • K
      x86, iommu: Fix IOMMU_INIT alignment rules · 7ac41ccf
      Konrad Rzeszutek Wilk 提交于
      This boot crash was observed:
      
       DMA-API: preallocated 32768 debug entries
       DMA-API: debugging enabled by kernel config
       BUG: unable to handle kernel paging request at 19da8955
       IP: [<f4ffffff>] 0xf4ffffff
       *pde = 00000000
      
      The crux of the failure was that even if we did not use any
      of the .iommu_table section, the linker would still insert it
      in the vmlinux file. This patch fixes that and also fixes the
      runtime crash where we would try to access the array.
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Joerg Roedel <joerg.roedel@amd.com>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      LKML-Reference: <1283191802-25086-1-git-send-email-konrad.wilk@oracle.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7ac41ccf
  19. 28 8月, 2010 1 次提交
  20. 27 8月, 2010 1 次提交
    • K
      x86, iommu: Add IOMMU_INIT macros, .iommu_table section, and iommu_table_entry structure · 0444ad93
      Konrad Rzeszutek Wilk 提交于
      This patch set adds a mechanism to "modularize" the IOMMUs we have
      on X86. Currently the count of IOMMUs is up to six and they have a complex
      relationship that requires careful execution order. 'pci_iommu_alloc'
      does that today, but most folks are unhappy with how it does it.
      This patch set addresses this and also paves a mechanism to jettison
      unused IOMMUs during run-time. For details that sparked this, please
      refer to: http://lkml.org/lkml/2010/8/2/282
      
      The first solution that comes to mind is to convert wholesale
      the IOMMU detection routines to be called during initcall
      time frame. Unfortunately that misses the dependency relationship
      that some of the IOMMUs have (for example: for AMD-Vi IOMMU to work,
      GART detection MUST run first, and before all of that SWIOTLB MUST run).
      
      The second solution would be to introduce a registration call wherein
      the IOMMU would provide its detection/init routines and as well on what
      MUST run before it. That would work, except that the 'pci_iommu_alloc'
      which would run through this list, is called during mem_init. This means we
      don't have any memory allocator, and it is so early that we haven't yet
      started running through the initcall_t list.
      
      This solution borrows concepts from the 2nd idea and from how
      MODULE_INIT works. A macro is provided that each IOMMU uses to define
      it's detect function and early_init (before the memory allocate is
      active), and as well what other IOMMU MUST run before us.  Since most IOMMUs
      depend on having SWIOTLB run first ("pci_swiotlb_detect") a convenience macro
      to depends on that is also provided.
      
      This macro is similar in design to MODULE_PARAM macro wherein
      we setup a .iommu_table section in which we populate it with the values
      that match a struct iommu_table_entry. During bootup we will sort
      through the array so that the IOMMUs that MUST run before us are first
      elements in the array. And then we just iterate through them calling the
      detection routine and if appropiate, the init routines.
      Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      LKML-Reference: <1282845485-8991-2-git-send-email-konrad.wilk@oracle.com>
      CC: H. Peter Anvin <hpa@zytor.com>
      CC: Fujita Tomonori <fujita.tomonori@lab.ntt.co.jp>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      0444ad93
  21. 30 3月, 2010 1 次提交
    • Y
      x86: Make smp_locks end with page alignment · 596b711e
      Yinghai Lu 提交于
      Fix:
      
       ------------[ cut here ]------------
       WARNING: at arch/x86/mm/init.c:342 free_init_pages+0x4c/0xfa()
       free_init_pages: range [0x40daf000, 0x40db5c24] is not aligned
       Modules linked in:
       Pid: 0, comm: swapper Not tainted
       2.6.34-rc2-tip-03946-g4f16b23-dirty #50 Call Trace:
        [<40232e9f>] warn_slowpath_common+0x65/0x7c
        [<4021c9f0>] ? free_init_pages+0x4c/0xfa
        [<40881434>] ? _etext+0x0/0x24
        [<40232eea>] warn_slowpath_fmt+0x24/0x27
        [<4021c9f0>] free_init_pages+0x4c/0xfa
        [<40881434>] ? _etext+0x0/0x24
        [<40d3f4bd>] alternative_instructions+0xf6/0x100
        [<40d3fe4f>] check_bugs+0xbd/0xbf
        [<40d398a7>] start_kernel+0x2d5/0x2e4
        [<40d390ce>] i386_start_kernel+0xce/0xd5
       ---[ end trace 4eaa2a86a8e2da22 ]---
      
      Comments in vmlinux.lds.S already said:
      
       |        /*
       |         * smp_locks might be freed after init
       |         * start/end must be page aligned
       |         */
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <1269830604-26214-2-git-send-email-yinghai@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      596b711e
  22. 03 3月, 2010 2 次提交
  23. 15 12月, 2009 1 次提交
    • H
      x86: Regex support and known-movable symbols for relocs, fix _end · 873b5271
      H. Peter Anvin 提交于
      This adds a new category of symbols to the relocs program: symbols
      which are known to be relative, even though the linker emits them as
      absolute; this is the case for symbols that live in the linker script,
      which currently applies to _end.
      
      Unfortunately the previous workaround of putting _end in its own empty
      section was defeated by newer binutils, which remove empty sections
      completely.
      
      This patch also changes the symbol matching to use regular expressions
      instead of hardcoded C for specific patterns.
      
      This is a decidedly non-minimal patch: a modified version of the
      relocs program is used as part of the Syslinux build, and this 	is
      basically a backport to Linux of some of those changes; they have
      thus been well tested.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <4AF86211.3070103@zytor.com>
      Acked-by: NMichal Marek <mmarek@suse.cz>
      Tested-by: NSedat Dilek <sedat.dilek@gmail.com>
      873b5271
  24. 19 11月, 2009 1 次提交
    • J
      x86: Eliminate redundant/contradicting cache line size config options · 350f8f56
      Jan Beulich 提交于
      Rather than having X86_L1_CACHE_BYTES and X86_L1_CACHE_SHIFT
      (with inconsistent defaults), just having the latter suffices as
      the former can be easily calculated from it.
      
      To be consistent, also change X86_INTERNODE_CACHE_BYTES to
      X86_INTERNODE_CACHE_SHIFT, and set it to 7 (128 bytes) for NUMA
      to account for last level cache line size (which here matters
      more than L1 cache line size).
      
      Finally, make sure the default value for X86_L1_CACHE_SHIFT,
      when X86_GENERIC is selected, is being seen before that for the
      individual CPU model options (other than on x86-64, where
      GENERIC_CPU is part of the choice construct, X86_GENERIC is a
      separate option on ix86).
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Acked-by: NRavikiran Thirumalai <kiran@scalex86.org>
      Acked-by: NNick Piggin <npiggin@suse.de>
      LKML-Reference: <4AFD5710020000780001F8F0@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      350f8f56
  25. 29 10月, 2009 1 次提交
  26. 20 10月, 2009 2 次提交
  27. 16 10月, 2009 1 次提交
    • I
      x86: Document linker script ASSERT() quirk · a5912f6b
      Ingo Molnar 提交于
      Older binutils breaks if ASSERT() is used without a sink
      for the output.
      
      For example 2.14.90.0.6 is known to be broken, the link
      fails with:
      
        LD      .tmp_vmlinux1
        ld:arch/x86/kernel/vmlinux.lds:678: parse error
      
      Document this quirk in all three files that use it.
      
        See:    http://marc.info/?l=linux-kbuild&m=124930110427870&w=2
        See[2]: d2ba8b21 ("x86: Fix assert syntax in vmlinux.lds.S")
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      LKML-Reference: <4AD6523D.5030909@zytor.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a5912f6b
  28. 15 10月, 2009 2 次提交
  29. 21 9月, 2009 1 次提交
  30. 19 9月, 2009 4 次提交
  31. 25 8月, 2009 1 次提交
    • J
      x86: Fix build with older binutils and consolidate linker script · c62e4320
      Jan Beulich 提交于
      binutils prior to 2.17 can't deal with the currently possible
      situation of a new segment following the per-CPU segment, but
      that new segment being empty - objcopy misplaces the .bss (and
      perhaps also the .brk) sections outside of any segment.
      
      However, the current ordering of sections really just appears
      to be the effect of cumulative unrelated changes; re-ordering
      things allows to easily guarantee that the segment following
      the per-CPU one is non-empty, and at once eliminates the need
      for the bogus data.init2 segment.
      
      Once touching this code, also use the various data section
      helper macros from include/asm-generic/vmlinux.lds.h.
      
      -v2: fix !SMP builds.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Cc: <sam@ravnborg.org>
      LKML-Reference: <4A94085D02000078000119A5@vpn.id2.novell.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c62e4320