1. 26 9月, 2006 6 次提交
    • R
      [PATCH] i386: Allow a kernel not to be in ring 0 · 78be3706
      Rusty Russell 提交于
      We allow for the fact that the guest kernel may not run in ring 0.  This
      requires some abstraction in a few places when setting %cs or checking
      privilege level (user vs kernel).
      
      This is Chris' [RFC PATCH 15/33] move segment checks to subarch, except rather
      than using #define USER_MODE_MASK which depends on a config option, we use
      Zach's more flexible approach of assuming ring 3 == userspace.  I also used
      "get_kernel_rpl()" over "get_kernel_cs()" because I think it reads better in
      the code...
      
      1) Remove the hardcoded 3 and introduce #define SEGMENT_RPL_MASK 3 2) Add a
      get_kernel_rpl() macro, and don't assume it's zero.
      
      And:
      
      Clean up of patch for letting kernel run other than ring 0:
      
      a. Add some comments about the SEGMENT_IS_*_CODE() macros.
      b. Add a USER_RPL macro.  (Code was comparing a value to a mask
         in some places and to the magic number 3 in other places.)
      c. Add macros for table indicator field and use them.
      d. Change the entry.S tests for LDT stack segment to use the macros
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      78be3706
    • R
      [PATCH] i386: Abstract sensitive instructions · 0da5db31
      Rusty Russell 提交于
      Abstract sensitive instructions in assembler code, replacing them with macros
      (which currently are #defined to the native versions).  We use long names:
      assembler is case-insensitive, so if something goes wrong and macros do not
      expand, it would assemble anyway.
      
      Resulting object files are exactly the same as before.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      0da5db31
    • C
      [PATCH] i386: annotate FIX_STACK() and the rest of nmi() · a549b86d
      Chuck Ebbert 提交于
      In i386's entry.S, FIX_STACK() needs annotation because it
      replaces the stack pointer.  And the rest of nmi() needs
      annotation in order to compile with these new annotations.
      Signed-off-by: NChuck Ebbert <76306.1226@compuserve.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      a549b86d
    • F
      [PATCH] i386: Disallow kprobes on NMI handlers · 06039754
      Fernando Luis Vzquez Cao 提交于
      A kprobe executes IRET early and that could cause NMI recursion and stack
      corruption.
      
      Note: This problem was originally spotted and solved by Andi Kleen in the
      x86_64 architecture. This patch is an adaption of his patch for i386.
      
      AK: Merged with current code which was a bit different.
      AK: Removed printk in nmi handler that shouldn't be there in the first time
      AK: Added missing include.
      AK: added KPROBES_END
      Signed-off-by: NFernando Vazquez <fernando@intellilink.co.jp>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      06039754
    • A
      [PATCH] i386: move kernel_thread_helper into entry.S · 02ba1a32
      Andi Kleen 提交于
      And add proper CFI annotation to it which was previously
      impossible. This prevents "stuck" messages by the dwarf2 unwinder
      when reaching the top of a kernel stack.
      
      Includes feedback from Jan Beulich
      
      Cc: jbeulich@novell.com
      Signed-off-by: NAndi Kleen <ak@suse.de>
      02ba1a32
    • P
      [PATCH] x86: error_code is not safe for kprobes · d28c4393
      Prasanna S.P 提交于
      This patch moves the entry.S:error_entry to .kprobes.text section,
      since code marked unsafe for kprobes jumps directly to entry.S::error_entry,
      that must be marked unsafe as well.
      This patch also moves all the ".previous.text" asm directives to ".previous"
      for kprobes section.
      
      AK: Following a similar i386 patch from Chuck Ebbert
      AK: Also merged Jeremy's fix in.
      
      +From: Jeremy Fitzhardinge <jeremy@goop.org>
      
      KPROBE_ENTRY does a .section .kprobes.text, and expects its users to
      do a .previous at the end of the function.
      
      Unfortunately, if any code within the function switches sections, for
      example .fixup, then the .previous ends up putting all subsequent code
      into .fixup.  Worse, any subsequent .fixup code gets intermingled with
      the code its supposed to be fixing (which is also in .fixup).  It's
      surprising this didn't cause more havok.
      
      The fix is to use .pushsection/.popsection, so this stuff nests
      properly.  A further cleanup would be to get rid of all
      .section/.previous pairs, since they're inherently fragile.
      
      +From: Chuck Ebbert <76306.1226@compuserve.com>
      
      Because code marked unsafe for kprobes jumps directly to
      entry.S::error_code, that must be marked unsafe as well.
      The easiest way to do that is to move the page fault entry
      point to just before error_code and let it inherit the same
      section.
      
      Also moved all the ".previous" asm directives for kprobes
      sections to column 1 and removed ".text" from them.
      Signed-off-by: NChuck Ebbert <76306.1226@compuserve.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      d28c4393
  2. 19 9月, 2006 1 次提交
  3. 01 8月, 2006 1 次提交
  4. 04 7月, 2006 1 次提交
  5. 01 7月, 2006 1 次提交
  6. 28 6月, 2006 3 次提交
    • I
      [PATCH] vdso: randomize the i386 vDSO by moving it into a vma · e6e5494c
      Ingo Molnar 提交于
      Move the i386 VDSO down into a vma and thus randomize it.
      
      Besides the security implications, this feature also helps debuggers, which
      can COW a vma-backed VDSO just like a normal DSO and can thus do
      single-stepping and other debugging features.
      
      It's good for hypervisors (Xen, VMWare) too, which typically live in the same
      high-mapped address space as the VDSO, hence whenever the VDSO is used, they
      get lots of guest pagefaults and have to fix such guest accesses up - which
      slows things down instead of speeding things up (the primary purpose of the
      VDSO).
      
      There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support
      for older glibcs that still rely on a prelinked high-mapped VDSO.  Newer
      distributions (using glibc 2.3.3 or later) can turn this option off.  Turning
      it off is also recommended for security reasons: attackers cannot use the
      predictable high-mapped VDSO page as syscall trampoline anymore.
      
      There is a new vdso=[0|1] boot option as well, and a runtime
      /proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned
      on/off.
      
      (This version of the VDSO-randomization patch also has working ELF
      coredumping, the previous patch crashed in the coredumping code.)
      
      This code is a combined work of the exec-shield VDSO randomization
      code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell
      started this patch and i completed it.
      
      [akpm@osdl.org: cleanups]
      [akpm@osdl.org: compile fix]
      [akpm@osdl.org: compile fix 2]
      [akpm@osdl.org: compile fix 3]
      [akpm@osdl.org: revernt MAXMEM change]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NArjan van de Ven <arjan@infradead.org>
      Cc: Gerd Hoffmann <kraxel@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Zachary Amsden <zach@vmware.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Jan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e6e5494c
    • A
      [PATCH] fix broken vm86 interrupt/signal handling · 4031ff38
      Aleksey Gorelov 提交于
      Commit c3ff8ec3 ("[PATCH] i386: Don't
      miss pending signals returning to user mode after signal processing")
      meant that vm86 interrupt/signal handling got broken for the case when
      vm86 is called from kernel space.
      
      In this scenario, if signal is pending because of vm86 interrupt,
      do_notify_resume/do_signal exits immediately due to user_mode() check,
      without processing any signals.  Thus, resume_userspace handler is spinning
      in a tight loop with signal pending and TIF_SIGPENDING is set.  Previously
      everything worked Ok.
      
      No in-tree usage of vm86() from kernel space exists, but I've heard
      about a number of projects out there which use vm86 calls from kernel,
      one of them being this, for instance:
      
      	http://dev.gentoo.org/~spock/projects/vesafb-tng/
      
      The following patch fixes the issue.
      Signed-off-by: NAleksey Gorelov <aleksey_gorelov@phoenix.com>
      Cc: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4031ff38
    • R
      [PATCH] x86: increase interrupt vector range · 19eadf98
      Rusty Russell 提交于
      Remove the limit of 256 interrupt vectors by changing the value stored in
      orig_{e,r}ax to be the complemented interrupt vector.  The orig_{e,r}ax
      needs to be < 0 to allow the signal code to distinguish between return from
      interrupt and return from syscall.  With this change applied, NR_IRQS can
      be > 256.
      
      Xen extends the IRQ numbering space to include room for dynamically
      allocated virtual interrupts (in the range 256-511), which requires a more
      permissive interface to do_IRQ.
      Signed-off-by: NIan Pratt <ian.pratt@xensource.com>
      Signed-off-by: NChristian Limpach <Christian.Limpach@cl.cam.ac.uk>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      19eadf98
  7. 27 6月, 2006 2 次提交
  8. 23 3月, 2006 1 次提交
    • C
      [PATCH] i386: fix singlestep through an int80 syscall · 635cf99a
      Chuck Ebbert 提交于
      Using PTRACE_SINGLESTEP on a child that does an int80 syscall misses the
      SIGTRAP that should be delivered upon syscall exit.  Fix that by setting
      TIF_SINGLESTEP when entering the kernel via int80 with TF set.
      
      /* Test whether singlestep through an int80 syscall works.
       */
      #define _GNU_SOURCE
      #include <stdio.h>
      #include <unistd.h>
      #include <fcntl.h>
      #include <sys/ptrace.h>
      #include <sys/wait.h>
      #include <sys/mman.h>
      #include <asm/user.h>
      
      static int child, status;
      static struct user_regs_struct regs;
      
      static void do_child()
      {
      	ptrace(PTRACE_TRACEME, 0, 0, 0);
      	kill(getpid(), SIGUSR1);
      	asm ("int $0x80" : : "a" (20)); /* getpid */
      }
      
      static void do_parent()
      {
      	unsigned long eip, expected = 0;
      again:
      	waitpid(child, &status, 0);
      	if (WIFEXITED(status) || WIFSIGNALED(status))
      		return;
      
      	if (WIFSTOPPED(status)) {
      		ptrace(PTRACE_GETREGS, child, 0, &regs);
      		eip = regs.eip;
      		if (expected)
      			fprintf(stderr, "child stop @ %08x, expected %08x %s\n",
      					eip, expected,
      					eip == expected ? "" : " <== ERROR");
      
      		if (*(unsigned short *)eip == 0x80cd) {
      			fprintf(stderr, "int 0x80 at %08x\n", (unsigned int)eip);
      			expected = eip + 2;
      		} else
      			expected = 0;
      
      		ptrace(PTRACE_SINGLESTEP, child, NULL, NULL);
      	}
      	goto again;
      }
      
      int main(int argc, char * const argv[])
      {
      	child = fork();
      	if (child)
      		do_parent();
      	else
      		do_child();
      	return 0;
      }
      Signed-off-by: NChuck Ebbert <76306.1226@compuserve.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      635cf99a
  9. 09 1月, 2006 1 次提交
    • M
      [PATCH] Make vm86 support optional · 64ca9004
      Matt Mackall 提交于
      This adds an option to remove vm86 support under CONFIG_EMBEDDED.  Saves
      about 5k.
      
      This version eliminates most of the #ifdefs of the previous version and
      instead uses function stubs in vm86.h.  Also, release_vm86_irqs is moved
      from asm-i386/irq.h to a more appropriate home in vm86.h so that the stubs
      can live together.
      
      $ size vmlinux-baseline vmlinux-novm86
         text    data     bss     dec     hex filename
      2920821  523232  190652 3634705  377611 vmlinux-baseline
      2916268  523100  190492 3629860  376324 vmlinux-novm86
      Signed-off-by: NMatt Mackall <mpm@selenic.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      64ca9004
  10. 07 1月, 2006 1 次提交
  11. 14 11月, 2005 1 次提交
  12. 12 9月, 2005 1 次提交
  13. 08 9月, 2005 1 次提交
  14. 05 9月, 2005 3 次提交
    • P
      [PATCH] uml: SYSEMU: slight cleanup and speedup · 640aa46e
      Paolo 'Blaisorblade' Giarrusso 提交于
      As a follow-up to "UML Support - Ptrace: adds the host SYSEMU support, for
      UML and general usage" (i.e.  uml-support-* in current mm).
      
      Avoid unconditionally jumping to work_pending and code copying, just reuse
      the already existing resume_userspace path.
      
      One interesting note, from Charles P.  Wright, suggested that the API is
      improvable with no downsides for UML (except that it will have to support
      yet another host API, since dropping support for the current API, for UML,
      is not reasonable from users' point of view).
      Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      CC: Charles P. Wright <cwright@cs.sunysb.edu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      640aa46e
    • B
      [PATCH] Uml support: reorganize PTRACE_SYSEMU support · c8c86cec
      Bodo Stroesser 提交于
      With this patch, we change the way we handle switching from PTRACE_SYSEMU to
      PTRACE_{SINGLESTEP,SYSCALL}, to free TIF_SYSCALL_EMU from double use as a
      preparation for PTRACE_SYSEMU_SINGLESTEP extension, without changing the
      behavior of the host kernel.
      Signed-off-by: NBodo Stroesser <bstroesser@fujitsu-siemens.com>
      Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Jeff Dike <jdike@addtoit.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c8c86cec
    • L
      [PATCH] UML Support - Ptrace: adds the host SYSEMU support, for UML and general usage · ed75e8d5
      Laurent Vivier 提交于
            Jeff Dike <jdike@addtoit.com>,
            Paolo 'Blaisorblade' Giarrusso <blaisorblade_spam@yahoo.it>,
            Bodo Stroesser <bstroesser@fujitsu-siemens.com>
      
      Adds a new ptrace(2) mode, called PTRACE_SYSEMU, resembling PTRACE_SYSCALL
      except that the kernel does not execute the requested syscall; this is useful
      to improve performance for virtual environments, like UML, which want to run
      the syscall on their own.
      
      In fact, using PTRACE_SYSCALL means stopping child execution twice, on entry
      and on exit, and each time you also have two context switches; with SYSEMU you
      avoid the 2nd stop and so save two context switches per syscall.
      
      Also, some architectures don't have support in the host for changing the
      syscall number via ptrace(), which is currently needed to skip syscall
      execution (UML turns any syscall into getpid() to avoid it being executed on
      the host).  Fixing that is hard, while SYSEMU is easier to implement.
      
      * This version of the patch includes some suggestions of Jeff Dike to avoid
        adding any instructions to the syscall fast path, plus some other little
        changes, by myself, to make it work even when the syscall is executed with
        SYSENTER (but I'm unsure about them). It has been widely tested for quite a
        lot of time.
      
      * Various fixed were included to handle the various switches between
        various states, i.e. when for instance a syscall entry is traced with one of
        PT_SYSCALL / _SYSEMU / _SINGLESTEP and another one is used on exit.
        Basically, this is done by remembering which one of them was used even after
        the call to ptrace_notify().
      
      * We're combining TIF_SYSCALL_EMU with TIF_SYSCALL_TRACE or TIF_SINGLESTEP
        to make do_syscall_trace() notice that the current syscall was started with
        SYSEMU on entry, so that no notification ought to be done in the exit path;
        this is a bit of a hack, so this problem is solved in another way in next
        patches.
      
      * Also, the effects of the patch:
      "Ptrace - i386: fix Syscall Audit interaction with singlestep"
      are cancelled; they are restored back in the last patch of this series.
      
      Detailed descriptions of the patches doing this kind of processing follow (but
      I've already summed everything up).
      
      * Fix behaviour when changing interception kind #1.
      
        In do_syscall_trace(), we check the status of the TIF_SYSCALL_EMU flag
        only after doing the debugger notification; but the debugger might have
        changed the status of this flag because he continued execution with
        PTRACE_SYSCALL, so this is wrong.  This patch fixes it by saving the flag
        status before calling ptrace_notify().
      
      * Fix behaviour when changing interception kind #2:
        avoid intercepting syscall on return when using SYSCALL again.
      
        A guest process switching from using PTRACE_SYSEMU to PTRACE_SYSCALL
        crashes.
      
        The problem is in arch/i386/kernel/entry.S.  The current SYSEMU patch
        inhibits the syscall-handler to be called, but does not prevent
        do_syscall_trace() to be called after this for syscall completion
        interception.
      
        The appended patch fixes this.  It reuses the flag TIF_SYSCALL_EMU to
        remember "we come from PTRACE_SYSEMU and now are in PTRACE_SYSCALL", since
        the flag is unused in the depicted situation.
      
      * Fix behaviour when changing interception kind #3:
        avoid intercepting syscall on return when using SINGLESTEP.
      
        When testing 2.6.9 and the skas3.v6 patch, with my latest patch and had
        problems with singlestepping on UML in SKAS with SYSEMU.  It looped
        receiving SIGTRAPs without moving forward.  EIP of the traced process was
        the same for all SIGTRAPs.
      
      What's missing is to handle switching from PTRACE_SYSCALL_EMU to
      PTRACE_SINGLESTEP in a way very similar to what is done for the change from
      PTRACE_SYSCALL_EMU to PTRACE_SYSCALL_TRACE.
      
      I.e., after calling ptrace(PTRACE_SYSEMU), on the return path, the debugger is
      notified and then wake ups the process; the syscall is executed (or skipped,
      when do_syscall_trace() returns 0, i.e.  when using PTRACE_SYSEMU), and
      do_syscall_trace() is called again.  Since we are on the return path of a
      SYSEMU'd syscall, if the wake up is performed through ptrace(PTRACE_SYSCALL),
      we must still avoid notifying the parent of the syscall exit.  Now, this
      behaviour is extended even to resuming with PTRACE_SINGLESTEP.
      Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
      Cc: Jeff Dike <jdike@addtoit.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      ed75e8d5
  15. 01 5月, 2005 2 次提交
  16. 30 4月, 2005 1 次提交
    • L
      x86: make traps on 'iret' be debuggable in user space · a879cbbb
      Linus Torvalds 提交于
      This makes a trap on the 'iret' that returns us to user space
      cause a nice clean SIGSEGV, instead of just a hard (and silent)
      exit.
      
      That way a debugger can actually try to see what happened, and
      we also properly notify everybody who might be interested about
      us being gone.
      
      This loses the error code, but tells the debugger what happened
      with ILL_BADSTK in the siginfo.
      a879cbbb
  17. 17 4月, 2005 2 次提交
    • S
      [PATCH] fix crash in entry.S restore_all · 5df24082
      Stas Sergeev 提交于
      Fix the access-above-bottom-of-stack crash.
      
      1. Allows to preserve the valueable optimization
      
      2. Works for NMIs
      
      3.  Doesn't care whether or not there are more of the like instances
         where the stack is left empty.
      
      4. Seems to work for me without the crashes:) 
      
      (akpm: this is still under discussion, although I _think_ it's OK.  You might
      want to hold off)
      
      Signed-off-by: Stas Sergeev <stsp@aknet.ru> 
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5df24082
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4