1. 03 4月, 2009 1 次提交
    • O
      signals: remove 'handler' parameter to tracehook functions · 43918f2b
      Oleg Nesterov 提交于
      Container-init must behave like global-init to processes within the
      container and hence it must be immune to unhandled fatal signals from
      within the container (i.e SIG_DFL signals that terminate the process).
      
      But the same container-init must behave like a normal process to processes
      in ancestor namespaces and so if it receives the same fatal signal from a
      process in ancestor namespace, the signal must be processed.
      
      Implementing these semantics requires that send_signal() determine pid
      namespace of the sender but since signals can originate from workqueues/
      interrupt-handlers, determining pid namespace of sender may not always be
      possible or safe.
      
      This patchset implements the design/simplified semantics suggested by
      Oleg Nesterov.  The simplified semantics for container-init are:
      
      	- container-init must never be terminated by a signal from a
      	  descendant process.
      
      	- container-init must never be immune to SIGKILL from an ancestor
      	  namespace (so a process in parent namespace must always be able
      	  to terminate a descendant container).
      
      	- container-init may be immune to unhandled fatal signals (like
      	  SIGUSR1) even if they are from ancestor namespace. SIGKILL/SIGSTOP
      	  are the only reliable signals to a container-init from ancestor
      	  namespace.
      
      This patch:
      
      Based on an earlier patch submitted by Oleg Nesterov and comments from
      Roland McGrath (http://lkml.org/lkml/2008/11/19/258).
      
      The handler parameter is currently unused in the tracehook functions.
      Besides, the tracehook functions are called with siglock held, so the
      functions can check the handler if they later need to.
      
      Removing the parameter simiplifies changes to sig_ignored() in a follow-on
      patch.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Daniel Lezcano <daniel.lezcano@free.fr>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      43918f2b
  2. 13 3月, 2009 1 次提交
  3. 03 3月, 2009 1 次提交
    • R
      x86-64: syscall-audit: fix 32/64 syscall hole · ccbe495c
      Roland McGrath 提交于
      On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with
      ljmp, and then use the "syscall" instruction to make a 64-bit system
      call.  A 64-bit process make a 32-bit system call with int $0x80.
      
      In both these cases, audit_syscall_entry() will use the wrong system
      call number table and the wrong system call argument registers.  This
      could be used to circumvent a syscall audit configuration that filters
      based on the syscall numbers or argument details.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ccbe495c
  4. 21 2月, 2009 1 次提交
    • I
      x86, mm: rename TASK_SIZE64 => TASK_SIZE_MAX · d9517346
      Ingo Molnar 提交于
      Impact: cleanup
      
      Rename TASK_SIZE64 to TASK_SIZE_MAX, and provide the
      define on 32-bit too. (mapped to TASK_SIZE)
      
      This allows 32-bit code to make use of the (former-) TASK_SIZE64
      symbol as well, in a clean way.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9517346
  5. 11 2月, 2009 1 次提交
  6. 10 2月, 2009 2 次提交
    • T
      x86: make lazy %gs optional on x86_32 · ccbeed3a
      Tejun Heo 提交于
      Impact: pt_regs changed, lazy gs handling made optional, add slight
              overhead to SAVE_ALL, simplifies error_code path a bit
      
      On x86_32, %gs hasn't been used by kernel and handled lazily.  pt_regs
      doesn't have place for it and gs is saved/loaded only when necessary.
      In preparation for stack protector support, this patch makes lazy %gs
      handling optional by doing the followings.
      
      * Add CONFIG_X86_32_LAZY_GS and place for gs in pt_regs.
      
      * Save and restore %gs along with other registers in entry_32.S unless
        LAZY_GS.  Note that this unfortunately adds "pushl $0" on SAVE_ALL
        even when LAZY_GS.  However, it adds no overhead to common exit path
        and simplifies entry path with error code.
      
      * Define different user_gs accessors depending on LAZY_GS and add
        lazy_save_gs() and lazy_load_gs() which are noop if !LAZY_GS.  The
        lazy_*_gs() ops are used to save, load and clear %gs lazily.
      
      * Define ELF_CORE_COPY_KERNEL_REGS() which always read %gs directly.
      
      xen and lguest changes need to be verified.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ccbeed3a
    • T
      x86: add %gs accessors for x86_32 · d9a89a26
      Tejun Heo 提交于
      Impact: cleanup
      
      On x86_32, %gs is handled lazily.  It's not saved and restored on
      kernel entry/exit but only when necessary which usually is during task
      switch but there are few other places.  Currently, it's done by
      calling savesegment() and loadsegment() explicitly.  Define
      get_user_gs(), set_user_gs() and task_user_gs() and use them instead.
      
      While at it, clean up register access macros in signal.c.
      
      This cleans up code a bit and will help future changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9a89a26
  7. 20 12月, 2008 2 次提交
    • M
      x86, bts: memory accounting · c5dee617
      Markus Metzger 提交于
      Impact: move the BTS buffer accounting to the mlock bucket
      
      Add alloc_locked_buffer() and free_locked_buffer() functions to mm/mlock.c
      to kalloc a buffer and account the locked memory to current.
      
      Account the memory for the BTS buffer to the tracer.
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c5dee617
    • M
      x86, bts: add fork and exit handling · bf53de90
      Markus Metzger 提交于
      Impact: introduce new ptrace facility
      
      Add arch_ptrace_untrace() function that is called when the tracer
      detaches (either voluntarily or when the tracing task dies);
      ptrace_disable() is only called on a voluntary detach.
      
      Add ptrace_fork() and arch_ptrace_fork(). They are called when a
      traced task is forked.
      
      Clear DS and BTS related fields on fork.
      
      Release DS resources and reclaim memory in ptrace_untrace(). This
      releases resources already when the tracing task dies. We used to do
      that when the traced task dies.
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      bf53de90
  8. 12 12月, 2008 2 次提交
  9. 26 11月, 2008 2 次提交
  10. 10 11月, 2008 1 次提交
  11. 12 10月, 2008 1 次提交
  12. 24 9月, 2008 1 次提交
  13. 23 9月, 2008 1 次提交
    • S
      signals: demultiplexing SIGTRAP signal · da654b74
      Srinivasa Ds 提交于
      Currently a SIGTRAP can denote any one of below reasons.
      	- Breakpoint hit
      	- H/W debug register hit
      	- Single step
      	- Signal sent through kill() or rasie()
      
      Architectures like powerpc/parisc provides infrastructure to demultiplex
      SIGTRAP signal by passing down the information for receiving SIGTRAP through
      si_code of siginfot_t structure. Here is an attempt is generalise this
      infrastructure by extending it to x86 and x86_64 archs.
      Signed-off-by: NSrinivasa DS <srinivasa@in.ibm.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: akpm@linux-foundation.org
      Cc: paulus@samba.org
      Cc: linuxppc-dev@ozlabs.org
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      da654b74
  14. 27 7月, 2008 1 次提交
  15. 25 7月, 2008 1 次提交
  16. 17 7月, 2008 2 次提交
    • R
      x86 ptrace: user-sets-TF nits · 380fdd75
      Roland McGrath 提交于
      This closes some arcane holes in single-step handling that can arise
      only when user programs set TF directly (via popf or sigreturn) and
      then use vDSO (syscall/sysenter) system call entry.  In those entry
      paths, the clear_TF_reenable case hits and we must check TIF_SINGLESTEP
      to be sure our bookkeeping stays correct wrt the user's view of TF.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      380fdd75
    • R
      x86 ptrace: unify syscall tracing · d4d67150
      Roland McGrath 提交于
      This unifies and cleans up the syscall tracing code on i386 and x86_64.
      
      Using a single function for entry and exit tracing on 32-bit made the
      do_syscall_trace() into some terrible spaghetti.  The logic is clear and
      simple using separate syscall_trace_enter() and syscall_trace_leave()
      functions as on 64-bit.
      
      The unification adds PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support
      on x86_64, for 32-bit ptrace() callers and for 64-bit ptrace() callers
      tracing either 32-bit or 64-bit tasks.  It behaves just like 32-bit.
      
      Changing syscall_trace_enter() to return the syscall number shortens
      all the assembly paths, while adding the SYSEMU feature in a simple way.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      d4d67150
  17. 01 7月, 2008 1 次提交
  18. 14 5月, 2008 1 次提交
  19. 13 5月, 2008 1 次提交
    • M
      x86, ptrace: PEBS support · 93fa7636
      Markus Metzger 提交于
      Polish the ds.h interface and add support for PEBS.
      
      Ds.c is meant to be the resource allocator for per-thread and per-cpu
      BTS and PEBS recording.
      It is used by ptrace/utrace to provide execution tracing of debugged tasks.
      It will be used by profilers (e.g. perfmon2).
      It may be used by kernel debuggers to provide a kernel execution trace.
      
      Changes in detail:
      - guard DS and ptrace by CONFIG macros
      - separate DS and BTS more clearly
      - simplify field accesses
      - add functions to manage PEBS buffers
      - add simple protection/allocation mechanism
      - added support for Atom
      
      Opens:
      - buffer overflow handling
        Currently, only circular buffers are supported. This is all we need
        for debugging. Profilers would want an overflow notification.
        This is planned to be added when perfmon2 is made to use the ds.h
        interface.
      - utrace intermediate layer
      Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      93fa7636
  20. 26 4月, 2008 2 次提交
  21. 17 4月, 2008 1 次提交
  22. 27 3月, 2008 1 次提交
    • A
      x86: ptrace.c: fix defined-but-unused warnings · d8d4f157
      Andrew Morton 提交于
      arch/x86/kernel/ptrace.c:548: warning: 'ptrace_bts_get_size' defined but not used
      arch/x86/kernel/ptrace.c:558: warning: 'ptrace_bts_read_record' defined but not used
      arch/x86/kernel/ptrace.c:607: warning: 'ptrace_bts_clear' defined but not used
      arch/x86/kernel/ptrace.c:617: warning: 'ptrace_bts_drain' defined but not used
      arch/x86/kernel/ptrace.c:720: warning: 'ptrace_bts_config' defined but not used
      arch/x86/kernel/ptrace.c:788: warning: 'ptrace_bts_status' defined but not used
      
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      d8d4f157
  23. 12 3月, 2008 1 次提交
    • R
      x86: ia32 syscall restart fix · 40f0933d
      Roland McGrath 提交于
      The code to restart syscalls after signals depends on checking for a
      negative orig_ax, and for particular negative -ERESTART* values in ax.
      These fields are 64 bits and for a 32-bit task they get zero-extended.
      The syscall restart behavior is lost, a regression from a native 32-bit
      kernel and from 64-bit tasks' behavior.
      
      This patch fixes the problem by doing sign-extension where it matters.
      
      For orig_ax, the only time the value should be -1 but winds up as
      0x0ffffffff is via a 32-bit ptrace call. So the patch changes ptrace to
      sign-extend the 32-bit orig_eax value when it's stored; it doesn't
      change the checks on orig_ax, though it uses the new current_syscall()
      inline to better document the subtle importance of the used of
      signedness there.
      
      The ax value is stored a lot of ways and it seems hard to get them all
      sign-extended at their origins. So for that, we use the
      current_syscall_ret() to sign-extend it only for 32-bit tasks at the
      time of the -ERESTART* comparisons.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      40f0933d
  24. 08 3月, 2008 1 次提交
    • R
      x86_64: make ptrace always sign-extend orig_ax to 64 bits · 84c6f604
      Roland McGrath 提交于
      This makes 64-bit ptrace calls setting the 64-bit orig_ax field for a
      32-bit task sign-extend the low 32 bits up to 64.  This matches what a
      64-bit debugger expects when tracing a 32-bit task.
      
      This follows on my "x86_64 ia32 syscall restart fix".  This didn't
      matter until that was fixed.
      
      The debugger ignores or zeros the high half of every register slot it
      sets (including the orig_rax pseudo-register) uniformly.  It expects
      that the setting of the low 32 bits always has the same meaning as a
      32-bit debugger setting those same 32 bits with native 32-bit
      facilities.
      
      This never arose before because the syscall restart check never
      matched any -ERESTART* values due to lack of sign extension.  Before
      that fix, even 32-bit ptrace setting orig_eax to -1 failed to trigger
      the restart check anyway.  So this was never noticed as a regression
      of 64-bit debuggers vs 32-bit debuggers on the same 64-bit kernel.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      [ Changed to just do the sign-extension unconditionally on x86-64,
        since orig_ax is always just a small integer and doesn't need
        the full 64-bit range ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      84c6f604
  25. 01 3月, 2008 1 次提交
  26. 22 2月, 2008 1 次提交
  27. 07 2月, 2008 1 次提交
  28. 30 1月, 2008 7 次提交