1. 21 5月, 2010 1 次提交
  2. 07 3月, 2010 1 次提交
  3. 04 3月, 2010 1 次提交
    • L
      Prioritize synchronous signals over 'normal' signals · a27341cd
      Linus Torvalds 提交于
      This makes sure that we pick the synchronous signals caused by a
      processor fault over any pending regular asynchronous signals sent to
      use by [t]kill().
      
      This is not strictly required semantics, but it makes it _much_ easier
      for programs like Wine that expect to find the fault information in the
      signal stack.
      
      Without this, if a non-synchronous signal gets picked first, the delayed
      asynchronous signal will have its signal context pointing to the new
      signal invocation, rather than the instruction that caused the SIGSEGV
      or SIGBUS in the first place.
      
      This is not all that pretty, and we're discussing making the synchronous
      signals more explicit rather than have these kinds of implicit
      preferences of SIGSEGV and friends.  See for example
      
      	http://bugzilla.kernel.org/show_bug.cgi?id=15395
      
      for some of the discussion.  But in the meantime this is a simple and
      fairly straightforward work-around, and the whole
      
      	if (x & Y)
      		x &= Y;
      
      thing can be compiled into (and gcc does do it) just three instructions:
      
      	movq    %rdx, %rax
      	andl    $Y, %eax
      	cmovne  %rax, %rdx
      
      so it is at least a simple solution to a subtle issue.
      Reported-and-tested-by: NPavel Vilim <wylda@volny.cz>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a27341cd
  4. 12 1月, 2010 1 次提交
    • A
      kernel/signal.c: fix kernel information leak with print-fatal-signals=1 · b45c6e76
      Andi Kleen 提交于
      When print-fatal-signals is enabled it's possible to dump any memory
      reachable by the kernel to the log by simply jumping to that address from
      user space.
      
      Or crash the system if there's some hardware with read side effects.
      
      The fatal signals handler will dump 16 bytes at the execution address,
      which is fully controlled by ring 3.
      
      In addition when something jumps to a unmapped address there will be up to
      16 additional useless page faults, which might be potentially slow (and at
      least is not very efficient)
      
      Fortunately this option is off by default and only there on i386.
      
      But fix it by checking for kernel addresses and also stopping when there's
      a page fault.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b45c6e76
  5. 16 12月, 2009 5 次提交
  6. 11 12月, 2009 2 次提交
  7. 26 11月, 2009 3 次提交
  8. 09 11月, 2009 1 次提交
    • N
      signal: Print warning message when dropping signals · f84d49b2
      Naohiro Ooiwa 提交于
      When the system has too many timers or too many aggregate
      queued signals, the EAGAIN error is returned to application
      from kernel, including timer_create() [POSIX.1b].
      
      It means that the app exceeded the limit of pending signals,
      but in general application writers do not expect this
      outcome and the current silent failure can cause rare app
      failures under very high load.
      
      This patch adds a new message when we reach the limit
      and if print_fatal_signals is enabled:
      
          task/1234: reached RLIMIT_SIGPENDING, dropping signal
      
      If you see this message and your system behaved unexpectedly,
      you can run following command to lift the limit:
      
         # ulimit -i unlimited
      
      With help from Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>.
      Signed-off-by: NNaohiro Ooiwa <nooiwa@miraclelinux.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: oleg@redhat.com
      LKML-Reference: <4AF6E7E2.9080406@miraclelinux.com>
      [ Modified a few small details, gave surrounding code some love. ]
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f84d49b2
  9. 24 9月, 2009 4 次提交
  10. 02 8月, 2009 2 次提交
    • L
      do_sigaltstack: small cleanups · 0dd8486b
      Linus Torvalds 提交于
      The previous commit ("do_sigaltstack: avoid copying 'stack_t' as a
      structure to user space") fixed a real bug.  This one just cleans up the
      copy from user space to that gcc can generate better code for it (and so
      that it looks the same as the later copy back to user space).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0dd8486b
    • L
      do_sigaltstack: avoid copying 'stack_t' as a structure to user space · 0083fc2c
      Linus Torvalds 提交于
      Ulrich Drepper correctly points out that there is generally padding in
      the structure on 64-bit hosts, and that copying the structure from
      kernel to user space can leak information from the kernel stack in those
      padding bytes.
      
      Avoid the whole issue by just copying the three members one by one
      instead, which also means that the function also can avoid the need for
      a stack frame.  This also happens to match how we copy the new structure
      from user space, so it all even makes sense.
      
      [ The obvious solution of adding a memset() generates horrid code, gcc
        does really stupid things. ]
      Reported-by: NUlrich Drepper <drepper@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0083fc2c
  11. 19 6月, 2009 2 次提交
  12. 15 6月, 2009 1 次提交
    • V
      signal: fix __send_signal() false positive kmemcheck warning · 7a0aeb14
      Vegard Nossum 提交于
      This false positive is due to field padding in struct sigqueue. When
      this dynamically allocated structure is copied to the stack (in arch-
      specific delivery code), kmemcheck sees a read from the padding, which
      is, naturally, uninitialized.
      
      Hide the false positive using the __GFP_NOTRACK_FALSE_POSITIVE flag.
      Also made the rlimit override code a bit clearer by introducing a new
      variable.
      
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      7a0aeb14
  13. 01 5月, 2009 2 次提交
  14. 30 4月, 2009 1 次提交
  15. 15 4月, 2009 2 次提交
    • S
      tracing/events: move trace point headers into include/trace/events · ad8d75ff
      Steven Rostedt 提交于
      Impact: clean up
      
      Create a sub directory in include/trace called events to keep the
      trace point headers in their own separate directory. Only headers that
      declare trace points should be defined in this directory.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ad8d75ff
    • S
      tracing: create automated trace defines · a8d154b0
      Steven Rostedt 提交于
      This patch lowers the number of places a developer must modify to add
      new tracepoints. The current method to add a new tracepoint
      into an existing system is to write the trace point macro in the
      trace header with one of the macros TRACE_EVENT, TRACE_FORMAT or
      DECLARE_TRACE, then they must add the same named item into the C file
      with the macro DEFINE_TRACE(name) and then add the trace point.
      
      This change cuts out the needing to add the DEFINE_TRACE(name).
      Every file that uses the tracepoint must still include the trace/<type>.h
      file, but the one C file must also add a define before the including
      of that file.
      
       #define CREATE_TRACE_POINTS
       #include <trace/mytrace.h>
      
      This will cause the trace/mytrace.h file to also produce the C code
      necessary to implement the trace point.
      
      Note, if more than one trace/<type>.h is used to create the C code
      it is best to list them all together.
      
       #define CREATE_TRACE_POINTS
       #include <trace/foo.h>
       #include <trace/bar.h>
       #include <trace/fido.h>
      
      Thanks to Mathieu Desnoyers and Christoph Hellwig for coming up with
      the cleaner solution of the define above the includes over my first
      design to have the C code include a "special" header.
      
      This patch converts sched, irq and lockdep and skb to use this new
      method.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a8d154b0
  16. 03 4月, 2009 6 次提交
    • S
      signals: SI_USER: Masquerade si_pid when crossing pid ns boundary · 6588c1e3
      Sukadev Bhattiprolu 提交于
      When sending a signal to a descendant namespace, set ->si_pid to 0 since
      the sender does not have a pid in the receiver's namespace.
      
      Note:
      	- If rt_sigqueueinfo() sets si_code to SI_USER when sending a
      	  signal across a pid namespace boundary, the value in ->si_pid
      	  will be cleared to 0.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Daniel Lezcano <daniel.lezcano@free.fr>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6588c1e3
    • S
      signals: protect cinit from blocked fatal signals · b3bfa0cb
      Sukadev Bhattiprolu 提交于
      Normally SIG_DFL signals to global and container-init are dropped early.
      But if a signal is blocked when it is posted, we cannot drop the signal
      since the receiver may install a handler before unblocking the signal.
      Once this signal is queued however, the receiver container-init has no way
      of knowing if the signal was sent from an ancestor or descendant
      namespace.  This patch ensures that contianer-init drops all SIG_DFL
      signals in get_signal_to_deliver() except SIGKILL/SIGSTOP.
      
      If SIGSTOP/SIGKILL originate from a descendant of container-init they are
      never queued (i.e dropped in sig_ignored() in an earler patch).
      
      If SIGSTOP/SIGKILL originate from parent namespace, the signal is queued
      and container-init processes the signal.
      
      IOW, if get_signal_to_deliver() sees a sig_kernel_only() signal for global
      or container-init, the signal must have been generated internally or must
      have come from an ancestor ns and we process the signal.
      
      Further, the signal_group_exit() check was needed to cover the case of a
      multi-threaded init sending SIGKILL to other threads when doing an exit()
      or exec().  But since the new sig_kernel_only() check covers the SIGKILL,
      the signal_group_exit() check is no longer needed and can be removed.
      
      Finally, now that we have all pieces in place, set SIGNAL_UNKILLABLE for
      container-inits.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Daniel Lezcano <daniel.lezcano@free.fr>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3bfa0cb
    • S
      signals: protect cinit from unblocked SIG_DFL signals · 921cf9f6
      Sukadev Bhattiprolu 提交于
      Drop early any SIG_DFL or SIG_IGN signals to container-init from within
      the same container.  But queue SIGSTOP and SIGKILL to the container-init
      if they are from an ancestor container.
      
      Blocked, fatal signals (i.e when SIG_DFL is to terminate) from within the
      container can still terminate the container-init.  That will be addressed
      in the next patch.
      
      Note:	To be bisect-safe, SIGNAL_UNKILLABLE will be set for container-inits
         	in a follow-on patch. Until then, this patch is just a preparatory
      	step.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Daniel Lezcano <daniel.lezcano@free.fr>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      921cf9f6
    • S
      signals: add from_ancestor_ns parameter to send_signal() · 7978b567
      Sukadev Bhattiprolu 提交于
      send_signal() (or its helper) needs to determine the pid namespace of the
      sender.  But a signal sent via kill_pid_info_as_uid() comes from within
      the kernel and send_signal() does not need to determine the pid namespace
      of the sender.  So define a helper for send_signal() which takes an
      additional parameter, 'from_ancestor_ns' and have kill_pid_info_as_uid()
      use that helper directly.
      
      The 'from_ancestor_ns' parameter will be used in a follow-on patch.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Daniel Lezcano <daniel.lezcano@free.fr>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7978b567
    • O
      signals: protect init from unwanted signals more · f008faff
      Oleg Nesterov 提交于
      (This is a modified version of the patch submitted by Oleg Nesterov
      http://lkml.org/lkml/2008/11/18/249 and tries to address comments that
      came up in that discussion)
      
      init ignores the SIG_DFL signals but we queue them anyway, including
      SIGKILL.  This is mostly OK, the signal will be dropped silently when
      dequeued, but the pending SIGKILL has 2 bad implications:
      
              - it implies fatal_signal_pending(), so we confuse things
                like wait_for_completion_killable/lock_page_killable.
      
              - for the sub-namespace inits, the pending SIGKILL can
                mask (legacy_queue) the subsequent SIGKILL from the
                parent namespace which must kill cinit reliably.
                (preparation, cinits don't have SIGNAL_UNKILLABLE yet)
      
      The patch can't help when init is ptraced, but ptracing of init is not
      "safe" anyway.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Daniel Lezcano <daniel.lezcano@free.fr>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f008faff
    • O
      signals: remove 'handler' parameter to tracehook functions · 43918f2b
      Oleg Nesterov 提交于
      Container-init must behave like global-init to processes within the
      container and hence it must be immune to unhandled fatal signals from
      within the container (i.e SIG_DFL signals that terminate the process).
      
      But the same container-init must behave like a normal process to processes
      in ancestor namespaces and so if it receives the same fatal signal from a
      process in ancestor namespace, the signal must be processed.
      
      Implementing these semantics requires that send_signal() determine pid
      namespace of the sender but since signals can originate from workqueues/
      interrupt-handlers, determining pid namespace of sender may not always be
      possible or safe.
      
      This patchset implements the design/simplified semantics suggested by
      Oleg Nesterov.  The simplified semantics for container-init are:
      
      	- container-init must never be terminated by a signal from a
      	  descendant process.
      
      	- container-init must never be immune to SIGKILL from an ancestor
      	  namespace (so a process in parent namespace must always be able
      	  to terminate a descendant container).
      
      	- container-init may be immune to unhandled fatal signals (like
      	  SIGUSR1) even if they are from ancestor namespace. SIGKILL/SIGSTOP
      	  are the only reliable signals to a container-init from ancestor
      	  namespace.
      
      This patch:
      
      Based on an earlier patch submitted by Oleg Nesterov and comments from
      Roland McGrath (http://lkml.org/lkml/2008/11/19/258).
      
      The handler parameter is currently unused in the tracehook functions.
      Besides, the tracehook functions are called with siglock held, so the
      functions can check the handler if they later need to.
      
      Removing the parameter simiplifies changes to sig_ignored() in a follow-on
      patch.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Acked-by: NRoland McGrath <roland@redhat.com>
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Daniel Lezcano <daniel.lezcano@free.fr>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      43918f2b
  17. 24 3月, 2009 1 次提交
    • M
      fix ptrace slowness · 53da1d94
      Miklos Szeredi 提交于
      This patch fixes bug #12208:
      
        Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12208
        Subject         : uml is very slow on 2.6.28 host
      
      This turned out to be not a scheduler regression, but an already
      existing problem in ptrace being triggered by subtle scheduler
      changes.
      
      The problem is this:
      
       - task A is ptracing task B
       - task B stops on a trace event
       - task A is woken up and preempts task B
       - task A calls ptrace on task B, which does ptrace_check_attach()
       - this calls wait_task_inactive(), which sees that task B is still on the runq
       - task A goes to sleep for a jiffy
       - ...
      
      Since UML does lots of the above sequences, those jiffies quickly add
      up to make it slow as hell.
      
      This patch solves this by not rescheduling in read_unlock() after
      ptrace_stop() has woken up the tracer.
      
      Thanks to Oleg Nesterov and Ingo Molnar for the feedback.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      CC: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      53da1d94
  18. 05 2月, 2009 1 次提交
    • P
      signal: re-add dead task accumulation stats. · 32bd671d
      Peter Zijlstra 提交于
      We're going to split the process wide cpu accounting into two parts:
      
       - clocks; which can take all the time they want since they run
                 from user context.
      
       - timers; which need constant time tracing but can affort the overhead
                 because they're default off -- and rare.
      
      The clock readout will go back to a full sum of the thread group, for this
      we need to re-add the exit stats that were removed in the initial itimer
      rework (f06febc9: timers: fix itimer/many thread hang).
      
      Furthermore, since that full sum can be rather slow for large thread groups
      and we have the complete dead task stats, revert the do_notify_parent time
      computation.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Reviewed-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      32bd671d
  19. 27 1月, 2009 1 次提交
  20. 14 1月, 2009 2 次提交