1. 02 11月, 2017 1 次提交
  2. 19 8月, 2017 1 次提交
  3. 07 8月, 2017 1 次提交
  4. 25 7月, 2017 1 次提交
    • E
      signal: Remove kernel interal si_code magic · cc731525
      Eric W. Biederman 提交于
      struct siginfo is a union and the kernel since 2.4 has been hiding a union
      tag in the high 16bits of si_code using the values:
      __SI_KILL
      __SI_TIMER
      __SI_POLL
      __SI_FAULT
      __SI_CHLD
      __SI_RT
      __SI_MESGQ
      __SI_SYS
      
      While this looks plausible on the surface, in practice this situation has
      not worked well.
      
      - Injected positive signals are not copied to user space properly
        unless they have these magic high bits set.
      
      - Injected positive signals are not reported properly by signalfd
        unless they have these magic high bits set.
      
      - These kernel internal values leaked to userspace via ptrace_peek_siginfo
      
      - It was possible to inject these kernel internal values and cause the
        the kernel to misbehave.
      
      - Kernel developers got confused and expected these kernel internal values
        in userspace in kernel self tests.
      
      - Kernel developers got confused and set si_code to __SI_FAULT which
        is SI_USER in userspace which causes userspace to think an ordinary user
        sent the signal and that it was not kernel generated.
      
      - The values make it impossible to reorganize the code to transform
        siginfo_copy_to_user into a plain copy_to_user.  As si_code must
        be massaged before being passed to userspace.
      
      So remove these kernel internal si codes and make the kernel code simpler
      and more maintainable.
      
      To replace these kernel internal magic si_codes introduce the helper
      function siginfo_layout, that takes a signal number and an si_code and
      computes which union member of siginfo is being used.  Have
      siginfo_layout return an enumeration so that gcc will have enough
      information to warn if a switch statement does not handle all of union
      members.
      
      A couple of architectures have a messed up ABI that defines signal
      specific duplications of SI_USER which causes more special cases in
      siginfo_layout than I would like.  The good news is only problem
      architectures pay the cost.
      
      Update all of the code that used the previous magic __SI_ values to
      use the new SIL_ values and to call siginfo_layout to get those
      values.  Escept where not all of the cases are handled remove the
      defaults in the switch statements so that if a new case is missed in
      the future the lack will show up at compile time.
      
      Modify the code that copies siginfo si_code to userspace to just copy
      the value and not cast si_code to a short first.  The high bits are no
      longer used to hold a magic union member.
      
      Fixup the siginfo header files to stop including the __SI_ values in
      their constants and for the headers that were missing it to properly
      update the number of si_codes for each signal type.
      
      The fixes to copy_siginfo_from_user32 implementations has the
      interesting property that several of them perviously should never have
      worked as the __SI_ values they depended up where kernel internal.
      With that dependency gone those implementations should work much
      better.
      
      The idea of not passing the __SI_ values out to userspace and then
      not reinserting them has been tested with criu and criu worked without
      changes.
      
      Ref: 2.4.0-test1
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      cc731525
  5. 11 7月, 2017 1 次提交
  6. 20 6月, 2017 1 次提交
  7. 18 6月, 2017 1 次提交
    • E
      signal: Only reschedule timers on signals timers have sent · 57db7e4a
      Eric W. Biederman 提交于
      Thomas Gleixner  wrote:
      > The CRIU support added a 'feature' which allows a user space task to send
      > arbitrary (kernel) signals to itself. The changelog says:
      >
      >   The kernel prevents sending of siginfo with positive si_code, because
      >   these codes are reserved for kernel.  I think we can allow a task to
      >   send such a siginfo to itself.  This operation should not be dangerous.
      >
      > Quite contrary to that claim, it turns out that it is outright dangerous
      > for signals with info->si_code == SI_TIMER. The following code sequence in
      > a user space task allows to crash the kernel:
      >
      >    id = timer_create(CLOCK_XXX, ..... signo = SIGX);
      >    timer_set(id, ....);
      >    info->si_signo = SIGX;
      >    info->si_code = SI_TIMER:
      >    info->_sifields._timer._tid = id;
      >    info->_sifields._timer._sys_private = 2;
      >    rt_[tg]sigqueueinfo(..., SIGX, info);
      >    sigemptyset(&sigset);
      >    sigaddset(&sigset, SIGX);
      >    rt_sigtimedwait(sigset, info);
      >
      > For timers based on CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID this
      > results in a kernel crash because sigwait() dequeues the signal and the
      > dequeue code observes:
      >
      >   info->si_code == SI_TIMER && info->_sifields._timer._sys_private != 0
      >
      > which triggers the following callchain:
      >
      >  do_schedule_next_timer() -> posix_cpu_timer_schedule() -> arm_timer()
      >
      > arm_timer() executes a list_add() on the timer, which is already armed via
      > the timer_set() syscall. That's a double list add which corrupts the posix
      > cpu timer list. As a consequence the kernel crashes on the next operation
      > touching the posix cpu timer list.
      >
      > Posix clocks which are internally implemented based on hrtimers are not
      > affected by this because hrtimer_start() can handle already armed timers
      > nicely, but it's a reliable way to trigger the WARN_ON() in
      > hrtimer_forward(), which complains about calling that function on an
      > already armed timer.
      
      This problem has existed since the posix timer code was merged into
      2.5.63. A few releases earlier in 2.5.60 ptrace gained the ability to
      inject not just a signal (which linux has supported since 1.0) but the
      full siginfo of a signal.
      
      The core problem is that the code will reschedule in response to
      signals getting dequeued not just for signals the timers sent but
      for other signals that happen to a si_code of SI_TIMER.
      
      Avoid this confusion by testing to see if the queued signal was
      preallocated as all timer signals are preallocated, and so far
      only the timer code preallocates signals.
      
      Move the check for if a timer needs to be rescheduled up into
      collect_signal where the preallocation check must be performed,
      and pass the result back to dequeue_signal where the code reschedules
      timers.   This makes it clear why the code cares about preallocated
      timers.
      
      Cc: stable@vger.kernel.org
      Reported-by: NThomas Gleixner <tglx@linutronix.de>
      History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
      Reference: 66dd34ad ("signal: allow to send any siginfo to itself")
      Reference: 1669ce53e2ff ("Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO")
      Fixes: db8b50ba75f2 ("[PATCH] POSIX clocks & timers")
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      57db7e4a
  8. 10 6月, 2017 2 次提交
  9. 04 6月, 2017 2 次提交
  10. 28 5月, 2017 1 次提交
  11. 22 4月, 2017 1 次提交
  12. 19 4月, 2017 1 次提交
    • P
      mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU · 5f0d5a3a
      Paul E. McKenney 提交于
      A group of Linux kernel hackers reported chasing a bug that resulted
      from their assumption that SLAB_DESTROY_BY_RCU provided an existence
      guarantee, that is, that no block from such a slab would be reallocated
      during an RCU read-side critical section.  Of course, that is not the
      case.  Instead, SLAB_DESTROY_BY_RCU only prevents freeing of an entire
      slab of blocks.
      
      However, there is a phrase for this, namely "type safety".  This commit
      therefore renames SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU in order
      to avoid future instances of this sort of confusion.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: <linux-mm@kvack.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      [ paulmck: Add comments mentioning the old name, as requested by Eric
        Dumazet, in order to help people familiar with the old name find
        the new one. ]
      Acked-by: NDavid Rientjes <rientjes@google.com>
      5f0d5a3a
  13. 02 3月, 2017 7 次提交
  14. 28 2月, 2017 1 次提交
  15. 01 2月, 2017 3 次提交
    • F
      signal: Convert obsolete cputime type to nsecs · bde8285e
      Frederic Weisbecker 提交于
      Use the new nsec based cputime accessors as part of the whole cputime
      conversion from cputime_t to nsecs.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Link: http://lkml.kernel.org/r/1485832191-26889-16-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bde8285e
    • F
      sched/cputime: Convert task/group cputime to nsecs · 5613fda9
      Frederic Weisbecker 提交于
      Now that most cputime readers use the transition API which return the
      task cputime in old style cputime_t, we can safely store the cputime in
      nsecs. This will eventually make cputime statistics less opaque and more
      granular. Back and forth convertions between cputime_t and nsecs in order
      to deal with cputime_t random granularity won't be needed anymore.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Link: http://lkml.kernel.org/r/1485832191-26889-8-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5613fda9
    • F
      sched/cputime: Introduce special task_cputime_t() API to return old-typed cputime · a1cecf2b
      Frederic Weisbecker 提交于
      This API returns a task's cputime in cputime_t in order to ease the
      conversion of cputime internals to use nsecs units instead. Blindly
      converting all cputime readers to use this API now will later let us
      convert more smoothly and step by step all these places to use the
      new nsec based cputime.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Link: http://lkml.kernel.org/r/1485832191-26889-7-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a1cecf2b
  16. 11 1月, 2017 1 次提交
  17. 26 12月, 2016 1 次提交
    • T
      ktime: Get rid of the union · 2456e855
      Thomas Gleixner 提交于
      ktime is a union because the initial implementation stored the time in
      scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
      variant for 32bit machines. The Y2038 cleanup removed the timespec variant
      and switched everything to scalar nanoseconds. The union remained, but
      become completely pointless.
      
      Get rid of the union and just keep ktime_t as simple typedef of type s64.
      
      The conversion was done with coccinelle and some manual mopping up.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      2456e855
  18. 25 12月, 2016 1 次提交
  19. 15 12月, 2016 1 次提交
    • W
      signals: avoid unnecessary taking of sighand->siglock · c7be96af
      Waiman Long 提交于
      When running certain database workload on a high-end system with many
      CPUs, it was found that spinlock contention in the sigprocmask syscalls
      became a significant portion of the overall CPU cycles as shown below.
      
        9.30%  9.30%  905387  dataserver  /proc/kcore 0x7fff8163f4d2
        [k] _raw_spin_lock_irq
                  |
                  ---_raw_spin_lock_irq
                     |
                     |--99.34%-- __set_current_blocked
                     |          sigprocmask
                     |          sys_rt_sigprocmask
                     |          system_call_fastpath
                     |          |
                     |          |--50.63%-- __swapcontext
                     |          |          |
                     |          |          |--99.91%-- upsleepgeneric
                     |          |
                     |          |--49.36%-- __setcontext
                     |          |          ktskRun
      
      Looking further into the swapcontext function in glibc, it was found that
      the function always call sigprocmask() without checking if there are
      changes in the signal mask.
      
      A check was added to the __set_current_blocked() function to avoid taking
      the sighand->siglock spinlock if there is no change in the signal mask.
      This will prevent unneeded spinlock contention when many threads are
      trying to call sigprocmask().
      
      With this patch applied, the spinlock contention in sigprocmask() was
      gone.
      
      Link: http://lkml.kernel.org/r/1474979209-11867-1-git-send-email-Waiman.Long@hpe.comSigned-off-by: NWaiman Long <Waiman.Long@hpe.com>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Stas Sergeev <stsp@list.ru>
      Cc: Scott J Norton <scott.norton@hpe.com>
      Cc: Douglas Hatch <doug.hatch@hpe.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c7be96af
  20. 16 11月, 2016 1 次提交
  21. 15 9月, 2016 1 次提交
  22. 07 7月, 2016 1 次提交
  23. 24 5月, 2016 1 次提交
  24. 04 5月, 2016 1 次提交
    • A
      signals/sigaltstack: Report current flag bits in sigaltstack() · 0318bc8a
      Andy Lutomirski 提交于
      sigaltstack()'s reported previous state uses a somewhat odd
      convention, but the concept of flag bits is new, and we can do the
      flag bits sensibly.  Specifically, let's just report them directly.
      
      This will allow saving and restoring the sigaltstack state using
      sigaltstack() to work correctly.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Amanieu d'Antras <amanieu@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Cc: Stas Sergeev <stsp@list.ru>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: linux-api@vger.kernel.org
      Link: http://lkml.kernel.org/r/94b291ec9fd47741a9264851e316e158ded0b00d.1462296606.git.luto@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0318bc8a
  25. 03 5月, 2016 2 次提交
    • S
      signals/sigaltstack: Implement SS_AUTODISARM flag · 2a742138
      Stas Sergeev 提交于
      This patch implements the SS_AUTODISARM flag that can be OR-ed with
      SS_ONSTACK when forming ss_flags.
      
      When this flag is set, sigaltstack will be disabled when entering
      the signal handler; more precisely, after saving sas to uc_stack.
      When leaving the signal handler, the sigaltstack is restored by
      uc_stack.
      
      When this flag is used, it is safe to switch from sighandler with
      swapcontext(). Without this flag, the subsequent signal will corrupt
      the state of the switched-away sighandler.
      
      To detect the support of this functionality, one can do:
      
        err = sigaltstack(SS_DISABLE | SS_AUTODISARM);
        if (err && errno == EINVAL)
      	unsupported();
      Signed-off-by: NStas Sergeev <stsp@list.ru>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Aleksa Sarai <cyphar@cyphar.com>
      Cc: Amanieu d'Antras <amanieu@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Moore <pmoore@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: linux-api@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/1460665206-13646-4-git-send-email-stsp@list.ruSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2a742138
    • S
      signals/sigaltstack: Prepare to add new SS_xxx flags · 407bc16a
      Stas Sergeev 提交于
      This patch adds SS_FLAG_BITS - the mask that splits sigaltstack
      mode values and bit-flags. Since there is no bit-flags yet, the
      mask is defined to 0. The flags are added by subsequent patches.
      With every new flag, the mask should have the appropriate bit cleared.
      
      This makes sure if some flag is tried on a kernel that doesn't
      support it, the -EINVAL error will be returned, because such a
      flag will be treated as an invalid mode rather than the bit-flag.
      
      That way the existence of the particular features can be probed
      at run-time.
      
      This change was suggested by Andy Lutomirski:
      
        https://lkml.org/lkml/2016/3/6/158Signed-off-by: NStas Sergeev <stsp@list.ru>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Amanieu d'Antras <amanieu@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: linux-api@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/1460665206-13646-3-git-send-email-stsp@list.ruSigned-off-by: NIngo Molnar <mingo@kernel.org>
      407bc16a
  26. 23 3月, 2016 1 次提交
    • H
      kernel/signal.c: add compile-time check for __ARCH_SI_PREAMBLE_SIZE · 41b27154
      Helge Deller 提交于
      The value of __ARCH_SI_PREAMBLE_SIZE defines the size (including
      padding) of the part of the struct siginfo that is before the union, and
      it is then used to calculate the needed padding (SI_PAD_SIZE) to make
      the size of struct siginfo equal to 128 (SI_MAX_SIZE) bytes.
      
      Depending on the target architecture and word width it equals to either
      3 or 4 times sizeof int.
      
      Since the very beginning we had __ARCH_SI_PREAMBLE_SIZE wrong on the
      parisc architecture for the 64bit kernel build.  It's even more
      frustrating, because it can easily be checked at compile time if the
      value was defined correctly.
      
      This patch adds such a check for the correctness of
      __ARCH_SI_PREAMBLE_SIZE in the hope that it will prevent existing and
      future architectures from running into the same problem.
      
      I refrained from replacing __ARCH_SI_PREAMBLE_SIZE by offsetof() in
      copy_siginfo() in include/asm-generic/siginfo.h, because a) it doesn't
      make any difference and b) it's used in the Documentation/kmemcheck.txt
      example.
      
      I ran this patch through the 0-DAY kernel test infrastructure and only
      the parisc architecture triggered as expected.  That means that this
      patch should be OK for all major architectures.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      41b27154
  27. 18 2月, 2016 1 次提交
    • D
      signals, pkeys: Notify userspace about protection key faults · cd0ea35f
      Dave Hansen 提交于
      A protection key fault is very similar to any other access error.
      There must be a VMA, etc...  We even want to take the same action
      (SIGSEGV) that we do with a normal access fault.
      
      However, we do need to let userspace know that something is
      different.  We do this the same way what we did with SEGV_BNDERR
      with Memory Protection eXtensions (MPX): define a new SEGV code:
      SEGV_PKUERR.
      
      We add a siginfo field: si_pkey that reveals to userspace which
      protection key was set on the PTE that we faulted on.  There is
      no other easy way for userspace to figure this out.  They could
      parse smaps but that would be a bit cruel.
      
      We share space with in siginfo with _addr_bnd.  #BR faults from
      MPX are completely separate from page faults (#PF) that trigger
      from protection key violations, so we never need both at the same
      time.
      
      Note that _pkey is a 64-bit value.  The current hardware only
      supports 4-bit protection keys.  We do this because there is
      _plenty_ of space in _sigfault and it is possible that future
      processors would support more than 4 bits of protection keys.
      
      The x86 code to actually fill in the siginfo is in the next
      patch.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Amanieu d'Antras <amanieu@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20160212210212.3A9B83AC@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cd0ea35f
  28. 06 2月, 2016 1 次提交
  29. 21 11月, 2015 1 次提交
    • R
      kernel/signal.c: unexport sigsuspend() · 9d8a7652
      Richard Weinberger 提交于
      sigsuspend() is nowhere used except in signal.c itself, so we can mark it
      static do not pollute the global namespace.
      
      But this patch is more than a boring cleanup patch, it fixes a real issue
      on UserModeLinux.  UML has a special console driver to display ttys using
      xterm, or other terminal emulators, on the host side.  Vegard reported
      that sometimes UML is unable to spawn a xterm and he's facing the
      following warning:
      
        WARNING: CPU: 0 PID: 908 at include/linux/thread_info.h:128 sigsuspend+0xab/0xc0()
      
      It turned out that this warning makes absolutely no sense as the UML
      xterm code calls sigsuspend() on the host side, at least it tries.  But
      as the kernel itself offers a sigsuspend() symbol the linker choose this
      one instead of the glibc wrapper.  Interestingly this code used to work
      since ever but always blocked signals on the wrong side.  Some recent
      kernel change made the WARN_ON() trigger and uncovered the bug.
      
      It is a wonderful example of how much works by chance on computers. :-)
      
      Fixes: 68f3f16d ("new helper: sigsuspend()")
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      Reported-by: NVegard Nossum <vegard.nossum@oracle.com>
      Tested-by: NVegard Nossum <vegard.nossum@oracle.com>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Cc: <stable@vger.kernel.org>	[3.5+]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9d8a7652