1. 23 1月, 2018 3 次提交
    • E
      signal: Helpers for faults with specialized siginfo layouts · 38246735
      Eric W. Biederman 提交于
      The helpers added are:
      send_sig_mceerr
      force_sig_mceerr
      force_sig_bnderr
      force_sig_pkuerr
      
      Filling out siginfo properly can ge tricky.  Especially for these
      specialized cases where the temptation is to share code with other
      cases which use a different subset of siginfo fields.  Unfortunately
      that code sharing frequently results in bugs with the wrong siginfo
      fields filled in, and makes it harder to verify that the siginfo
      structure was properly initialized.
      
      Provide these helpers instead that get all of the details right, and
      guarantee that siginfo is properly initialized.
      
      send_sig_mceerr and force_sig_mceer are a little special as two si
      codes BUS_MCEERR_AO and BUS_MCEER_AR both use the same extended
      signinfo layout.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      38246735
    • E
      signal: Add send_sig_fault and force_sig_fault · f8ec6601
      Eric W. Biederman 提交于
      The vast majority of signals sent from architecture specific code are
      simple faults.  Encapsulate this reality with two helper functions so
      that the nit-picky implementation of preparing a siginfo does not need
      to be repeated many times on each architecture.
      
      As only some architectures support the trapno field, make the trapno
      arguement only present on those architectures.
      
      Similary as ia64 has three fields: imm, flags, and isr that
      are specific to it.  Have those arguments always present on ia64
      and no where else.
      
      This ensures the architecture specific code always remembers which
      fields it needs to pass into the siginfo structure.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      f8ec6601
    • E
      signal: Don't use structure initializers for struct siginfo · 5f74972c
      Eric W. Biederman 提交于
      The siginfo structure has all manners of holes with the result that a
      structure initializer is not guaranteed to initialize all of the bits.
      As we have to copy the structure to userspace don't even try to use
      a structure initializer.  Instead use clear_siginfo followed by initializing
      selected fields.  This gives a guarantee that uninitialized kernel memory
      is not copied to userspace.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      5f74972c
  2. 16 1月, 2018 4 次提交
    • E
      signal: Unify and correct copy_siginfo_to_user32 · ea64d5ac
      Eric W. Biederman 提交于
      Among the existing architecture specific versions of
      copy_siginfo_to_user32 there are several different implementation
      problems.  Some architectures fail to handle all of the cases in in
      the siginfo union.  Some architectures perform a blind copy of the
      siginfo union when the si_code is negative.  A blind copy suggests the
      data is expected to be in 32bit siginfo format, which means that
      receiving such a signal via signalfd won't work, or that the data is
      in 64bit siginfo and the code is copying nonsense to userspace.
      
      Create a single instance of copy_siginfo_to_user32 that all of the
      architectures can share, and teach it to handle all of the cases in
      the siginfo union correctly, with the assumption that siginfo is
      stored internally to the kernel is 64bit siginfo format.
      
      A special case is made for x86 x32 format.  This is needed as presence
      of both x32 and ia32 on x86_64 results in two different 32bit signal
      formats.  By allowing this small special case there winds up being
      exactly one code base that needs to be maintained between all of the
      architectures.  Vastly increasing the testing base and the chances of
      finding bugs.
      
      As the x86 copy of copy_siginfo_to_user32 the call of the x86
      signal_compat_build_tests were moved into sigaction_compat_abi, so
      that they will keep running.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      ea64d5ac
    • E
      signal: Remove the code to clear siginfo before calling copy_siginfo_from_user32 · eb5346c3
      Eric W. Biederman 提交于
      The new unified copy_siginfo_from_user32 takes care of this.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      eb5346c3
    • E
      signal: Unify and correct copy_siginfo_from_user32 · 212a36a1
      Eric W. Biederman 提交于
      The function copy_siginfo_from_user32 is used for two things, in ptrace
      since the dawn of siginfo for arbirarily modifying a signal that
      user space sees, and in sigqueueinfo to send a signal with arbirary
      siginfo data.
      
      Create a single copy of copy_siginfo_from_user32 that all architectures
      share, and teach it to handle all of the cases in the siginfo union.
      
      In the generic version of copy_siginfo_from_user32 ensure that all
      of the fields in siginfo are initialized so that the siginfo structure
      can be safely copied to userspace if necessary.
      
      When copying the embedded sigval union copy the si_int member.  That
      ensures the 32bit values passes through the kernel unchanged.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      212a36a1
    • E
      signal/blackfin: Move the blackfin specific si_codes to asm-generic/siginfo.h · 71ee78d5
      Eric W. Biederman 提交于
      Having si_codes in many different files simply encourages duplicate definitions
      that can cause problems later.  To avoid that merge the blackfin specific si_codes
      into uapi/asm-generic/siginfo.h
      
      Update copy_siginfo_to_user to copy with the absence of BUS_MCEERR_AR that blackfin
      defines to be something else.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      71ee78d5
  3. 13 1月, 2018 6 次提交
  4. 04 1月, 2018 1 次提交
    • E
      signal: Simplify and fix kdb_send_sig · 0b44bf9a
      Eric W. Biederman 提交于
      - Rename from kdb_send_sig_info to kdb_send_sig
        As there is no meaningful siginfo sent
      
      - Use SEND_SIG_PRIV instead of generating a siginfo for a kdb
        signal.  The generated siginfo had a bogus rationale and was
        not correct in the face of pid namespaces.  SEND_SIG_PRIV
        is simpler and actually correct.
      
      - As the code grabs siglock just send the signal with siglock
        held instead of dropping siglock and attempting to grab it again.
      
      - Move the sig_valid test into kdb_kill where it can generate
        a good error message.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      0b44bf9a
  5. 18 11月, 2017 3 次提交
  6. 16 11月, 2017 1 次提交
  7. 02 11月, 2017 1 次提交
  8. 20 9月, 2017 4 次提交
  9. 19 8月, 2017 1 次提交
  10. 07 8月, 2017 1 次提交
  11. 25 7月, 2017 1 次提交
    • E
      signal: Remove kernel interal si_code magic · cc731525
      Eric W. Biederman 提交于
      struct siginfo is a union and the kernel since 2.4 has been hiding a union
      tag in the high 16bits of si_code using the values:
      __SI_KILL
      __SI_TIMER
      __SI_POLL
      __SI_FAULT
      __SI_CHLD
      __SI_RT
      __SI_MESGQ
      __SI_SYS
      
      While this looks plausible on the surface, in practice this situation has
      not worked well.
      
      - Injected positive signals are not copied to user space properly
        unless they have these magic high bits set.
      
      - Injected positive signals are not reported properly by signalfd
        unless they have these magic high bits set.
      
      - These kernel internal values leaked to userspace via ptrace_peek_siginfo
      
      - It was possible to inject these kernel internal values and cause the
        the kernel to misbehave.
      
      - Kernel developers got confused and expected these kernel internal values
        in userspace in kernel self tests.
      
      - Kernel developers got confused and set si_code to __SI_FAULT which
        is SI_USER in userspace which causes userspace to think an ordinary user
        sent the signal and that it was not kernel generated.
      
      - The values make it impossible to reorganize the code to transform
        siginfo_copy_to_user into a plain copy_to_user.  As si_code must
        be massaged before being passed to userspace.
      
      So remove these kernel internal si codes and make the kernel code simpler
      and more maintainable.
      
      To replace these kernel internal magic si_codes introduce the helper
      function siginfo_layout, that takes a signal number and an si_code and
      computes which union member of siginfo is being used.  Have
      siginfo_layout return an enumeration so that gcc will have enough
      information to warn if a switch statement does not handle all of union
      members.
      
      A couple of architectures have a messed up ABI that defines signal
      specific duplications of SI_USER which causes more special cases in
      siginfo_layout than I would like.  The good news is only problem
      architectures pay the cost.
      
      Update all of the code that used the previous magic __SI_ values to
      use the new SIL_ values and to call siginfo_layout to get those
      values.  Escept where not all of the cases are handled remove the
      defaults in the switch statements so that if a new case is missed in
      the future the lack will show up at compile time.
      
      Modify the code that copies siginfo si_code to userspace to just copy
      the value and not cast si_code to a short first.  The high bits are no
      longer used to hold a magic union member.
      
      Fixup the siginfo header files to stop including the __SI_ values in
      their constants and for the headers that were missing it to properly
      update the number of si_codes for each signal type.
      
      The fixes to copy_siginfo_from_user32 implementations has the
      interesting property that several of them perviously should never have
      worked as the __SI_ values they depended up where kernel internal.
      With that dependency gone those implementations should work much
      better.
      
      The idea of not passing the __SI_ values out to userspace and then
      not reinserting them has been tested with criu and criu worked without
      changes.
      
      Ref: 2.4.0-test1
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      cc731525
  12. 11 7月, 2017 1 次提交
  13. 20 6月, 2017 1 次提交
  14. 18 6月, 2017 1 次提交
    • E
      signal: Only reschedule timers on signals timers have sent · 57db7e4a
      Eric W. Biederman 提交于
      Thomas Gleixner  wrote:
      > The CRIU support added a 'feature' which allows a user space task to send
      > arbitrary (kernel) signals to itself. The changelog says:
      >
      >   The kernel prevents sending of siginfo with positive si_code, because
      >   these codes are reserved for kernel.  I think we can allow a task to
      >   send such a siginfo to itself.  This operation should not be dangerous.
      >
      > Quite contrary to that claim, it turns out that it is outright dangerous
      > for signals with info->si_code == SI_TIMER. The following code sequence in
      > a user space task allows to crash the kernel:
      >
      >    id = timer_create(CLOCK_XXX, ..... signo = SIGX);
      >    timer_set(id, ....);
      >    info->si_signo = SIGX;
      >    info->si_code = SI_TIMER:
      >    info->_sifields._timer._tid = id;
      >    info->_sifields._timer._sys_private = 2;
      >    rt_[tg]sigqueueinfo(..., SIGX, info);
      >    sigemptyset(&sigset);
      >    sigaddset(&sigset, SIGX);
      >    rt_sigtimedwait(sigset, info);
      >
      > For timers based on CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID this
      > results in a kernel crash because sigwait() dequeues the signal and the
      > dequeue code observes:
      >
      >   info->si_code == SI_TIMER && info->_sifields._timer._sys_private != 0
      >
      > which triggers the following callchain:
      >
      >  do_schedule_next_timer() -> posix_cpu_timer_schedule() -> arm_timer()
      >
      > arm_timer() executes a list_add() on the timer, which is already armed via
      > the timer_set() syscall. That's a double list add which corrupts the posix
      > cpu timer list. As a consequence the kernel crashes on the next operation
      > touching the posix cpu timer list.
      >
      > Posix clocks which are internally implemented based on hrtimers are not
      > affected by this because hrtimer_start() can handle already armed timers
      > nicely, but it's a reliable way to trigger the WARN_ON() in
      > hrtimer_forward(), which complains about calling that function on an
      > already armed timer.
      
      This problem has existed since the posix timer code was merged into
      2.5.63. A few releases earlier in 2.5.60 ptrace gained the ability to
      inject not just a signal (which linux has supported since 1.0) but the
      full siginfo of a signal.
      
      The core problem is that the code will reschedule in response to
      signals getting dequeued not just for signals the timers sent but
      for other signals that happen to a si_code of SI_TIMER.
      
      Avoid this confusion by testing to see if the queued signal was
      preallocated as all timer signals are preallocated, and so far
      only the timer code preallocates signals.
      
      Move the check for if a timer needs to be rescheduled up into
      collect_signal where the preallocation check must be performed,
      and pass the result back to dequeue_signal where the code reschedules
      timers.   This makes it clear why the code cares about preallocated
      timers.
      
      Cc: stable@vger.kernel.org
      Reported-by: NThomas Gleixner <tglx@linutronix.de>
      History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
      Reference: 66dd34ad ("signal: allow to send any siginfo to itself")
      Reference: 1669ce53e2ff ("Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO")
      Fixes: db8b50ba75f2 ("[PATCH] POSIX clocks & timers")
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      57db7e4a
  15. 10 6月, 2017 2 次提交
  16. 04 6月, 2017 2 次提交
  17. 28 5月, 2017 1 次提交
  18. 22 4月, 2017 1 次提交
  19. 19 4月, 2017 1 次提交
    • P
      mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU · 5f0d5a3a
      Paul E. McKenney 提交于
      A group of Linux kernel hackers reported chasing a bug that resulted
      from their assumption that SLAB_DESTROY_BY_RCU provided an existence
      guarantee, that is, that no block from such a slab would be reallocated
      during an RCU read-side critical section.  Of course, that is not the
      case.  Instead, SLAB_DESTROY_BY_RCU only prevents freeing of an entire
      slab of blocks.
      
      However, there is a phrase for this, namely "type safety".  This commit
      therefore renames SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU in order
      to avoid future instances of this sort of confusion.
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: <linux-mm@kvack.org>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      [ paulmck: Add comments mentioning the old name, as requested by Eric
        Dumazet, in order to help people familiar with the old name find
        the new one. ]
      Acked-by: NDavid Rientjes <rientjes@google.com>
      5f0d5a3a
  20. 02 3月, 2017 4 次提交