1. 23 8月, 2018 4 次提交
  2. 10 6月, 2018 1 次提交
  3. 04 5月, 2018 1 次提交
    • P
      sched/core: Introduce set_special_state() · b5bf9a90
      Peter Zijlstra 提交于
      Gaurav reported a perceived problem with TASK_PARKED, which turned out
      to be a broken wait-loop pattern in __kthread_parkme(), but the
      reported issue can (and does) in fact happen for states that do not do
      condition based sleeps.
      
      When the 'current->state = TASK_RUNNING' store of a previous
      (concurrent) try_to_wake_up() collides with the setting of a 'special'
      sleep state, we can loose the sleep state.
      
      Normal condition based wait-loops are immune to this problem, but for
      sleep states that are not condition based are subject to this problem.
      
      There already is a fix for TASK_DEAD. Abstract that and also apply it
      to TASK_STOPPED and TASK_TRACED, both of which are also without
      condition based wait-loop.
      Reported-by: NGaurav Kohli <gkohli@codeaurora.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      b5bf9a90
  4. 27 4月, 2018 2 次提交
    • E
      signal: Extend siginfo_layout with SIL_FAULT_{MCEERR|BNDERR|PKUERR} · 31931c93
      Eric W. Biederman 提交于
      Update the siginfo_layout function and enum siginfo_layout to represent
      all of the possible field layouts of struct siginfo.
      
      This allows the uses of siginfo_layout in um and arm64 where they are testing
      for SIL_FAULT to be more accurate as this rules out the other cases.
      
      Further this allows the switch statements on siginfo_layout to be simpler
      if perhaps a little more wordy.  Making it easier to understand what is
      actually going on.
      
      As SIL_FAULT_BNDERR and SIL_FAULT_PKUERR are never expected to appear
      in signalfd just treat them as SIL_FAULT.  To include them would take
      20 extra bytes an pretty much fill up what is left of
      signalfd_siginfo.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      31931c93
    • E
      signal: Remove unncessary #ifdef SEGV_PKUERR in 32bit compat code · 36a4ca3d
      Eric W. Biederman 提交于
      The only architecture that does not support SEGV_PKUERR is ia64 and
      ia64 has not had 32bit support since some time in 2008.  Therefore
      copy_siginfo_to_user32 and copy_siginfo_from_user32 do not need to
      include support for a missing SEGV_PKUERR.
      
      Compile test on ia64.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      36a4ca3d
  5. 25 4月, 2018 4 次提交
    • E
      signal: Remove ifdefs for BUS_MCEERR_AR and BUS_MCEERR_AO · 4181d225
      Eric W. Biederman 提交于
      With the recent architecture cleanups these si_codes are always
      defined so there is no need to test for them.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      4181d225
    • E
      signal: Remove SEGV_BNDERR ifdefs · 3a11ab14
      Eric W. Biederman 提交于
      After the last round of cleanups to siginfo.h SEGV_BNDERR is defined
      on all architectures so testing to see if it is defined is unnecessary.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      3a11ab14
    • E
      signal: Stop special casing TRAP_FIXME and FPE_FIXME in siginfo_layout · 0c362f96
      Eric W. Biederman 提交于
      After more experience with the cases where no one the si_code of 0
      is used both as a signal specific si_code, and as SI_USER it appears
      that no one cares about the signal specific si_code case and the
      good solution is to just fix the architectures by using
      a different si_code.
      
      In none of the conversations has anyone even suggested that
      anything depends on the signal specific redefinition of SI_USER.
      
      There are at least test cases that care when si_code as 0 does
      not work as si_user.
      
      So make things simple and keep the generic code from introducing
      problems by removing the special casing of TRAP_FIXME and FPE_FIXME.
      This will ensure the generic case of sending a signal with
      kill will always set SI_USER and work.
      
      The architecture specific, and signal specific overloads that
      set si_code to 0 will now have problems with signalfd and
      the 32bit compat versions of siginfo copying.   At least
      until they are fixed.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      0c362f96
    • E
      signal: Reduce copy_siginfo_to_user to just copy_to_user · c999b933
      Eric W. Biederman 提交于
      Now that every instance of struct siginfo is now initialized it is no
      longer necessary to copy struct siginfo piece by piece to userspace
      but instead the entire structure can be copied.
      
      As well as making the code simpler and more efficient this means that
      copy_sinfo_to_user no longer cares which union member of struct
      siginfo is in use.
      
      In practice this means that all 32bit architectures that define
      FPE_FIXME will handle properly send SI_USER when kill(SIGFPE) is sent.
      While still performing their historic architectural brokenness when 0
      is used a floating pointer signal.  This matches the current behavior
      of 64bit architectures that define FPE_FIXME who get lucky and an
      overloaded SI_USER has continuted to work through copy_siginfo_to_user
      because the 8 byte si_addr occupies the same bytes in struct siginfo
      as the 4 byte si_pid and the 4 byte si_uid.
      
      Problematic architectures still need to fix their ABI so that signalfd
      and 32bit compat code will work properly.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      c999b933
  6. 03 4月, 2018 2 次提交
  7. 09 3月, 2018 1 次提交
    • D
      arm64: signal: Ensure si_code is valid for all fault signals · af40ff68
      Dave Martin 提交于
      Currently, as reported by Eric, an invalid si_code value 0 is
      passed in many signals delivered to userspace in response to faults
      and other kernel errors.  Typically 0 is passed when the fault is
      insufficiently diagnosable or when there does not appear to be any
      sensible alternative value to choose.
      
      This appears to violate POSIX, and is intuitively wrong for at
      least two reasons arising from the fact that 0 == SI_USER:
      
       1) si_code is a union selector, and SI_USER (and si_code <= 0 in
          general) implies the existence of a different set of fields
          (siginfo._kill) from that which exists for a fault signal
          (siginfo._sigfault).  However, the code raising the signal
          typically writes only the _sigfault fields, and the _kill
          fields make no sense in this case.
      
          Thus when userspace sees si_code == 0 (SI_USER) it may
          legitimately inspect fields in the inactive union member _kill
          and obtain garbage as a result.
      
          There appears to be software in the wild relying on this,
          albeit generally only for printing diagnostic messages.
      
       2) Software that wants to be robust against spurious signals may
          discard signals where si_code == SI_USER (or <= 0), or may
          filter such signals based on the si_uid and si_pid fields of
          siginfo._sigkill.  In the case of fault signals, this means
          that important (and usually fatal) error conditions may be
          silently ignored.
      
      In practice, many of the faults for which arm64 passes si_code == 0
      are undiagnosable conditions such as exceptions with syndrome
      values in ESR_ELx to which the architecture does not yet assign any
      meaning, or conditions indicative of a bug or error in the kernel
      or system and thus that are unrecoverable and should never occur in
      normal operation.
      
      The approach taken in this patch is to translate all such
      undiagnosable or "impossible" synchronous fault conditions to
      SIGKILL, since these are at least probably localisable to a single
      process.  Some of these conditions should really result in a kernel
      panic, but due to the lack of diagnostic information it is
      difficult to be certain: this patch does not add any calls to
      panic(), but this could change later if justified.
      
      Although si_code will not reach userspace in the case of SIGKILL,
      it is still desirable to pass a nonzero value so that the common
      siginfo handling code can detect incorrect use of si_code == 0
      without false positives.  In this case the si_code dependent
      siginfo fields will not be correctly initialised, but since they
      are not passed to userspace I deem this not to matter.
      
      A few faults can reasonably occur in realistic userspace scenarios,
      and _should_ raise a regular, handleable (but perhaps not
      ignorable/blockable) signal: for these, this patch attempts to
      choose a suitable standard si_code value for the raised signal in
      each case instead of 0.
      
      arm64 was the only arch to define a BUS_FIXME code, so after this
      patch nobody defines it.  This patch therefore also removes the
      relevant code from siginfo_layout().
      
      Cc: James Morse <james.morse@arm.com>
      Reported-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      af40ff68
  8. 07 3月, 2018 1 次提交
  9. 23 1月, 2018 4 次提交
    • E
      signal/ptrace: Add force_sig_ptrace_errno_trap and use it where needed · f71dd7dc
      Eric W. Biederman 提交于
      There are so many places that build struct siginfo by hand that at
      least one of them is bound to get it wrong.  A handful of cases in the
      kernel arguably did just that when using the errno field of siginfo to
      pass no errno values to userspace.  The usage is limited to a single
      si_code so at least does not mess up anything else.
      
      Encapsulate this questionable pattern in a helper function so
      that the userspace ABI is preserved.
      
      Update all of the places that use this pattern to use the new helper
      function.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      f71dd7dc
    • E
      signal: Helpers for faults with specialized siginfo layouts · 38246735
      Eric W. Biederman 提交于
      The helpers added are:
      send_sig_mceerr
      force_sig_mceerr
      force_sig_bnderr
      force_sig_pkuerr
      
      Filling out siginfo properly can ge tricky.  Especially for these
      specialized cases where the temptation is to share code with other
      cases which use a different subset of siginfo fields.  Unfortunately
      that code sharing frequently results in bugs with the wrong siginfo
      fields filled in, and makes it harder to verify that the siginfo
      structure was properly initialized.
      
      Provide these helpers instead that get all of the details right, and
      guarantee that siginfo is properly initialized.
      
      send_sig_mceerr and force_sig_mceer are a little special as two si
      codes BUS_MCEERR_AO and BUS_MCEER_AR both use the same extended
      signinfo layout.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      38246735
    • E
      signal: Add send_sig_fault and force_sig_fault · f8ec6601
      Eric W. Biederman 提交于
      The vast majority of signals sent from architecture specific code are
      simple faults.  Encapsulate this reality with two helper functions so
      that the nit-picky implementation of preparing a siginfo does not need
      to be repeated many times on each architecture.
      
      As only some architectures support the trapno field, make the trapno
      arguement only present on those architectures.
      
      Similary as ia64 has three fields: imm, flags, and isr that
      are specific to it.  Have those arguments always present on ia64
      and no where else.
      
      This ensures the architecture specific code always remembers which
      fields it needs to pass into the siginfo structure.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      f8ec6601
    • E
      signal: Don't use structure initializers for struct siginfo · 5f74972c
      Eric W. Biederman 提交于
      The siginfo structure has all manners of holes with the result that a
      structure initializer is not guaranteed to initialize all of the bits.
      As we have to copy the structure to userspace don't even try to use
      a structure initializer.  Instead use clear_siginfo followed by initializing
      selected fields.  This gives a guarantee that uninitialized kernel memory
      is not copied to userspace.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      5f74972c
  10. 16 1月, 2018 4 次提交
    • E
      signal: Unify and correct copy_siginfo_to_user32 · ea64d5ac
      Eric W. Biederman 提交于
      Among the existing architecture specific versions of
      copy_siginfo_to_user32 there are several different implementation
      problems.  Some architectures fail to handle all of the cases in in
      the siginfo union.  Some architectures perform a blind copy of the
      siginfo union when the si_code is negative.  A blind copy suggests the
      data is expected to be in 32bit siginfo format, which means that
      receiving such a signal via signalfd won't work, or that the data is
      in 64bit siginfo and the code is copying nonsense to userspace.
      
      Create a single instance of copy_siginfo_to_user32 that all of the
      architectures can share, and teach it to handle all of the cases in
      the siginfo union correctly, with the assumption that siginfo is
      stored internally to the kernel is 64bit siginfo format.
      
      A special case is made for x86 x32 format.  This is needed as presence
      of both x32 and ia32 on x86_64 results in two different 32bit signal
      formats.  By allowing this small special case there winds up being
      exactly one code base that needs to be maintained between all of the
      architectures.  Vastly increasing the testing base and the chances of
      finding bugs.
      
      As the x86 copy of copy_siginfo_to_user32 the call of the x86
      signal_compat_build_tests were moved into sigaction_compat_abi, so
      that they will keep running.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      ea64d5ac
    • E
      signal: Remove the code to clear siginfo before calling copy_siginfo_from_user32 · eb5346c3
      Eric W. Biederman 提交于
      The new unified copy_siginfo_from_user32 takes care of this.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      eb5346c3
    • E
      signal: Unify and correct copy_siginfo_from_user32 · 212a36a1
      Eric W. Biederman 提交于
      The function copy_siginfo_from_user32 is used for two things, in ptrace
      since the dawn of siginfo for arbirarily modifying a signal that
      user space sees, and in sigqueueinfo to send a signal with arbirary
      siginfo data.
      
      Create a single copy of copy_siginfo_from_user32 that all architectures
      share, and teach it to handle all of the cases in the siginfo union.
      
      In the generic version of copy_siginfo_from_user32 ensure that all
      of the fields in siginfo are initialized so that the siginfo structure
      can be safely copied to userspace if necessary.
      
      When copying the embedded sigval union copy the si_int member.  That
      ensures the 32bit values passes through the kernel unchanged.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      212a36a1
    • E
      signal/blackfin: Move the blackfin specific si_codes to asm-generic/siginfo.h · 71ee78d5
      Eric W. Biederman 提交于
      Having si_codes in many different files simply encourages duplicate definitions
      that can cause problems later.  To avoid that merge the blackfin specific si_codes
      into uapi/asm-generic/siginfo.h
      
      Update copy_siginfo_to_user to copy with the absence of BUS_MCEERR_AR that blackfin
      defines to be something else.
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      71ee78d5
  11. 13 1月, 2018 6 次提交
  12. 04 1月, 2018 1 次提交
    • E
      signal: Simplify and fix kdb_send_sig · 0b44bf9a
      Eric W. Biederman 提交于
      - Rename from kdb_send_sig_info to kdb_send_sig
        As there is no meaningful siginfo sent
      
      - Use SEND_SIG_PRIV instead of generating a siginfo for a kdb
        signal.  The generated siginfo had a bogus rationale and was
        not correct in the face of pid namespaces.  SEND_SIG_PRIV
        is simpler and actually correct.
      
      - As the code grabs siglock just send the signal with siglock
        held instead of dropping siglock and attempting to grab it again.
      
      - Move the sig_valid test into kdb_kill where it can generate
        a good error message.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      0b44bf9a
  13. 05 12月, 2017 1 次提交
    • M
      livepatch: send a fake signal to all blocking tasks · 43347d56
      Miroslav Benes 提交于
      Live patching consistency model is of LEAVE_PATCHED_SET and
      SWITCH_THREAD. This means that all tasks in the system have to be marked
      one by one as safe to call a new patched function. Safe means when a
      task is not (sleeping) in a set of patched functions. That is, no
      patched function is on the task's stack. Another clearly safe place is
      the boundary between kernel and userspace. The patching waits for all
      tasks to get outside of the patched set or to cross the boundary. The
      transition is completed afterwards.
      
      The problem is that a task can block the transition for quite a long
      time, if not forever. It could sleep in a set of patched functions, for
      example.  Luckily we can force the task to leave the set by sending it a
      fake signal, that is a signal with no data in signal pending structures
      (no handler, no sign of proper signal delivered). Suspend/freezer use
      this to freeze the tasks as well. The task gets TIF_SIGPENDING set and
      is woken up (if it has been sleeping in the kernel before) or kicked by
      rescheduling IPI (if it was running on other CPU). This causes the task
      to go to kernel/userspace boundary where the signal would be handled and
      the task would be marked as safe in terms of live patching.
      
      There are tasks which are not affected by this technique though. The
      fake signal is not sent to kthreads. They should be handled differently.
      They can be woken up so they leave the patched set and their
      TIF_PATCH_PENDING can be cleared thanks to stack checking.
      
      For the sake of completeness, if the task is in TASK_RUNNING state but
      not currently running on some CPU it doesn't get the IPI, but it would
      eventually handle the signal anyway. Second, if the task runs in the
      kernel (in TASK_RUNNING state) it gets the IPI, but the signal is not
      handled on return from the interrupt. It would be handled on return to
      the userspace in the future when the fake signal is sent again. Stack
      checking deals with these cases in a better way.
      
      If the task was sleeping in a syscall it would be woken by our fake
      signal, it would check if TIF_SIGPENDING is set (by calling
      signal_pending() predicate) and return ERESTART* or EINTR. Syscalls with
      ERESTART* return values are restarted in case of the fake signal (see
      do_signal()). EINTR is propagated back to the userspace program. This
      could disturb the program, but...
      
      * each process dealing with signals should react accordingly to EINTR
        return values.
      * syscalls returning EINTR happen to be quite common situation in the
        system even if no fake signal is sent.
      * freezer sends the fake signal and does not deal with EINTR anyhow.
        Thus EINTR values are returned when the system is resumed.
      
      The very safe marking is done in architectures' "entry" on syscall and
      interrupt/exception exit paths, and in a stack checking functions of
      livepatch.  TIF_PATCH_PENDING is cleared and the next
      recalc_sigpending() drops TIF_SIGPENDING. In connection with this, also
      call klp_update_patch_state() before do_signal(), so that
      recalc_sigpending() in dequeue_signal() can clear TIF_PATCH_PENDING
      immediately and thus prevent a double call of do_signal().
      
      Note that the fake signal is not sent to stopped/traced tasks. Such task
      prevents the patching to finish till it continues again (is not traced
      anymore).
      
      Last, sending the fake signal is not automatic. It is done only when
      admin requests it by writing 1 to signal sysfs attribute in livepatch
      sysfs directory.
      Signed-off-by: NMiroslav Benes <mbenes@suse.cz>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: x86@kernel.org
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      43347d56
  14. 18 11月, 2017 3 次提交
  15. 16 11月, 2017 1 次提交
  16. 02 11月, 2017 1 次提交
  17. 20 9月, 2017 3 次提交