1. 12 4月, 2018 2 次提交
  2. 06 4月, 2018 1 次提交
  3. 10 3月, 2018 1 次提交
  4. 08 3月, 2018 1 次提交
  5. 18 11月, 2017 5 次提交
  6. 17 8月, 2017 1 次提交
    • K
      locking/refcounts, x86/asm: Implement fast refcount overflow protection · 7a46ec0e
      Kees Cook 提交于
      This implements refcount_t overflow protection on x86 without a noticeable
      performance impact, though without the fuller checking of REFCOUNT_FULL.
      
      This is done by duplicating the existing atomic_t refcount implementation
      but with normally a single instruction added to detect if the refcount
      has gone negative (e.g. wrapped past INT_MAX or below zero). When detected,
      the handler saturates the refcount_t to INT_MIN / 2. With this overflow
      protection, the erroneous reference release that would follow a wrap back
      to zero is blocked from happening, avoiding the class of refcount-overflow
      use-after-free vulnerabilities entirely.
      
      Only the overflow case of refcounting can be perfectly protected, since
      it can be detected and stopped before the reference is freed and left to
      be abused by an attacker. There isn't a way to block early decrements,
      and while REFCOUNT_FULL stops increment-from-zero cases (which would
      be the state _after_ an early decrement and stops potential double-free
      conditions), this fast implementation does not, since it would require
      the more expensive cmpxchg loops. Since the overflow case is much more
      common (e.g. missing a "put" during an error path), this protection
      provides real-world protection. For example, the two public refcount
      overflow use-after-free exploits published in 2016 would have been
      rendered unexploitable:
      
        http://perception-point.io/2016/01/14/analysis-and-exploitation-of-a-linux-kernel-vulnerability-cve-2016-0728/
      
        http://cyseclabs.com/page?n=02012016
      
      This implementation does, however, notice an unchecked decrement to zero
      (i.e. caller used refcount_dec() instead of refcount_dec_and_test() and it
      resulted in a zero). Decrements under zero are noticed (since they will
      have resulted in a negative value), though this only indicates that a
      use-after-free may have already happened. Such notifications are likely
      avoidable by an attacker that has already exploited a use-after-free
      vulnerability, but it's better to have them reported than allow such
      conditions to remain universally silent.
      
      On first overflow detection, the refcount value is reset to INT_MIN / 2
      (which serves as a saturation value) and a report and stack trace are
      produced. When operations detect only negative value results (such as
      changing an already saturated value), saturation still happens but no
      notification is performed (since the value was already saturated).
      
      On the matter of races, since the entire range beyond INT_MAX but before
      0 is negative, every operation at INT_MIN / 2 will trap, leaving no
      overflow-only race condition.
      
      As for performance, this implementation adds a single "js" instruction
      to the regular execution flow of a copy of the standard atomic_t refcount
      operations. (The non-"and_test" refcount_dec() function, which is uncommon
      in regular refcount design patterns, has an additional "jz" instruction
      to detect reaching exactly zero.) Since this is a forward jump, it is by
      default the non-predicted path, which will be reinforced by dynamic branch
      prediction. The result is this protection having virtually no measurable
      change in performance over standard atomic_t operations. The error path,
      located in .text.unlikely, saves the refcount location and then uses UD0
      to fire a refcount exception handler, which resets the refcount, handles
      reporting, and returns to regular execution. This keeps the changes to
      .text size minimal, avoiding return jumps and open-coded calls to the
      error reporting routine.
      
      Example assembly comparison:
      
      refcount_inc() before:
      
        .text:
        ffffffff81546149:       f0 ff 45 f4             lock incl -0xc(%rbp)
      
      refcount_inc() after:
      
        .text:
        ffffffff81546149:       f0 ff 45 f4             lock incl -0xc(%rbp)
        ffffffff8154614d:       0f 88 80 d5 17 00       js     ffffffff816c36d3
        ...
        .text.unlikely:
        ffffffff816c36d3:       48 8d 4d f4             lea    -0xc(%rbp),%rcx
        ffffffff816c36d7:       0f ff                   (bad)
      
      These are the cycle counts comparing a loop of refcount_inc() from 1
      to INT_MAX and back down to 0 (via refcount_dec_and_test()), between
      unprotected refcount_t (atomic_t), fully protected REFCOUNT_FULL
      (refcount_t-full), and this overflow-protected refcount (refcount_t-fast):
      
        2147483646 refcount_inc()s and 2147483647 refcount_dec_and_test()s:
      		    cycles		protections
        atomic_t           82249267387	none
        refcount_t-fast    82211446892	overflow, untested dec-to-zero
        refcount_t-full   144814735193	overflow, untested dec-to-zero, inc-from-zero
      
      This code is a modified version of the x86 PAX_REFCOUNT atomic_t
      overflow defense from the last public patch of PaX/grsecurity, based
      on my understanding of the code. Changes or omissions from the original
      code are mine and don't reflect the original grsecurity/PaX code. Thanks
      to PaX Team for various suggestions for improvement for repurposing this
      code to be a refcount-only protection.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Eric Biggers <ebiggers3@gmail.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Hans Liljestrand <ishkamiel@gmail.com>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Serge E. Hallyn <serge@hallyn.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arozansk@redhat.com
      Cc: axboe@kernel.dk
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-arch <linux-arch@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20170815161924.GA133115@beastSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7a46ec0e
  7. 02 3月, 2017 1 次提交
  8. 24 2月, 2017 1 次提交
  9. 08 2月, 2017 1 次提交
  10. 25 1月, 2017 1 次提交
  11. 18 1月, 2017 1 次提交
  12. 27 11月, 2016 1 次提交
    • P
      taint/module: Clean up global and module taint flags handling · 7fd8329b
      Petr Mladek 提交于
      The commit 66cc69e3 ("Fix: module signature vs tracepoints:
      add new TAINT_UNSIGNED_MODULE") updated module_taint_flags() to
      potentially print one more character. But it did not increase the
      size of the corresponding buffers in m_show() and print_modules().
      
      We have recently done the same mistake when adding a taint flag
      for livepatching, see
      https://lkml.kernel.org/r/cfba2c823bb984690b73572aaae1db596b54a082.1472137475.git.jpoimboe@redhat.com
      
      Also struct module uses an incompatible type for mod-taints flags.
      It survived from the commit 2bc2d61a ("[PATCH] list module
      taint flags in Oops/panic"). There was used "int" for the global taint
      flags at these times. But only the global tain flags was later changed
      to "unsigned long" by the commit 25ddbb18 ("Make the taint
      flags reliable").
      
      This patch defines TAINT_FLAGS_COUNT that can be used to create
      arrays and buffers of the right size. Note that we could not use
      enum because the taint flag indexes are used also in assembly code.
      
      Then it reworks the table that describes the taint flags. The TAINT_*
      numbers can be used as the index. Instead, we add information
      if the taint flag is also shown per-module.
      
      Finally, it uses "unsigned long", bit operations, and the updated
      taint_flags table also for mod->taints.
      
      It is not optimal because only few taint flags can be printed by
      module_taint_flags(). But better be on the safe side. IMHO, it is
      not worth the optimization and this is a good compromise.
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      Link: http://lkml.kernel.org/r/1474458442-21581-1-git-send-email-pmladek@suse.com
      [jeyu@redhat.com: fix broken lkml link in changelog]
      Signed-off-by: NJessica Yu <jeyu@redhat.com>
      7fd8329b
  13. 12 10月, 2016 1 次提交
    • H
      x86/panic: replace smp_send_stop() with kdump friendly version in panic path · 0ee59413
      Hidehiro Kawai 提交于
      Daniel Walker reported problems which happens when
      crash_kexec_post_notifiers kernel option is enabled
      (https://lkml.org/lkml/2015/6/24/44).
      
      In that case, smp_send_stop() is called before entering kdump routines
      which assume other CPUs are still online.  As the result, for x86, kdump
      routines fail to save other CPUs' registers and disable virtualization
      extensions.
      
      To fix this problem, call a new kdump friendly function,
      crash_smp_send_stop(), instead of the smp_send_stop() when
      crash_kexec_post_notifiers is enabled.  crash_smp_send_stop() is a weak
      function, and it just call smp_send_stop().  Architecture codes should
      override it so that kdump can work appropriately.  This patch only
      provides x86-specific version.
      
      For Xen's PV kernel, just keep the current behavior.
      
      NOTES:
      
      - Right solution would be to place crash_smp_send_stop() before
        __crash_kexec() invocation in all cases and remove smp_send_stop(), but
        we can't do that until all architectures implement own
        crash_smp_send_stop()
      
      - crash_smp_send_stop()-like work is still needed by
        machine_crash_shutdown() because crash_kexec() can be called without
        entering panic()
      
      Fixes: f06e5153 (kernel/panic.c: add "crash_kexec_post_notifiers" option)
      Link: http://lkml.kernel.org/r/20160810080948.11028.15344.stgit@sysi4-13.yrl.intra.hitachi.co.jpSigned-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Reported-by: NDaniel Walker <dwalker@fifo99.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Daniel Walker <dwalker@fifo99.com>
      Cc: Xunlei Pang <xpang@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Daney <david.daney@cavium.com>
      Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
      Cc: "Steven J. Hill" <steven.hill@cavium.com>
      Cc: Corey Minyard <cminyard@mvista.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ee59413
  14. 03 8月, 2016 1 次提交
  15. 21 5月, 2016 1 次提交
    • P
      printk/nmi: flush NMI messages on the system panic · cf9b1106
      Petr Mladek 提交于
      In NMI context, printk() messages are stored into per-CPU buffers to
      avoid a possible deadlock.  They are normally flushed to the main ring
      buffer via an IRQ work.  But the work is never called when the system
      calls panic() in the very same NMI handler.
      
      This patch tries to flush NMI buffers before the crash dump is
      generated.  In this case it does not risk a double release and bails out
      when the logbuf_lock is already taken.  The aim is to get the messages
      into the main ring buffer when possible.  It makes them better
      accessible in the vmcore.
      
      Then the patch tries to flush the buffers second time when other CPUs
      are down.  It might be more aggressive and reset logbuf_lock.  The aim
      is to get the messages available for the consequent kmsg_dump() and
      console_flush_on_panic() calls.
      
      The patch causes vprintk_emit() to be called even in NMI context again.
      But it is done via printk_deferred() so that the console handling is
      skipped.  Consoles use internal locks and we could not prevent a
      deadlock easily.  They are explicitly called later when the crash dump
      is not generated, see console_flush_on_panic().
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Daniel Thompson <daniel.thompson@linaro.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jiri Kosina <jkosina@suse.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cf9b1106
  16. 23 3月, 2016 1 次提交
    • H
      panic: change nmi_panic from macro to function · ebc41f20
      Hidehiro Kawai 提交于
      Commit 1717f209 ("panic, x86: Fix re-entrance problem due to panic
      on NMI") and commit 58c5661f ("panic, x86: Allow CPUs to save
      registers even if looping in NMI context") introduced nmi_panic() which
      prevents concurrent/recursive execution of panic().  It also saves
      registers for the crash dump on x86.
      
      However, there are some cases where NMI handlers still use panic().
      This patch set partially replaces them with nmi_panic() in those cases.
      
      Even this patchset is applied, some NMI or similar handlers (e.g.  MCE
      handler) continue to use panic().  This is because I can't test them
      well and actual problems won't happen.  For example, the possibility
      that normal panic and panic on MCE happen simultaneously is very low.
      
      This patch (of 3):
      
      Convert nmi_panic() to a proper function and export it instead of
      exporting internal implementation details to modules, for obvious
      reasons.
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NMichal Nazarewicz <mina86@mina86.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Javi Merino <javi.merino@arm.com>
      Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com>
      Cc: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ebc41f20
  17. 18 3月, 2016 1 次提交
  18. 17 1月, 2016 1 次提交
    • T
      printk: do cond_resched() between lines while outputting to consoles · 8d91f8b1
      Tejun Heo 提交于
      @console_may_schedule tracks whether console_sem was acquired through
      lock or trylock.  If the former, we're inside a sleepable context and
      console_conditional_schedule() performs cond_resched().  This allows
      console drivers which use console_lock for synchronization to yield
      while performing time-consuming operations such as scrolling.
      
      However, the actual console outputting is performed while holding
      irq-safe logbuf_lock, so console_unlock() clears @console_may_schedule
      before starting outputting lines.  Also, only a few drivers call
      console_conditional_schedule() to begin with.  This means that when a
      lot of lines need to be output by console_unlock(), for example on a
      console registration, the task doing console_unlock() may not yield for
      a long time on a non-preemptible kernel.
      
      If this happens with a slow console devices, for example a serial
      console, the outputting task may occupy the cpu for a very long time.
      Long enough to trigger softlockup and/or RCU stall warnings, which in
      turn pile more messages, sometimes enough to trigger the next cycle of
      warnings incapacitating the system.
      
      Fix it by making console_unlock() insert cond_resched() between lines if
      @console_may_schedule.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NCalvin Owens <calvinowens@fb.com>
      Acked-by: NJan Kara <jack@suse.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Kyle McMartin <kyle@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d91f8b1
  19. 19 12月, 2015 3 次提交
    • H
      kexec: Fix race between panic() and crash_kexec() · 7bbee5ca
      Hidehiro Kawai 提交于
      Currently, panic() and crash_kexec() can be called at the same time.
      For example (x86 case):
      
      CPU 0:
        oops_end()
          crash_kexec()
            mutex_trylock() // acquired
              nmi_shootdown_cpus() // stop other CPUs
      
      CPU 1:
        panic()
          crash_kexec()
            mutex_trylock() // failed to acquire
          smp_send_stop() // stop other CPUs
          infinite loop
      
      If CPU 1 calls smp_send_stop() before nmi_shootdown_cpus(), kdump
      fails.
      
      In another case:
      
      CPU 0:
        oops_end()
          crash_kexec()
            mutex_trylock() // acquired
              <NMI>
              io_check_error()
                panic()
                  crash_kexec()
                    mutex_trylock() // failed to acquire
                  infinite loop
      
      Clearly, this is an undesirable result.
      
      To fix this problem, this patch changes crash_kexec() to exclude others
      by using the panic_cpu atomic.
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: kexec@lists.infradead.org
      Cc: linux-doc@vger.kernel.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Minfei Huang <mnfhuang@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: x86-ml <x86@kernel.org>
      Link: http://lkml.kernel.org/r/20151210014630.25437.94161.stgit@softrsSigned-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      7bbee5ca
    • H
      panic, x86: Allow CPUs to save registers even if looping in NMI context · 58c5661f
      Hidehiro Kawai 提交于
      Currently, kdump_nmi_shootdown_cpus(), a subroutine of crash_kexec(),
      sends an NMI IPI to CPUs which haven't called panic() to stop them,
      save their register information and do some cleanups for crash dumping.
      However, if such a CPU is infinitely looping in NMI context, we fail to
      save its register information into the crash dump.
      
      For example, this can happen when unknown NMIs are broadcast to all
      CPUs as follows:
      
        CPU 0                             CPU 1
        ===========================       ==========================
        receive an unknown NMI
        unknown_nmi_error()
          panic()                         receive an unknown NMI
            spin_trylock(&panic_lock)     unknown_nmi_error()
            crash_kexec()                   panic()
                                              spin_trylock(&panic_lock)
                                              panic_smp_self_stop()
                                                infinite loop
              kdump_nmi_shootdown_cpus()
                issue NMI IPI -----------> blocked until IRET
                                                infinite loop...
      
      Here, since CPU 1 is in NMI context, the second NMI from CPU 0 is
      blocked until CPU 1 executes IRET. However, CPU 1 never executes IRET,
      so the NMI is not handled and the callback function to save registers is
      never called.
      
      In practice, this can happen on some servers which broadcast NMIs to all
      CPUs when the NMI button is pushed.
      
      To save registers in this case, we need to:
      
        a) Return from NMI handler instead of looping infinitely
        or
        b) Call the callback function directly from the infinite loop
      
      Inherently, a) is risky because NMI is also used to prevent corrupted
      data from being propagated to devices.  So, we chose b).
      
      This patch does the following:
      
      1. Move the infinite looping of CPUs which haven't called panic() in NMI
         context (actually done by panic_smp_self_stop()) outside of panic() to
         enable us to refer pt_regs. Please note that panic_smp_self_stop() is
         still used for normal context.
      
      2. Call a callback of kdump_nmi_shootdown_cpus() directly to save
         registers and do some cleanups after setting waiting_for_crash_ipi which
         is used for counting down the number of CPUs which handled the callback
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Javi Merino <javi.merino@arm.com>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: kexec@lists.infradead.org
      Cc: linux-doc@vger.kernel.org
      Cc: lkml <linux-kernel@vger.kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Stefan Lippers-Hollmann <s.l-h@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Link: http://lkml.kernel.org/r/20151210014628.25437.75256.stgit@softrs
      [ Cleanup comments, fixup formatting. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      58c5661f
    • H
      panic, x86: Fix re-entrance problem due to panic on NMI · 1717f209
      Hidehiro Kawai 提交于
      If panic on NMI happens just after panic() on the same CPU, panic() is
      recursively called. Kernel stalls, as a result, after failing to acquire
      panic_lock.
      
      To avoid this problem, don't call panic() in NMI context if we've
      already entered panic().
      
      For that, introduce nmi_panic() macro to reduce code duplication. In
      the case of panic on NMI, don't return from NMI handlers if another CPU
      already panicked.
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Javi Merino <javi.merino@arm.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: kexec@lists.infradead.org
      Cc: linux-doc@vger.kernel.org
      Cc: lkml <linux-kernel@vger.kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Link: http://lkml.kernel.org/r/20151210014626.25437.13302.stgit@softrs
      [ Cleanup comments, fixup formatting. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      1717f209
  20. 21 11月, 2015 1 次提交
  21. 07 11月, 2015 1 次提交
    • V
      panic: release stale console lock to always get the logbuf printed out · 08d78658
      Vitaly Kuznetsov 提交于
      In some cases we may end up killing the CPU holding the console lock
      while still having valuable data in logbuf. E.g. I'm observing the
      following:
      
      - A crash is happening on one CPU and console_unlock() is being called on
        some other.
      
      - console_unlock() tries to print out the buffer before releasing the lock
        and on slow console it takes time.
      
      - in the meanwhile crashing CPU does lots of printk()-s with valuable data
        (which go to the logbuf) and sends IPIs to all other CPUs.
      
      - console_unlock() finishes printing previous chunk and enables interrupts
        before trying to print out the rest, the CPU catches the IPI and never
        releases console lock.
      
      This is not the only possible case: in VT/fb subsystems we have many other
      console_lock()/console_unlock() users.  Non-masked interrupts (or
      receiving NMI in case of extreme slowness) will have the same result.
      Getting the whole console buffer printed out on crash should be top
      priority.
      
      [akpm@linux-foundation.org: tweak comment text]
      Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Xie XiuQi <xiexiuqi@huawei.com>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      08d78658
  22. 01 7月, 2015 2 次提交
  23. 22 12月, 2014 1 次提交
  24. 11 12月, 2014 1 次提交
    • P
      kernel: add panic_on_warn · 9e3961a0
      Prarit Bhargava 提交于
      There have been several times where I have had to rebuild a kernel to
      cause a panic when hitting a WARN() in the code in order to get a crash
      dump from a system.  Sometimes this is easy to do, other times (such as
      in the case of a remote admin) it is not trivial to send new images to
      the user.
      
      A much easier method would be a switch to change the WARN() over to a
      panic.  This makes debugging easier in that I can now test the actual
      image the WARN() was seen on and I do not have to engage in remote
      debugging.
      
      This patch adds a panic_on_warn kernel parameter and
      /proc/sys/kernel/panic_on_warn calls panic() in the
      warn_slowpath_common() path.  The function will still print out the
      location of the warning.
      
      An example of the panic_on_warn output:
      
      The first line below is from the WARN_ON() to output the WARN_ON()'s
      location.  After that the panic() output is displayed.
      
          WARNING: CPU: 30 PID: 11698 at /home/prarit/dummy_module/dummy-module.c:25 init_dummy+0x1f/0x30 [dummy_module]()
          Kernel panic - not syncing: panic_on_warn set ...
      
          CPU: 30 PID: 11698 Comm: insmod Tainted: G        W  OE  3.17.0+ #57
          Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
           0000000000000000 000000008e3f87df ffff88080f093c38 ffffffff81665190
           0000000000000000 ffffffff818aea3d ffff88080f093cb8 ffffffff8165e2ec
           ffffffff00000008 ffff88080f093cc8 ffff88080f093c68 000000008e3f87df
          Call Trace:
           [<ffffffff81665190>] dump_stack+0x46/0x58
           [<ffffffff8165e2ec>] panic+0xd0/0x204
           [<ffffffffa038e05f>] ? init_dummy+0x1f/0x30 [dummy_module]
           [<ffffffff81076b90>] warn_slowpath_common+0xd0/0xd0
           [<ffffffffa038e040>] ? dummy_greetings+0x40/0x40 [dummy_module]
           [<ffffffff81076c8a>] warn_slowpath_null+0x1a/0x20
           [<ffffffffa038e05f>] init_dummy+0x1f/0x30 [dummy_module]
           [<ffffffff81002144>] do_one_initcall+0xd4/0x210
           [<ffffffff811b52c2>] ? __vunmap+0xc2/0x110
           [<ffffffff810f8889>] load_module+0x16a9/0x1b30
           [<ffffffff810f3d30>] ? store_uevent+0x70/0x70
           [<ffffffff810f49b9>] ? copy_module_from_fd.isra.44+0x129/0x180
           [<ffffffff810f8ec6>] SyS_finit_module+0xa6/0xd0
           [<ffffffff8166cf29>] system_call_fastpath+0x12/0x17
      
      Successfully tested by me.
      
      hpa said: There is another very valid use for this: many operators would
      rather a machine shuts down than being potentially compromised either
      functionally or security-wise.
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Acked-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9e3961a0
  25. 14 11月, 2014 1 次提交
  26. 09 8月, 2014 1 次提交
  27. 07 6月, 2014 1 次提交
  28. 08 4月, 2014 1 次提交
  29. 31 3月, 2014 1 次提交
    • R
      Use 'E' instead of 'X' for unsigned module taint flag. · 57673c2b
      Rusty Russell 提交于
      Takashi Iwai <tiwai@suse.de> says:
      > The letter 'X' has been already used for SUSE kernels for very long
      > time, to indicate the external supported modules.  Can the new flag be
      > changed to another letter for avoiding conflict...?
      > (BTW, we also use 'N' for "no support", too.)
      
      Note: this code should be cleaned up, so we don't have such maps in
      three places!
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      57673c2b
  30. 21 3月, 2014 1 次提交
  31. 13 3月, 2014 1 次提交
    • M
      Fix: module signature vs tracepoints: add new TAINT_UNSIGNED_MODULE · 66cc69e3
      Mathieu Desnoyers 提交于
      Users have reported being unable to trace non-signed modules loaded
      within a kernel supporting module signature.
      
      This is caused by tracepoint.c:tracepoint_module_coming() refusing to
      take into account tracepoints sitting within force-loaded modules
      (TAINT_FORCED_MODULE). The reason for this check, in the first place, is
      that a force-loaded module may have a struct module incompatible with
      the layout expected by the kernel, and can thus cause a kernel crash
      upon forced load of that module on a kernel with CONFIG_TRACEPOINTS=y.
      
      Tracepoints, however, specifically accept TAINT_OOT_MODULE and
      TAINT_CRAP, since those modules do not lead to the "very likely system
      crash" issue cited above for force-loaded modules.
      
      With kernels having CONFIG_MODULE_SIG=y (signed modules), a non-signed
      module is tainted re-using the TAINT_FORCED_MODULE taint flag.
      Unfortunately, this means that Tracepoints treat that module as a
      force-loaded module, and thus silently refuse to consider any tracepoint
      within this module.
      
      Since an unsigned module does not fit within the "very likely system
      crash" category of tainting, add a new TAINT_UNSIGNED_MODULE taint flag
      to specifically address this taint behavior, and accept those modules
      within Tracepoints. We use the letter 'X' as a taint flag character for
      a module being loaded that doesn't know how to sign its name (proposed
      by Steven Rostedt).
      
      Also add the missing 'O' entry to trace event show_module_flags() list
      for the sake of completeness.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      NAKed-by: NIngo Molnar <mingo@redhat.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: David Howells <dhowells@redhat.com>
      CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      66cc69e3
  32. 14 2月, 2014 1 次提交