1. 26 8月, 2017 1 次提交
    • J
      futex: Remove duplicated code and fix undefined behaviour · 30d6e0a4
      Jiri Slaby 提交于
      There is code duplicated over all architecture's headers for
      futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr,
      and comparison of the result.
      
      Remove this duplication and leave up to the arches only the needed
      assembly which is now in arch_futex_atomic_op_inuser.
      
      This effectively distributes the Will Deacon's arm64 fix for undefined
      behaviour reported by UBSAN to all architectures. The fix was done in
      commit 5f16a046 (arm64: futex: Fix undefined behaviour with
      FUTEX_OP_OPARG_SHIFT usage). Look there for an example dump.
      
      And as suggested by Thomas, check for negative oparg too, because it was
      also reported to cause undefined behaviour report.
      
      Note that s390 removed access_ok check in d12a2970 ("s390/uaccess:
      remove pointless access_ok() checks") as access_ok there returns true.
      We introduce it back to the helper for the sake of simplicity (it gets
      optimized away anyway).
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390]
      Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
      Reviewed-by: NDarren Hart (VMware) <dvhart@infradead.org>
      Reviewed-by: Will Deacon <will.deacon@arm.com> [core/arm64]
      Cc: linux-mips@linux-mips.org
      Cc: Rich Felker <dalias@libc.org>
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: peterz@infradead.org
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: sparclinux@vger.kernel.org
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: linux-s390@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: linux-hexagon@vger.kernel.org
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: linux-snps-arc@lists.infradead.org
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linux-xtensa@linux-xtensa.org
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: openrisc@lists.librecores.org
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: linux-parisc@vger.kernel.org
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: linux-alpha@vger.kernel.org
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: "David S. Miller" <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/20170824073105.3901-1-jslaby@suse.cz
      30d6e0a4
  2. 24 8月, 2017 1 次提交
  3. 23 8月, 2017 5 次提交
    • A
      ARM: at91: don't select CONFIG_ARM_CPU_SUSPEND for old platforms · dbeb0c8e
      Arnd Bergmann 提交于
      My previous patch fixed a link error for all at91 platforms when
      CONFIG_ARM_CPU_SUSPEND was not set, however this caused another
      problem on a configuration that enabled CONFIG_ARCH_AT91 but none
      of the individual SoCs, and that also enabled CPU_ARM720 as
      the only CPU:
      
      warning: (ARCH_AT91 && SOC_IMX23 && SOC_IMX28 && ARCH_PXA && MACH_MVEBU_V7 && SOC_IMX6 && ARCH_OMAP3 && ARCH_OMAP4 && SOC_OMAP5 && SOC_AM33XX && SOC_DRA7XX && ARCH_EXYNOS3 && ARCH_EXYNOS4 && EXYNOS5420_MCPM && EXYNOS_CPU_SUSPEND && ARCH_VEXPRESS_TC2_PM && ARM_BIG_LITTLE_CPUIDLE && ARM_HIGHBANK_CPUIDLE && QCOM_PM) selects ARM_CPU_SUSPEND which has unmet direct dependencies (ARCH_SUSPEND_POSSIBLE)
      arch/arm/kernel/sleep.o: In function `cpu_resume':
      (.text+0xf0): undefined reference to `cpu_arm720_suspend_size'
      arch/arm/kernel/suspend.o: In function `__cpu_suspend_save':
      suspend.c:(.text+0x134): undefined reference to `cpu_arm720_do_suspend'
      
      This improves the hack some more by only selecting ARM_CPU_SUSPEND
      for the part that requires it, and changing pm.c to drop the
      contents of unused init functions so we no longer refer to
      cpu_resume on at91 platforms that don't need it.
      
      Fixes: cc7a938f ("ARM: at91: select CONFIG_ARM_CPU_SUSPEND")
      Acked-by: NAlexandre Belloni <alexandre.belloni@free-electrons.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      dbeb0c8e
    • C
      arm64: kaslr: Adjust the offset to avoid Image across alignment boundary · a067d94d
      Catalin Marinas 提交于
      With 16KB pages and a kernel Image larger than 16MB, the current
      kaslr_early_init() logic for avoiding mappings across swapper table
      boundaries fails since increasing the offset by kimg_sz just moves the
      problem to the next boundary.
      
      This patch rounds the offset down to (1 << SWAPPER_TABLE_SHIFT) if the
      Image crosses a PMD_SIZE boundary.
      
      Fixes: afd0e5a8 ("arm64: kaslr: Fix up the kernel image alignment")
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Neeraj Upadhyay <neeraju@codeaurora.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      a067d94d
    • A
      arm64: kaslr: ignore modulo offset when validating virtual displacement · 4a23e56a
      Ard Biesheuvel 提交于
      In the KASLR setup routine, we ensure that the early virtual mapping
      of the kernel image does not cover more than a single table entry at
      the level above the swapper block level, so that the assembler routines
      involved in setting up this mapping can remain simple.
      
      In this calculation we add the proposed KASLR offset to the values of
      the _text and _end markers, and reject it if they would end up falling
      in different swapper table sized windows.
      
      However, when taking the addresses of _text and _end, the modulo offset
      (the physical displacement modulo 2 MB) is already accounted for, and
      so adding it again results in incorrect results. So disregard the modulo
      offset from the calculation.
      
      Fixes: 08cdac61 ("arm64: relocatable: deal with physically misaligned ...")
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Tested-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      4a23e56a
    • M
      arm64: mm: abort uaccess retries upon fatal signal · 289d07a2
      Mark Rutland 提交于
      When there's a fatal signal pending, arm64's do_page_fault()
      implementation returns 0. The intent is that we'll return to the
      faulting userspace instruction, delivering the signal on the way.
      
      However, if we take a fatal signal during fixing up a uaccess, this
      results in a return to the faulting kernel instruction, which will be
      instantly retried, resulting in the same fault being taken forever. As
      the task never reaches userspace, the signal is not delivered, and the
      task is left unkillable. While the task is stuck in this state, it can
      inhibit the forward progress of the system.
      
      To avoid this, we must ensure that when a fatal signal is pending, we
      apply any necessary fixup for a faulting kernel instruction. Thus we
      will return to an error path, and it is up to that code to make forward
      progress towards delivering the fatal signal.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NSteve Capper <steve.capper@arm.com>
      Tested-by: NSteve Capper <steve.capper@arm.com>
      Reviewed-by: NJames Morse <james.morse@arm.com>
      Tested-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      289d07a2
    • D
      arm64: fpsimd: Prevent registers leaking across exec · 09662210
      Dave Martin 提交于
      There are some tricky dependencies between the different stages of
      flushing the FPSIMD register state during exec, and these can race
      with context switch in ways that can cause the old task's regs to
      leak across.  In particular, a context switch during the memset() can
      cause some of the task's old FPSIMD registers to reappear.
      
      Disabling preemption for this small window would be no big deal for
      performance: preemption is already disabled for similar scenarios
      like updating the FPSIMD registers in sigreturn.
      
      So, instead of rearranging things in ways that might swap existing
      subtle bugs for new ones, this patch just disables preemption
      around the FPSIMD state flushing so that races of this type can't
      occur here.  This brings fpsimd_flush_thread() into line with other
      code paths.
      
      Cc: stable@vger.kernel.org
      Fixes: 674c242c ("arm64: flush FP/SIMD state correctly after execve()")
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      09662210
  4. 22 8月, 2017 1 次提交
    • T
      sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus() · 2dc77533
      Thomas Petazzoni 提交于
      When building the kernel for Sparc using gcc 7.x, the build fails
      with:
      
      arch/sparc/kernel/pcic.c: In function ‘pcibios_fixup_bus’:
      arch/sparc/kernel/pcic.c:647:8: error: ‘cmd’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
          cmd |= PCI_COMMAND_IO;
              ^~
      
      The simplified code looks like this:
      
      unsigned int cmd;
      [...]
      pcic_read_config(dev->bus, dev->devfn, PCI_COMMAND, 2, &cmd);
      [...]
      cmd |= PCI_COMMAND_IO;
      
      I.e, the code assumes that pcic_read_config() will always initialize
      cmd. But it's not the case. Looking at pcic_read_config(), if
      bus->number is != 0 or if the size is not one of 1, 2 or 4, *val will
      not be initialized.
      
      As a simple fix, we initialize cmd to zero at the beginning of
      pcibios_fixup_bus.
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2dc77533
  5. 19 8月, 2017 2 次提交
  6. 18 8月, 2017 3 次提交
  7. 17 8月, 2017 6 次提交
    • K
      locking/refcounts, x86/asm: Implement fast refcount overflow protection · 7a46ec0e
      Kees Cook 提交于
      This implements refcount_t overflow protection on x86 without a noticeable
      performance impact, though without the fuller checking of REFCOUNT_FULL.
      
      This is done by duplicating the existing atomic_t refcount implementation
      but with normally a single instruction added to detect if the refcount
      has gone negative (e.g. wrapped past INT_MAX or below zero). When detected,
      the handler saturates the refcount_t to INT_MIN / 2. With this overflow
      protection, the erroneous reference release that would follow a wrap back
      to zero is blocked from happening, avoiding the class of refcount-overflow
      use-after-free vulnerabilities entirely.
      
      Only the overflow case of refcounting can be perfectly protected, since
      it can be detected and stopped before the reference is freed and left to
      be abused by an attacker. There isn't a way to block early decrements,
      and while REFCOUNT_FULL stops increment-from-zero cases (which would
      be the state _after_ an early decrement and stops potential double-free
      conditions), this fast implementation does not, since it would require
      the more expensive cmpxchg loops. Since the overflow case is much more
      common (e.g. missing a "put" during an error path), this protection
      provides real-world protection. For example, the two public refcount
      overflow use-after-free exploits published in 2016 would have been
      rendered unexploitable:
      
        http://perception-point.io/2016/01/14/analysis-and-exploitation-of-a-linux-kernel-vulnerability-cve-2016-0728/
      
        http://cyseclabs.com/page?n=02012016
      
      This implementation does, however, notice an unchecked decrement to zero
      (i.e. caller used refcount_dec() instead of refcount_dec_and_test() and it
      resulted in a zero). Decrements under zero are noticed (since they will
      have resulted in a negative value), though this only indicates that a
      use-after-free may have already happened. Such notifications are likely
      avoidable by an attacker that has already exploited a use-after-free
      vulnerability, but it's better to have them reported than allow such
      conditions to remain universally silent.
      
      On first overflow detection, the refcount value is reset to INT_MIN / 2
      (which serves as a saturation value) and a report and stack trace are
      produced. When operations detect only negative value results (such as
      changing an already saturated value), saturation still happens but no
      notification is performed (since the value was already saturated).
      
      On the matter of races, since the entire range beyond INT_MAX but before
      0 is negative, every operation at INT_MIN / 2 will trap, leaving no
      overflow-only race condition.
      
      As for performance, this implementation adds a single "js" instruction
      to the regular execution flow of a copy of the standard atomic_t refcount
      operations. (The non-"and_test" refcount_dec() function, which is uncommon
      in regular refcount design patterns, has an additional "jz" instruction
      to detect reaching exactly zero.) Since this is a forward jump, it is by
      default the non-predicted path, which will be reinforced by dynamic branch
      prediction. The result is this protection having virtually no measurable
      change in performance over standard atomic_t operations. The error path,
      located in .text.unlikely, saves the refcount location and then uses UD0
      to fire a refcount exception handler, which resets the refcount, handles
      reporting, and returns to regular execution. This keeps the changes to
      .text size minimal, avoiding return jumps and open-coded calls to the
      error reporting routine.
      
      Example assembly comparison:
      
      refcount_inc() before:
      
        .text:
        ffffffff81546149:       f0 ff 45 f4             lock incl -0xc(%rbp)
      
      refcount_inc() after:
      
        .text:
        ffffffff81546149:       f0 ff 45 f4             lock incl -0xc(%rbp)
        ffffffff8154614d:       0f 88 80 d5 17 00       js     ffffffff816c36d3
        ...
        .text.unlikely:
        ffffffff816c36d3:       48 8d 4d f4             lea    -0xc(%rbp),%rcx
        ffffffff816c36d7:       0f ff                   (bad)
      
      These are the cycle counts comparing a loop of refcount_inc() from 1
      to INT_MAX and back down to 0 (via refcount_dec_and_test()), between
      unprotected refcount_t (atomic_t), fully protected REFCOUNT_FULL
      (refcount_t-full), and this overflow-protected refcount (refcount_t-fast):
      
        2147483646 refcount_inc()s and 2147483647 refcount_dec_and_test()s:
      		    cycles		protections
        atomic_t           82249267387	none
        refcount_t-fast    82211446892	overflow, untested dec-to-zero
        refcount_t-full   144814735193	overflow, untested dec-to-zero, inc-from-zero
      
      This code is a modified version of the x86 PAX_REFCOUNT atomic_t
      overflow defense from the last public patch of PaX/grsecurity, based
      on my understanding of the code. Changes or omissions from the original
      code are mine and don't reflect the original grsecurity/PaX code. Thanks
      to PaX Team for various suggestions for improvement for repurposing this
      code to be a refcount-only protection.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Eric Biggers <ebiggers3@gmail.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Hans Liljestrand <ishkamiel@gmail.com>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Serge E. Hallyn <serge@hallyn.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arozansk@redhat.com
      Cc: axboe@kernel.dk
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-arch <linux-arch@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20170815161924.GA133115@beastSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7a46ec0e
    • A
      x86/boot/64/clang: Use fixup_pointer() to access 'next_early_pgt' · 187e91fe
      Alexander Potapenko 提交于
      __startup_64() is normally using fixup_pointer() to access globals in a
      position-independent fashion. However 'next_early_pgt' was accessed
      directly, which wasn't guaranteed to work.
      
      Luckily GCC was generating a R_X86_64_PC32 PC-relative relocation for
      'next_early_pgt', but Clang emitted a R_X86_64_32S, which led to
      accessing invalid memory and rebooting the kernel.
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Davidson <md@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: c88d7150 ("x86/boot/64: Rewrite startup_64() in C")
      Link: http://lkml.kernel.org/r/20170816190808.131748-1-glider@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      187e91fe
    • O
      x86/elf: Remove the unnecessary ADDR_NO_RANDOMIZE checks · 01578e36
      Oleg Nesterov 提交于
      The ADDR_NO_RANDOMIZE checks in stack_maxrandom_size() and
      randomize_stack_top() are not required.
      
      PF_RANDOMIZE is set by load_elf_binary() only if ADDR_NO_RANDOMIZE is not
      set, no need to re-check after that.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Cc: stable@vger.kernel.org
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170815154011.GB1076@redhat.com
      01578e36
    • O
      x86: Fix norandmaps/ADDR_NO_RANDOMIZE · 47ac5484
      Oleg Nesterov 提交于
      Documentation/admin-guide/kernel-parameters.txt says:
      
          norandmaps  Don't use address space randomization. Equivalent
                      to echo 0 > /proc/sys/kernel/randomize_va_space
      
      but it doesn't work because arch_rnd() which is used to randomize
      mm->mmap_base returns a random value unconditionally. And as Kirill
      pointed out, ADDR_NO_RANDOMIZE is broken by the same reason.
      
      Just shift the PF_RANDOMIZE check from arch_mmap_rnd() to arch_rnd().
      
      Fixes: 1b028f78 ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()")
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Reviewed-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Cc: stable@vger.kernel.org
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20170815153952.GA1076@redhat.com
      47ac5484
    • T
      sparc64: remove unnecessary log message · 6170a506
      Tushar Dave 提交于
      There is no need to log message if ATU hvapi couldn't get register.
      Unlike PCI hvapi, ATU hvapi registration failure is not hard error.
      Even if ATU hvapi registration fails (on system with ATU or without
      ATU) system continues with legacy IOMMU. So only log message when
      ATU hvapi successfully get registered.
      Signed-off-by: NTushar Dave <tushar.n.dave@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6170a506
    • D
      sparc64: Don't clibber fixed registers in __multi4. · 79db7958
      David S. Miller 提交于
      %g4 and %g5 are fixed registers used by the kernel for the thread
      pointer and the per-cpu offset.  Use %o4 and %g7 instead.
      
      Diagnosis by Anthony Yznaga.
      
      Fixes: 1b4af13f ("sparc64: Add __multi3 for gcc 7.x and later.")
      Reported-by: NAnatoly Pugachev <matorola@gmail.com>
      Tested-by: NAnatoly Pugachev <matorola@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      79db7958
  8. 16 8月, 2017 1 次提交
  9. 15 8月, 2017 1 次提交
  10. 14 8月, 2017 1 次提交
  11. 11 8月, 2017 11 次提交
  12. 10 8月, 2017 7 次提交