1. 23 4月, 2017 1 次提交
    • I
      Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation" · 6dd29b3d
      Ingo Molnar 提交于
      This reverts commit 2947ba05.
      
      Dan Williams reported dax-pmem kernel warnings with the following signature:
      
         WARNING: CPU: 8 PID: 245 at lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0x1f5/0x200
         percpu ref (dax_pmem_percpu_release [dax_pmem]) <= 0 (0) after switching to atomic
      
      ... and bisected it to this commit, which suggests possible memory corruption
      caused by the x86 fast-GUP conversion.
      
      He also pointed out:
      
       "
        This is similar to the backtrace when we were not properly handling
        pud faults and was fixed with this commit: 220ced16 "mm: fix
        get_user_pages() vs device-dax pud mappings"
      
        I've found some missing _devmap checks in the generic
        get_user_pages_fast() path, but this does not fix the regression
        [...]
       "
      
      So given that there are known bugs, and a pretty robust looking bisection
      points to this commit suggesting that are unknown bugs in the conversion
      as well, revert it for the time being - we'll re-try in v4.13.
      Reported-by: NDan Williams <dan.j.williams@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: aneesh.kumar@linux.vnet.ibm.com
      Cc: dann.frazier@canonical.com
      Cc: dave.hansen@intel.com
      Cc: steve.capper@linaro.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6dd29b3d
  2. 13 4月, 2017 1 次提交
  3. 12 4月, 2017 1 次提交
    • J
      x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space · 5ed386ec
      Joerg Roedel 提交于
      When this function fails it just sends a SIGSEGV signal to
      user-space using force_sig(). This signal is missing
      essential information about the cause, e.g. the trap_nr or
      an error code.
      
      Fix this by propagating the error to the only caller of
      mpx_handle_bd_fault(), do_bounds(), which sends the correct
      SIGSEGV signal to the process.
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: fe3d197f ('x86, mpx: On-demand kernel allocation of bounds tables')
      Link: http://lkml.kernel.org/r/1491488362-27198-1-git-send-email-joro@8bytes.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5ed386ec
  4. 08 4月, 2017 1 次提交
  5. 04 4月, 2017 2 次提交
  6. 03 4月, 2017 2 次提交
  7. 31 3月, 2017 1 次提交
    • B
      x86/boot/32: Flip the logic in test_wp_bit() · 952a6c2c
      Borislav Petkov 提交于
      ... to have a natural "likely()" in the code flow and thus have the
      success case with a branch 99.999% of the times non-taken and function
      return code following it instead of jumping to it each time.
      
      This puts the panic() call at the end of the function - it is going to
      be practically unreachable anyway.
      
      The C code is a bit more readable too.
      
      No functionality change.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: thgarnie@google.com
      Link: http://lkml.kernel.org/r/20170330080101.ywsf5rg6ilzu4itk@pd.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>
      952a6c2c
  8. 30 3月, 2017 2 次提交
  9. 27 3月, 2017 3 次提交
  10. 24 3月, 2017 1 次提交
    • B
      x86/mm/KASLR: Exclude EFI region from KASLR VA space randomization · a46f60d7
      Baoquan He 提交于
      Currently KASLR is enabled on three regions: the direct mapping of physical
      memory, vamlloc and vmemmap. However the EFI region is also mistakenly
      included for VA space randomization because of misusing EFI_VA_START macro
      and assuming EFI_VA_START < EFI_VA_END.
      
      (This breaks kexec and possibly other things that rely on stable addresses.)
      
      The EFI region is reserved for EFI runtime services virtual mapping which
      should not be included in KASLR ranges. In Documentation/x86/x86_64/mm.txt,
      we can see:
      
        ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
      
      EFI uses the space from -4G to -64G thus EFI_VA_START > EFI_VA_END,
      Here EFI_VA_START = -4G, and EFI_VA_END = -64G.
      
      Changing EFI_VA_START to EFI_VA_END in mm/kaslr.c fixes this problem.
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Reviewed-by: NBhupesh Sharma <bhsharma@redhat.com>
      Acked-by: NDave Young <dyoung@redhat.com>
      Acked-by: NThomas Garnier <thgarnie@google.com>
      Cc: <stable@vger.kernel.org> #4.8+
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1490331592-31860-1-git-send-email-bhe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a46f60d7
  11. 18 3月, 2017 2 次提交
  12. 16 3月, 2017 2 次提交
    • T
      x86/mpx: Make unnecessarily global function static · 6bce725a
      Tobias Klauser 提交于
      Make the function get_user_bd_entry() static as it is not used outside of
      arch/x86/mm/mpx.c
      
      This fixes a sparse warning.
      Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      6bce725a
    • T
      x86/mm: Adapt MODULES_END based on fixmap section size · f06bdd40
      Thomas Garnier 提交于
      This patch aligns MODULES_END to the beginning of the fixmap section.
      It optimizes the space available for both sections. The address is
      pre-computed based on the number of pages required by the fixmap
      section.
      
      It will allow GDT remapping in the fixmap section. The current
      MODULES_END static address does not provide enough space for the kernel
      to support a large number of processors.
      Signed-off-by: NThomas Garnier <thgarnie@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lorenzo Stoakes <lstoakes@gmail.com>
      Cc: Luis R . Rodriguez <mcgrof@kernel.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Rafael J . Wysocki <rjw@rjwysocki.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: kasan-dev@googlegroups.com
      Cc: kernel-hardening@lists.openwall.com
      Cc: kvm@vger.kernel.org
      Cc: lguest@lists.ozlabs.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-efi@vger.kernel.org
      Cc: linux-mm@kvack.org
      Cc: linux-pm@vger.kernel.org
      Cc: xen-devel@lists.xenproject.org
      Cc: zijun_hu <zijun_hu@htc.com>
      Link: http://lkml.kernel.org/r/20170314170508.100882-1-thgarnie@google.com
      [ Small build fix. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f06bdd40
  13. 14 3月, 2017 6 次提交
  14. 13 3月, 2017 3 次提交
    • D
      x86/mm: Introduce mmap_compat_base() for 32-bit mmap() · 1b028f78
      Dmitry Safonov 提交于
      mmap() uses a base address, from which it starts to look for a free space
      for allocation.
      
      The base address is stored in mm->mmap_base, which is calculated during
      exec(). The address depends on task's size, set rlimit for stack, ASLR
      randomization. The base depends on the task size and the number of random
      bits which are different for 64-bit and 32bit applications.
      
      Due to the fact, that the base address is fixed, its mmap() from a compat
      (32bit) syscall issued by a 64bit task will return a address which is based
      on the 64bit base address and does not fit into the 32bit address space
      (4GB). The returned pointer is truncated to 32bit, which results in an
      invalid address.
      
      To solve store a seperate compat address base plus a compat legacy address
      base in mm_struct. These bases are calculated at exec() time and can be
      used later to address the 32bit compat mmap() issued by 64 bit
      applications.
      
      As a consequence of this change 32-bit applications issuing a 64-bit
      syscall (after doing a long jump) will get a 64-bit mapping now. Before
      this change 32-bit applications always got a 32bit mapping.
      
      [ tglx: Massaged changelog and added a comment ]
      Signed-off-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Cc: 0x7f454c46@gmail.com
      Cc: linux-mm@kvack.org
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170306141721.9188-4-dsafonov@virtuozzo.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      1b028f78
    • D
      x86/mm: Add task_size parameter to mmap_base() · 8f3e474f
      Dmitry Safonov 提交于
      To correctly handle 32-bit and 64-bit mmap() syscalls in 64bit applications
      its required to have separate address bases to place a mapping.
      
      The tasksize can be used as an indicator to select the proper parameters
      for mmap_base().
      
      This requires the following changes:
      
       - Add task_size argument to mmap_base() and make the calculation based on it.
       - Provide mmap_legacy_base() as a seperate function
       - Use the new functions in arch_pick_mmap_layout()
      
      [ tglx: Massaged changelog ]
      Signed-off-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Cc: 0x7f454c46@gmail.com
      Cc: linux-mm@kvack.org
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170306141721.9188-3-dsafonov@virtuozzo.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      8f3e474f
    • D
      x86/mm: Introduce arch_rnd() to compute 32/64 mmap random base · 6a0b41d1
      Dmitry Safonov 提交于
      The compat (32bit) mmap() sycall issued by a 64-bit task results in a
      mapping above 4GB. That's outside the compat mode address space and
      prevents CRIU to restore 32bit processes from a 64bit application.
      
      As a first step to address this, split out the address base randomizing
      calculation from arch_mmap_rnd() into a helper function, which can be used
      independent of mmap_ia32() based decisions.
      
      [ tglx: Massaged changelog ]
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Cc: 0x7f454c46@gmail.com
      Cc: linux-mm@kvack.org
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170306141721.9188-2-dsafonov@virtuozzo.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6a0b41d1
  15. 11 3月, 2017 1 次提交
    • M
      x86/cpu: Drop wp_works_ok member of struct cpuinfo_x86 · 6415813b
      Mathias Krause 提交于
      Remove the wp_works_ok member of struct cpuinfo_x86. It's an
      optimization back from Linux v0.99 times where we had no fixup support
      yet and did the CR0.WP test via special code in the page fault handler.
      The < 0 test was an optimization to not do the special casing for each
      NULL ptr access violation but just for the first one doing the WP test.
      Today it serves no real purpose as the test no longer needs special code
      in the page fault handler and the only call side -- mem_init() -- calls
      it just once, anyway. However, Xen pre-initializes it to 1, to skip the
      test.
      
      Doing the test again for Xen should be no issue at all, as even the
      commit introducing skipping the test (commit d560bc61 ("x86, xen:
      Suppress WP test on Xen")) mentioned it being ban aid only. And, in
      fact, testing the patch on Xen showed nothing breaks.
      
      The pre-fixup times are long gone and with the removal of the fallback
      handling code in commit a5c2a893 ("x86, 386 removal: Remove
      CONFIG_X86_WP_WORKS_OK") the kernel requires a working CR0.WP anyway.
      So just get rid of the "optimization" and do the test unconditionally.
      Signed-off-by: NMathias Krause <minipli@googlemail.com>
      Acked-by: NBorislav Petkov <bp@alien8.de>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Arnd Hannemann <hannemann@nets.rwth-aachen.de>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Link: http://lkml.kernel.org/r/1486933932-585-3-git-send-email-minipli@googlemail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6415813b
  16. 10 3月, 2017 2 次提交
  17. 02 3月, 2017 6 次提交
  18. 28 2月, 2017 1 次提交
  19. 25 2月, 2017 2 次提交