1. 16 11月, 2017 2 次提交
    • C
      x86/mm: Limit mmap() of /dev/mem to valid physical addresses · be62a320
      Craig Bergstrom 提交于
      One thing /dev/mem access APIs should verify is that there's no way
      that excessively large pfn's can leak into the high bits of the
      page table entry.
      
      In particular, if people can use "very large physical page addresses"
      through /dev/mem to set the bits past bit 58 - SOFTW4 and permission
      key bits and NX bit, that could *really* confuse the kernel.
      
      We had an earlier attempt:
      
        ce56a86e ("x86/mm: Limit mmap() of /dev/mem to valid physical addresses")
      
      ... which turned out to be too restrictive (breaking mem=... bootups for example) and
      had to be reverted in:
      
        90edaac6 ("Revert "x86/mm: Limit mmap() of /dev/mem to valid physical addresses"")
      
      This v2 attempt modifies the original patch and makes sure that mmap(/dev/mem)
      limits the pfns so that it at least fits in the actual pteval_t architecturally:
      
       - Make sure mmap_mem() actually validates that the offset fits in phys_addr_t
      
          ( This may be indirectly true due to some other check, but it's not
            entirely obvious. )
      
       - Change valid_mmap_phys_addr_range() to just use phys_addr_valid()
         on the top byte
      
          ( Top byte is sufficient, because mmap_mem() has already checked that
            it cannot wrap. )
      
       - Add a few comments about what the valid_phys_addr_range() vs.
         valid_mmap_phys_addr_range() difference is.
      Signed-off-by: NCraig Bergstrom <craigb@google.com>
      [ Fixed the checks and added comments. ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      [ Collected the discussion and patches into a commit. ]
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hans Verkuil <hans.verkuil@cisco.com>
      Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sander Eikelenboom <linux@eikelenboom.it>
      Cc: Sean Young <sean@mess.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/CA+55aFyEcOMb657vWSmrM13OxmHxC-XxeBmNis=DwVvpJUOogQ@mail.gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      be62a320
    • K
      x86/mm: Prevent non-MAP_FIXED mapping across DEFAULT_MAP_WINDOW border · 1e0f25db
      Kirill A. Shutemov 提交于
      In case of 5-level paging, the kernel does not place any mapping above
      47-bit, unless userspace explicitly asks for it.
      
      Userspace can request an allocation from the full address space by
      specifying the mmap address hint above 47-bit.
      
      Nicholas noticed that the current implementation violates this interface:
      
        If user space requests a mapping at the end of the 47-bit address space
        with a length which causes the mapping to cross the 47-bit border
        (DEFAULT_MAP_WINDOW), then the vma is partially in the address space
        below and above.
      
      Sanity check the mmap address hint so that start and end of the resulting
      vma are on the same side of the 47-bit border. If that's not the case fall
      back to the code path which ignores the address hint and allocate from the
      regular address space below 47-bit.
      
      To make the checks consistent, mask out the address hints lower bits
      (either PAGE_MASK or huge_page_mask()) instead of using ALIGN() which can
      push them up to the next boundary.
      
      [ tglx: Moved the address check to a function and massaged comment and
        	changelog ]
      Reported-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: linux-mm@kvack.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lkml.kernel.org/r/20171115143607.81541-1-kirill.shutemov@linux.intel.com
      1e0f25db
  2. 27 10月, 2017 1 次提交
    • I
      Revert "x86/mm: Limit mmap() of /dev/mem to valid physical addresses" · 90edaac6
      Ingo Molnar 提交于
      This reverts commit ce56a86e.
      
      There's unanticipated interaction with some boot parameters like 'mem=',
      which now cause the new checks via valid_mmap_phys_addr_range() to be too
      restrictive, crashing a Qemu bootup in fact, as reported by Fengguang Wu.
      
      So while the motivation of the change is still entirely valid, we
      need a few more rounds of testing to get it right - it's way too late
      after -rc6, so revert it for now.
      Reported-by: NFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NCraig Bergstrom <craigb@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: dsafonov@virtuozzo.com
      Cc: kirill.shutemov@linux.intel.com
      Cc: mhocko@suse.com
      Cc: oleg@redhat.com
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      90edaac6
  3. 20 10月, 2017 1 次提交
    • C
      x86/mm: Limit mmap() of /dev/mem to valid physical addresses · ce56a86e
      Craig Bergstrom 提交于
      Currently, it is possible to mmap() any offset from /dev/mem.  If a
      program mmaps() /dev/mem offsets outside of the addressable limits
      of a system, the page table can be corrupted by setting reserved bits.
      
      For example if you mmap() offset 0x0001000000000000 of /dev/mem on an
      x86_64 system with a 48-bit bus, the page fault handler will be called
      with error_code set to RSVD.  The kernel then crashes with a page table
      corruption error.
      
      This change prevents this page table corruption on x86 by refusing
      to mmap offsets higher than the highest valid address in the system.
      Signed-off-by: NCraig Bergstrom <craigb@google.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: dsafonov@virtuozzo.com
      Cc: kirill.shutemov@linux.intel.com
      Cc: mhocko@suse.com
      Cc: oleg@redhat.com
      Link: http://lkml.kernel.org/r/20171019192856.39672-1-craigb@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ce56a86e
  4. 17 8月, 2017 2 次提交
  5. 21 7月, 2017 2 次提交
    • K
      x86/mm: Prepare to expose larger address space to userspace · b569bab7
      Kirill A. Shutemov 提交于
      On x86, 5-level paging enables 56-bit userspace virtual address space.
      Not all user space is ready to handle wide addresses. It's known that
      at least some JIT compilers use higher bits in pointers to encode their
      information. It collides with valid pointers with 5-level paging and
      leads to crashes.
      
      To mitigate this, we are not going to allocate virtual address space
      above 47-bit by default.
      
      But userspace can ask for allocation from full address space by
      specifying hint address (with or without MAP_FIXED) above 47-bits.
      
      If hint address set above 47-bit, but MAP_FIXED is not specified, we try
      to look for unmapped area by specified address. If it's already
      occupied, we look for unmapped area in *full* address space, rather than
      from 47-bit window.
      
      A high hint address would only affect the allocation in question, but not
      any future mmap()s.
      
      Specifying high hint address on older kernel or on machine without 5-level
      paging support is safe. The hint will be ignored and kernel will fall back
      to allocation from 47-bit address space.
      
      This approach helps to easily make application's memory allocator aware
      about large address space without manually tracking allocated virtual
      address space.
      
      The patch puts all machinery in place, but not yet allows userspace to have
      mappings above 47-bit -- TASK_SIZE_MAX has to be raised to get the effect.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20170716225954.74185-7-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b569bab7
    • K
      x86/mm: Rename tasksize_32bit/64bit to task_size_32bit/64bit() · e8f01a8d
      Kirill A. Shutemov 提交于
      Rename these helpers to be consistent with spelling of TASK_SIZE and
      related constants.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20170716225954.74185-5-kirill.shutemov@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e8f01a8d
  6. 13 7月, 2017 1 次提交
  7. 24 6月, 2017 1 次提交
    • M
      x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap · 4a06370b
      Michal Hocko 提交于
      Since the following commit in 2008:
      
        cc503c1b ("x86: PIE executable randomization")
      
      We added a heuristics to treat applications with RLIMIT_STACK configured
      to unlimited as legacy. This means:
      
       a) set the mmap_base to 1/3 of address space + randomization and
       b) mmap from bottom to top.
      
      This makes some sense as it allows the stack to grow really large. On the
      other hand it reduces the address space usable for default mmaps
      (without address hint) quite a lot.
      
      We have received a bug report that SAP HANA workload has hit into this
      limitation.
      
      We could argue that the user just got what he asked for when setting
      up the unlimited stack but to be realistic growing stack up to 1/6
      TASK_SIZE (allowed by mmap_base) is pretty much unimited in the real
      life. This would give mmap 20TB of additional address space which is
      quite nice. Especially when it is much more likely to use that address
      space than the reserved stack.
      
      Digging into the history the original implementation of the randomization:
      
        8817210d ("[PATCH] x86_64: Flexmap for 32bit and randomized mappings for 64bit")
      
      didn't have this restriction.
      
      So let's try and remove this assumption - hopefully nothing breaks.
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJiri Kosina <jkosina@suse.cz>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: akpm@linux-foundation.org
      Cc: hughd@google.com
      Cc: linux-mm@kvack.org
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/tip-86b110d2ae6365ce91cabd37588bc8611770421a@git.kernel.org
      [ So I've applied this to tip:x86/mm with a wider Cc: list - if anyone objects to this change please holler. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4a06370b
  8. 14 3月, 2017 1 次提交
  9. 13 3月, 2017 3 次提交
    • D
      x86/mm: Introduce mmap_compat_base() for 32-bit mmap() · 1b028f78
      Dmitry Safonov 提交于
      mmap() uses a base address, from which it starts to look for a free space
      for allocation.
      
      The base address is stored in mm->mmap_base, which is calculated during
      exec(). The address depends on task's size, set rlimit for stack, ASLR
      randomization. The base depends on the task size and the number of random
      bits which are different for 64-bit and 32bit applications.
      
      Due to the fact, that the base address is fixed, its mmap() from a compat
      (32bit) syscall issued by a 64bit task will return a address which is based
      on the 64bit base address and does not fit into the 32bit address space
      (4GB). The returned pointer is truncated to 32bit, which results in an
      invalid address.
      
      To solve store a seperate compat address base plus a compat legacy address
      base in mm_struct. These bases are calculated at exec() time and can be
      used later to address the 32bit compat mmap() issued by 64 bit
      applications.
      
      As a consequence of this change 32-bit applications issuing a 64-bit
      syscall (after doing a long jump) will get a 64-bit mapping now. Before
      this change 32-bit applications always got a 32bit mapping.
      
      [ tglx: Massaged changelog and added a comment ]
      Signed-off-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Cc: 0x7f454c46@gmail.com
      Cc: linux-mm@kvack.org
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170306141721.9188-4-dsafonov@virtuozzo.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      1b028f78
    • D
      x86/mm: Add task_size parameter to mmap_base() · 8f3e474f
      Dmitry Safonov 提交于
      To correctly handle 32-bit and 64-bit mmap() syscalls in 64bit applications
      its required to have separate address bases to place a mapping.
      
      The tasksize can be used as an indicator to select the proper parameters
      for mmap_base().
      
      This requires the following changes:
      
       - Add task_size argument to mmap_base() and make the calculation based on it.
       - Provide mmap_legacy_base() as a seperate function
       - Use the new functions in arch_pick_mmap_layout()
      
      [ tglx: Massaged changelog ]
      Signed-off-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Cc: 0x7f454c46@gmail.com
      Cc: linux-mm@kvack.org
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170306141721.9188-3-dsafonov@virtuozzo.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      8f3e474f
    • D
      x86/mm: Introduce arch_rnd() to compute 32/64 mmap random base · 6a0b41d1
      Dmitry Safonov 提交于
      The compat (32bit) mmap() sycall issued by a 64-bit task results in a
      mapping above 4GB. That's outside the compat mode address space and
      prevents CRIU to restore 32bit processes from a 64bit application.
      
      As a first step to address this, split out the address base randomizing
      calculation from arch_mmap_rnd() into a helper function, which can be used
      independent of mmap_ia32() based decisions.
      
      [ tglx: Massaged changelog ]
      Suggested-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NDmitry Safonov <dsafonov@virtuozzo.com>
      Cc: 0x7f454c46@gmail.com
      Cc: linux-mm@kvack.org
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170306141721.9188-2-dsafonov@virtuozzo.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      6a0b41d1
  10. 02 3月, 2017 2 次提交
  11. 11 3月, 2016 1 次提交
    • H
      x86/mm/32: Enable full randomization on i386 and X86_32 · 8b8addf8
      Hector Marco-Gisbert 提交于
      Currently on i386 and on X86_64 when emulating X86_32 in legacy mode, only
      the stack and the executable are randomized but not other mmapped files
      (libraries, vDSO, etc.). This patch enables randomization for the
      libraries, vDSO and mmap requests on i386 and in X86_32 in legacy mode.
      
      By default on i386 there are 8 bits for the randomization of the libraries,
      vDSO and mmaps which only uses 1MB of VA.
      
      This patch preserves the original randomness, using 1MB of VA out of 3GB or
      4GB. We think that 1MB out of 3GB is not a big cost for having the ASLR.
      
      The first obvious security benefit is that all objects are randomized (not
      only the stack and the executable) in legacy mode which highly increases
      the ASLR effectiveness, otherwise the attackers may use these
      non-randomized areas. But also sensitive setuid/setgid applications are
      more secure because currently, attackers can disable the randomization of
      these applications by setting the ulimit stack to "unlimited". This is a
      very old and widely known trick to disable the ASLR in i386 which has been
      allowed for too long.
      
      Another trick used to disable the ASLR was to set the ADDR_NO_RANDOMIZE
      personality flag, but fortunately this doesn't work on setuid/setgid
      applications because there is security checks which clear Security-relevant
      flags.
      
      This patch always randomizes the mmap_legacy_base address, removing the
      possibility to disable the ASLR by setting the stack to "unlimited".
      Signed-off-by: NHector Marco-Gisbert <hecmargi@upv.es>
      Acked-by: NIsmael Ripoll Ripoll <iripoll@upv.es>
      Acked-by: NKees Cook <keescook@chromium.org>
      Acked-by: NArjan van de Ven <arjan@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: akpm@linux-foundation.org
      Cc: kees Cook <keescook@chromium.org>
      Link: http://lkml.kernel.org/r/1457639460-5242-1-git-send-email-hecmargi@upv.esSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8b8addf8
  12. 28 2月, 2016 1 次提交
    • D
      mm: ASLR: use get_random_long() · 5ef11c35
      Daniel Cashman 提交于
      Replace calls to get_random_int() followed by a cast to (unsigned long)
      with calls to get_random_long().  Also address shifting bug which, in
      case of x86 removed entropy mask for mmap_rnd_bits values > 31 bits.
      Signed-off-by: NDaniel Cashman <dcashman@android.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Nick Kralevich <nnk@google.com>
      Cc: Jeff Vander Stoep <jeffv@google.com>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5ef11c35
  13. 15 1月, 2016 1 次提交
    • D
      x86: mm: support ARCH_MMAP_RND_BITS · 9e08f57d
      Daniel Cashman 提交于
      x86: arch_mmap_rnd() uses hard-coded values, 8 for 32-bit and 28 for
      64-bit, to generate the random offset for the mmap base address.  This
      value represents a compromise between increased ASLR effectiveness and
      avoiding address-space fragmentation.  Replace it with a Kconfig option,
      which is sensibly bounded, so that platform developers may choose where
      to place this compromise.  Keep default values as new minimums.
      Signed-off-by: NDaniel Cashman <dcashman@google.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Cc: Jeff Vander Stoep <jeffv@google.com>
      Cc: Nick Kralevich <nnk@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9e08f57d
  14. 21 7月, 2015 1 次提交
    • K
      x86/mpx: Do not set ->vm_ops on MPX VMAs · a8965276
      Kirill A. Shutemov 提交于
      MPX setups private anonymous mapping, but uses vma->vm_ops too.
      This can confuse core VM, as it relies on vm->vm_ops to
      distinguish file VMAs from anonymous.
      
      As result we will get SIGBUS, because handle_pte_fault() thinks
      it's file VMA without vm_ops->fault and it doesn't know how to
      handle the situation properly.
      
      Let's fix that by not setting ->vm_ops.
      
      We don't really need ->vm_ops here: MPX VMA can be detected with
      VM_MPX flag. And vma_merge() will not merge MPX VMA with non-MPX
      VMA, because ->vm_flags won't match.
      
      The only thing left is name of VMA. I'm not sure if it's part of
      ABI, or we can just drop it. The patch keep it by providing
      arch_vma_name() on x86.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: <stable@vger.kernel.org> # Fixes: 6b7339f4 (mm: avoid setting up anonymous pages into file mapping)
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dave@sr71.net
      Link: http://lkml.kernel.org/r/20150720212958.305CC3E9@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a8965276
  15. 15 4月, 2015 2 次提交
    • K
      mm: expose arch_mmap_rnd when available · 2b68f6ca
      Kees Cook 提交于
      When an architecture fully supports randomizing the ELF load location,
      a per-arch mmap_rnd() function is used to find a randomized mmap base.
      In preparation for randomizing the location of ET_DYN binaries
      separately from mmap, this renames and exports these functions as
      arch_mmap_rnd(). Additionally introduces CONFIG_ARCH_HAS_ELF_RANDOMIZE
      for describing this feature on architectures that support it
      (which is a superset of ARCH_BINFMT_ELF_RANDOMIZE_PIE, since s390
      already supports a separated ET_DYN ASLR from mmap ASLR without the
      ARCH_BINFMT_ELF_RANDOMIZE_PIE logic).
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Russell King <linux@arm.linux.org.uk>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: "David A. Long" <dave.long@linaro.org>
      Cc: Andrey Ryabinin <a.ryabinin@samsung.com>
      Cc: Arun Chandran <achandran@mvista.com>
      Cc: Yann Droneaud <ydroneaud@opteya.com>
      Cc: Min-Hua Chen <orca.chen@gmail.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Alex Smith <alex@alex-smith.me.uk>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: Vineeth Vijayan <vvijayan@mvista.com>
      Cc: Jeff Bailey <jeffbailey@google.com>
      Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Behan Webster <behanw@converseincode.com>
      Cc: Ismael Ripoll <iripoll@upv.es>
      Cc: Jan-Simon Mller <dl9pf@gmx.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2b68f6ca
    • K
      x86: standardize mmap_rnd() usage · 82168140
      Kees Cook 提交于
      In preparation for splitting out ET_DYN ASLR, this refactors the use of
      mmap_rnd() to be used similarly to arm, and extracts the checking of
      PF_RANDOMIZE.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NIngo Molnar <mingo@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      82168140
  16. 19 2月, 2015 1 次提交
    • H
      x86, mm/ASLR: Fix stack randomization on 64-bit systems · 4e7c22d4
      Hector Marco-Gisbert 提交于
      The issue is that the stack for processes is not properly randomized on
      64 bit architectures due to an integer overflow.
      
      The affected function is randomize_stack_top() in file
      "fs/binfmt_elf.c":
      
        static unsigned long randomize_stack_top(unsigned long stack_top)
        {
                 unsigned int random_variable = 0;
      
                 if ((current->flags & PF_RANDOMIZE) &&
                         !(current->personality & ADDR_NO_RANDOMIZE)) {
                         random_variable = get_random_int() & STACK_RND_MASK;
                         random_variable <<= PAGE_SHIFT;
                 }
                 return PAGE_ALIGN(stack_top) + random_variable;
                 return PAGE_ALIGN(stack_top) - random_variable;
        }
      
      Note that, it declares the "random_variable" variable as "unsigned int".
      Since the result of the shifting operation between STACK_RND_MASK (which
      is 0x3fffff on x86_64, 22 bits) and PAGE_SHIFT (which is 12 on x86_64):
      
      	  random_variable <<= PAGE_SHIFT;
      
      then the two leftmost bits are dropped when storing the result in the
      "random_variable". This variable shall be at least 34 bits long to hold
      the (22+12) result.
      
      These two dropped bits have an impact on the entropy of process stack.
      Concretely, the total stack entropy is reduced by four: from 2^28 to
      2^30 (One fourth of expected entropy).
      
      This patch restores back the entropy by correcting the types involved
      in the operations in the functions randomize_stack_top() and
      stack_maxrandom_size().
      
      The successful fix can be tested with:
      
        $ for i in `seq 1 10`; do cat /proc/self/maps | grep stack; done
        7ffeda566000-7ffeda587000 rw-p 00000000 00:00 0                          [stack]
        7fff5a332000-7fff5a353000 rw-p 00000000 00:00 0                          [stack]
        7ffcdb7a1000-7ffcdb7c2000 rw-p 00000000 00:00 0                          [stack]
        7ffd5e2c4000-7ffd5e2e5000 rw-p 00000000 00:00 0                          [stack]
        ...
      
      Once corrected, the leading bytes should be between 7ffc and 7fff,
      rather than always being 7fff.
      Signed-off-by: NHector Marco-Gisbert <hecmargi@upv.es>
      Signed-off-by: NIsmael Ripoll <iripoll@upv.es>
      [ Rebased, fixed 80 char bugs, cleaned up commit message, added test example and CVE ]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: <stable@vger.kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Fixes: CVE-2015-1593
      Link: http://lkml.kernel.org/r/20150214173350.GA18393@www.outflux.netSigned-off-by: NBorislav Petkov <bp@suse.de>
      4e7c22d4
  17. 09 9月, 2014 1 次提交
  18. 23 8月, 2013 2 次提交
  19. 14 8月, 2013 1 次提交
  20. 11 7月, 2013 1 次提交
  21. 06 12月, 2011 1 次提交
  22. 07 8月, 2011 1 次提交
  23. 06 8月, 2011 1 次提交
    • B
      x86, amd: Avoid cache aliasing penalties on AMD family 15h · dfb09f9b
      Borislav Petkov 提交于
      This patch provides performance tuning for the "Bulldozer" CPU. With its
      shared instruction cache there is a chance of generating an excessive
      number of cache cross-invalidates when running specific workloads on the
      cores of a compute module.
      
      This excessive amount of cross-invalidations can be observed if cache
      lines backed by shared physical memory alias in bits [14:12] of their
      virtual addresses, as those bits are used for the index generation.
      
      This patch addresses the issue by clearing all the bits in the [14:12]
      slice of the file mapping's virtual address at generation time, thus
      forcing those bits the same for all mappings of a single shared library
      across processes and, in doing so, avoids instruction cache aliases.
      
      It also adds the command line option "align_va_addr=(32|64|on|off)" with
      which virtual address alignment can be enabled for 32-bit or 64-bit x86
      individually, or both, or be completely disabled.
      
      This change leaves virtual region address allocation on other families
      and/or vendors unaffected.
      Signed-off-by: NBorislav Petkov <borislav.petkov@amd.com>
      Link: http://lkml.kernel.org/r/1312550110-24160-2-git-send-email-bp@amd64.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      dfb09f9b
  24. 28 1月, 2010 1 次提交
    • J
      x86: Use helpers for rlimits · 2854e72b
      Jiri Slaby 提交于
      Make sure compiler won't do weird things with limits.  Fetching them
      twice may return 2 different values after writable limits are
      implemented.
      
      We can either use rlimit helpers added in
      3e10e716 or ACCESS_ONCE if not
      applicable; this patch uses the helpers.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      LKML-Reference: <1264609942-24621-1-git-send-email-jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      2854e72b
  25. 11 9月, 2009 1 次提交
    • M
      x86: Increase MIN_GAP to include randomized stack · 80938332
      Michal Hocko 提交于
      Currently we are not including randomized stack size when calculating
      mmap_base address in arch_pick_mmap_layout for topdown case. This might
      cause that mmap_base starts in the stack reserved area because stack is
      randomized by 1GB for 64b (8MB for 32b) and the minimum gap is 128MB.
      
      If the stack really grows down to mmap_base then we can get silent mmap
      region overwrite by the stack values.
      
      Let's include maximum stack randomization size into MIN_GAP which is
      used as the low bound for the gap in mmap.
      Signed-off-by: NMichal Hocko <mhocko@suse.cz>
      LKML-Reference: <1252400515-6866-1-git-send-email-mhocko@suse.cz>
      Acked-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Stable Team <stable@kernel.org>
      80938332
  26. 31 1月, 2009 1 次提交
  27. 30 1月, 2008 5 次提交
    • H
      x86: unify mmap_{32|64}.c · 675a0813
      Harvey Harrison 提交于
      mmap_is_ia32 always true for X86_32, or while emulating IA32 on X86_64
      
      Randomization not supported on X86_32 in legacy layout.  Both layouts allow
      randomization on X86_64.
      Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      675a0813
    • A
      x86: PIE executable randomization, uninlining · 954683a2
      Andrew Morton 提交于
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Jakub Jelinek <jakub@redhat.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      954683a2
    • A
      x86: PIE executable randomization, checkpatch fixes · bb1ad820
      Andrew Morton 提交于
      #39: FILE: arch/ia64/ia32/binfmt_elf32.c:229:
      +elf32_map (struct file *filep, unsigned long addr, struct elf_phdr *eppnt, int prot, int type, unsigned long unused)
      
      WARNING: no space between function name and open parenthesis '('
      #39: FILE: arch/ia64/ia32/binfmt_elf32.c:229:
      +elf32_map (struct file *filep, unsigned long addr, struct elf_phdr *eppnt, int prot, int type, unsigned long unused)
      
      WARNING: line over 80 characters
      #67: FILE: arch/x86/kernel/sys_x86_64.c:80:
      +			new_begin = randomize_range(*begin, *begin + 0x02000000, 0);
      
      ERROR: use tabs not spaces
      #110: FILE: arch/x86/kernel/sys_x86_64.c:185:
      + ^I        mm->cached_hole_size = 0;$
      
      ERROR: use tabs not spaces
      #111: FILE: arch/x86/kernel/sys_x86_64.c:186:
      + ^I^Imm->free_area_cache = mm->mmap_base;$
      
      ERROR: use tabs not spaces
      #112: FILE: arch/x86/kernel/sys_x86_64.c:187:
      + ^I}$
      
      ERROR: use tabs not spaces
      #141: FILE: arch/x86/kernel/sys_x86_64.c:216:
      + ^I^I/* remember the largest hole we saw so far */$
      
      ERROR: use tabs not spaces
      #142: FILE: arch/x86/kernel/sys_x86_64.c:217:
      + ^I^Iif (addr + mm->cached_hole_size < vma->vm_start)$
      
      ERROR: use tabs not spaces
      #143: FILE: arch/x86/kernel/sys_x86_64.c:218:
      + ^I^I        mm->cached_hole_size = vma->vm_start - addr;$
      
      ERROR: use tabs not spaces
      #157: FILE: arch/x86/kernel/sys_x86_64.c:232:
      +  ^Imm->free_area_cache = TASK_UNMAPPED_BASE;$
      
      ERROR: need a space before the open parenthesis '('
      #291: FILE: arch/x86/mm/mmap_64.c:101:
      +	} else if(mmap_is_legacy()) {
      
      WARNING: braces {} are not necessary for single statement blocks
      #302: FILE: arch/x86/mm/mmap_64.c:112:
      +	if (current->flags & PF_RANDOMIZE) {
      +		mm->mmap_base += ((long)rnd) << PAGE_SHIFT;
      +	}
      
      WARNING: line over 80 characters
      #314: FILE: fs/binfmt_elf.c:48:
      +static unsigned long elf_map (struct file *, unsigned long, struct elf_phdr *, int, int, unsigned long);
      
      WARNING: no space between function name and open parenthesis '('
      #314: FILE: fs/binfmt_elf.c:48:
      +static unsigned long elf_map (struct file *, unsigned long, struct elf_phdr *, int, int, unsigned long);
      
      WARNING: line over 80 characters
      #429: FILE: fs/binfmt_elf.c:438:
      +					   eppnt, elf_prot, elf_type, total_size);
      
      ERROR: need space after that ',' (ctx:VxV)
      #480: FILE: fs/binfmt_elf.c:939:
      +				elf_prot, elf_flags,0);
       				                   ^
      
      total: 9 errors, 7 warnings, 461 lines checked
      Your patch has style problems, please review.  If any of these errors
      are false positives report them to the maintainer, see
      CHECKPATCH in MAINTAINERS.
      
      Please run checkpatch prior to sending patches
      
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Jakub Jelinek <jakub@redhat.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      bb1ad820
    • J
      x86: PIE executable randomization · cc503c1b
      Jiri Kosina 提交于
      main executable of (specially compiled/linked -pie/-fpie) ET_DYN binaries
      onto a random address (in cases in which mmap() is allowed to perform a
      randomization).
      
      The code has been extraced from Ingo's exec-shield patch
      http://people.redhat.com/mingo/exec-shield/
      
      [akpm@linux-foundation.org: fix used-uninitialsied warning]
      [kamezawa.hiroyu@jp.fujitsu.com: fixed ia32 ELF on x86_64 handling]
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Jakub Jelinek <jakub@redhat.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      cc503c1b
    • T
      x86: clean up arch/x86/mm/mmap_32/64.c · f8eeae68
      Thomas Gleixner 提交于
      White space and coding style clenaup.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f8eeae68
  28. 11 10月, 2007 1 次提交