1. 11 9月, 2017 2 次提交
  2. 10 9月, 2017 2 次提交
    • R
      sparc64: Handle additional cases of no fault loads · b6fe1089
      Rob Gardner 提交于
      Load instructions using ASI_PNF or other no-fault ASIs should not
      cause a SIGSEGV or SIGBUS.
      
      A garden variety unmapped address follows the TSB miss path, and when
      no valid mapping is found in the process page tables, the miss handler
      checks to see if the access was via a no-fault ASI.  It then fixes up
      the target register with a zero, and skips the no-fault load
      instruction.
      
      But different paths are taken for data access exceptions and alignment
      traps, and these do not respect the no-fault ASI. We add checks in
      these paths for the no-fault ASI, and fix up the target register and
      TPC just like in the TSB miss case.
      Signed-off-by: NRob Gardner <rob.gardner@oracle.com>
      Acked-by: NSam Ravnborg <sam@ravnborg.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6fe1089
    • A
      sparc64: speed up etrap/rtrap on NG2 and later processors · a7159a87
      Anthony Yznaga 提交于
      For many sun4v processor types, reading or writing a privileged register
      has a latency of 40 to 70 cycles.  Use a combination of the low-latency
      allclean, otherw, normalw, and nop instructions in etrap and rtrap to
      replace 2 rdpr and 5 wrpr instructions and improve etrap/rtrap
      performance.  allclean, otherw, and normalw are available on NG2 and
      later processors.
      
      The average ticks to execute the flush windows trap ("ta 0x3") with and
      without this patch on select platforms:
      
       CPU            Not patched     Patched    % Latency Reduction
      
       NG2            1762            1558            -11.58
       NG4            3619            3204            -11.47
       M7             3015            2624            -12.97
       SPARC64-X      829             770              -7.12
      Signed-off-by: NAnthony Yznaga <anthony.yznaga@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a7159a87
  3. 09 9月, 2017 15 次提交
  4. 07 9月, 2017 6 次提交
    • A
      x86/mm: Document how CR4.PCIDE restore works · 1c9fe440
      Andy Lutomirski 提交于
      While debugging a problem, I thought that using
      cr4_set_bits_and_update_boot() to restore CR4.PCIDE would be
      helpful.  It turns out to be counterproductive.
      
      Add a comment documenting how this works.
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1c9fe440
    • A
      x86/mm: Reinitialize TLB state on hotplug and resume · 72c0098d
      Andy Lutomirski 提交于
      When Linux brings a CPU down and back up, it switches to init_mm and then
      loads swapper_pg_dir into CR3.  With PCID enabled, this has the side effect
      of masking off the ASID bits in CR3.
      
      This can result in some confusion in the TLB handling code.  If we
      bring a CPU down and back up with any ASID other than 0, we end up
      with the wrong ASID active on the CPU after resume.  This could
      cause our internal state to become corrupt, although major
      corruption is unlikely because init_mm doesn't have any user pages.
      More obviously, if CONFIG_DEBUG_VM=y, we'll trip over an assertion
      in the next context switch.  The result of *that* is a failure to
      resume from suspend with probability 1 - 1/6^(cpus-1).
      
      Fix it by reinitializing cpu_tlbstate on resume and CPU bringup.
      Reported-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Reported-by: NJiri Kosina <jikos@kernel.org>
      Fixes: 10af6235 ("x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID")
      Signed-off-by: NAndy Lutomirski <luto@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72c0098d
    • R
      mm,fork: introduce MADV_WIPEONFORK · d2cd9ede
      Rik van Riel 提交于
      Introduce MADV_WIPEONFORK semantics, which result in a VMA being empty
      in the child process after fork.  This differs from MADV_DONTFORK in one
      important way.
      
      If a child process accesses memory that was MADV_WIPEONFORK, it will get
      zeroes.  The address ranges are still valid, they are just empty.
      
      If a child process accesses memory that was MADV_DONTFORK, it will get a
      segmentation fault, since those address ranges are no longer valid in
      the child after fork.
      
      Since MADV_DONTFORK also seems to be used to allow very large programs
      to fork in systems with strict memory overcommit restrictions, changing
      the semantics of MADV_DONTFORK might break existing programs.
      
      MADV_WIPEONFORK only works on private, anonymous VMAs.
      
      The use case is libraries that store or cache information, and want to
      know that they need to regenerate it in the child process after fork.
      
      Examples of this would be:
       - systemd/pulseaudio API checks (fail after fork) (replacing a getpid
         check, which is too slow without a PID cache)
       - PKCS#11 API reinitialization check (mandated by specification)
       - glibc's upcoming PRNG (reseed after fork)
       - OpenSSL PRNG (reseed after fork)
      
      The security benefits of a forking server having a re-inialized PRNG in
      every child process are pretty obvious.  However, due to libraries
      having all kinds of internal state, and programs getting compiled with
      many different versions of each library, it is unreasonable to expect
      calling programs to re-initialize everything manually after fork.
      
      A further complication is the proliferation of clone flags, programs
      bypassing glibc's functions to call clone directly, and programs calling
      unshare, causing the glibc pthread_atfork hook to not get called.
      
      It would be better to have the kernel take care of this automatically.
      
      The patch also adds MADV_KEEPONFORK, to undo the effects of a prior
      MADV_WIPEONFORK.
      
      This is similar to the OpenBSD minherit syscall with MAP_INHERIT_ZERO:
      
          https://man.openbsd.org/minherit.2
      
      [akpm@linux-foundation.org: numerically order arch/parisc/include/uapi/asm/mman.h #defines]
      Link: http://lkml.kernel.org/r/20170811212829.29186-3-riel@redhat.comSigned-off-by: NRik van Riel <riel@redhat.com>
      Reported-by: NFlorian Weimer <fweimer@redhat.com>
      Reported-by: NColm MacCártaigh <colm@allcosts.net>
      Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Cc: <linux-api@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d2cd9ede
    • R
      x86,mpx: make mpx depend on x86-64 to free up VMA flag · df3735c5
      Rik van Riel 提交于
      Patch series "mm,fork,security: introduce MADV_WIPEONFORK", v4.
      
      If a child process accesses memory that was MADV_WIPEONFORK, it will get
      zeroes.  The address ranges are still valid, they are just empty.
      
      If a child process accesses memory that was MADV_DONTFORK, it will get a
      segmentation fault, since those address ranges are no longer valid in
      the child after fork.
      
      Since MADV_DONTFORK also seems to be used to allow very large programs
      to fork in systems with strict memory overcommit restrictions, changing
      the semantics of MADV_DONTFORK might break existing programs.
      
      The use case is libraries that store or cache information, and want to
      know that they need to regenerate it in the child process after fork.
      
      Examples of this would be:
       - systemd/pulseaudio API checks (fail after fork) (replacing a getpid
         check, which is too slow without a PID cache)
       - PKCS#11 API reinitialization check (mandated by specification)
       - glibc's upcoming PRNG (reseed after fork)
       - OpenSSL PRNG (reseed after fork)
      
      The security benefits of a forking server having a re-inialized PRNG in
      every child process are pretty obvious.  However, due to libraries
      having all kinds of internal state, and programs getting compiled with
      many different versions of each library, it is unreasonable to expect
      calling programs to re-initialize everything manually after fork.
      
      A further complication is the proliferation of clone flags, programs
      bypassing glibc's functions to call clone directly, and programs calling
      unshare, causing the glibc pthread_atfork hook to not get called.
      
      It would be better to have the kernel take care of this automatically.
      
      The patchset also adds MADV_KEEPONFORK, to undo the effects of a prior
      MADV_WIPEONFORK.
      
      This is similar to the OpenBSD minherit syscall with MAP_INHERIT_ZERO:
      
          https://man.openbsd.org/minherit.2
      
      This patch (of 2):
      
      MPX only seems to be available on 64 bit CPUs, starting with Skylake and
      Goldmont.  Move VM_MPX into the 64 bit only portion of vma->vm_flags, in
      order to free up a VMA flag.
      
      Link: http://lkml.kernel.org/r/20170811212829.29186-2-riel@redhat.comSigned-off-by: NRik van Riel <riel@redhat.com>
      Acked-by: NDave Hansen <dave.hansen@intel.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Will Drewry <wad@chromium.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Colm MacCártaigh <colm@allcosts.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      df3735c5
    • M
      mm: arch: consolidate mmap hugetlb size encodings · aafd4562
      Mike Kravetz 提交于
      A non-default huge page size can be encoded in the flags argument of the
      mmap system call.  The definitions for these encodings are in arch
      specific header files.  However, all architectures use the same values.
      
      Consolidate all the definitions in the primary user header file
      (uapi/linux/mman.h).  Include definitions for all known huge page sizes.
      Use the generic encoding definitions in hugetlb_encode.h as the basis
      for these definitions.
      
      Link: http://lkml.kernel.org/r/1501527386-10736-3-git-send-email-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aafd4562
    • D
      metag/numa: remove the unused parent_node() macro · f0cd3406
      Dou Liyang 提交于
      Commit a7be6e5a ("mm: drop useless local parameters of
      __register_one_node()") removes the last user of parent_node().
      
      The parent_node() macro in METAG architecture is unnecessary.
      
      Remove it for cleanup.
      
      Link: http://lkml.kernel.org/r/1501076076-1974-4-git-send-email-douly.fnst@cn.fujitsu.comSigned-off-by: NDou Liyang <douly.fnst@cn.fujitsu.com>
      Reported-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: James Hogan <james.hogan@imgtec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f0cd3406
  5. 05 9月, 2017 12 次提交
  6. 04 9月, 2017 3 次提交