1. 21 1月, 2016 1 次提交
  2. 17 1月, 2016 1 次提交
    • H
      parisc: Protect huge page pte changes with spinlocks · b0e55131
      Helge Deller 提交于
      PA-RISC doesn't have atomic instructions to modify page table entries, so it
      takes spinlock in the TLB handler and modifies the page table entry
      non-atomically. If you modify the page table entry without the spinlock, you
      may race with TLB handler on another CPU and your modification may be lost.
      Protect against that with usage of purge_tlb_start() and purge_tlb_end() which
      handles the TLB spinlock.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org # v4.4
      b0e55131
  3. 13 1月, 2016 3 次提交
  4. 12 12月, 2015 1 次提交
  5. 22 11月, 2015 3 次提交
    • H
      parisc: Add Huge Page and HUGETLBFS support · 736d2169
      Helge Deller 提交于
      This patch adds huge page support to allow userspace to allocate huge
      pages and to use hugetlbfs filesystem on 32- and 64-bit Linux kernels.
      A later patch will add kernel support to map kernel text and data on
      huge pages.
      
      The only requirement is, that the kernel needs to be compiled for a
      PA8X00 CPU (PA2.0 architecture). Older PA1.X CPUs do not support
      variable page sizes. 64bit Kernels are compiled for PA2.0 by default.
      
      Technically on parisc multiple physical huge pages may be needed to
      emulate standard 2MB huge pages.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      736d2169
    • H
      parisc: Increase initial kernel mapping to 32MB on 64bit kernel · 332b42e4
      Helge Deller 提交于
      For the 64bit kernel the initially 16 MB kernel memory might become too
      small if you build a kernel with many modules built-in and with kernel
      text and data areas mapped on huge pages.
      
      This patch increases the initial mapping to 32MB for 64bit kernels and
      keeps 16MB for 32bit kernels.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      332b42e4
    • H
      parisc: Add defines for Huge page support · 1f25ad26
      Helge Deller 提交于
      Huge pages on parisc will have the same size as one pmd table, which
      is on a 64bit kernel 2MB on a kernel with 4K kernel page sizes, and
      on a 32bit kernel 4MB when used with 4K kernel pages.
      
      Since parisc does not physically supports 2MB huge page sizes, emulate
      it with two consecutive 1MB page sizes instead. Keeping the same huge
      page size as one pmd will allow us to add transparent huge page support
      later on.
      
      Bit 21 in the pte flags was unused and will now be used to mark a page
      as huge page (_PAGE_HPAGE_BIT).
      Signed-off-by: NHelge Deller <deller@gmx.de>
      1f25ad26
  6. 20 11月, 2015 2 次提交
  7. 10 11月, 2015 1 次提交
  8. 09 11月, 2015 1 次提交
    • H
      parisc: Fixes and cleanups in kernel uapi header files · d0cf62fb
      Helge Deller 提交于
      This patch fixes some bugs and partly cleans up the parisc uapi header
      files to what glibc defined:
      - compat_semid64_ds was wrong and did not take the endianess into
        account
      - ipc64_perm exported userspace types which broke building userspace
        packages on debian (e.g. trinity)
      - ipc64_perm needs to use a 32bit mode_t on 64bit kernel
      - msqid64_ds and semid64_ds needs unsigned longs for various struct members
      - shmid64_ds exported size_t instead of __kernel_size_t
      
      And finally add some compile-time checks for the sizes of those structs
      to avoid future breakage.
      
      Runtime-tested with the Linux Test Project (LTP) testsuite.
      
      Cc: <stable@vger.kernel.org> # 3.18+
      Reviewed-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NHelge Deller <deller@gmx.de>
      d0cf62fb
  9. 25 10月, 2015 1 次提交
  10. 23 9月, 2015 1 次提交
    • P
      atomic, arch: Audit atomic_{read,set}() · 62e8a325
      Peter Zijlstra 提交于
      This patch makes sure that atomic_{read,set}() are at least
      {READ,WRITE}_ONCE().
      
      We already had the 'requirement' that atomic_read() should use
      ACCESS_ONCE(), and most archs had this, but a few were lacking.
      All are now converted to use READ_ONCE().
      
      And, by a symmetry and general paranoia argument, upgrade atomic_set()
      to use WRITE_ONCE().
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: james.hogan@imgtec.com
      Cc: linux-kernel@vger.kernel.org
      Cc: oleg@redhat.com
      Cc: will.deacon@arm.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      62e8a325
  11. 08 9月, 2015 1 次提交
  12. 27 7月, 2015 2 次提交
  13. 19 7月, 2015 1 次提交
  14. 18 7月, 2015 1 次提交
  15. 11 7月, 2015 1 次提交
    • J
      parisc: Fix some PTE/TLB race conditions and optimize __flush_tlb_range based on timing results · 01ab6057
      John David Anglin 提交于
      The increased use of pdtlb/pitlb instructions seemed to increase the
      frequency of random segmentation faults building packages. Further, we
      had a number of cases where TLB inserts would repeatedly fail and all
      forward progress would stop. The Haskell ghc package caused a lot of
      trouble in this area. The final indication of a race in pte handling was
      this syslog entry on sibaris (C8000):
      
       swap_free: Unused swap offset entry 00000004
       BUG: Bad page map in process mysqld  pte:00000100 pmd:019bbec5
       addr:00000000ec464000 vm_flags:00100073 anon_vma:0000000221023828 mapping: (null) index:ec464
       CPU: 1 PID: 9176 Comm: mysqld Not tainted 4.0.0-2-parisc64-smp #1 Debian 4.0.5-1
       Backtrace:
        [<0000000040173eb0>] show_stack+0x20/0x38
        [<0000000040444424>] dump_stack+0x9c/0x110
        [<00000000402a0d38>] print_bad_pte+0x1a8/0x278
        [<00000000402a28b8>] unmap_single_vma+0x3d8/0x770
        [<00000000402a4090>] zap_page_range+0xf0/0x198
        [<00000000402ba2a4>] SyS_madvise+0x404/0x8c0
      
      Note that the pte value is 0 except for the accessed bit 0x100. This bit
      shouldn't be set without the present bit.
      
      It should be noted that the madvise system call is probably a trigger for many
      of the random segmentation faults.
      
      In looking at the kernel code, I found the following problems:
      
      1) The pte_clear define didn't take TLB lock when clearing a pte.
      2) We didn't test pte present bit inside lock in exception support.
      3) The pte and tlb locks needed to merged in order to ensure consistency
      between page table and TLB. This also has the effect of serializing TLB
      broadcasts on SMP systems.
      
      The attached change implements the above and a few other tweaks to try
      to improve performance. Based on the timing code, TLB purges are very
      slow (e.g., ~ 209 cycles per page on rp3440). Thus, I think it
      beneficial to test the split_tlb variable to avoid duplicate purges.
      Probably, all PA 2.0 machines have combined TLBs.
      
      I dropped using __flush_tlb_range in flush_tlb_mm as I realized all
      applications and most threads have a stack size that is too large to
      make this useful. I added some comments to this effect.
      
      Since implementing 1 through 3, I haven't had any random segmentation
      faults on mx3210 (rp3440) in about one week of building code and running
      as a Debian buildd.
      Signed-off-by: NJohn David Anglin <dave.anglin@bell.net>
      Cc: stable@vger.kernel.org # v3.18+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      01ab6057
  16. 25 6月, 2015 1 次提交
    • L
      mm: new mm hook framework · 2ae416b1
      Laurent Dufour 提交于
      CRIU is recreating the process memory layout by remapping the checkpointee
      memory area on top of the current process (criu).  This includes remapping
      the vDSO to the place it has at checkpoint time.
      
      However some architectures like powerpc are keeping a reference to the
      vDSO base address to build the signal return stack frame by calling the
      vDSO sigreturn service.  So once the vDSO has been moved, this reference
      is no more valid and the signal frame built later are not usable.
      
      This patch serie is introducing a new mm hook framework, and a new
      arch_remap hook which is called when mremap is done and the mm lock still
      hold.  The next patch is adding the vDSO remap and unmap tracking to the
      powerpc architecture.
      
      This patch (of 3):
      
      This patch introduces a new set of header file to manage mm hooks:
      - per architecture empty header file (arch/x/include/asm/mm-arch-hooks.h)
      - a generic header (include/linux/mm-arch-hooks.h)
      
      The architecture which need to overwrite a hook as to redefine it in its
      header file, while architecture which doesn't need have nothing to do.
      
      The default hooks are defined in the generic header and are used in the
      case the architecture is not defining it.
      
      In a next step, mm hooks defined in include/asm-generic/mm_hooks.h should
      be moved here.
      Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Suggested-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2ae416b1
  17. 08 6月, 2015 1 次提交
  18. 19 5月, 2015 2 次提交
  19. 13 5月, 2015 2 次提交
    • T
      arch: Remove __ARCH_HAVE_CMPXCHG · a22e5f57
      Thomas Gleixner 提交于
      We removed the only user of this define in the rtmutex code. Get rid
      of it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      a22e5f57
    • H
      parisc,metag: Fix crashes due to stack randomization on stack-grows-upwards architectures · d045c77c
      Helge Deller 提交于
      On architectures where the stack grows upwards (CONFIG_STACK_GROWSUP=y,
      currently parisc and metag only) stack randomization sometimes leads to crashes
      when the stack ulimit is set to lower values than STACK_RND_MASK (which is 8 MB
      by default if not defined in arch-specific headers).
      
      The problem is, that when the stack vm_area_struct is set up in fs/exec.c, the
      additional space needed for the stack randomization (as defined by the value of
      STACK_RND_MASK) was not taken into account yet and as such, when the stack
      randomization code added a random offset to the stack start, the stack
      effectively got smaller than what the user defined via rlimit_max(RLIMIT_STACK)
      which then sometimes leads to out-of-stack situations and crashes.
      
      This patch fixes it by adding the maximum possible amount of memory (based on
      STACK_RND_MASK) which theoretically could be added by the stack randomization
      code to the initial stack size. That way, the user-defined stack size is always
      guaranteed to be at minimum what is defined via rlimit_max(RLIMIT_STACK).
      
      This bug is currently not visible on the metag architecture, because on metag
      STACK_RND_MASK is defined to 0 which effectively disables stack randomization.
      
      The changes to fs/exec.c are inside an "#ifdef CONFIG_STACK_GROWSUP"
      section, so it does not affect other platformws beside those where the
      stack grows upwards (parisc and metag).
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: linux-parisc@vger.kernel.org
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: linux-metag@vger.kernel.org
      Cc: stable@vger.kernel.org # v3.16+
      d045c77c
  20. 06 5月, 2015 1 次提交
  21. 22 4月, 2015 2 次提交
  22. 17 4月, 2015 1 次提交
  23. 15 4月, 2015 1 次提交
  24. 13 4月, 2015 1 次提交
  25. 23 3月, 2015 2 次提交
  26. 01 3月, 2015 1 次提交
    • K
      mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines · c07af4f1
      Kirill A. Shutemov 提交于
      Core mm expects __PAGETABLE_{PUD,PMD}_FOLDED to be defined if these page
      table levels folded.  Usually, these defines are provided by
      <asm-generic/pgtable-nopmd.h> and <asm-generic/pgtable-nopud.h>.
      
      But some architectures fold page table levels in a custom way.  They
      need to define these macros themself.  This patch adds missing defines.
      
      The patch fixes mm->nr_pmds underflow and eliminates dead __pmd_alloc()
      and __pud_alloc() on architectures without these page table levels.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c07af4f1
  27. 17 2月, 2015 3 次提交
  28. 13 2月, 2015 1 次提交
    • A
      all arches, signal: move restart_block to struct task_struct · f56141e3
      Andy Lutomirski 提交于
      If an attacker can cause a controlled kernel stack overflow, overwriting
      the restart block is a very juicy exploit target.  This is because the
      restart_block is held in the same memory allocation as the kernel stack.
      
      Moving the restart block to struct task_struct prevents this exploit by
      making the restart_block harder to locate.
      
      Note that there are other fields in thread_info that are also easy
      targets, at least on some architectures.
      
      It's also a decent simplification, since the restart code is more or less
      identical on all architectures.
      
      [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
      Signed-off-by: NAndy Lutomirski <luto@amacapital.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f56141e3