1. 31 5月, 2019 1 次提交
  2. 24 5月, 2019 5 次提交
  3. 21 5月, 2019 2 次提交
    • T
      treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 · 1ccea77e
      Thomas Gleixner 提交于
      Based on 2 normalized pattern(s):
      
        this program is free software you can redistribute it and or modify
        it under the terms of the gnu general public license as published by
        the free software foundation either version 2 of the license or at
        your option any later version this program is distributed in the
        hope that it will be useful but without any warranty without even
        the implied warranty of merchantability or fitness for a particular
        purpose see the gnu general public license for more details you
        should have received a copy of the gnu general public license along
        with this program if not see http www gnu org licenses
      
        this program is free software you can redistribute it and or modify
        it under the terms of the gnu general public license as published by
        the free software foundation either version 2 of the license or at
        your option any later version this program is distributed in the
        hope that it will be useful but without any warranty without even
        the implied warranty of merchantability or fitness for a particular
        purpose see the gnu general public license for more details [based]
        [from] [clk] [highbank] [c] you should have received a copy of the
        gnu general public license along with this program if not see http
        www gnu org licenses
      
      extracted by the scancode license scanner the SPDX license identifier
      
        GPL-2.0-or-later
      
      has been chosen to replace the boilerplate/reference in 355 file(s).
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NJilayne Lovejoy <opensource@jilayne.com>
      Reviewed-by: NSteve Winslow <swinslow@gmail.com>
      Reviewed-by: NAllison Randal <allison@lohutok.net>
      Cc: linux-spdx@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ccea77e
    • T
      treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 1 · 16216333
      Thomas Gleixner 提交于
      Based on 2 normalized pattern(s):
      
        this program is free software you can redistribute it and or modify
        it under the terms of the gnu general public license as published by
        the free software foundation either version 2 of the license or at
        your option any later version this program is distributed in the
        hope that it will be useful but without any warranty without even
        the implied warranty of merchantability or fitness for a particular
        purpose see the gnu general public license for more details you
        should have received a copy of the gnu general public license along
        with this program if not write to the free software foundation inc
        51 franklin street fifth floor boston ma 02110 1301 usa
      
        this program is free software you can redistribute it and or modify
        it under the terms of the gnu general public license as published by
        the free software foundation either version 2 of the license or at
        your option [no]_[pad]_[ctrl] any later version this program is
        distributed in the hope that it will be useful but without any
        warranty without even the implied warranty of merchantability or
        fitness for a particular purpose see the gnu general public license
        for more details you should have received a copy of the gnu general
        public license along with this program if not write to the free
        software foundation inc 51 franklin street fifth floor boston ma
        02110 1301 usa
      
      extracted by the scancode license scanner the SPDX license identifier
      
        GPL-2.0-or-later
      
      has been chosen to replace the boilerplate/reference in 176 file(s).
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NJilayne Lovejoy <opensource@jilayne.com>
      Reviewed-by: NSteve Winslow <swinslow@gmail.com>
      Reviewed-by: NAllison Randal <allison@lohutok.net>
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Cc: linux-spdx@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190519154040.652910950@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16216333
  4. 17 5月, 2019 1 次提交
    • A
      powerpc/mm/hash: Fix get_region_id() for invalid addresses · c179976c
      Aneesh Kumar K.V 提交于
      Accesses by userspace to random addresses outside the user or kernel
      address range will generate an SLB fault. When we handle that fault we
      classify the effective address into several classes, eg. user, kernel
      linear, kernel virtual etc.
      
      For addresses that are completely outside of any valid range, we
      should not insert an SLB entry at all, and instead immediately an
      exception.
      
      In the past this was handled in two ways. Firstly we would check the
      top nibble of the address (using REGION_ID(ea)) and that would tell us
      if the address was user (0), kernel linear (c), kernel virtual (d), or
      vmemmap (f). If the address didn't match any of these it was invalid.
      
      Then for each type of address we would do a secondary check. For the
      user region we check against H_PGTABLE_RANGE, for kernel linear we
      would mask the top nibble of the address and then check the address
      against MAX_PHYSMEM_BITS.
      
      As part of commit 0034d395 ("powerpc/mm/hash64: Map all the kernel
      regions in the same 0xc range") we replaced REGION_ID() with
      get_region_id() and changed the masking of the top nibble to only mask
      the top two bits, which introduced a bug.
      
      Addresses less than (4 << 60) are still handled correctly, they are
      either less than (1 << 60) in which case they are subject to the
      H_PGTABLE_RANGE check, or they are correctly checked against
      MAX_PHYSMEM_BITS.
      
      However addresses from (4 << 60) to ((0xc << 60) - 1), are incorrectly
      treated as kernel linear addresses in get_region_id(). Then the top
      two bits are cleared by EA_MASK in slb_allocate_kernel() and the
      address is checked against MAX_PHYSMEM_BITS, which it passes due to
      the masking. The end result is we incorrectly insert SLB entries for
      those addresses.
      
      That is not actually catastrophic, having inserted the SLB entry we
      will then go on to take a page fault for the address and at that point
      we detect the problem and report it as a bad fault.
      
      Still we should not be inserting those entries, or treating them as
      kernel linear addresses in the first place. So fix get_region_id() to
      detect addresses in that range and return an invalid region id, which
      we cause use to not insert an SLB entry and directly report an
      exception.
      
      Fixes: 0034d395 ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      [mpe: Drop change to EA_MASK for now, rewrite change log]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c179976c
  5. 16 5月, 2019 1 次提交
    • A
      powerpc/mm: Drop VM_BUG_ON in get_region_id() · 6457f42e
      Aneesh Kumar K.V 提交于
      We call get_region_id() without validating the ea value. That means
      with a wrong ea value we hit the BUG as below.
      
        kernel BUG at arch/powerpc/include/asm/book3s/64/hash.h:129!
        Oops: Exception in kernel mode, sig: 5 [#1]
        LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
        CPU: 0 PID: 3937 Comm: access_tests Not tainted 5.1.0
        ....
        NIP [c00000000007ba20] do_slb_fault+0x70/0x320
        LR [c00000000000896c] data_access_slb_common+0x15c/0x1a0
      
      Fix this by removing the VM_BUG_ON. All callers make sure the returned
      region id is valid and error out otherwise.
      
      Fixes: 0034d395 ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range")
      Reported-by: NAndrew Donnellan <ajd@linux.ibm.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6457f42e
  6. 15 5月, 2019 1 次提交
  7. 11 5月, 2019 1 次提交
  8. 09 5月, 2019 2 次提交
    • D
      x86/mpx, mm/core: Fix recursive munmap() corruption · 5a28fc94
      Dave Hansen 提交于
      This is a bit of a mess, to put it mildly.  But, it's a bug
      that only seems to have showed up in 4.20 but wasn't noticed
      until now, because nobody uses MPX.
      
      MPX has the arch_unmap() hook inside of munmap() because MPX
      uses bounds tables that protect other areas of memory.  When
      memory is unmapped, there is also a need to unmap the MPX
      bounds tables.  Barring this, unused bounds tables can eat 80%
      of the address space.
      
      But, the recursive do_munmap() that gets called vi arch_unmap()
      wreaks havoc with __do_munmap()'s state.  It can result in
      freeing populated page tables, accessing bogus VMA state,
      double-freed VMAs and more.
      
      See the "long story" further below for the gory details.
      
      To fix this, call arch_unmap() before __do_unmap() has a chance
      to do anything meaningful.  Also, remove the 'vma' argument
      and force the MPX code to do its own, independent VMA lookup.
      
      == UML / unicore32 impact ==
      
      Remove unused 'vma' argument to arch_unmap().  No functional
      change.
      
      I compile tested this on UML but not unicore32.
      
      == powerpc impact ==
      
      powerpc uses arch_unmap() well to watch for munmap() on the
      VDSO and zeroes out 'current->mm->context.vdso_base'.  Moving
      arch_unmap() makes this happen earlier in __do_munmap().  But,
      'vdso_base' seems to only be used in perf and in the signal
      delivery that happens near the return to userspace.  I can not
      find any likely impact to powerpc, other than the zeroing
      happening a little earlier.
      
      powerpc does not use the 'vma' argument and is unaffected by
      its removal.
      
      I compile-tested a 64-bit powerpc defconfig.
      
      == x86 impact ==
      
      For the common success case this is functionally identical to
      what was there before.  For the munmap() failure case, it's
      possible that some MPX tables will be zapped for memory that
      continues to be in use.  But, this is an extraordinarily
      unlikely scenario and the harm would be that MPX provides no
      protection since the bounds table got reset (zeroed).
      
      I can't imagine anyone doing this:
      
      	ptr = mmap();
      	// use ptr
      	ret = munmap(ptr);
      	if (ret)
      		// oh, there was an error, I'll
      		// keep using ptr.
      
      Because if you're doing munmap(), you are *done* with the
      memory.  There's probably no good data in there _anyway_.
      
      This passes the original reproducer from Richard Biener as
      well as the existing mpx selftests/.
      
      The long story:
      
      munmap() has a couple of pieces:
      
       1. Find the affected VMA(s)
       2. Split the start/end one(s) if neceesary
       3. Pull the VMAs out of the rbtree
       4. Actually zap the memory via unmap_region(), including
          freeing page tables (or queueing them to be freed).
       5. Fix up some of the accounting (like fput()) and actually
          free the VMA itself.
      
      This specific ordering was actually introduced by:
      
        dd2283f2 ("mm: mmap: zap pages with read mmap_sem in munmap")
      
      during the 4.20 merge window.  The previous __do_munmap() code
      was actually safe because the only thing after arch_unmap() was
      remove_vma_list().  arch_unmap() could not see 'vma' in the
      rbtree because it was detached, so it is not even capable of
      doing operations unsafe for remove_vma_list()'s use of 'vma'.
      
      Richard Biener reported a test that shows this in dmesg:
      
        [1216548.787498] BUG: Bad rss-counter state mm:0000000017ce560b idx:1 val:551
        [1216548.787500] BUG: non-zero pgtables_bytes on freeing mm: 24576
      
      What triggered this was the recursive do_munmap() called via
      arch_unmap().  It was freeing page tables that has not been
      properly zapped.
      
      But, the problem was bigger than this.  For one, arch_unmap()
      can free VMAs.  But, the calling __do_munmap() has variables
      that *point* to VMAs and obviously can't handle them just
      getting freed while the pointer is still in use.
      
      I tried a couple of things here.  First, I tried to fix the page
      table freeing problem in isolation, but I then found the VMA
      issue.  I also tried having the MPX code return a flag if it
      modified the rbtree which would force __do_munmap() to re-walk
      to restart.  That spiralled out of control in complexity pretty
      fast.
      
      Just moving arch_unmap() and accepting that the bonkers failure
      case might eat some bounds tables seems like the simplest viable
      fix.
      
      This was also reported in the following kernel bugzilla entry:
      
        https://bugzilla.kernel.org/show_bug.cgi?id=203123
      
      There are some reports that this commit triggered this bug:
      
        dd2283f2 ("mm: mmap: zap pages with read mmap_sem in munmap")
      
      While that commit certainly made the issues easier to hit, I believe
      the fundamental issue has been with us as long as MPX itself, thus
      the Fixes: tag below is for one of the original MPX commits.
      
      [ mingo: Minor edits to the changelog and the patch. ]
      Reported-by: NRichard Biener <rguenther@suse.de>
      Reported-by: NH.J. Lu <hjl.tools@gmail.com>
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by Thomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mm@kvack.org
      Cc: linux-um@lists.infradead.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: stable@vger.kernel.org
      Fixes: dd2283f2 ("mm: mmap: zap pages with read mmap_sem in munmap")
      Link: http://lkml.kernel.org/r/20190419194747.5E1AD6DC@viggo.jf.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5a28fc94
    • M
      powerpc/64s: Use early_mmu_has_feature() in set_kuap() · 8150a153
      Michael Ellerman 提交于
      When implementing the KUAP support on Radix we fixed one case where
      mmu_has_feature() was being called too early in boot via
      __put_user_size().
      
      However since then some new code in linux-next has created a new path
      via which we can end up calling mmu_has_feature() too early.
      
      On P9 this leads to crashes early in boot if we have both PPC_KUAP and
      CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG enabled. Our early boot code
      calls printk() which calls probe_kernel_read(), that does a
      __copy_from_user_inatomic() which calls into set_kuap() and that uses
      mmu_has_feature().
      
      At that point in boot we haven't patched MMU features yet so the debug
      code in mmu_has_feature() complains, and calls printk(). At that point
      we recurse, eg:
      
        ...
        dump_stack+0xdc
        probe_kernel_read+0x1a4
        check_pointer+0x58
        ...
        printk+0x40
        dump_stack_print_info+0xbc
        dump_stack+0x8
        probe_kernel_read+0x1a4
        probe_kernel_read+0x19c
        check_pointer+0x58
        ...
        printk+0x40
        cpufeatures_process_feature+0xc8
        scan_cpufeatures_subnodes+0x380
        of_scan_flat_dt_subnodes+0xb4
        dt_cpu_ftrs_scan_callback+0x158
        of_scan_flat_dt+0xf0
        dt_cpu_ftrs_scan+0x3c
        early_init_devtree+0x360
        early_setup+0x9c
      
      And so on for infinity, symptom is a dead system.
      
      Even more fun is what happens when using the hash MMU (ie. p8 or p9
      with Radix disabled), and when we don't have
      CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG enabled. With the debug disabled
      we don't check if static keys have been initialised, we just rely on
      the jump label. But the jump label defaults to true so we just whack
      the AMR even though Radix is not enabled.
      
      Clearing the AMR is fine, but after we've done the user copy we write
      (0b11 << 62) into AMR. When using hash that makes all pages with key
      zero no longer readable or writable. All kernel pages implicitly have
      key zero, and so all of a sudden the kernel can't read or write any of
      its memory. Again dead system.
      
      In the medium term we have several options for fixing this.
      probe_kernel_read() doesn't need to touch AMR at all, it's not doing a
      user access after all, but it uses __copy_from_user_inatomic() just
      because it's easy, we could fix that.
      
      It would also be safe to default to not writing to the AMR during
      early boot, until we've detected features. But it's not clear that
      flipping all the MMU features to static_key_false won't introduce
      other bugs.
      
      But for now just switch to early_mmu_has_feature() in set_kuap(), that
      avoids all the problems with jump labels. It adds the overhead of a
      global lookup and test, but that's probably trivial compared to the
      writes to the AMR anyway.
      
      Fixes: 890274c2 ("powerpc/64s: Implement KUAP for Radix MMU")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NRussell Currey <ruscur@russell.cc>
      8150a153
  9. 07 5月, 2019 1 次提交
    • R
      powerpc/book3s/64: check for NULL pointer in pgd_alloc() · f3935626
      Rick Lindsley 提交于
      When the memset code was added to pgd_alloc(), it failed to consider
      that kmem_cache_alloc() can return NULL. It's uncommon, but not
      impossible under heavy memory contention. Example oops:
      
        Unable to handle kernel paging request for data at address 0x00000000
        Faulting instruction address: 0xc0000000000a4000
        Oops: Kernel access of bad area, sig: 11 [#1]
        LE SMP NR_CPUS=2048 NUMA pSeries
        CPU: 70 PID: 48471 Comm: entrypoint.sh Kdump: loaded Not tainted 4.14.0-115.6.1.el7a.ppc64le #1
        task: c000000334a00000 task.stack: c000000331c00000
        NIP:  c0000000000a4000 LR: c00000000012f43c CTR: 0000000000000020
        REGS: c000000331c039c0 TRAP: 0300   Not tainted  (4.14.0-115.6.1.el7a.ppc64le)
        MSR:  800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 44022840  XER: 20040000
        CFAR: c000000000008874 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1
        ...
        NIP [c0000000000a4000] memset+0x68/0x104
        LR [c00000000012f43c] mm_init+0x27c/0x2f0
        Call Trace:
          mm_init+0x260/0x2f0 (unreliable)
          copy_mm+0x11c/0x638
          copy_process.isra.28.part.29+0x6fc/0x1080
          _do_fork+0xdc/0x4c0
          ppc_clone+0x8/0xc
        Instruction dump:
        409e000c b0860000 38c60002 409d000c 90860000 38c60004 78a0d183 78a506a0
        7c0903a6 41820034 60000000 60420000 <f8860000> f8860008 f8860010 f8860018
      
      Fixes: fc5c2f4a ("powerpc/mm/hash64: Zero PGD pages on allocation")
      Cc: stable@vger.kernel.org # v4.16+
      Signed-off-by: NRick Lindsley <ricklind@vnet.linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f3935626
  10. 03 5月, 2019 4 次提交
  11. 02 5月, 2019 21 次提交