1. 07 9月, 2018 1 次提交
    • M
      arm64: fix erroneous warnings in page freeing functions · fac880c7
      Mark Rutland 提交于
      In pmd_free_pte_page() and pud_free_pmd_page() we try to warn if they
      hit a present non-table entry. In both cases we'll warn for non-present
      entries, as the VM_WARN_ON() only checks the entry is not a table entry.
      
      This has been observed to result in warnings when booting a v4.19-rc2
      kernel under qemu.
      
      Fix this by bailing out earlier for non-present entries.
      
      Fixes: ec28bb9c ("arm64: Implement page table free interfaces")
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fac880c7
  2. 31 8月, 2018 1 次提交
    • J
      arm64: mm: always enable CONFIG_HOLES_IN_ZONE · f52bb98f
      James Morse 提交于
      Commit 6d526ee2 ("arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA")
      only enabled HOLES_IN_ZONE for NUMA systems because the NUMA code was
      choking on the missing zone for nomap pages. This problem doesn't just
      apply to NUMA systems.
      
      If the architecture doesn't set HAVE_ARCH_PFN_VALID, pfn_valid() will
      return true if the pfn is part of a valid sparsemem section.
      
      When working with multiple pages, the mm code uses pfn_valid_within()
      to test each page it uses within the sparsemem section is valid. On
      most systems memory comes in MAX_ORDER_NR_PAGES chunks which all
      have valid/initialised struct pages. In this case pfn_valid_within()
      is optimised out.
      
      Systems where this isn't true (e.g. due to nomap) should set
      HOLES_IN_ZONE and provide HAVE_ARCH_PFN_VALID so that mm tests each
      page as it works with it.
      
      Currently non-NUMA arm64 systems can't enable HOLES_IN_ZONE, leading to
      a VM_BUG_ON():
      
      | page:fffffdff802e1780 is uninitialized and poisoned
      | raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
      | raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
      | page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
      | ------------[ cut here ]------------
      | kernel BUG at include/linux/mm.h:978!
      | Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      [...]
      | CPU: 1 PID: 25236 Comm: dd Not tainted 4.18.0 #7
      | Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
      | pstate: 40000085 (nZcv daIf -PAN -UAO)
      | pc : move_freepages_block+0x144/0x248
      | lr : move_freepages_block+0x144/0x248
      | sp : fffffe0071177680
      [...]
      | Process dd (pid: 25236, stack limit = 0x0000000094cc07fb)
      | Call trace:
      |  move_freepages_block+0x144/0x248
      |  steal_suitable_fallback+0x100/0x16c
      |  get_page_from_freelist+0x440/0xb20
      |  __alloc_pages_nodemask+0xe8/0x838
      |  new_slab+0xd4/0x418
      |  ___slab_alloc.constprop.27+0x380/0x4a8
      |  __slab_alloc.isra.21.constprop.26+0x24/0x34
      |  kmem_cache_alloc+0xa8/0x180
      |  alloc_buffer_head+0x1c/0x90
      |  alloc_page_buffers+0x68/0xb0
      |  create_empty_buffers+0x20/0x1ec
      |  create_page_buffers+0xb0/0xf0
      |  __block_write_begin_int+0xc4/0x564
      |  __block_write_begin+0x10/0x18
      |  block_write_begin+0x48/0xd0
      |  blkdev_write_begin+0x28/0x30
      |  generic_perform_write+0x98/0x16c
      |  __generic_file_write_iter+0x138/0x168
      |  blkdev_write_iter+0x80/0xf0
      |  __vfs_write+0xe4/0x10c
      |  vfs_write+0xb4/0x168
      |  ksys_write+0x44/0x88
      |  sys_write+0xc/0x14
      |  el0_svc_naked+0x30/0x34
      | Code: aa1303e0 90001a01 91296421 94008902 (d4210000)
      | ---[ end trace 1601ba47f6e883fe ]---
      
      Remove the NUMA dependency.
      
      Link: https://www.spinics.net/lists/arm-kernel/msg671851.html
      Cc: <stable@vger.kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Reported-by: NMikulas Patocka <mpatocka@redhat.com>
      Reviewed-by: NPavel Tatashin <pavel.tatashin@microsoft.com>
      Tested-by: NMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f52bb98f
  3. 30 8月, 2018 1 次提交
  4. 27 8月, 2018 1 次提交
  5. 25 8月, 2018 2 次提交
  6. 24 8月, 2018 4 次提交
  7. 23 8月, 2018 1 次提交
    • A
      arch: enable relative relocations for arm64, power and x86 · 271ca788
      Ard Biesheuvel 提交于
      Patch series "add support for relative references in special sections", v10.
      
      This adds support for emitting special sections such as initcall arrays,
      PCI fixups and tracepoints as relative references rather than absolute
      references.  This reduces the size by 50% on 64-bit architectures, but
      more importantly, it removes the need for carrying relocation metadata for
      these sections in relocatable kernels (e.g., for KASLR) that needs to be
      fixed up at boot time.  On arm64, this reduces the vmlinux footprint of
      such a reference by 8x (8 byte absolute reference + 24 byte RELA entry vs
      4 byte relative reference)
      
      Patch #3 was sent out before as a single patch.  This series supersedes
      the previous submission.  This version makes relative ksymtab entries
      dependent on the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS rather
      than trying to infer from kbuild test robot replies for which
      architectures it should be blacklisted.
      
      Patch #1 introduces the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS,
      and sets it for the main architectures that are expected to benefit the
      most from this feature, i.e., 64-bit architectures or ones that use
      runtime relocations.
      
      Patch #2 add support for #define'ing __DISABLE_EXPORTS to get rid of
      ksymtab/kcrctab sections in decompressor and EFI stub objects when
      rebuilding existing C files to run in a different context.
      
      Patches #4 - #6 implement relative references for initcalls, PCI fixups
      and tracepoints, respectively, all of which produce sections with order
      ~1000 entries on an arm64 defconfig kernel with tracing enabled.  This
      means we save about 28 KB of vmlinux space for each of these patches.
      
      [From the v7 series blurb, which included the jump_label patches as well]:
      
        For the arm64 kernel, all patches combined reduce the memory footprint
        of vmlinux by about 1.3 MB (using a config copied from Ubuntu that has
        KASLR enabled), of which ~1 MB is the size reduction of the RELA section
        in .init, and the remaining 300 KB is reduction of .text/.data.
      
      This patch (of 6):
      
      Before updating certain subsystems to use place relative 32-bit
      relocations in special sections, to save space and reduce the number of
      absolute relocations that need to be processed at runtime by relocatable
      kernels, introduce the Kconfig symbol and define it for some architectures
      that should be able to support and benefit from it.
      
      Link: http://lkml.kernel.org/r/20180704083651.24360-2-ard.biesheuvel@linaro.orgSigned-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "Serge E. Hallyn" <serge@hallyn.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Nicolas Pitre <nico@linaro.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
      Cc: James Morris <james.morris@microsoft.com>
      Cc: Jessica Yu <jeyu@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      271ca788
  8. 18 8月, 2018 2 次提交
  9. 17 8月, 2018 2 次提交
    • G
      arm64: mm: check for upper PAGE_SHIFT bits in pfn_valid() · 5ad356ea
      Greg Hackmann 提交于
      ARM64's pfn_valid() shifts away the upper PAGE_SHIFT bits of the input
      before seeing if the PFN is valid.  This leads to false positives when
      some of the upper bits are set, but the lower bits match a valid PFN.
      
      For example, the following userspace code looks up a bogus entry in
      /proc/kpageflags:
      
          int pagemap = open("/proc/self/pagemap", O_RDONLY);
          int pageflags = open("/proc/kpageflags", O_RDONLY);
          uint64_t pfn, val;
      
          lseek64(pagemap, [...], SEEK_SET);
          read(pagemap, &pfn, sizeof(pfn));
          if (pfn & (1UL << 63)) {        /* valid PFN */
              pfn &= ((1UL << 55) - 1);   /* clear flag bits */
              pfn |= (1UL << 55);
              lseek64(pageflags, pfn * sizeof(uint64_t), SEEK_SET);
              read(pageflags, &val, sizeof(val));
          }
      
      On ARM64 this causes the userspace process to crash with SIGSEGV rather
      than reading (1 << KPF_NOPAGE).  kpageflags_read() treats the offset as
      valid, and stable_page_flags() will try to access an address between the
      user and kernel address ranges.
      
      Fixes: c1cc1552 ("arm64: MMU initialisation")
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Hackmann <ghackmann@google.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5ad356ea
    • W
      arm64: Avoid calling stop_machine() when patching jump labels · f6cc0c50
      Will Deacon 提交于
      Patching a jump label involves patching a single instruction at a time,
      swizzling between a branch and a NOP. The architecture treats these
      instructions specially, so a concurrently executing CPU is guaranteed to
      see either the NOP or the branch, rather than an amalgamation of the two
      instruction encodings.
      
      However, in order to guarantee that the new instruction is visible, it
      is necessary to send an IPI to the concurrently executing CPU so that it
      discards any previously fetched instructions from its pipeline. This
      operation therefore cannot be completed from a context with IRQs
      disabled, but this is exactly what happens on the jump label path where
      the hotplug lock is held and irqs are subsequently disabled by
      stop_machine_cpuslocked(). This results in a deadlock during boot on
      Hikey-960.
      
      Due to the architectural guarantees around patching NOPs and branches,
      we don't actually need to stop_machine() at all on the jump label path,
      so we can avoid the deadlock by using the "nosync" variant of our
      instruction patching routine.
      
      Fixes: 693350a7 ("arm64: insn: Don't fallback on nosync path for general insn patching")
      Reported-by: NTuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
      Reported-by: NJohn Stultz <john.stultz@linaro.org>
      Tested-by: NValentin Schneider <valentin.schneider@arm.com>
      Tested-by: NTuomas Tynkkynen <tuomas@tuxera.com>
      Tested-by: NJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f6cc0c50
  10. 12 8月, 2018 3 次提交
  11. 09 8月, 2018 1 次提交
  12. 08 8月, 2018 1 次提交
  13. 07 8月, 2018 6 次提交
  14. 06 8月, 2018 1 次提交
  15. 03 8月, 2018 5 次提交
  16. 02 8月, 2018 4 次提交
    • C
      kconfig: include kernel/Kconfig.preempt from init/Kconfig · 87a4c375
      Christoph Hellwig 提交于
      Almost all architectures include it.  Add a ARCH_NO_PREEMPT symbol to
      disable preempt support for alpha, hexagon, non-coldfire m68k and
      user mode Linux.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      87a4c375
    • C
      Kconfig: consolidate the "Kernel hacking" menu · 06ec64b8
      Christoph Hellwig 提交于
      Move the source of lib/Kconfig.debug and arch/$(ARCH)/Kconfig.debug to
      the top-level Kconfig.  For two architectures that means moving their
      arch-specific symbols in that menu into a new arch Kconfig.debug file,
      and for a few more creating a dummy file so that we can include it
      unconditionally.
      
      Also move the actual 'Kernel hacking' menu to lib/Kconfig.debug, where
      it belongs.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      06ec64b8
    • C
      kconfig: include common Kconfig files from top-level Kconfig · 1572497c
      Christoph Hellwig 提交于
      Instead of duplicating the source statements in every architecture just
      do it once in the toplevel Kconfig file.
      
      Note that with this the inclusion of arch/$(SRCARCH/Kconfig moves out of
      the top-level Kconfig into arch/Kconfig so that don't violate ordering
      constraits while keeping a sensible menu structure.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      1572497c
    • L
      mm: do not initialize TLB stack vma's with vma_init() · 8b11ec1b
      Linus Torvalds 提交于
      Commit 2c4541e2 ("mm: use vma_init() to initialize VMAs on stack and
      data segments") tried to initialize various left-over ad-hoc vma's
      "properly", but actually made things worse for the temporary vma's used
      for TLB flushing.
      
      vma_init() doesn't actually initialize all of the vma, just a few
      fields, so doing something like
      
         -       struct vm_area_struct vma = { .vm_mm = tlb->mm, };
         +       struct vm_area_struct vma;
         +
         +       vma_init(&vma, tlb->mm);
      
      was actually very bad: instead of having a nicely initialized vma with
      every field but "vm_mm" zeroed, you'd have an entirely uninitialized vma
      with only a couple of fields initialized.  And they weren't even fields
      that the code in question mostly cared about.
      
      The flush_tlb_range() function takes a "struct vma" rather than a
      "struct mm_struct", because a few architectures actually care about what
      kind of range it is - being able to only do an ITLB flush if it's a
      range that doesn't have data accesses enabled, for example.  And all the
      normal users already have the vma for doing the range invalidation.
      
      But a few people want to call flush_tlb_range() with a range they just
      made up, so they also end up using a made-up vma.  x86 just has a
      special "flush_tlb_mm_range()" function for this, but other
      architectures (arm and ia64) do the "use fake vma" thing instead, and
      thus got caught up in the vma_init() changes.
      
      At the same time, the TLB flushing code really doesn't care about most
      other fields in the vma, so vma_init() is just unnecessary and
      pointless.
      
      This fixes things by having an explicit "this is just an initializer for
      the TLB flush" initializer macro, which is used by the arm/arm64/ia64
      people who mis-use this interface with just a dummy vma.
      
      Fixes: 2c4541e2 ("mm: use vma_init() to initialize VMAs on stack and data segments")
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8b11ec1b
  17. 01 8月, 2018 1 次提交
  18. 31 7月, 2018 3 次提交