1. 17 8月, 2018 1 次提交
    • G
      arm64: mm: check for upper PAGE_SHIFT bits in pfn_valid() · 5ad356ea
      Greg Hackmann 提交于
      ARM64's pfn_valid() shifts away the upper PAGE_SHIFT bits of the input
      before seeing if the PFN is valid.  This leads to false positives when
      some of the upper bits are set, but the lower bits match a valid PFN.
      
      For example, the following userspace code looks up a bogus entry in
      /proc/kpageflags:
      
          int pagemap = open("/proc/self/pagemap", O_RDONLY);
          int pageflags = open("/proc/kpageflags", O_RDONLY);
          uint64_t pfn, val;
      
          lseek64(pagemap, [...], SEEK_SET);
          read(pagemap, &pfn, sizeof(pfn));
          if (pfn & (1UL << 63)) {        /* valid PFN */
              pfn &= ((1UL << 55) - 1);   /* clear flag bits */
              pfn |= (1UL << 55);
              lseek64(pageflags, pfn * sizeof(uint64_t), SEEK_SET);
              read(pageflags, &val, sizeof(val));
          }
      
      On ARM64 this causes the userspace process to crash with SIGSEGV rather
      than reading (1 << KPF_NOPAGE).  kpageflags_read() treats the offset as
      valid, and stable_page_flags() will try to access an address between the
      user and kernel address ranges.
      
      Fixes: c1cc1552 ("arm64: MMU initialisation")
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Hackmann <ghackmann@google.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5ad356ea
  2. 25 7月, 2018 1 次提交
    • J
      arm64: fix vmemmap BUILD_BUG_ON() triggering on !vmemmap setups · 7b0eb6b4
      Johannes Weiner 提交于
      Arnd reports the following arm64 randconfig build error with the PSI
      patches that add another page flag:
      
        /git/arm-soc/arch/arm64/mm/init.c: In function 'mem_init':
        /git/arm-soc/include/linux/compiler.h:357:38: error: call to
        '__compiletime_assert_618' declared with attribute error: BUILD_BUG_ON
        failed: sizeof(struct page) > (1 << STRUCT_PAGE_MAX_SHIFT)
      
      The additional page flag causes other information stored in
      page->flags to get bumped into their own struct page member:
      
        #if SECTIONS_WIDTH+ZONES_WIDTH+NODES_SHIFT+LAST_CPUPID_SHIFT <=
        BITS_PER_LONG - NR_PAGEFLAGS
        #define LAST_CPUPID_WIDTH LAST_CPUPID_SHIFT
        #else
        #define LAST_CPUPID_WIDTH 0
        #endif
      
        #if defined(CONFIG_NUMA_BALANCING) && LAST_CPUPID_WIDTH == 0
        #define LAST_CPUPID_NOT_IN_PAGE_FLAGS
        #endif
      
      which in turn causes the struct page size to exceed the size set in
      STRUCT_PAGE_MAX_SHIFT. This value is an an estimate used to size the
      VMEMMAP page array according to address space and struct page size.
      
      However, the check is performed - and triggers here - on a !VMEMMAP
      config, which consumes an additional 22 page bits for the sparse
      section id. When VMEMMAP is enabled, those bits are returned, cpupid
      doesn't need its own member, and the page passes the VMEMMAP check.
      
      Restrict that check to the situation it was meant to check: that we
      are sizing the VMEMMAP page array correctly.
      
      Says Arnd:
      
          Further experiments show that the build error already existed before,
          but was only triggered with larger values of CONFIG_NR_CPU and/or
          CONFIG_NODES_SHIFT that might be used in actual configurations but
          not in randconfig builds.
      
          With longer CPU and node masks, I could recreate the problem with
          kernels as old as linux-4.7 when arm64 NUMA support got added.
      Reported-by: NArnd Bergmann <arnd@arndb.de>
      Tested-by: NArnd Bergmann <arnd@arndb.de>
      Cc: stable@vger.kernel.org
      Fixes: 1a2db300 ("arm64, numa: Add NUMA support for arm64 platforms.")
      Fixes: 3e1907d5 ("arm64: mm: move vmemmap region right below the linear region")
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7b0eb6b4
  3. 15 6月, 2018 1 次提交
  4. 01 5月, 2018 1 次提交
  5. 27 3月, 2018 1 次提交
  6. 07 3月, 2018 1 次提交
    • C
      arm64: Revert L1_CACHE_SHIFT back to 6 (64-byte cache line size) · 1f85b42a
      Catalin Marinas 提交于
      Commit 97303480 ("arm64: Increase the max granular size") increased
      the cache line size to 128 to match Cavium ThunderX, apparently for some
      performance benefit which could not be confirmed. This change, however,
      has an impact on the network packets allocation in certain
      circumstances, requiring slightly over a 4K page with a significant
      performance degradation.
      
      This patch reverts L1_CACHE_SHIFT back to 6 (64-byte cache line) while
      keeping ARCH_DMA_MINALIGN at 128. The cache_line_size() function was
      changed to default to ARCH_DMA_MINALIGN in the absence of a meaningful
      CTR_EL0.CWG bit field.
      
      In addition, if a system with ARCH_DMA_MINALIGN < CTR_EL0.CWG is
      detected, the kernel will force swiotlb bounce buffering for all
      non-coherent devices since DMA cache maintenance on sub-CWG ranges is
      not safe, leading to data corruption.
      
      Cc: Tirumalesh Chalamarla <tchalamarla@cavium.com>
      Cc: Timur Tabi <timur@codeaurora.org>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1f85b42a
  7. 19 1月, 2018 1 次提交
  8. 16 1月, 2018 1 次提交
  9. 15 1月, 2018 1 次提交
  10. 12 12月, 2017 1 次提交
    • S
      arm64: Initialise high_memory global variable earlier · f24e5834
      Steve Capper 提交于
      The high_memory global variable is used by
      cma_declare_contiguous(.) before it is defined.
      
      We don't notice this as we compute __pa(high_memory - 1), and it looks
      like we're processing a VA from the direct linear map.
      
      This problem becomes apparent when we flip the kernel virtual address
      space and the linear map is moved to the bottom of the kernel VA space.
      
      This patch moves the initialisation of high_memory before it used.
      
      Cc: <stable@vger.kernel.org>
      Fixes: f7426b98 ("mm: cma: adjust address limit to avoid hitting low/high memory boundary")
      Signed-off-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f24e5834
  11. 06 4月, 2017 4 次提交
  12. 17 1月, 2017 1 次提交
    • A
      arm64: Fix swiotlb fallback allocation · 524dabe1
      Alexander Graf 提交于
      Commit b67a8b29 introduced logic to skip swiotlb allocation when all memory
      is DMA accessible anyway.
      
      While this is a great idea, __dma_alloc still calls swiotlb code unconditionally
      to allocate memory when there is no CMA memory available. The swiotlb code is
      called to ensure that we at least try get_free_pages().
      
      Without initialization, swiotlb allocation code tries to access io_tlb_list
      which is NULL. That results in a stack trace like this:
      
        Unable to handle kernel NULL pointer dereference at virtual address 00000000
        [...]
        [<ffff00000845b908>] swiotlb_tbl_map_single+0xd0/0x2b0
        [<ffff00000845be94>] swiotlb_alloc_coherent+0x10c/0x198
        [<ffff000008099dc0>] __dma_alloc+0x68/0x1a8
        [<ffff000000a1b410>] drm_gem_cma_create+0x98/0x108 [drm]
        [<ffff000000abcaac>] drm_fbdev_cma_create_with_funcs+0xbc/0x368 [drm_kms_helper]
        [<ffff000000abcd84>] drm_fbdev_cma_create+0x2c/0x40 [drm_kms_helper]
        [<ffff000000abc040>] drm_fb_helper_initial_config+0x238/0x410 [drm_kms_helper]
        [<ffff000000abce88>] drm_fbdev_cma_init_with_funcs+0x98/0x160 [drm_kms_helper]
        [<ffff000000abcf90>] drm_fbdev_cma_init+0x40/0x58 [drm_kms_helper]
        [<ffff000000b47980>] vc4_kms_load+0x90/0xf0 [vc4]
        [<ffff000000b46a94>] vc4_drm_bind+0xec/0x168 [vc4]
        [...]
      
      Thankfully swiotlb code just learned how to not do allocations with the FORCE_NO
      option. This patch configures the swiotlb code to use that if we decide not to
      initialize the swiotlb framework.
      
      Fixes: b67a8b29 ("arm64: mm: only initialize swiotlb when necessary")
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      CC: Jisheng Zhang <jszhang@marvell.com>
      CC: Geert Uytterhoeven <geert+renesas@glider.be>
      CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      524dabe1
  13. 12 1月, 2017 1 次提交
  14. 19 12月, 2016 1 次提交
  15. 20 10月, 2016 1 次提交
    • M
      arm64: remove pr_cont abuse from mem_init · f7881bd6
      Mark Rutland 提交于
      All the lines printed by mem_init are independent, with each ending with
      a newline. While they logically form a large block, none are actually
      continuations of previous lines.
      
      The kernel-side printk code and the userspace demsg tool differ in their
      handling of KERN_CONT following a newline, and while this isn't always a
      problem kernel-side, it does cause difficulty for userspace. Using
      pr_cont causes the userspace tool to not print line prefix (e.g.
      timestamps) even when following a newline, mis-aligning the output and
      making it harder to read, e.g.
      
      [    0.000000] Virtual kernel memory layout:
      [    0.000000]     modules : 0xffff000000000000 - 0xffff000008000000   (   128 MB)
          vmalloc : 0xffff000008000000 - 0xffff7dffbfff0000   (129022 GB)
            .text : 0xffff000008080000 - 0xffff0000088b0000   (  8384 KB)
          .rodata : 0xffff0000088b0000 - 0xffff000008c50000   (  3712 KB)
            .init : 0xffff000008c50000 - 0xffff000008d50000   (  1024 KB)
            .data : 0xffff000008d50000 - 0xffff000008e25200   (   853 KB)
             .bss : 0xffff000008e25200 - 0xffff000008e6bec0   (   284 KB)
          fixed   : 0xffff7dfffe7fd000 - 0xffff7dfffec00000   (  4108 KB)
          PCI I/O : 0xffff7dfffee00000 - 0xffff7dffffe00000   (    16 MB)
          vmemmap : 0xffff7e0000000000 - 0xffff800000000000   (  2048 GB maximum)
                    0xffff7e0000000000 - 0xffff7e0026000000   (   608 MB actual)
          memory  : 0xffff800000000000 - 0xffff800980000000   ( 38912 MB)
      [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=6, Nodes=1
      
      Fix this by using pr_notice consistently for all lines, which both the
      kernel and userspace are happy with.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f7881bd6
  16. 07 9月, 2016 1 次提交
  17. 22 8月, 2016 1 次提交
  18. 29 7月, 2016 1 次提交
    • D
      arm64:acpi: fix the acpi alignment exception when 'mem=' specified · cb0a6502
      Dennis Chen 提交于
      When booting an ACPI enabled kernel with 'mem=x', there is the
      possibility that ACPI data regions from the firmware will lie above the
      memory limit.  Ordinarily these will be removed by
      memblock_enforce_memory_limit(.).
      
      Unfortunately, this means that these regions will then be mapped by
      acpi_os_ioremap(.) as device memory (instead of normal) thus unaligned
      accessess will then provoke alignment faults.
      
      In this patch we adopt memblock_mem_limit_remove_map instead, and this
      preserves these ACPI data regions (marked NOMAP) thus ensuring that
      these regions are not mapped as device memory.
      
      For example, below is an alignment exception observed on ARM platform
      when booting the kernel with 'acpi=on mem=8G':
      
        ...
        Unable to handle kernel paging request at virtual address ffff0000080521e7
        pgd = ffff000008aa0000
        [ffff0000080521e7] *pgd=000000801fffe003, *pud=000000801fffd003, *pmd=000000801fffc003, *pte=00e80083ff1c1707
        Internal error: Oops: 96000021 [#1] PREEMPT SMP
        Modules linked in:
        CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc3-next-20160616+ #172
        Hardware name: AMD Overdrive/Supercharger/Default string, BIOS ROD1001A 02/09/2016
        task: ffff800001ef0000 ti: ffff800001ef8000 task.ti: ffff800001ef8000
        PC is at acpi_ns_lookup+0x520/0x734
        LR is at acpi_ns_lookup+0x4a4/0x734
        pc : [<ffff0000083b8b10>] lr : [<ffff0000083b8a94>] pstate: 60000045
        sp : ffff800001efb8b0
        x29: ffff800001efb8c0 x28: 000000000000001b
        x27: 0000000000000001 x26: 0000000000000000
        x25: ffff800001efb9e8 x24: ffff000008a10000
        x23: 0000000000000001 x22: 0000000000000001
        x21: ffff000008724000 x20: 000000000000001b
        x19: ffff0000080521e7 x18: 000000000000000d
        x17: 00000000000038ff x16: 0000000000000002
        x15: 0000000000000007 x14: 0000000000007fff
        x13: ffffff0000000000 x12: 0000000000000018
        x11: 000000001fffd200 x10: 00000000ffffff76
        x9 : 000000000000005f x8 : ffff000008725fa8
        x7 : ffff000008a8df70 x6 : ffff000008a8df70
        x5 : ffff000008a8d000 x4 : 0000000000000010
        x3 : 0000000000000010 x2 : 000000000000000c
        x1 : 0000000000000006 x0 : 0000000000000000
        ...
          acpi_ns_lookup+0x520/0x734
          acpi_ds_load1_begin_op+0x174/0x4fc
          acpi_ps_build_named_op+0xf8/0x220
          acpi_ps_create_op+0x208/0x33c
          acpi_ps_parse_loop+0x204/0x838
          acpi_ps_parse_aml+0x1bc/0x42c
          acpi_ns_one_complete_parse+0x1e8/0x22c
          acpi_ns_parse_table+0x8c/0x128
          acpi_ns_load_table+0xc0/0x1e8
          acpi_tb_load_namespace+0xf8/0x2e8
          acpi_load_tables+0x7c/0x110
          acpi_init+0x90/0x2c0
          do_one_initcall+0x38/0x12c
          kernel_init_freeable+0x148/0x1ec
          kernel_init+0x10/0xec
          ret_from_fork+0x10/0x40
        Code: b9009fbc 2a00037b 36380057 3219037b (b9400260)
        ---[ end trace 03381e5eb0a24de4 ]---
        Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
      
      With 'efi=debug', we can see those ACPI regions loaded by firmware on
      that board as:
      
        efi:   0x0083ff185000-0x0083ff1b4fff [Reserved           |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
        efi:   0x0083ff1b5000-0x0083ff1c2fff [ACPI Reclaim Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
        efi:   0x0083ff223000-0x0083ff224fff [ACPI Memory NVS    |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
      
      Link: http://lkml.kernel.org/r/1468475036-5852-3-git-send-email-dennis.chen@arm.comAcked-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NDennis Chen <dennis.chen@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Rafael J. Wysocki <rafael@kernel.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Kaly Xin <kaly.xin@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cb0a6502
  19. 28 6月, 2016 2 次提交
  20. 21 6月, 2016 1 次提交
  21. 20 4月, 2016 2 次提交
  22. 16 4月, 2016 1 次提交
  23. 14 4月, 2016 4 次提交
    • A
      arm64: mm: move vmemmap region right below the linear region · 3e1907d5
      Ard Biesheuvel 提交于
      This moves the vmemmap region right below PAGE_OFFSET, aka the start
      of the linear region, and redefines its size to be a power of two.
      Due to the placement of PAGE_OFFSET in the middle of the address space,
      whose size is a power of two as well, this guarantees that virt to
      page conversions and vice versa can be implemented efficiently, by
      masking and shifting rather than ordinary arithmetic.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3e1907d5
    • A
      arm64: mm: free __init memory via the linear mapping · d386825c
      Ard Biesheuvel 提交于
      The implementation of free_initmem_default() expects __init_begin
      and __init_end to be covered by the linear mapping, which is no
      longer the case. So open code it instead, using addresses that are
      explicitly translated from kernel virtual to linear virtual.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d386825c
    • A
      arm64: add the initrd region to the linear mapping explicitly · 177e15f0
      Ard Biesheuvel 提交于
      Instead of going out of our way to relocate the initrd if it turns out
      to occupy memory that is not covered by the linear mapping, just add the
      initrd to the linear mapping. This puts the burden on the bootloader to
      pass initrd= and mem= options that are mutually consistent.
      
      Note that, since the placement of the linear region in the PA space is
      also dependent on the placement of the kernel Image, which may reside
      anywhere in memory, we may still end up with a situation where the initrd
      and the kernel Image are simply too far apart to be covered by the linear
      region.
      
      Since we now leave it up to the bootloader to pass the initrd in memory
      that is guaranteed to be accessible by the kernel, add a mention of this to
      the arm64 boot protocol specification as well.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      177e15f0
    • A
      arm64/mm: ensure memstart_addr remains sufficiently aligned · 2958987f
      Ard Biesheuvel 提交于
      After choosing memstart_addr to be the highest multiple of
      ARM64_MEMSTART_ALIGN less than or equal to the first usable physical memory
      address, we clip the memblocks to the maximum size of the linear region.
      Since the kernel may be high up in memory, we take care not to clip the
      kernel itself, which means we have to clip some memory from the bottom if
      this occurs, to ensure that the distance between the first and the last
      usable physical memory address can be covered by the linear region.
      
      However, we fail to update memstart_addr if this clipping from the bottom
      occurs, which means that we may still end up with virtual addresses that
      wrap into the userland range. So increment memstart_addr as appropriate to
      prevent this from happening.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2958987f
  24. 21 3月, 2016 1 次提交
  25. 02 3月, 2016 1 次提交
  26. 01 3月, 2016 2 次提交
  27. 27 2月, 2016 1 次提交
    • A
      arm64: vmemmap: use virtual projection of linear region · dfd55ad8
      Ard Biesheuvel 提交于
      Commit dd006da2 ("arm64: mm: increase VA range of identity map") made
      some changes to the memory mapping code to allow physical memory to reside
      at an offset that exceeds the size of the virtual mapping.
      
      However, since the size of the vmemmap area is proportional to the size of
      the VA area, but it is populated relative to the physical space, we may
      end up with the struct page array being mapped outside of the vmemmap
      region. For instance, on my Seattle A0 box, I can see the following output
      in the dmesg log.
      
         vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000   (     8 GB maximum)
                   0xffffffbfc0000000 - 0xffffffbfd0000000   (   256 MB actual)
      
      We can fix this by deciding that the vmemmap region is not a projection of
      the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
      linear region. This way, we are guaranteed that the vmemmap region is of
      sufficient size, and we can even reduce the size by half.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      dfd55ad8
  28. 26 2月, 2016 2 次提交
  29. 24 2月, 2016 1 次提交
    • A
      arm64: kaslr: randomize the linear region · c031a421
      Ard Biesheuvel 提交于
      When KASLR is enabled (CONFIG_RANDOMIZE_BASE=y), and entropy has been
      provided by the bootloader, randomize the placement of RAM inside the
      linear region if sufficient space is available. For instance, on a 4KB
      granule/3 levels kernel, the linear region is 256 GB in size, and we can
      choose any 1 GB aligned offset that is far enough from the top of the
      address space to fit the distance between the start of the lowest memblock
      and the top of the highest memblock.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      c031a421
  30. 19 2月, 2016 1 次提交
    • A
      arm64: allow kernel Image to be loaded anywhere in physical memory · a7f8de16
      Ard Biesheuvel 提交于
      This relaxes the kernel Image placement requirements, so that it
      may be placed at any 2 MB aligned offset in physical memory.
      
      This is accomplished by ignoring PHYS_OFFSET when installing
      memblocks, and accounting for the apparent virtual offset of
      the kernel Image. As a result, virtual address references
      below PAGE_OFFSET are correctly mapped onto physical references
      into the kernel Image regardless of where it sits in memory.
      
      Special care needs to be taken for dealing with memory limits passed
      via mem=, since the generic implementation clips memory top down, which
      may clip the kernel image itself if it is loaded high up in memory. To
      deal with this case, we simply add back the memory covering the kernel
      image, which may result in more memory to be retained than was passed
      as a mem= parameter.
      
      Since mem= should not be considered a production feature, a panic notifier
      handler is installed that dumps the memory limit at panic time if one was
      set.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      a7f8de16