提交 · acafce48b07bf5f9994a38e7fe237193d43d092e · openeuler / Kernel

12 12月, 2018 1 次提交

arm64: Add memory hotplug support · 4ab21506

由 Robin Murphy 提交于 12月 11, 2018

Wire up the basic support for hot-adding memory. Since memory hotplug
is fairly tightly coupled to sparsemem, we tweak pfn_valid() to also
cross-check the presence of a section in the manner of the generic
implementation, before falling back to memblock to check for no-map
regions within a present section as before. By having arch_add_memory(()
create the linear mapping first, this then makes everything work in the
way that __add_section() expects.

We expect hotplug to be ACPI-driven, so the swapper_pg_dir updates
should be safe from races by virtue of the global device hotplug lock.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

4ab21506

11 12月, 2018 1 次提交

arm64: mm: Introduce DEFAULT_MAP_WINDOW · 363524d2

由 Steve Capper 提交于 12月 06, 2018

We wish to introduce a 52-bit virtual address space for userspace but
maintain compatibility with software that assumes the maximum VA space
size is 48 bit.

In order to achieve this, on 52-bit VA systems, we make mmap behave as
if it were running on a 48-bit VA system (unless userspace explicitly
requests a VA where addr[51:48] != 0).

On a system running a 52-bit userspace we need TASK_SIZE to represent
the 52-bit limit as it is used in various places to distinguish between
kernelspace and userspace addresses.

Thus we need a new limit for mmap, stack, ELF loader and EFI (which uses
TTBR0) to represent the non-extended VA space.

This patch introduces DEFAULT_MAP_WINDOW and DEFAULT_MAP_WINDOW_64 and
switches the appropriate logic to use that instead of TASK_SIZE.
Signed-off-by: NSteve Capper <steve.capper@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

363524d2

10 12月, 2018 1 次提交

arm64: move memstart_addr export inline · 03ef055f

由 Mark Rutland 提交于 12月 07, 2018

Since we define memstart_addr in a C file, we can have the export
immediately after the definition of the symbol, as we do elsewhere.

As a step towards removing arm64ksyms.c, move the export of
memstart_addr to init.c, where the symbol is defined.

There should be no functional change as a result of this patch.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

03ef055f

09 11月, 2018 1 次提交

arm64: memblock: don't permit memblock resizing until linear mapping is up · 24cc61d8

由 Ard Biesheuvel 提交于 11月 07, 2018

Bhupesh reports that having numerous memblock reservations at early
boot may result in the following crash:

  Unable to handle kernel paging request at virtual address ffff80003ffe0000
  ...
  Call trace:
   __memcpy+0x110/0x180
   memblock_add_range+0x134/0x2e8
   memblock_reserve+0x70/0xb8
   memblock_alloc_base_nid+0x6c/0x88
   __memblock_alloc_base+0x3c/0x4c
   memblock_alloc_base+0x28/0x4c
   memblock_alloc+0x2c/0x38
   early_pgtable_alloc+0x20/0xb0
   paging_init+0x28/0x7f8

This is caused by the fact that we permit memblock resizing before the
linear mapping is up, and so the memblock_reserved() array is moved
into memory that is not mapped yet.

So let's ensure that this crash can no longer occur, by deferring to
call to memblock_allow_resize() to after the linear mapping has been
created.
Reported-by: NBhupesh Sharma <bhsharma@redhat.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Tested-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

24cc61d8

31 10月, 2018 3 次提交

mm: remove include/linux/bootmem.h · 57c8a661

由 Mike Rapoport 提交于 10月 30, 2018

Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>

@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>

[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
  Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
  Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

57c8a661

memblock: rename free_all_bootmem to memblock_free_all · c6ffc5ca

由 Mike Rapoport 提交于 10月 30, 2018

The conversion is done using

sed -i 's@free_all_bootmem@memblock_free_all@' \
    $(git grep -l free_all_bootmem)

Link: http://lkml.kernel.org/r/1536927045-23536-26-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c6ffc5ca

memblock: replace free_bootmem{_node} with memblock_free · 2013288f

由 Mike Rapoport 提交于 10月 30, 2018

The free_bootmem and free_bootmem_node are merely wrappers for
memblock_free. Replace their usage with a call to memblock_free using the
following semantic patch:

@@
expression e1, e2, e3;
@@
(
- free_bootmem(e1, e2)
+ memblock_free(e1, e2)
|
- free_bootmem_node(e1, e2, e3)
+ memblock_free(e2, e3)
)

Link: http://lkml.kernel.org/r/1536927045-23536-24-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2013288f

21 9月, 2018 1 次提交

arm64: Kconfig: Remove ARCH_HAS_HOLES_MEMORYMODEL · 8a695a58

由 James Morse 提交于 8月 31, 2018

include/linux/mmzone.h describes ARCH_HAS_HOLES_MEMORYMODEL as
relevant when parts the memmap have been free()d. This would
happen on systems where memory is smaller than a sparsemem-section,
and the extra struct pages are expensive. pfn_valid() on these
systems returns true for the whole sparsemem-section, so an extra
memmap_valid_within() check is needed.

On arm64 we have nomap memory, so always provide pfn_valid() to test
for nomap pages. This means ARCH_HAS_HOLES_MEMORYMODEL's extra checks
are already rolled up into pfn_valid().

Remove it.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

8a695a58

17 8月, 2018 1 次提交

arm64: mm: check for upper PAGE_SHIFT bits in pfn_valid() · 5ad356ea

由 Greg Hackmann 提交于 8月 15, 2018

ARM64's pfn_valid() shifts away the upper PAGE_SHIFT bits of the input
before seeing if the PFN is valid.  This leads to false positives when
some of the upper bits are set, but the lower bits match a valid PFN.

For example, the following userspace code looks up a bogus entry in
/proc/kpageflags:

    int pagemap = open("/proc/self/pagemap", O_RDONLY);
    int pageflags = open("/proc/kpageflags", O_RDONLY);
    uint64_t pfn, val;

    lseek64(pagemap, [...], SEEK_SET);
    read(pagemap, &pfn, sizeof(pfn));
    if (pfn & (1UL << 63)) {        /* valid PFN */
        pfn &= ((1UL << 55) - 1);   /* clear flag bits */
        pfn |= (1UL << 55);
        lseek64(pageflags, pfn * sizeof(uint64_t), SEEK_SET);
        read(pageflags, &val, sizeof(val));
    }

On ARM64 this causes the userspace process to crash with SIGSEGV rather
than reading (1 << KPF_NOPAGE).  kpageflags_read() treats the offset as
valid, and stable_page_flags() will try to access an address between the
user and kernel address ranges.

Fixes: c1cc1552 ("arm64: MMU initialisation")
Cc: stable@vger.kernel.org
Signed-off-by: NGreg Hackmann <ghackmann@google.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

5ad356ea

25 7月, 2018 1 次提交

arm64: fix vmemmap BUILD_BUG_ON() triggering on !vmemmap setups · 7b0eb6b4

由 Johannes Weiner 提交于 7月 23, 2018

Arnd reports the following arm64 randconfig build error with the PSI
patches that add another page flag:

  /git/arm-soc/arch/arm64/mm/init.c: In function 'mem_init':
  /git/arm-soc/include/linux/compiler.h:357:38: error: call to
  '__compiletime_assert_618' declared with attribute error: BUILD_BUG_ON
  failed: sizeof(struct page) > (1 << STRUCT_PAGE_MAX_SHIFT)

The additional page flag causes other information stored in
page->flags to get bumped into their own struct page member:

  #if SECTIONS_WIDTH+ZONES_WIDTH+NODES_SHIFT+LAST_CPUPID_SHIFT <=
  BITS_PER_LONG - NR_PAGEFLAGS
  #define LAST_CPUPID_WIDTH LAST_CPUPID_SHIFT
  #else
  #define LAST_CPUPID_WIDTH 0
  #endif

  #if defined(CONFIG_NUMA_BALANCING) && LAST_CPUPID_WIDTH == 0
  #define LAST_CPUPID_NOT_IN_PAGE_FLAGS
  #endif

which in turn causes the struct page size to exceed the size set in
STRUCT_PAGE_MAX_SHIFT. This value is an an estimate used to size the
VMEMMAP page array according to address space and struct page size.

However, the check is performed - and triggers here - on a !VMEMMAP
config, which consumes an additional 22 page bits for the sparse
section id. When VMEMMAP is enabled, those bits are returned, cpupid
doesn't need its own member, and the page passes the VMEMMAP check.

Restrict that check to the situation it was meant to check: that we
are sizing the VMEMMAP page array correctly.

Says Arnd:

    Further experiments show that the build error already existed before,
    but was only triggered with larger values of CONFIG_NR_CPU and/or
    CONFIG_NODES_SHIFT that might be used in actual configurations but
    not in randconfig builds.

    With longer CPU and node masks, I could recreate the problem with
    kernels as old as linux-4.7 when arm64 NUMA support got added.
Reported-by: NArnd Bergmann <arnd@arndb.de>
Tested-by: NArnd Bergmann <arnd@arndb.de>
Cc: stable@vger.kernel.org
Fixes: 1a2db300 ("arm64, numa: Add NUMA support for arm64 platforms.")
Fixes: 3e1907d5 ("arm64: mm: move vmemmap region right below the linear region")
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

7b0eb6b4

15 6月, 2018 1 次提交

treewide: use PHYS_ADDR_MAX to avoid type casting ULLONG_MAX · d7dc899a

由 Stefan Agner 提交于 6月 14, 2018

With PHYS_ADDR_MAX there is now a type safe variant for all bits set.
Make use of it.

Patch created using a semantic patch as follows:

// <smpl>
@@
typedef phys_addr_t;
@@
-(phys_addr_t)ULLONG_MAX
+PHYS_ADDR_MAX
// </smpl>

Link: http://lkml.kernel.org/r/20180419214204.19322-1-stefan@agner.chSigned-off-by: NStefan Agner <stefan@agner.ch>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d7dc899a

01 5月, 2018 1 次提交

arm64: To remove initrd reserved area entry from memblock · 05c58752

由 CHANDAN VN 提交于 4月 30, 2018

INITRD reserved area entry is not removed from memblock
even though initrd reserved area is freed. After freeing
the memory it is released from memblock. The same can be
checked from /sys/kernel/debug/memblock/reserved.

The patch makes sure that the initrd entry is removed from
memblock when keepinitrd is not enabled.

The patch only affects accounting and debugging. This does not
fix any memory leak.
Acked-by: NLaura Abbott <labbott@redhat.com>
Signed-off-by: NCHANDAN VN <chandan.vn@samsung.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

05c58752

27 3月, 2018 1 次提交

Revert "arm64: Revert L1_CACHE_SHIFT back to 6 (64-byte cache line size)" · 3f251cf0

由 Will Deacon 提交于 3月 27, 2018

This reverts commit 1f85b42a.

The internal dma-direct.h API has changed in -next, which collides with
us trying to use it to manage non-coherent DMA devices on systems with
unreasonably large cache writeback granules.

This isn't at all trivial to resolve, so revert our changes for now and
we can revisit this after the merge window. Effectively, this just
restores our behaviour back to that of 4.16.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

3f251cf0

07 3月, 2018 1 次提交

arm64: Revert L1_CACHE_SHIFT back to 6 (64-byte cache line size) · 1f85b42a

由 Catalin Marinas 提交于 2月 28, 2018

Commit 97303480 ("arm64: Increase the max granular size") increased
the cache line size to 128 to match Cavium ThunderX, apparently for some
performance benefit which could not be confirmed. This change, however,
has an impact on the network packets allocation in certain
circumstances, requiring slightly over a 4K page with a significant
performance degradation.

This patch reverts L1_CACHE_SHIFT back to 6 (64-byte cache line) while
keeping ARCH_DMA_MINALIGN at 128. The cache_line_size() function was
changed to default to ARCH_DMA_MINALIGN in the absence of a meaningful
CTR_EL0.CWG bit field.

In addition, if a system with ARCH_DMA_MINALIGN < CTR_EL0.CWG is
detected, the kernel will force swiotlb bounce buffering for all
non-coherent devices since DMA cache maintenance on sub-CWG ranges is
not safe, leading to data corruption.

Cc: Tirumalesh Chalamarla <tchalamarla@cavium.com>
Cc: Timur Tabi <timur@codeaurora.org>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

1f85b42a

19 1月, 2018 1 次提交

arm64: mm: ignore memory above supported physical address size · e9eaa805

由 Kristina Martsenko 提交于 1月 18, 2018

When booting a kernel without 52-bit PA support (e.g. a kernel with 4k
pages) on a system with 52-bit memory, the kernel will currently try to
use the 52-bit memory and crash. Fix this by ignoring any memory higher
than what the kernel supports.

Fixes: f77d2817 ("arm64: enable 52-bit physical address support")
Signed-off-by: NKristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

e9eaa805

16 1月, 2018 1 次提交

arm64: Stop printing the virtual memory layout · 071929db

由 Laura Abbott 提交于 12月 19, 2017

Printing kernel addresses should be done in limited circumstances, mostly
for debugging purposes. Printing out the virtual memory layout at every
kernel bootup doesn't really fall into this category so delete the prints.
There are other ways to get the same information.
Acked-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NLaura Abbott <labbott@redhat.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

071929db

15 1月, 2018 1 次提交

arm64: replace ZONE_DMA with ZONE_DMA32 · ad67f5a6

由 Christoph Hellwig 提交于 12月 24, 2017

arm64 uses ZONE_DMA for allocations below 32-bits.  These days we
name the zone for that ZONE_DMA32, which will allow to use the
dma-direct and generic swiotlb code as-is, so rename it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>

ad67f5a6

12 12月, 2017 1 次提交

arm64: Initialise high_memory global variable earlier · f24e5834

由 Steve Capper 提交于 12月 04, 2017

The high_memory global variable is used by
cma_declare_contiguous(.) before it is defined.

We don't notice this as we compute __pa(high_memory - 1), and it looks
like we're processing a VA from the direct linear map.

This problem becomes apparent when we flip the kernel virtual address
space and the linear map is moved to the bottom of the kernel VA space.

This patch moves the initialisation of high_memory before it used.

Cc: <stable@vger.kernel.org>
Fixes: f7426b98 ("mm: cma: adjust address limit to avoid hitting low/high memory boundary")
Signed-off-by: NSteve Capper <steve.capper@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f24e5834

06 4月, 2017 4 次提交

arm64: kdump: provide /proc/vmcore file · e62aaeac

由 AKASHI Takahiro 提交于 4月 03, 2017

Arch-specific functions are added to allow for implementing a crash dump
file interface, /proc/vmcore, which can be viewed as a ELF file.

A user space tool, like kexec-tools, is responsible for allocating
a separate region for the core's ELF header within crash kdump kernel
memory and filling it in when executing kexec_load().

Then, its location will be advertised to crash dump kernel via a new
device-tree property, "linux,elfcorehdr", and crash dump kernel preserves
the region for later use with reserve_elfcorehdr() at boot time.

On crash dump kernel, /proc/vmcore will access the primary kernel's memory
with copy_oldmem_page(), which feeds the data page-by-page by ioremap'ing
it since it does not reside in linear mapping on crash dump kernel.

Meanwhile, elfcorehdr_read() is simple as the region is always mapped.
Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: NJames Morse <james.morse@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

e62aaeac

arm64: hibernate: preserve kdump image around hibernation · 254a41c0

由 AKASHI Takahiro 提交于 4月 03, 2017

Since arch_kexec_protect_crashkres() removes a mapping for crash dump
kernel image, the loaded data won't be preserved around hibernation.

In this patch, helper functions, crash_prepare_suspend()/
crash_post_resume(), are additionally called before/after hibernation so
that the relevant memory segments will be mapped again and preserved just
as the others are.

In addition, to minimize the size of hibernation image, crash_is_nosave()
is added to pfn_is_nosave() in order to recognize only the pages that hold
loaded crash dump kernel image as saveable. Hibernation excludes any pages
that are marked as Reserved and yet "nosave."
Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

254a41c0

arm64: kdump: reserve memory for crash dump kernel · 764b51ea

由 AKASHI Takahiro 提交于 4月 03, 2017

"crashkernel=" kernel parameter specifies the size (and optionally
the start address) of the system ram to be used by crash dump kernel.
reserve_crashkernel() will allocate and reserve that memory at boot time
of primary kernel.

The memory range will be exposed to userspace as a resource named
"Crash kernel" in /proc/iomem.
Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: NMark Salter <msalter@redhat.com>
Signed-off-by: NPratyush Anand <panand@redhat.com>
Reviewed-by: NJames Morse <james.morse@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

764b51ea

arm64: limit memory regions based on DT property, usable-memory-range · 8f579b1c

由 AKASHI Takahiro 提交于 4月 03, 2017

Crash dump kernel uses only a limited range of available memory as System
RAM. On arm64 kdump, This memory range is advertised to crash dump kernel
via a device-tree property under /chosen,
   linux,usable-memory-range = <BASE SIZE>

Crash dump kernel reads this property at boot time and calls
memblock_cap_memory_range() to limit usable memory which are listed either
in UEFI memory map table or "memory" nodes of a device tree blob.
Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: NGeoff Levand <geoff@infradead.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

8f579b1c

17 1月, 2017 1 次提交

arm64: Fix swiotlb fallback allocation · 524dabe1

由 Alexander Graf 提交于 1月 16, 2017

Commit b67a8b29 introduced logic to skip swiotlb allocation when all memory
is DMA accessible anyway.

While this is a great idea, __dma_alloc still calls swiotlb code unconditionally
to allocate memory when there is no CMA memory available. The swiotlb code is
called to ensure that we at least try get_free_pages().

Without initialization, swiotlb allocation code tries to access io_tlb_list
which is NULL. That results in a stack trace like this:

  Unable to handle kernel NULL pointer dereference at virtual address 00000000
  [...]
  [<ffff00000845b908>] swiotlb_tbl_map_single+0xd0/0x2b0
  [<ffff00000845be94>] swiotlb_alloc_coherent+0x10c/0x198
  [<ffff000008099dc0>] __dma_alloc+0x68/0x1a8
  [<ffff000000a1b410>] drm_gem_cma_create+0x98/0x108 [drm]
  [<ffff000000abcaac>] drm_fbdev_cma_create_with_funcs+0xbc/0x368 [drm_kms_helper]
  [<ffff000000abcd84>] drm_fbdev_cma_create+0x2c/0x40 [drm_kms_helper]
  [<ffff000000abc040>] drm_fb_helper_initial_config+0x238/0x410 [drm_kms_helper]
  [<ffff000000abce88>] drm_fbdev_cma_init_with_funcs+0x98/0x160 [drm_kms_helper]
  [<ffff000000abcf90>] drm_fbdev_cma_init+0x40/0x58 [drm_kms_helper]
  [<ffff000000b47980>] vc4_kms_load+0x90/0xf0 [vc4]
  [<ffff000000b46a94>] vc4_drm_bind+0xec/0x168 [vc4]
  [...]

Thankfully swiotlb code just learned how to not do allocations with the FORCE_NO
option. This patch configures the swiotlb code to use that if we decide not to
initialize the swiotlb framework.

Fixes: b67a8b29 ("arm64: mm: only initialize swiotlb when necessary")
Signed-off-by: NAlexander Graf <agraf@suse.de>
CC: Jisheng Zhang <jszhang@marvell.com>
CC: Geert Uytterhoeven <geert+renesas@glider.be>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

524dabe1

12 1月, 2017 1 次提交

arm64: Use __pa_symbol for kernel symbols · 2077be67

由 Laura Abbott 提交于 1月 10, 2017

__pa_symbol is technically the marcro that should be used for kernel
symbols. Switch to this as a pre-requisite for DEBUG_VIRTUAL which
will do bounds checking.
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Tested-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NLaura Abbott <labbott@redhat.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

2077be67

19 12月, 2016 1 次提交

swiotlb: Convert swiotlb_force from int to enum · ae7871be

由 Geert Uytterhoeven 提交于 12月 16, 2016

Convert the flag swiotlb_force from an int to an enum, to prepare for
the advent of more possible values.
Suggested-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

ae7871be

20 10月, 2016 1 次提交

arm64: remove pr_cont abuse from mem_init · f7881bd6

由 Mark Rutland 提交于 10月 20, 2016

All the lines printed by mem_init are independent, with each ending with
a newline. While they logically form a large block, none are actually
continuations of previous lines.

The kernel-side printk code and the userspace demsg tool differ in their
handling of KERN_CONT following a newline, and while this isn't always a
problem kernel-side, it does cause difficulty for userspace. Using
pr_cont causes the userspace tool to not print line prefix (e.g.
timestamps) even when following a newline, mis-aligning the output and
making it harder to read, e.g.

[    0.000000] Virtual kernel memory layout:
[    0.000000]     modules : 0xffff000000000000 - 0xffff000008000000   (   128 MB)
    vmalloc : 0xffff000008000000 - 0xffff7dffbfff0000   (129022 GB)
      .text : 0xffff000008080000 - 0xffff0000088b0000   (  8384 KB)
    .rodata : 0xffff0000088b0000 - 0xffff000008c50000   (  3712 KB)
      .init : 0xffff000008c50000 - 0xffff000008d50000   (  1024 KB)
      .data : 0xffff000008d50000 - 0xffff000008e25200   (   853 KB)
       .bss : 0xffff000008e25200 - 0xffff000008e6bec0   (   284 KB)
    fixed   : 0xffff7dfffe7fd000 - 0xffff7dfffec00000   (  4108 KB)
    PCI I/O : 0xffff7dfffee00000 - 0xffff7dffffe00000   (    16 MB)
    vmemmap : 0xffff7e0000000000 - 0xffff800000000000   (  2048 GB maximum)
              0xffff7e0000000000 - 0xffff7e0026000000   (   608 MB actual)
    memory  : 0xffff800000000000 - 0xffff800980000000   ( 38912 MB)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=6, Nodes=1

Fix this by using pr_notice consistently for all lines, which both the
kernel and userspace are happy with.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f7881bd6

07 9月, 2016 1 次提交

arm64: mm: drop fixup_init() and mm.h · dae8c235

由 Kefeng Wang 提交于 9月 05, 2016

There is only fixup_init() in mm.h , and it is only called
in free_initmem(), so move the codes from fixup_init() into
free_initmem(), then drop fixup_init() and mm.h.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

dae8c235

22 8月, 2016 1 次提交

arm64: apply __ro_after_init to some objects · 5a9e3e15

由 Jisheng Zhang 提交于 8月 15, 2016

These objects are set during initialization, thereafter are read only.

Previously I only want to mark vdso_pages, vdso_spec, vectors_page and
cpu_ops as __read_mostly from performance point of view. Then inspired
by Kees's patch[1] to apply more __ro_after_init for arm, I think it's
better to mark them as __ro_after_init. What's more, I find some more
objects are also read only after init. So apply __ro_after_init to all
of them.

This patch also removes global vdso_pagelist and tries to clean up
vdso_spec[] assignment code.

[1] http://www.spinics.net/lists/arm-kernel/msg523188.htmlAcked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NJisheng Zhang <jszhang@marvell.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

5a9e3e15

29 7月, 2016 1 次提交

arm64:acpi: fix the acpi alignment exception when 'mem=' specified · cb0a6502

由 Dennis Chen 提交于 7月 28, 2016

When booting an ACPI enabled kernel with 'mem=x', there is the
possibility that ACPI data regions from the firmware will lie above the
memory limit.  Ordinarily these will be removed by
memblock_enforce_memory_limit(.).

Unfortunately, this means that these regions will then be mapped by
acpi_os_ioremap(.) as device memory (instead of normal) thus unaligned
accessess will then provoke alignment faults.

In this patch we adopt memblock_mem_limit_remove_map instead, and this
preserves these ACPI data regions (marked NOMAP) thus ensuring that
these regions are not mapped as device memory.

For example, below is an alignment exception observed on ARM platform
when booting the kernel with 'acpi=on mem=8G':

  ...
  Unable to handle kernel paging request at virtual address ffff0000080521e7
  pgd = ffff000008aa0000
  [ffff0000080521e7] *pgd=000000801fffe003, *pud=000000801fffd003, *pmd=000000801fffc003, *pte=00e80083ff1c1707
  Internal error: Oops: 96000021 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc3-next-20160616+ #172
  Hardware name: AMD Overdrive/Supercharger/Default string, BIOS ROD1001A 02/09/2016
  task: ffff800001ef0000 ti: ffff800001ef8000 task.ti: ffff800001ef8000
  PC is at acpi_ns_lookup+0x520/0x734
  LR is at acpi_ns_lookup+0x4a4/0x734
  pc : [<ffff0000083b8b10>] lr : [<ffff0000083b8a94>] pstate: 60000045
  sp : ffff800001efb8b0
  x29: ffff800001efb8c0 x28: 000000000000001b
  x27: 0000000000000001 x26: 0000000000000000
  x25: ffff800001efb9e8 x24: ffff000008a10000
  x23: 0000000000000001 x22: 0000000000000001
  x21: ffff000008724000 x20: 000000000000001b
  x19: ffff0000080521e7 x18: 000000000000000d
  x17: 00000000000038ff x16: 0000000000000002
  x15: 0000000000000007 x14: 0000000000007fff
  x13: ffffff0000000000 x12: 0000000000000018
  x11: 000000001fffd200 x10: 00000000ffffff76
  x9 : 000000000000005f x8 : ffff000008725fa8
  x7 : ffff000008a8df70 x6 : ffff000008a8df70
  x5 : ffff000008a8d000 x4 : 0000000000000010
  x3 : 0000000000000010 x2 : 000000000000000c
  x1 : 0000000000000006 x0 : 0000000000000000
  ...
    acpi_ns_lookup+0x520/0x734
    acpi_ds_load1_begin_op+0x174/0x4fc
    acpi_ps_build_named_op+0xf8/0x220
    acpi_ps_create_op+0x208/0x33c
    acpi_ps_parse_loop+0x204/0x838
    acpi_ps_parse_aml+0x1bc/0x42c
    acpi_ns_one_complete_parse+0x1e8/0x22c
    acpi_ns_parse_table+0x8c/0x128
    acpi_ns_load_table+0xc0/0x1e8
    acpi_tb_load_namespace+0xf8/0x2e8
    acpi_load_tables+0x7c/0x110
    acpi_init+0x90/0x2c0
    do_one_initcall+0x38/0x12c
    kernel_init_freeable+0x148/0x1ec
    kernel_init+0x10/0xec
    ret_from_fork+0x10/0x40
  Code: b9009fbc 2a00037b 36380057 3219037b (b9400260)
  ---[ end trace 03381e5eb0a24de4 ]---
  Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

With 'efi=debug', we can see those ACPI regions loaded by firmware on
that board as:

  efi:   0x0083ff185000-0x0083ff1b4fff [Reserved           |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
  efi:   0x0083ff1b5000-0x0083ff1c2fff [ACPI Reclaim Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
  efi:   0x0083ff223000-0x0083ff224fff [ACPI Memory NVS    |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*

Link: http://lkml.kernel.org/r/1468475036-5852-3-git-send-email-dennis.chen@arm.comAcked-by: NSteve Capper <steve.capper@arm.com>
Signed-off-by: NDennis Chen <dennis.chen@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Kaly Xin <kaly.xin@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cb0a6502

28 6月, 2016 2 次提交

arm64: mm: fix location of _etext · 9fdc14c5

由 Ard Biesheuvel 提交于 6月 23, 2016

As Kees Cook notes in the ARM counterpart of this patch [0]:

  The _etext position is defined to be the end of the kernel text code,
  and should not include any part of the data segments. This interferes
  with things that might check memory ranges and expect executable code
  up to _etext.

In particular, Kees is referring to the HARDENED_USERCOPY patch set [1],
which rejects attempts to call copy_to_user() on kernel ranges containing
executable code, but does allow access to the .rodata segment. Regardless
of whether one may or may not agree with the distinction, it makes sense
for _etext to have the same meaning across architectures.

So let's put _etext where it belongs, between .text and .rodata, and fix
up existing references to use __init_begin instead, which unlike _end_rodata
includes the exception and notes sections as well.

The _etext references in kaslr.c are left untouched, since its references
to [_stext, _etext) are meant to capture potential jump instruction targets,
and so disregarding .rodata is actually an improvement here.

[0] http://article.gmane.org/gmane.linux.kernel/2245084
[1] http://thread.gmane.org/gmane.linux.kernel.hardened.devel/2502Reported-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

9fdc14c5

arm64: mm: simplify memblock numa node extraction · ea2cbee3

由 Mark Rutland 提交于 6月 22, 2016

We currently open-code extracting the NUMA node of a memblock region,
which requires an ifdef to cater for !CONFIG_NUMA builds where the
memblock_region::nid field does not exist.

The generic memblock_get_region_node helper is intended to cater for
this. For CONFIG_HAVE_MEMBLOCK_NODE_MAP, builds this returns reg->nid,
and for for !CONFIG_HAVE_MEMBLOCK_NODE_MAP builds this is a static
inline that returns 0. Note that for arm64,
CONFIG_HAVE_MEMBLOCK_NODE_MAP is selected iff CONFIG_NUMA is.

This patch makes use of memblock_get_region_node to simplify the arm64
code. At the same time, we can move the nid variable definition into the
loop, as this is the only place it is used.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NSteve Capper <steve.capper@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

ea2cbee3

21 6月, 2016 1 次提交

arm64: mm: only initialize swiotlb when necessary · b67a8b29

由 Jisheng Zhang 提交于 6月 08, 2016

we only initialize swiotlb when swiotlb_force is true or not all system
memory is DMA-able, this trivial optimization saves us 64MB when
swiotlb is not necessary.
Signed-off-by: NJisheng Zhang <jszhang@marvell.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

b67a8b29

20 4月, 2016 2 次提交

arm64: mm: Show bss segment in kernel memory layout · 9974723e

由 Kefeng Wang 提交于 4月 18, 2016

Show the bss segment information as with text and data in Virtual
memory kernel layout.
Acked-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

9974723e

arm64: mm: make pr_cont() per line in Virtual kernel memory layout · d32351c8

由 Kefeng Wang 提交于 4月 18, 2016

Each line with single pr_cont() in Virtual kernel memory layout,
or the dump of the kernel memory layout in dmesg is not aligned
when PRINTK_TIME enabled, due to the missing time stamps.
Tested-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

d32351c8

16 4月, 2016 1 次提交

arm64, numa: Add NUMA support for arm64 platforms. · 1a2db300

由 Ganapatrao Kulkarni 提交于 4月 08, 2016

Attempt to get the memory and CPU NUMA node via of_numa.  If that
fails, default the dummy NUMA node and map all memory and CPUs to node
0.
Tested-by: NShannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: NRobert Richter <rrichter@cavium.com>
Signed-off-by: NGanapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Signed-off-by: NDavid Daney <david.daney@cavium.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

1a2db300

14 4月, 2016 4 次提交

arm64: mm: move vmemmap region right below the linear region · 3e1907d5

由 Ard Biesheuvel 提交于 3月 30, 2016

This moves the vmemmap region right below PAGE_OFFSET, aka the start
of the linear region, and redefines its size to be a power of two.
Due to the placement of PAGE_OFFSET in the middle of the address space,
whose size is a power of two as well, this guarantees that virt to
page conversions and vice versa can be implemented efficiently, by
masking and shifting rather than ordinary arithmetic.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

3e1907d5

arm64: mm: free __init memory via the linear mapping · d386825c

由 Ard Biesheuvel 提交于 3月 30, 2016

The implementation of free_initmem_default() expects __init_begin
and __init_end to be covered by the linear mapping, which is no
longer the case. So open code it instead, using addresses that are
explicitly translated from kernel virtual to linear virtual.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

d386825c

arm64: add the initrd region to the linear mapping explicitly · 177e15f0

由 Ard Biesheuvel 提交于 3月 30, 2016

Instead of going out of our way to relocate the initrd if it turns out
to occupy memory that is not covered by the linear mapping, just add the
initrd to the linear mapping. This puts the burden on the bootloader to
pass initrd= and mem= options that are mutually consistent.

Note that, since the placement of the linear region in the PA space is
also dependent on the placement of the kernel Image, which may reside
anywhere in memory, we may still end up with a situation where the initrd
and the kernel Image are simply too far apart to be covered by the linear
region.

Since we now leave it up to the bootloader to pass the initrd in memory
that is guaranteed to be accessible by the kernel, add a mention of this to
the arm64 boot protocol specification as well.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

177e15f0

arm64/mm: ensure memstart_addr remains sufficiently aligned · 2958987f

由 Ard Biesheuvel 提交于 3月 30, 2016

After choosing memstart_addr to be the highest multiple of
ARM64_MEMSTART_ALIGN less than or equal to the first usable physical memory
address, we clip the memblocks to the maximum size of the linear region.
Since the kernel may be high up in memory, we take care not to clip the
kernel itself, which means we have to clip some memory from the bottom if
this occurs, to ensure that the distance between the first and the last
usable physical memory address can be covered by the linear region.

However, we fail to update memstart_addr if this clipping from the bottom
occurs, which means that we may still end up with virtual addresses that
wrap into the userland range. So increment memstart_addr as appropriate to
prevent this from happening.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

2958987f

21 3月, 2016 1 次提交

arm64: Split pr_notice("Virtual kernel memory layout...") into multiple pr_cont() · f09f1bac

由 Catalin Marinas 提交于 3月 11, 2016

The printk() implementation has a limit of LOG_LINE_MAX (== 1024 - 32)
buffer per call which the arm64 mem_init() breaches when printing the
virtual memory layout with CONFIG_KASAN enabled. The result is that the
last line is no longer printed. This patch splits the call into a
pr_notice() + additional pr_cont() calls.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>

f09f1bac

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功