提交 · d50e071fdaa33c1b399c764c44fa1ce879881185 · openanolis / cloud-kernel

09 8月, 2017 2 次提交

arm64: Implement pmem API support · d50e071f

由 Robin Murphy 提交于 7月 25, 2017

Add a clean-to-point-of-persistence cache maintenance helper, and wire
up the basic architectural support for the pmem driver based on it.
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
[catalin.marinas@arm.com: move arch_*_pmem() functions to arch/arm64/mm/flush.c]
[catalin.marinas@arm.com: change dmb(sy) to dmb(osh)]
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

d50e071f

arm64: Convert __inval_cache_range() to area-based · d46befef

由 Robin Murphy 提交于 7月 25, 2017

__inval_cache_range() is already the odd one out among our data cache
maintenance routines as the only remaining range-based one; as we're
going to want an invalidation routine to call from C code for the pmem
API, let's tweak the prototype and name to bring it in line with the
clean operations, and to make its relationship with __dma_inv_area()
neatly mirror that of __clean_dcache_area_poc() and __dma_clean_area().
The loop clearing the early page tables gets mildly massaged in the
process for the sake of consistency.
Reviewed-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

d46befef

07 8月, 2017 1 次提交

arm64: Decode information from ESR upon mem faults · 1f9b8936

由 Julien Thierry 提交于 8月 04, 2017

When receiving unhandled faults from the CPU, description is very sparse.
Adding information about faults decoded from ESR.

Added defines to esr.h corresponding ESR fields. Values are based on ARM
Archtecture Reference Manual (DDI 0487B.a), section D7.2.28 ESR_ELx, Exception
Syndrome Register (ELx) (pages D7-2275 to D7-2280).

New output is of the form:
[   77.818059] Mem abort info:
[   77.820826]   Exception class = DABT (current EL), IL = 32 bits
[   77.826706]   SET = 0, FnV = 0
[   77.829742]   EA = 0, S1PTW = 0
[   77.832849] Data abort info:
[   77.835713]   ISV = 0, ISS = 0x00000070
[   77.839522]   CM = 0, WnR = 1
Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: fix "%lu" in a pr_alert() call]
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

1f9b8936

04 8月, 2017 1 次提交

arm64: Fix potential race with hardware DBM in ptep_set_access_flags() · 6d332747

由 Catalin Marinas 提交于 7月 25, 2017

In a system with DBM (dirty bit management) capable agents there is a
possible race between a CPU executing ptep_set_access_flags() (maybe
non-DBM capable) and a hardware update of the dirty state (clearing of
PTE_RDONLY). The scenario:

a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old
   (PTE_AF clear)
b) ptep_set_access_flags() is called as a result of a read access and it
   needs to set the pte to writable, clean and young (PTE_AF set)
c) a DBM-capable agent, as a result of a different write access, is
   marking the entry as young (setting PTE_AF) and dirty (clearing
   PTE_RDONLY)

The current ptep_set_access_flags() implementation would set the
PTE_RDONLY bit in the resulting value overriding the DBM update and
losing the dirty state.

This patch fixes such race by setting PTE_RDONLY to the most permissive
(lowest value) of the current entry and the new one.

Fixes: 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM")
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NSteve Capper <steve.capper@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

6d332747

28 7月, 2017 1 次提交

arm64: mmu: Place guard page after mapping of kernel image · 92bbd16e

由 Will Deacon 提交于 7月 24, 2017

The vast majority of virtual allocations in the vmalloc region are followed
by a guard page, which can help to avoid overruning on vma into another,
which may map a read-sensitive device.

This patch adds a guard page to the end of the kernel image mapping (i.e.
following the data/bss segments).

Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

92bbd16e

21 7月, 2017 1 次提交

arm64/numa: Drop duplicate message · ece4b206

由 Punit Agrawal 提交于 7月 20, 2017

When booting linux on a system without CONFIG_NUMA enabled, the
following messages are printed during boot -

NUMA: Faking a node at [mem 0x0000000000000000-0x00000083ffffffff]
NUMA: Adding memblock [0x8000000000 - 0x8000e7ffff] on node 0
NUMA: Adding memblock [0x8000e80000 - 0x83f65cffff] on node 0
NUMA: Adding memblock [0x83f65d0000 - 0x83f665ffff] on node 0
NUMA: Adding memblock [0x83f6660000 - 0x83f676ffff] on node 0
NUMA: Adding memblock [0x83f6770000 - 0x83f678ffff] on node 0
NUMA: Adding memblock [0x83f6790000 - 0x83fb82ffff] on node 0
NUMA: Adding memblock [0x83fb830000 - 0x83fbc0ffff] on node 0
NUMA: Adding memblock [0x83fbc10000 - 0x83fbdfffff] on node 0
NUMA: Adding memblock [0x83fbe00000 - 0x83fbffffff] on node 0
NUMA: Adding memblock [0x83fc000000 - 0x83fffbffff] on node 0
NUMA: Adding memblock [0x83fffc0000 - 0x83fffdffff] on node 0
NUMA: Adding memblock [0x83fffe0000 - 0x83ffffffff] on node 0
NUMA: Initmem setup node 0 [mem 0x8000000000-0x83ffffffff]
NUMA: NODE_DATA [mem 0x83fffec500-0x83fffedfff]

The information is then duplicated by core kernel messages right after
the above output.

Early memory node ranges
  node   0: [mem 0x0000008000000000-0x0000008000e7ffff]
  node   0: [mem 0x0000008000e80000-0x00000083f65cffff]
  node   0: [mem 0x00000083f65d0000-0x00000083f665ffff]
  node   0: [mem 0x00000083f6660000-0x00000083f676ffff]
  node   0: [mem 0x00000083f6770000-0x00000083f678ffff]
  node   0: [mem 0x00000083f6790000-0x00000083fb82ffff]
  node   0: [mem 0x00000083fb830000-0x00000083fbc0ffff]
  node   0: [mem 0x00000083fbc10000-0x00000083fbdfffff]
  node   0: [mem 0x00000083fbe00000-0x00000083fbffffff]
  node   0: [mem 0x00000083fc000000-0x00000083fffbffff]
  node   0: [mem 0x00000083fffc0000-0x00000083fffdffff]
  node   0: [mem 0x00000083fffe0000-0x00000083ffffffff]
Initmem setup node 0 [mem 0x0000008000000000-0x00000083ffffffff]

Remove the duplication of memblock layout information printed during
boot by dropping the messages from arm64 numa initialisation.
Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

ece4b206

20 7月, 2017 1 次提交

dma-coherent: introduce interface for default DMA pool · 43fc509c

由 Vladimir Murzin 提交于 7月 20, 2017

Christoph noticed [1] that default DMA pool in current form overload
the DMA coherent infrastructure. In reply, Robin suggested [2] to
split the per-device vs. global pool interfaces, so allocation/release
from default DMA pool is driven by dma ops implementation.

This patch implements Robin's idea and provide interface to
allocate/release/mmap the default (aka global) DMA pool.

To make it clear that existing *_from_coherent routines work on
per-device pool rename them to *_from_dev_coherent.

[1] https://lkml.org/lkml/2017/7/7/370
[2] https://lkml.org/lkml/2017/7/7/431

Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Suggested-by: NRobin Murphy <robin.murphy@arm.com>
Tested-by: NAndras Szemzo <sza@esh.hu>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

43fc509c

13 7月, 2017 1 次提交

arm64/mmap: properly account for stack randomization in mmap_base · cf92251d

由 Rik van Riel 提交于 7月 12, 2017

When RLIMIT_STACK is, for example, 256MB, the current code results in a
gap between the top of the task and mmap_base of 256MB, failing to take
into account the amount by which the stack address was randomized.  In
other words, the stack gets less than RLIMIT_STACK space.

Ensure that the gap between the stack and mmap_base always takes stack
randomization and the stack guard gap into account.

Obtained from Daniel Micay's linux-hardened tree.

Link: http://lkml.kernel.org/r/20170622200033.25714-3-riel@redhat.comSigned-off-by: NDaniel Micay <danielmicay@gmail.com>
Signed-off-by: NRik van Riel <riel@redhat.com>
Reported-by: NFlorian Weimer <fweimer@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cf92251d

11 7月, 2017 1 次提交

arm64/kasan: don't allocate extra shadow memory · 3f9ec80f

由 Andrey Ryabinin 提交于 7月 10, 2017

We used to read several bytes of the shadow memory in advance.
Therefore additional shadow memory mapped to prevent crash if
speculative load would happen near the end of the mapped shadow memory.

Now we don't have such speculative loads, so we no longer need to map
additional shadow memory.

Link: http://lkml.kernel.org/r/20170601162338.23540-3-aryabinin@virtuozzo.comSigned-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3f9ec80f

07 7月, 2017 3 次提交

mm/hugetlb: add size parameter to huge_pte_offset() · 7868a208

由 Punit Agrawal 提交于 7月 06, 2017

A poisoned or migrated hugepage is stored as a swap entry in the page
tables.  On architectures that support hugepages consisting of
contiguous page table entries (such as on arm64) this leads to ambiguity
in determining the page table entry to return in huge_pte_offset() when
a poisoned entry is encountered.

Let's remove the ambiguity by adding a size parameter to convey
additional information about the requested address.  Also fixup the
definition/usage of huge_pte_offset() throughout the tree.

Link: http://lkml.kernel.org/r/20170522133604.11392-4-punit.agrawal@arm.comSigned-off-by: NPunit Agrawal <punit.agrawal@arm.com>
Acked-by: NSteve Capper <steve.capper@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: James Hogan <james.hogan@imgtec.com> (odd fixer:METAG ARCHITECTURE)
Cc: Ralf Baechle <ralf@linux-mips.org> (supporter:MIPS)
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7868a208

arm64: hugetlb: remove spurious calls to huge_ptep_offset() · f0b38d65

由 Steve Capper 提交于 7月 06, 2017

We don't need to call huge_ptep_offset as our accessors are already
supplied with the pte_t *.  This patch removes those spurious calls.

[punit.agrawal@arm.com: resolve rebase conflicts due to patch re-ordering]
Link: http://lkml.kernel.org/r/20170524115409.31309-3-punit.agrawal@arm.comSigned-off-by: NSteve Capper <steve.capper@arm.com>
Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
Cc: David Woods <dwoods@mellanox.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f0b38d65

arm64: hugetlb: refactor find_num_contig() · bb9dd3df

由 Steve Capper 提交于 7月 06, 2017

Patch series "Support for contiguous pte hugepages", v4.

This patchset updates the hugetlb code to fix issues arising from
contiguous pte hugepages (such as on arm64).  Compared to v3, This
version addresses a build failure on arm64 by including two cleanup
patches.  Other than the arm64 cleanups, the rest are generic code
changes.  The remaining arm64 support based on these patches will be
posted separately.  The patches are based on v4.12-rc2.  Previous
related postings can be found at [0], [1], [2], and [3].

The patches fall into three categories -

* Patch 1-2 - arm64 cleanups required to greatly simplify changing
  huge_pte_offset() prototype in Patch 5.

  Catalin, Will - are you happy for these patches to go via mm?

* Patches 3-4 address issues with gup

* Patches 5-8 relate to passing a size argument to hugepage helpers to
  disambiguate the size of the referred page. These changes are
  required to enable arch code to properly handle swap entries for
  contiguous pte hugepages.

  The changes to huge_pte_offset() (patch 5) touch multiple
  architectures but I've managed to minimise these changes for the
  other affected functions - huge_pte_clear() and set_huge_pte_at().

These patches gate the enabling of contiguous hugepages support on arm64
which has been requested for systems using !4k page granule.

The ARM64 architecture supports two flavours of hugepages -

* Block mappings at the pud/pmd level

  These are regular hugepages where a pmd or a pud page table entry
  points to a block of memory. Depending on the PAGE_SIZE in use the
  following size of block mappings are supported -

          PMD	PUD
          ---	---
  4K:      2M	 1G
  16K:    32M
  64K:   512M

  For certain applications/usecases such as HPC and large enterprise
  workloads, folks are using 64k page size but the minimum hugepage size
  of 512MB isn't very practical.

To overcome this ...

* Using the Contiguous bit

  The architecture provides a contiguous bit in the translation table
  entry which acts as a hint to the mmu to indicate that it is one of a
  contiguous set of entries that can be cached in a single TLB entry.

  We use the contiguous bit in Linux to increase the mapping size at the
  pmd and pte (last) level.

  The number of supported contiguous entries varies by page size and
  level of the page table.

  Using the contiguous bit allows additional hugepage sizes -

           CONT PTE    PMD    CONT PMD    PUD
           --------    ---    --------    ---
    4K:         64K     2M         32M     1G
    16K:         2M    32M          1G
    64K:         2M   512M         16G

  Of these, 64K with 4K and 2M with 64K pages have been explicitly
  requested by a few different users.

Entries with the contiguous bit set are required to be modified all
together - which makes things like memory poisoning and migration
impossible to do correctly without knowing the size of hugepage being
dealt with - the reason for adding size parameter to a few of the
hugepage helpers in this series.

This patch (of 8):

As we regularly check for contiguous pte's in the huge accessors, remove
this extra check from find_num_contig.

[punit.agrawal@arm.com: resolve rebase conflicts due to patch re-ordering]
Link: http://lkml.kernel.org/r/20170524115409.31309-2-punit.agrawal@arm.comSigned-off-by: NSteve Capper <steve.capper@arm.com>
Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
Cc: David Woods <dwoods@mellanox.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bb9dd3df

23 6月, 2017 3 次提交

arm/arm64: KVM: add guest SEA support · 621f48e4

由 Tyler Baicar 提交于 6月 21, 2017

Currently external aborts are unsupported by the guest abort
handling. Add handling for SEAs so that the host kernel reports
SEAs which occur in the guest kernel.

When an SEA occurs in the guest kernel, the guest exits and is
routed to kvm_handle_guest_abort(). Prior to this patch, a print
message of an unsupported FSC would be printed and nothing else
would happen. With this patch, the code gets routed to the APEI
handling of SEAs in the host kernel to report the SEA information.
Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
Acked-by: NChristoffer Dall <cdall@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

621f48e4

acpi: apei: handle SEA notification type for ARMv8 · 7edda088

由 Tyler Baicar 提交于 6月 21, 2017

ARM APEI extension proposal added SEA (Synchronous External Abort)
notification type for ARMv8.
Add a new GHES error source handling function for SEA. If an error
source's notification type is SEA, then this function can be registered
into the SEA exception handler. That way GHES will parse and report
SEA exceptions when they occur.
An SEA can interrupt code that had interrupts masked and is treated as
an NMI. To aid this the page of address space for mapping APEI buffers
while in_nmi() is always reserved, and ghes_ioremap_pfn_nmi() is
changed to use the helper methods to find the prot_t to map with in
the same way as ghes_ioremap_pfn_irq().
Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Reviewed-by: NJames Morse <james.morse@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

7edda088

arm64: exception: handle Synchronous External Abort · 32015c23

由 Tyler Baicar 提交于 6月 21, 2017

SEA exceptions are often caused by an uncorrected hardware
error, and are handled when data abort and instruction abort
exception classes have specific values for their Fault Status
Code.
When SEA occurs, before killing the process, report the error
in the kernel logs.
Update fault_info[] with specific SEA faults so that the
new SEA handler is used.
Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Reviewed-by: NJames Morse <james.morse@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
[will: use NULL instead of 0 when assigning si_addr]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

32015c23

20 6月, 2017 1 次提交

arm64: remove DMA_ERROR_CODE · e0d60ac1

由 Christoph Hellwig 提交于 5月 21, 2017

The dma alloc interface returns an error by return NULL, and the
mapping interfaces rely on the mapping_error method, which the dummy
ops already implement correctly.

Thus remove the DMA_ERROR_CODE define.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>

e0d60ac1

15 6月, 2017 1 次提交

arm64/dma-mapping: Remove extraneous null-pointer checks · 577dfe16

由 Olav Haugan 提交于 6月 13, 2017

The current null-pointer check in __dma_alloc_coherent and
__dma_free_coherent is not needed anymore since the
__dma_alloc/__dma_free functions won't be called if !dev (dummy ops will
be called instead).
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NOlav Haugan <ohaugan@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

577dfe16

12 6月, 2017 7 次提交

arm64: mm: Update perf accounting to handle poison faults · 0e3a9026

由 Punit Agrawal 提交于 6月 08, 2017

Re-organise the perf accounting for fault handling in preparation for
enabling handling of hardware poison faults in subsequent commits. The
change updates perf accounting to be inline with the behaviour on
x86.

With this update, the perf fault accounting -

  * Always report PERF_COUNT_SW_PAGE_FAULTS

  * Doesn't report anything else for VM_FAULT_ERROR (which includes
    hwpoison faults)

  * Reports PERF_COUNT_SW_PAGE_FAULTS_MAJ if it's a major
    fault (indicated by VM_FAULT_MAJOR)

  * Otherwise, reports PERF_COUNT_SW_PAGE_FAULTS_MIN
Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

0e3a9026

arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling · e7c600f1

由 Jonathan (Zhixiong) Zhang 提交于 6月 08, 2017

Add VM_FAULT_HWPOISON[_LARGE] handling to the arm64 page fault
handler. Handling of VM_FAULT_HWPOISON[_LARGE] is very similar
to VM_FAULT_OOM, the only difference is that a different si_code
(BUS_MCEERR_AR) is passed to user space and si_addr_lsb field is
initialized.
Signed-off-by: NJonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org>
(fix new __do_user_fault call-site)
Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
Acked-by: NSteve Capper <steve.capper@arm.com>
Tested-by: NManoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

e7c600f1

arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries · f02ab08a

由 Punit Agrawal 提交于 6月 08, 2017

When memory failure is enabled, a poisoned hugepage pte is marked as a
swap entry. huge_pte_offset() does not return the poisoned page table
entries when it encounters PUD/PMD hugepages.

This behaviour of huge_pte_offset() leads to error such as below when
munmap is called on poisoned hugepages.

[  344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074.

Fix huge_pte_offset() to return the poisoned pte which is then
appropriately handled by the generic layer code.
Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
Acked-by: NSteve Capper <steve.capper@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: David Woods <dwoods@mellanox.com>
Tested-by: NManoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f02ab08a

arm64: fault: Print info about page table structure when dumping pte · 1eb34b6e

由 Will Deacon 提交于 5月 15, 2017

Whilst debugging a remote crash, I noticed that show_pte is unhelpful
when it comes to describing the structure of the page table being walked.
This is easily fixed by printing out the page table (swapper vs user),
page size and virtual address size when displaying the PGD address.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

1eb34b6e

arm64: mm: print file name of faulting vma · 83016b20

由 Kristina Martsenko 提交于 6月 09, 2017

Print out the name of the file associated with the vma that faulted.
This is usually the executable or shared library name. We already print
out the task name, but also printing the library name is useful for
pinpointing bugs to libraries.

Also print the base address and size of the vma, which together with the
PC (printed by __show_regs) gives the offset into the library.

Fault prints now look like:
test[2361]: unhandled level 2 translation fault (11) at 0x00000012, esr 0x92000006, in libfoo.so[ffffa0145000+1000]

This is already done on x86, for more details see commit 03252919
("x86: print which shared library/executable faulted in segfault etc.
messages v3").
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NKristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

83016b20

arm64: mm: don't print out page table entries on EL0 faults · bf396c09

由 Kristina Martsenko 提交于 6月 09, 2017

When we take a fault from EL0 that can't be handled, we print out the
page table entries associated with the faulting address. This allows
userspace to print out any current page table entries, including kernel
(TTBR1) entries. Exposing kernel mappings like this could pose a
security risk, so don't print out page table information on EL0 faults.
(But still print it out for EL1 faults.) This also follows the same
behaviour as x86, printing out page table entries on kernel mode faults
but not user mode faults.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NKristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

bf396c09

arm64: mm: print out correct page table entries · 67ce16ec

由 Kristina Martsenko 提交于 6月 09, 2017

When we take a fault that can't be handled, we print out the page table
entries associated with the faulting address. In some cases we currently
print out the wrong entries. For a faulting TTBR1 address, we sometimes
print out TTBR0 table entries instead, and for a faulting TTBR0 address
we sometimes print out TTBR1 table entries. Fix this by choosing the
tables based on the faulting address.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NKristina Martsenko <kristina.martsenko@arm.com>
[will: zero-extend addrs to 64-bit, don't walk swapper w/ TTBR0 addr]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

67ce16ec

02 6月, 2017 1 次提交

arm64: kernel: restrict /dev/mem read() calls to linear region · 1151f838

由 Ard Biesheuvel 提交于 5月 19, 2017

When running lscpu on an AArch64 system that has SMBIOS version 2.0
tables, it will segfault in the following way:

  Unable to handle kernel paging request at virtual address ffff8000bfff0000
  pgd = ffff8000f9615000
  [ffff8000bfff0000] *pgd=0000000000000000
  Internal error: Oops: 96000007 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 1284 Comm: lscpu Not tainted 4.11.0-rc3+ #103
  Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
  task: ffff8000fa78e800 task.stack: ffff8000f9780000
  PC is at __arch_copy_to_user+0x90/0x220
  LR is at read_mem+0xcc/0x140

This is caused by the fact that lspci issues a read() on /dev/mem at the
offset where it expects to find the SMBIOS structure array. However, this
region is classified as EFI_RUNTIME_SERVICE_DATA (as per the UEFI spec),
and so it is omitted from the linear mapping.

So let's restrict /dev/mem read/write access to those areas that are
covered by the linear region.
Reported-by: NAlexander Graf <agraf@suse.de>
Fixes: 4dffbfc4 ("arm64/efi: mark UEFI reserved regions as MEMBLOCK_NOMAP")
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

1151f838

30 5月, 2017 2 次提交

arm64: mm: explicity include linux/vmalloc.h · 6efd8499

由 Tobias Klauser 提交于 5月 15, 2017

arm64's mm/mmu.c uses vm_area_add_early, struct vm_area and other
definitions  but relies on implict inclusion of linux/vmalloc.h which
means that changes in other headers could break the build. Thus, add an
explicit include.
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NTobias Klauser <tklauser@distanz.ch>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

6efd8499

arm64: Call __show_regs directly · c07ab957

由 Kefeng Wang 提交于 5月 09, 2017

Generic code expects show_regs() to also dump the stack, but arm64's
show_reg() does not do this. Some arm64 callers of show_regs() *only*
want the registers dumped, without the stack.

To enable generic code to work as expected, we need to make
show_regs() dump the stack. Where we only want the registers dumped,
we must use __show_regs().

This patch updates code to use __show_regs() where only registers are
desired. A subsequent patch will modify show_regs().
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c07ab957

09 5月, 2017 1 次提交

arm64: use set_memory.h header · d4bbc30b

由 Laura Abbott 提交于 5月 08, 2017

The set_memory_* functions have moved to set_memory.h. Use that header
explicitly.

Link: http://lkml.kernel.org/r/1488920133-27229-4-git-send-email-labbott@redhat.comSigned-off-by: NLaura Abbott <labbott@redhat.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d4bbc30b

05 5月, 2017 1 次提交

arm64: Fix the DMA mmap and get_sgtable API with DMA_ATTR_FORCE_CONTIGUOUS · 92f66f84

由 Catalin Marinas 提交于 4月 25, 2017

While honouring the DMA_ATTR_FORCE_CONTIGUOUS on arm64 (commit
44176bb3: "arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to
IOMMU"), the existing uses of dma_mmap_attrs() and dma_get_sgtable()
have been broken by passing a physically contiguous vm_struct with an
invalid pages pointer through the common iommu API.

Since the coherent allocation with DMA_ATTR_FORCE_CONTIGUOUS uses CMA,
this patch simply reuses the existing swiotlb logic for mmap and
get_sgtable.

Note that the current implementation of get_sgtable (both swiotlb and
iommu) is broken if dma_declare_coherent_memory() is used since such
memory does not have a corresponding struct page. To be addressed in a
subsequent patch.

Fixes: 44176bb3 ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU")
Reported-by: NAndrzej Hajda <a.hajda@samsung.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: NRobin Murphy <robin.murphy@arm.com>
Tested-by: NAndrzej Hajda <a.hajda@samsung.com>
Reviewed-by: NAndrzej Hajda <a.hajda@samsung.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

92f66f84

02 5月, 2017 1 次提交

xen/arm,arm64: fix xen_dma_ops after "Consolidate get_dma_ops..." · e0586326

由 Stefano Stabellini 提交于 4月 13, 2017

The following commit:

  commit 815dd187
  Author: Bart Van Assche <bart.vanassche@sandisk.com>
  Date:   Fri Jan 20 13:04:04 2017 -0800

      treewide: Consolidate get_dma_ops() implementations

rearranges get_dma_ops in a way that xen_dma_ops are not returned when
running on Xen anymore, dev->dma_ops is returned instead (see
arch/arm/include/asm/dma-mapping.h:get_arch_dma_ops and
include/linux/dma-mapping.h:get_dma_ops).

Fix the problem by storing dev->dma_ops in dev_archdata, and setting
dev->dma_ops to xen_dma_ops. This way, xen_dma_ops is returned naturally
by get_dma_ops. The Xen code can retrieve the original dev->dma_ops from
dev_archdata when needed. It also allows us to remove __generic_dma_ops
from common headers.
Signed-off-by: NStefano Stabellini <sstabellini@kernel.org>
Tested-by: NJulien Grall <julien.grall@arm.com>
Suggested-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org>        [4.11+]
CC: linux@armlinux.org.uk
CC: catalin.marinas@arm.com
CC: will.deacon@arm.com
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
CC: Julien Grall <julien.grall@arm.com>

e0586326

29 4月, 2017 1 次提交

iommu: Remove pci.h include from trace/events/iommu.h · 461a6946

由 Joerg Roedel 提交于 4月 26, 2017

The include file does not need any PCI specifics, so remove
that include. Also fix the places that relied on it.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

461a6946

20 4月, 2017 1 次提交

arm64: dma-mapping: Remove the notifier trick to handle early setting of dma_ops · b913efe7

由 Sricharan R 提交于 4月 10, 2017

With arch_setup_dma_ops now being called late during device's probe after
the device's iommu is probed, the notifier trick required to handle the
early setup of dma_ops before the iommu group gets created is not
required. So removing the notifier's here.
Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Tested-by: NHanjun Guo <hanjun.guo@linaro.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NSricharan R <sricharan@codeaurora.org>
[rm: clean up even more]
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

b913efe7

07 4月, 2017 2 次提交

Revert "Revert "arm64: hugetlb: partial revert of "" · 6ae979ab

由 Will Deacon 提交于 3月 31, 2017

The use of the contiguous bit by our hugetlb implementation violates
the break-before-make requirements of the architecture and can lead to
silent data corruption or TLB conflict aborts. Once again, disable these
hugetlb sizes whilst it gets worked out.

This reverts commit ab2e1b89.

Conflicts:
	arch/arm64/mm/hugetlbpage.c
Signed-off-by: NWill Deacon <will.deacon@arm.com>

6ae979ab

arm64: print a fault message when attempting to write RO memory · b824b930

由 Stephen Boyd 提交于 4月 05, 2017

If a page is marked read only we should print out that fact,
instead of printing out that there was a page fault. Right now we
get a cryptic error message that something went wrong with an
unhandled fault, but we don't evaluate the esr to figure out that
it was a read/write permission fault.

Instead of seeing:

  Unable to handle kernel paging request at virtual address ffff000008e460d8
  pgd = ffff800003504000
  [ffff000008e460d8] *pgd=0000000083473003, *pud=0000000083503003, *pmd=0000000000000000
  Internal error: Oops: 9600004f [#1] PREEMPT SMP

we'll see:

  Unable to handle kernel write to read-only memory at virtual address ffff000008e760d8
  pgd = ffff80003d3de000
  [ffff000008e760d8] *pgd=0000000083472003, *pud=0000000083435003, *pmd=0000000000000000
  Internal error: Oops: 9600004f [#1] PREEMPT SMP

We also add a userspace address check into is_permission_fault()
so that the function doesn't return true for ttbr0 PAN faults
when it shouldn't.
Reviewed-by: NJames Morse <james.morse@arm.com>
Tested-by: NJames Morse <james.morse@arm.com>
Acked-by: NLaura Abbott <labbott@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: NStephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

b824b930

06 4月, 2017 6 次提交

arm64: kdump: provide /proc/vmcore file · e62aaeac

由 AKASHI Takahiro 提交于 4月 03, 2017

Arch-specific functions are added to allow for implementing a crash dump
file interface, /proc/vmcore, which can be viewed as a ELF file.

A user space tool, like kexec-tools, is responsible for allocating
a separate region for the core's ELF header within crash kdump kernel
memory and filling it in when executing kexec_load().

Then, its location will be advertised to crash dump kernel via a new
device-tree property, "linux,elfcorehdr", and crash dump kernel preserves
the region for later use with reserve_elfcorehdr() at boot time.

On crash dump kernel, /proc/vmcore will access the primary kernel's memory
with copy_oldmem_page(), which feeds the data page-by-page by ioremap'ing
it since it does not reside in linear mapping on crash dump kernel.

Meanwhile, elfcorehdr_read() is simple as the region is always mapped.
Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: NJames Morse <james.morse@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

e62aaeac

arm64: hibernate: preserve kdump image around hibernation · 254a41c0

由 AKASHI Takahiro 提交于 4月 03, 2017

Since arch_kexec_protect_crashkres() removes a mapping for crash dump
kernel image, the loaded data won't be preserved around hibernation.

In this patch, helper functions, crash_prepare_suspend()/
crash_post_resume(), are additionally called before/after hibernation so
that the relevant memory segments will be mapped again and preserved just
as the others are.

In addition, to minimize the size of hibernation image, crash_is_nosave()
is added to pfn_is_nosave() in order to recognize only the pages that hold
loaded crash dump kernel image as saveable. Hibernation excludes any pages
that are marked as Reserved and yet "nosave."
Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

254a41c0

arm64: kdump: protect crash dump kernel memory · 98d2e153

由 Takahiro Akashi 提交于 4月 03, 2017

arch_kexec_protect_crashkres() and arch_kexec_unprotect_crashkres()
are meant to be called by kexec_load() in order to protect the memory
allocated for crash dump kernel once the image is loaded.

The protection is implemented by unmapping the relevant segments in crash
dump kernel memory, rather than making it read-only as other archs do,
to prevent coherency issues due to potential cache aliasing (with
mismatched attributes).

Page-level mappings are consistently used here so that we can change
the attributes of segments in page granularity as well as shrink the region
also in page granularity through /sys/kernel/kexec_crash_size, putting
the freed memory back to buddy system.
Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

98d2e153

arm64: mm: add set_memory_valid() · 9b0aa14e

由 AKASHI Takahiro 提交于 4月 03, 2017

This function validates and invalidates PTE entries, and will be utilized
in kdump to protect loaded crash dump kernel image.
Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

9b0aa14e

arm64: kdump: reserve memory for crash dump kernel · 764b51ea

由 AKASHI Takahiro 提交于 4月 03, 2017

"crashkernel=" kernel parameter specifies the size (and optionally
the start address) of the system ram to be used by crash dump kernel.
reserve_crashkernel() will allocate and reserve that memory at boot time
of primary kernel.

The memory range will be exposed to userspace as a resource named
"Crash kernel" in /proc/iomem.
Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: NMark Salter <msalter@redhat.com>
Signed-off-by: NPratyush Anand <panand@redhat.com>
Reviewed-by: NJames Morse <james.morse@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

764b51ea

arm64: limit memory regions based on DT property, usable-memory-range · 8f579b1c

由 AKASHI Takahiro 提交于 4月 03, 2017

Crash dump kernel uses only a limited range of available memory as System
RAM. On arm64 kdump, This memory range is advertised to crash dump kernel
via a device-tree property under /chosen,
   linux,usable-memory-range = <BASE SIZE>

Crash dump kernel reads this property at boot time and calls
memblock_cap_memory_range() to limit usable memory which are listed either
in UEFI memory map table or "memory" nodes of a device tree blob.
Signed-off-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: NGeoff Levand <geoff@infradead.org>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

8f579b1c

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功