提交 · 290622efc76ece22ef76a30bf117755891ab27f6 · openanolis / cloud-kernel

01 7月, 2016 2 次提交

arm64: fix "dc cvau" cache operation on errata-affected core · 290622ef

由 Andre Przywara 提交于 6月 28, 2016

The ARM errata 819472, 826319, 827319 and 824069 for affected
Cortex-A53 cores demand to promote "dc cvau" instructions to
"dc civac" as well.
Attribute the usage of the instruction in __flush_cache_user_range
to also be covered by our alternative patching efforts.
For that we introduce an assembly macro which both deals with
alternatives while still tagging the instructions as USER.
Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

290622ef

arm64: mm: remove unnecessary BUG_ON · 6c5269f3

由 Kefeng Wang 提交于 6月 30, 2016

The memblock_alloc() and memblock_alloc_base() will panic on their own
if no free memory, remove pointless BUG_ON.
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

6c5269f3

28 6月, 2016 2 次提交

arm64: mm: fix location of _etext · 9fdc14c5

由 Ard Biesheuvel 提交于 6月 23, 2016

As Kees Cook notes in the ARM counterpart of this patch [0]:

  The _etext position is defined to be the end of the kernel text code,
  and should not include any part of the data segments. This interferes
  with things that might check memory ranges and expect executable code
  up to _etext.

In particular, Kees is referring to the HARDENED_USERCOPY patch set [1],
which rejects attempts to call copy_to_user() on kernel ranges containing
executable code, but does allow access to the .rodata segment. Regardless
of whether one may or may not agree with the distinction, it makes sense
for _etext to have the same meaning across architectures.

So let's put _etext where it belongs, between .text and .rodata, and fix
up existing references to use __init_begin instead, which unlike _end_rodata
includes the exception and notes sections as well.

The _etext references in kaslr.c are left untouched, since its references
to [_stext, _etext) are meant to capture potential jump instruction targets,
and so disregarding .rodata is actually an improvement here.

[0] http://article.gmane.org/gmane.linux.kernel/2245084
[1] http://thread.gmane.org/gmane.linux.kernel.hardened.devel/2502Reported-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

9fdc14c5

arm64: mm: simplify memblock numa node extraction · ea2cbee3

由 Mark Rutland 提交于 6月 22, 2016

We currently open-code extracting the NUMA node of a memblock region,
which requires an ifdef to cater for !CONFIG_NUMA builds where the
memblock_region::nid field does not exist.

The generic memblock_get_region_node helper is intended to cater for
this. For CONFIG_HAVE_MEMBLOCK_NODE_MAP, builds this returns reg->nid,
and for for !CONFIG_HAVE_MEMBLOCK_NODE_MAP builds this is a static
inline that returns 0. Note that for arm64,
CONFIG_HAVE_MEMBLOCK_NODE_MAP is selected iff CONFIG_NUMA is.

This patch makes use of memblock_get_region_node to simplify the arm64
code. At the same time, we can move the nid variable definition into the
loop, as this is the only place it is used.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NSteve Capper <steve.capper@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

ea2cbee3

22 6月, 2016 2 次提交

arm64: kill ESR_LNX_EXEC · 541ec870

由 Mark Rutland 提交于 5月 31, 2016

Currently we treat ESR_EL1 bit 24 as software-defined for distinguishing
instruction aborts from data aborts, but this bit is architecturally
RES0 for instruction aborts, and could be allocated for an arbitrary
purpose in future. Additionally, we hard-code the value in entry.S
without the mnemonic, making the code difficult to understand.

Instead, remove ESR_LNX_EXEC, and distinguish aborts based on the esr,
which we already pass to the sole use of ESR_LNX_EXEC. A new helper,
is_el0_instruction_abort() is added to make the logic clear. Any
instruction aborts taken from EL1 will already have been handled by
bad_mode, so we need not handle that case in the helper.

For consistency, the existing permission_fault helper is renamed to
is_permission_fault, and the return type is changed to bool. There
should be no functional changes as the return value was a boolean
expression, and the result is only used in another boolean expression.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Dave P Martin <dave.martin@arm.com>
Cc: Huang Shijie <shijie.huang@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

541ec870

arm64: add macro to extract ESR_ELx.EC · 275f344b

由 Mark Rutland 提交于 5月 31, 2016

Several places open-code extraction of the EC field from an ESR_ELx
value, in subtly different ways. This is unfortunate duplication and
variation, and the precise logic used to extract the field is a
distraction.

This patch adds a new macro, ESR_ELx_EC(), to extract the EC field from
an ESR_ELx value in a consistent fashion.

Existing open-coded extractions in core arm64 code are moved over to the
new helper. KVM code is left as-is for the moment.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Tested-by: NHuang Shijie <shijie.huang@arm.com>
Cc: Dave P Martin <dave.martin@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

275f344b

21 6月, 2016 2 次提交

arm64: mm: only initialize swiotlb when necessary · b67a8b29

由 Jisheng Zhang 提交于 6月 08, 2016

we only initialize swiotlb when swiotlb_force is true or not all system
memory is DMA-able, this trivial optimization saves us 64MB when
swiotlb is not necessary.
Signed-off-by: NJisheng Zhang <jszhang@marvell.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

b67a8b29

arm64: mm: dump: make page table dumping reusable · 4674fdb9

由 Mark Rutland 提交于 5月 31, 2016

For debugging purposes, it would be nice if we could export page tables
other than the swapper_pg_dir to userspace. To enable this, this patch
refactors the arm64 page table dumping code such that multiple tables
may be registered with the framework, and exported under debugfs.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

4674fdb9

14 6月, 2016 1 次提交

arm64: mm: mark fault_info table const · bbb1681e

由 Mark Rutland 提交于 6月 13, 2016

Unlike the debug_fault_info table, we never intentionally alter the
fault_info table at runtime, and all derived pointers are treated as
const currently.

Make the table const so that it can be placed in .rodata and protected
from unintentional writes, as we do for the syscall tables.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

bbb1681e

08 6月, 2016 1 次提交

arm64: mm: always take dirty state from new pte in ptep_set_access_flags · 0106d456

由 Will Deacon 提交于 6月 07, 2016

Commit 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for
hardware AF/DBM") ensured that pte flags are updated atomically in the
face of potential concurrent, hardware-assisted updates. However, Alex
reports that:

 | This patch breaks swapping for me.
 | In the broken case, you'll see either systemd cpu time spike (because
 | it's stuck in a page fault loop) or the system hang (because the
 | application owning the screen is stuck in a page fault loop).

It turns out that this is because the 'dirty' argument to
ptep_set_access_flags is always 0 for read faults, and so we can't use
it to set PTE_RDONLY. The failing sequence is:

  1. We put down a PTE_WRITE | PTE_DIRTY | PTE_AF pte
  2. Memory pressure -> pte_mkold(pte) -> clear PTE_AF
  3. A read faults due to the missing access flag
  4. ptep_set_access_flags is called with dirty = 0, due to the read fault
  5. pte is then made PTE_WRITE | PTE_DIRTY | PTE_AF | PTE_RDONLY (!)
  6. A write faults, but pte_write is true so we get stuck

The solution is to check the new page table entry (as would be done by
the generic, non-atomic definition of ptep_set_access_flags that just
calls set_pte_at) to establish the dirty state.

Cc: <stable@vger.kernel.org> # 4.3+
Fixes: 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM")
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Reported-by: NAlexander Graf <agraf@suse.de>
Tested-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

0106d456

03 6月, 2016 1 次提交

arm64: mm: dump: log span level · 48dd73c5

由 Mark Rutland 提交于 5月 31, 2016

The page table dump code logs spans of entries at the same level
(pgd/pud/pmd/pte) which have the same attributes. While we log the
(decoded) attributes, we don't log the level, which leaves the output
ambiguous and/or confusing in some cases.

For example:

0xffff800800000000-0xffff800980000000           6G       RW NX SHD AF        BLK UXN MEM/NORMAL

If using 4K pages, this may describe a span of 6 1G block entries at the
PGD/PUD level, or 3072 2M block entries at the PMD level.

This patch adds the page table level to each output line, removing this
ambiguity. For the example above, this will produce:

0xffffffc800000000-0xffffffc980000000           6G PUD       RW NX SHD AF        BLK UXN MEM/NORMAL

When 3 level tables are in use, and we use the asm-generic/nopud.h
definitions, the dump code treats each entry in the PGD as a 1 element
table at the PUD level, and logs spans as being PUDs, which can be
confusing. To counteract this, the "PUD" mnemonic is replaced with "PGD"
when CONFIG_PGTABLE_LEVELS <= 3. Likewise for "PMD" when
CONFIG_PGTABLE_LEVELS <= 2.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Huang Shijie <shijie.huang@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

48dd73c5

31 5月, 2016 1 次提交

Revert "arm64: hugetlb: partial revert of " · ab2e1b89

由 Will Deacon 提交于 5月 31, 2016

This reverts commit ff792584.

Now that the contiguous-hint hugetlb regression has been debugged and
fixed upstream by 66ee95d1 ("mm: exclude HugeTLB pages from THP
page_mapped() logic"), we can revert the previous partial revert of this
feature.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

ab2e1b89

20 5月, 2016 1 次提交

arm64: mm: use hugetlb_bad_size() · d77e20ce

由 Vaishali Thakkar 提交于 5月 19, 2016

Update setup_hugepagesz() to call hugetlb_bad_size() when unsupported
hugepage size is found.
Signed-off-by: NVaishali Thakkar <vaishali.thakkar@oracle.com>
Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d77e20ce

09 5月, 2016 2 次提交

iommu/dma: Finish optimising higher-order allocations · 3b6b7e19

由 Robin Murphy 提交于 4月 13, 2016

Now that we know exactly which page sizes our caller wants to use in the
given domain, we can restrict higher-order allocation attempts to just
those sizes, if any, and avoid wasting any time or effort on other sizes
which offer no benefit. In the same vein, this also lets us accommodate
a minimum order greater than 0 for special cases.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Tested-by: NYong Wu <yong.wu@mediatek.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

3b6b7e19

iommu: of: enforce const-ness of struct iommu_ops · 53c92d79

由 Robin Murphy 提交于 4月 07, 2016

As a set of driver-provided callbacks and static data, there is no
compelling reason for struct iommu_ops to be mutable in core code, so
enforce const-ness throughout.
Acked-by: NThierry Reding <treding@nvidia.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

53c92d79

05 5月, 2016 1 次提交

arm64: mm: remove unnecessary EXPORT_SYMBOL_GPL · 394bf2f2

由 Yang Shi 提交于 5月 04, 2016

arch_pick_mmap_layout is only called by fs/exec.c which is always built into
kernel, it looks the EXPORT_SYMBOL_GPL is pointless and no architectures export
it other than ARM64.
Signed-off-by: NYang Shi <yang.shi@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

394bf2f2

28 4月, 2016 2 次提交

arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va · cabe1c81

由 James Morse 提交于 4月 27, 2016

By enabling the MMU early in cpu_resume(), the sleep_save_sp and stack can
be accessed by VA, which avoids the need to convert-addresses and clean to
PoC on the suspend path.

MMU setup is shared with the boot path, meaning the swapper_pg_dir is
restored directly: ttbr1_el1 is no longer saved/restored.

struct sleep_save_sp is removed, replacing it with a single array of
pointers.

cpu_do_{suspend,resume} could be further reduced to not restore: cpacr_el1,
mdscr_el1, tcr_el1, vbar_el1 and sctlr_el1, all of which are set by
__cpu_setup(). However these values all contain res0 bits that may be used
to enable future features.
Signed-off-by: NJames Morse <james.morse@arm.com>
Reviewed-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

cabe1c81

arm64: Fold proc-macros.S into assembler.h · 7b7293ae

由 Geoff Levand 提交于 4月 27, 2016

To allow the assembler macros defined in arch/arm64/mm/proc-macros.S to
be used outside the mm code move the contents of proc-macros.S to
asm/assembler.h.  Also, delete proc-macros.S, and fix up all references
to proc-macros.S.
Signed-off-by: NGeoff Levand <geoff@infradead.org>
Acked-by: NPavel Machek <pavel@ucw.cz>
[rebased, included dcache_by_line_op]
Signed-off-by: NJames Morse <james.morse@arm.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

7b7293ae

25 4月, 2016 2 次提交

arm64: ptdump: add region marker for kasan shadow region · d8fc68a0

由 Ard Biesheuvel 提交于 4月 22, 2016

Annotate the KASAN shadow region with boundary markers, so that its
mappings stand out in the page table dumper output.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

d8fc68a0

arm64: ptdump: use static initializers for vmemmap region boundaries · c8f8cca4

由 Ard Biesheuvel 提交于 4月 22, 2016

There is no need to initialize the vmemmap region boundaries dynamically,
since they are compile time constants. So just add these constants to the
global struct initializer, and drop the dynamic assignment and related code.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c8f8cca4

22 4月, 2016 2 次提交

arm64/dma-mapping: Remove default domain workaround · 921b1f52

由 Robin Murphy 提交于 4月 19, 2016

With the IOMMU core now taking care of default domains for groups
regardless of bus type, we can gleefully rip out this stop-gap, as
slight recompense for having to expand the other one.
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

921b1f52

arm64/dma-mapping: Extend DMA ops workaround to PCI devices · 226d89cb

由 Robin Murphy 提交于 4月 19, 2016

PCI devices now suffer the same hiccup as platform devices, in that they
get their DMA ops configured before they have been added to their bus,
and thus before we know whether they have successfully registered with
an IOMMU or not. Until the necessary driver core changes to reorder
calls during device creation have been worked out, extend our delayed
notifier trick onto the PCI bus so as to avoid broken DMA ops once
IOMMUs get plugged into the PCI code.
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

226d89cb

20 4月, 2016 2 次提交

arm64: mm: Show bss segment in kernel memory layout · 9974723e

由 Kefeng Wang 提交于 4月 18, 2016

Show the bss segment information as with text and data in Virtual
memory kernel layout.
Acked-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

9974723e

arm64: mm: make pr_cont() per line in Virtual kernel memory layout · d32351c8

由 Kefeng Wang 提交于 4月 18, 2016

Each line with single pr_cont() in Virtual kernel memory layout,
or the dump of the kernel memory layout in dmesg is not aligned
when PRINTK_TIME enabled, due to the missing time stamps.
Tested-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

d32351c8

19 4月, 2016 1 次提交

arm64: mm: remove the redundant code · 3a72db70

由 Huang Shijie 提交于 4月 18, 2016

We already re-enable interrupts where necessary in the entry code, so
there is no need to do it again in do_page fault. This patch removes
the redundant code.
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NHuang Shijie <shijie.huang@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

3a72db70

16 4月, 2016 4 次提交

arm64: Implement ptep_set_access_flags() for hardware AF/DBM · 66dbd6e6

由 Catalin Marinas 提交于 4月 13, 2016

When hardware updates of the access and dirty states are enabled, the
default ptep_set_access_flags() implementation based on calling
set_pte_at() directly is potentially racy. This triggers the "racy dirty
state clearing" warning in set_pte_at() because an existing writable PTE
is overridden with a clean entry.

There are two main scenarios for this situation:

1. The CPU getting an access fault does not support hardware updates of
   the access/dirty flags. However, a different agent in the system
   (e.g. SMMU) can do this, therefore overriding a writable entry with a
   clean one could potentially lose the automatically updated dirty
   status

2. A more complex situation is possible when all CPUs support hardware
   AF/DBM:

   a) Initial state: shareable + writable vma and pte_none(pte)
   b) Read fault taken by two threads of the same process on different
      CPUs
   c) CPU0 takes the mmap_sem and proceeds to handling the fault. It
      eventually reaches do_set_pte() which sets a writable + clean pte.
      CPU0 releases the mmap_sem
   d) CPU1 acquires the mmap_sem and proceeds to handle_pte_fault(). The
      pte entry it reads is present, writable and clean and it continues
      to pte_mkyoung()
   e) CPU1 calls ptep_set_access_flags()

   If between (d) and (e) the hardware (another CPU) updates the dirty
   state (clears PTE_RDONLY), CPU1 will override the PTR_RDONLY bit
   marking the entry clean again.

This patch implements an arm64-specific ptep_set_access_flags() function
to perform an atomic update of the PTE flags.

Fixes: 2f4b829c ("arm64: Add support for hardware updates of the access and dirty pte bits")
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Reported-by: NMing Lei <tom.leiming@gmail.com>
Tested-by: NJulien Grall <julien.grall@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # 4.3+
[will: reworded comment]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

66dbd6e6

arm64, numa: Add NUMA support for arm64 platforms. · 1a2db300

由 Ganapatrao Kulkarni 提交于 4月 08, 2016

Attempt to get the memory and CPU NUMA node via of_numa.  If that
fails, default the dummy NUMA node and map all memory and CPUs to node
0.
Tested-by: NShannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: NRobert Richter <rrichter@cavium.com>
Signed-off-by: NGanapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Signed-off-by: NDavid Daney <david.daney@cavium.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

1a2db300

arm64: Move unflatten_device_tree() call earlier. · 3194ac6e

由 David Daney 提交于 4月 08, 2016

In order to extract NUMA information from the device tree, we need to
have the tree in its unflattened form.

Move the call to bootmem_init() in the tail of paging_init() into
setup_arch, and adjust header files so that its declaration is
visible.

Move the unflatten_device_tree() call between the calls to
paging_init() and bootmem_init().  Follow on patches add NUMA handling
to bootmem_init().
Signed-off-by: NDavid Daney <david.daney@cavium.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

3194ac6e

arm64: Add cpu_panic_kernel helper · 17eebd1a

由 Suzuki K Poulose 提交于 4月 12, 2016

During the activation of a secondary CPU, we could report serious
configuration issues and hence request to crash the kernel. We do
this for CPU ASID bit check now. We will need it also for handling
mismatched exception levels for the CPUs with VHE. Hence, add a
helper to do the same for reusability.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

17eebd1a

15 4月, 2016 3 次提交

arm64: mm: Add trace_irqflags annotations to do_debug_exception() · 6afedcd2

由 James Morse 提交于 4月 13, 2016

With CONFIG_PROVE_LOCKING, CONFIG_DEBUG_LOCKDEP and CONFIG_TRACE_IRQFLAGS
enabled, lockdep will compare current->hardirqs_enabled with the flags from
local_irq_save().

When a debug exception occurs, interrupts are disabled in entry.S, but
lockdep isn't told, resulting in:
DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
------------[ cut here ]------------
WARNING: at ../kernel/locking/lockdep.c:3523
Modules linked in:
CPU: 3 PID: 1752 Comm: perf Not tainted 4.5.0-rc4+ #2204
Hardware name: ARM Juno development board (r1) (DT)
task: ffffffc974868000 ti: ffffffc975f40000 task.ti: ffffffc975f40000
PC is at check_flags.part.35+0x17c/0x184
LR is at check_flags.part.35+0x17c/0x184
pc : [<ffffff80080fc93c>] lr : [<ffffff80080fc93c>] pstate: 600003c5
[...]
---[ end trace 74631f9305ef5020 ]---
Call trace:
[<ffffff80080fc93c>] check_flags.part.35+0x17c/0x184
[<ffffff80080ffe30>] lock_acquire+0xa8/0xc4
[<ffffff8008093038>] breakpoint_handler+0x118/0x288
[<ffffff8008082434>] do_debug_exception+0x3c/0xa8
[<ffffff80080854b4>] el1_dbg+0x18/0x6c
[<ffffff80081e82f4>] do_filp_open+0x64/0xdc
[<ffffff80081d6e60>] do_sys_open+0x140/0x204
[<ffffff80081d6f58>] SyS_openat+0x10/0x18
[<ffffff8008085d30>] el0_svc_naked+0x24/0x28
possible reason: unannotated irqs-off.
irq event stamp: 65857
hardirqs last  enabled at (65857): [<ffffff80081fb1c0>] lookup_mnt+0xf4/0x1b4
hardirqs last disabled at (65856): [<ffffff80081fb188>] lookup_mnt+0xbc/0x1b4
softirqs last  enabled at (65790): [<ffffff80080bdca4>] __do_softirq+0x1f8/0x290
softirqs last disabled at (65757): [<ffffff80080be038>] irq_exit+0x9c/0xd0

This patch adds the annotations to do_debug_exception(), while trying not
to call trace_hardirqs_off() if el1_dbg() interrupted a task that already
had irqs disabled.
Signed-off-by: NJames Morse <james.morse@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

6afedcd2

arm64: cover the .head.text section in the .text segment mapping · 7eb90f2f

由 Ard Biesheuvel 提交于 3月 30, 2016

Keeping .head.text out of the .text mapping buys us very little: its actual
payload is only 4 KB, most of which is padding, but the page alignment may
add up to 2 MB (in case of CONFIG_DEBUG_ALIGN_RODATA=y) of additional
padding to the uncompressed kernel Image.

Also, on 4 KB granule kernels, the 4 KB misalignment of .text forces us to
map the adjacent 56 KB of code without the PTE_CONT attribute, and since
this region contains things like the vector table and the GIC interrupt
handling entry point, this region is likely to benefit from the reduced TLB
pressure that results from PTE_CONT mappings.

So remove the alignment between the .head.text and .text sections, and use
the [_text, _etext) rather than the [_stext, _etext) interval for mapping
the .text segment.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

7eb90f2f

arm64: use 'segment' rather than 'chunk' to describe mapped kernel regions · 2c09ec06

由 Ard Biesheuvel 提交于 3月 30, 2016

Replace the poorly defined term chunk with segment, which is a term that is
already used by the ELF spec to describe contiguous mappings with the same
permission attributes of statically allocated ranges of an executable.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

2c09ec06

14 4月, 2016 4 次提交

arm64: mm: move vmemmap region right below the linear region · 3e1907d5

由 Ard Biesheuvel 提交于 3月 30, 2016

This moves the vmemmap region right below PAGE_OFFSET, aka the start
of the linear region, and redefines its size to be a power of two.
Due to the placement of PAGE_OFFSET in the middle of the address space,
whose size is a power of two as well, this guarantees that virt to
page conversions and vice versa can be implemented efficiently, by
masking and shifting rather than ordinary arithmetic.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

3e1907d5

arm64: mm: free __init memory via the linear mapping · d386825c

由 Ard Biesheuvel 提交于 3月 30, 2016

The implementation of free_initmem_default() expects __init_begin
and __init_end to be covered by the linear mapping, which is no
longer the case. So open code it instead, using addresses that are
explicitly translated from kernel virtual to linear virtual.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

d386825c

arm64: add the initrd region to the linear mapping explicitly · 177e15f0

由 Ard Biesheuvel 提交于 3月 30, 2016

Instead of going out of our way to relocate the initrd if it turns out
to occupy memory that is not covered by the linear mapping, just add the
initrd to the linear mapping. This puts the burden on the bootloader to
pass initrd= and mem= options that are mutually consistent.

Note that, since the placement of the linear region in the PA space is
also dependent on the placement of the kernel Image, which may reside
anywhere in memory, we may still end up with a situation where the initrd
and the kernel Image are simply too far apart to be covered by the linear
region.

Since we now leave it up to the bootloader to pass the initrd in memory
that is guaranteed to be accessible by the kernel, add a mention of this to
the arm64 boot protocol specification as well.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

177e15f0

arm64/mm: ensure memstart_addr remains sufficiently aligned · 2958987f

由 Ard Biesheuvel 提交于 3月 30, 2016

After choosing memstart_addr to be the highest multiple of
ARM64_MEMSTART_ALIGN less than or equal to the first usable physical memory
address, we clip the memblocks to the maximum size of the linear region.
Since the kernel may be high up in memory, we take care not to clip the
kernel itself, which means we have to clip some memory from the bottom if
this occurs, to ensure that the distance between the first and the last
usable physical memory address can be covered by the linear region.

However, we fail to update memstart_addr if this clipping from the bottom
occurs, which means that we may still end up with virtual addresses that
wrap into the userland range. So increment memstart_addr as appropriate to
prevent this from happening.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

2958987f

25 3月, 2016 2 次提交

arm64: mm: allow preemption in copy_to_user_page · 691b1e2e

由 Mark Rutland 提交于 3月 22, 2016

Currently we disable preemption in copy_to_user_page; a behaviour that
we inherited from the 32-bit arm code. This was necessary for older
cores without broadcast data cache maintenance, and ensured that cache
lines were dirtied and cleaned by the same CPU. On these systems dirty
cache line migration was not possible, so this was sufficient to
guarantee coherency.

On contemporary systems, cache coherence protocols permit (dirty) cache
lines to migrate between CPUs as a result of speculation, prefetching,
and other behaviours. To account for this, in ARMv8 data cache
maintenance operations are broadcast and affect all data caches in the
domain associated with the VA (i.e. ISH for kernel and user mappings).

In __switch_to we ensure that tasks can be safely migrated in the middle
of a maintenance sequence, using a dsb(ish) to ensure prior explicit
memory accesses are observed and cache maintenance operations are
completed before a task can be run on another CPU.

Given the above, it is not necessary to disable preemption in
copy_to_user_page. This patch removes the preempt_{disable,enable}
calls, permitting preemption.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

691b1e2e

arm64: consistently use p?d_set_huge · c661cb1c

由 Mark Rutland 提交于 3月 22, 2016

Commit 324420bf ("arm64: add support for ioremap() block
mappings") added new p?d_set_huge functions which do the hard work to
generate and set a correct block entry.

These differ from open-coded huge page creation in the early page table
code by explicitly setting the P?D_TYPE_SECT bits (which are implicitly
retained by mk_sect_prot() for any valid prot), but are otherwise
identical (and cannot fail on arm64).

For simplicity and consistency, make use of these in the initial page
table creation code.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

c661cb1c

21 3月, 2016 1 次提交

arm64: Split pr_notice("Virtual kernel memory layout...") into multiple pr_cont() · f09f1bac

由 Catalin Marinas 提交于 3月 11, 2016

The printk() implementation has a limit of LOG_LINE_MAX (== 1024 - 32)
buffer per call which the arm64 mem_init() breaches when printing the
virtual memory layout with CONFIG_KASAN enabled. The result is that the
last line is no longer printed. This patch splits the call into a
pr_notice() + additional pr_cont() calls.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>

f09f1bac

18 3月, 2016 1 次提交

mm: remove VM_FAULT_MINOR · 0e8fb931

由 Jan Kara 提交于 3月 17, 2016

The define has a comment from Nick Piggin from 2007:

 /* For backwards compat. Remove me quickly. */

I guess 9 years should not be too hurried sense of 'quickly' even for
kernel measures.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0e8fb931

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功