提交 · 120f0779c3ed89c25ef1db943feac8ed73a0d7f9 · gsplhtlxg / clone-Linux

11 3月, 2016 1 次提交

arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission · fdc69e7d

由 Catalin Marinas 提交于 3月 09, 2016

The set_pte_at() function must update the hardware PTE_RDONLY bit
depending on the state of the PTE_WRITE and PTE_DIRTY bits of the given
entry value. However, it currently only performs this for pte_valid()
entries, ignoring PTE_PROT_NONE. The side-effect is that PROT_NONE
mappings would not have the PTE_RDONLY bit set. Without
CONFIG_ARM64_HW_AFDBM, this is not an issue since such PROT_NONE pages
are not accessible anyway.

With commit 2f4b829c ("arm64: Add support for hardware updates of
the access and dirty pte bits"), the ptep_set_wrprotect() function was
re-written to cope with automatic hardware updates of the dirty state.
As an optimisation, only PTE_RDONLY is checked to assess the "dirty"
status. Since set_pte_at() does not set this bit for PROT_NONE mappings,
such pages may be considered "dirty" as a result of
ptep_set_wrprotect().

This patch updates the pte_valid() check to pte_present() in
set_pte_at(). It also adds PTE_PROT_NONE to the swap entry bits comment.

Fixes: 2f4b829c ("arm64: Add support for hardware updates of the access and dirty pte bits")
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Reported-by: NGanapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Tested-by: NGanapatrao Kulkarni <gkulkarni@cavium.com>
Cc: <stable@vger.kernel.org>

fdc69e7d

09 3月, 2016 1 次提交

arm64: account for sparsemem section alignment when choosing vmemmap offset · 36e5cd6b

由 Ard Biesheuvel 提交于 3月 08, 2016

Commit dfd55ad8 ("arm64: vmemmap: use virtual projection of linear
region") fixed an issue where the struct page array would overflow into the
adjacent virtual memory region if system RAM was placed so high up in
physical memory that its addresses were not representable in the build time
configured virtual address size.

However, the fix failed to take into account that the vmemmap region needs
to be relatively aligned with respect to the sparsemem section size, so that
a sequence of page structs corresponding with a sparsemem section in the
linear region appears naturally aligned in the vmemmap region.

So round up vmemmap to sparsemem section size. Since this essentially moves
the projection of the linear region up in memory, also revert the reduction
of the size of the vmemmap region.

Cc: <stable@vger.kernel.org>
Fixes: dfd55ad8 ("arm64: vmemmap: use virtual projection of linear region")
Tested-by: NMark Langsdorf <mlangsdo@redhat.com>
Tested-by: NDavid Daney <david.daney@cavium.com>
Tested-by: NRobert Richter <rrichter@cavium.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

36e5cd6b

27 2月, 2016 1 次提交

arm64: vmemmap: use virtual projection of linear region · dfd55ad8

由 Ard Biesheuvel 提交于 2月 26, 2016

Commit dd006da2 ("arm64: mm: increase VA range of identity map") made
some changes to the memory mapping code to allow physical memory to reside
at an offset that exceeds the size of the virtual mapping.

However, since the size of the vmemmap area is proportional to the size of
the VA area, but it is populated relative to the physical space, we may
end up with the struct page array being mapped outside of the vmemmap
region. For instance, on my Seattle A0 box, I can see the following output
in the dmesg log.

vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)

We can fix this by deciding that the vmemmap region is not a projection of
the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
linear region. This way, we are guaranteed that the vmemmap region is of
sufficient size, and we can even reduce the size by half.

Cc: <stable@vger.kernel.org>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

dfd55ad8

26 2月, 2016 2 次提交

arm64: Remove fixmap include fragility · 3eca86e7

由 Mark Rutland 提交于 2月 26, 2016

The asm-generic fixmap.h depends on each architecture's fixmap.h to pull
in the definition of PAGE_KERNEL_RO, if this exists. In the absence of
this, FIXMAP_PAGE_RO will not be defined. In mm/early_ioremap.c the
definition of early_memremap_ro is predicated on FIXMAP_PAGE_RO being
defined.

Currently, the arm64 fixmap.h doesn't include pgtable.h for the
definition of PAGE_KERNEL_RO, and as a knock-on effect early_memremap_ro
is not always defined, leading to link-time failures when it is used.
This has been observed with defconfig on next-20160226.

Unfortunately, as pgtable.h includes fixmap.h, adding the include
introduces a circular dependency, which is just as fragile.

Instead, this patch factors out PAGE_KERNEL_RO and other prot
definitions into a new pgtable-prot header which can be included by poth
pgtable.h and fixmap.h, avoiding the  circular dependency, and ensuring
that early_memremap_ro is alwyas defined where it is used.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reported-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

3eca86e7

arm64: Fix building error with 16KB pages and 36-bit VA · cac4b8cd

由 Catalin Marinas 提交于 2月 25, 2016

In such configuration, Linux uses only two pages of page tables and
__pud_populate() should not be used. However, the BUILD_BUG() triggers
since pud_sect() is still defined and the compiler cannot eliminate such
code, even though at run-time it should not be triggered. This patch
extends the #ifdef ARM64_64K_PAGES condition for pud_sect to include
PGTABLE_LEVELS < 3.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

cac4b8cd

19 2月, 2016 2 次提交

arm64: move kernel image to base of vmalloc area · f9040773

由 Ard Biesheuvel 提交于 2月 16, 2016

This moves the module area to right before the vmalloc area, and moves
the kernel image to the base of the vmalloc area. This is an intermediate
step towards implementing KASLR, which allows the kernel image to be
located anywhere in the vmalloc area.

Since other subsystems such as hibernate may still need to refer to the
kernel text or data segments via their linears addresses, both are mapped
in the linear region as well. The linear alias of the text region is
mapped read-only/non-executable to prevent inadvertent modification or
execution.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

f9040773

arm64: pgtable: implement static [pte|pmd|pud]_offset variants · 6533945a

由 Ard Biesheuvel 提交于 2月 16, 2016

The page table accessors pte_offset(), pud_offset() and pmd_offset()
rely on __va translations, so they can only be used after the linear
mapping has been installed. For the early fixmap and kasan init routines,
whose page tables are allocated statically in the kernel image, these
functions will return bogus values. So implement pte_offset_kimg(),
pmd_offset_kimg() and pud_offset_kimg(), which can be used instead
before any page tables have been allocated dynamically.
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

6533945a

16 2月, 2016 4 次提交

arm64: mm: add functions to walk tables in fixmap · 961faac1

由 Mark Rutland 提交于 1月 25, 2016

As a preparatory step to allow us to allocate early page tables from
unmapped memory using memblock_alloc, add new p??_{set,clear}_fixmap*
functions which can be used to walk page tables outside of the linear
mapping by using fixmap slots.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

961faac1

arm64: mm: add functions to walk page tables by PA · dca56dca

由 Mark Rutland 提交于 1月 25, 2016

To allow us to walk tables allocated into the fixmap, we need to acquire
the physical address of a page, rather than the virtual address in the
linear map.

This patch adds new p??_page_paddr and p??_offset_phys functions to
acquire the physical address of a next-level table, and changes
p??_offset* into macros which simply convert this to a linear map VA.
This renders p??_page_vaddr unused, and hence they are removed.

At the pgd level, a new pgd_offset_raw function is added to find the
relevant PGD entry given the base of a PGD and a virtual address.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

dca56dca

arm64: mm: move pte_* macros · 053520f7

由 Mark Rutland 提交于 1月 25, 2016

For pmd, pud, and pgd levels of table, functions including p?d_index and
p?d_offset are defined after the p?d_page_vaddr function for the
immediately higher level of table.

The pte functions however are defined much earlier, even though several
rely on the later definition of pmd_page_vaddr. While this isn't
currently a problem as these are macros, it prevents the logical
grouping of later C functions (which cannot rely on prototypes for
functions not yet defined).

Move these definitions after pmd_page_vaddr, for consistency with the
placement of these functions for other levels of table.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

053520f7

arm64: mm: place empty_zero_page in bss · 5227cfa7

由 Mark Rutland 提交于 1月 25, 2016

Currently the zero page is set up in paging_init, and thus we cannot use
the zero page earlier. We use the zero page as a reserved TTBR value
from which no TLB entries may be allocated (e.g. when uninstalling the
idmap). To enable such usage earlier (as may be required for invasive
changes to the kernel page tables), and to minimise the time that the
idmap is active, we need to be able to use the zero page before
paging_init.

This patch follows the example set by x86, by allocating the zero page
at compile time, in .bss. This means that the zero page itself is
available immediately upon entry to start_kernel (as we zero .bss before
this), and also means that the zero page takes up no space in the raw
Image binary. The associated struct page is allocated in bootmem_init,
and remains unavailable until this time.

Outside of arch code, the only users of empty_zero_page assume that the
empty_zero_page symbol refers to the zeroed memory itself, and that
ZERO_PAGE(x) must be used to acquire the associated struct page,
following the example of x86. This patch also brings arm64 inline with
these assumptions.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: NJeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

5227cfa7

25 1月, 2016 1 次提交

arm64: Honour !PTE_WRITE in set_pte_at() for kernel mappings · ac15bd63

由 Catalin Marinas 提交于 1月 07, 2016

Currently, set_pte_at() only checks the software PTE_WRITE bit for user
mappings when it sets or clears the hardware PTE_RDONLY accordingly. The
kernel ptes are written directly without any modification, relying
solely on the protection bits in macros like PAGE_KERNEL. However,
modifying kernel pte attributes via pte_wrprotect() would be ignored by
set_pte_at(). Since pte_wrprotect() does not set PTE_RDONLY (it only
clears PTE_WRITE), the new permission is not taken into account.

This patch changes set_pte_at() to adjust the read-only permission for
kernel ptes as well. As a side effect, existing PROT_* definitions used
for kernel ioremap*() need to include PTE_DIRTY | PTE_WRITE.

(additionally, white space fix for PTE_KERNEL_ROX)
Acked-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Reported-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

ac15bd63

16 1月, 2016 2 次提交

arch/arm64/include/asm/pgtable.h: add pmd_mkclean for THP · 05ee26d9

由 Minchan Kim 提交于 1月 15, 2016

MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite
of the contents since MADV_FREE syscall is called for THP page.

This patch adds pmd_mkclean for THP page MADV_FREE support.
Signed-off-by: NMinchan Kim <minchan@kernel.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Shaohua Li <shli@kernel.org>
Cc: <yalin.wang2010@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jason Evans <je@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mika Penttil <mika.penttila@nextfour.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rik van Riel <riel@redhat.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Shaohua Li <shli@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

05ee26d9

arm64, thp: remove infrastructure for handling splitting PMDs · b7ed934a

由 Kirill A. Shutemov 提交于 1月 15, 2016

With new refcounting we don't need to mark PMDs splitting.  Let's drop
code to handle this.

pmdp_splitting_flush() is not needed too: on splitting PMD we will do
pmdp_clear_flush() + set_pte_at().  pmdp_clear_flush() will do IPI as
needed for fast_gup.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b7ed934a

05 1月, 2016 1 次提交

arm64: mm: move pgd_cache initialisation to pgtable_cache_init · 39b5be9b

由 Will Deacon 提交于 1月 05, 2016

Initialising the suppport for EFI runtime services requires us to
allocate a pgd off the back of an early_initcall. On systems where the
PGD_SIZE is smaller than PAGE_SIZE (e.g. 64k pages and 48-bit VA), the
pgd_cache isn't initialised at this stage, and we panic with a NULL
dereference during boot:

  Unable to handle kernel NULL pointer dereference at virtual address 00000000

  __create_mapping.isra.5+0x84/0x350
  create_pgd_mapping+0x20/0x28
  efi_create_mapping+0x5c/0x6c
  arm_enable_runtime_services+0x154/0x1e4
  do_one_initcall+0x8c/0x190
  kernel_init_freeable+0x84/0x1ec
  kernel_init+0x10/0xe0
  ret_from_fork+0x10/0x50

This patch fixes the problem by initialising the pgd_cache earlier, in
the pgtable_cache_init callback, which sounds suspiciously like what it
was intended for.
Reported-by: NDennis Chen <dennis.chen@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

39b5be9b

22 12月, 2015 1 次提交

arm64: hugetlb: add support for PTE contiguous bit · 66b3923a

由 David Woods 提交于 12月 17, 2015

The arm64 MMU supports a Contiguous bit which is a hint that the TTE
is one of a set of contiguous entries which can be cached in a single
TLB entry.  Supporting this bit adds new intermediate huge page sizes.

The set of huge page sizes available depends on the base page size.
Without using contiguous pages the huge page sizes are as follows.

 4KB:   2MB  1GB
64KB: 512MB

With a 4KB granule, the contiguous bit groups together sets of 16 pages
and with a 64KB granule it groups sets of 32 pages.  This enables two new
huge page sizes in each case, so that the full set of available sizes
is as follows.

 4KB:  64KB   2MB  32MB  1GB
64KB:   2MB 512MB  16GB

If a 16KB granule is used then the contiguous bit groups 128 pages
at the PTE level and 32 pages at the PMD level.

If the base page size is set to 64KB then 2MB pages are enabled by
default.  It is possible in the future to make 2MB the default huge
page size for both 4KB and 64KB granules.
Reviewed-by: NChris Metcalf <cmetcalf@ezchip.com>
Reviewed-by: NSteve Capper <steve.capper@linaro.org>
Signed-off-by: NDavid Woods <dwoods@ezchip.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

66b3923a

11 12月, 2015 1 次提交

arm64: Improve error reporting on set_pte_at() checks · 82d34008

由 Catalin Marinas 提交于 12月 08, 2015

Currently the BUG_ON() checks do not give enough information about the
PTEs being set. This patch changes BUG_ON to WARN_ONCE and dumps the
values of the old and new PTEs. In addition, the checks are only made if
the new PTE entry is valid.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Reported-by: NMing Lei <tom.leiming@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>

82d34008

01 12月, 2015 1 次提交

arm64: pgtable: implement pte_accessible() · 76c714be

由 Will Deacon 提交于 10月 30, 2015

This patch implements the pte_accessible() macro, which can be used to
test whether or not a given pte is a candidate for allocation in the
TLB.
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

76c714be

18 11月, 2015 1 次提交

arm64: Fix R/O permissions in mark_rodata_ro · 0b2aa5b8

由 Laura Abbott 提交于 11月 12, 2015

The permissions in mark_rodata_ro trigger a build error
with STRICT_MM_TYPECHECKS. Fix this by introducing
PAGE_KERNEL_ROX for the same reasons as PAGE_KERNEL_RO.
From Ard:

"PAGE_KERNEL_EXEC has PTE_WRITE set as well, making the range
writeable under the ARMv8.1 DBM feature, that manages the
dirty bit in hardware (writing to a page with the PTE_RDONLY
and PTE_WRITE bits both set will clear the PTE_RDONLY bit in that case)"
Signed-off-by: NLaura Abbott <labbott@fedoraproject.org>
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

0b2aa5b8

09 11月, 2015 1 次提交

arm64: fix R/O permissions of FDT mapping · fb226c3d

由 Ard Biesheuvel 提交于 11月 09, 2015

The mapping permissions of the FDT are set to 'PAGE_KERNEL | PTE_RDONLY'
in an attempt to map the FDT as read-only. However, not only does this
break at build time under STRICT_MM_TYPECHECKS (since the two terms are
of different types in that case), it also results in both the PTE_WRITE
and PTE_RDONLY attributes to be set, which means the region is still
writable under ARMv8.1 DBM (and an attempted write will simply clear the
PT_RDONLY bit).

So instead, define PAGE_KERNEL_RO (which already has an established
meaning across architectures) and use that instead.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

fb226c3d

16 10月, 2015 1 次提交

arm64: Minor coding style fixes for kc_offset_to_vaddr and kc_vaddr_to_offset · 7db743c6

由 Catalin Marinas 提交于 10月 16, 2015

These were introduced by commit 03875ad5 (arm64: add
kc_offset_to_vaddr and kc_vaddr_to_offset macro).
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

7db743c6

13 10月, 2015 2 次提交

arm64: add kc_offset_to_vaddr and kc_vaddr_to_offset macro · 03875ad5

由 yalin wang 提交于 10月 12, 2015

This patch add kc_offset_to_vaddr() and kc_vaddr_to_offset(),
the default version doesn't work on arm64, because arm64 kernel address
is below the PAGE_OFFSET, like module address and vmemmap address are
all below PAGE_OFFSET address.
Signed-off-by: Nyalin wang <yalin.wang2010@gmail.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

03875ad5

arm64: add KASAN support · 39d114dd

由 Andrey Ryabinin 提交于 10月 12, 2015

This patch adds arch specific code for kernel address sanitizer
(see Documentation/kasan.txt).

1/8 of kernel addresses reserved for shadow memory. There was no
big enough hole for this, so virtual addresses for shadow were
stolen from vmalloc area.

At early boot stage the whole shadow region populated with just
one physical page (kasan_zero_page). Later, this page reused
as readonly zero shadow for some memory that KASan currently
don't track (vmalloc).
After mapping the physical memory, pages for shadow memory are
allocated and mapped.

Functions like memset/memmove/memcpy do a lot of memory accesses.
If bad pointer passed to one of these function it is important
to catch this. Compiler's instrumentation cannot do this since
these functions are written in assembly.
KASan replaces memory functions with manually instrumented variants.
Original functions declared as weak symbols so strong definitions
in mm/kasan/kasan.c could replace them. Original functions have aliases
with '__' prefix in name, so we could call non-instrumented variant
if needed.
Some files built without kasan instrumentation (e.g. mm/slub.c).
Original mem* function replaced (via #define) with prefixed variants
to disable memory access checks for such files.
Signed-off-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com>
Tested-by: NLinus Walleij <linus.walleij@linaro.org>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

39d114dd

09 10月, 2015 2 次提交

arm64: Default kernel pages should be contiguous · 06f90d25

由 Jeremy Linton 提交于 10月 07, 2015

The default page attributes for a PMD being broken should have the CONT bit
set. Create a new definition for an early boot range of PTE's that are
contiguous.
Signed-off-by: NJeremy Linton <jeremy.linton@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

06f90d25

arm64: Macros to check/set/unset the contiguous bit · 93ef666a

由 Jeremy Linton 提交于 10月 07, 2015

Add the supporting macros to check if the contiguous bit
is set, set the bit, or clear it in a PTE entry.
Signed-off-by: NJeremy Linton <jeremy.linton@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

93ef666a

07 10月, 2015 2 次提交

arm64: mm: remove dsb from update_mmu_cache · 120798d2

由 Will Deacon 提交于 10月 06, 2015

update_mmu_cache() consists of a dsb(ishst) instruction so that new user
mappings are guaranteed to be visible to the page table walker on
exception return.

In reality this can be a very expensive operation which is rarely needed.
Removing this barrier shows a modest improvement in hackbench scores and
, in the worst case, we re-take the user fault and establish that there
was nothing to do.
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

120798d2

arm64: introduce VA_START macro - the first kernel virtual address. · 127db024

由 Andrey Ryabinin 提交于 9月 17, 2015

In order to not use lengthy (UL(0xffffffffffffffff) << VA_BITS) everywhere,
replace it with VA_START.
Signed-off-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

127db024

02 10月, 2015 1 次提交

arm64: Fix THP protection change logic · 1a541b4e

由 Steve Capper 提交于 10月 01, 2015

6910fa16 ("arm64: enable PTE type bit in the mask for pte_modify") fixes
a problem whereby a large block of PROT_NONE mapped memory is
incorrectly mapped as block descriptors when mprotect is called.

Unfortunately, a subtle bug was introduced by this fix to the THP logic.

If one mmaps a large block of memory, then faults it such that it is
collapsed into THPs; resulting calls to mprotect on this area of memory
will lead to incorrect table descriptors being written instead of block
descriptors. This is because pmd_modify calls pte_modify which is now
allowed to modify the type of the page table entry.

This patch reverts commit 6910fa16, and
fixes the problem it was trying to address by adjusting PAGE_NONE to
represent a table entry. Thus no change in pte type is required when
moving from PROT_NONE to a different protection.

Fixes: 6910fa16 ("arm64: enable PTE type bit in the mask for pte_modify")
Cc: <stable@vger.kernel.org> # 4.0+
Cc: Feng Kan <fkan@apm.com>
Reported-by: NGanapatrao Kulkarni <Ganapatrao.Kulkarni@caviumnetworks.com>
Tested-by: NGanapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NSteve Capper <steve.capper@linaro.org>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

1a541b4e

14 9月, 2015 3 次提交

arm64: pgtable: use a single bit for PTE_WRITE regardless of DBM · bf950040

由 Will Deacon 提交于 9月 11, 2015

Depending on CONFIG_ARM64_HW_AFDBM, we use either bit 57 or 51 of the
pte to represent PTE_WRITE. Given that bit 51 is reserved prior to
ARMv8.1, we can just use that bit regardless of the config option. That
also matches what happens if a kernel configured with ARM64_HW_AFDBM=y
is run on a CPU without the DBM functionality.

Cc: Julien Grall <julien.grall@citrix.com>
Tested-by: NJulien Grall <julien.grall@citrix.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

bf950040

arm64: Fix pte_modify() to preserve the hardware dirty information · 62d96c71

由 Catalin Marinas 提交于 9月 11, 2015

The pte_modify() function with hardware AF/DBM enabled must transfer the
hardware dirty information to the software PTE_DIRTY bit. However, it
was setting this bit in newprot and the mask does not cover such bit.
This patch sets PTE_DIRTY on the original pte which will be preserved in
the returned value.

Fixes: 2f4b829c ("arm64: Add support for hardware updates of the access and dirty pte bits")
Cc: Julien Grall <julien.grall@citrix.com>
Tested-by: NJulien Grall <julien.grall@citrix.com>
Tested-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

62d96c71

arm64: Fix the pte_hw_dirty() check when AF/DBM is enabled · b847415c

由 Catalin Marinas 提交于 9月 11, 2015

Commit 2f4b829c ("arm64: Add support for hardware updates of the
access and dirty pte bits") introduced support for handling hardware
updates of the access flag and dirty status. The PTE is automatically
dirtied in hardware (if supported) by clearing the PTE_RDONLY bit when
the PTE_DBM/PTE_WRITE bit is set. The pte_hw_dirty() macro was added to
detect a hardware dirtied pte. The pte_dirty() macro checks for both
software PTE_DIRTY and pte_hw_dirty().

Functions like pte_modify() clear the PTE_RDONLY bit since it is meant
to be set in set_pte_at() when written to memory. In such cases,
pte_hw_dirty() would return true even though such pte is clean. This
patch changes pte_hw_dirty() to test the PTE_DBM/PTE_WRITE bit together
with PTE_RDONLY.

Fixes: 2f4b829c ("arm64: Add support for hardware updates of the access and dirty pte bits")
Reported-by: NJulien Grall <julien.grall@citrix.com>
Tested-by: NJulien Grall <julien.grall@citrix.com>
Tested-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

b847415c

08 8月, 2015 1 次提交

arm64/mm: Add PROT_DEVICE_nGnRnE and PROT_NORMAL_WT · 8d446c86

由 Jonathan (Zhixiong) Zhang 提交于 8月 07, 2015

UEFI spec 2.5 section 2.3.6.1 defines that
EFI_MEMORY_[UC|WC|WT|WB] are possible EFI memory types for
AArch64.

Each of those EFI memory types is mapped to a corresponding
AArch64 memory type. So we need to define PROT_DEVICE_nGnRnE
and PROT_NORMWL_WT additionaly.

MT_NORMAL_WT is defined, and its encoding is added to MAIR_EL1
when initializing the CPU.
Signed-off-by: NJonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1438936621-5215-6-git-send-email-matt@codeblueprint.co.ukSigned-off-by: NIngo Molnar <mingo@kernel.org>

8d446c86

28 7月, 2015 1 次提交

arm64: pgtable: fix definition of pte_valid · 766ffb69

由 Will Deacon 提交于 7月 28, 2015

pte_valid should check if the PTE_VALID bit (1 << 0) is set in the pte,
so fix the macro definition to use bitwise & instead of logical &&.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

766ffb69

27 7月, 2015 3 次提交

arm64: force CONFIG_SMP=y and remove redundant #ifdefs · 4b3dc967

由 Will Deacon 提交于 5月 29, 2015

Nobody seems to be producing !SMP systems anymore, so this is just
becoming a source of kernel bugs, particularly if people want to use
coherent DMA with non-shared pages.

This patch forces CONFIG_SMP=y for arm64, removing a modest amount of
code in the process.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

4b3dc967

arm64: Add support for hardware updates of the access and dirty pte bits · 2f4b829c

由 Catalin Marinas 提交于 7月 10, 2015

The ARMv8.1 architecture extensions introduce support for hardware
updates of the access and dirty information in page table entries. With
TCR_EL1.HA enabled, when the CPU accesses an address with the PTE_AF bit
cleared in the page table, instead of raising an access flag fault the
CPU sets the actual page table entry bit. To ensure that kernel
modifications to the page tables do not inadvertently revert a change
introduced by hardware updates, the exclusive monitor (ldxr/stxr) is
adopted in the pte accessors.

When TCR_EL1.HD is enabled, a write access to a memory location with the
DBM (Dirty Bit Management) bit set in the corresponding pte
automatically clears the read-only bit (AP[2]). Such DBM bit maps onto
the Linux PTE_WRITE bit and to check whether a writable (DBM set) page
is dirty, the kernel tests the PTE_RDONLY bit. In order to allow
read-only and dirty pages, the kernel needs to preserve the software
dirty bit. The hardware dirty status is transferred to the software
dirty bit in ptep_set_wrprotect() (using load/store exclusive loop) and
pte_modify().
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

2f4b829c

arm64: move update_mmu_cache() into asm/pgtable.h · cba3574f

由 Will Deacon 提交于 7月 16, 2015

Mark Brown reported an allnoconfig build failure in -next:

  Today's linux-next fails to build an arm64 allnoconfig due to "mm:
  make GUP handle pfn mapping unless FOLL_GET is requested" which
  causes:

  >       arm64-allnoconfig
  > ../mm/gup.c:51:4: error: implicit declaration of function
    'update_mmu_cache' [-Werror=implicit-function-declaration]

Fix the error by moving the function to asm/pgtable.h, as is the case
for most other architectures.
Reported-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

cba3574f

15 4月, 2015 1 次提交

arm64: expose number of page table levels on Kconfig level · 9f25e6ad

由 Kirill A. Shutemov 提交于 4月 14, 2015

We would want to use number of page table level to define mm_struct.
Let's expose it as CONFIG_PGTABLE_LEVELS.

ARM64_PGTABLE_LEVELS is renamed to PGTABLE_LEVELS and defined before
sourcing init/Kconfig: arch/Kconfig will define default value and it's
sourced from init/Kconfig.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Tested-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9f25e6ad

27 2月, 2015 1 次提交

arm64: enable PTE type bit in the mask for pte_modify · 6910fa16

由 Feng Kan 提交于 2月 24, 2015

Caught during Trinity testing. The pte_modify does not allow
modification for PTE type bit. This cause the test to hang
the system. It is found that the PTE can't transit from an
inaccessible page (b00) to a valid page (b11) because the mask
does not allow it. This happens when a big block of mmaped
memory is set the PROT_NONE, then the a small piece is broken
off and set to PROT_WRITE | PROT_READ cause a huge page split.
Signed-off-by: NFeng Kan <fkan@apm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

6910fa16

12 2月, 2015 1 次提交

mm: make FIRST_USER_ADDRESS unsigned long on all archs · d016bf7e

由 Kirill A. Shutemov 提交于 2月 11, 2015

LKP has triggered a compiler warning after my recent patch "mm: account
pmd page tables to the process":

    mm/mmap.c: In function 'exit_mmap':
 >> mm/mmap.c:2857:2: warning: right shift count >= width of type [enabled by default]

The code:

 > 2857                WARN_ON(mm_nr_pmds(mm) >
   2858                                round_up(FIRST_USER_ADDRESS, PUD_SIZE) >> PUD_SHIFT);

In this, on tile, we have FIRST_USER_ADDRESS defined as 0.  round_up() has
the same type -- int.  PUD_SHIFT.

I think the best way to fix it is to define FIRST_USER_ADDRESS as unsigned
long.  On every arch for consistency.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d016bf7e

11 2月, 2015 1 次提交

arm64: drop PTE_FILE and pte_file()-related helpers · 9b3e661e

由 Kirill A. Shutemov 提交于 2月 10, 2015

We've replaced remap_file_pages(2) implementation with emulation.  Nobody
creates non-linear mapping anymore.

This patch also adjust __SWP_TYPE_SHIFT and increase number of bits
availble for swap offset.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9b3e661e