1. 06 5月, 2016 2 次提交
  2. 16 4月, 2016 2 次提交
    • C
      arm64: Implement ptep_set_access_flags() for hardware AF/DBM · 66dbd6e6
      Catalin Marinas 提交于
      When hardware updates of the access and dirty states are enabled, the
      default ptep_set_access_flags() implementation based on calling
      set_pte_at() directly is potentially racy. This triggers the "racy dirty
      state clearing" warning in set_pte_at() because an existing writable PTE
      is overridden with a clean entry.
      
      There are two main scenarios for this situation:
      
      1. The CPU getting an access fault does not support hardware updates of
         the access/dirty flags. However, a different agent in the system
         (e.g. SMMU) can do this, therefore overriding a writable entry with a
         clean one could potentially lose the automatically updated dirty
         status
      
      2. A more complex situation is possible when all CPUs support hardware
         AF/DBM:
      
         a) Initial state: shareable + writable vma and pte_none(pte)
         b) Read fault taken by two threads of the same process on different
            CPUs
         c) CPU0 takes the mmap_sem and proceeds to handling the fault. It
            eventually reaches do_set_pte() which sets a writable + clean pte.
            CPU0 releases the mmap_sem
         d) CPU1 acquires the mmap_sem and proceeds to handle_pte_fault(). The
            pte entry it reads is present, writable and clean and it continues
            to pte_mkyoung()
         e) CPU1 calls ptep_set_access_flags()
      
         If between (d) and (e) the hardware (another CPU) updates the dirty
         state (clears PTE_RDONLY), CPU1 will override the PTR_RDONLY bit
         marking the entry clean again.
      
      This patch implements an arm64-specific ptep_set_access_flags() function
      to perform an atomic update of the PTE flags.
      
      Fixes: 2f4b829c ("arm64: Add support for hardware updates of the access and dirty pte bits")
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: NMing Lei <tom.leiming@gmail.com>
      Tested-by: NJulien Grall <julien.grall@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: <stable@vger.kernel.org> # 4.3+
      [will: reworded comment]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      66dbd6e6
    • G
      arm64, mm, numa: Add NUMA balancing support for arm64. · 56166230
      Ganapatrao Kulkarni 提交于
      Enable NUMA balancing for arm64 platforms.
      Add pte, pmd protnone helpers for use by automatic NUMA balancing.
      Reviewed-by: NSteve Capper <steve.capper@arm.com>
      Reviewed-by: NRobert Richter <rrichter@cavium.com>
      Signed-off-by: NGanapatrao Kulkarni <gkulkarni@caviumnetworks.com>
      Signed-off-by: NDavid Daney <david.daney@cavium.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      56166230
  3. 14 4月, 2016 3 次提交
  4. 11 3月, 2016 1 次提交
    • C
      arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission · fdc69e7d
      Catalin Marinas 提交于
      The set_pte_at() function must update the hardware PTE_RDONLY bit
      depending on the state of the PTE_WRITE and PTE_DIRTY bits of the given
      entry value. However, it currently only performs this for pte_valid()
      entries, ignoring PTE_PROT_NONE. The side-effect is that PROT_NONE
      mappings would not have the PTE_RDONLY bit set. Without
      CONFIG_ARM64_HW_AFDBM, this is not an issue since such PROT_NONE pages
      are not accessible anyway.
      
      With commit 2f4b829c ("arm64: Add support for hardware updates of
      the access and dirty pte bits"), the ptep_set_wrprotect() function was
      re-written to cope with automatic hardware updates of the dirty state.
      As an optimisation, only PTE_RDONLY is checked to assess the "dirty"
      status. Since set_pte_at() does not set this bit for PROT_NONE mappings,
      such pages may be considered "dirty" as a result of
      ptep_set_wrprotect().
      
      This patch updates the pte_valid() check to pte_present() in
      set_pte_at(). It also adds PTE_PROT_NONE to the swap entry bits comment.
      
      Fixes: 2f4b829c ("arm64: Add support for hardware updates of the access and dirty pte bits")
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: NGanapatrao Kulkarni <gkulkarni@caviumnetworks.com>
      Tested-by: NGanapatrao Kulkarni <gkulkarni@cavium.com>
      Cc: <stable@vger.kernel.org>
      fdc69e7d
  5. 09 3月, 2016 1 次提交
  6. 27 2月, 2016 1 次提交
    • A
      arm64: vmemmap: use virtual projection of linear region · dfd55ad8
      Ard Biesheuvel 提交于
      Commit dd006da2 ("arm64: mm: increase VA range of identity map") made
      some changes to the memory mapping code to allow physical memory to reside
      at an offset that exceeds the size of the virtual mapping.
      
      However, since the size of the vmemmap area is proportional to the size of
      the VA area, but it is populated relative to the physical space, we may
      end up with the struct page array being mapped outside of the vmemmap
      region. For instance, on my Seattle A0 box, I can see the following output
      in the dmesg log.
      
         vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000   (     8 GB maximum)
                   0xffffffbfc0000000 - 0xffffffbfd0000000   (   256 MB actual)
      
      We can fix this by deciding that the vmemmap region is not a projection of
      the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
      linear region. This way, we are guaranteed that the vmemmap region is of
      sufficient size, and we can even reduce the size by half.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      dfd55ad8
  7. 26 2月, 2016 2 次提交
    • M
      arm64: Remove fixmap include fragility · 3eca86e7
      Mark Rutland 提交于
      The asm-generic fixmap.h depends on each architecture's fixmap.h to pull
      in the definition of PAGE_KERNEL_RO, if this exists. In the absence of
      this, FIXMAP_PAGE_RO will not be defined. In mm/early_ioremap.c the
      definition of early_memremap_ro is predicated on FIXMAP_PAGE_RO being
      defined.
      
      Currently, the arm64 fixmap.h doesn't include pgtable.h for the
      definition of PAGE_KERNEL_RO, and as a knock-on effect early_memremap_ro
      is not always defined, leading to link-time failures when it is used.
      This has been observed with defconfig on next-20160226.
      
      Unfortunately, as pgtable.h includes fixmap.h, adding the include
      introduces a circular dependency, which is just as fragile.
      
      Instead, this patch factors out PAGE_KERNEL_RO and other prot
      definitions into a new pgtable-prot header which can be included by poth
      pgtable.h and fixmap.h, avoiding the  circular dependency, and ensuring
      that early_memremap_ro is alwyas defined where it is used.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reported-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      3eca86e7
    • C
      arm64: Fix building error with 16KB pages and 36-bit VA · cac4b8cd
      Catalin Marinas 提交于
      In such configuration, Linux uses only two pages of page tables and
      __pud_populate() should not be used. However, the BUILD_BUG() triggers
      since pud_sect() is still defined and the compiler cannot eliminate such
      code, even though at run-time it should not be triggered. This patch
      extends the #ifdef ARM64_64K_PAGES condition for pud_sect to include
      PGTABLE_LEVELS < 3.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      cac4b8cd
  8. 19 2月, 2016 2 次提交
  9. 16 2月, 2016 4 次提交
  10. 25 1月, 2016 1 次提交
  11. 16 1月, 2016 2 次提交
    • M
      arch/arm64/include/asm/pgtable.h: add pmd_mkclean for THP · 05ee26d9
      Minchan Kim 提交于
      MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite
      of the contents since MADV_FREE syscall is called for THP page.
      
      This patch adds pmd_mkclean for THP page MADV_FREE support.
      Signed-off-by: NMinchan Kim <minchan@kernel.org>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: <yalin.wang2010@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Gang <gang.chen.5i5j@gmail.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Daniel Micay <danielmicay@gmail.com>
      Cc: Darrick J. Wong <darrick.wong@oracle.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Jason Evans <je@fb.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mika Penttil <mika.penttila@nextfour.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      05ee26d9
    • K
      arm64, thp: remove infrastructure for handling splitting PMDs · b7ed934a
      Kirill A. Shutemov 提交于
      With new refcounting we don't need to mark PMDs splitting.  Let's drop
      code to handle this.
      
      pmdp_splitting_flush() is not needed too: on splitting PMD we will do
      pmdp_clear_flush() + set_pte_at().  pmdp_clear_flush() will do IPI as
      needed for fast_gup.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Steve Capper <steve.capper@linaro.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b7ed934a
  12. 05 1月, 2016 1 次提交
    • W
      arm64: mm: move pgd_cache initialisation to pgtable_cache_init · 39b5be9b
      Will Deacon 提交于
      Initialising the suppport for EFI runtime services requires us to
      allocate a pgd off the back of an early_initcall. On systems where the
      PGD_SIZE is smaller than PAGE_SIZE (e.g. 64k pages and 48-bit VA), the
      pgd_cache isn't initialised at this stage, and we panic with a NULL
      dereference during boot:
      
        Unable to handle kernel NULL pointer dereference at virtual address 00000000
      
        __create_mapping.isra.5+0x84/0x350
        create_pgd_mapping+0x20/0x28
        efi_create_mapping+0x5c/0x6c
        arm_enable_runtime_services+0x154/0x1e4
        do_one_initcall+0x8c/0x190
        kernel_init_freeable+0x84/0x1ec
        kernel_init+0x10/0xe0
        ret_from_fork+0x10/0x50
      
      This patch fixes the problem by initialising the pgd_cache earlier, in
      the pgtable_cache_init callback, which sounds suspiciously like what it
      was intended for.
      Reported-by: NDennis Chen <dennis.chen@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      39b5be9b
  13. 22 12月, 2015 1 次提交
    • D
      arm64: hugetlb: add support for PTE contiguous bit · 66b3923a
      David Woods 提交于
      The arm64 MMU supports a Contiguous bit which is a hint that the TTE
      is one of a set of contiguous entries which can be cached in a single
      TLB entry.  Supporting this bit adds new intermediate huge page sizes.
      
      The set of huge page sizes available depends on the base page size.
      Without using contiguous pages the huge page sizes are as follows.
      
       4KB:   2MB  1GB
      64KB: 512MB
      
      With a 4KB granule, the contiguous bit groups together sets of 16 pages
      and with a 64KB granule it groups sets of 32 pages.  This enables two new
      huge page sizes in each case, so that the full set of available sizes
      is as follows.
      
       4KB:  64KB   2MB  32MB  1GB
      64KB:   2MB 512MB  16GB
      
      If a 16KB granule is used then the contiguous bit groups 128 pages
      at the PTE level and 32 pages at the PMD level.
      
      If the base page size is set to 64KB then 2MB pages are enabled by
      default.  It is possible in the future to make 2MB the default huge
      page size for both 4KB and 64KB granules.
      Reviewed-by: NChris Metcalf <cmetcalf@ezchip.com>
      Reviewed-by: NSteve Capper <steve.capper@linaro.org>
      Signed-off-by: NDavid Woods <dwoods@ezchip.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      66b3923a
  14. 11 12月, 2015 1 次提交
  15. 01 12月, 2015 1 次提交
  16. 18 11月, 2015 1 次提交
  17. 09 11月, 2015 1 次提交
    • A
      arm64: fix R/O permissions of FDT mapping · fb226c3d
      Ard Biesheuvel 提交于
      The mapping permissions of the FDT are set to 'PAGE_KERNEL | PTE_RDONLY'
      in an attempt to map the FDT as read-only. However, not only does this
      break at build time under STRICT_MM_TYPECHECKS (since the two terms are
      of different types in that case), it also results in both the PTE_WRITE
      and PTE_RDONLY attributes to be set, which means the region is still
      writable under ARMv8.1 DBM (and an attempted write will simply clear the
      PT_RDONLY bit).
      
      So instead, define PAGE_KERNEL_RO (which already has an established
      meaning across architectures) and use that instead.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      fb226c3d
  18. 16 10月, 2015 1 次提交
  19. 13 10月, 2015 2 次提交
    • Y
      arm64: add kc_offset_to_vaddr and kc_vaddr_to_offset macro · 03875ad5
      yalin wang 提交于
      This patch add kc_offset_to_vaddr() and kc_vaddr_to_offset(),
      the default version doesn't work on arm64, because arm64 kernel address
      is below the PAGE_OFFSET, like module address and vmemmap address are
      all below PAGE_OFFSET address.
      Signed-off-by: Nyalin wang <yalin.wang2010@gmail.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      03875ad5
    • A
      arm64: add KASAN support · 39d114dd
      Andrey Ryabinin 提交于
      This patch adds arch specific code for kernel address sanitizer
      (see Documentation/kasan.txt).
      
      1/8 of kernel addresses reserved for shadow memory. There was no
      big enough hole for this, so virtual addresses for shadow were
      stolen from vmalloc area.
      
      At early boot stage the whole shadow region populated with just
      one physical page (kasan_zero_page). Later, this page reused
      as readonly zero shadow for some memory that KASan currently
      don't track (vmalloc).
      After mapping the physical memory, pages for shadow memory are
      allocated and mapped.
      
      Functions like memset/memmove/memcpy do a lot of memory accesses.
      If bad pointer passed to one of these function it is important
      to catch this. Compiler's instrumentation cannot do this since
      these functions are written in assembly.
      KASan replaces memory functions with manually instrumented variants.
      Original functions declared as weak symbols so strong definitions
      in mm/kasan/kasan.c could replace them. Original functions have aliases
      with '__' prefix in name, so we could call non-instrumented variant
      if needed.
      Some files built without kasan instrumentation (e.g. mm/slub.c).
      Original mem* function replaced (via #define) with prefixed variants
      to disable memory access checks for such files.
      Signed-off-by: NAndrey Ryabinin <ryabinin.a.a@gmail.com>
      Tested-by: NLinus Walleij <linus.walleij@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      39d114dd
  20. 09 10月, 2015 2 次提交
  21. 07 10月, 2015 2 次提交
  22. 02 10月, 2015 1 次提交
  23. 14 9月, 2015 3 次提交
  24. 08 8月, 2015 1 次提交
  25. 28 7月, 2015 1 次提交