1. 29 9月, 2017 1 次提交
    • W
      arm64: fault: Route pte translation faults via do_translation_fault · 760bfb47
      Will Deacon 提交于
      We currently route pte translation faults via do_page_fault, which elides
      the address check against TASK_SIZE before invoking the mm fault handling
      code. However, this can cause issues with the path walking code in
      conjunction with our word-at-a-time implementation because
      load_unaligned_zeropad can end up faulting in kernel space if it reads
      across a page boundary and runs into a page fault (e.g. by attempting to
      read from a guard region).
      
      In the case of such a fault, load_unaligned_zeropad has registered a
      fixup to shift the valid data and pad with zeroes, however the abort is
      reported as a level 3 translation fault and we dispatch it straight to
      do_page_fault, despite it being a kernel address. This results in calling
      a sleeping function from atomic context:
      
        BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:313
        in_atomic(): 0, irqs_disabled(): 0, pid: 10290
        Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
        [...]
        [<ffffff8e016cd0cc>] ___might_sleep+0x134/0x144
        [<ffffff8e016cd158>] __might_sleep+0x7c/0x8c
        [<ffffff8e016977f0>] do_page_fault+0x140/0x330
        [<ffffff8e01681328>] do_mem_abort+0x54/0xb0
        Exception stack(0xfffffffb20247a70 to 0xfffffffb20247ba0)
        [...]
        [<ffffff8e016844fc>] el1_da+0x18/0x78
        [<ffffff8e017f399c>] path_parentat+0x44/0x88
        [<ffffff8e017f4c9c>] filename_parentat+0x5c/0xd8
        [<ffffff8e017f5044>] filename_create+0x4c/0x128
        [<ffffff8e017f59e4>] SyS_mkdirat+0x50/0xc8
        [<ffffff8e01684e30>] el0_svc_naked+0x24/0x28
        Code: 36380080 d5384100 f9400800 9402566d (d4210000)
        ---[ end trace 2d01889f2bca9b9f ]---
      
      Fix this by dispatching all translation faults to do_translation_faults,
      which avoids invoking the page fault logic for faults on kernel addresses.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NAnkit Jain <ankijain@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      760bfb47
  2. 23 8月, 2017 5 次提交
  3. 22 8月, 2017 5 次提交
  4. 21 8月, 2017 5 次提交
  5. 11 8月, 2017 1 次提交
    • A
      arm64: fix pmem interface definition · caf5ef7d
      Arnd Bergmann 提交于
      Defining the two functions as 'static inline' and exporting them
      leads to the interesting case where we can use the interface
      from loadable modules, but not from built-in drivers, as shown
      in this link failure:
      
      vers/nvdimm/claim.o: In function `nsio_rw_bytes':
      claim.c:(.text+0x1b8): undefined reference to `arch_invalidate_pmem'
      drivers/nvdimm/pmem.o: In function `pmem_dax_flush':
      pmem.c:(.text+0x11c): undefined reference to `arch_wb_cache_pmem'
      drivers/nvdimm/pmem.o: In function `pmem_make_request':
      pmem.c:(.text+0x5a4): undefined reference to `arch_invalidate_pmem'
      pmem.c:(.text+0x650): undefined reference to `arch_invalidate_pmem'
      pmem.c:(.text+0x6d4): undefined reference to `arch_invalidate_pmem'
      
      This removes the bogus 'static inline'.
      
      Fixes: d50e071f ("arm64: Implement pmem API support")
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      caf5ef7d
  6. 09 8月, 2017 2 次提交
  7. 07 8月, 2017 1 次提交
    • J
      arm64: Decode information from ESR upon mem faults · 1f9b8936
      Julien Thierry 提交于
      When receiving unhandled faults from the CPU, description is very sparse.
      Adding information about faults decoded from ESR.
      
      Added defines to esr.h corresponding ESR fields. Values are based on ARM
      Archtecture Reference Manual (DDI 0487B.a), section D7.2.28 ESR_ELx, Exception
      Syndrome Register (ELx) (pages D7-2275 to D7-2280).
      
      New output is of the form:
      [   77.818059] Mem abort info:
      [   77.820826]   Exception class = DABT (current EL), IL = 32 bits
      [   77.826706]   SET = 0, FnV = 0
      [   77.829742]   EA = 0, S1PTW = 0
      [   77.832849] Data abort info:
      [   77.835713]   ISV = 0, ISS = 0x00000070
      [   77.839522]   CM = 0, WnR = 1
      Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      [catalin.marinas@arm.com: fix "%lu" in a pr_alert() call]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      1f9b8936
  8. 04 8月, 2017 1 次提交
    • C
      arm64: Fix potential race with hardware DBM in ptep_set_access_flags() · 6d332747
      Catalin Marinas 提交于
      In a system with DBM (dirty bit management) capable agents there is a
      possible race between a CPU executing ptep_set_access_flags() (maybe
      non-DBM capable) and a hardware update of the dirty state (clearing of
      PTE_RDONLY). The scenario:
      
      a) the pte is writable (PTE_WRITE set), clean (PTE_RDONLY set) and old
         (PTE_AF clear)
      b) ptep_set_access_flags() is called as a result of a read access and it
         needs to set the pte to writable, clean and young (PTE_AF set)
      c) a DBM-capable agent, as a result of a different write access, is
         marking the entry as young (setting PTE_AF) and dirty (clearing
         PTE_RDONLY)
      
      The current ptep_set_access_flags() implementation would set the
      PTE_RDONLY bit in the resulting value overriding the DBM update and
      losing the dirty state.
      
      This patch fixes such race by setting PTE_RDONLY to the most permissive
      (lowest value) of the current entry and the new one.
      
      Fixes: 66dbd6e6 ("arm64: Implement ptep_set_access_flags() for hardware AF/DBM")
      Cc: Will Deacon <will.deacon@arm.com>
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      6d332747
  9. 28 7月, 2017 1 次提交
  10. 21 7月, 2017 1 次提交
    • P
      arm64/numa: Drop duplicate message · ece4b206
      Punit Agrawal 提交于
      When booting linux on a system without CONFIG_NUMA enabled, the
      following messages are printed during boot -
      
      NUMA: Faking a node at [mem 0x0000000000000000-0x00000083ffffffff]
      NUMA: Adding memblock [0x8000000000 - 0x8000e7ffff] on node 0
      NUMA: Adding memblock [0x8000e80000 - 0x83f65cffff] on node 0
      NUMA: Adding memblock [0x83f65d0000 - 0x83f665ffff] on node 0
      NUMA: Adding memblock [0x83f6660000 - 0x83f676ffff] on node 0
      NUMA: Adding memblock [0x83f6770000 - 0x83f678ffff] on node 0
      NUMA: Adding memblock [0x83f6790000 - 0x83fb82ffff] on node 0
      NUMA: Adding memblock [0x83fb830000 - 0x83fbc0ffff] on node 0
      NUMA: Adding memblock [0x83fbc10000 - 0x83fbdfffff] on node 0
      NUMA: Adding memblock [0x83fbe00000 - 0x83fbffffff] on node 0
      NUMA: Adding memblock [0x83fc000000 - 0x83fffbffff] on node 0
      NUMA: Adding memblock [0x83fffc0000 - 0x83fffdffff] on node 0
      NUMA: Adding memblock [0x83fffe0000 - 0x83ffffffff] on node 0
      NUMA: Initmem setup node 0 [mem 0x8000000000-0x83ffffffff]
      NUMA: NODE_DATA [mem 0x83fffec500-0x83fffedfff]
      
      The information is then duplicated by core kernel messages right after
      the above output.
      
      Early memory node ranges
        node   0: [mem 0x0000008000000000-0x0000008000e7ffff]
        node   0: [mem 0x0000008000e80000-0x00000083f65cffff]
        node   0: [mem 0x00000083f65d0000-0x00000083f665ffff]
        node   0: [mem 0x00000083f6660000-0x00000083f676ffff]
        node   0: [mem 0x00000083f6770000-0x00000083f678ffff]
        node   0: [mem 0x00000083f6790000-0x00000083fb82ffff]
        node   0: [mem 0x00000083fb830000-0x00000083fbc0ffff]
        node   0: [mem 0x00000083fbc10000-0x00000083fbdfffff]
        node   0: [mem 0x00000083fbe00000-0x00000083fbffffff]
        node   0: [mem 0x00000083fc000000-0x00000083fffbffff]
        node   0: [mem 0x00000083fffc0000-0x00000083fffdffff]
        node   0: [mem 0x00000083fffe0000-0x00000083ffffffff]
      Initmem setup node 0 [mem 0x0000008000000000-0x00000083ffffffff]
      
      Remove the duplication of memblock layout information printed during
      boot by dropping the messages from arm64 numa initialisation.
      Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ece4b206
  11. 20 7月, 2017 1 次提交
  12. 13 7月, 2017 1 次提交
  13. 11 7月, 2017 1 次提交
  14. 07 7月, 2017 3 次提交
    • P
      mm/hugetlb: add size parameter to huge_pte_offset() · 7868a208
      Punit Agrawal 提交于
      A poisoned or migrated hugepage is stored as a swap entry in the page
      tables.  On architectures that support hugepages consisting of
      contiguous page table entries (such as on arm64) this leads to ambiguity
      in determining the page table entry to return in huge_pte_offset() when
      a poisoned entry is encountered.
      
      Let's remove the ambiguity by adding a size parameter to convey
      additional information about the requested address.  Also fixup the
      definition/usage of huge_pte_offset() throughout the tree.
      
      Link: http://lkml.kernel.org/r/20170522133604.11392-4-punit.agrawal@arm.comSigned-off-by: NPunit Agrawal <punit.agrawal@arm.com>
      Acked-by: NSteve Capper <steve.capper@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: James Hogan <james.hogan@imgtec.com> (odd fixer:METAG ARCHITECTURE)
      Cc: Ralf Baechle <ralf@linux-mips.org> (supporter:MIPS)
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7868a208
    • S
      arm64: hugetlb: remove spurious calls to huge_ptep_offset() · f0b38d65
      Steve Capper 提交于
      We don't need to call huge_ptep_offset as our accessors are already
      supplied with the pte_t *.  This patch removes those spurious calls.
      
      [punit.agrawal@arm.com: resolve rebase conflicts due to patch re-ordering]
      Link: http://lkml.kernel.org/r/20170524115409.31309-3-punit.agrawal@arm.comSigned-off-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
      Cc: David Woods <dwoods@mellanox.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f0b38d65
    • S
      arm64: hugetlb: refactor find_num_contig() · bb9dd3df
      Steve Capper 提交于
      Patch series "Support for contiguous pte hugepages", v4.
      
      This patchset updates the hugetlb code to fix issues arising from
      contiguous pte hugepages (such as on arm64).  Compared to v3, This
      version addresses a build failure on arm64 by including two cleanup
      patches.  Other than the arm64 cleanups, the rest are generic code
      changes.  The remaining arm64 support based on these patches will be
      posted separately.  The patches are based on v4.12-rc2.  Previous
      related postings can be found at [0], [1], [2], and [3].
      
      The patches fall into three categories -
      
      * Patch 1-2 - arm64 cleanups required to greatly simplify changing
        huge_pte_offset() prototype in Patch 5.
      
        Catalin, Will - are you happy for these patches to go via mm?
      
      * Patches 3-4 address issues with gup
      
      * Patches 5-8 relate to passing a size argument to hugepage helpers to
        disambiguate the size of the referred page. These changes are
        required to enable arch code to properly handle swap entries for
        contiguous pte hugepages.
      
        The changes to huge_pte_offset() (patch 5) touch multiple
        architectures but I've managed to minimise these changes for the
        other affected functions - huge_pte_clear() and set_huge_pte_at().
      
      These patches gate the enabling of contiguous hugepages support on arm64
      which has been requested for systems using !4k page granule.
      
      The ARM64 architecture supports two flavours of hugepages -
      
      * Block mappings at the pud/pmd level
      
        These are regular hugepages where a pmd or a pud page table entry
        points to a block of memory. Depending on the PAGE_SIZE in use the
        following size of block mappings are supported -
      
                PMD	PUD
                ---	---
        4K:      2M	 1G
        16K:    32M
        64K:   512M
      
        For certain applications/usecases such as HPC and large enterprise
        workloads, folks are using 64k page size but the minimum hugepage size
        of 512MB isn't very practical.
      
      To overcome this ...
      
      * Using the Contiguous bit
      
        The architecture provides a contiguous bit in the translation table
        entry which acts as a hint to the mmu to indicate that it is one of a
        contiguous set of entries that can be cached in a single TLB entry.
      
        We use the contiguous bit in Linux to increase the mapping size at the
        pmd and pte (last) level.
      
        The number of supported contiguous entries varies by page size and
        level of the page table.
      
        Using the contiguous bit allows additional hugepage sizes -
      
                 CONT PTE    PMD    CONT PMD    PUD
                 --------    ---    --------    ---
          4K:         64K     2M         32M     1G
          16K:         2M    32M          1G
          64K:         2M   512M         16G
      
        Of these, 64K with 4K and 2M with 64K pages have been explicitly
        requested by a few different users.
      
      Entries with the contiguous bit set are required to be modified all
      together - which makes things like memory poisoning and migration
      impossible to do correctly without knowing the size of hugepage being
      dealt with - the reason for adding size parameter to a few of the
      hugepage helpers in this series.
      
      This patch (of 8):
      
      As we regularly check for contiguous pte's in the huge accessors, remove
      this extra check from find_num_contig.
      
      [punit.agrawal@arm.com: resolve rebase conflicts due to patch re-ordering]
      Link: http://lkml.kernel.org/r/20170524115409.31309-2-punit.agrawal@arm.comSigned-off-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NPunit Agrawal <punit.agrawal@arm.com>
      Cc: David Woods <dwoods@mellanox.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bb9dd3df
  15. 23 6月, 2017 3 次提交
  16. 20 6月, 2017 1 次提交
  17. 15 6月, 2017 1 次提交
  18. 12 6月, 2017 6 次提交