1. 09 12月, 2021 1 次提交
  2. 22 10月, 2021 1 次提交
    • C
      powerpc/32: Don't use a struct based type for pte_t · c7d19189
      Christophe Leroy 提交于
      Long time ago we had a config item called STRICT_MM_TYPECHECKS
      to build the kernel with pte_t defined as a structure in order
      to perform additional build checks or build it with pte_t
      defined as a simple type in order to get simpler generated code.
      
      Commit 670eea92 ("powerpc/mm: Always use STRICT_MM_TYPECHECKS")
      made the struct based definition the only one, considering that the
      generated code was similar in both cases.
      
      That's right on ppc64 because the ABI is such that the content of a
      struct having a single simple type element is passed as register,
      but on ppc32 such a structure is passed via the stack like any
      structure.
      
      Simple test function:
      
      	pte_t test(pte_t pte)
      	{
      		return pte;
      	}
      
      Before this patch we get
      
      	c00108ec <test>:
      	c00108ec:	81 24 00 00 	lwz     r9,0(r4)
      	c00108f0:	91 23 00 00 	stw     r9,0(r3)
      	c00108f4:	4e 80 00 20 	blr
      
      So, for PPC32, restore the simple type behaviour we got before
      commit 670eea92, but instead of adding a config option to
      activate type check, do it when __CHECKER__ is set so that type
      checking is performed by 'sparse' and provides feedback like:
      
      	arch/powerpc/mm/pgtable.c:466:16: warning: incorrect type in return expression (different base types)
      	arch/powerpc/mm/pgtable.c:466:16:    expected unsigned long
      	arch/powerpc/mm/pgtable.c:466:16:    got struct pte_t [usertype] x
      
      With this patch we now get
      
      	c0010890 <test>:
      	c0010890:	4e 80 00 20 	blr
      Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      [mpe: Define STRICT_MM_TYPECHECKS rather than repeating the condition]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/c904599f33aaf6bb7ee2836a9ff8368509e0d78d.1631887042.git.christophe.leroy@csgroup.eu
      c7d19189
  3. 24 6月, 2021 1 次提交
  4. 16 6月, 2021 1 次提交
  5. 11 2月, 2021 1 次提交
  6. 30 1月, 2021 1 次提交
  7. 17 10月, 2020 1 次提交
  8. 15 9月, 2020 1 次提交
  9. 08 8月, 2020 1 次提交
    • M
      mm: remove unneeded includes of <asm/pgalloc.h> · ca15ca40
      Mike Rapoport 提交于
      Patch series "mm: cleanup usage of <asm/pgalloc.h>"
      
      Most architectures have very similar versions of pXd_alloc_one() and
      pXd_free_one() for intermediate levels of page table.  These patches add
      generic versions of these functions in <asm-generic/pgalloc.h> and enable
      use of the generic functions where appropriate.
      
      In addition, functions declared and defined in <asm/pgalloc.h> headers are
      used mostly by core mm and early mm initialization in arch and there is no
      actual reason to have the <asm/pgalloc.h> included all over the place.
      The first patch in this series removes unneeded includes of
      <asm/pgalloc.h>
      
      In the end it didn't work out as neatly as I hoped and moving
      pXd_alloc_track() definitions to <asm-generic/pgalloc.h> would require
      unnecessary changes to arches that have custom page table allocations, so
      I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local
      to mm/.
      
      This patch (of 8):
      
      In most cases <asm/pgalloc.h> header is required only for allocations of
      page table memory.  Most of the .c files that include that header do not
      use symbols declared in <asm/pgalloc.h> and do not require that header.
      
      As for the other header files that used to include <asm/pgalloc.h>, it is
      possible to move that include into the .c file that actually uses symbols
      from <asm/pgalloc.h> and drop the include from the header file.
      
      The process was somewhat automated using
      
      	sed -i -E '/[<"]asm\/pgalloc\.h/d' \
                      $(grep -L -w -f /tmp/xx \
                              $(git grep -E -l '[<"]asm/pgalloc\.h'))
      
      where /tmp/xx contains all the symbols defined in
      arch/*/include/asm/pgalloc.h.
      
      [rppt@linux.ibm.com: fix powerpc warning]
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NPekka Enberg <penberg@kernel.org>
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Matthew Wilcox <willy@infradead.org>
      Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org
      Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ca15ca40
  10. 11 6月, 2020 1 次提交
  11. 10 6月, 2020 1 次提交
  12. 05 6月, 2020 1 次提交
  13. 26 5月, 2020 2 次提交
    • C
      powerpc/8xx: Manage 512k huge pages as standard pages. · b250c8c0
      Christophe Leroy 提交于
      At the time being, 512k huge pages are handled through hugepd page
      tables. The PMD entry is flagged as a hugepd pointer and it
      means that only 512k hugepages can be managed in that 4M block.
      However, the hugepd table has the same size as a normal page
      table, and 512k entries can therefore be nested with normal pages.
      
      On the 8xx, TLB loading is performed by software and allthough the
      page tables are organised to match the L1 and L2 level defined by
      the HW, all TLB entries have both L1 and L2 independent entries.
      It means that even if two TLB entries are associated with the same
      PMD entry, they can be loaded with different values in L1 part.
      
      The L1 entry contains the page size (PS field):
      - 00 for 4k and 16 pages
      - 01 for 512k pages
      - 11 for 8M pages
      
      By adding a flag for hugepages in the PTE (_PAGE_HUGE) and copying it
      into the lower bit of PS, we can then manage 512k pages with normal
      page tables:
      - PMD entry has PS=11 for 8M pages
      - PMD entry has PS=00 for other pages.
      
      As a PMD entry covers 4M areas, a PMD will either point to a hugepd
      table having a single entry to an 8M page, or the PMD will point to
      a standard page table which will have either entries to 4k or 16k or
      512k pages. For 512k pages, as the L1 entry will not know it is a
      512k page before the PTE is read, there will be 128 entries in the
      PTE as if it was 4k pages. But when loading the TLB, it will be
      flagged as a 512k page.
      
      Note that we can't use pmd_ptr() in asm/nohash/32/pgtable.h because
      it is not defined yet.
      
      In ITLB miss, we keep the possibility to opt it out as when kernel
      text is pinned and no user hugepages are used, we can save several
      instruction by not using r11.
      
      In DTLB miss, that's just one instruction so it's not worth bothering
      with it.
      Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/002819e8e166bf81d24b24782d98de7c40905d8f.1589866984.git.christophe.leroy@csgroup.eu
      b250c8c0
    • C
      powerpc/mm: Reduce hugepd size for 8M hugepages on 8xx · b12c07a4
      Christophe Leroy 提交于
      Commit 55c8fc3f ("powerpc/8xx: reintroduce 16K pages with HW
      assistance") redefined pte_t as a struct of 4 pte_basic_t, because
      in 16K pages mode there are four identical entries in the page table.
      But hugepd entries for 8M pages require only one entry of size
      pte_basic_t. So there is no point in creating a cache for 4 entries
      page tables.
      
      Calculate PTE_T_ORDER using the size of pte_basic_t instead of pte_t.
      
      Define specific huge_pte helpers (set_huge_pte_at(), huge_pte_clear(),
      huge_ptep_set_wrprotect()) to write the pte in a single entry instead
      of using set_pte_at() which writes 4 identical entries in 16k pages
      mode. Also make sure that __ptep_set_access_flags() properly handle
      the huge_pte case.
      
      Define set_pte_filter() inline otherwise GCC doesn't inline it anymore
      because it is now used twice, and that gives a pretty suboptimal code
      because of pte_t being a struct of 4 entries.
      
      Those functions are also used for 512k pages which only require one
      entry as well allthough replicating it four times was harmless as 512k
      pages entries are spread every 128 bytes in the table.
      Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/43050d1a0c2d6e1541cab9c1126fc80bc7015ebd.1589866984.git.christophe.leroy@csgroup.eu
      b12c07a4
  14. 04 7月, 2019 1 次提交
  15. 07 6月, 2019 1 次提交
  16. 31 5月, 2019 1 次提交
  17. 02 5月, 2019 4 次提交
  18. 19 12月, 2018 1 次提交
  19. 25 11月, 2018 1 次提交
  20. 20 10月, 2018 1 次提交
    • A
      powerpc/mm: Fix WARN_ON with THP NUMA migration · dd0e144a
      Aneesh Kumar K.V 提交于
      WARNING: CPU: 12 PID: 4322 at /arch/powerpc/mm/pgtable-book3s64.c:76 set_pmd_at+0x4c/0x2b0
       Modules linked in:
       CPU: 12 PID: 4322 Comm: qemu-system-ppc Tainted: G        W         4.19.0-rc3-00758-g8f0c636b0542 #36
       NIP:  c0000000000872fc LR: c000000000484eec CTR: 0000000000000000
       REGS: c000003fba876fe0 TRAP: 0700   Tainted: G        W          (4.19.0-rc3-00758-g8f0c636b0542)
       MSR:  900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 24282884  XER: 00000000
       CFAR: c000000000484ee8 IRQMASK: 0
       GPR00: c000000000484eec c000003fba877268 c000000001f0ec00 c000003fbd229f80
       GPR04: 00007c8fe8e00000 c000003f864c5a38 860300853e0000c0 0000000000000080
       GPR08: 0000000080000000 0000000000000001 0401000000000080 0000000000000001
       GPR12: 0000000000002000 c000003fffff5400 c000003fce292000 00007c9024570000
       GPR16: 0000000000000000 0000000000ffffff 0000000000000001 c000000001885950
       GPR20: 0000000000000000 001ffffc0004807c 0000000000000008 c000000001f49d05
       GPR24: 00007c8fe8e00000 c0000000020f2468 ffffffffffffffff c000003fcd33b090
       GPR28: 00007c8fe8e00000 c000003fbd229f80 c000003f864c5a38 860300853e0000c0
       NIP [c0000000000872fc] set_pmd_at+0x4c/0x2b0
       LR [c000000000484eec] do_huge_pmd_numa_page+0xb1c/0xc20
       Call Trace:
       [c000003fba877268] [c00000000045931c] mpol_misplaced+0x1bc/0x230 (unreliable)
       [c000003fba8772c8] [c000000000484eec] do_huge_pmd_numa_page+0xb1c/0xc20
       [c000003fba877398] [c00000000040d344] __handle_mm_fault+0x5e4/0x2300
       [c000003fba8774d8] [c00000000040f400] handle_mm_fault+0x3a0/0x420
       [c000003fba877528] [c0000000003ff6f4] __get_user_pages+0x2e4/0x560
       [c000003fba877628] [c000000000400314] get_user_pages_unlocked+0x104/0x2a0
       [c000003fba8776c8] [c000000000118f44] __gfn_to_pfn_memslot+0x284/0x6a0
       [c000003fba877748] [c0000000001463a0] kvmppc_book3s_radix_page_fault+0x360/0x12d0
       [c000003fba877838] [c000000000142228] kvmppc_book3s_hv_page_fault+0x48/0x1300
       [c000003fba877988] [c00000000013dc08] kvmppc_vcpu_run_hv+0x1808/0x1b50
       [c000003fba877af8] [c000000000126b44] kvmppc_vcpu_run+0x34/0x50
       [c000003fba877b18] [c000000000123268] kvm_arch_vcpu_ioctl_run+0x288/0x2d0
       [c000003fba877b98] [c00000000011253c] kvm_vcpu_ioctl+0x1fc/0x8c0
       [c000003fba877d08] [c0000000004e9b24] do_vfs_ioctl+0xa44/0xae0
       [c000003fba877db8] [c0000000004e9c44] ksys_ioctl+0x84/0xf0
       [c000003fba877e08] [c0000000004e9cd8] sys_ioctl+0x28/0x80
      
      We removed the pte_protnone check earlier with the understanding that we
      mark the pte invalid before the set_pte/set_pmd usage. But the huge pmd
      autonuma still use the set_pmd_at directly. This is ok because a protnone pte
      won't have translation cache in TLB.
      
      Fixes: da7ad366 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      dd0e144a
  21. 14 10月, 2018 2 次提交
  22. 03 10月, 2018 1 次提交
    • A
      powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit · da7ad366
      Aneesh Kumar K.V 提交于
      With this patch we use 0x8000000000000000UL (_PAGE_PRESENT) to indicate a valid
      pgd/pud/pmd entry. We also switch the p**_present() to look at this bit.
      
      With pmd_present, we have a special case. We need to make sure we consider a
      pmd marked invalid during THP split as present. Right now we clear the
      _PAGE_PRESENT bit during a pmdp_invalidate. Inorder to consider this special
      case we add a new pte bit _PAGE_INVALID (mapped to _RPAGE_SW0). This bit is
      only used with _PAGE_PRESENT cleared. Hence we are not really losing a pte bit
      for this special case. pmd_present is also updated to look at _PAGE_INVALID.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      da7ad366
  23. 03 6月, 2018 4 次提交
  24. 16 1月, 2018 1 次提交
    • C
      powerpc/mm: extend _PAGE_PRIVILEGED to all CPUs · 812fadcb
      Christophe Leroy 提交于
      commit ac29c640 ("powerpc/mm: Replace _PAGE_USER with
      _PAGE_PRIVILEGED") introduced _PAGE_PRIVILEGED for BOOK3S/64
      
      This patch generalises _PAGE_PRIVILEGED for all CPUs, allowing
      to have either _PAGE_PRIVILEGED or _PAGE_USER or both.
      
      PPC_8xx has a _PAGE_SHARED flag which is set for and only for
      all non user pages. Lets rename it _PAGE_PRIVILEGED to remove
      confusion as it has nothing to do with Linux shared pages.
      
      On BookE, there's a _PAGE_BAP_SR which has to be set for kernel
      pages: defining _PAGE_PRIVILEGED as _PAGE_BAP_SR will make
      this generic
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      812fadcb
  25. 17 2月, 2017 1 次提交
  26. 28 11月, 2016 1 次提交
  27. 13 9月, 2016 1 次提交
  28. 01 8月, 2016 1 次提交
  29. 11 5月, 2016 1 次提交
  30. 01 5月, 2016 3 次提交
    • A
      powerpc/mm: Drop WIMG in favour of new constants · 30bda41a
      Aneesh Kumar K.V 提交于
      PowerISA 3.0 introduces two pte bits with the below meaning for radix:
        00 -> Normal Memory
        01 -> Strong Access Order (SAO)
        10 -> Non idempotent I/O (Cache inhibited and guarded)
        11 -> Tolerant I/O (Cache inhibited)
      
      We drop the existing WIMG bits in the Linux page table in favour of the
      above constants. We loose _PAGE_WRITETHRU with this conversion. We only
      use writethru via pgprot_cached_wthru() which is used by
      fbdev/controlfb.c which is Apple control display and also PPC32.
      
      With respect to _PAGE_COHERENCE, we have been marking hpte always
      coherent for some time now. htab_convert_pte_flags() always added
      HPTE_R_M.
      
      NOTE: KVM changes need closer review.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      30bda41a
    • A
      powerpc/mm: Replace _PAGE_USER with _PAGE_PRIVILEGED · ac29c640
      Aneesh Kumar K.V 提交于
      _PAGE_PRIVILEGED means the page can be accessed only by the kernel. This
      is done to keep pte bits similar to PowerISA 3.0 Radix PTE format. User
      pages are now marked by clearing _PAGE_PRIVILEGED bit.
      
      Previously we allowed the kernel to have a privileged page in the lower
      address range (USER_REGION). With this patch such access is denied.
      
      We also prevent a kernel access to a non-privileged page in higher
      address range (ie, REGION_ID != 0).
      
      Both the above access scenarios should never happen.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jeremy Kerr <jk@ozlabs.org>
      Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ac29c640
    • A
      powerpc/mm: Use _PAGE_READ to indicate Read access · c7d54842
      Aneesh Kumar K.V 提交于
      This splits the _PAGE_RW bit into _PAGE_READ and _PAGE_WRITE. It also
      removes the dependency on _PAGE_USER for implying read only. Few things
      to note here is that, we have read implied with write and execute
      permission. Hence we should always find _PAGE_READ set on hash pte
      fault.
      
      We still can't switch PROT_NONE to !(_PAGE_RWX). Auto numa depends on
      marking a prot none pte _PAGE_WRITE. (For more details look at
      b191f9b1 "mm: numa: preserve PTE write permissions across a NUMA
      hinting fault")
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jeremy Kerr <jk@ozlabs.org>
      Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
      Acked-by: NIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      c7d54842