1. 02 5月, 2019 2 次提交
  2. 13 3月, 2019 1 次提交
    • M
      treewide: add checks for the return value of memblock_alloc*() · 8a7f97b9
      Mike Rapoport 提交于
      Add check for the return value of memblock_alloc*() functions and call
      panic() in case of error.  The panic message repeats the one used by
      panicing memblock allocators with adjustment of parameters to include
      only relevant ones.
      
      The replacement was mostly automated with semantic patches like the one
      below with manual massaging of format strings.
      
        @@
        expression ptr, size, align;
        @@
        ptr = memblock_alloc(size, align);
        + if (!ptr)
        + 	panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, size, align);
      
      [anders.roxell@linaro.org: use '%pa' with 'phys_addr_t' type]
        Link: http://lkml.kernel.org/r/20190131161046.21886-1-anders.roxell@linaro.org
      [rppt@linux.ibm.com: fix format strings for panics after memblock_alloc]
        Link: http://lkml.kernel.org/r/1548950940-15145-1-git-send-email-rppt@linux.ibm.com
      [rppt@linux.ibm.com: don't panic if the allocation in sparse_buffer_init fails]
        Link: http://lkml.kernel.org/r/20190131074018.GD28876@rapoport-lnx
      [akpm@linux-foundation.org: fix xtensa printk warning]
      Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAnders Roxell <anders.roxell@linaro.org>
      Reviewed-by: Guo Ren <ren_guo@c-sky.com>		[c-sky]
      Acked-by: Paul Burton <paul.burton@mips.com>		[MIPS]
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>	[s390]
      Reviewed-by: Juergen Gross <jgross@suse.com>		[Xen]
      Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Acked-by: Max Filippov <jcmvbkbc@gmail.com>		[xtensa]
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a7f97b9
  3. 08 3月, 2019 1 次提交
    • M
      powerpc: prefer memblock APIs returning virtual address · f806714f
      Mike Rapoport 提交于
      Patch series "memblock: simplify several early memory allocation", v4.
      
      These patches simplify some of the early memory allocations by replacing
      usage of older memblock APIs with newer and shinier ones.
      
      Quite a few places in the arch/ code allocated memory using a memblock
      API that returns a physical address of the allocated area, then
      converted this physical address to a virtual one and then used memset(0)
      to clear the allocated range.
      
      More recent memblock APIs do all the three steps in one call and their
      usage simplifies the code.
      
      It's important to note that regardless of API used, the core allocation
      is nearly identical for any set of memblock allocators: first it tries
      to find a free memory with all the constraints specified by the caller
      and then falls back to the allocation with some or all constraints
      disabled.
      
      The first three patches perform the conversion of call sites that have
      exact requirements for the node and the possible memory range.
      
      The fourth patch is a bit one-off as it simplifies openrisc's
      implementation of pte_alloc_one_kernel(), and not only the memblock
      usage.
      
      The fifth patch takes care of simpler cases when the allocation can be
      satisfied with a simple call to memblock_alloc().
      
      The sixth patch removes one-liner wrappers for memblock_alloc on arm and
      unicore32, as suggested by Christoph.
      
      This patch (of 6):
      
      There are a several places that allocate memory using memblock APIs that
      return a physical address, convert the returned address to the virtual
      address and frequently also memset(0) the allocated range.
      
      Update these places to use memblock allocators already returning a
      virtual address.  Use memblock functions that clear the allocated memory
      instead of calling memset(0) where appropriate.
      
      The calls to memblock_alloc_base() that were not followed by memset(0)
      are replaced with memblock_alloc_try_nid_raw().  Since the latter does
      not panic() when the allocation fails, the appropriate panic() calls are
      added to the call sites.
      
      Link: http://lkml.kernel.org/r/1546248566-14910-2-git-send-email-rppt@linux.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f806714f
  4. 06 3月, 2019 1 次提交
  5. 31 1月, 2019 1 次提交
    • A
      powerpc/radix: Fix kernel crash with mremap() · 579b9239
      Aneesh Kumar K.V 提交于
      With support for split pmd lock, we use pmd page pmd_huge_pte pointer
      to store the deposited page table. In those config when we move page
      tables we need to make sure we move the deposited page table to the
      correct pmd page. Otherwise this can result in crash when we withdraw
      of deposited page table because we can find the pmd_huge_pte NULL.
      
      eg:
      
        __split_huge_pmd+0x1070/0x1940
        __split_huge_pmd+0xe34/0x1940 (unreliable)
        vma_adjust_trans_huge+0x110/0x1c0
        __vma_adjust+0x2b4/0x9b0
        __split_vma+0x1b8/0x280
        __do_munmap+0x13c/0x550
        sys_mremap+0x220/0x7e0
        system_call+0x5c/0x70
      
      Fixes: 675d9952 ("powerpc/book3s64: Enable split pmd ptlock.")
      Cc: stable@vger.kernel.org # v4.18+
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      579b9239
  6. 04 12月, 2018 2 次提交
  7. 20 10月, 2018 1 次提交
    • A
      powerpc/mm: Fix WARN_ON with THP NUMA migration · dd0e144a
      Aneesh Kumar K.V 提交于
      WARNING: CPU: 12 PID: 4322 at /arch/powerpc/mm/pgtable-book3s64.c:76 set_pmd_at+0x4c/0x2b0
       Modules linked in:
       CPU: 12 PID: 4322 Comm: qemu-system-ppc Tainted: G        W         4.19.0-rc3-00758-g8f0c636b0542 #36
       NIP:  c0000000000872fc LR: c000000000484eec CTR: 0000000000000000
       REGS: c000003fba876fe0 TRAP: 0700   Tainted: G        W          (4.19.0-rc3-00758-g8f0c636b0542)
       MSR:  900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 24282884  XER: 00000000
       CFAR: c000000000484ee8 IRQMASK: 0
       GPR00: c000000000484eec c000003fba877268 c000000001f0ec00 c000003fbd229f80
       GPR04: 00007c8fe8e00000 c000003f864c5a38 860300853e0000c0 0000000000000080
       GPR08: 0000000080000000 0000000000000001 0401000000000080 0000000000000001
       GPR12: 0000000000002000 c000003fffff5400 c000003fce292000 00007c9024570000
       GPR16: 0000000000000000 0000000000ffffff 0000000000000001 c000000001885950
       GPR20: 0000000000000000 001ffffc0004807c 0000000000000008 c000000001f49d05
       GPR24: 00007c8fe8e00000 c0000000020f2468 ffffffffffffffff c000003fcd33b090
       GPR28: 00007c8fe8e00000 c000003fbd229f80 c000003f864c5a38 860300853e0000c0
       NIP [c0000000000872fc] set_pmd_at+0x4c/0x2b0
       LR [c000000000484eec] do_huge_pmd_numa_page+0xb1c/0xc20
       Call Trace:
       [c000003fba877268] [c00000000045931c] mpol_misplaced+0x1bc/0x230 (unreliable)
       [c000003fba8772c8] [c000000000484eec] do_huge_pmd_numa_page+0xb1c/0xc20
       [c000003fba877398] [c00000000040d344] __handle_mm_fault+0x5e4/0x2300
       [c000003fba8774d8] [c00000000040f400] handle_mm_fault+0x3a0/0x420
       [c000003fba877528] [c0000000003ff6f4] __get_user_pages+0x2e4/0x560
       [c000003fba877628] [c000000000400314] get_user_pages_unlocked+0x104/0x2a0
       [c000003fba8776c8] [c000000000118f44] __gfn_to_pfn_memslot+0x284/0x6a0
       [c000003fba877748] [c0000000001463a0] kvmppc_book3s_radix_page_fault+0x360/0x12d0
       [c000003fba877838] [c000000000142228] kvmppc_book3s_hv_page_fault+0x48/0x1300
       [c000003fba877988] [c00000000013dc08] kvmppc_vcpu_run_hv+0x1808/0x1b50
       [c000003fba877af8] [c000000000126b44] kvmppc_vcpu_run+0x34/0x50
       [c000003fba877b18] [c000000000123268] kvm_arch_vcpu_ioctl_run+0x288/0x2d0
       [c000003fba877b98] [c00000000011253c] kvm_vcpu_ioctl+0x1fc/0x8c0
       [c000003fba877d08] [c0000000004e9b24] do_vfs_ioctl+0xa44/0xae0
       [c000003fba877db8] [c0000000004e9c44] ksys_ioctl+0x84/0xf0
       [c000003fba877e08] [c0000000004e9cd8] sys_ioctl+0x28/0x80
      
      We removed the pte_protnone check earlier with the understanding that we
      mark the pte invalid before the set_pte/set_pmd usage. But the huge pmd
      autonuma still use the set_pmd_at directly. This is ok because a protnone pte
      won't have translation cache in TLB.
      
      Fixes: da7ad366 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      dd0e144a
  8. 03 10月, 2018 2 次提交
    • A
      powerpc/mm/book3s: Check for pmd_large instead of pmd_trans_huge · ae28f17b
      Aneesh Kumar K.V 提交于
      Update few code paths to check for pmd_large.
      
      set_pmd_at:
      We want to use this to store swap pte at pmd level. For swap ptes we don't want
      to set H_PAGE_THP_HUGE. Hence check for pmd_large in set_pmd_at. This remove
      the false WARN_ON when using this with swap pmd entry.
      
      pmd_page:
      We don't really use them on pmd migration entries. But they can also work with
      migration entries and we don't differentiate at the pte level. Hence update
      pmd_page to work with pmd migration entries too
      
      __find_linux_pte:
      lockless page table walk need to handle pmd migration entries. pmd_trans_huge
      check will return false on them. We don't set thp = 1 for such entries, but
      update hpage_shift correctly. Without this we will walk pmd migration entries
      as a pte page pointer which is wrong.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ae28f17b
    • A
      powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit · da7ad366
      Aneesh Kumar K.V 提交于
      With this patch we use 0x8000000000000000UL (_PAGE_PRESENT) to indicate a valid
      pgd/pud/pmd entry. We also switch the p**_present() to look at this bit.
      
      With pmd_present, we have a special case. We need to make sure we consider a
      pmd marked invalid during THP split as present. Right now we clear the
      _PAGE_PRESENT bit during a pmdp_invalidate. Inorder to consider this special
      case we add a new pte bit _PAGE_INVALID (mapped to _RPAGE_SW0). This bit is
      only used with _PAGE_PRESENT cleared. Hence we are not really losing a pte bit
      for this special case. pmd_present is also updated to look at _PAGE_INVALID.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      da7ad366
  9. 13 8月, 2018 1 次提交
  10. 07 8月, 2018 1 次提交
  11. 20 6月, 2018 1 次提交
  12. 03 6月, 2018 3 次提交
  13. 15 5月, 2018 7 次提交
  14. 30 3月, 2018 1 次提交
  15. 27 3月, 2018 1 次提交
    • M
      powerpc/mm: Fix section mismatch warning in stop_machine_change_mapping() · bde709a7
      Mauricio Faria de Oliveira 提交于
      Fix the warning messages for stop_machine_change_mapping(), and a number
      of other affected functions in its call chain.
      
      All modified functions are under CONFIG_MEMORY_HOTPLUG, so __meminit
      is okay (keeps them / does not discard them).
      
      Boot-tested on powernv/power9/radix-mmu and pseries/power8/hash-mmu.
      
          $ make -j$(nproc) CONFIG_DEBUG_SECTION_MISMATCH=y vmlinux
          ...
            MODPOST vmlinux.o
          WARNING: vmlinux.o(.text+0x6b130): Section mismatch in reference from the function stop_machine_change_mapping() to the function .meminit.text:create_physical_mapping()
          The function stop_machine_change_mapping() references
          the function __meminit create_physical_mapping().
          This is often because stop_machine_change_mapping lacks a __meminit
          annotation or the annotation of create_physical_mapping is wrong.
      
          WARNING: vmlinux.o(.text+0x6b13c): Section mismatch in reference from the function stop_machine_change_mapping() to the function .meminit.text:create_physical_mapping()
          The function stop_machine_change_mapping() references
          the function __meminit create_physical_mapping().
          This is often because stop_machine_change_mapping lacks a __meminit
          annotation or the annotation of create_physical_mapping is wrong.
          ...
      Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bde709a7
  16. 01 2月, 2018 1 次提交
  17. 17 8月, 2017 2 次提交
    • A
      powerpc/mm/cxl: Add the fault handling cpu to mm cpumask · 0f4bc093
      Aneesh Kumar K.V 提交于
      We use mm cpumask for serializing against lockless page table walk.
      Anybody who is doing a lockless page table walk is expected to disable
      irq and only cpus in mm cpumask is expected do the lockless walk. This
      ensure that a THP split can send IPI to only cpus in the mm cpumask,
      to make sure there are no parallel lockless page table walk.
      
      Add the CAPI fault handling cpu to the mm cpumask so that we can do
      the lockless page table walk while inserting hash page table entries.
      Reviewed-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      0f4bc093
    • A
      powerpc/mm: Don't send IPI to all cpus on THP updates · fa4531f7
      Aneesh Kumar K.V 提交于
      Now that we made sure that lockless walk of linux page table is mostly
      limitted to current task(current->mm->pgdir) we can update the THP
      update sequence to only send IPI to CPUs on which this task has run.
      This helps in reducing the IPI overload on systems with large number
      of CPUs.
      
      WRT kvm even though kvm is walking page table with vpc->arch.pgdir,
      it is done only on secondary CPUs and in that case we have primary CPU
      added to task's mm cpumask. Sending an IPI to primary will force the
      secondary to do a vm exit and hence this mm cpumask usage is safe
      here.
      
      WRT CAPI, we still end up walking linux page table with capi context
      MM. For now the pte lookup serialization sends an IPI to all CPUs in
      CPI is in use. We can further improve this by adding the CAPI
      interrupt handling CPU to task mm cpumask. That will be done in a
      later patch.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fa4531f7
  18. 02 7月, 2017 1 次提交
  19. 02 3月, 2017 1 次提交
  20. 31 1月, 2017 2 次提交
  21. 17 1月, 2017 1 次提交
  22. 28 11月, 2016 1 次提交
  23. 23 9月, 2016 1 次提交
  24. 13 9月, 2016 1 次提交
  25. 04 8月, 2016 1 次提交
    • M
      powerpc/mm: Move register_process_table() out of ppc_md · eea8148c
      Michael Ellerman 提交于
      We want to initialise register_process_table() before ppc_md is setup,
      so that it can be called as part of MMU init (at least on Radix ATM).
      
      That no longer works because probe_machine() requires that ppc_md be
      empty before it's called, and we now do probe_machine() much later.
      
      So make register_process_table a global for now. It will probably move
      into a mmu_radix_ops struct at some point in the future.
      
      This was broken by me when applying commit 7025776e "powerpc/mm:
      Move hash table ops to a separate structure" due to conflicts with other
      patches.
      
      Fixes: 7025776e ("powerpc/mm: Move hash table ops to a separate structure")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      eea8148c
  26. 01 8月, 2016 1 次提交
  27. 01 6月, 2016 1 次提交