1. 27 8月, 2020 4 次提交
    • A
      powerpc/perf: Fix reading of MSR[HV/PR] bits in trace-imc · 82715a0f
      Athira Rajeev 提交于
      IMC trace-mode uses MSR[HV/PR] bits to set the cpumode for the
      instruction pointer captured in each sample. The bits are fetched from
      the third double word of the trace record. Reading third double word
      from IMC trace record should use be64_to_cpu() along with READ_ONCE
      inorder to fetch correct MSR[HV/PR] bits. Patch addresses this change.
      
      Currently we are using PERF_RECORD_MISC_HYPERVISOR as cpumode if MSR
      HV is 1 and PR is 0 which means the address is from host counter. But
      using PERF_RECORD_MISC_HYPERVISOR for host counter data will fail to
      resolve the address -> symbol during "perf report" because perf tools
      side uses PERF_RECORD_MISC_KERNEL to represent the host counter data.
      Therefore, fix the trace imc sample data to use
      PERF_RECORD_MISC_KERNEL as cpumode for host kernel information.
      
      Fixes: 77ca3951 ("powerpc/perf: Add kernel support for new MSR[HV PR] bits in trace-imc")
      Signed-off-by: NAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/1598424029-1662-1-git-send-email-atrajeev@linux.vnet.ibm.com
      82715a0f
    • A
      powerpc/perf: Fix crashes with generic_compat_pmu & BHRB · b460b512
      Alexey Kardashevskiy 提交于
      The bhrb_filter_map ("The Branch History Rolling Buffer") callback is
      only defined in raw CPUs' power_pmu structs. The "architected" CPUs
      use generic_compat_pmu, which does not have this callback, and crashes
      occur if a user tries to enable branch stack for an event.
      
      This add a NULL pointer check for bhrb_filter_map() which behaves as
      if the callback returned an error.
      
      This does not add the same check for config_bhrb() as the only caller
      checks for cpuhw->bhrb_users which remains zero if bhrb_filter_map==0.
      
      Fixes: be80e758 ("powerpc/perf: Add generic compat mode pmu driver")
      Cc: stable@vger.kernel.org # v5.2+
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NMadhavan Srinivasan <maddy@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200602025612.62707-1-aik@ozlabs.ru
      b460b512
    • M
      powerpc/64s: Fix crash in load_fp_state() due to fpexc_mode · b91eb518
      Michael Ellerman 提交于
      The recent commit 01eb0187 ("powerpc/64s: Fix restore_math
      unnecessarily changing MSR") changed some of the handling of floating
      point/vector restore.
      
      In particular it caused current->thread.fpexc_mode to be copied into
      the current MSR (via msr_check_and_set()), rather than just into
      regs->msr (which is moved into MSR on return to userspace).
      
      This can lead to a crash in the kernel if we take a floating point
      exception when restoring FPSCR:
      
        Oops: Exception in kernel mode, sig: 8 [#1]
        LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
        Modules linked in:
        CPU: 3 PID: 101213 Comm: ld64.so.2 Not tainted 5.9.0-rc1-00098-g18445bf4-dirty #9
        NIP:  c00000000000fbb4 LR: c00000000001a7ac CTR: c000000000183570
        REGS: c0000016b7cfb3b0 TRAP: 0700   Not tainted  (5.9.0-rc1-00098-g18445bf4-dirty)
        MSR:  900000000290b933 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44002444  XER: 00000000
        CFAR: c00000000001a7a8 IRQMASK: 1
        GPR00: c00000000001ae40 c0000016b7cfb640 c0000000011b7f00 c000001542a0f740
        GPR04: c000001542a0f720 c000001542a0eb00 0000000000000900 c000001542a0eb00
        GPR08: 000000000000000a 0000000000002000 9000000000009033 0000000000000000
        GPR12: 0000000000004000 c0000017ffffd900 0000000000000001 c000000000df5a58
        GPR16: c000000000e19c18 c0000000010e1123 0000000000000001 c000000000e1a638
        GPR20: 0000000000000000 c0000000044b1d00 0000000000000000 c000001542a0f2a0
        GPR24: 00000016c7fe0000 c000001542a0f720 c000000001c93da0 c000000000fe5f28
        GPR28: c000001542a0f720 0000000000800000 c0000016b7cfbe90 0000000002802900
        NIP load_fp_state+0x4/0x214
        LR  restore_math+0x17c/0x1f0
        Call Trace:
          0xc0000016b7cfb680 (unreliable)
          __switch_to+0x330/0x460
          __schedule+0x318/0x920
          schedule+0x74/0x140
          schedule_timeout+0x318/0x3f0
          wait_for_completion+0xc8/0x210
          call_usermodehelper_exec+0x234/0x280
          do_coredump+0xedc/0x13c0
          get_signal+0x1d4/0xbe0
          do_notify_resume+0x1a0/0x490
          interrupt_exit_user_prepare+0x1c4/0x230
          interrupt_return+0x14/0x1c0
        Instruction dump:
        ebe10168 e88101a0 7c8ff120 382101e0 e8010010 7c0803a6 4e800020 790605c4
        782905c4 7c0008a8 7c0008a8 c8030200 <fffe058e> 48000088 c8030000 c8230010
      
      Fix it by only loading the fpexc_mode value into regs->msr.
      
      Also add a comment to explain that although VSX is subject to the
      value of fpexc_mode, we don't have to handle that separately because
      we only allow VSX to be enabled if FP is also enabled.
      
      Fixes: 01eb0187 ("powerpc/64s: Fix restore_math unnecessarily changing MSR")
      Reported-by: NMilton Miller <miltonm@us.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NNicholas Piggin <npiggin@gmail.com>
      Link: https://lore.kernel.org/r/20200825093424.3967813-1-mpe@ellerman.id.au
      b91eb518
    • N
      powerpc/64s: scv entry should set PPR · e5fe5609
      Nicholas Piggin 提交于
      Kernel entry sets PPR to HMT_MEDIUM by convention. The scv entry
      path missed this.
      
      Fixes: 7fa95f9a ("powerpc/64s: system call support for scv/rfscv instructions")
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200825075309.224184-1-npiggin@gmail.com
      e5fe5609
  2. 24 8月, 2020 2 次提交
  3. 21 8月, 2020 2 次提交
  4. 20 8月, 2020 3 次提交
  5. 18 8月, 2020 3 次提交
    • M
      powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death · 801980f6
      Michael Roth 提交于
      For a power9 KVM guest with XIVE enabled, running a test loop
      where we hotplug 384 vcpus and then unplug them, the following traces
      can be seen (generally within a few loops) either from the unplugged
      vcpu:
      
        cpu 65 (hwid 65) Ready to die...
        Querying DEAD? cpu 66 (66) shows 2
        list_del corruption. next->prev should be c00a000002470208, but was c00a000002470048
        ------------[ cut here ]------------
        kernel BUG at lib/list_debug.c:56!
        Oops: Exception in kernel mode, sig: 5 [#1]
        LE SMP NR_CPUS=2048 NUMA pSeries
        Modules linked in: fuse nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 ...
        CPU: 66 PID: 0 Comm: swapper/66 Kdump: loaded Not tainted 4.18.0-221.el8.ppc64le #1
        NIP:  c0000000007ab50c LR: c0000000007ab508 CTR: 00000000000003ac
        REGS: c0000009e5a17840 TRAP: 0700   Not tainted  (4.18.0-221.el8.ppc64le)
        MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28000842  XER: 20040000
        ...
        NIP __list_del_entry_valid+0xac/0x100
        LR  __list_del_entry_valid+0xa8/0x100
        Call Trace:
          __list_del_entry_valid+0xa8/0x100 (unreliable)
          free_pcppages_bulk+0x1f8/0x940
          free_unref_page+0xd0/0x100
          xive_spapr_cleanup_queue+0x148/0x1b0
          xive_teardown_cpu+0x1bc/0x240
          pseries_mach_cpu_die+0x78/0x2f0
          cpu_die+0x48/0x70
          arch_cpu_idle_dead+0x20/0x40
          do_idle+0x2f4/0x4c0
          cpu_startup_entry+0x38/0x40
          start_secondary+0x7bc/0x8f0
          start_secondary_prolog+0x10/0x14
      
      or on the worker thread handling the unplug:
      
        pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a
        Querying DEAD? cpu 314 (314) shows 2
        BUG: Bad page state in process kworker/u768:3  pfn:95de1
        cpu 314 (hwid 314) Ready to die...
        page:c00a000002577840 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0
        flags: 0x5ffffc00000000()
        raw: 005ffffc00000000 5deadbeef0000100 5deadbeef0000200 0000000000000000
        raw: 0000000000000000 0000000000000000 00000000ffffff7f 0000000000000000
        page dumped because: nonzero mapcount
        Modules linked in: kvm xt_CHECKSUM ipt_MASQUERADE xt_conntrack ...
        CPU: 0 PID: 548 Comm: kworker/u768:3 Kdump: loaded Not tainted 4.18.0-224.el8.bz1856588.ppc64le #1
        Workqueue: pseries hotplug workque pseries_hp_work_fn
        Call Trace:
          dump_stack+0xb0/0xf4 (unreliable)
          bad_page+0x12c/0x1b0
          free_pcppages_bulk+0x5bc/0x940
          page_alloc_cpu_dead+0x118/0x120
          cpuhp_invoke_callback.constprop.5+0xb8/0x760
          _cpu_down+0x188/0x340
          cpu_down+0x5c/0xa0
          cpu_subsys_offline+0x24/0x40
          device_offline+0xf0/0x130
          dlpar_offline_cpu+0x1c4/0x2a0
          dlpar_cpu_remove+0xb8/0x190
          dlpar_cpu_remove_by_index+0x12c/0x150
          dlpar_cpu+0x94/0x800
          pseries_hp_work_fn+0x128/0x1e0
          process_one_work+0x304/0x5d0
          worker_thread+0xcc/0x7a0
          kthread+0x1ac/0x1c0
          ret_from_kernel_thread+0x5c/0x80
      
      The latter trace is due to the following sequence:
      
        page_alloc_cpu_dead
          drain_pages
            drain_pages_zone
              free_pcppages_bulk
      
      where drain_pages() in this case is called under the assumption that
      the unplugged cpu is no longer executing. To ensure that is the case,
      and early call is made to __cpu_die()->pseries_cpu_die(), which runs a
      loop that waits for the cpu to reach a halted state by polling its
      status via query-cpu-stopped-state RTAS calls. It only polls for 25
      iterations before giving up, however, and in the trace above this
      results in the following being printed only .1 seconds after the
      hotplug worker thread begins processing the unplug request:
      
        pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a
        Querying DEAD? cpu 314 (314) shows 2
      
      At that point the worker thread assumes the unplugged CPU is in some
      unknown/dead state and procedes with the cleanup, causing the race
      with the XIVE cleanup code executed by the unplugged CPU.
      
      Fix this by waiting indefinitely, but also making an effort to avoid
      spurious lockup messages by allowing for rescheduling after polling
      the CPU status and printing a warning if we wait for longer than 120s.
      
      Fixes: eac1e731 ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
      Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NMichael Roth <mdroth@linux.vnet.ibm.com>
      Tested-by: NGreg Kurz <groug@kaod.org>
      Reviewed-by: NThiago Jung Bauermann <bauerman@linux.ibm.com>
      Reviewed-by: NGreg Kurz <groug@kaod.org>
      [mpe: Trim oopses in change log slightly for readability]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200811161544.10513-1-mdroth@linux.vnet.ibm.com
      801980f6
    • C
      powerpc/32s: Fix is_module_segment() when MODULES_VADDR is defined · 7bee31ad
      Christophe Leroy 提交于
      When MODULES_VADDR is defined, is_module_segment() shall check the
      address against it instead of checking agains VMALLOC_START.
      
      Fixes: 6ca05532 ("powerpc/32s: Use dedicated segment for modules with STRICT_KERNEL_RWX")
      Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/07884ed033c31e074747b7eb8eaa329d15db07ec.1596641219.git.christophe.leroy@csgroup.eu
      7bee31ad
    • C
      powerpc/kasan: Fix KASAN_SHADOW_START on BOOK3S_32 · 48d2f040
      Christophe Leroy 提交于
      On BOOK3S_32, when we have modules and strict kernel RWX, modules
      are not in vmalloc space but in a dedicated segment that is
      below PAGE_OFFSET.
      
      So KASAN_SHADOW_START must take it into account.
      
      MODULES_VADDR can't be used because it is not defined yet
      in kasan.h
      
      Fixes: 6ca05532 ("powerpc/32s: Use dedicated segment for modules with STRICT_KERNEL_RWX")
      Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/6eddca2d5611fd57312a88eae31278c87a8fc99d.1596641224.git.christophe.leroy@csgroup.eu
      48d2f040
  6. 17 8月, 2020 7 次提交
  7. 15 8月, 2020 2 次提交
    • K
      iomap: constify ioreadX() iomem argument (as in generic implementation) · 8f28ca6b
      Krzysztof Kozlowski 提交于
      Patch series "iomap: Constify ioreadX() iomem argument", v3.
      
      The ioread8/16/32() and others have inconsistent interface among the
      architectures: some taking address as const, some not.
      
      It seems there is nothing really stopping all of them to take pointer to
      const.
      
      This patch (of 4):
      
      The ioreadX() and ioreadX_rep() helpers have inconsistent interface.  On
      some architectures void *__iomem address argument is a pointer to const,
      on some not.
      
      Implementations of ioreadX() do not modify the memory under the address so
      they can be converted to a "const" version for const-safety and
      consistency among architectures.
      
      [krzk@kernel.org: sh: clk: fix assignment from incompatible pointer type for ioreadX()]
        Link: http://lkml.kernel.org/r/20200723082017.24053-1-krzk@kernel.org
      [akpm@linux-foundation.org: fix drivers/mailbox/bcm-pdc-mailbox.c]
        Link: http://lkml.kernel.org/r/202007132209.Rxmv4QyS%25lkp@intel.comSuggested-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Kalle Valo <kvalo@codeaurora.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Jon Mason <jdmason@kudzu.us>
      Cc: Allen Hubbe <allenbh@gmail.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Link: http://lkml.kernel.org/r/20200709072837.5869-1-krzk@kernel.org
      Link: http://lkml.kernel.org/r/20200709072837.5869-2-krzk@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f28ca6b
    • X
      all arch: remove system call sys_sysctl · 88db0aa2
      Xiaoming Ni 提交于
      Since commit 61a47c1a ("sysctl: Remove the sysctl system call"),
      sys_sysctl is actually unavailable: any input can only return an error.
      
      We have been warning about people using the sysctl system call for years
      and believe there are no more users.  Even if there are users of this
      interface if they have not complained or fixed their code by now they
      probably are not going to, so there is no point in warning them any
      longer.
      
      So completely remove sys_sysctl on all architectures.
      
      [nixiaoming@huawei.com: s390: fix build error for sys_call_table_emu]
       Link: http://lkml.kernel.org/r/20200618141426.16884-1-nixiaoming@huawei.comSigned-off-by: NXiaoming Ni <nixiaoming@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: Will Deacon <will@kernel.org>		[arm/arm64]
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Aleksa Sarai <cyphar@cyphar.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bin Meng <bin.meng@windriver.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: chenzefeng <chenzefeng2@huawei.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christian Brauner <christian@brauner.io>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Diego Elio Pettenò <flameeyes@flameeyes.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Iurii Zaikin <yzaikin@google.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kars de Jong <jongk@linux-m68k.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Krzysztof Kozlowski <krzk@kernel.org>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Marco Elver <elver@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miklos Szeredi <mszeredi@redhat.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Nick Piggin <npiggin@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: Paul Burton <paulburton@kernel.org>
      Cc: "Paul E. McKenney" <paulmck@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Sami Tolvanen <samitolvanen@google.com>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Sven Schnelle <svens@stackframe.org>
      Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Zhou Yanjie <zhouyanjie@wanyeetech.com>
      Link: http://lkml.kernel.org/r/20200616030734.87257-1-nixiaoming@huawei.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      88db0aa2
  8. 13 8月, 2020 4 次提交
    • P
      mm: clean up the last pieces of page fault accountings · a2beb5f1
      Peter Xu 提交于
      Here're the last pieces of page fault accounting that were still done
      outside handle_mm_fault() where we still have regs==NULL when calling
      handle_mm_fault():
      
      arch/powerpc/mm/copro_fault.c:   copro_handle_mm_fault
      arch/sparc/mm/fault_32.c:        force_user_fault
      arch/um/kernel/trap.c:           handle_page_fault
      mm/gup.c:                        faultin_page
                                       fixup_user_fault
      mm/hmm.c:                        hmm_vma_fault
      mm/ksm.c:                        break_ksm
      
      Some of them has the issue of duplicated accounting for page fault
      retries.  Some of them didn't do the accounting at all.
      
      This patch cleans all these up by letting handle_mm_fault() to do per-task
      page fault accounting even if regs==NULL (though we'll still skip the perf
      event accountings).  With that, we can safely remove all the outliers now.
      
      There's another functional change in that now we account the page faults
      to the caller of gup, rather than the task_struct that passed into the gup
      code.  More information of this can be found at [1].
      
      After this patch, below things should never be touched again outside
      handle_mm_fault():
      
        - task_struct.[maj|min]_flt
        - PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]
      
      [1] https://lore.kernel.org/lkml/CAHk-=wj_V2Tps2QrMn20_W0OJF9xqNh52XSGA42s-ZJ8Y+GyKw@mail.gmail.com/Signed-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200707225021.200906-25-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a2beb5f1
    • P
      mm/powerpc: use general page fault accounting · 428fdc09
      Peter Xu 提交于
      Use the general page fault accounting by passing regs into
      handle_mm_fault().
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Link: http://lkml.kernel.org/r/20200707225021.200906-17-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      428fdc09
    • P
      mm: do page fault accounting in handle_mm_fault · bce617ed
      Peter Xu 提交于
      Patch series "mm: Page fault accounting cleanups", v5.
      
      This is v5 of the pf accounting cleanup series.  It originates from Gerald
      Schaefer's report on an issue a week ago regarding to incorrect page fault
      accountings for retried page fault after commit 4064b982 ("mm: allow
      VM_FAULT_RETRY for multiple times"):
      
        https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/
      
      What this series did:
      
        - Correct page fault accounting: we do accounting for a page fault
          (no matter whether it's from #PF handling, or gup, or anything else)
          only with the one that completed the fault.  For example, page fault
          retries should not be counted in page fault counters.  Same to the
          perf events.
      
        - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
          event is used in an adhoc way across different archs.
      
          Case (1): for many archs it's done at the entry of a page fault
          handler, so that it will also cover e.g.  errornous faults.
      
          Case (2): for some other archs, it is only accounted when the page
          fault is resolved successfully.
      
          Case (3): there're still quite some archs that have not enabled
          this perf event.
      
          Since this series will touch merely all the archs, we unify this
          perf event to always follow case (1), which is the one that makes most
          sense.  And since we moved the accounting into handle_mm_fault, the
          other two MAJ/MIN perf events are well taken care of naturally.
      
        - Unify definition of "major faults": the definition of "major
          fault" is slightly changed when used in accounting (not
          VM_FAULT_MAJOR).  More information in patch 1.
      
        - Always account the page fault onto the one that triggered the page
          fault.  This does not matter much for #PF handlings, but mostly for
          gup.  More information on this in patch 25.
      
      Patchset layout:
      
      Patch 1:     Introduced the accounting in handle_mm_fault(), not enabled.
      Patch 2-23:  Enable the new accounting for arch #PF handlers one by one.
      Patch 24:    Enable the new accounting for the rest outliers (gup, iommu, etc.)
      Patch 25:    Cleanup GUP task_struct pointer since it's not needed any more
      
      This patch (of 25):
      
      This is a preparation patch to move page fault accountings into the
      general code in handle_mm_fault().  This includes both the per task
      flt_maj/flt_min counters, and the major/minor page fault perf events.  To
      do this, the pt_regs pointer is passed into handle_mm_fault().
      
      PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
      handlers.
      
      So far, all the pt_regs pointer that passed into handle_mm_fault() is
      NULL, which means this patch should have no intented functional change.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Cain <bcain@codeaurora.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
      Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bce617ed
    • C
      uaccess: remove segment_eq · 428e2976
      Christoph Hellwig 提交于
      segment_eq is only used to implement uaccess_kernel.  Just open code
      uaccess_kernel in the arch uaccess headers and remove one layer of
      indirection.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NGreentime Hu <green.hu@gmail.com>
      Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Vincent Chen <deanbo422@gmail.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Link: http://lkml.kernel.org/r/20200710135706.537715-5-hch@lst.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      428e2976
  9. 10 8月, 2020 2 次提交
    • A
      powerpc/pkeys: Fix boot failures with Nemo board (A-EON AmigaOne X1000) · 6553fb79
      Aneesh Kumar K.V 提交于
      On p6 and before we should avoid updating UAMOR SPRN. This resulted
      in boot failure on Nemo board.
      
      Fixes: 269e829f ("powerpc/book3s64/pkey: Disable pkey on POWER6 and before")
      Reported-by: NChristian Zigotzky <chzigotzky@xenosoft.de>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200810102623.685083-1-aneesh.kumar@linux.ibm.com
      6553fb79
    • M
      kbuild: introduce ccflags-remove-y and asflags-remove-y · 15d5761a
      Masahiro Yamada 提交于
      CFLAGS_REMOVE_<file>.o filters out flags when compiling a particular
      object, but there is no convenient way to do that for every object in
      a directory.
      
      Add ccflags-remove-y and asflags-remove-y to make it easily.
      
      Use ccflags-remove-y to clean up some Makefiles.
      
      The add/remove order works as follows:
      
       [1] KBUILD_CFLAGS specifies compiler flags used globally
      
       [2] ccflags-y adds compiler flags for all objects in the
           current Makefile
      
       [3] ccflags-remove-y removes compiler flags for all objects in the
           current Makefile (New feature)
      
       [4] CFLAGS_<file> adds compiler flags per file.
      
       [5] CFLAGS_REMOVE_<file> removes compiler flags per file.
      
      Having [3] before [4] allows us to remove flags from most (but not all)
      objects in the current Makefile.
      
      For example, kernel/trace/Makefile removes $(CC_FLAGS_FTRACE)
      from all objects in the directory, then adds it back to
      trace_selftest_dynamic.o and CFLAGS_trace_kprobe_selftest.o
      
      The same applies to lib/livepatch/Makefile.
      
      Please note ccflags-remove-y has no effect to the sub-directories.
      In contrast, the previous notation got rid of compiler flags also from
      all the sub-directories.
      
      The following are not affected because they have no sub-directories:
      
        arch/arm/boot/compressed/
        arch/powerpc/xmon/
        arch/sh/
        kernel/trace/
      
      However, lib/ has several sub-directories.
      
      To keep the behavior, I added ccflags-remove-y to all Makefiles
      in subdirectories of lib/, except the following:
      
        lib/vdso/Makefile        - Kbuild does not descend into this Makefile
        lib/raid/test/Makefile   - This is not used for the kernel build
      
      I think commit 2464a609 ("ftrace: do not trace library functions")
      excluded too much. In the next commit, I will remove ccflags-remove-y
      from the sub-directories of lib/.
      Suggested-by: NSami Tolvanen <samitolvanen@google.com>
      Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
      Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Acked-by: Brendan Higgins <brendanhiggins@google.com> (KUnit)
      Tested-by: NAnders Roxell <anders.roxell@linaro.org>
      15d5761a
  10. 08 8月, 2020 4 次提交
    • M
      powerpc/ptrace: Fix build error in pkey_get() · 7b9de977
      Michael Ellerman 提交于
      The merge resolution in commit 25d8d4ee left ret no longer used,
      leading to:
      
        arch/powerpc/kernel/ptrace/ptrace-view.c: In function ‘pkey_get’:
        arch/powerpc/kernel/ptrace/ptrace-view.c:473:6: error: unused variable ‘ret’
          473 |  int ret;
      
      Fix it by removing ret.
      
      Fixes: 25d8d4ee ("Merge tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux")
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7b9de977
    • M
      mm/sparse: cleanup the code surrounding memory_present() · c89ab04f
      Mike Rapoport 提交于
      After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP we have two equivalent
      functions that call memory_present() for each region in memblock.memory:
      sparse_memory_present_with_active_regions() and membocks_present().
      
      Moreover, all architectures have a call to either of these functions
      preceding the call to sparse_init() and in the most cases they are called
      one after the other.
      
      Mark the regions from memblock.memory as present during sparce_init() by
      making sparse_init() call memblocks_present(), make memblocks_present()
      and memory_present() functions static and remove redundant
      sparse_memory_present_with_active_regions() function.
      
      Also remove no longer required HAVE_MEMORY_PRESENT configuration option.
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20200712083130.22919-1-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c89ab04f
    • A
      mm/sparsemem: enable vmem_altmap support in vmemmap_alloc_block_buf() · 56993b4e
      Anshuman Khandual 提交于
      There are many instances where vmemap allocation is often switched between
      regular memory and device memory just based on whether altmap is available
      or not.  vmemmap_alloc_block_buf() is used in various platforms to
      allocate vmemmap mappings.  Lets also enable it to handle altmap based
      device memory allocation along with existing regular memory allocations.
      This will help in avoiding the altmap based allocation switch in many
      places.  To summarize there are two different methods to call
      vmemmap_alloc_block_buf().
      
      vmemmap_alloc_block_buf(size, node, NULL)   /* Allocate from system RAM */
      vmemmap_alloc_block_buf(size, node, altmap) /* Allocate from altmap */
      
      This converts altmap_alloc_block_buf() into a static function, drops it's
      entry from the header and updates Documentation/vm/memory-model.rst.
      Suggested-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: NJia He <justin.he@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Will Deacon <will@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Hsin-Yi Wang <hsinyi@chromium.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
      Cc: Steve Capper <steve.capper@arm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Link: http://lkml.kernel.org/r/1594004178-8861-3-git-send-email-anshuman.khandual@arm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      56993b4e
    • M
      mm: remove unneeded includes of <asm/pgalloc.h> · ca15ca40
      Mike Rapoport 提交于
      Patch series "mm: cleanup usage of <asm/pgalloc.h>"
      
      Most architectures have very similar versions of pXd_alloc_one() and
      pXd_free_one() for intermediate levels of page table.  These patches add
      generic versions of these functions in <asm-generic/pgalloc.h> and enable
      use of the generic functions where appropriate.
      
      In addition, functions declared and defined in <asm/pgalloc.h> headers are
      used mostly by core mm and early mm initialization in arch and there is no
      actual reason to have the <asm/pgalloc.h> included all over the place.
      The first patch in this series removes unneeded includes of
      <asm/pgalloc.h>
      
      In the end it didn't work out as neatly as I hoped and moving
      pXd_alloc_track() definitions to <asm-generic/pgalloc.h> would require
      unnecessary changes to arches that have custom page table allocations, so
      I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local
      to mm/.
      
      This patch (of 8):
      
      In most cases <asm/pgalloc.h> header is required only for allocations of
      page table memory.  Most of the .c files that include that header do not
      use symbols declared in <asm/pgalloc.h> and do not require that header.
      
      As for the other header files that used to include <asm/pgalloc.h>, it is
      possible to move that include into the .c file that actually uses symbols
      from <asm/pgalloc.h> and drop the include from the header file.
      
      The process was somewhat automated using
      
      	sed -i -E '/[<"]asm\/pgalloc\.h/d' \
                      $(grep -L -w -f /tmp/xx \
                              $(git grep -E -l '[<"]asm/pgalloc\.h'))
      
      where /tmp/xx contains all the symbols defined in
      arch/*/include/asm/pgalloc.h.
      
      [rppt@linux.ibm.com: fix powerpc warning]
      Signed-off-by: NMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NPekka Enberg <penberg@kernel.org>
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Matthew Wilcox <willy@infradead.org>
      Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org
      Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ca15ca40
  11. 04 8月, 2020 1 次提交
    • M
      powerpc: Fix circular dependency between percpu.h and mmu.h · 0c83b277
      Michael Ellerman 提交于
      Recently random.h started including percpu.h (see commit
      f227e3ec ("random32: update the net random state on interrupt and
      activity")), which broke corenet64_smp_defconfig:
      
        In file included from /linux/arch/powerpc/include/asm/paca.h:18,
                         from /linux/arch/powerpc/include/asm/percpu.h:13,
                         from /linux/include/linux/random.h:14,
                         from /linux/lib/uuid.c:14:
        /linux/arch/powerpc/include/asm/mmu.h:139:22: error: unknown type name 'next_tlbcam_idx'
          139 | DECLARE_PER_CPU(int, next_tlbcam_idx);
      
      This is due to a circular header dependency:
        asm/mmu.h includes asm/percpu.h, which includes asm/paca.h, which
        includes asm/mmu.h
      
      Which means DECLARE_PER_CPU() isn't defined when mmu.h needs it.
      
      We can fix it by moving the include of paca.h below the include of
      asm-generic/percpu.h.
      
      This moves the include of paca.h out of the #ifdef __powerpc64__, but
      that is OK because paca.h is almost entirely inside #ifdef
      CONFIG_PPC64 anyway.
      
      It also moves the include of paca.h out of the #ifdef CONFIG_SMP,
      which could possibly break something, but seems to have no ill
      effects.
      
      Fixes: f227e3ec ("random32: update the net random state on interrupt and activity")
      Cc: stable@vger.kernel.org # v5.8
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200804130558.292328-1-mpe@ellerman.id.au
      0c83b277
  12. 03 8月, 2020 2 次提交
  13. 31 7月, 2020 2 次提交
    • V
      powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric · af0870c4
      Vaibhav Jain 提交于
      We add support for reporting 'fuel-gauge' NVDIMM metric via
      PAPR_PDSM_HEALTH pdsm payload. 'fuel-gauge' metric indicates the usage
      life remaining of a papr-scm compatible NVDIMM. PHYP exposes this
      metric via the H_SCM_PERFORMANCE_STATS.
      
      The metric value is returned from the pdsm by extending the return
      payload 'struct nd_papr_pdsm_health' without breaking the ABI. A new
      field 'dimm_fuel_gauge' to hold the metric value is introduced at the
      end of the payload struct and its presence is indicated by by
      extension flag PDSM_DIMM_HEALTH_RUN_GAUGE_VALID.
      
      The patch introduces a new function papr_pdsm_fuel_gauge() that is
      called from papr_pdsm_health(). If fetching NVDIMM performance stats
      is supported then 'papr_pdsm_fuel_gauge()' allocated an output buffer
      large enough to hold the performance stat and passes it to
      drc_pmem_query_stats() that issues the HCALL to PHYP. The return value
      of the stat is then populated in the 'struct
      nd_papr_pdsm_health.dimm_fuel_gauge' field with extension flag
      'PDSM_DIMM_HEALTH_RUN_GAUGE_VALID' set in 'struct
      nd_papr_pdsm_health.extension_flags'
      Signed-off-by: NVaibhav Jain <vaibhav@linux.ibm.com>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200731064153.182203-3-vaibhav@linux.ibm.com
      af0870c4
    • V
      powerpc/papr_scm: Fetch nvdimm performance stats from PHYP · 2d02bf83
      Vaibhav Jain 提交于
      Update papr_scm.c to query dimm performance statistics from PHYP via
      H_SCM_PERFORMANCE_STATS hcall and export them to user-space as PAPR
      specific NVDIMM attribute 'perf_stats' in sysfs. The patch also
      provide a sysfs ABI documentation for the stats being reported and
      their meanings.
      
      During NVDIMM probe time in papr_scm_nvdimm_init() a special variant
      of H_SCM_PERFORMANCE_STATS hcall is issued to check if collection of
      performance statistics is supported or not. If successful then a PHYP
      returns a maximum possible buffer length needed to read all
      performance stats. This returned value is stored in a per-nvdimm
      attribute 'stat_buffer_len'.
      
      The layout of request buffer for reading NVDIMM performance stats from
      PHYP is defined in 'struct papr_scm_perf_stats' and 'struct
      papr_scm_perf_stat'. These structs are used in newly introduced
      drc_pmem_query_stats() that issues the H_SCM_PERFORMANCE_STATS hcall.
      
      The sysfs access function perf_stats_show() uses value
      'stat_buffer_len' to allocate a buffer large enough to hold all
      possible NVDIMM performance stats and passes it to
      drc_pmem_query_stats() to populate. Finally statistics reported in the
      buffer are formatted into the sysfs access function output buffer.
      Signed-off-by: NVaibhav Jain <vaibhav@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200731064153.182203-2-vaibhav@linux.ibm.com
      2d02bf83
  14. 30 7月, 2020 2 次提交