1. 06 2月, 2019 25 次提交
  2. 04 2月, 2019 15 次提交
    • E
      perf/ring_buffer: Convert ring_buffer.aux_refcount to refcount_t · ca3bb3d0
      Elena Reshetova 提交于
      atomic_t variables are currently used to implement reference
      counters with the following properties:
      
       - counter is initialized to 1 using atomic_set()
       - a resource is freed upon counter reaching zero
       - once counter reaches zero, its further
         increments aren't allowed
       - counter schema uses basic atomic operations
         (set, inc, inc_not_zero, dec_and_test, etc.)
      
      Such atomic variables should be converted to a newly provided
      refcount_t type and API that prevents accidental counter overflows
      and underflows. This is important since overflows and underflows
      can lead to use-after-free situation and be exploitable.
      
      The variable ring_buffer.aux_refcount is used as pure reference counter.
      Convert it to refcount_t and fix up the operations.
      
      ** Important note for maintainers:
      
      Some functions from refcount_t API defined in lib/refcount.c
      have different memory ordering guarantees than their atomic
      counterparts. Please check Documentation/core-api/refcount-vs-atomic.rst
      for more information.
      
      Normally the differences should not matter since refcount_t provides
      enough guarantees to satisfy the refcounting use cases, but in
      some rare cases it might matter.
      Please double check that you don't have some undocumented
      memory guarantees for this variable usage.
      
      For the ring_buffer.aux_refcount it might make a difference
      in following places:
      
       - perf_aux_output_begin(): increment in refcount_inc_not_zero() only
         guarantees control dependency on success vs. fully ordered
         atomic counterpart
       - rb_free_aux(): decrement in refcount_dec_and_test() only
         provides RELEASE ordering and ACQUIRE ordering + control dependency
         on success vs. fully ordered atomic counterpart
      Suggested-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
      Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: namhyung@kernel.org
      Link: https://lkml.kernel.org/r/1548678448-24458-4-git-send-email-elena.reshetova@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ca3bb3d0
    • E
      perf/ring_buffer: Convert ring_buffer.refcount to refcount_t · fecb8ed2
      Elena Reshetova 提交于
      atomic_t variables are currently used to implement reference
      counters with the following properties:
      
       - counter is initialized to 1 using atomic_set()
       - a resource is freed upon counter reaching zero
       - once counter reaches zero, its further
         increments aren't allowed
       - counter schema uses basic atomic operations
         (set, inc, inc_not_zero, dec_and_test, etc.)
      
      Such atomic variables should be converted to a newly provided
      refcount_t type and API that prevents accidental counter overflows
      and underflows. This is important since overflows and underflows
      can lead to use-after-free situation and be exploitable.
      
      The variable ring_buffer.refcount is used as pure reference counter.
      Convert it to refcount_t and fix up the operations.
      
      ** Important note for maintainers:
      
      Some functions from refcount_t API defined in lib/refcount.c
      have different memory ordering guarantees than their atomic
      counterparts. Please check Documentation/core-api/refcount-vs-atomic.rst
      for more information.
      
      Normally the differences should not matter since refcount_t provides
      enough guarantees to satisfy the refcounting use cases, but in
      some rare cases it might matter.
      Please double check that you don't have some undocumented
      memory guarantees for this variable usage.
      
      For the ring_buffer.refcount it might make a difference
      in following places:
      
       - ring_buffer_get(): increment in refcount_inc_not_zero() only
         guarantees control dependency on success vs. fully ordered
         atomic counterpart
       - ring_buffer_put(): decrement in refcount_dec_and_test() only
         provides RELEASE ordering and ACQUIRE ordering + control dependency
         on success vs. fully ordered atomic counterpart
      Suggested-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
      Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: namhyung@kernel.org
      Link: https://lkml.kernel.org/r/1548678448-24458-3-git-send-email-elena.reshetova@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fecb8ed2
    • E
      perf: Convert perf_event_context.refcount to refcount_t · 8c94abbb
      Elena Reshetova 提交于
      atomic_t variables are currently used to implement reference
      counters with the following properties:
      
       - counter is initialized to 1 using atomic_set()
       - a resource is freed upon counter reaching zero
       - once counter reaches zero, its further
         increments aren't allowed
       - counter schema uses basic atomic operations
         (set, inc, inc_not_zero, dec_and_test, etc.)
      
      Such atomic variables should be converted to a newly provided
      refcount_t type and API that prevents accidental counter overflows
      and underflows. This is important since overflows and underflows
      can lead to use-after-free situation and be exploitable.
      
      The variable perf_event_context.refcount is used as pure reference counter.
      Convert it to refcount_t and fix up the operations.
      
      ** Important note for maintainers:
      
      Some functions from refcount_t API defined in lib/refcount.c
      have different memory ordering guarantees than their atomic
      counterparts. Please check Documentation/core-api/refcount-vs-atomic.rst
      for more information.
      
      Normally the differences should not matter since refcount_t provides
      enough guarantees to satisfy the refcounting use cases, but in
      some rare cases it might matter.
      Please double check that you don't have some undocumented
      memory guarantees for this variable usage.
      
      For the perf_event_context.refcount it might make a difference
      in following places:
      
       - get_ctx(), perf_event_ctx_lock_nested(), perf_lock_task_context()
         and __perf_event_ctx_lock_double(): increment in
         refcount_inc_not_zero() only guarantees control dependency
         on success vs. fully ordered atomic counterpart
       - put_ctx(): decrement in refcount_dec_and_test() provides
         RELEASE ordering and ACQUIRE ordering + control dependency on success
         vs. fully ordered atomic counterpart
      Suggested-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NDavid Windsor <dwindsor@gmail.com>
      Reviewed-by: NHans Liljestrand <ishkamiel@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: namhyung@kernel.org
      Link: https://lkml.kernel.org/r/1548678448-24458-2-git-send-email-elena.reshetova@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8c94abbb
    • T
      perf/uprobes: Convert to SPDX license identifier · 720e596a
      Thomas Gleixner 提交于
      Replace the license boiler plate with a SPDX license identifier.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NPaul McKenney <paulmck@linux.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20190116111308.211981422@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      720e596a
    • T
      perf/hw_breakpoints: Convert to SPDX license identifier · 469eb32e
      Thomas Gleixner 提交于
      Replace the license boiler plate with a SPDX license identifier.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NPaul McKenney <paulmck@linux.ibm.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20190116111308.105855650@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      469eb32e
    • T
      perf/core: Convert to SPDX license identifiers · 8e86e015
      Thomas Gleixner 提交于
      Use proper SPDX license identifiers instead of the bogus reference to
      kernel-base/COPYING.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20190116111308.012666937@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8e86e015
    • I
      98cb6210
    • M
      perf/core: Don't WARN() for impossible ring-buffer sizes · 9dff0aa9
      Mark Rutland 提交于
      The perf tool uses /proc/sys/kernel/perf_event_mlock_kb to determine how
      large its ringbuffer mmap should be. This can be configured to arbitrary
      values, which can be larger than the maximum possible allocation from
      kmalloc.
      
      When this is configured to a suitably large value (e.g. thanks to the
      perf fuzzer), attempting to use perf record triggers a WARN_ON_ONCE() in
      __alloc_pages_nodemask():
      
         WARNING: CPU: 2 PID: 5666 at mm/page_alloc.c:4511 __alloc_pages_nodemask+0x3f8/0xbc8
      
      Let's avoid this by checking that the requested allocation is possible
      before calling kzalloc.
      Reported-by: NJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20190110142745.25495-1-mark.rutland@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9dff0aa9
    • P
      perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu() · 602cae04
      Peter Zijlstra 提交于
      intel_pmu_cpu_prepare() allocated memory for ->shared_regs among other
      members of struct cpu_hw_events. This memory is released in
      intel_pmu_cpu_dying() which is wrong. The counterpart of the
      intel_pmu_cpu_prepare() callback is x86_pmu_dead_cpu().
      
      Otherwise if the CPU fails on the UP path between CPUHP_PERF_X86_PREPARE
      and CPUHP_AP_PERF_X86_STARTING then it won't release the memory but
      allocate new memory on the next attempt to online the CPU (leaking the
      old memory).
      Also, if the CPU down path fails between CPUHP_AP_PERF_X86_STARTING and
      CPUHP_PERF_X86_PREPARE then the CPU will go back online but never
      allocate the memory that was released in x86_pmu_dying_cpu().
      
      Make the memory allocation/free symmetrical in regard to the CPU hotplug
      notifier by moving the deallocation to intel_pmu_cpu_dead().
      
      This started in commit:
      
         a7e3ed1e ("perf: Add support for supplementary event registers").
      
      In principle the bug was introduced in v2.6.39 (!), but it will almost
      certainly not backport cleanly across the big CPU hotplug rewrite between v4.7-v4.15...
      
      [ bigeasy: Added patch description. ]
      [ mingo: Added backporting guidance. ]
      Reported-by: NHe Zhe <zhe.he@windriver.com>
      Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With developer hat on
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With maintainer hat on
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: jolsa@kernel.org
      Cc: kan.liang@linux.intel.com
      Cc: namhyung@kernel.org
      Cc: <stable@vger.kernel.org>
      Fixes: a7e3ed1e ("perf: Add support for supplementary event registers").
      Link: https://lkml.kernel.org/r/20181219165350.6s3jvyxbibpvlhtq@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      602cae04
    • K
      perf/x86/intel/uncore: Add Node ID mask · 9e63a789
      Kan Liang 提交于
      Some PCI uncore PMUs cannot be registered on an 8-socket system (HPE
      Superdome Flex).
      
      To understand which Socket the PCI uncore PMUs belongs to, perf retrieves
      the local Node ID of the uncore device from CPUNODEID(0xC0) of the PCI
      configuration space, and the mapping between Socket ID and Node ID from
      GIDNIDMAP(0xD4). The Socket ID can be calculated accordingly.
      
      The local Node ID is only available at bit 2:0, but current code doesn't
      mask it. If a BIOS doesn't clear the rest of the bits, an incorrect Node ID
      will be fetched.
      
      Filter the Node ID by adding a mask.
      Reported-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org> # v3.7+
      Fixes: 7c94ee2e ("perf/x86: Add Intel Nehalem and Sandy Bridge-EP uncore support")
      Link: https://lkml.kernel.org/r/1548600794-33162-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9e63a789
    • L
      Linux 5.0-rc5 · 8834f560
      Linus Torvalds 提交于
      8834f560
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 24b888d8
      Linus Torvalds 提交于
      Pull x86 fixes from Thomas Gleixner:
       "A few updates for x86:
      
         - Fix an unintended sign extension issue in the fault handling code
      
         - Rename the new resource control config switch so it's less
           confusing
      
         - Avoid setting up EFI info in kexec when the EFI runtime is
           disabled.
      
         - Fix the microcode version check in the AMD microcode loader so it
           only loads higher version numbers and never downgrades
      
         - Set EFER.LME in the 32bit trampoline before returning to long mode
           to handle older AMD/KVM behaviour properly.
      
         - Add Darren and Andy as x86/platform reviewers"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/resctrl: Avoid confusion over the new X86_RESCTRL config
        x86/kexec: Don't setup EFI info if EFI runtime is not enabled
        x86/microcode/amd: Don't falsely trick the late loading mechanism
        MAINTAINERS: Add Andy and Darren as arch/x86/platform/ reviewers
        x86/fault: Fix sign-extend unintended sign extension
        x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning to long mode
        x86/cpu: Add Atom Tremont (Jacobsville)
      24b888d8
    • L
      Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · cc6810e3
      Linus Torvalds 提交于
      Pull cpu hotplug fixes from Thomas Gleixner:
       "Two fixes for the cpu hotplug machinery:
      
         - Replace the overly clever 'SMT disabled by BIOS' detection logic as
           it breaks KVM scenarios and prevents speculation control updates
           when the Hyperthreads are brought online late after boot.
      
         - Remove a redundant invocation of the speculation control update
           function"
      
      * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM
        x86/speculation: Remove redundant arch_smt_update() invocation
      cc6810e3
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 58f6d428
      Linus Torvalds 提交于
      Pull perf fixes from Thomas Gleixner:
       "A pile of perf updates:
      
         - Fix broken sanity check in the /proc/sys/kernel/perf_cpu_time_max_percent
           write handler
      
         - Cure a perf script crash which caused by an unitinialized data
           structure
      
         - Highlight the hottest instruction in perf top and not a random one
      
         - Cure yet another clang issue when building perf python
      
         - Handle topology entries with no CPU correctly in the tools
      
         - Handle perf data which contains both tracepoints and performance
           counter entries correctly.
      
         - Add a missing NULL pointer check in perf ordered_events_free()"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf script: Fix crash when processing recorded stat data
        perf top: Fix wrong hottest instruction highlighted
        perf tools: Handle TOPOLOGY headers with no CPU
        perf python: Remove -fstack-clash-protection when building with some clang versions
        perf core: Fix perf_proc_update_handler() bug
        perf script: Fix crash with printing mixed trace point and other events
        perf ordered_events: Fix crash in ordered_events__free
      58f6d428
    • L
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 89401be6
      Linus Torvalds 提交于
      Pull EFI fix from Thomas Gleixner:
       "The dump info for the efi page table debugging lacks a terminator
        which causes the kernel to crash when the debugfile is read"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi/arm64: Fix debugfs crash by adding a terminator for ptdump marker
      89401be6