1. 02 10月, 2018 10 次提交
    • P
      x86/cpu: Sanitize FAM6_ATOM naming · f2c4db1b
      Peter Zijlstra 提交于
      Going primarily by:
      
        https://en.wikipedia.org/wiki/List_of_Intel_Atom_microprocessors
      
      with additional information gleaned from other related pages; notably:
      
       - Bonnell shrink was called Saltwell
       - Moorefield is the Merriefield refresh which makes it Airmont
      
      The general naming scheme is: FAM6_ATOM_UARCH_SOCTYPE
      
        for i in `git grep -l FAM6_ATOM` ; do
      	sed -i  -e 's/ATOM_PINEVIEW/ATOM_BONNELL/g'		\
      		-e 's/ATOM_LINCROFT/ATOM_BONNELL_MID/'		\
      		-e 's/ATOM_PENWELL/ATOM_SALTWELL_MID/g'		\
      		-e 's/ATOM_CLOVERVIEW/ATOM_SALTWELL_TABLET/g'	\
      		-e 's/ATOM_CEDARVIEW/ATOM_SALTWELL/g'		\
      		-e 's/ATOM_SILVERMONT1/ATOM_SILVERMONT/g'	\
      		-e 's/ATOM_SILVERMONT2/ATOM_SILVERMONT_X/g'	\
      		-e 's/ATOM_MERRIFIELD/ATOM_SILVERMONT_MID/g'	\
      		-e 's/ATOM_MOOREFIELD/ATOM_AIRMONT_MID/g'	\
      		-e 's/ATOM_DENVERTON/ATOM_GOLDMONT_X/g'		\
      		-e 's/ATOM_GEMINI_LAKE/ATOM_GOLDMONT_PLUS/g' ${i}
        done
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: dave.hansen@linux.intel.com
      Cc: len.brown@intel.com
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      f2c4db1b
    • A
      perf/x86/intel: Add a separate Arch Perfmon v4 PMI handler · af3bdb99
      Andi Kleen 提交于
      Implements counter freezing for Arch Perfmon v4 (Skylake and
      newer). This allows to speed up the PMI handler by avoiding
      unnecessary MSR writes and make it more accurate.
      
      The Arch Perfmon v4 PMI handler is substantially different than
      the older PMI handler.
      
      Differences to the old handler:
      
      - It relies on counter freezing, which eliminates several MSR
        writes from the PMI handler and lowers the overhead significantly.
      
        It makes the PMI handler more accurate, as all counters get
        frozen atomically as soon as any counter overflows. So there is
        much less counting of the PMI handler itself.
      
        With the freezing we don't need to disable or enable counters or
        PEBS. Only BTS which does not support auto-freezing still needs to
        be explicitly managed.
      
      - The PMU acking is done at the end, not the beginning.
        This makes it possible to avoid manual enabling/disabling
        of the PMU, instead we just rely on the freezing/acking.
      
      - The APIC is acked before reenabling the PMU, which avoids
        problems with LBRs occasionally not getting unfreezed on Skylake.
      
      - Looping is only needed to workaround a corner case which several PMIs
        are very close to each other. For common cases, the counters are freezed
        during PMI handler. It doesn't need to do re-check.
      
      This patch:
      
      - Adds code to enable v4 counter freezing
      - Fork <=v3 and >=v4 PMI handlers into separate functions.
      - Add kernel parameter to disable counter freezing. It took some time to
        debug counter freezing, so in case there are new problems we added an
        option to turn it off. Would not expect this to be used until there
        are new bugs.
      - Only for big core. The patch for small core will be posted later
        separately.
      
      Performance:
      
      When profiling a kernel build on Kabylake with different perf options,
      measuring the length of all NMI handlers using the nmi handler
      trace point:
      
      V3 is without counter freezing.
      V4 is with counter freezing.
      The value is the average cost of the PMI handler.
      (lower is better)
      
      perf options    `           V3(ns) V4(ns)  delta
      -c 100000                   1088   894     -18%
      -g -c 100000                1862   1646    -12%
      --call-graph lbr -c 100000  3649   3367    -8%
      --c.g. dwarf -c 100000      2248   1982    -12%
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/1533712328-2834-2-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      af3bdb99
    • K
      perf/x86/intel: Factor out common code of PMI handler · ba12d20e
      Kan Liang 提交于
      The Arch Perfmon v4 PMI handler is substantially different than
      the older PMI handler. Instead of adding more and more ifs cleanly
      fork the new handler into a new function, with the main common
      code factored out into a common function.
      
      Fix complaint from checkpatch.pl by removing "false" from "static bool
      warned".
      
      No functional change.
      
      Based-on-code-from: Andi Kleen <ak@linux.intel.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/1533712328-2834-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ba12d20e
    • I
      Merge branch 'x86/cache' into perf/core, to resolve conflicts · a4c9f265
      Ingo Molnar 提交于
      Avoid conflict with upcoming perf/core patches, merge in the RDT perf work.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a4c9f265
    • I
      97e831e1
    • N
      perf/x86/amd/uncore: Set ThreadMask and SliceMask for L3 Cache perf events · d7cbbe49
      Natarajan, Janakarajan 提交于
      In Family 17h, some L3 Cache Performance events require the ThreadMask
      and SliceMask to be set. For other events, these fields do not affect
      the count either way.
      
      Set ThreadMask and SliceMask to 0xFF and 0xF respectively.
      Signed-off-by: NJanakarajan Natarajan <Janakarajan.Natarajan@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H . Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suravee <Suravee.Suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/Message-ID:
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      d7cbbe49
    • K
      perf/x86/intel/uncore: Fix PCI BDF address of M3UPI on SKX · 9d92cfea
      Kan Liang 提交于
      The counters on M3UPI Link 0 and Link 3 don't count properly, and writing
      0 to these counters may causes system crash on some machines.
      
      The PCI BDF addresses of the M3UPI in the current code are incorrect.
      
      The correct addresses should be:
      
        D18:F1	0x204D
        D18:F2	0x204E
        D18:F5	0x204D
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: cd34cd97 ("perf/x86/intel/uncore: Add Skylake server uncore support")
      Link: http://lkml.kernel.org/r/1537538826-55489-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9d92cfea
    • J
      perf/ring_buffer: Prevent concurent ring buffer access · cd6fb677
      Jiri Olsa 提交于
      Some of the scheduling tracepoints allow the perf_tp_event
      code to write to ring buffer under different cpu than the
      code is running on.
      
      This results in corrupted ring buffer data demonstrated in
      following perf commands:
      
        # perf record -e 'sched:sched_switch,sched:sched_wakeup' perf bench sched messaging
        # Running 'sched/messaging' benchmark:
        # 20 sender and receiver processes per group
        # 10 groups == 400 processes run
      
             Total time: 0.383 [sec]
        [ perf record: Woken up 8 times to write data ]
        0x42b890 [0]: failed to process type: -1765585640
        [ perf record: Captured and wrote 4.825 MB perf.data (29669 samples) ]
      
        # perf report --stdio
        0x42b890 [0]: failed to process type: -1765585640
      
      The reason for the corruption are some of the scheduling tracepoints,
      that have __perf_task dfined and thus allow to store data to another
      cpu ring buffer:
      
        sched_waking
        sched_wakeup
        sched_wakeup_new
        sched_stat_wait
        sched_stat_sleep
        sched_stat_iowait
        sched_stat_blocked
      
      The perf_tp_event function first store samples for current cpu
      related events defined for tracepoint:
      
          hlist_for_each_entry_rcu(event, head, hlist_entry)
            perf_swevent_event(event, count, &data, regs);
      
      And then iterates events of the 'task' and store the sample
      for any task's event that passes tracepoint checks:
      
        ctx = rcu_dereference(task->perf_event_ctxp[perf_sw_context]);
      
        list_for_each_entry_rcu(event, &ctx->event_list, event_entry) {
          if (event->attr.type != PERF_TYPE_TRACEPOINT)
            continue;
          if (event->attr.config != entry->type)
            continue;
      
          perf_swevent_event(event, count, &data, regs);
        }
      
      Above code can race with same code running on another cpu,
      ending up with 2 cpus trying to store under the same ring
      buffer, which is specifically not allowed.
      
      This patch prevents the problem, by allowing only events with the same
      current cpu to receive the event.
      
      NOTE: this requires the use of (per-task-)per-cpu buffers for this
      feature to work; perf-record does this.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      [peterz: small edits to Changelog]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrew Vagin <avagin@openvz.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: e6dab5ff ("perf/trace: Add ability to set a target task for events")
      Link: http://lkml.kernel.org/r/20180923161343.GB15054@kravaSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cd6fb677
    • M
      perf/x86/intel/uncore: Use boot_cpu_data.phys_proc_id instead of hardcorded physical package ID 0 · 6265adb9
      Masayoshi Mizuma 提交于
      Physical package id 0 doesn't always exist, we should use
      boot_cpu_data.phys_proc_id here.
      Signed-off-by: NMasayoshi Mizuma <m.mizuma@jp.fujitsu.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masayoshi Mizuma <msys.mizuma@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20180910144750.6782-1-msys.mizuma@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6265adb9
    • P
      perf/core: Fix perf_pmu_unregister() locking · a9f97721
      Peter Zijlstra 提交于
      When we unregister a PMU, we fail to serialize the @pmu_idr properly.
      Fix that by doing the entire thing under pmu_lock.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: 2e80a82a ("perf: Dynamic pmu types")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a9f97721
  2. 30 9月, 2018 4 次提交
    • G
      Merge tag 'for-linus-20180929' of git://git.kernel.dk/linux-block · 291d0e5d
      Greg Kroah-Hartman 提交于
      Jens writes:
        "Block fixes for 4.19-rc6
      
         A set of fixes that should go into this release. This pull request
         contains:
      
         - A fix (hopefully) for the persistent grants for xen-blkfront. A
           previous fix from this series wasn't complete, hence reverted, and
           this one should hopefully be it. (Boris Ostrovsky)
      
         - Fix for an elevator drain warning with SMR devices, which is
           triggered when you switch schedulers (Damien)
      
         - bcache deadlock fix (Guoju Fang)
      
         - Fix for the block unplug tracepoint, which has had the
           timer/explicit flag reverted since 4.11 (Ilya)
      
         - Fix a regression in this series where the blk-mq timeout hook is
           invoked with the RCU read lock held, hence preventing it from
           blocking (Keith)
      
         - NVMe pull from Christoph, with a single multipath fix (Susobhan Dey)"
      
      * tag 'for-linus-20180929' of git://git.kernel.dk/linux-block:
        xen/blkfront: correct purging of persistent grants
        Revert "xen/blkfront: When purging persistent grants, keep them in the buffer"
        blk-mq: I/O and timer unplugs are inverted in blktrace
        bcache: add separate workqueue for journal_write to avoid deadlock
        xen/blkfront: When purging persistent grants, keep them in the buffer
        block: fix deadline elevator drain for zoned block devices
        blk-mq: Allow blocking queue tag iter callbacks
        nvme: properly propagate errors in nvme_mpath_init
      291d0e5d
    • G
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e7541773
      Greg Kroah-Hartman 提交于
      Thomas writes:
        "A single fix for the AMD memory encryption boot code so it does not
         read random garbage instead of the cached encryption bit when a kexec
         kernel is allocated above the 32bit address limit."
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/boot: Fix kexec booting failure in the SEV bit detection code
      e7541773
    • G
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e1ce697d
      Greg Kroah-Hartman 提交于
      Thomas writes:
        "Three small fixes for clocksource drivers:
         - Proper error handling in the Atmel PIT driver
         - Add CLOCK_SOURCE_SUSPEND_NONSTOP for TI SoCs so suspend works again
         - Fix the next event function for Facebook Backpack-CMM BMC chips so
           usleep(100) doesnt sleep several milliseconds"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        clocksource/drivers/timer-atmel-pit: Properly handle error cases
        clocksource/drivers/fttmr010: Fix set_next_event handler
        clocksource/drivers/ti-32k: Add CLOCK_SOURCE_SUSPEND_NONSTOP flag for non-am43 SoCs
      e1ce697d
    • G
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · af17b3aa
      Greg Kroah-Hartman 提交于
      Thomas writes:
        "A single fix for a missing sanity check when a pinned event is tried
        to be read on the wrong CPU due to a legit event scheduling failure."
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/core: Add sanity check to deal with pinned event failure
      af17b3aa
  3. 29 9月, 2018 16 次提交
  4. 28 9月, 2018 10 次提交