提交 9d9420f1 编写于 作者: L Linus Torvalds

Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar:
 "Kernel side updates:

   - Fix and enhance poll support (Jiri Olsa)

   - Re-enable inheritance optimization (Jiri Olsa)

   - Enhance Intel memory events support (Stephane Eranian)

   - Refactor the Intel uncore driver to be more maintainable (Zheng
     Yan)

   - Enhance and fix Intel CPU and uncore PMU drivers (Peter Zijlstra,
     Andi Kleen)

   - [ plus various smaller fixes/cleanups ]

  User visible tooling updates:

   - Add +field argument support for --field option, so that one can add
     fields to the default list of fields to show, ie now one can just
     do:

         perf report --fields +pid

     And the pid will appear in addition to the default fields (Jiri
     Olsa)

   - Add +field argument support for --sort option (Jiri Olsa)

   - Honour -w in the report tools (report, top), allowing to specify
     the widths for the histogram entries columns (Namhyung Kim)

   - Properly show submicrosecond times in 'perf kvm stat' (Christian
     Borntraeger)

   - Add beautifier for mremap flags param in 'trace' (Alex Snast)

   - perf script: Allow callchains if any event samples them

   - Don't truncate Intel style addresses in 'annotate' (Alex Converse)

   - Allow profiling when kptr_restrict == 1 for non root users, kernel
     samples will just remain unresolved (Andi Kleen)

   - Allow configuring default options for callchains in config file
     (Namhyung Kim)

   - Support operations for shared futexes.  (Davidlohr Bueso)

   - "perf kvm stat report" improvements by Alexander Yarygin:
       -  Save pid string in opts.target.pid
       -  Enable the target.system_wide flag
       -  Unify the title bar output

   - [ plus lots of other fixes and small improvements.  ]

  Tooling infrastructure changes:

   - Refactor unit and scale function parameters for PMU parsing
     routines (Matt Fleming)

   - Improve DSO long names lookup with rbtree, resulting in great
     speedup for workloads with lots of DSOs (Waiman Long)

   - We were not handling POLLHUP notifications for event file
     descriptors

     Fix it by filtering entries in the events file descriptor array
     after poll() returns, refcounting mmaps so that when the last fd
     pointing to a perf mmap goes away we do the unmap (Arnaldo Carvalho
     de Melo)

   - Intel PT prep work, from Adrian Hunter, including:
       - Let a user specify a PMU event without any config terms
       - Add perf-with-kcore script
       - Let default config be defined for a PMU
       - Add perf_pmu__scan_file()
       - Add a 'perf test' for tracking with sched_switch
       - Add 'flush' callback to scripting API

   - Use ring buffer consume method to look like other tools (Arnaldo
     Carvalho de Melo)

   - hists browser (used in top and report) refactorings, getting rid of
     unused variables and reducing source code size by handling similar
     cases in a fewer functions (Namhyung Kim).

   - Replace thread unsafe strerror() with strerror_r() accross the
     whole tools/perf/ tree (Masami Hiramatsu)

   - Rename ordered_samples to ordered_events and allow setting a queue
     size for ordering events (Jiri Olsa)

   - [ plus lots of fixes, cleanups and other improvements ]"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (198 commits)
  perf/x86: Tone down kernel messages when the PMU check fails in a virtual environment
  perf/x86/intel/uncore: Fix minor race in box set up
  perf record: Fix error message for --filter option not coming after tracepoint
  perf tools: Fix build breakage on arm64 targets
  perf symbols: Improve DSO long names lookup speed with rbtree
  perf symbols: Encapsulate dsos list head into struct dsos
  perf bench futex: Sanitize -q option in requeue
  perf bench futex: Support operations for shared futexes
  perf trace: Fix mmap return address truncation to 32-bit
  perf tools: Refactor unit and scale function parameters
  perf tools: Fix line number in the config file error message
  perf tools: Convert {record,top}.call-graph option to call-graph.record-mode
  perf tools: Introduce perf_callchain_config()
  perf callchain: Move some parser functions to callchain.c
  perf tools: Move callchain config from record_opts to callchain_param
  perf hists browser: Fix callchain print bug on TUI
  perf tools: Use ACCESS_ONCE() instead of volatile cast
  perf tools: Modify error code for when perf_session__new() fails
  perf tools: Fix perf record as non root with kptr_restrict == 1
  perf stat: Fix --per-core on multi socket systems
  ...
......@@ -51,6 +51,14 @@
ARCH_PERFMON_EVENTSEL_EDGE | \
ARCH_PERFMON_EVENTSEL_INV | \
ARCH_PERFMON_EVENTSEL_CMASK)
#define X86_ALL_EVENT_FLAGS \
(ARCH_PERFMON_EVENTSEL_EDGE | \
ARCH_PERFMON_EVENTSEL_INV | \
ARCH_PERFMON_EVENTSEL_CMASK | \
ARCH_PERFMON_EVENTSEL_ANY | \
ARCH_PERFMON_EVENTSEL_PIN_CONTROL | \
HSW_IN_TX | \
HSW_IN_TX_CHECKPOINTED)
#define AMD64_RAW_EVENT_MASK \
(X86_RAW_EVENT_MASK | \
AMD64_EVENTSEL_EVENT)
......
......@@ -39,7 +39,9 @@ obj-$(CONFIG_CPU_SUP_AMD) += perf_event_amd_iommu.o
endif
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_p6.o perf_event_knc.o perf_event_p4.o
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_lbr.o perf_event_intel_ds.o perf_event_intel.o
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_uncore.o perf_event_intel_rapl.o
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_uncore.o perf_event_intel_uncore_snb.o
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_uncore_snbep.o perf_event_intel_uncore_nhmex.o
obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_intel_rapl.o
endif
......
......@@ -243,7 +243,8 @@ static bool check_hw_exists(void)
msr_fail:
printk(KERN_CONT "Broken PMU hardware detected, using software events only.\n");
printk(KERN_ERR "Failed to access perfctr msr (MSR %x is %Lx)\n", reg, val_new);
printk(boot_cpu_has(X86_FEATURE_HYPERVISOR) ? KERN_INFO : KERN_ERR
"Failed to access perfctr msr (MSR %x is %Lx)\n", reg, val_new);
return false;
}
......@@ -387,7 +388,7 @@ int x86_pmu_hw_config(struct perf_event *event)
precise++;
/* Support for IP fixup */
if (x86_pmu.lbr_nr)
if (x86_pmu.lbr_nr || x86_pmu.intel_cap.pebs_format >= 2)
precise++;
}
......@@ -443,6 +444,12 @@ int x86_pmu_hw_config(struct perf_event *event)
if (event->attr.type == PERF_TYPE_RAW)
event->hw.config |= event->attr.config & X86_RAW_EVENT_MASK;
if (event->attr.sample_period && x86_pmu.limit_period) {
if (x86_pmu.limit_period(event, event->attr.sample_period) >
event->attr.sample_period)
return -EINVAL;
}
return x86_setup_perfctr(event);
}
......@@ -980,6 +987,9 @@ int x86_perf_event_set_period(struct perf_event *event)
if (left > x86_pmu.max_period)
left = x86_pmu.max_period;
if (x86_pmu.limit_period)
left = x86_pmu.limit_period(event, left);
per_cpu(pmc_prev_left[idx], smp_processor_id()) = left;
/*
......
......@@ -67,8 +67,10 @@ struct event_constraint {
*/
#define PERF_X86_EVENT_PEBS_LDLAT 0x1 /* ld+ldlat data address sampling */
#define PERF_X86_EVENT_PEBS_ST 0x2 /* st data address sampling */
#define PERF_X86_EVENT_PEBS_ST_HSW 0x4 /* haswell style st data sampling */
#define PERF_X86_EVENT_PEBS_ST_HSW 0x4 /* haswell style datala, store */
#define PERF_X86_EVENT_COMMITTED 0x8 /* event passed commit_txn */
#define PERF_X86_EVENT_PEBS_LD_HSW 0x10 /* haswell style datala, load */
#define PERF_X86_EVENT_PEBS_NA_HSW 0x20 /* haswell style datala, unknown */
struct amd_nb {
int nb_id; /* NorthBridge id */
......@@ -252,18 +254,52 @@ struct cpu_hw_events {
EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK)
#define INTEL_PLD_CONSTRAINT(c, n) \
__EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
__EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \
HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_LDLAT)
#define INTEL_PST_CONSTRAINT(c, n) \
__EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
__EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \
HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST)
/* DataLA version of store sampling without extra enable bit. */
#define INTEL_PST_HSW_CONSTRAINT(c, n) \
__EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
/* Event constraint, but match on all event flags too. */
#define INTEL_FLAGS_EVENT_CONSTRAINT(c, n) \
EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS)
/* Check only flags, but allow all event/umask */
#define INTEL_ALL_EVENT_CONSTRAINT(code, n) \
EVENT_CONSTRAINT(code, n, X86_ALL_EVENT_FLAGS)
/* Check flags and event code, and set the HSW store flag */
#define INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_ST(code, n) \
__EVENT_CONSTRAINT(code, n, \
ARCH_PERFMON_EVENTSEL_EVENT|X86_ALL_EVENT_FLAGS, \
HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST_HSW)
/* Check flags and event code, and set the HSW load flag */
#define INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD(code, n) \
__EVENT_CONSTRAINT(code, n, \
ARCH_PERFMON_EVENTSEL_EVENT|X86_ALL_EVENT_FLAGS, \
HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_LD_HSW)
/* Check flags and event code/umask, and set the HSW store flag */
#define INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(code, n) \
__EVENT_CONSTRAINT(code, n, \
INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \
HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST_HSW)
/* Check flags and event code/umask, and set the HSW load flag */
#define INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(code, n) \
__EVENT_CONSTRAINT(code, n, \
INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \
HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_LD_HSW)
/* Check flags and event code/umask, and set the HSW N/A flag */
#define INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA(code, n) \
__EVENT_CONSTRAINT(code, n, \
INTEL_ARCH_EVENT_MASK|INTEL_ARCH_EVENT_MASK, \
HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_NA_HSW)
/*
* We define the end marker as having a weight of -1
* to enable blacklisting of events using a counter bitmask
......@@ -409,6 +445,7 @@ struct x86_pmu {
struct x86_pmu_quirk *quirks;
int perfctr_second_write;
bool late_ack;
unsigned (*limit_period)(struct perf_event *event, unsigned l);
/*
* sysfs attrs
......
......@@ -220,6 +220,15 @@ static struct event_constraint intel_hsw_event_constraints[] = {
EVENT_CONSTRAINT_END
};
static struct event_constraint intel_bdw_event_constraints[] = {
FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */
FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */
FIXED_EVENT_CONSTRAINT(0x0300, 2), /* CPU_CLK_UNHALTED.REF */
INTEL_UEVENT_CONSTRAINT(0x148, 0x4), /* L1D_PEND_MISS.PENDING */
INTEL_EVENT_CONSTRAINT(0xa3, 0x4), /* CYCLE_ACTIVITY.* */
EVENT_CONSTRAINT_END
};
static u64 intel_pmu_event_map(int hw_event)
{
return intel_perfmon_event_map[hw_event];
......@@ -415,6 +424,126 @@ static __initconst const u64 snb_hw_cache_event_ids
};
static __initconst const u64 hsw_hw_cache_event_ids
[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX] =
{
[ C(L1D ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x81d0, /* MEM_UOPS_RETIRED.ALL_LOADS */
[ C(RESULT_MISS) ] = 0x151, /* L1D.REPLACEMENT */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x82d0, /* MEM_UOPS_RETIRED.ALL_STORES */
[ C(RESULT_MISS) ] = 0x0,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = 0x0,
[ C(RESULT_MISS) ] = 0x0,
},
},
[ C(L1I ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x0,
[ C(RESULT_MISS) ] = 0x280, /* ICACHE.MISSES */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = 0x0,
[ C(RESULT_MISS) ] = 0x0,
},
},
[ C(LL ) ] = {
[ C(OP_READ) ] = {
/* OFFCORE_RESPONSE:ALL_DATA_RD|ALL_CODE_RD */
[ C(RESULT_ACCESS) ] = 0x1b7,
/* OFFCORE_RESPONSE:ALL_DATA_RD|ALL_CODE_RD|SUPPLIER_NONE|
L3_MISS|ANY_SNOOP */
[ C(RESULT_MISS) ] = 0x1b7,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x1b7, /* OFFCORE_RESPONSE:ALL_RFO */
/* OFFCORE_RESPONSE:ALL_RFO|SUPPLIER_NONE|L3_MISS|ANY_SNOOP */
[ C(RESULT_MISS) ] = 0x1b7,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = 0x0,
[ C(RESULT_MISS) ] = 0x0,
},
},
[ C(DTLB) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x81d0, /* MEM_UOPS_RETIRED.ALL_LOADS */
[ C(RESULT_MISS) ] = 0x108, /* DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x82d0, /* MEM_UOPS_RETIRED.ALL_STORES */
[ C(RESULT_MISS) ] = 0x149, /* DTLB_STORE_MISSES.MISS_CAUSES_A_WALK */
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = 0x0,
[ C(RESULT_MISS) ] = 0x0,
},
},
[ C(ITLB) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x6085, /* ITLB_MISSES.STLB_HIT */
[ C(RESULT_MISS) ] = 0x185, /* ITLB_MISSES.MISS_CAUSES_A_WALK */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
},
[ C(BPU ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0xc4, /* BR_INST_RETIRED.ALL_BRANCHES */
[ C(RESULT_MISS) ] = 0xc5, /* BR_MISP_RETIRED.ALL_BRANCHES */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
},
};
static __initconst const u64 hsw_hw_cache_extra_regs
[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX] =
{
[ C(LL ) ] = {
[ C(OP_READ) ] = {
/* OFFCORE_RESPONSE:ALL_DATA_RD|ALL_CODE_RD */
[ C(RESULT_ACCESS) ] = 0x2d5,
/* OFFCORE_RESPONSE:ALL_DATA_RD|ALL_CODE_RD|SUPPLIER_NONE|
L3_MISS|ANY_SNOOP */
[ C(RESULT_MISS) ] = 0x3fbc0202d5ull,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x122, /* OFFCORE_RESPONSE:ALL_RFO */
/* OFFCORE_RESPONSE:ALL_RFO|SUPPLIER_NONE|L3_MISS|ANY_SNOOP */
[ C(RESULT_MISS) ] = 0x3fbc020122ull,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = 0x0,
[ C(RESULT_MISS) ] = 0x0,
},
},
};
static __initconst const u64 westmere_hw_cache_event_ids
[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
......@@ -1905,6 +2034,24 @@ hsw_get_event_constraints(struct cpu_hw_events *cpuc, struct perf_event *event)
return c;
}
/*
* Broadwell:
* The INST_RETIRED.ALL period always needs to have lowest
* 6bits cleared (BDM57). It shall not use a period smaller
* than 100 (BDM11). We combine the two to enforce
* a min-period of 128.
*/
static unsigned bdw_limit_period(struct perf_event *event, unsigned left)
{
if ((event->hw.config & INTEL_ARCH_EVENT_MASK) ==
X86_CONFIG(.event=0xc0, .umask=0x01)) {
if (left < 128)
left = 128;
left &= ~0x3fu;
}
return left;
}
PMU_FORMAT_ATTR(event, "config:0-7" );
PMU_FORMAT_ATTR(umask, "config:8-15" );
PMU_FORMAT_ATTR(edge, "config:18" );
......@@ -2367,15 +2514,15 @@ __init int intel_pmu_init(void)
* Install the hw-cache-events table:
*/
switch (boot_cpu_data.x86_model) {
case 14: /* 65 nm core solo/duo, "Yonah" */
case 14: /* 65nm Core "Yonah" */
pr_cont("Core events, ");
break;
case 15: /* original 65 nm celeron/pentium/core2/xeon, "Merom"/"Conroe" */
case 15: /* 65nm Core2 "Merom" */
x86_add_quirk(intel_clovertown_quirk);
case 22: /* single-core 65 nm celeron/core2solo "Merom-L"/"Conroe-L" */
case 23: /* current 45 nm celeron/core2/xeon "Penryn"/"Wolfdale" */
case 29: /* six-core 45 nm xeon "Dunnington" */
case 22: /* 65nm Core2 "Merom-L" */
case 23: /* 45nm Core2 "Penryn" */
case 29: /* 45nm Core2 "Dunnington (MP) */
memcpy(hw_cache_event_ids, core2_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
......@@ -2386,9 +2533,9 @@ __init int intel_pmu_init(void)
pr_cont("Core2 events, ");
break;
case 26: /* 45 nm nehalem, "Bloomfield" */
case 30: /* 45 nm nehalem, "Lynnfield" */
case 46: /* 45 nm nehalem-ex, "Beckton" */
case 30: /* 45nm Nehalem */
case 26: /* 45nm Nehalem-EP */
case 46: /* 45nm Nehalem-EX */
memcpy(hw_cache_event_ids, nehalem_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, nehalem_hw_cache_extra_regs,
......@@ -2415,11 +2562,11 @@ __init int intel_pmu_init(void)
pr_cont("Nehalem events, ");
break;
case 28: /* Atom */
case 38: /* Lincroft */
case 39: /* Penwell */
case 53: /* Cloverview */
case 54: /* Cedarview */
case 28: /* 45nm Atom "Pineview" */
case 38: /* 45nm Atom "Lincroft" */
case 39: /* 32nm Atom "Penwell" */
case 53: /* 32nm Atom "Cloverview" */
case 54: /* 32nm Atom "Cedarview" */
memcpy(hw_cache_event_ids, atom_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
......@@ -2430,8 +2577,8 @@ __init int intel_pmu_init(void)
pr_cont("Atom events, ");
break;
case 55: /* Atom 22nm "Silvermont" */
case 77: /* Avoton "Silvermont" */
case 55: /* 22nm Atom "Silvermont" */
case 77: /* 22nm Atom "Silvermont Avoton/Rangely" */
memcpy(hw_cache_event_ids, slm_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, slm_hw_cache_extra_regs,
......@@ -2446,9 +2593,9 @@ __init int intel_pmu_init(void)
pr_cont("Silvermont events, ");
break;
case 37: /* 32 nm nehalem, "Clarkdale" */
case 44: /* 32 nm nehalem, "Gulftown" */
case 47: /* 32 nm Xeon E7 */
case 37: /* 32nm Westmere */
case 44: /* 32nm Westmere-EP */
case 47: /* 32nm Westmere-EX */
memcpy(hw_cache_event_ids, westmere_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, nehalem_hw_cache_extra_regs,
......@@ -2474,8 +2621,8 @@ __init int intel_pmu_init(void)
pr_cont("Westmere events, ");
break;
case 42: /* SandyBridge */
case 45: /* SandyBridge, "Romely-EP" */
case 42: /* 32nm SandyBridge */
case 45: /* 32nm SandyBridge-E/EN/EP */
x86_add_quirk(intel_sandybridge_quirk);
memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
......@@ -2506,8 +2653,9 @@ __init int intel_pmu_init(void)
pr_cont("SandyBridge events, ");
break;
case 58: /* IvyBridge */
case 62: /* IvyBridge EP */
case 58: /* 22nm IvyBridge */
case 62: /* 22nm IvyBridge-EP/EX */
memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
/* dTLB-load-misses on IVB is different than SNB */
......@@ -2539,20 +2687,19 @@ __init int intel_pmu_init(void)
break;
case 60: /* Haswell Client */
case 70:
case 71:
case 63:
case 69:
case 60: /* 22nm Haswell Core */
case 63: /* 22nm Haswell Server */
case 69: /* 22nm Haswell ULT */
case 70: /* 22nm Haswell + GT3e (Intel Iris Pro graphics) */
x86_pmu.late_ack = true;
memcpy(hw_cache_event_ids, snb_hw_cache_event_ids, sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, snb_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
memcpy(hw_cache_event_ids, hsw_hw_cache_event_ids, sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, hsw_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
intel_pmu_lbr_init_snb();
x86_pmu.event_constraints = intel_hsw_event_constraints;
x86_pmu.pebs_constraints = intel_hsw_pebs_event_constraints;
x86_pmu.extra_regs = intel_snb_extra_regs;
x86_pmu.extra_regs = intel_snbep_extra_regs;
x86_pmu.pebs_aliases = intel_pebs_aliases_snb;
/* all extra regs are per-cpu when HT is on */
x86_pmu.er_flags |= ERF_HAS_RSP_1;
......@@ -2565,6 +2712,28 @@ __init int intel_pmu_init(void)
pr_cont("Haswell events, ");
break;
case 61: /* 14nm Broadwell Core-M */
x86_pmu.late_ack = true;
memcpy(hw_cache_event_ids, hsw_hw_cache_event_ids, sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, hsw_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
intel_pmu_lbr_init_snb();
x86_pmu.event_constraints = intel_bdw_event_constraints;
x86_pmu.pebs_constraints = intel_hsw_pebs_event_constraints;
x86_pmu.extra_regs = intel_snbep_extra_regs;
x86_pmu.pebs_aliases = intel_pebs_aliases_snb;
/* all extra regs are per-cpu when HT is on */
x86_pmu.er_flags |= ERF_HAS_RSP_1;
x86_pmu.er_flags |= ERF_NO_HT_SHARING;
x86_pmu.hw_config = hsw_hw_config;
x86_pmu.get_event_constraints = hsw_get_event_constraints;
x86_pmu.cpu_events = hsw_events_attrs;
x86_pmu.limit_period = bdw_limit_period;
pr_cont("Broadwell events, ");
break;
default:
switch (x86_pmu.version) {
case 1:
......
......@@ -108,14 +108,16 @@ static u64 precise_store_data(u64 status)
return val;
}
static u64 precise_store_data_hsw(struct perf_event *event, u64 status)
static u64 precise_datala_hsw(struct perf_event *event, u64 status)
{
union perf_mem_data_src dse;
u64 cfg = event->hw.config & INTEL_ARCH_EVENT_MASK;
dse.val = 0;
dse.mem_op = PERF_MEM_OP_STORE;
dse.mem_lvl = PERF_MEM_LVL_NA;
dse.val = PERF_MEM_NA;
if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
dse.mem_op = PERF_MEM_OP_STORE;
else if (event->hw.flags & PERF_X86_EVENT_PEBS_LD_HSW)
dse.mem_op = PERF_MEM_OP_LOAD;
/*
* L1 info only valid for following events:
......@@ -125,15 +127,12 @@ static u64 precise_store_data_hsw(struct perf_event *event, u64 status)
* MEM_UOPS_RETIRED.SPLIT_STORES
* MEM_UOPS_RETIRED.ALL_STORES
*/
if (cfg != 0x12d0 && cfg != 0x22d0 && cfg != 0x42d0 && cfg != 0x82d0)
return dse.mem_lvl;
if (status & 1)
dse.mem_lvl = PERF_MEM_LVL_L1 | PERF_MEM_LVL_HIT;
else
dse.mem_lvl = PERF_MEM_LVL_L1 | PERF_MEM_LVL_MISS;
/* Nothing else supported. Sorry. */
if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW) {
if (status & 1)
dse.mem_lvl = PERF_MEM_LVL_L1 | PERF_MEM_LVL_HIT;
else
dse.mem_lvl = PERF_MEM_LVL_L1 | PERF_MEM_LVL_MISS;
}
return dse.val;
}
......@@ -569,28 +568,10 @@ struct event_constraint intel_atom_pebs_event_constraints[] = {
};
struct event_constraint intel_slm_pebs_event_constraints[] = {
INTEL_UEVENT_CONSTRAINT(0x0103, 0x1), /* REHABQ.LD_BLOCK_ST_FORWARD_PS */
INTEL_UEVENT_CONSTRAINT(0x0803, 0x1), /* REHABQ.LD_SPLITS_PS */
INTEL_UEVENT_CONSTRAINT(0x0204, 0x1), /* MEM_UOPS_RETIRED.L2_HIT_LOADS_PS */
INTEL_UEVENT_CONSTRAINT(0x0404, 0x1), /* MEM_UOPS_RETIRED.L2_MISS_LOADS_PS */
INTEL_UEVENT_CONSTRAINT(0x0804, 0x1), /* MEM_UOPS_RETIRED.DTLB_MISS_LOADS_PS */
INTEL_UEVENT_CONSTRAINT(0x2004, 0x1), /* MEM_UOPS_RETIRED.HITM_PS */
INTEL_UEVENT_CONSTRAINT(0x00c0, 0x1), /* INST_RETIRED.ANY_PS */
INTEL_UEVENT_CONSTRAINT(0x00c4, 0x1), /* BR_INST_RETIRED.ALL_BRANCHES_PS */
INTEL_UEVENT_CONSTRAINT(0x7ec4, 0x1), /* BR_INST_RETIRED.JCC_PS */
INTEL_UEVENT_CONSTRAINT(0xbfc4, 0x1), /* BR_INST_RETIRED.FAR_BRANCH_PS */
INTEL_UEVENT_CONSTRAINT(0xebc4, 0x1), /* BR_INST_RETIRED.NON_RETURN_IND_PS */
INTEL_UEVENT_CONSTRAINT(0xf7c4, 0x1), /* BR_INST_RETIRED.RETURN_PS */
INTEL_UEVENT_CONSTRAINT(0xf9c4, 0x1), /* BR_INST_RETIRED.CALL_PS */
INTEL_UEVENT_CONSTRAINT(0xfbc4, 0x1), /* BR_INST_RETIRED.IND_CALL_PS */
INTEL_UEVENT_CONSTRAINT(0xfdc4, 0x1), /* BR_INST_RETIRED.REL_CALL_PS */
INTEL_UEVENT_CONSTRAINT(0xfec4, 0x1), /* BR_INST_RETIRED.TAKEN_JCC_PS */
INTEL_UEVENT_CONSTRAINT(0x00c5, 0x1), /* BR_INST_MISP_RETIRED.ALL_BRANCHES_PS */
INTEL_UEVENT_CONSTRAINT(0x7ec5, 0x1), /* BR_INST_MISP_RETIRED.JCC_PS */
INTEL_UEVENT_CONSTRAINT(0xebc5, 0x1), /* BR_INST_MISP_RETIRED.NON_RETURN_IND_PS */
INTEL_UEVENT_CONSTRAINT(0xf7c5, 0x1), /* BR_INST_MISP_RETIRED.RETURN_PS */
INTEL_UEVENT_CONSTRAINT(0xfbc5, 0x1), /* BR_INST_MISP_RETIRED.IND_CALL_PS */
INTEL_UEVENT_CONSTRAINT(0xfec5, 0x1), /* BR_INST_MISP_RETIRED.TAKEN_JCC_PS */
/* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
/* Allow all events as PEBS with no flags */
INTEL_ALL_EVENT_CONSTRAINT(0, 0x1),
EVENT_CONSTRAINT_END
};
......@@ -626,68 +607,44 @@ struct event_constraint intel_westmere_pebs_event_constraints[] = {
struct event_constraint intel_snb_pebs_event_constraints[] = {
INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */
INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */
INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */
INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
INTEL_UEVENT_CONSTRAINT(0x02d4, 0xf), /* MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS */
/* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
/* Allow all events as PEBS with no flags */
INTEL_ALL_EVENT_CONSTRAINT(0, 0xf),
EVENT_CONSTRAINT_END
};
struct event_constraint intel_ivb_pebs_event_constraints[] = {
INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */
INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */
INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */
INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
/* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
/* Allow all events as PEBS with no flags */
INTEL_ALL_EVENT_CONSTRAINT(0, 0xf),
EVENT_CONSTRAINT_END
};
struct event_constraint intel_hsw_pebs_event_constraints[] = {
INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
INTEL_PST_HSW_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */
INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
INTEL_UEVENT_CONSTRAINT(0x01c5, 0xf), /* BR_MISP_RETIRED.CONDITIONAL */
INTEL_UEVENT_CONSTRAINT(0x04c5, 0xf), /* BR_MISP_RETIRED.ALL_BRANCHES */
INTEL_UEVENT_CONSTRAINT(0x20c5, 0xf), /* BR_MISP_RETIRED.NEAR_TAKEN */
INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.* */
/* MEM_UOPS_RETIRED.STLB_MISS_LOADS */
INTEL_UEVENT_CONSTRAINT(0x11d0, 0xf),
/* MEM_UOPS_RETIRED.STLB_MISS_STORES */
INTEL_UEVENT_CONSTRAINT(0x12d0, 0xf),
INTEL_UEVENT_CONSTRAINT(0x21d0, 0xf), /* MEM_UOPS_RETIRED.LOCK_LOADS */
INTEL_UEVENT_CONSTRAINT(0x41d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_LOADS */
/* MEM_UOPS_RETIRED.SPLIT_STORES */
INTEL_UEVENT_CONSTRAINT(0x42d0, 0xf),
INTEL_UEVENT_CONSTRAINT(0x81d0, 0xf), /* MEM_UOPS_RETIRED.ALL_LOADS */
INTEL_PST_HSW_CONSTRAINT(0x82d0, 0xf), /* MEM_UOPS_RETIRED.ALL_STORES */
INTEL_UEVENT_CONSTRAINT(0x01d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L1_HIT */
INTEL_UEVENT_CONSTRAINT(0x02d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L2_HIT */
INTEL_UEVENT_CONSTRAINT(0x04d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L3_HIT */
/* MEM_LOAD_UOPS_RETIRED.HIT_LFB */
INTEL_UEVENT_CONSTRAINT(0x40d1, 0xf),
/* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS */
INTEL_UEVENT_CONSTRAINT(0x01d2, 0xf),
/* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT */
INTEL_UEVENT_CONSTRAINT(0x02d2, 0xf),
/* MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM */
INTEL_UEVENT_CONSTRAINT(0x01d3, 0xf),
INTEL_UEVENT_CONSTRAINT(0x04c8, 0xf), /* HLE_RETIRED.Abort */
INTEL_UEVENT_CONSTRAINT(0x04c9, 0xf), /* RTM_RETIRED.Abort */
INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.* */
/* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x11d0, 0xf), /* MEM_UOPS_RETIRED.STLB_MISS_LOADS */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x21d0, 0xf), /* MEM_UOPS_RETIRED.LOCK_LOADS */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x41d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_LOADS */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x81d0, 0xf), /* MEM_UOPS_RETIRED.ALL_LOADS */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x12d0, 0xf), /* MEM_UOPS_RETIRED.STLB_MISS_STORES */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x42d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_STORES */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x82d0, 0xf), /* MEM_UOPS_RETIRED.ALL_STORES */
INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */
INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD(0xd2, 0xf), /* MEM_LOAD_UOPS_L3_HIT_RETIRED.* */
INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD(0xd3, 0xf), /* MEM_LOAD_UOPS_L3_MISS_RETIRED.* */
/* Allow all events as PEBS with no flags */
INTEL_ALL_EVENT_CONSTRAINT(0, 0xf),
EVENT_CONSTRAINT_END
};
......@@ -864,6 +821,10 @@ static inline u64 intel_hsw_transaction(struct pebs_record_hsw *pebs)
static void __intel_pmu_pebs_event(struct perf_event *event,
struct pt_regs *iregs, void *__pebs)
{
#define PERF_X86_EVENT_PEBS_HSW_PREC \
(PERF_X86_EVENT_PEBS_ST_HSW | \
PERF_X86_EVENT_PEBS_LD_HSW | \
PERF_X86_EVENT_PEBS_NA_HSW)
/*
* We cast to the biggest pebs_record but are careful not to
* unconditionally access the 'extra' entries.
......@@ -873,42 +834,40 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
struct perf_sample_data data;
struct pt_regs regs;
u64 sample_type;
int fll, fst;
int fll, fst, dsrc;
int fl = event->hw.flags;
if (!intel_pmu_save_and_restart(event))
return;
fll = event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT;
fst = event->hw.flags & (PERF_X86_EVENT_PEBS_ST |
PERF_X86_EVENT_PEBS_ST_HSW);
sample_type = event->attr.sample_type;
dsrc = sample_type & PERF_SAMPLE_DATA_SRC;
fll = fl & PERF_X86_EVENT_PEBS_LDLAT;
fst = fl & (PERF_X86_EVENT_PEBS_ST | PERF_X86_EVENT_PEBS_HSW_PREC);
perf_sample_data_init(&data, 0, event->hw.last_period);
data.period = event->hw.last_period;
sample_type = event->attr.sample_type;
/*
* if PEBS-LL or PreciseStore
* Use latency for weight (only avail with PEBS-LL)
*/
if (fll || fst) {
/*
* Use latency for weight (only avail with PEBS-LL)
*/
if (fll && (sample_type & PERF_SAMPLE_WEIGHT))
data.weight = pebs->lat;
/*
* data.data_src encodes the data source
*/
if (sample_type & PERF_SAMPLE_DATA_SRC) {
if (fll)
data.data_src.val = load_latency_data(pebs->dse);
else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
data.data_src.val =
precise_store_data_hsw(event, pebs->dse);
else
data.data_src.val = precise_store_data(pebs->dse);
}
if (fll && (sample_type & PERF_SAMPLE_WEIGHT))
data.weight = pebs->lat;
/*
* data.data_src encodes the data source
*/
if (dsrc) {
u64 val = PERF_MEM_NA;
if (fll)
val = load_latency_data(pebs->dse);
else if (fst && (fl & PERF_X86_EVENT_PEBS_HSW_PREC))
val = precise_datala_hsw(event, pebs->dse);
else if (fst)
val = precise_store_data(pebs->dse);
data.data_src.val = val;
}
/*
......@@ -935,16 +894,16 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
else
regs.flags &= ~PERF_EFLAGS_EXACT;
if ((event->attr.sample_type & PERF_SAMPLE_ADDR) &&
if ((sample_type & PERF_SAMPLE_ADDR) &&
x86_pmu.intel_cap.pebs_format >= 1)
data.addr = pebs->dla;
if (x86_pmu.intel_cap.pebs_format >= 2) {
/* Only set the TSX weight when no memory weight. */
if ((event->attr.sample_type & PERF_SAMPLE_WEIGHT) && !fll)
if ((sample_type & PERF_SAMPLE_WEIGHT) && !fll)
data.weight = intel_hsw_weight(pebs);
if (event->attr.sample_type & PERF_SAMPLE_TRANSACTION)
if (sample_type & PERF_SAMPLE_TRANSACTION)
data.txn = intel_hsw_transaction(pebs);
}
......@@ -1055,7 +1014,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
* BTS, PEBS probe and setup
*/
void intel_ds_init(void)
void __init intel_ds_init(void)
{
/*
* No support for 32bit formats
......
......@@ -697,7 +697,7 @@ static const int snb_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
};
/* core */
void intel_pmu_lbr_init_core(void)
void __init intel_pmu_lbr_init_core(void)
{
x86_pmu.lbr_nr = 4;
x86_pmu.lbr_tos = MSR_LBR_TOS;
......@@ -712,7 +712,7 @@ void intel_pmu_lbr_init_core(void)
}
/* nehalem/westmere */
void intel_pmu_lbr_init_nhm(void)
void __init intel_pmu_lbr_init_nhm(void)
{
x86_pmu.lbr_nr = 16;
x86_pmu.lbr_tos = MSR_LBR_TOS;
......@@ -733,7 +733,7 @@ void intel_pmu_lbr_init_nhm(void)
}
/* sandy bridge */
void intel_pmu_lbr_init_snb(void)
void __init intel_pmu_lbr_init_snb(void)
{
x86_pmu.lbr_nr = 16;
x86_pmu.lbr_tos = MSR_LBR_TOS;
......@@ -753,7 +753,7 @@ void intel_pmu_lbr_init_snb(void)
}
/* atom */
void intel_pmu_lbr_init_atom(void)
void __init intel_pmu_lbr_init_atom(void)
{
/*
* only models starting at stepping 10 seems
......
......@@ -24,395 +24,6 @@
#define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff)
/* SNB event control */
#define SNB_UNC_CTL_EV_SEL_MASK 0x000000ff
#define SNB_UNC_CTL_UMASK_MASK 0x0000ff00
#define SNB_UNC_CTL_EDGE_DET (1 << 18)
#define SNB_UNC_CTL_EN (1 << 22)
#define SNB_UNC_CTL_INVERT (1 << 23)
#define SNB_UNC_CTL_CMASK_MASK 0x1f000000
#define NHM_UNC_CTL_CMASK_MASK 0xff000000
#define NHM_UNC_FIXED_CTR_CTL_EN (1 << 0)
#define SNB_UNC_RAW_EVENT_MASK (SNB_UNC_CTL_EV_SEL_MASK | \
SNB_UNC_CTL_UMASK_MASK | \
SNB_UNC_CTL_EDGE_DET | \
SNB_UNC_CTL_INVERT | \
SNB_UNC_CTL_CMASK_MASK)
#define NHM_UNC_RAW_EVENT_MASK (SNB_UNC_CTL_EV_SEL_MASK | \
SNB_UNC_CTL_UMASK_MASK | \
SNB_UNC_CTL_EDGE_DET | \
SNB_UNC_CTL_INVERT | \
NHM_UNC_CTL_CMASK_MASK)
/* SNB global control register */
#define SNB_UNC_PERF_GLOBAL_CTL 0x391
#define SNB_UNC_FIXED_CTR_CTRL 0x394
#define SNB_UNC_FIXED_CTR 0x395
/* SNB uncore global control */
#define SNB_UNC_GLOBAL_CTL_CORE_ALL ((1 << 4) - 1)
#define SNB_UNC_GLOBAL_CTL_EN (1 << 29)
/* SNB Cbo register */
#define SNB_UNC_CBO_0_PERFEVTSEL0 0x700
#define SNB_UNC_CBO_0_PER_CTR0 0x706
#define SNB_UNC_CBO_MSR_OFFSET 0x10
/* NHM global control register */
#define NHM_UNC_PERF_GLOBAL_CTL 0x391
#define NHM_UNC_FIXED_CTR 0x394
#define NHM_UNC_FIXED_CTR_CTRL 0x395
/* NHM uncore global control */
#define NHM_UNC_GLOBAL_CTL_EN_PC_ALL ((1ULL << 8) - 1)
#define NHM_UNC_GLOBAL_CTL_EN_FC (1ULL << 32)
/* NHM uncore register */
#define NHM_UNC_PERFEVTSEL0 0x3c0
#define NHM_UNC_UNCORE_PMC0 0x3b0
/* SNB-EP Box level control */
#define SNBEP_PMON_BOX_CTL_RST_CTRL (1 << 0)
#define SNBEP_PMON_BOX_CTL_RST_CTRS (1 << 1)
#define SNBEP_PMON_BOX_CTL_FRZ (1 << 8)
#define SNBEP_PMON_BOX_CTL_FRZ_EN (1 << 16)
#define SNBEP_PMON_BOX_CTL_INT (SNBEP_PMON_BOX_CTL_RST_CTRL | \
SNBEP_PMON_BOX_CTL_RST_CTRS | \
SNBEP_PMON_BOX_CTL_FRZ_EN)
/* SNB-EP event control */
#define SNBEP_PMON_CTL_EV_SEL_MASK 0x000000ff
#define SNBEP_PMON_CTL_UMASK_MASK 0x0000ff00
#define SNBEP_PMON_CTL_RST (1 << 17)
#define SNBEP_PMON_CTL_EDGE_DET (1 << 18)
#define SNBEP_PMON_CTL_EV_SEL_EXT (1 << 21)
#define SNBEP_PMON_CTL_EN (1 << 22)
#define SNBEP_PMON_CTL_INVERT (1 << 23)
#define SNBEP_PMON_CTL_TRESH_MASK 0xff000000
#define SNBEP_PMON_RAW_EVENT_MASK (SNBEP_PMON_CTL_EV_SEL_MASK | \
SNBEP_PMON_CTL_UMASK_MASK | \
SNBEP_PMON_CTL_EDGE_DET | \
SNBEP_PMON_CTL_INVERT | \
SNBEP_PMON_CTL_TRESH_MASK)
/* SNB-EP Ubox event control */
#define SNBEP_U_MSR_PMON_CTL_TRESH_MASK 0x1f000000
#define SNBEP_U_MSR_PMON_RAW_EVENT_MASK \
(SNBEP_PMON_CTL_EV_SEL_MASK | \
SNBEP_PMON_CTL_UMASK_MASK | \
SNBEP_PMON_CTL_EDGE_DET | \
SNBEP_PMON_CTL_INVERT | \
SNBEP_U_MSR_PMON_CTL_TRESH_MASK)
#define SNBEP_CBO_PMON_CTL_TID_EN (1 << 19)
#define SNBEP_CBO_MSR_PMON_RAW_EVENT_MASK (SNBEP_PMON_RAW_EVENT_MASK | \
SNBEP_CBO_PMON_CTL_TID_EN)
/* SNB-EP PCU event control */
#define SNBEP_PCU_MSR_PMON_CTL_OCC_SEL_MASK 0x0000c000
#define SNBEP_PCU_MSR_PMON_CTL_TRESH_MASK 0x1f000000
#define SNBEP_PCU_MSR_PMON_CTL_OCC_INVERT (1 << 30)
#define SNBEP_PCU_MSR_PMON_CTL_OCC_EDGE_DET (1 << 31)
#define SNBEP_PCU_MSR_PMON_RAW_EVENT_MASK \
(SNBEP_PMON_CTL_EV_SEL_MASK | \
SNBEP_PCU_MSR_PMON_CTL_OCC_SEL_MASK | \
SNBEP_PMON_CTL_EDGE_DET | \
SNBEP_PMON_CTL_EV_SEL_EXT | \
SNBEP_PMON_CTL_INVERT | \
SNBEP_PCU_MSR_PMON_CTL_TRESH_MASK | \
SNBEP_PCU_MSR_PMON_CTL_OCC_INVERT | \
SNBEP_PCU_MSR_PMON_CTL_OCC_EDGE_DET)
#define SNBEP_QPI_PCI_PMON_RAW_EVENT_MASK \
(SNBEP_PMON_RAW_EVENT_MASK | \
SNBEP_PMON_CTL_EV_SEL_EXT)
/* SNB-EP pci control register */
#define SNBEP_PCI_PMON_BOX_CTL 0xf4
#define SNBEP_PCI_PMON_CTL0 0xd8
/* SNB-EP pci counter register */
#define SNBEP_PCI_PMON_CTR0 0xa0
/* SNB-EP home agent register */
#define SNBEP_HA_PCI_PMON_BOX_ADDRMATCH0 0x40
#define SNBEP_HA_PCI_PMON_BOX_ADDRMATCH1 0x44
#define SNBEP_HA_PCI_PMON_BOX_OPCODEMATCH 0x48
/* SNB-EP memory controller register */
#define SNBEP_MC_CHy_PCI_PMON_FIXED_CTL 0xf0
#define SNBEP_MC_CHy_PCI_PMON_FIXED_CTR 0xd0
/* SNB-EP QPI register */
#define SNBEP_Q_Py_PCI_PMON_PKT_MATCH0 0x228
#define SNBEP_Q_Py_PCI_PMON_PKT_MATCH1 0x22c
#define SNBEP_Q_Py_PCI_PMON_PKT_MASK0 0x238
#define SNBEP_Q_Py_PCI_PMON_PKT_MASK1 0x23c
/* SNB-EP Ubox register */
#define SNBEP_U_MSR_PMON_CTR0 0xc16
#define SNBEP_U_MSR_PMON_CTL0 0xc10
#define SNBEP_U_MSR_PMON_UCLK_FIXED_CTL 0xc08
#define SNBEP_U_MSR_PMON_UCLK_FIXED_CTR 0xc09
/* SNB-EP Cbo register */
#define SNBEP_C0_MSR_PMON_CTR0 0xd16
#define SNBEP_C0_MSR_PMON_CTL0 0xd10
#define SNBEP_C0_MSR_PMON_BOX_CTL 0xd04
#define SNBEP_C0_MSR_PMON_BOX_FILTER 0xd14
#define SNBEP_CBO_MSR_OFFSET 0x20
#define SNBEP_CB0_MSR_PMON_BOX_FILTER_TID 0x1f
#define SNBEP_CB0_MSR_PMON_BOX_FILTER_NID 0x3fc00
#define SNBEP_CB0_MSR_PMON_BOX_FILTER_STATE 0x7c0000
#define SNBEP_CB0_MSR_PMON_BOX_FILTER_OPC 0xff800000
#define SNBEP_CBO_EVENT_EXTRA_REG(e, m, i) { \
.event = (e), \
.msr = SNBEP_C0_MSR_PMON_BOX_FILTER, \
.config_mask = (m), \
.idx = (i) \
}
/* SNB-EP PCU register */
#define SNBEP_PCU_MSR_PMON_CTR0 0xc36
#define SNBEP_PCU_MSR_PMON_CTL0 0xc30
#define SNBEP_PCU_MSR_PMON_BOX_CTL 0xc24
#define SNBEP_PCU_MSR_PMON_BOX_FILTER 0xc34
#define SNBEP_PCU_MSR_PMON_BOX_FILTER_MASK 0xffffffff
#define SNBEP_PCU_MSR_CORE_C3_CTR 0x3fc
#define SNBEP_PCU_MSR_CORE_C6_CTR 0x3fd
/* IVT event control */
#define IVT_PMON_BOX_CTL_INT (SNBEP_PMON_BOX_CTL_RST_CTRL | \
SNBEP_PMON_BOX_CTL_RST_CTRS)
#define IVT_PMON_RAW_EVENT_MASK (SNBEP_PMON_CTL_EV_SEL_MASK | \
SNBEP_PMON_CTL_UMASK_MASK | \
SNBEP_PMON_CTL_EDGE_DET | \
SNBEP_PMON_CTL_TRESH_MASK)
/* IVT Ubox */
#define IVT_U_MSR_PMON_GLOBAL_CTL 0xc00
#define IVT_U_PMON_GLOBAL_FRZ_ALL (1 << 31)
#define IVT_U_PMON_GLOBAL_UNFRZ_ALL (1 << 29)
#define IVT_U_MSR_PMON_RAW_EVENT_MASK \
(SNBEP_PMON_CTL_EV_SEL_MASK | \
SNBEP_PMON_CTL_UMASK_MASK | \
SNBEP_PMON_CTL_EDGE_DET | \
SNBEP_U_MSR_PMON_CTL_TRESH_MASK)
/* IVT Cbo */
#define IVT_CBO_MSR_PMON_RAW_EVENT_MASK (IVT_PMON_RAW_EVENT_MASK | \
SNBEP_CBO_PMON_CTL_TID_EN)
#define IVT_CB0_MSR_PMON_BOX_FILTER_TID (0x1fULL << 0)
#define IVT_CB0_MSR_PMON_BOX_FILTER_LINK (0xfULL << 5)
#define IVT_CB0_MSR_PMON_BOX_FILTER_STATE (0x3fULL << 17)
#define IVT_CB0_MSR_PMON_BOX_FILTER_NID (0xffffULL << 32)
#define IVT_CB0_MSR_PMON_BOX_FILTER_OPC (0x1ffULL << 52)
#define IVT_CB0_MSR_PMON_BOX_FILTER_C6 (0x1ULL << 61)
#define IVT_CB0_MSR_PMON_BOX_FILTER_NC (0x1ULL << 62)
#define IVT_CB0_MSR_PMON_BOX_FILTER_IOSC (0x1ULL << 63)
/* IVT home agent */
#define IVT_HA_PCI_PMON_CTL_Q_OCC_RST (1 << 16)
#define IVT_HA_PCI_PMON_RAW_EVENT_MASK \
(IVT_PMON_RAW_EVENT_MASK | \
IVT_HA_PCI_PMON_CTL_Q_OCC_RST)
/* IVT PCU */
#define IVT_PCU_MSR_PMON_RAW_EVENT_MASK \
(SNBEP_PMON_CTL_EV_SEL_MASK | \
SNBEP_PMON_CTL_EV_SEL_EXT | \
SNBEP_PCU_MSR_PMON_CTL_OCC_SEL_MASK | \
SNBEP_PMON_CTL_EDGE_DET | \
SNBEP_PCU_MSR_PMON_CTL_TRESH_MASK | \
SNBEP_PCU_MSR_PMON_CTL_OCC_INVERT | \
SNBEP_PCU_MSR_PMON_CTL_OCC_EDGE_DET)
/* IVT QPI */
#define IVT_QPI_PCI_PMON_RAW_EVENT_MASK \
(IVT_PMON_RAW_EVENT_MASK | \
SNBEP_PMON_CTL_EV_SEL_EXT)
/* NHM-EX event control */
#define NHMEX_PMON_CTL_EV_SEL_MASK 0x000000ff
#define NHMEX_PMON_CTL_UMASK_MASK 0x0000ff00
#define NHMEX_PMON_CTL_EN_BIT0 (1 << 0)
#define NHMEX_PMON_CTL_EDGE_DET (1 << 18)
#define NHMEX_PMON_CTL_PMI_EN (1 << 20)
#define NHMEX_PMON_CTL_EN_BIT22 (1 << 22)
#define NHMEX_PMON_CTL_INVERT (1 << 23)
#define NHMEX_PMON_CTL_TRESH_MASK 0xff000000
#define NHMEX_PMON_RAW_EVENT_MASK (NHMEX_PMON_CTL_EV_SEL_MASK | \
NHMEX_PMON_CTL_UMASK_MASK | \
NHMEX_PMON_CTL_EDGE_DET | \
NHMEX_PMON_CTL_INVERT | \
NHMEX_PMON_CTL_TRESH_MASK)
/* NHM-EX Ubox */
#define NHMEX_U_MSR_PMON_GLOBAL_CTL 0xc00
#define NHMEX_U_MSR_PMON_CTR 0xc11
#define NHMEX_U_MSR_PMON_EV_SEL 0xc10
#define NHMEX_U_PMON_GLOBAL_EN (1 << 0)
#define NHMEX_U_PMON_GLOBAL_PMI_CORE_SEL 0x0000001e
#define NHMEX_U_PMON_GLOBAL_EN_ALL (1 << 28)
#define NHMEX_U_PMON_GLOBAL_RST_ALL (1 << 29)
#define NHMEX_U_PMON_GLOBAL_FRZ_ALL (1 << 31)
#define NHMEX_U_PMON_RAW_EVENT_MASK \
(NHMEX_PMON_CTL_EV_SEL_MASK | \
NHMEX_PMON_CTL_EDGE_DET)
/* NHM-EX Cbox */
#define NHMEX_C0_MSR_PMON_GLOBAL_CTL 0xd00
#define NHMEX_C0_MSR_PMON_CTR0 0xd11
#define NHMEX_C0_MSR_PMON_EV_SEL0 0xd10
#define NHMEX_C_MSR_OFFSET 0x20
/* NHM-EX Bbox */
#define NHMEX_B0_MSR_PMON_GLOBAL_CTL 0xc20
#define NHMEX_B0_MSR_PMON_CTR0 0xc31
#define NHMEX_B0_MSR_PMON_CTL0 0xc30
#define NHMEX_B_MSR_OFFSET 0x40
#define NHMEX_B0_MSR_MATCH 0xe45
#define NHMEX_B0_MSR_MASK 0xe46
#define NHMEX_B1_MSR_MATCH 0xe4d
#define NHMEX_B1_MSR_MASK 0xe4e
#define NHMEX_B_PMON_CTL_EN (1 << 0)
#define NHMEX_B_PMON_CTL_EV_SEL_SHIFT 1
#define NHMEX_B_PMON_CTL_EV_SEL_MASK \
(0x1f << NHMEX_B_PMON_CTL_EV_SEL_SHIFT)
#define NHMEX_B_PMON_CTR_SHIFT 6
#define NHMEX_B_PMON_CTR_MASK \
(0x3 << NHMEX_B_PMON_CTR_SHIFT)
#define NHMEX_B_PMON_RAW_EVENT_MASK \
(NHMEX_B_PMON_CTL_EV_SEL_MASK | \
NHMEX_B_PMON_CTR_MASK)
/* NHM-EX Sbox */
#define NHMEX_S0_MSR_PMON_GLOBAL_CTL 0xc40
#define NHMEX_S0_MSR_PMON_CTR0 0xc51
#define NHMEX_S0_MSR_PMON_CTL0 0xc50
#define NHMEX_S_MSR_OFFSET 0x80
#define NHMEX_S0_MSR_MM_CFG 0xe48
#define NHMEX_S0_MSR_MATCH 0xe49
#define NHMEX_S0_MSR_MASK 0xe4a
#define NHMEX_S1_MSR_MM_CFG 0xe58
#define NHMEX_S1_MSR_MATCH 0xe59
#define NHMEX_S1_MSR_MASK 0xe5a
#define NHMEX_S_PMON_MM_CFG_EN (0x1ULL << 63)
#define NHMEX_S_EVENT_TO_R_PROG_EV 0
/* NHM-EX Mbox */
#define NHMEX_M0_MSR_GLOBAL_CTL 0xca0
#define NHMEX_M0_MSR_PMU_DSP 0xca5
#define NHMEX_M0_MSR_PMU_ISS 0xca6
#define NHMEX_M0_MSR_PMU_MAP 0xca7
#define NHMEX_M0_MSR_PMU_MSC_THR 0xca8
#define NHMEX_M0_MSR_PMU_PGT 0xca9
#define NHMEX_M0_MSR_PMU_PLD 0xcaa
#define NHMEX_M0_MSR_PMU_ZDP_CTL_FVC 0xcab
#define NHMEX_M0_MSR_PMU_CTL0 0xcb0
#define NHMEX_M0_MSR_PMU_CNT0 0xcb1
#define NHMEX_M_MSR_OFFSET 0x40
#define NHMEX_M0_MSR_PMU_MM_CFG 0xe54
#define NHMEX_M1_MSR_PMU_MM_CFG 0xe5c
#define NHMEX_M_PMON_MM_CFG_EN (1ULL << 63)
#define NHMEX_M_PMON_ADDR_MATCH_MASK 0x3ffffffffULL
#define NHMEX_M_PMON_ADDR_MASK_MASK 0x7ffffffULL
#define NHMEX_M_PMON_ADDR_MASK_SHIFT 34
#define NHMEX_M_PMON_CTL_EN (1 << 0)
#define NHMEX_M_PMON_CTL_PMI_EN (1 << 1)
#define NHMEX_M_PMON_CTL_COUNT_MODE_SHIFT 2
#define NHMEX_M_PMON_CTL_COUNT_MODE_MASK \
(0x3 << NHMEX_M_PMON_CTL_COUNT_MODE_SHIFT)
#define NHMEX_M_PMON_CTL_STORAGE_MODE_SHIFT 4
#define NHMEX_M_PMON_CTL_STORAGE_MODE_MASK \
(0x3 << NHMEX_M_PMON_CTL_STORAGE_MODE_SHIFT)
#define NHMEX_M_PMON_CTL_WRAP_MODE (1 << 6)
#define NHMEX_M_PMON_CTL_FLAG_MODE (1 << 7)
#define NHMEX_M_PMON_CTL_INC_SEL_SHIFT 9
#define NHMEX_M_PMON_CTL_INC_SEL_MASK \
(0x1f << NHMEX_M_PMON_CTL_INC_SEL_SHIFT)
#define NHMEX_M_PMON_CTL_SET_FLAG_SEL_SHIFT 19
#define NHMEX_M_PMON_CTL_SET_FLAG_SEL_MASK \
(0x7 << NHMEX_M_PMON_CTL_SET_FLAG_SEL_SHIFT)
#define NHMEX_M_PMON_RAW_EVENT_MASK \
(NHMEX_M_PMON_CTL_COUNT_MODE_MASK | \
NHMEX_M_PMON_CTL_STORAGE_MODE_MASK | \
NHMEX_M_PMON_CTL_WRAP_MODE | \
NHMEX_M_PMON_CTL_FLAG_MODE | \
NHMEX_M_PMON_CTL_INC_SEL_MASK | \
NHMEX_M_PMON_CTL_SET_FLAG_SEL_MASK)
#define NHMEX_M_PMON_ZDP_CTL_FVC_MASK (((1 << 11) - 1) | (1 << 23))
#define NHMEX_M_PMON_ZDP_CTL_FVC_EVENT_MASK(n) (0x7ULL << (11 + 3 * (n)))
#define WSMEX_M_PMON_ZDP_CTL_FVC_MASK (((1 << 12) - 1) | (1 << 24))
#define WSMEX_M_PMON_ZDP_CTL_FVC_EVENT_MASK(n) (0x7ULL << (12 + 3 * (n)))
/*
* use the 9~13 bits to select event If the 7th bit is not set,
* otherwise use the 19~21 bits to select event.
*/
#define MBOX_INC_SEL(x) ((x) << NHMEX_M_PMON_CTL_INC_SEL_SHIFT)
#define MBOX_SET_FLAG_SEL(x) (((x) << NHMEX_M_PMON_CTL_SET_FLAG_SEL_SHIFT) | \
NHMEX_M_PMON_CTL_FLAG_MODE)
#define MBOX_INC_SEL_MASK (NHMEX_M_PMON_CTL_INC_SEL_MASK | \
NHMEX_M_PMON_CTL_FLAG_MODE)
#define MBOX_SET_FLAG_SEL_MASK (NHMEX_M_PMON_CTL_SET_FLAG_SEL_MASK | \
NHMEX_M_PMON_CTL_FLAG_MODE)
#define MBOX_INC_SEL_EXTAR_REG(c, r) \
EVENT_EXTRA_REG(MBOX_INC_SEL(c), NHMEX_M0_MSR_PMU_##r, \
MBOX_INC_SEL_MASK, (u64)-1, NHMEX_M_##r)
#define MBOX_SET_FLAG_SEL_EXTRA_REG(c, r) \
EVENT_EXTRA_REG(MBOX_SET_FLAG_SEL(c), NHMEX_M0_MSR_PMU_##r, \
MBOX_SET_FLAG_SEL_MASK, \
(u64)-1, NHMEX_M_##r)
/* NHM-EX Rbox */
#define NHMEX_R_MSR_GLOBAL_CTL 0xe00
#define NHMEX_R_MSR_PMON_CTL0 0xe10
#define NHMEX_R_MSR_PMON_CNT0 0xe11
#define NHMEX_R_MSR_OFFSET 0x20
#define NHMEX_R_MSR_PORTN_QLX_CFG(n) \
((n) < 4 ? (0xe0c + (n)) : (0xe2c + (n) - 4))
#define NHMEX_R_MSR_PORTN_IPERF_CFG0(n) (0xe04 + (n))
#define NHMEX_R_MSR_PORTN_IPERF_CFG1(n) (0xe24 + (n))
#define NHMEX_R_MSR_PORTN_XBR_OFFSET(n) \
(((n) < 4 ? 0 : 0x10) + (n) * 4)
#define NHMEX_R_MSR_PORTN_XBR_SET1_MM_CFG(n) \
(0xe60 + NHMEX_R_MSR_PORTN_XBR_OFFSET(n))
#define NHMEX_R_MSR_PORTN_XBR_SET1_MATCH(n) \
(NHMEX_R_MSR_PORTN_XBR_SET1_MM_CFG(n) + 1)
#define NHMEX_R_MSR_PORTN_XBR_SET1_MASK(n) \
(NHMEX_R_MSR_PORTN_XBR_SET1_MM_CFG(n) + 2)
#define NHMEX_R_MSR_PORTN_XBR_SET2_MM_CFG(n) \
(0xe70 + NHMEX_R_MSR_PORTN_XBR_OFFSET(n))
#define NHMEX_R_MSR_PORTN_XBR_SET2_MATCH(n) \
(NHMEX_R_MSR_PORTN_XBR_SET2_MM_CFG(n) + 1)
#define NHMEX_R_MSR_PORTN_XBR_SET2_MASK(n) \
(NHMEX_R_MSR_PORTN_XBR_SET2_MM_CFG(n) + 2)
#define NHMEX_R_PMON_CTL_EN (1 << 0)
#define NHMEX_R_PMON_CTL_EV_SEL_SHIFT 1
#define NHMEX_R_PMON_CTL_EV_SEL_MASK \
(0x1f << NHMEX_R_PMON_CTL_EV_SEL_SHIFT)
#define NHMEX_R_PMON_CTL_PMI_EN (1 << 6)
#define NHMEX_R_PMON_RAW_EVENT_MASK NHMEX_R_PMON_CTL_EV_SEL_MASK
/* NHM-EX Wbox */
#define NHMEX_W_MSR_GLOBAL_CTL 0xc80
#define NHMEX_W_MSR_PMON_CNT0 0xc90
#define NHMEX_W_MSR_PMON_EVT_SEL0 0xc91
#define NHMEX_W_MSR_PMON_FIXED_CTR 0x394
#define NHMEX_W_MSR_PMON_FIXED_CTL 0x395
#define NHMEX_W_PMON_GLOBAL_FIXED_EN (1ULL << 31)
struct intel_uncore_ops;
struct intel_uncore_pmu;
struct intel_uncore_box;
......@@ -505,6 +116,9 @@ struct uncore_event_desc {
const char *config;
};
ssize_t uncore_event_show(struct kobject *kobj,
struct kobj_attribute *attr, char *buf);
#define INTEL_UNCORE_EVENT_DESC(_name, _config) \
{ \
.attr = __ATTR(_name, 0444, uncore_event_show, NULL), \
......@@ -522,15 +136,6 @@ static ssize_t __uncore_##_var##_show(struct kobject *kobj, \
static struct kobj_attribute format_attr_##_var = \
__ATTR(_name, 0444, __uncore_##_var##_show, NULL)
static ssize_t uncore_event_show(struct kobject *kobj,
struct kobj_attribute *attr, char *buf)
{
struct uncore_event_desc *event =
container_of(attr, struct uncore_event_desc, attr);
return sprintf(buf, "%s", event->config);
}
static inline unsigned uncore_pci_box_ctl(struct intel_uncore_box *box)
{
return box->pmu->type->box_ctl;
......@@ -694,3 +299,41 @@ static inline bool uncore_box_is_fake(struct intel_uncore_box *box)
{
return (box->phys_id < 0);
}
struct intel_uncore_pmu *uncore_event_to_pmu(struct perf_event *event);
struct intel_uncore_box *uncore_pmu_to_box(struct intel_uncore_pmu *pmu, int cpu);
struct intel_uncore_box *uncore_event_to_box(struct perf_event *event);
u64 uncore_msr_read_counter(struct intel_uncore_box *box, struct perf_event *event);
void uncore_pmu_start_hrtimer(struct intel_uncore_box *box);
void uncore_pmu_cancel_hrtimer(struct intel_uncore_box *box);
void uncore_pmu_event_read(struct perf_event *event);
void uncore_perf_event_update(struct intel_uncore_box *box, struct perf_event *event);
struct event_constraint *
uncore_get_constraint(struct intel_uncore_box *box, struct perf_event *event);
void uncore_put_constraint(struct intel_uncore_box *box, struct perf_event *event);
u64 uncore_shared_reg_config(struct intel_uncore_box *box, int idx);
extern struct intel_uncore_type **uncore_msr_uncores;
extern struct intel_uncore_type **uncore_pci_uncores;
extern struct pci_driver *uncore_pci_driver;
extern int uncore_pcibus_to_physid[256];
extern struct pci_dev *uncore_extra_pci_dev[UNCORE_SOCKET_MAX][UNCORE_EXTRA_PCI_DEV_MAX];
extern struct event_constraint uncore_constraint_empty;
/* perf_event_intel_uncore_snb.c */
int snb_uncore_pci_init(void);
int ivb_uncore_pci_init(void);
int hsw_uncore_pci_init(void);
void snb_uncore_cpu_init(void);
void nhm_uncore_cpu_init(void);
/* perf_event_intel_uncore_snbep.c */
int snbep_uncore_pci_init(void);
void snbep_uncore_cpu_init(void);
int ivbep_uncore_pci_init(void);
void ivbep_uncore_cpu_init(void);
int hswep_uncore_pci_init(void);
void hswep_uncore_cpu_init(void);
/* perf_event_intel_uncore_nhmex.c */
void nhmex_uncore_cpu_init(void);
此差异已折叠。
此差异已折叠。
此差异已折叠。
......@@ -8,28 +8,28 @@
* Copyright (C) 2011-2012 Peter Zijlstra <pzijlstr@redhat.com>
*
* Jump labels provide an interface to generate dynamic branches using
* self-modifying code. Assuming toolchain and architecture support the result
* of a "if (static_key_false(&key))" statement is a unconditional branch (which
* self-modifying code. Assuming toolchain and architecture support, the result
* of a "if (static_key_false(&key))" statement is an unconditional branch (which
* defaults to false - and the true block is placed out of line).
*
* However at runtime we can change the branch target using
* static_key_slow_{inc,dec}(). These function as a 'reference' count on the key
* object and for as long as there are references all branches referring to
* object, and for as long as there are references all branches referring to
* that particular key will point to the (out of line) true block.
*
* Since this relies on modifying code the static_key_slow_{inc,dec}() functions
* Since this relies on modifying code, the static_key_slow_{inc,dec}() functions
* must be considered absolute slow paths (machine wide synchronization etc.).
* OTOH, since the affected branches are unconditional their runtime overhead
* OTOH, since the affected branches are unconditional, their runtime overhead
* will be absolutely minimal, esp. in the default (off) case where the total
* effect is a single NOP of appropriate size. The on case will patch in a jump
* to the out-of-line block.
*
* When the control is directly exposed to userspace it is prudent to delay the
* When the control is directly exposed to userspace, it is prudent to delay the
* decrement to avoid high frequency code modifications which can (and do)
* cause significant performance degradation. Struct static_key_deferred and
* static_key_slow_dec_deferred() provide for this.
*
* Lacking toolchain and or architecture support, it falls back to a simple
* Lacking toolchain and or architecture support, jump labels fall back to a simple
* conditional branch.
*
* struct static_key my_key = STATIC_KEY_INIT_TRUE;
......@@ -43,8 +43,7 @@
*
* Not initializing the key (static data is initialized to 0s anyway) is the
* same as using STATIC_KEY_INIT_FALSE.
*
*/
*/
#include <linux/types.h>
#include <linux/compiler.h>
......
......@@ -2538,6 +2538,7 @@
#define PCI_DEVICE_ID_INTEL_EESSC 0x0008
#define PCI_DEVICE_ID_INTEL_SNB_IMC 0x0100
#define PCI_DEVICE_ID_INTEL_IVB_IMC 0x0154
#define PCI_DEVICE_ID_INTEL_IVB_E3_IMC 0x0150
#define PCI_DEVICE_ID_INTEL_HSW_IMC 0x0c00
#define PCI_DEVICE_ID_INTEL_PXHD_0 0x0320
#define PCI_DEVICE_ID_INTEL_PXHD_1 0x0321
......
......@@ -52,6 +52,7 @@ struct perf_guest_info_callbacks {
#include <linux/atomic.h>
#include <linux/sysfs.h>
#include <linux/perf_regs.h>
#include <linux/workqueue.h>
#include <asm/local.h>
struct perf_callchain_entry {
......@@ -268,6 +269,7 @@ struct pmu {
* enum perf_event_active_state - the states of a event
*/
enum perf_event_active_state {
PERF_EVENT_STATE_EXIT = -3,
PERF_EVENT_STATE_ERROR = -2,
PERF_EVENT_STATE_OFF = -1,
PERF_EVENT_STATE_INACTIVE = 0,
......@@ -507,6 +509,9 @@ struct perf_event_context {
int nr_cgroups; /* cgroup evts */
int nr_branch_stack; /* branch_stack evt */
struct rcu_head rcu_head;
struct delayed_work orphans_remove;
bool orphans_remove_sched;
};
/*
......@@ -604,6 +609,13 @@ struct perf_sample_data {
u64 txn;
};
/* default value for data source */
#define PERF_MEM_NA (PERF_MEM_S(OP, NA) |\
PERF_MEM_S(LVL, NA) |\
PERF_MEM_S(SNOOP, NA) |\
PERF_MEM_S(LOCK, NA) |\
PERF_MEM_S(TLB, NA))
static inline void perf_sample_data_init(struct perf_sample_data *data,
u64 addr, u64 period)
{
......@@ -616,7 +628,7 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
data->regs_user.regs = NULL;
data->stack_user_size = 0;
data->weight = 0;
data->data_src.val = 0;
data->data_src.val = PERF_MEM_NA;
data->txn = 0;
}
......
......@@ -52,7 +52,7 @@ static void release_callchain_buffers(void)
struct callchain_cpus_entries *entries;
entries = callchain_cpus_entries;
rcu_assign_pointer(callchain_cpus_entries, NULL);
RCU_INIT_POINTER(callchain_cpus_entries, NULL);
call_rcu(&entries->rcu_head, release_callchain_buffers_rcu);
}
......
此差异已折叠。
......@@ -10,9 +10,14 @@ LIB_OBJS=
LIB_H += fs/debugfs.h
LIB_H += fs/fs.h
# See comment below about piggybacking...
LIB_H += fd/array.h
LIB_OBJS += $(OUTPUT)fs/debugfs.o
LIB_OBJS += $(OUTPUT)fs/fs.o
# XXX piggybacking here, need to introduce libapikfd, or rename this
# to plain libapik.a and make it have it all api goodies
LIB_OBJS += $(OUTPUT)fd/array.o
LIBFILE = libapikfs.a
......@@ -29,7 +34,7 @@ $(LIBFILE): $(LIB_OBJS)
$(LIB_OBJS): $(LIB_H)
libapi_dirs:
$(QUIET_MKDIR)mkdir -p $(OUTPUT)fs/
$(QUIET_MKDIR)mkdir -p $(OUTPUT)fd $(OUTPUT)fs
$(OUTPUT)%.o: %.c libapi_dirs
$(QUIET_CC)$(CC) -o $@ -c $(ALL_CFLAGS) $<
......
此差异已折叠。
#ifndef __API_FD_ARRAY__
#define __API_FD_ARRAY__
#include <stdio.h>
struct pollfd;
/**
* struct fdarray: Array of file descriptors
*
* @priv: Per array entry priv area, users should access just its contents,
* not set it to anything, as it is kept in synch with @entries, being
* realloc'ed, * for instance, in fdarray__{grow,filter}.
*
* I.e. using 'fda->priv[N].idx = * value' where N < fda->nr is ok,
* but doing 'fda->priv = malloc(M)' is not allowed.
*/
struct fdarray {
int nr;
int nr_alloc;
int nr_autogrow;
struct pollfd *entries;
union {
int idx;
} *priv;
};
void fdarray__init(struct fdarray *fda, int nr_autogrow);
void fdarray__exit(struct fdarray *fda);
struct fdarray *fdarray__new(int nr_alloc, int nr_autogrow);
void fdarray__delete(struct fdarray *fda);
int fdarray__add(struct fdarray *fda, int fd, short revents);
int fdarray__poll(struct fdarray *fda, int timeout);
int fdarray__filter(struct fdarray *fda, short revents,
void (*entry_destructor)(struct fdarray *fda, int fd));
int fdarray__grow(struct fdarray *fda, int extra);
int fdarray__fprintf(struct fdarray *fda, FILE *fp);
static inline int fdarray__available_entries(struct fdarray *fda)
{
return fda->nr_alloc - fda->nr;
}
#endif /* __API_FD_ARRAY__ */
......@@ -15,6 +15,7 @@ perf.data
perf.data.old
output.svg
perf-archive
perf-with-kcore
tags
TAGS
cscope*
......
......@@ -104,6 +104,9 @@ OPTIONS
Specify path to the executable or shared library file for user
space tracing. Can also be used with --funcs option.
--demangle-kernel::
Demangle kernel symbols.
In absence of -m/-x options, perf probe checks if the first argument after
the options is an absolute path name. If its an absolute path, perf probe
uses it as a target module/target user space binary to probe.
......
......@@ -147,7 +147,7 @@ OPTIONS
-w::
--column-widths=<width[,width...]>::
Force each column width to the provided list, for large terminal
readability.
readability. 0 means no limit (default behavior).
-t::
--field-separator=::
......@@ -276,6 +276,9 @@ OPTIONS
Demangle symbol names to human readable form. It's enabled by default,
disable with --no-demangle.
--demangle-kernel::
Demangle kernel symbol names to human readable form (for C++ kernels).
--mem-mode::
Use the data addresses of samples in addition to instruction addresses
to build the histograms. To generate meaningful output, the perf.data
......
......@@ -98,6 +98,9 @@ Default is to monitor all CPUS.
--hide_user_symbols::
Hide user symbols.
--demangle-kernel::
Demangle kernel symbols.
-D::
--dump-symtab::
Dump the symbol table used for profiling.
......@@ -193,6 +196,12 @@ Default is to monitor all CPUS.
sum of shown entries will be always 100%. "absolute" means it retains
the original value before and after the filter is applied.
-w::
--column-widths=<width[,width...]>::
Force each column width to the provided list, for large terminal
readability. 0 means no limit (default behavior).
INTERACTIVE PROMPTING KEYS
--------------------------
......
此差异已折叠。
......@@ -3,6 +3,7 @@
#include "thread.h"
#include "map.h"
#include "event.h"
#include "debug.h"
#include "tests/tests.h"
#define STACK_SIZE 8192
......
......@@ -3,6 +3,7 @@
#include <libunwind.h>
#include "perf_regs.h"
#include "../../util/unwind.h"
#include "../../util/debug.h"
int libunwind__arch_reg_id(int regnum)
{
......
......@@ -3,6 +3,7 @@
#include <libunwind.h>
#include "perf_regs.h"
#include "../../util/unwind.h"
#include "../../util/debug.h"
int libunwind__arch_reg_id(int regnum)
{
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册