1. 26 1月, 2016 25 次提交
  2. 22 1月, 2016 15 次提交
    • A
      perf: Synchronously free aux pages in case of allocation failure · 45c815f0
      Alexander Shishkin 提交于
      We are currently using asynchronous deallocation in the error path in
      AUX mmap code, which is unnecessary and also presents a problem for users
      that wish to probe for the biggest possible buffer size they can get:
      they'll get -EINVAL on all subsequent attemts to allocate a smaller
      buffer before the asynchronous deallocation callback frees up the pages
      from the previous unsuccessful attempt.
      
      Currently, gdb does that for allocating AUX buffers for Intel PT traces.
      More specifically, overwrite mode of AUX pmus that don't support hardware
      sg (some implementations of Intel PT, for instance) is limited to only
      one contiguous high order allocation for its buffer and there is no way
      of knowing its size without trying.
      
      This patch changes error path freeing to be synchronous as there won't
      be any contenders for the AUX pages at that point.
      Reported-by: NMarkus Metzger <markus.t.metzger@intel.com>
      Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/1453216469-9509-1-git-send-email-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      45c815f0
    • S
      perf/x86: add Intel SkyLake uncore IMC PMU support · 0e1eb0a1
      Stephane Eranian 提交于
      This patch enables the uncore_imc PMU for Intel
      SkyLake Desktop processors (Core i7-6700, model 94).
      
      It is possible to compute memory read/write bandwidth
      using:
      
        $ perf stat -a -e uncore_imc/data_reads/,uncore_imc/data_writes/ ....
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1452151546-8853-1-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0e1eb0a1
    • P
      perf: Fix perf_event_exit_task() race · 63b6da39
      Peter Zijlstra 提交于
      There is a race against perf_event_exit_task() vs
      event_function_call(),find_get_context(),perf_install_in_context()
      (iow, everyone).
      
      Since there is no permanent marker on a context that its dead, it is
      quite possible that we access (and even modify) a context after its
      passed through perf_event_exit_task().
      
      For instance, find_get_context() might find the context still
      installed, but by the time we get to perf_install_in_context() it
      might already have passed through perf_event_exit_task() and be
      considered dead, we will however still add the event to it.
      
      Solve this by marking a ctx dead by setting its ctx->task value to -1,
      it must be !0 so we still know its a (former) task context.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      63b6da39
    • P
      perf: Add more assertions · c97f4736
      Peter Zijlstra 提交于
      Try to trigger warnings before races do damage.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c97f4736
    • P
      perf: Collapse and fix event_function_call() users · fae3fde6
      Peter Zijlstra 提交于
      There is one common bug left in all the event_function_call() users,
      between loading ctx->task and getting to the remote_function(),
      ctx->task can already have been changed.
      
      Therefore we need to double check and retry if ctx->task != current.
      
      Insert another trampoline specific to event_function_call() that
      checks for this and further validates state. This also allows getting
      rid of the active/inactive functions.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      fae3fde6
    • P
      perf: Specialize perf_event_exit_task() · 32132a3d
      Peter Zijlstra 提交于
      The perf_remove_from_context() usage in __perf_event_exit_task() is
      different from the other usages in that this site has already
      detached and scheduled out the task context.
      
      This will stand in the way of stronger assertions checking the (task)
      context scheduling invariants.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      32132a3d
    • P
      perf: Fix task context scheduling · 39a43640
      Peter Zijlstra 提交于
      There is a very nasty problem wrt disabling the perf task scheduling
      hooks.
      
      Currently we {set,clear} ctx->is_active on every
      __perf_event_task_sched_{in,out}, _however_ this means that if we
      disable these calls we'll have task contexts with ->is_active set that
      are not active and 'active' task contexts without ->is_active set.
      
      This can result in event_function_call() looping on the ctx->is_active
      condition basically indefinitely.
      
      Resolve this by changing things such that contexts without events do
      not set ->is_active like we used to. From this invariant it trivially
      follows that if there are no (task) events, every task ctx is inactive
      and disabling the context switch hooks is harmless.
      
      This leaves two places that need attention (and already had
      accumulated weird and wonderful hacks to work around, without
      recognising this actual problem).
      
      Namely:
      
       - perf_install_in_context() will need to deal with installing events
         in an inactive context, meaning it cannot rely on ctx-is_active for
         its IPIs.
      
       - perf_remove_from_context() will have to mark a context as inactive
         when it removes the last event.
      
      For specific detail, see the patch/comments.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      39a43640
    • P
      perf: Make ctx->is_active and cpuctx->task_ctx consistent · 63e30d3e
      Peter Zijlstra 提交于
      For no apparent reason and to great confusion the rules for
      ctx->is_active and cpuctx->task_ctx are different. This means that its
      not always possible to find all active (task) contexts.
      
      Fix this such that if ctx->is_active gets set, we also set (or verify)
      cpuctx->task_ctx.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      63e30d3e
    • P
      perf: Optimize perf_sched_events() usage · 25432ae9
      Peter Zijlstra 提交于
      It doesn't make sense to take up-to _4_ references on
      perf_sched_events() per event, avoid doing this.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      25432ae9
    • P
      perf: Simplify/fix perf_event_enable() event scheduling · aee7dbc4
      Peter Zijlstra 提交于
      Like perf_enable_on_exec(), perf_event_enable() event scheduling has problems
      respecting the context hierarchy when trying to schedule events (for
      example, it will try and add a pinned event without first removing
      existing flexible events).
      
      So simplify it by using the new ctx_resched() call which will DTRT.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      aee7dbc4
    • P
      perf: Use task_ctx_sched_out() · 8833d0e2
      Peter Zijlstra 提交于
      We have a function that does exactly what we want here, use it. This
      reduces the amount of cpuctx->task_ctx muckery.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      8833d0e2
    • P
      perf: Fix perf_enable_on_exec() event scheduling · 3e349507
      Peter Zijlstra 提交于
      There are two problems with the current perf_enable_on_exec() event
      scheduling:
      
        - the newly enabled events will be immediately scheduled
          irrespective of their ctx event list order.
      
        - there's a hole in the ctx->lock between scheduling the events
          out and putting them back on.
      
      Esp. the latter issue is a real problem because a hole in event
      scheduling leaves the thing in an observable inconsistent state,
      confusing things.
      
      Fix both issues by first doing the enable iteration and at the end,
      when there are newly enabled events, reschedule the ctx in one go.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3e349507
    • P
      perf: Remove stale comment · 5947f657
      Peter Zijlstra 提交于
      The comment here is horribly out of date, remove it.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5947f657
    • P
      perf: Fix cgroup scheduling in perf_enable_on_exec() · 70a01657
      Peter Zijlstra 提交于
      There is a comment that states that perf_event_context_sched_in() will
      also switch in the cgroup events, I cannot find it does so. Therefore
      all the resulting logic goes out the window too.
      
      Clean that up.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      70a01657
    • P
      perf: Fix cgroup event scheduling · 7e41d177
      Peter Zijlstra 提交于
      There appears to be a problem in __perf_event_task_sched_in() wrt
      cgroup event scheduling.
      
      The normal event scheduling order is:
      
      	CPU pinned
      	Task pinned
      	CPU flexible
      	Task flexible
      
      And since perf_cgroup_sched*() only schedules the cpu context, we must
      call this _before_ adding the task events.
      
      Note: double check what happens on the ctx switch optimization where
      the task ctx isn't scheduled.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      7e41d177