1. 12 2月, 2015 11 次提交
  2. 11 2月, 2015 4 次提交
  3. 07 2月, 2015 7 次提交
  4. 06 2月, 2015 9 次提交
  5. 04 2月, 2015 8 次提交
    • M
      perf: Decouple unthrottling and rotating · 2fde4f94
      Mark Rutland 提交于
      Currently the adjusments made as part of perf_event_task_tick() use the
      percpu rotation lists to iterate over any active PMU contexts, but these
      are not used by the context rotation code, having been replaced by
      separate (per-context) hrtimer callbacks. However, some manipulation of
      the rotation lists (i.e. removal of contexts) has remained in
      perf_rotate_context(). This leads to the following issues:
      
      * Contexts are not always removed from the rotation lists. Removal of
        PMUs which have been placed in rotation lists, but have not been
        removed by a hrtimer callback can result in corruption of the rotation
        lists (when memory backing the context is freed).
      
        This has been observed to result in hangs when PMU drivers built as
        modules are inserted and removed around the creation of events for
        said PMUs.
      
      * Contexts which do not require rotation may be removed from the
        rotation lists as a result of a hrtimer, and will not be considered by
        the unthrottling code in perf_event_task_tick.
      
      This patch fixes the issue by updating the rotation ist when events are
      scheduled in/out, ensuring that each rotation list stays in sync with
      the HW state. As each event holds a refcount on the module of its PMU,
      this ensures that when a PMU module is unloaded none of its CPU contexts
      can be in a rotation list. By maintaining a list of perf_event_contexts
      rather than perf_event_cpu_contexts, we don't need separate paths to
      handle the cpu and task contexts, which also makes the code a little
      simpler.
      
      As the rotation_list variables are not used for rotation, these are
      renamed to active_ctx_list, which better matches their current function.
      perf_pmu_rotate_{start,stop} are renamed to
      perf_pmu_ctx_{activate,deactivate}.
      Reported-by: NJohannes Jensen <johannes.jensen@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Will Deacon <Will.Deacon@arm.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20150129134511.GR17721@leverpostejSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2fde4f94
    • M
      perf: Drop module reference on event init failure · cc34b98b
      Mark Rutland 提交于
      When initialising an event, perf_init_event will call try_module_get() to
      ensure that the PMU's module cannot be removed for the lifetime of the
      event, with __free_event() dropping the reference when the event is
      finally destroyed. If something fails after the event has been
      initialised, but before the event is installed, perf_event_alloc will
      drop the reference on the module.
      
      However, if we fail to initialise an event for some reason (e.g. we ask
      an uncore PMU to perform sampling, and it refuses to initialise the
      event), we do not drop the refcount. If we try to open such a bogus
      event without a precise IDR type, we will loop over each PMU in the pmus
      list, incrementing each of their refcounts without decrementing them.
      
      This patch adds a module_put when pmu->event_init(event) fails, ensuring
      that the refcounts are balanced in failure cases. As the innards of the
      precise and search based initialisation look very similar, this logic is
      hoisted out into a new helper function. While the early return for the
      failed try_module_get is removed from the search case, this is handled
      by the remaining return when ret is not -ENOENT.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1420642611-22667-1-git-send-email-mark.rutland@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      cc34b98b
    • J
      perf: Use POLLIN instead of POLL_IN for perf poll data in flag · 7c60fc0e
      Jiri Olsa 提交于
      Currently we flag available data (via poll syscall) on perf fd with
      POLL_IN macro, which is normally used for SIGIO interface.
      
      We've been lucky, because POLLIN (0x1) is subset of POLL_IN (0x20001)
      and sys_poll (do_pollfd function) cut the extra bit out (0x20000).
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1422467678-22341-1-git-send-email-jolsa@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7c60fc0e
    • P
      perf: Fix put_event() ctx lock · a83fe28e
      Peter Zijlstra 提交于
      So what I suspect; but I'm in zombie mode today it seems; is that while
      I initially thought that it was impossible for ctx to change when
      refcount dropped to 0, I now suspect its possible.
      
      Note that until perf_remove_from_context() the event is still active and
      visible on the lists. So a concurrent sys_perf_event_open() from another
      task into this task can race.
      Reported-by: NVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@gmail.com>
      Cc: mark.rutland@arm.com
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20150129134434.GB26304@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a83fe28e
    • P
      perf: Fix move_group() order · 8f95b435
      Peter Zijlstra (Intel) 提交于
      Jiri reported triggering the new WARN_ON_ONCE in event_sched_out over
      the weekend:
      
        event_sched_out.isra.79+0x2b9/0x2d0
        group_sched_out+0x69/0xc0
        ctx_sched_out+0x106/0x130
        task_ctx_sched_out+0x37/0x70
        __perf_install_in_context+0x70/0x1a0
        remote_function+0x48/0x60
        generic_exec_single+0x15b/0x1d0
        smp_call_function_single+0x67/0xa0
        task_function_call+0x53/0x80
        perf_install_in_context+0x8b/0x110
      
      I think the below should cure this; if we install a group leader it
      will iterate the (still intact) group list and find its siblings and
      try and install those too -- even though those still have the old
      event->ctx -- in the new ctx.
      
      Upon installing the first group sibling we'd try and schedule out the
      group and trigger the above warn.
      
      Fix this by installing the group leader last, installing siblings
      would have no effect, they're not reachable through the group lists
      and therefore we don't schedule them.
      
      Also delay resetting the state until we're absolutely sure the events
      are quiescent.
      Reported-by: NJiri Olsa <jolsa@redhat.com>
      Reported-by: vincent.weaver@maine.edu
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20150126162639.GA21418@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      8f95b435
    • P
      perf: Fix event->ctx locking · f63a8daa
      Peter Zijlstra 提交于
      There have been a few reported issues wrt. the lack of locking around
      changing event->ctx. This patch tries to address those.
      
      It avoids the whole rwsem thing; and while it appears to work, please
      give it some thought in review.
      
      What I did fail at is sensible runtime checks on the use of
      event->ctx, the RCU use makes it very hard.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20150123125834.209535886@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f63a8daa
    • P
      perf: Add a bit of paranoia · 652884fe
      Peter Zijlstra 提交于
      Add a few WARN()s to catch things that should never happen.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20150123125834.150481799@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      652884fe
    • I
  6. 02 2月, 2015 1 次提交