1. 12 3月, 2018 1 次提交
    • P
      perf/core: Fix perf_output_read_group() · 9e5b127d
      Peter Zijlstra 提交于
      Mark reported his arm64 perf fuzzer runs sometimes splat like:
      
        armv8pmu_read_counter+0x1e8/0x2d8
        armpmu_event_update+0x8c/0x188
        armpmu_read+0xc/0x18
        perf_output_read+0x550/0x11e8
        perf_event_read_event+0x1d0/0x248
        perf_event_exit_task+0x468/0xbb8
        do_exit+0x690/0x1310
        do_group_exit+0xd0/0x2b0
        get_signal+0x2e8/0x17a8
        do_signal+0x144/0x4f8
        do_notify_resume+0x148/0x1e8
        work_pending+0x8/0x14
      
      which asserts that we only call pmu::read() on ACTIVE events.
      
      The above callchain does:
      
        perf_event_exit_task()
          perf_event_exit_task_context()
            task_ctx_sched_out() // INACTIVE
            perf_event_exit_event()
              perf_event_set_state(EXIT) // EXIT
              sync_child_event()
                perf_event_read_event()
                  perf_output_read()
                    perf_output_read_group()
                      leader->pmu->read()
      
      Which results in doing a pmu::read() on an !ACTIVE event.
      
      I _think_ this is 'new' since we added attr.inherit_stat, which added
      the perf_event_read_event() to the exit path, without that
      perf_event_read_output() would only trigger from samples and for
      @event to trigger a sample, it's leader _must_ be ACTIVE too.
      
      Still, adding this check makes it consistent with the @sub case for
      the siblings.
      Reported-and-Tested-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      9e5b127d
  2. 12 2月, 2018 1 次提交
    • L
      vfs: do bulk POLL* -> EPOLL* replacement · a9a08845
      Linus Torvalds 提交于
      This is the mindless scripted replacement of kernel use of POLL*
      variables as described by Al, done by this script:
      
          for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
              L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
              for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
          done
      
      with de-mangling cleanups yet to come.
      
      NOTE! On almost all architectures, the EPOLL* constants have the same
      values as the POLL* constants do.  But they keyword here is "almost".
      For various bad reasons they aren't the same, and epoll() doesn't
      actually work quite correctly in some cases due to this on Sparc et al.
      
      The next patch from Al will sort out the final differences, and we
      should be all done.
      Scripted-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9a08845
  3. 06 2月, 2018 2 次提交
  4. 25 1月, 2018 3 次提交
    • P
      perf/core: Fix ctx::mutex deadlock · 0c7296ca
      Peter Zijlstra 提交于
      Lockdep noticed the following 3-way lockup scenario:
      
      	sys_perf_event_open()
      	  perf_event_alloc()
      	    perf_try_init_event()
       #0	      ctx = perf_event_ctx_lock_nested(1)
      	      perf_swevent_init()
      		swevent_hlist_get()
       #1		  mutex_lock(&pmus_lock)
      
      	perf_event_init_cpu()
       #1	  mutex_lock(&pmus_lock)
       #2	  mutex_lock(&ctx->mutex)
      
      	sys_perf_event_open()
      	  mutex_lock_double()
       #2	   mutex_lock()
       #0	   mutex_lock_nested()
      
      And while we need that perf_event_ctx_lock_nested() for HW PMUs such
      that they can iterate the sibling list, trying to match it to the
      available counters, the software PMUs need do no such thing. Exclude
      them.
      
      In particular the swevent triggers the above invertion, while the
      tpevent PMU triggers a more elaborate one through their event_mutex.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      0c7296ca
    • P
      perf/core: Fix another perf,trace,cpuhp lock inversion · 43fa87f7
      Peter Zijlstra 提交于
      Lockdep noticed the following 3-way lockup race:
      
              perf_trace_init()
       #0       mutex_lock(&event_mutex)
                perf_trace_event_init()
                  perf_trace_event_reg()
                    tp_event->class->reg() := tracepoint_probe_register
       #1              mutex_lock(&tracepoints_mutex)
                        trace_point_add_func()
       #2                  static_key_enable()
      
       #2	do_cpu_up()
      	  perf_event_init_cpu()
       #3	    mutex_lock(&pmus_lock)
       #4	    mutex_lock(&ctx->mutex)
      
      	perf_ioctl()
       #4	  ctx = perf_event_ctx_lock()
      	  _perf_iotcl()
      	    ftrace_profile_set_filter()
       #0	      mutex_lock(&event_mutex)
      
      Fudge it for now by noting that the tracepoint state does not depend
      on the event <-> context relation. Ugly though :/
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      43fa87f7
    • P
      perf/core: Fix lock inversion between perf,trace,cpuhp · 82d94856
      Peter Zijlstra 提交于
      Lockdep gifted us with noticing the following 4-way lockup scenario:
      
              perf_trace_init()
       #0       mutex_lock(&event_mutex)
                perf_trace_event_init()
                  perf_trace_event_reg()
                    tp_event->class->reg() := tracepoint_probe_register
       #1             mutex_lock(&tracepoints_mutex)
                        trace_point_add_func()
       #2                 static_key_enable()
      
       #2     do_cpu_up()
                perf_event_init_cpu()
       #3         mutex_lock(&pmus_lock)
       #4         mutex_lock(&ctx->mutex)
      
              perf_event_task_disable()
                mutex_lock(&current->perf_event_mutex)
       #4       ctx = perf_event_ctx_lock()
       #5       perf_event_for_each_child()
      
              do_exit()
                task_work_run()
                  __fput()
                    perf_release()
                      perf_event_release_kernel()
       #4               mutex_lock(&ctx->mutex)
       #5               mutex_lock(&event->child_mutex)
                        free_event()
                          _free_event()
                            event->destroy() := perf_trace_destroy
       #0                     mutex_lock(&event_mutex);
      
      Fix that by moving the free_event() out from under the locks.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      82d94856
  5. 08 1月, 2018 3 次提交
  6. 03 1月, 2018 1 次提交
  7. 17 12月, 2017 1 次提交
  8. 14 12月, 2017 1 次提交
    • Y
      bpf/tracing: fix kernel/events/core.c compilation error · f4e2298e
      Yonghong Song 提交于
      Commit f371b304 ("bpf/tracing: allow user space to
      query prog array on the same tp") introduced a perf
      ioctl command to query prog array attached to the
      same perf tracepoint. The commit introduced a
      compilation error under certain config conditions, e.g.,
        (1). CONFIG_BPF_SYSCALL is not defined, or
        (2). CONFIG_TRACING is defined but neither CONFIG_UPROBE_EVENTS
             nor CONFIG_KPROBE_EVENTS is defined.
      
      Error message:
        kernel/events/core.o: In function `perf_ioctl':
        core.c:(.text+0x98c4): undefined reference to `bpf_event_query_prog_array'
      
      This patch fixed this error by guarding the real definition under
      CONFIG_BPF_EVENTS and provided static inline dummy function
      if CONFIG_BPF_EVENTS was not defined.
      It renamed the function from bpf_event_query_prog_array to
      perf_event_query_prog_array and moved the definition from linux/bpf.h
      to linux/trace_events.h so the definition is in proximity to
      other prog_array related functions.
      
      Fixes: f371b304 ("bpf/tracing: allow user space to query prog array on the same tp")
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      f4e2298e
  9. 13 12月, 2017 2 次提交
    • J
      bpf: add a bpf_override_function helper · 9802d865
      Josef Bacik 提交于
      Error injection is sloppy and very ad-hoc.  BPF could fill this niche
      perfectly with it's kprobe functionality.  We could make sure errors are
      only triggered in specific call chains that we care about with very
      specific situations.  Accomplish this with the bpf_override_funciton
      helper.  This will modify the probe'd callers return value to the
      specified value and set the PC to an override function that simply
      returns, bypassing the originally probed function.  This gives us a nice
      clean way to implement systematic error injection for all of our code
      paths.
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      9802d865
    • Y
      bpf/tracing: allow user space to query prog array on the same tp · f371b304
      Yonghong Song 提交于
      Commit e87c6bc3 ("bpf: permit multiple bpf attachments
      for a single perf event") added support to attach multiple
      bpf programs to a single perf event.
      Although this provides flexibility, users may want to know
      what other bpf programs attached to the same tp interface.
      Besides getting visibility for the underlying bpf system,
      such information may also help consolidate multiple bpf programs,
      understand potential performance issues due to a large array,
      and debug (e.g., one bpf program which overwrites return code
      may impact subsequent program results).
      
      Commit 2541517c ("tracing, perf: Implement BPF programs
      attached to kprobes") utilized the existing perf ioctl
      interface and added the command PERF_EVENT_IOC_SET_BPF
      to attach a bpf program to a tracepoint. This patch adds a new
      ioctl command, given a perf event fd, to query the bpf program
      array attached to the same perf tracepoint event.
      
      The new uapi ioctl command:
        PERF_EVENT_IOC_QUERY_BPF
      
      The new uapi/linux/perf_event.h structure:
        struct perf_event_query_bpf {
             __u32	ids_len;
             __u32	prog_cnt;
             __u32	ids[0];
        };
      
      User space provides buffer "ids" for kernel to copy to.
      When returning from the kernel, the number of available
      programs in the array is set in "prog_cnt".
      
      The usage:
        struct perf_event_query_bpf *query =
          malloc(sizeof(*query) + sizeof(u32) * ids_len);
        query.ids_len = ids_len;
        err = ioctl(pmu_efd, PERF_EVENT_IOC_QUERY_BPF, query);
        if (err == 0) {
          /* query.prog_cnt is the number of available progs,
           * number of progs in ids: (ids_len == 0) ? 0 : query.prog_cnt
           */
        } else if (errno == ENOSPC) {
          /* query.ids_len number of progs copied,
           * query.prog_cnt is the number of available progs
           */
        } else {
            /* other errors */
        }
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      f371b304
  10. 05 12月, 2017 1 次提交
    • H
      bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type · c895f6f7
      Hendrik Brueckner 提交于
      Commit 0515e599 ("bpf: introduce BPF_PROG_TYPE_PERF_EVENT
      program type") introduced the bpf_perf_event_data structure which
      exports the pt_regs structure.  This is OK for multiple architectures
      but fail for s390 and arm64 which do not export pt_regs.  Programs
      using them, for example, the bpf selftest fail to compile on these
      architectures.
      
      For s390, exporting the pt_regs is not an option because s390 wants
      to allow changes to it.  For arm64, there is a user_pt_regs structure
      that covers parts of the pt_regs structure for use by user space.
      
      To solve the broken uapi for s390 and arm64, introduce an abstract
      type for pt_regs and add an asm/bpf_perf_event.h file that concretes
      the type.  An asm-generic header file covers the architectures that
      export pt_regs today.
      
      The arch-specific enablement for s390 and arm64 follows in separate
      commits.
      Reported-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Fixes: 0515e599 ("bpf: introduce BPF_PROG_TYPE_PERF_EVENT program type")
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Reviewed-and-tested-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      c895f6f7
  11. 29 11月, 2017 1 次提交
  12. 28 11月, 2017 1 次提交
  13. 17 11月, 2017 1 次提交
  14. 15 11月, 2017 2 次提交
  15. 11 11月, 2017 2 次提交
  16. 08 11月, 2017 1 次提交
  17. 30 10月, 2017 1 次提交
  18. 27 10月, 2017 9 次提交
  19. 25 10月, 2017 3 次提交
  20. 24 10月, 2017 1 次提交
  21. 17 10月, 2017 1 次提交
  22. 10 10月, 2017 1 次提交