1. 21 3月, 2013 1 次提交
    • S
      perf: Fix ring_buffer perf_output_space() boundary calculation · dd9c086d
      Stephane Eranian 提交于
      This patch fixes a flaw in perf_output_space(). In case the size
      of the space needed is bigger than the actual buffer size, there
      may be situations where the function would return true (i.e.,
      there is space) when it should not. head > offset due to
      rounding of the masking logic.
      
      The problem can be tested by activating BTS on Intel processors.
      A BTS record can be as big as 16 pages. The following command
      fails:
      
        $ perf record -m 4 -c 1 -e branches:u my_test_program
      
      You will get a buffer corruption with this. Perf report won't be
      able to parse the perf.data.
      
      The fix is to first check that the requested space is smaller
      than the buffer size. If so, then the masking logic will work
      fine. If not, then there is no chance the record can be saved
      and it will be gracefully handled by upper code layers.
      
      [ In v2, we also make the logic for the writable more explicit by
        renaming it to rb->overwrite because it tells whether or not the
        buffer can overwrite its tail (suggested by PeterZ). ]
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: peterz@infradead.org
      Cc: jolsa@redhat.com
      Cc: fweisbec@gmail.com
      Link: http://lkml.kernel.org/r/20130318133327.GA3056@quadSigned-off-by: NIngo Molnar <mingo@kernel.org>
      dd9c086d
  2. 18 3月, 2013 2 次提交
    • N
      perf: Generate EXIT event only once per task context · d610d98b
      Namhyung Kim 提交于
      perf_event_task_event() iterates pmu list and generate events
      for each eligible pmu context.  But if task_event has task_ctx
      like in EXIT it'll generate events even though the pmu doesn't
      have an eligible one. Fix it by moving the code to proper
      places.
      
      Before this patch:
      
        $ perf record -n true
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.006 MB perf.data (~248 samples) ]
      
        $ perf report -D | tail
        Aggregated stats:
                   TOTAL events:         73
                    MMAP events:         67
                    COMM events:          2
                    EXIT events:          4
        cycles stats:
                   TOTAL events:         73
                    MMAP events:         67
                    COMM events:          2
                    EXIT events:          4
      
      After this patch:
      
        $ perf report -D | tail
        Aggregated stats:
                   TOTAL events:         70
                    MMAP events:         67
                    COMM events:          2
                    EXIT events:          1
        cycles stats:
                   TOTAL events:         70
                    MMAP events:         67
                    COMM events:          2
                    EXIT events:          1
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1363332433-7637-1-git-send-email-namhyung@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d610d98b
    • N
      perf: Reset hwc->last_period on sw clock events · 778141e3
      Namhyung Kim 提交于
      When cpu/task clock events are initialized, their sampling
      frequencies are converted to have a fixed value.  However it
      missed to update the hwc->last_period which was set to 1 for
      initial sampling frequency calibration.
      
      Because this hwc->last_period value is used as a period in
      perf_swevent_ hrtime(), every recorded sample will have an
      incorrected period of 1.
      
        $ perf record -e task-clock noploop 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.158 MB perf.data (~6919 samples) ]
      
        $ perf report -n --show-total-period  --stdio
        # Samples: 4K of event 'task-clock'
        # Event count (approx.): 4000
        #
        # Overhead       Samples        Period  Command  Shared Object              Symbol
        # ........  ............  ............  .......  .............  ..................
        #
            99.95%          3998          3998  noploop  noploop        [.] main
             0.03%             1             1  noploop  libc-2.15.so   [.] init_cacheinfo
             0.03%             1             1  noploop  ld-2.15.so     [.] open_verify
      
      Note that it doesn't affect the non-sampling event so that the
      perf stat still gets correct value with or without this patch.
      
        $ perf stat -e task-clock noploop 1
      
         Performance counter stats for 'noploop 1':
      
               1000.272525 task-clock                #    1.000 CPUs utilized
      
               1.000560605 seconds time elapsed
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1363574507-18808-1-git-send-email-namhyung@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      778141e3
  3. 28 2月, 2013 2 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
    • T
      events: convert to idr_alloc() · 0e9c3be2
      Tejun Heo 提交于
      Convert to the much saner new idr interface.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0e9c3be2
  4. 23 2月, 2013 1 次提交
  5. 15 2月, 2013 1 次提交
  6. 09 2月, 2013 27 次提交
  7. 03 2月, 2013 1 次提交
  8. 25 1月, 2013 1 次提交
  9. 20 11月, 2012 1 次提交
  10. 19 11月, 2012 1 次提交
    • E
      pidns: Use task_active_pid_ns where appropriate · 17cf22c3
      Eric W. Biederman 提交于
      The expressions tsk->nsproxy->pid_ns and task_active_pid_ns
      aka ns_of_pid(task_pid(tsk)) should have the same number of
      cache line misses with the practical difference that
      ns_of_pid(task_pid(tsk)) is released later in a processes life.
      
      Furthermore by using task_active_pid_ns it becomes trivial
      to write an unshare implementation for the the pid namespace.
      
      So I have used task_active_pid_ns everywhere I can.
      
      In fork since the pid has not yet been attached to the
      process I use ns_of_pid, to achieve the same effect.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      17cf22c3
  11. 16 11月, 2012 1 次提交
    • O
      uprobes: Use percpu_rw_semaphore to fix register/unregister vs dup_mmap() race · 32cdba1e
      Oleg Nesterov 提交于
      This was always racy, but 26872090
      "uprobes: Rework register_for_each_vma() to make it O(n)" should be
      blamed anyway, it made everything worse and I didn't notice.
      
      register/unregister call build_map_info() and then do install/remove
      breakpoint for every mm which mmaps inode/offset. This can obviously
      race with fork()->dup_mmap() in between and we can miss the child.
      
      uprobe_register() could be easily fixed but unregister is much worse,
      the new mm inherits "int3" from parent and there is no way to detect
      this if uprobe goes away.
      
      So this patch simply adds percpu_down_read/up_read around dup_mmap(),
      and percpu_down_write/up_write into register_for_each_vma().
      
      This adds 2 new hooks into dup_mmap() but we can kill uprobe_dup_mmap()
      and fold it into uprobe_end_dup_mmap().
      Reported-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Acked-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      32cdba1e
  12. 15 11月, 2012 1 次提交