1. 18 10月, 2018 1 次提交
    • M
      tracepoint: Fix tracepoint array element size mismatch · 9c0be3f6
      Mathieu Desnoyers 提交于
      commit 46e0c9be ("kernel: tracepoints: add support for relative
      references") changes the layout of the __tracepoint_ptrs section on
      architectures supporting relative references. However, it does so
      without turning struct tracepoint * const into const int elsewhere in
      the tracepoint code, which has the following side-effect:
      
      Setting mod->num_tracepoints is done in by module.c:
      
          mod->tracepoints_ptrs = section_objs(info, "__tracepoints_ptrs",
                                               sizeof(*mod->tracepoints_ptrs),
                                               &mod->num_tracepoints);
      
      Basically, since sizeof(*mod->tracepoints_ptrs) is a pointer size
      (rather than sizeof(int)), num_tracepoints is erroneously set to half the
      size it should be on 64-bit arch. So a module with an odd number of
      tracepoints misses the last tracepoint due to effect of integer
      division.
      
      So in the module going notifier:
      
              for_each_tracepoint_range(mod->tracepoints_ptrs,
                      mod->tracepoints_ptrs + mod->num_tracepoints,
                      tp_module_going_check_quiescent, NULL);
      
      the expression (mod->tracepoints_ptrs + mod->num_tracepoints) actually
      evaluates to something within the bounds of the array, but miss the
      last tracepoint if the number of tracepoints is odd on 64-bit arch.
      
      Fix this by introducing a new typedef: tracepoint_ptr_t, which
      is either "const int" on architectures that have PREL32 relocations,
      or "struct tracepoint * const" on architectures that does not have
      this feature.
      
      Also provide a new tracepoint_ptr_defer() static inline to
      encapsulate deferencing this type rather than duplicate code and
      ugly idefs within the for_each_tracepoint_range() implementation.
      
      This issue appears in 4.19-rc kernels, and should ideally be fixed
      before the end of the rc cycle.
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NJessica Yu <jeyu@kernel.org>
      Link: http://lkml.kernel.org/r/20181013191050.22389-1-mathieu.desnoyers@efficios.com
      Link: http://lkml.kernel.org/r/20180704083651.24360-7-ard.biesheuvel@linaro.org
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morris <james.morris@microsoft.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Nicolas Pitre <nico@linaro.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: "Serge E. Hallyn" <serge@hallyn.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Thomas Garnier <thgarnie@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      9c0be3f6
  2. 23 8月, 2018 1 次提交
  3. 11 8月, 2018 1 次提交
    • S
      tracepoints: Free early tracepoints after RCU is initialized · f8a79d5c
      Steven Rostedt (VMware) 提交于
      When enabling trace events via the kernel command line, I hit this warning:
      
      WARNING: CPU: 0 PID: 13 at kernel/rcu/srcutree.c:236 check_init_srcu_struct+0xe/0x61
      Modules linked in:
      CPU: 0 PID: 13 Comm: watchdog/0 Not tainted 4.18.0-rc6-test+ #6
      Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
      RIP: 0010:check_init_srcu_struct+0xe/0x61
      Code: 48 c7 c6 ec 8a 65 b4 e8 ff 79 fe ff 48 89 df 31 f6 e8 f2 fa ff ff 5a
      5b 41 5c 5d c3 0f 1f 44 00 00 83 3d 68 94 b8 01 01 75 02 <0f> 0b 48 8b 87 f0
      0a 00 00 a8 03 74 45 55 48 89 e5 41 55 41 54 4c
      RSP: 0000:ffff96eb9ea03e68 EFLAGS: 00010246
      RAX: ffff96eb962b5b01 RBX: ffffffffb4a87420 RCX: 0000000000000001
      RDX: ffffffffb3107969 RSI: ffff96eb962b5b40 RDI: ffffffffb4a87420
      RBP: ffff96eb9ea03eb0 R08: ffffabbd00cd7f48 R09: 0000000000000000
      R10: ffff96eb9ea03e68 R11: ffffffffb4a6eec0 R12: ffff96eb962b5b40
      R13: ffff96eb9ea03ef8 R14: ffffffffb3107969 R15: ffffffffb3107948
      FS:  0000000000000000(0000) GS:ffff96eb9ea00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffff96eb13ab2000 CR3: 0000000192a1e001 CR4: 00000000001606f0
      Call Trace:
       <IRQ>
       ? __call_srcu+0x2d/0x290
       ? rcu_process_callbacks+0x26e/0x448
       ? allocate_probes+0x2b/0x2b
       call_srcu+0x13/0x15
       rcu_free_old_probes+0x1f/0x21
       rcu_process_callbacks+0x2ed/0x448
       __do_softirq+0x172/0x336
       irq_exit+0x62/0xb2
       smp_apic_timer_interrupt+0x161/0x19e
       apic_timer_interrupt+0xf/0x20
       </IRQ>
      
      The problem is that the enabling of trace events before RCU is set up will
      cause SRCU to give this warning. To avoid this, add a list to store probes
      that need to be freed till after RCU is initialized, and then free them
      then.
      
      Link: http://lkml.kernel.org/r/20180810113554.1df28050@gandalf.local.home
      Link: http://lkml.kernel.org/r/20180810123517.5e9714ad@gandalf.local.homeAcked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Fixes: e6753f23 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      f8a79d5c
  4. 31 7月, 2018 1 次提交
    • J
      tracepoint: Make rcuidle tracepoint callers use SRCU · e6753f23
      Joel Fernandes (Google) 提交于
      In recent tests with IRQ on/off tracepoints, a large performance
      overhead ~10% is noticed when running hackbench. This is root caused to
      calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the
      tracepoint code. Following a long discussion on the list [1] about this,
      we concluded that srcu is a better alternative for use during rcu idle.
      Although it does involve extra barriers, its lighter than the sched-rcu
      version which has to do additional RCU calls to notify RCU idle about
      entry into RCU sections.
      
      In this patch, we change the underlying implementation of the
      trace_*_rcuidle API to use SRCU. This has shown to improve performance
      alot for the high frequency irq enable/disable tracepoints.
      
      Test: Tested idle and preempt/irq tracepoints.
      
      Here are some performance numbers:
      
      With a run of the following 30 times on a single core x86 Qemu instance
      with 1GB memory:
      hackbench -g 4 -f 2 -l 3000
      
      Completion times in seconds. CONFIG_PROVE_LOCKING=y.
      
      No patches (without this series)
      Mean: 3.048
      Median: 3.025
      Std Dev: 0.064
      
      With Lockdep using irq tracepoints with RCU implementation:
      Mean: 3.451   (-11.66 %)
      Median: 3.447 (-12.22%)
      Std Dev: 0.049
      
      With Lockdep using irq tracepoints with SRCU implementation (this series):
      Mean: 3.020   (I would consider the improvement against the "without
      	       this series" case as just noise).
      Median: 3.013
      Std Dev: 0.033
      
      [1] https://patchwork.kernel.org/patch/10344297/
      
      [remove rcu_read_lock_sched_notrace as its the equivalent of
      preempt_disable_notrace and is unnecessary to call in tracepoint code]
      Link: http://lkml.kernel.org/r/20180730222423.196630-3-joel@joelfernandes.orgCleaned-up-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Reviewed-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      [ Simplified WARN_ON_ONCE() ]
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      e6753f23
  5. 29 5月, 2018 1 次提交
  6. 01 5月, 2018 1 次提交
  7. 05 12月, 2017 1 次提交
  8. 02 3月, 2017 2 次提交
  9. 09 12月, 2016 1 次提交
    • S
      tracing: Have the reg function allow to fail · 8cf868af
      Steven Rostedt (Red Hat) 提交于
      Some tracepoints have a registration function that gets enabled when the
      tracepoint is enabled. There may be cases that the registraction function
      must fail (for example, can't allocate enough memory). In this case, the
      tracepoint should also fail to register, otherwise the user would not know
      why the tracepoint is not working.
      
      Cc: David Howells <dhowells@redhat.com>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      8cf868af
  10. 23 3月, 2016 1 次提交
  11. 26 10月, 2015 1 次提交
    • S
      tracepoint: Give priority to probes of tracepoints · 7904b5c4
      Steven Rostedt (Red Hat) 提交于
      In order to guarantee that a probe will be called before other probes that
      are attached to a tracepoint, there needs to be a mechanism to provide
      priority of one probe over the others.
      
      Adding a prio field to the struct tracepoint_func, which lets the probes be
      sorted by the priority set in the structure. If no priority is specified,
      then a priority of 10 is given (this is a macro, and perhaps may be changed
      in the future).
      
      Now probes may be added to affect other probes that are attached to a
      tracepoint with a guaranteed order.
      
      One use case would be to allow tracing of tracepoints be able to filter by
      pid. A special (higher priority probe) may be added to the sched_switch
      tracepoint and set the necessary flags of the other tracepoints to notify
      them if they should be traced or not. In case a tracepoint is enabled at the
      sched_switch tracepoint too, the order of the two are not random.
      
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7904b5c4
  12. 21 6月, 2014 2 次提交
  13. 05 6月, 2014 1 次提交
  14. 08 5月, 2014 1 次提交
  15. 09 4月, 2014 3 次提交
  16. 22 3月, 2014 1 次提交
  17. 13 3月, 2014 1 次提交
    • M
      Fix: module signature vs tracepoints: add new TAINT_UNSIGNED_MODULE · 66cc69e3
      Mathieu Desnoyers 提交于
      Users have reported being unable to trace non-signed modules loaded
      within a kernel supporting module signature.
      
      This is caused by tracepoint.c:tracepoint_module_coming() refusing to
      take into account tracepoints sitting within force-loaded modules
      (TAINT_FORCED_MODULE). The reason for this check, in the first place, is
      that a force-loaded module may have a struct module incompatible with
      the layout expected by the kernel, and can thus cause a kernel crash
      upon forced load of that module on a kernel with CONFIG_TRACEPOINTS=y.
      
      Tracepoints, however, specifically accept TAINT_OOT_MODULE and
      TAINT_CRAP, since those modules do not lead to the "very likely system
      crash" issue cited above for force-loaded modules.
      
      With kernels having CONFIG_MODULE_SIG=y (signed modules), a non-signed
      module is tainted re-using the TAINT_FORCED_MODULE taint flag.
      Unfortunately, this means that Tracepoints treat that module as a
      force-loaded module, and thus silently refuse to consider any tracepoint
      within this module.
      
      Since an unsigned module does not fit within the "very likely system
      crash" category of tainting, add a new TAINT_UNSIGNED_MODULE taint flag
      to specifically address this taint behavior, and accept those modules
      within Tracepoints. We use the letter 'X' as a taint flag character for
      a module being loaded that doesn't know how to sign its name (proposed
      by Steven Rostedt).
      
      Also add the missing 'O' entry to trace event show_module_flags() list
      for the sake of completeness.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      NAKed-by: NIngo Molnar <mingo@redhat.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: David Howells <dhowells@redhat.com>
      CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      66cc69e3
  18. 12 3月, 2014 2 次提交
  19. 07 3月, 2014 1 次提交
    • S
      tracing: Warn if a tracepoint is not set via debugfs · b196e2b9
      Steven Rostedt 提交于
      Tracepoints were made to allow enabling a tracepoint in a module before that
      module was loaded. When a tracepoint is enabled and it does not exist, the
      name is stored and will be enabled when the tracepoint is created.
      
      The problem with this approach is that when a tracepoint is enabled when
      it expects to be there, it gives no warning that it does not exist.
      
      To add salt to the wound, if a module is added and sets the FORCED flag, which
      can happen if it isn't signed properly, the tracepoint code will not enabled
      the tracepoints, but they will be created in the debugfs system! When a user
      goes to enable the tracepoint, the tracepoint code will not see it existing
      and will think it is to be enabled later AND WILL NOT GIVE A WARNING.
      
      The tracing will look like it succeeded but will actually be doing nothing.
      This will cause lots of confusion and headaches for developers trying to
      figure out why they are not seeing their tracepoints.
      
      Link: http://lkml.kernel.org/r/20140213154507.4040fb06@gandalf.local.homeReported-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Reported-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b196e2b9
  20. 04 3月, 2014 2 次提交
  21. 20 4月, 2013 1 次提交
  22. 28 2月, 2013 1 次提交
    • S
      hlist: drop the node parameter from iterators · b67bfe0d
      Sasha Levin 提交于
      I'm not sure why, but the hlist for each entry iterators were conceived
      
              list_for_each_entry(pos, head, member)
      
      The hlist ones were greedy and wanted an extra parameter:
      
              hlist_for_each_entry(tpos, pos, head, member)
      
      Why did they need an extra pos parameter? I'm not quite sure. Not only
      they don't really need it, it also prevents the iterator from looking
      exactly like the list iterator, which is unfortunate.
      
      Besides the semantic patch, there was some manual work required:
      
       - Fix up the actual hlist iterators in linux/list.h
       - Fix up the declaration of other iterators based on the hlist ones.
       - A very small amount of places were using the 'node' parameter, this
       was modified to use 'obj->member' instead.
       - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
       properly, so those had to be fixed up manually.
      
      The semantic patch which is mostly the work of Peter Senna Tschudin is here:
      
      @@
      iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
      
      type T;
      expression a,c,d,e;
      identifier b;
      statement S;
      @@
      
      -T b;
          <+... when != b
      (
      hlist_for_each_entry(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue(a,
      - b,
      c) S
      |
      hlist_for_each_entry_from(a,
      - b,
      c) S
      |
      hlist_for_each_entry_rcu(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_rcu_bh(a,
      - b,
      c, d) S
      |
      hlist_for_each_entry_continue_rcu_bh(a,
      - b,
      c) S
      |
      for_each_busy_worker(a, c,
      - b,
      d) S
      |
      ax25_uid_for_each(a,
      - b,
      c) S
      |
      ax25_for_each(a,
      - b,
      c) S
      |
      inet_bind_bucket_for_each(a,
      - b,
      c) S
      |
      sctp_for_each_hentry(a,
      - b,
      c) S
      |
      sk_for_each(a,
      - b,
      c) S
      |
      sk_for_each_rcu(a,
      - b,
      c) S
      |
      sk_for_each_from
      -(a, b)
      +(a)
      S
      + sk_for_each_from(a) S
      |
      sk_for_each_safe(a,
      - b,
      c, d) S
      |
      sk_for_each_bound(a,
      - b,
      c) S
      |
      hlist_for_each_entry_safe(a,
      - b,
      c, d, e) S
      |
      hlist_for_each_entry_continue_rcu(a,
      - b,
      c) S
      |
      nr_neigh_for_each(a,
      - b,
      c) S
      |
      nr_neigh_for_each_safe(a,
      - b,
      c, d) S
      |
      nr_node_for_each(a,
      - b,
      c) S
      |
      nr_node_for_each_safe(a,
      - b,
      c, d) S
      |
      - for_each_gfn_sp(a, c, d, b) S
      + for_each_gfn_sp(a, c, d) S
      |
      - for_each_gfn_indirect_valid_sp(a, c, d, b) S
      + for_each_gfn_indirect_valid_sp(a, c, d) S
      |
      for_each_host(a,
      - b,
      c) S
      |
      for_each_host_safe(a,
      - b,
      c, d) S
      |
      for_each_mesh_entry(a,
      - b,
      c, d) S
      )
          ...+>
      
      [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
      [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
      [akpm@linux-foundation.org: checkpatch fixes]
      [akpm@linux-foundation.org: fix warnings]
      [akpm@linux-foudnation.org: redo intrusive kvm changes]
      Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b67bfe0d
  23. 24 2月, 2012 1 次提交
    • I
      static keys: Introduce 'struct static_key', static_key_true()/false() and... · c5905afb
      Ingo Molnar 提交于
      static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]()
      
      So here's a boot tested patch on top of Jason's series that does
      all the cleanups I talked about and turns jump labels into a
      more intuitive to use facility. It should also address the
      various misconceptions and confusions that surround jump labels.
      
      Typical usage scenarios:
      
              #include <linux/static_key.h>
      
              struct static_key key = STATIC_KEY_INIT_TRUE;
      
              if (static_key_false(&key))
                      do unlikely code
              else
                      do likely code
      
      Or:
      
              if (static_key_true(&key))
                      do likely code
              else
                      do unlikely code
      
      The static key is modified via:
      
              static_key_slow_inc(&key);
              ...
              static_key_slow_dec(&key);
      
      The 'slow' prefix makes it abundantly clear that this is an
      expensive operation.
      
      I've updated all in-kernel code to use this everywhere. Note
      that I (intentionally) have not pushed through the rename
      blindly through to the lowest levels: the actual jump-label
      patching arch facility should be named like that, so we want to
      decouple jump labels from the static-key facility a bit.
      
      On non-jump-label enabled architectures static keys default to
      likely()/unlikely() branches.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NJason Baron <jbaron@redhat.com>
      Acked-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: a.p.zijlstra@chello.nl
      Cc: mathieu.desnoyers@efficios.com
      Cc: davem@davemloft.net
      Cc: ddaney.cavm@gmail.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20120222085809.GA26397@elte.huSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c5905afb
  24. 17 1月, 2012 1 次提交
    • S
      tracepoints/module: Fix disabling tracepoints with taint CRAP or OOT · c10076c4
      Steven Rostedt 提交于
      Tracepoints are disabled for tainted modules, which is usually because the
      module is either proprietary or was forced, and we don't want either of them
      using kernel tracepoints.
      
      But, a module can also be tainted by being in the staging directory or
      compiled out of tree. Either is fine for use with tracepoints, no need
      to punish them.  I found this out when I noticed that my sample trace event
      module, when done out of tree, stopped working.
      
      Cc: stable@vger.kernel.org # 3.2
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      c10076c4
  25. 11 8月, 2011 1 次提交
    • M
      Tracepoint: Dissociate from module mutex · b75ef8b4
      Mathieu Desnoyers 提交于
      Copy the information needed from struct module into a local module list
      held within tracepoint.c from within the module coming/going notifier.
      
      This vastly simplifies locking of tracepoint registration /
      unregistration, because we don't have to take the module mutex to
      register and unregister tracepoints anymore. Steven Rostedt ran into
      dependency problems related to modules mutex vs kprobes mutex vs ftrace
      mutex vs tracepoint mutex that seems to be hard to fix without removing
      this dependency between tracepoint and module mutex. (note: it should be
      investigated whether kprobes could benefit of being dissociated from the
      modules mutex too.)
      
      This also fixes module handling of tracepoint list iterators, because it
      was expecting the list to be sorted by pointer address. Given we have
      control on our own list now, it's OK to sort this list which has
      tracepoints as its only purpose. The reason why this sorting is required
      is to handle the fact that seq files (and any read() operation from
      user-space) cannot hold the tracepoint mutex across multiple calls, so
      list entries may vanish between calls. With sorting, the tracepoint
      iterator becomes usable even if the list don't contain the exact item
      pointed to by the iterator anymore.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NJason Baron <jbaron@redhat.com>
      CC: Ingo Molnar <mingo@elte.hu>
      CC: Lai Jiangshan <laijs@cn.fujitsu.com>
      CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Link: http://lkml.kernel.org/r/20110810191839.GC8525@KrystalSigned-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b75ef8b4
  26. 05 4月, 2011 1 次提交
    • J
      jump label: Introduce static_branch() interface · d430d3d7
      Jason Baron 提交于
      Introduce:
      
      static __always_inline bool static_branch(struct jump_label_key *key);
      
      instead of the old JUMP_LABEL(key, label) macro.
      
      In this way, jump labels become really easy to use:
      
      Define:
      
              struct jump_label_key jump_key;
      
      Can be used as:
      
              if (static_branch(&jump_key))
                      do unlikely code
      
      enable/disale via:
      
              jump_label_inc(&jump_key);
              jump_label_dec(&jump_key);
      
      that's it!
      
      For the jump labels disabled case, the static_branch() becomes an
      atomic_read(), and jump_label_inc()/dec() are simply atomic_inc(),
      atomic_dec() operations. We show testing results for this change below.
      
      Thanks to H. Peter Anvin for suggesting the 'static_branch()' construct.
      
      Since we now require a 'struct jump_label_key *key', we can store a pointer into
      the jump table addresses. In this way, we can enable/disable jump labels, in
      basically constant time. This change allows us to completely remove the previous
      hashtable scheme. Thanks to Peter Zijlstra for this re-write.
      
      Testing:
      
      I ran a series of 'tbench 20' runs 5 times (with reboots) for 3
      configurations, where tracepoints were disabled.
      
      jump label configured in
      avg: 815.6
      
      jump label *not* configured in (using atomic reads)
      avg: 800.1
      
      jump label *not* configured in (regular reads)
      avg: 803.4
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20110316212947.GA8792@redhat.com>
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Suggested-by: NH. Peter Anvin <hpa@linux.intel.com>
      Tested-by: NDavid Daney <ddaney@caviumnetworks.com>
      Acked-by: NRalf Baechle <ralf@linux-mips.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      d430d3d7
  27. 03 2月, 2011 1 次提交
    • M
      tracepoints: Fix section alignment using pointer array · 65498646
      Mathieu Desnoyers 提交于
      Make the tracepoints more robust, making them solid enough to handle compiler
      changes by not relying on anything based on compiler-specific behavior with
      respect to structure alignment. Implement an approach proposed by David Miller:
      use an array of const pointers to refer to the individual structures, and export
      this pointer array through the linker script rather than the structures per se.
      It will consume 32 extra bytes per tracepoint (24 for structure padding and 8
      for the pointers), but are less likely to break due to compiler changes.
      
      History:
      
      commit 7e066fb8 tracepoints: add DECLARE_TRACE() and DEFINE_TRACE()
      added the aligned(32) type and variable attribute to the tracepoint structures
      to deal with gcc happily aligning statically defined structures on 32-byte
      multiples.
      
      One attempt was to use a 8-byte alignment for tracepoint structures by applying
      both the variable and type attribute to tracepoint structures definitions and
      declarations. It worked fine with gcc 4.5.1, but broke with gcc 4.4.4 and 4.4.5.
      
      The reason is that the "aligned" attribute only specify the _minimum_ alignment
      for a structure, leaving both the compiler and the linker free to align on
      larger multiples. Because tracepoint.c expects the structures to be placed as an
      array within each section, up-alignment cause NULL-pointer exceptions due to the
      extra unexpected padding.
      
      (this patch applies on top of -tip)
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      LKML-Reference: <20110126222622.GA10794@Krystal>
      CC: Frederic Weisbecker <fweisbec@gmail.com>
      CC: Ingo Molnar <mingo@elte.hu>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Peter Zijlstra <peterz@infradead.org>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      65498646
  28. 19 10月, 2010 1 次提交
  29. 23 9月, 2010 1 次提交
  30. 14 5月, 2010 1 次提交
    • S
      tracing: Let tracepoints have data passed to tracepoint callbacks · 38516ab5
      Steven Rostedt 提交于
      This patch adds data to be passed to tracepoint callbacks.
      
      The created functions from DECLARE_TRACE() now need a mandatory data
      parameter. For example:
      
      DECLARE_TRACE(mytracepoint, int value, value)
      
      Will create the register function:
      
      int register_trace_mytracepoint((void(*)(void *data, int value))probe,
                                      void *data);
      
      As the first argument, all callbacks (probes) must take a (void *data)
      parameter. So a callback for the above tracepoint will look like:
      
      void myprobe(void *data, int value)
      {
      }
      
      The callback may choose to ignore the data parameter.
      
      This change allows callbacks to register a private data pointer along
      with the function probe.
      
      	void mycallback(void *data, int value);
      
      	register_trace_mytracepoint(mycallback, mydata);
      
      Then the mycallback() will receive the "mydata" as the first parameter
      before the args.
      
      A more detailed example:
      
        DECLARE_TRACE(mytracepoint, TP_PROTO(int status), TP_ARGS(status));
      
        /* In the C file */
      
        DEFINE_TRACE(mytracepoint, TP_PROTO(int status), TP_ARGS(status));
      
        [...]
      
             trace_mytracepoint(status);
      
        /* In a file registering this tracepoint */
      
        int my_callback(void *data, int status)
        {
      	struct my_struct my_data = data;
      	[...]
        }
      
        [...]
      	my_data = kmalloc(sizeof(*my_data), GFP_KERNEL);
      	init_my_data(my_data);
      	register_trace_mytracepoint(my_callback, my_data);
      
      The same callback can also be registered to the same tracepoint as long
      as the data registered is different. Note, the data must also be used
      to unregister the callback:
      
      	unregister_trace_mytracepoint(my_callback, my_data);
      
      Because of the data parameter, tracepoints declared this way can not have
      no args. That is:
      
        DECLARE_TRACE(mytracepoint, TP_PROTO(void), TP_ARGS());
      
      will cause an error.
      
      If no arguments are needed, a new macro can be used instead:
      
        DECLARE_TRACE_NOARGS(mytracepoint);
      
      Since there are no arguments, the proto and args fields are left out.
      
      This is part of a series to make the tracepoint footprint smaller:
      
         text	   data	    bss	    dec	    hex	filename
      4913961	1088356	 861512	6863829	 68bbd5	vmlinux.orig
      4914025	1088868	 861512	6864405	 68be15	vmlinux.class
      4918492	1084612	 861512	6864616	 68bee8	vmlinux.tracepoint
      
      Again, this patch also increases the size of the kernel, but
      lays the ground work for decreasing it.
      
       v5: Fixed net/core/drop_monitor.c to handle these updates.
      
       v4: Moved the DECLARE_TRACE() DECLARE_TRACE_NOARGS out of the
           #ifdef CONFIG_TRACE_POINTS, since the two are the same in both
           cases. The __DECLARE_TRACE() is what changes.
           Thanks to Frederic Weisbecker for pointing this out.
      
       v3: Made all register_* functions require data to be passed and
           all callbacks to take a void * parameter as its first argument.
           This makes the calling functions comply with C standards.
      
           Also added more comments to the modifications of DECLARE_TRACE().
      
       v2: Made the DECLARE_TRACE() have the ability to pass arguments
           and added a new DECLARE_TRACE_NOARGS() for tracepoints that
           do not need any arguments.
      Acked-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      38516ab5
  31. 21 9月, 2009 1 次提交
  32. 27 8月, 2009 1 次提交
    • H
      tracing: Don't trace kernel thread syscalls · cc3b13c1
      Hendrik Brueckner 提交于
      Kernel threads don't call syscalls using the sysenter/sysexit
      path. Instead they directly call the sys_* or do_* functions
      that implement the syscalls inside the kernel.
      
      The current syscall tracepoints only bind the sysenter/sysexit
      path, then it has no effect to trace the kernel thread calls
      to syscalls in that path.
      Setting the TIF_SYSCALL_TRACEPOINT flag is then useless for these.
      
      Actually there is only one case when a kernel thread can reach the
      usual syscall exit tracing path: when we create a kernel thread, the
      child comes to ret_from_fork and is the fork() return is then traced.
      But this information alone is useless, then we don't want to set the
      TIF flags for these threads.
      
      Kernel threads have task_struct->mm set to NULL.
      (Thanks to Heiko for that hint ;-)
      The idea is then to check the mm field in syscall_regfunc() and
      set the flag accordingly.
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      LKML-Reference: <20090825160237.GG4639@cetus.boeblingen.de.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      cc3b13c1
  33. 26 8月, 2009 2 次提交
    • J
      tracing: Move tracepoint callbacks from declaration to definition · 97419875
      Josh Stone 提交于
      It's not strictly correct for the tracepoint reg/unreg callbacks to
      occur when a client is hooking up, because the actual tracepoint may not
      be present yet.  This happens to be fine for syscall, since that's in
      the core kernel, but it would cause problems for tracepoints defined in
      a module that hasn't been loaded yet.  It also means the reg/unreg has
      to be EXPORTed for any modules to use the tracepoint (as in SystemTap).
      
      This patch removes DECLARE_TRACE_WITH_CALLBACK, and instead introduces
      DEFINE_TRACE_FN which stores the callbacks in struct tracepoint.  The
      callbacks are used now when the active state of the tracepoint changes
      in set_tracepoint & disable_tracepoint.
      
      This also introduces TRACE_EVENT_FN, so ftrace events can also provide
      registration callbacks if needed.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <1251150194-1713-4-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      97419875
    • J
      tracing: Make syscall tracepoints conditional · 3d27d8cb
      Josh Stone 提交于
      The syscall enter/exit tracepoints are only supported on archs that
      HAVE_SYSCALL_TRACEPOINTS, so the declarations should be #ifdef'ed.
      Also, the definition of syscall_regfunc and syscall_unregfunc should
      depend on this same config, rather than the ftrace-specific one.
      Signed-off-by: NJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <1251150194-1713-3-git-send-email-jistone@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      3d27d8cb