1. 01 4月, 2017 5 次提交
    • S
      ftrace: Create separate t_func_next() to simplify the function / hash logic · 5bd84629
      Steven Rostedt (VMware) 提交于
      I noticed that if I use dd to read the set_ftrace_filter file that the first
      hash command is repeated.
      
       # cd /sys/kernel/debug/tracing
       # echo schedule > set_ftrace_filter
       # echo do_IRQ >> set_ftrace_filter
       # echo schedule:traceoff >> set_ftrace_filter
       # echo do_IRQ:traceoff >> set_ftrace_filter
      
       # cat set_ftrace_filter
       schedule
       do_IRQ
       schedule:traceoff:unlimited
       do_IRQ:traceoff:unlimited
      
       # dd if=set_ftrace_filter bs=1
       schedule
       do_IRQ
       schedule:traceoff:unlimited
       schedule:traceoff:unlimited
       do_IRQ:traceoff:unlimited
       98+0 records in
       98+0 records out
       98 bytes copied, 0.00265011 s, 37.0 kB/s
      
      This is due to the way t_start() calls t_next() as well as the seq_file
      calls t_next() and the state is slightly different between the two. Namely,
      t_start() will call t_next() with a local "pos" variable.
      
      By separating out the function listing from t_next() into its own function,
      we can have better control of outputting the functions and the hash of
      triggers. This simplifies the code.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      5bd84629
    • S
      ftrace: Update func_pos in t_start() when all functions are enabled · 43ff926a
      Steven Rostedt (VMware) 提交于
      If all functions are enabled, there's a comment displayed in the file to
      denote that:
      
        # cd /sys/kernel/debug/tracing
        # cat set_ftrace_filter
       #### all functions enabled ####
      
      If a function trigger is set, those are displayed as well:
      
        # echo schedule:traceoff >> /debug/tracing/set_ftrace_filter
        # cat set_ftrace_filter
       #### all functions enabled ####
       schedule:traceoff:unlimited
      
      But if you read that file with dd, the output can change:
      
        # dd if=/debug/tracing/set_ftrace_filter bs=1
       #### all functions enabled ####
       32+0 records in
       32+0 records out
       32 bytes copied, 7.0237e-05 s, 456 kB/s
      
      This is because the "pos" variable is updated for the comment, but func_pos
      is not. "func_pos" is used by the triggers (or hashes) to know how many
      functions were printed and it bases its index from the pos - func_pos.
      func_pos should be 1 to count for the comment printed. But since it is not,
      t_hash_start() thinks that one trigger was already printed.
      
      The cat gets to t_hash_start() via t_next() and not t_start() which updates
      both pos and func_pos.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      43ff926a
    • S
      ftrace: Return NULL at end of t_start() instead of calling t_hash_start() · 2d71d989
      Steven Rostedt (VMware) 提交于
      The loop in t_start() of calling t_next() will call t_hash_start() if the
      pos is beyond the functions and enters the hash items. There's no reason to
      check if p is NULL and call t_hash_start(), as that would be redundant.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      2d71d989
    • S
      ftrace: Assign iter->hash to filter or notrace hashes on seq read · c20489da
      Steven Rostedt (VMware) 提交于
      Instead of testing if the hash to use is the filter_hash or the notrace_hash
      at each iteration, do the test at open, and set the iter->hash to point to
      the corresponding filter or notrace hash. Then use that directly instead of
      testing which hash needs to be used each iteration.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      c20489da
    • S
      ftrace: Clean up __seq_open_private() return check · c1bc5919
      Steven Rostedt (VMware) 提交于
      The return status check of __seq_open_private() is rather strange:
      
      	iter = __seq_open_private();
      	if (iter) {
      		/* do stuff */
      	}
      
      	return iter ? 0 : -ENOMEM;
      
      It makes much more sense to do the return of failure right away:
      
      	iter = __seq_open_private();
      	if (!iter)
      		return -ENOMEM;
      
      	/* do stuff */
      
      	return 0;
      
      This clean up will make updates to this code a bit nicer.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      c1bc5919
  2. 25 3月, 2017 5 次提交
    • S
      tracing: Move trace_handle_return() out of line · af0009fc
      Steven Rostedt (VMware) 提交于
      Currently trace_handle_return() looks like this:
      
       static inline enum print_line_t trace_handle_return(struct trace_seq *s)
       {
              return trace_seq_has_overflowed(s) ?
                      TRACE_TYPE_PARTIAL_LINE : TRACE_TYPE_HANDLED;
       }
      
      Where trace_seq_overflowed(s) is:
      
       static inline bool trace_seq_has_overflowed(struct trace_seq *s)
       {
      	return s->full || seq_buf_has_overflowed(&s->seq);
       }
      
      And seq_buf_has_overflowed(&s->seq) is:
      
       static inline bool
       seq_buf_has_overflowed(struct seq_buf *s)
       {
      	return s->len > s->size;
       }
      
      Making trace_handle_return() into:
      
       return (s->full || (s->seq->len > s->seq->size)) ?
                 TRACE_TYPE_PARTIAL_LINE :
                 TRACE_TYPE_HANDLED;
      
      One would think this is not an issue to keep as an inline. But because this
      is used in the TRACE_EVENT() macro, it is extended for every tracepoint in
      the system. Taking a look at a single tracepoint x86_irq_vector (was the
      first one I randomly chosen). As trace_handle_return is used in the
      TRACE_EVENT() macro of trace_raw_output_##call() we disassemble
      trace_raw_output_x86_irq_vector and do a diff:
      
      - is the original
      + is the out-of-line code
      
      I removed identical lines that were different just due to different
      addresses.
      
      --- /tmp/irq-vec-orig	2017-03-16 09:12:48.569384851 -0400
      +++ /tmp/irq-vec-ool	2017-03-16 09:13:39.378153385 -0400
      @@ -6,27 +6,23 @@
              53                      push   %rbx
              48 89 fb                mov    %rdi,%rbx
              4c 8b a7 c0 20 00 00    mov    0x20c0(%rdi),%r12
              e8 f7 72 13 00          callq  ffffffff81155c80 <trace_raw_output_prep>
              83 f8 01                cmp    $0x1,%eax
              74 05                   je     ffffffff8101e993 <trace_raw_output_x86_irq_vector+0x23>
              5b                      pop    %rbx
              41 5c                   pop    %r12
              5d                      pop    %rbp
              c3                      retq
              41 8b 54 24 08          mov    0x8(%r12),%edx
      -       48 8d bb 98 10 00 00    lea    0x1098(%rbx),%rdi
      +       48 81 c3 98 10 00 00    add    $0x1098,%rbx
      -       48 c7 c6 7b 8a a0 81    mov    $0xffffffff81a08a7b,%rsi
      +       48 c7 c6 ab 8a a0 81    mov    $0xffffffff81a08aab,%rsi
      -       e8 c5 85 13 00          callq  ffffffff81156f70 <trace_seq_printf>
      
       === here's the start of the main difference ===
      
      +       48 89 df                mov    %rbx,%rdi
      +       e8 62 7e 13 00          callq  ffffffff81156810 <trace_seq_printf>
      -       8b 93 b8 20 00 00       mov    0x20b8(%rbx),%edx
      -       31 c0                   xor    %eax,%eax
      -       85 d2                   test   %edx,%edx
      -       75 11                   jne    ffffffff8101e9c8 <trace_raw_output_x86_irq_vector+0x58>
      -       48 8b 83 a8 20 00 00    mov    0x20a8(%rbx),%rax
      -       48 39 83 a0 20 00 00    cmp    %rax,0x20a0(%rbx)
      -       0f 93 c0                setae  %al
      +       48 89 df                mov    %rbx,%rdi
      +       e8 4a c5 12 00          callq  ffffffff8114af00 <trace_handle_return>
              5b                      pop    %rbx
      -       0f b6 c0                movzbl %al,%eax
      
       === end ===
      
              41 5c                   pop    %r12
              5d                      pop    %rbp
              c3                      retq
      
      If you notice, the original has 22 bytes of text more than the out of line
      version. As this is for every TRACE_EVENT() defined in the system, this can
      become quite large.
      
         text	   data	    bss	    dec	    hex	filename
      8690305	5450490	1298432	15439227	 eb957b	vmlinux-orig
      8681725	5450490	1298432	15430647	 eb73f7	vmlinux-handle
      
      This change has a total of 8580 bytes in savings.
      
       $ objdump -dr /tmp/vmlinux-orig | grep '^[0-9a-f]* <trace_raw_output' | wc -l
      324
      
      That's 324 tracepoints. But this does not include modules (which contain
      many more tracepoints). For an allyesconfig build:
      
       $ objdump -dr vmlinux-allyes-orig | grep '^[0-9a-f]* <trace_raw_output' | wc -l
      1401
      
      That's 1401 tracepoints giving us:
      
         text    data     bss     dec     hex filename
      137920629       140221067       53264384        331406080       13c0db00 vmlinux-allyes-orig
      137827709       140221067       53264384        331313160       13bf7008 vmlinux-allyes-handle
      
      92920 bytes in savings!!!
      
      Link: http://lkml.kernel.org/r/20170315021431.13107-2-andi@firstfloor.orgReported-by: NAndi Kleen <andi@firstfloor.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      af0009fc
    • S
      ftrace: Allow for function tracing to record init functions on boot up · 42c269c8
      Steven Rostedt (VMware) 提交于
      Adding a hook into free_reserve_area() that informs ftrace that boot up init
      text is being free, lets ftrace safely remove those init functions from its
      records, which keeps ftrace from trying to modify text that no longer
      exists.
      
      Note, this still does not allow for tracing .init text of modules, as
      modules require different work for freeing its init code.
      
      Link: http://lkml.kernel.org/r/1488502497.7212.24.camel@linux.intel.com
      
      Cc: linux-mm@kvack.org
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Requested-by: NTodd Brandt <todd.e.brandt@linux.intel.com>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      42c269c8
    • S
      ftrace: Have function tracing start in early boot up · dbeafd0d
      Steven Rostedt (VMware) 提交于
      Register the function tracer right after the tracing buffers are initialized
      in early boot up. This will allow function tracing to begin early if it is
      enabled via the kernel command line.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      dbeafd0d
    • S
      tracing: Postpone tracer start-up tests till the system is more robust · 9afecfbb
      Steven Rostedt (VMware) 提交于
      As tracing can now be enabled very early in boot up, even before some
      critical system services (like scheduling), do not run the tracer selftests
      until after early_initcall() is performed. If a tracer is registered before
      such time, it is saved off in a list and the test is run when the system is
      able to handle more diverse functions.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      9afecfbb
    • S
      tracing: Split tracing initialization into two for early initialization · e725c731
      Steven Rostedt (VMware) 提交于
      Create an early_trace_init() function that will initialize the buffers and
      allow for ealier use of trace_printk(). This will also allow for future work
      to have function tracing start earlier at boot up.
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      e725c731
  3. 10 3月, 2017 1 次提交
  4. 03 3月, 2017 2 次提交
  5. 02 3月, 2017 6 次提交
  6. 01 3月, 2017 1 次提交
  7. 28 2月, 2017 1 次提交
  8. 23 2月, 2017 1 次提交
    • R
      tracing: add __print_flags_u64() · d3213e8f
      Ross Zwisler 提交于
      Patch series "DAX tracepoints, mm argument simplification", v4.
      
      This contains both my DAX tracepoint code and Dave Jiang's MM argument
      simplifications.  Dave's code was written with my tracepoint code as a
      baseline, so it seemed simplest to keep them together in a single series.
      
      This patch (of 7):
      
      Add __print_flags_u64() and the helper trace_print_flags_seq_u64() in the
      same spirit as __print_symbolic_u64() and trace_print_symbols_seq_u64().
      These functions allow us to print symbols associated with flags that are
      64 bits wide even on 32 bit machines.
      
      These will be used by the DAX code so that we can print the flags set in a
      pfn_t such as PFN_SG_CHAIN, PFN_SG_LAST, PFN_DEV and PFN_MAP.
      
      Without this new function I was getting errors like the following when
      compiling for i386:
      
        include/linux/pfn_t.h:13:22: warning: large integer implicitly truncated to unsigned type [-Woverflow]
         #define PFN_SG_CHAIN (1ULL << (BITS_PER_LONG_LONG - 1))
          ^
      
      Link: http://lkml.kernel.org/r/1484085142-2297-2-git-send-email-ross.zwisler@linux.intel.comSigned-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d3213e8f
  9. 18 2月, 2017 1 次提交
  10. 17 2月, 2017 1 次提交
  11. 15 2月, 2017 6 次提交
  12. 11 2月, 2017 1 次提交
  13. 04 2月, 2017 1 次提交
  14. 03 2月, 2017 8 次提交
    • S
      ftrace: Have set_graph_function handle multiple functions in one write · e704eff3
      Steven Rostedt (VMware) 提交于
      Currently, only one function can be written to set_graph_function and
      set_graph_notrace. The last function in the list will have saved, even
      though other functions will be added then removed.
      
      Change the behavior to be the same as set_ftrace_function as to allow
      multiple functions to be written. If any one fails, none of them will be
      added. The addition of the functions are done at the end when the file is
      closed.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      e704eff3
    • S
      ftrace: Do not hold references of ftrace_graph_{notrace_}hash out of graph_lock · 649b988b
      Steven Rostedt (VMware) 提交于
      The hashs ftrace_graph_hash and ftrace_graph_notrace_hash are modified
      within the graph_lock being held. Holding a pointer to them and passing them
      along can lead to a use of a stale pointer (fgd->hash). Move assigning the
      pointer and its use to within the holding of the lock. Note, it's an
      rcu_sched protected data, and other instances of referencing them are done
      with preemption disabled. But the file manipuation code must be protected by
      the lock.
      
      The fgd->hash pointer is set to NULL when the lock is being released.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      649b988b
    • S
      tracing: Reset parser->buffer to allow multiple "puts" · 0e684b65
      Steven Rostedt (VMware) 提交于
      trace_parser_put() simply frees the allocated parser buffer. But it does not
      reset the pointer that was freed. This means that if trace_parser_put() is
      called on the same parser more than once, it will corrupt the allocation
      system. Setting parser->buffer to NULL after free allows it to be called
      more than once without any ill effect.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      0e684b65
    • S
      ftrace: Have set_graph_functions handle write with RDWR · ae98d27a
      Steven Rostedt (VMware) 提交于
      Since reading the set_graph_functions uses seq functions, which sets the
      file->private_data pointer to a seq_file descriptor. On writes the
      ftrace_graph_data descriptor is set to file->private_data. But if the file
      is opened for RDWR, the ftrace_graph_write() will incorrectly use the
      file->private_data descriptor instead of
      ((struct seq_file *)file->private_data)->private pointer, and this can crash
      the kernel.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      ae98d27a
    • S
      ftrace: Reset fgd->hash in ftrace_graph_write() · d4ad9a1c
      Steven Rostedt (VMware) 提交于
      fgd->hash is saved and then freed, but is never reset to either
      ftrace_graph_hash nor ftrace_graph_notrace_hash. But if multiple writes are
      performed, then the freed hash could be accessed again.
      
       # cd /sys/kernel/debug/tracing
       # head -1000 available_filter_functions > /tmp/funcs
       # cat /tmp/funcs > set_graph_function
      
      Causes:
      
       general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
       Modules linked in:  [...]
       CPU: 2 PID: 1337 Comm: cat Not tainted 4.10.0-rc2-test-00010-g6b052e9 #32
       Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
       task: ffff880113a12200 task.stack: ffffc90001940000
       RIP: 0010:free_ftrace_hash+0x7c/0x160
       RSP: 0018:ffffc90001943db0 EFLAGS: 00010246
       RAX: 6b6b6b6b6b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: 6b6b6b6b6b6b6b6b
       RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffff8800ce1e1d40
       RBP: ffff8800ce1e1d50 R08: 0000000000000000 R09: 0000000000006400
       R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
       R13: ffff8800ce1e1d40 R14: 0000000000004000 R15: 0000000000000001
       FS:  00007f9408a07740(0000) GS:ffff88011e500000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000aee1f0 CR3: 0000000116bb4000 CR4: 00000000001406e0
       Call Trace:
        ? ftrace_graph_write+0x150/0x190
        ? __vfs_write+0x1f6/0x210
        ? __audit_syscall_entry+0x17f/0x200
        ? rw_verify_area+0xdb/0x210
        ? _cond_resched+0x2b/0x50
        ? __sb_start_write+0xb4/0x130
        ? vfs_write+0x1c8/0x330
        ? SyS_write+0x62/0xf0
        ? do_syscall_64+0xa3/0x1b0
        ? entry_SYSCALL64_slow_path+0x25/0x25
       Code: 01 48 85 db 0f 84 92 00 00 00 b8 01 00 00 00 d3 e0 85 c0 7e 3f 83 e8 01 48 8d 6f 10 45 31 e4 4c 8d 34 c5 08 00 00 00 49 8b 45 08 <4a> 8b 34 20 48 85 f6 74 13 48 8b 1e 48 89 ef e8 20 fa ff ff 48
       RIP: free_ftrace_hash+0x7c/0x160 RSP: ffffc90001943db0
       ---[ end trace 999b48216bf4b393 ]---
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      d4ad9a1c
    • S
      ftrace: Replace (void *)1 with a meaningful macro name FTRACE_GRAPH_EMPTY · 555fc781
      Steven Rostedt (VMware) 提交于
      When the set_graph_function or set_graph_notrace contains no records, a
      banner is displayed of either "#### all functions enabled ####" or
      "#### all functions disabled ####" respectively. To tell the seq operations
      to do this, (void *)1 is passed as a return value. Instead of using a
      hardcoded meaningless variable, define it as a macro.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      555fc781
    • S
      ftrace: Create a slight optimization on searching the ftrace_hash · 2b2c279c
      Steven Rostedt (VMware) 提交于
      This is a micro-optimization, but as it has to deal with a fast path of the
      function tracer, these optimizations can be noticed.
      
      The ftrace_lookup_ip() returns true if the given ip is found in the hash. If
      it's not found or the hash is NULL, it returns false. But there's some cases
      that a NULL hash is a true, and the ftrace_hash_empty() is tested before
      calling ftrace_lookup_ip() in those cases. But as ftrace_lookup_ip() tests
      that first, that adds a few extra unneeded instructions in those cases.
      
      A new static "always_inlined" function is created that does not perform the
      hash empty test. This most only be used by callers that do the check first
      anyway, as an empty or NULL hash could cause a crash if a lookup is
      performed on it.
      
      Also add kernel doc for the ftrace_lookup_ip() main function.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      2b2c279c
    • S
      tracing: Add ftrace_hash_key() helper function · 2b0cce0e
      Steven Rostedt (VMware) 提交于
      Replace the couple of use cases that has small logic to produce the ftrace
      function key id with a helper function. No need for duplicate code.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      2b0cce0e