• S
    ring-buffer: Add multiple iterations between benchmark timestamps · a6f0eb6a
    Steven Rostedt 提交于
    The ring_buffer_benchmark does a gettimeofday after every write to the
    ring buffer in its measurements. This adds the overhead of the call
    to gettimeofday to the measurements and does not give an accurate picture
    of the length of time it takes to record a trace.
    
    This was first noticed with perf top:
    
    ------------------------------------------------------------------------------
       PerfTop:     679 irqs/sec  kernel:99.9% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
    ------------------------------------------------------------------------------
    
                 samples    pcnt   kernel function
                 _______   _____   _______________
    
                 1673.00 - 27.8% : trace_clock_local
                  806.00 - 13.4% : do_gettimeofday
                  590.00 -  9.8% : rb_reserve_next_event
                  554.00 -  9.2% : native_read_tsc
                  431.00 -  7.2% : ring_buffer_lock_reserve
                  365.00 -  6.1% : __rb_reserve_next
                  355.00 -  5.9% : rb_end_commit
                  322.00 -  5.4% : getnstimeofday
                  268.00 -  4.5% : ring_buffer_unlock_commit
                  262.00 -  4.4% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                  113.00 -  1.9% : read_tsc
                   91.00 -  1.5% : debug_smp_processor_id
                   69.00 -  1.1% : trace_recursive_unlock
                   66.00 -  1.1% : ring_buffer_event_data
                   25.00 -  0.4% : _spin_unlock_irq
    
    And the length of each write to the ring buffer measured at 310ns.
    
    This patch adds a new module parameter called "write_interval" which is
    defaulted to 50. This is the number of writes performed between
    timestamps. After this patch perf top shows:
    
    ------------------------------------------------------------------------------
       PerfTop:     244 irqs/sec  kernel:100.0% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
    ------------------------------------------------------------------------------
    
                 samples    pcnt   kernel function
                 _______   _____   _______________
    
                 2842.00 - 40.4% : trace_clock_local
                 1043.00 - 14.8% : rb_reserve_next_event
                  784.00 - 11.1% : ring_buffer_lock_reserve
                  600.00 -  8.5% : __rb_reserve_next
                  579.00 -  8.2% : rb_end_commit
                  440.00 -  6.3% : ring_buffer_unlock_commit
                  290.00 -  4.1% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                  155.00 -  2.2% : debug_smp_processor_id
                  117.00 -  1.7% : trace_recursive_unlock
                  103.00 -  1.5% : ring_buffer_event_data
                   28.00 -  0.4% : do_gettimeofday
                   22.00 -  0.3% : _spin_unlock_irq
                   14.00 -  0.2% : native_read_tsc
                   11.00 -  0.2% : getnstimeofday
    
    do_gettimeofday dropped from 13% usage to a mere 0.4%! (using the default
    50 interval)  The measurement for each timestamp went from 310ns to 210ns.
    That's 100ns (1/3rd) overhead that the gettimeofday call was introducing.
    Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
    a6f0eb6a
ring_buffer_benchmark.c 8.7 KB