1. 17 11月, 2009 1 次提交
    • S
      ring-buffer: Move access to commit_page up into function used · 5a50e33c
      Steven Rostedt 提交于
      With the change of the way we process commits. Where a commit only happens
      at the outer most level, and that we don't need to worry about
      a commit ending after the rb_start_commit() has been called, the code
      use to grab the commit page before the tail page to prevent a possible
      race. But this race no longer exists with the rb_start_commit()
      rb_end_commit() interface.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5a50e33c
  2. 12 11月, 2009 2 次提交
    • S
      tracing: do not disable interrupts for trace_clock_local · 8b2a5dac
      Steven Rostedt 提交于
      Disabling interrupts in trace_clock_local takes quite a performance
      hit to the recording of traces. Using perf top we see:
      
      ------------------------------------------------------------------------------
         PerfTop:     244 irqs/sec  kernel:100.0% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
                   samples    pcnt   kernel function
                   _______   _____   _______________
      
                   2842.00 - 40.4% : trace_clock_local
                   1043.00 - 14.8% : rb_reserve_next_event
                    784.00 - 11.1% : ring_buffer_lock_reserve
                    600.00 -  8.5% : __rb_reserve_next
                    579.00 -  8.2% : rb_end_commit
                    440.00 -  6.3% : ring_buffer_unlock_commit
                    290.00 -  4.1% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                    155.00 -  2.2% : debug_smp_processor_id
                    117.00 -  1.7% : trace_recursive_unlock
                    103.00 -  1.5% : ring_buffer_event_data
                     28.00 -  0.4% : do_gettimeofday
                     22.00 -  0.3% : _spin_unlock_irq
                     14.00 -  0.2% : native_read_tsc
                     11.00 -  0.2% : getnstimeofday
      
      Where trace_clock_local is 40% of the tracing, and the time for recording
      a trace according to ring_buffer_benchmark is 210ns. After converting
      the interrupts to preemption disabling we have from perf top:
      
      ------------------------------------------------------------------------------
         PerfTop:    1084 irqs/sec  kernel:99.9% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
                   samples    pcnt   kernel function
                   _______   _____   _______________
      
                   1277.00 - 16.8% : native_read_tsc
                   1148.00 - 15.1% : rb_reserve_next_event
                    896.00 - 11.8% : ring_buffer_lock_reserve
                    688.00 -  9.1% : __rb_reserve_next
                    664.00 -  8.8% : rb_end_commit
                    563.00 -  7.4% : ring_buffer_unlock_commit
                    508.00 -  6.7% : _spin_unlock_irq
                    365.00 -  4.8% : debug_smp_processor_id
                    321.00 -  4.2% : trace_clock_local
                    303.00 -  4.0% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                    273.00 -  3.6% : native_sched_clock
                    122.00 -  1.6% : trace_recursive_unlock
                    113.00 -  1.5% : sched_clock
                    101.00 -  1.3% : ring_buffer_event_data
                     53.00 -  0.7% : tick_nohz_stop_sched_tick
      
      Where trace_clock_local drops from 40% to only taking 4% of the total time.
      The trace time also goes from 210ns down to 179ns (31ns).
      
      I talked with Peter Zijlstra about the impact that sched_clock may have
      without having interrupts disabled, and he told me that if a timer interrupt
      comes in, sched_clock may report a wrong time.
      
      Balancing a seldom incorrect timestamp with a 15% performance boost, I'll
      take the performance boost.
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      8b2a5dac
    • S
      ring-buffer: Add multiple iterations between benchmark timestamps · a6f0eb6a
      Steven Rostedt 提交于
      The ring_buffer_benchmark does a gettimeofday after every write to the
      ring buffer in its measurements. This adds the overhead of the call
      to gettimeofday to the measurements and does not give an accurate picture
      of the length of time it takes to record a trace.
      
      This was first noticed with perf top:
      
      ------------------------------------------------------------------------------
         PerfTop:     679 irqs/sec  kernel:99.9% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
                   samples    pcnt   kernel function
                   _______   _____   _______________
      
                   1673.00 - 27.8% : trace_clock_local
                    806.00 - 13.4% : do_gettimeofday
                    590.00 -  9.8% : rb_reserve_next_event
                    554.00 -  9.2% : native_read_tsc
                    431.00 -  7.2% : ring_buffer_lock_reserve
                    365.00 -  6.1% : __rb_reserve_next
                    355.00 -  5.9% : rb_end_commit
                    322.00 -  5.4% : getnstimeofday
                    268.00 -  4.5% : ring_buffer_unlock_commit
                    262.00 -  4.4% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                    113.00 -  1.9% : read_tsc
                     91.00 -  1.5% : debug_smp_processor_id
                     69.00 -  1.1% : trace_recursive_unlock
                     66.00 -  1.1% : ring_buffer_event_data
                     25.00 -  0.4% : _spin_unlock_irq
      
      And the length of each write to the ring buffer measured at 310ns.
      
      This patch adds a new module parameter called "write_interval" which is
      defaulted to 50. This is the number of writes performed between
      timestamps. After this patch perf top shows:
      
      ------------------------------------------------------------------------------
         PerfTop:     244 irqs/sec  kernel:100.0% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
                   samples    pcnt   kernel function
                   _______   _____   _______________
      
                   2842.00 - 40.4% : trace_clock_local
                   1043.00 - 14.8% : rb_reserve_next_event
                    784.00 - 11.1% : ring_buffer_lock_reserve
                    600.00 -  8.5% : __rb_reserve_next
                    579.00 -  8.2% : rb_end_commit
                    440.00 -  6.3% : ring_buffer_unlock_commit
                    290.00 -  4.1% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                    155.00 -  2.2% : debug_smp_processor_id
                    117.00 -  1.7% : trace_recursive_unlock
                    103.00 -  1.5% : ring_buffer_event_data
                     28.00 -  0.4% : do_gettimeofday
                     22.00 -  0.3% : _spin_unlock_irq
                     14.00 -  0.2% : native_read_tsc
                     11.00 -  0.2% : getnstimeofday
      
      do_gettimeofday dropped from 13% usage to a mere 0.4%! (using the default
      50 interval)  The measurement for each timestamp went from 310ns to 210ns.
      That's 100ns (1/3rd) overhead that the gettimeofday call was introducing.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a6f0eb6a
  3. 03 11月, 2009 1 次提交
  4. 02 11月, 2009 2 次提交
    • L
      tracing: Fix to use __always_unused attribute · 5e9b3972
      Li Zefan 提交于
      ____ftrace_check_##name() is used for compile-time check on
      F_printk() only, so it should be marked as __unused instead
      of __used.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4AEE2D01.4010305@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5e9b3972
    • L
      compiler: Introduce __always_unused · 7b2a3513
      Li Zefan 提交于
      I wrote some code which is used as compile-time checker, and the
      code should be elided after compile.
      
      So I need to annotate the code as "always unused", compared to
      "maybe unused".
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4AEE2CEC.8040206@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7b2a3513
  5. 30 10月, 2009 8 次提交
  6. 29 10月, 2009 1 次提交
  7. 24 10月, 2009 4 次提交
  8. 23 10月, 2009 2 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus · 964fe080
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
        move virtrng_remove to .devexit.text
        move virtballoon_remove to .devexit.text
        virtio_blk: Revert serial number support
        virtio: let header files include virtio_ids.h
        virtio_blk: revert QUEUE_FLAG_VIRT addition
      964fe080
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · 4848490c
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
        niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was copied to the head buffer in the Vlan packets case
        KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICAST
        KS8851: Fix MAC address write order
        KS8851: Add soft reset at probe time
        net: fix section mismatch in fec.c
        net: Fix struct inet_timewait_sock bitfield annotation
        tcp: Try to catch MSG_PEEK bug
        net: Fix IP_MULTICAST_IF
        bluetooth: static lock key fix
        bluetooth: scheduling while atomic bug fix
        tcp: fix TCP_DEFER_ACCEPT retrans calculation
        tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT
        tcp: accept socket after TCP_DEFER_ACCEPT period
        Revert "tcp: fix tcp_defer_accept to consider the timeout"
        AF_UNIX: Fix deadlock on connecting to shutdown socket
        ethoc: clear only pending irqs
        ethoc: inline regs access
        vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n
        virtio_net: use dev_kfree_skb_any() in free_old_xmit_skbs()
        be2net: fix support for PCI hot plug
        ...
      4848490c
  9. 22 10月, 2009 16 次提交
  10. 21 10月, 2009 3 次提交