1. 26 11月, 2009 2 次提交
    • L
      tracing: Separate raw syscall from syscall tracer · b8007ef7
      Lai Jiangshan 提交于
      The current syscall tracer mixes raw syscalls and real syscalls.
      
      echo 1 > events/syscalls/enable
      And we get these from the output:
      
      (XXXX insteads "            grep-20914 [001] 588211.446347" .. etc)
      
      XXXX: sys_read(fd: 3, buf: 80609a8, count: 7000)
      XXXX: sys_enter: NR 3 (3, 80609a8, 7000, a, 1000, bfce8ef8)
      XXXX: sys_read -> 0x138
      XXXX: sys_exit: NR 3 = 312
      XXXX: sys_read(fd: 3, buf: 8060ae0, count: 7000)
      XXXX: sys_enter: NR 3 (3, 8060ae0, 7000, a, 1000, bfce8ef8)
      XXXX: sys_read -> 0x138
      XXXX: sys_exit: NR 3 = 312
      
      There are 2 drawbacks here.
      A) two almost identical records are saved in ringbuffer
         when a syscall enters or exits. (4 records for every syscall)
         This wastes precious space in the ring buffer.
      B) the lines including "sys_enter/sys_exit" produces
         hardly any useful information for the output (no labels).
      
      The user can use this method to prevent these drawbacks:
      echo 1 > events/syscalls/enable
      echo 0 > events/syscalls/sys_enter/enable
      echo 0 > events/syscalls/sys_exit/enable
      
      But this is not user friendly. So we separate raw syscall
      from syscall tracer.
      
      After this fix applied:
      syscall tracer's output (echo 1 > events/syscalls/enable):
      
      XXXX: sys_read(fd: 3, buf: bfe87d88, count: 200)
      XXXX: sys_read -> 0x200
      XXXX: sys_fstat64(fd: 3, statbuf: bfe87c98)
      XXXX: sys_fstat64 -> 0x0
      XXXX: sys_close(fd: 3)
      
      raw syscall tracer's output (echo 1 > events/raw_syscalls/enable):
      
      XXXX: sys_enter: NR 175 (0, bf92bf18, bf92bf98, 8, b748cff4, bf92bef8)
      XXXX: sys_exit: NR 175 = 0
      XXXX: sys_enter: NR 175 (2, bf92bf98, 0, 8, b748cff4, bf92bef8)
      XXXX: sys_exit: NR 175 = 0
      XXXX: sys_enter: NR 3 (9, bf927f9c, 4000, b77e2518, b77dce60, bf92bff8)
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4AEFC37C.5080609@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      b8007ef7
    • S
      ring-buffer-benchmark: Add parameters to set produce/consumer priorities · 7ac07434
      Steven Rostedt 提交于
      Running the ring-buffer-benchmark's threads at the lowest priority may
      work well for keeping it in the background, but it is not appropriate
      for the benchmarks.
      
      This patch adds 4 parameters to the module:
      
        consumer_fifo
        consumer_nice
        producer_fifo
        producer_nice
      
      By default the consumer and producer still run at nice +19.
      
      If the *_fifo options are set, they will override the *_nice values.
      
       modprobe ring_buffer_benchmark consumer_nice=0 producer_fifo=10
      
      The above will set the consumer thread to a nice value of 0, and
      the producer thread to a RT SCHED_FIFO priority of 10.
      
      Note, this patch also fixes a bug where calling set_user_nice on the
      consumer thread would oops the kernel when the parameter "disable_reader"
      is set.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      7ac07434
  2. 23 11月, 2009 2 次提交
    • I
      tracing, function tracer: Clean up strstrip() usage · 457dc928
      Ingo Molnar 提交于
      Clean up strstrip() usage - which also addresses this build warning:
      
        kernel/trace/ftrace.c: In function 'ftrace_pid_write':
        kernel/trace/ftrace.c:3004: warning: ignoring return value of 'strstrip', declared with attribute warn_unused_result
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      457dc928
    • I
      ring-buffer benchmark: Run producer/consumer threads at nice +19 · 98e4833b
      Ingo Molnar 提交于
      The ring-buffer benchmark threads run on nice 0 by default, using
      up a lot of CPU time and slowing down the system:
      
         PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
        1024 root      20   0     0    0    0 D 95.3  0.0   4:01.67 rb_producer
        1023 root      20   0     0    0    0 R 93.5  0.0   2:54.33 rb_consumer
       21569 mingo     40   0 14852 1048  772 R  3.6  0.1   0:00.05 top
           1 root      40   0  4080  928  668 S  0.0  0.0   0:23.98 init
      
      Renice them to +19 to make them less intrusive.
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      98e4833b
  3. 19 11月, 2009 1 次提交
  4. 18 11月, 2009 2 次提交
  5. 17 11月, 2009 1 次提交
    • S
      ring-buffer: Move access to commit_page up into function used · 5a50e33c
      Steven Rostedt 提交于
      With the change of the way we process commits. Where a commit only happens
      at the outer most level, and that we don't need to worry about
      a commit ending after the rb_start_commit() has been called, the code
      use to grab the commit page before the tail page to prevent a possible
      race. But this race no longer exists with the rb_start_commit()
      rb_end_commit() interface.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      5a50e33c
  6. 12 11月, 2009 2 次提交
    • S
      tracing: do not disable interrupts for trace_clock_local · 8b2a5dac
      Steven Rostedt 提交于
      Disabling interrupts in trace_clock_local takes quite a performance
      hit to the recording of traces. Using perf top we see:
      
      ------------------------------------------------------------------------------
         PerfTop:     244 irqs/sec  kernel:100.0% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
                   samples    pcnt   kernel function
                   _______   _____   _______________
      
                   2842.00 - 40.4% : trace_clock_local
                   1043.00 - 14.8% : rb_reserve_next_event
                    784.00 - 11.1% : ring_buffer_lock_reserve
                    600.00 -  8.5% : __rb_reserve_next
                    579.00 -  8.2% : rb_end_commit
                    440.00 -  6.3% : ring_buffer_unlock_commit
                    290.00 -  4.1% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                    155.00 -  2.2% : debug_smp_processor_id
                    117.00 -  1.7% : trace_recursive_unlock
                    103.00 -  1.5% : ring_buffer_event_data
                     28.00 -  0.4% : do_gettimeofday
                     22.00 -  0.3% : _spin_unlock_irq
                     14.00 -  0.2% : native_read_tsc
                     11.00 -  0.2% : getnstimeofday
      
      Where trace_clock_local is 40% of the tracing, and the time for recording
      a trace according to ring_buffer_benchmark is 210ns. After converting
      the interrupts to preemption disabling we have from perf top:
      
      ------------------------------------------------------------------------------
         PerfTop:    1084 irqs/sec  kernel:99.9% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
                   samples    pcnt   kernel function
                   _______   _____   _______________
      
                   1277.00 - 16.8% : native_read_tsc
                   1148.00 - 15.1% : rb_reserve_next_event
                    896.00 - 11.8% : ring_buffer_lock_reserve
                    688.00 -  9.1% : __rb_reserve_next
                    664.00 -  8.8% : rb_end_commit
                    563.00 -  7.4% : ring_buffer_unlock_commit
                    508.00 -  6.7% : _spin_unlock_irq
                    365.00 -  4.8% : debug_smp_processor_id
                    321.00 -  4.2% : trace_clock_local
                    303.00 -  4.0% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                    273.00 -  3.6% : native_sched_clock
                    122.00 -  1.6% : trace_recursive_unlock
                    113.00 -  1.5% : sched_clock
                    101.00 -  1.3% : ring_buffer_event_data
                     53.00 -  0.7% : tick_nohz_stop_sched_tick
      
      Where trace_clock_local drops from 40% to only taking 4% of the total time.
      The trace time also goes from 210ns down to 179ns (31ns).
      
      I talked with Peter Zijlstra about the impact that sched_clock may have
      without having interrupts disabled, and he told me that if a timer interrupt
      comes in, sched_clock may report a wrong time.
      
      Balancing a seldom incorrect timestamp with a 15% performance boost, I'll
      take the performance boost.
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      8b2a5dac
    • S
      ring-buffer: Add multiple iterations between benchmark timestamps · a6f0eb6a
      Steven Rostedt 提交于
      The ring_buffer_benchmark does a gettimeofday after every write to the
      ring buffer in its measurements. This adds the overhead of the call
      to gettimeofday to the measurements and does not give an accurate picture
      of the length of time it takes to record a trace.
      
      This was first noticed with perf top:
      
      ------------------------------------------------------------------------------
         PerfTop:     679 irqs/sec  kernel:99.9% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
                   samples    pcnt   kernel function
                   _______   _____   _______________
      
                   1673.00 - 27.8% : trace_clock_local
                    806.00 - 13.4% : do_gettimeofday
                    590.00 -  9.8% : rb_reserve_next_event
                    554.00 -  9.2% : native_read_tsc
                    431.00 -  7.2% : ring_buffer_lock_reserve
                    365.00 -  6.1% : __rb_reserve_next
                    355.00 -  5.9% : rb_end_commit
                    322.00 -  5.4% : getnstimeofday
                    268.00 -  4.5% : ring_buffer_unlock_commit
                    262.00 -  4.4% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                    113.00 -  1.9% : read_tsc
                     91.00 -  1.5% : debug_smp_processor_id
                     69.00 -  1.1% : trace_recursive_unlock
                     66.00 -  1.1% : ring_buffer_event_data
                     25.00 -  0.4% : _spin_unlock_irq
      
      And the length of each write to the ring buffer measured at 310ns.
      
      This patch adds a new module parameter called "write_interval" which is
      defaulted to 50. This is the number of writes performed between
      timestamps. After this patch perf top shows:
      
      ------------------------------------------------------------------------------
         PerfTop:     244 irqs/sec  kernel:100.0% [1000Hz cpu-clock-msecs],  (all, 4 CPUs)
      ------------------------------------------------------------------------------
      
                   samples    pcnt   kernel function
                   _______   _____   _______________
      
                   2842.00 - 40.4% : trace_clock_local
                   1043.00 - 14.8% : rb_reserve_next_event
                    784.00 - 11.1% : ring_buffer_lock_reserve
                    600.00 -  8.5% : __rb_reserve_next
                    579.00 -  8.2% : rb_end_commit
                    440.00 -  6.3% : ring_buffer_unlock_commit
                    290.00 -  4.1% : ring_buffer_producer_thread 	[ring_buffer_benchmark]
                    155.00 -  2.2% : debug_smp_processor_id
                    117.00 -  1.7% : trace_recursive_unlock
                    103.00 -  1.5% : ring_buffer_event_data
                     28.00 -  0.4% : do_gettimeofday
                     22.00 -  0.3% : _spin_unlock_irq
                     14.00 -  0.2% : native_read_tsc
                     11.00 -  0.2% : getnstimeofday
      
      do_gettimeofday dropped from 13% usage to a mere 0.4%! (using the default
      50 interval)  The measurement for each timestamp went from 310ns to 210ns.
      That's 100ns (1/3rd) overhead that the gettimeofday call was introducing.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      a6f0eb6a
  7. 03 11月, 2009 1 次提交
  8. 02 11月, 2009 2 次提交
    • L
      tracing: Fix to use __always_unused attribute · 5e9b3972
      Li Zefan 提交于
      ____ftrace_check_##name() is used for compile-time check on
      F_printk() only, so it should be marked as __unused instead
      of __used.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4AEE2D01.4010305@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5e9b3972
    • L
      compiler: Introduce __always_unused · 7b2a3513
      Li Zefan 提交于
      I wrote some code which is used as compile-time checker, and the
      code should be elided after compile.
      
      So I need to annotate the code as "always unused", compared to
      "maybe unused".
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <4AEE2CEC.8040206@cn.fujitsu.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7b2a3513
  9. 30 10月, 2009 8 次提交
  10. 29 10月, 2009 1 次提交
  11. 24 10月, 2009 4 次提交
  12. 23 10月, 2009 2 次提交
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus · 964fe080
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
        move virtrng_remove to .devexit.text
        move virtballoon_remove to .devexit.text
        virtio_blk: Revert serial number support
        virtio: let header files include virtio_ids.h
        virtio_blk: revert QUEUE_FLAG_VIRT addition
      964fe080
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · 4848490c
      Linus Torvalds 提交于
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
        niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was copied to the head buffer in the Vlan packets case
        KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICAST
        KS8851: Fix MAC address write order
        KS8851: Add soft reset at probe time
        net: fix section mismatch in fec.c
        net: Fix struct inet_timewait_sock bitfield annotation
        tcp: Try to catch MSG_PEEK bug
        net: Fix IP_MULTICAST_IF
        bluetooth: static lock key fix
        bluetooth: scheduling while atomic bug fix
        tcp: fix TCP_DEFER_ACCEPT retrans calculation
        tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT
        tcp: accept socket after TCP_DEFER_ACCEPT period
        Revert "tcp: fix tcp_defer_accept to consider the timeout"
        AF_UNIX: Fix deadlock on connecting to shutdown socket
        ethoc: clear only pending irqs
        ethoc: inline regs access
        vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n
        virtio_net: use dev_kfree_skb_any() in free_old_xmit_skbs()
        be2net: fix support for PCI hot plug
        ...
      4848490c
  13. 22 10月, 2009 12 次提交