1. 25 3月, 2009 1 次提交
    • J
      dynamic debug: combine dprintk and dynamic printk · e9d376f0
      Jason Baron 提交于
      This patch combines Greg Bank's dprintk() work with the existing dynamic
      printk patchset, we are now calling it 'dynamic debug'.
      
      The new feature of this patchset is a richer /debugfs control file interface,
      (an example output from my system is at the bottom), which allows fined grained
      control over the the debug output. The output can be controlled by function,
      file, module, format string, and line number.
      
      for example, enabled all debug messages in module 'nf_conntrack':
      
      echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control
      
      to disable them:
      
      echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control
      
      A further explanation can be found in the documentation patch.
      Signed-off-by: NGreg Banks <gnb@sgi.com>
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
      e9d376f0
  2. 16 1月, 2009 1 次提交
    • I
      alpha: fix RTC on marvel · 5f7dc5d7
      Ivan Kokshaysky 提交于
      Unlike other alphas, marvel doesn't have real PC-style CMOS clock hardware
      - RTC accesses are emulated via PAL calls.  Unfortunately, for unknown
      reason these calls work only on CPU #0.  So current implementation for
      arbitrary CPU makes CMOS_READ/WRITE to be executed on CPU #0 via IPI.
      However, for obvious reason this doesn't work with standard
      get/set_rtc_time() functions, where a bunch of CMOS accesses is done with
      disabled interrupts.
      
      Solved by making the IPI calls for entire get/set_rtc_time() functions,
      not for individual CMOS accesses.  Which is also a lot more effective
      performance-wise.
      
      The patch is largely based on the code from Jay Estabrook.
      My changes:
      - tweak asm-generic/rtc.h by adding a couple of #defines to
        avoid a massive code duplication in arch/alpha/include/asm/rtc.h;
      - sys_marvel.c: fix get/set_rtc_time() return values (Jay's FIXMEs).
      
      NOTE: this fixes *only* LIB_RTC drivers.  Legacy (CONFIG_RTC) driver
      wont't work on marvel.  Actually I think that we should just disable
      CONFIG_RTC on alpha (maybe in 2.6.30?), like most other arches - AFAIK,
      all modern distributions use LIB_RTC anyway.
      Signed-off-by: NIvan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Richard Henderson <rth@twiddle.net>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5f7dc5d7
  3. 15 1月, 2009 1 次提交
  4. 14 1月, 2009 2 次提交
  5. 07 1月, 2009 3 次提交
  6. 25 12月, 2008 1 次提交
    • M
      [S390] __page_to_pfn warnings · 32272a26
      Martin Schwidefsky 提交于
      For CONFIG_SPARSEMEM_VMEMMAP=y on s390 I get warnings like
      
      init/main.c: In function 'start_kernel':
      init/main.c:641: warning: format '%08lx' expects type 'long unsigned int', but argument 2 has type 'int'
      
      The warning can be suppressed with a cast to unsigned long in the
      CONFIG_SPARSEMEM_VMEMMAP=y version of __page_to_pfn.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      32272a26
  7. 20 12月, 2008 1 次提交
  8. 19 12月, 2008 1 次提交
  9. 17 12月, 2008 1 次提交
  10. 13 12月, 2008 1 次提交
  11. 12 12月, 2008 1 次提交
  12. 11 12月, 2008 1 次提交
  13. 09 12月, 2008 1 次提交
  14. 29 11月, 2008 1 次提交
  15. 23 11月, 2008 2 次提交
    • S
      trace: profile all if conditionals · 2bcd521a
      Steven Rostedt 提交于
      Impact: feature to profile if statements
      
      This patch adds a branch profiler for all if () statements.
      The results will be found in:
      
        /debugfs/tracing/profile_branch
      
      For example:
      
         miss      hit    %        Function                  File              Line
       ------- ---------  -        --------                  ----              ----
             0        1 100 x86_64_start_reservations      head64.c             127
             0        1 100 copy_bootdata                  head64.c             69
             1        0   0 x86_64_start_kernel            head64.c             111
            32        0   0 set_intr_gate                  desc.h               319
             1        0   0 reserve_ebda_region            head.c               51
             1        0   0 reserve_ebda_region            head.c               47
             0        1 100 reserve_ebda_region            head.c               42
             0        0   X maxcpus                        main.c               165
      
      Miss means the branch was not taken. Hit means the branch was taken.
      The percent is the percentage the branch was taken.
      
      This adds a significant amount of overhead and should only be used
      by those analyzing their system.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      2bcd521a
    • S
      trace: consolidate unlikely and likely profiler · 45b79749
      Steven Rostedt 提交于
      Impact: clean up to make one profiler of like and unlikely tracer
      
      The likely and unlikely profiler prints out the file and line numbers
      of the annotated branches that it is profiling. It shows the number
      of times it was correct or incorrect in its guess. Having two
      different files or sections for that matter to tell us if it was a
      likely or unlikely is pretty pointless. We really only care if
      it was correct or not.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      45b79749
  16. 16 11月, 2008 1 次提交
    • M
      tracepoints: add DECLARE_TRACE() and DEFINE_TRACE() · 7e066fb8
      Mathieu Desnoyers 提交于
      Impact: API *CHANGE*. Must update all tracepoint users.
      
      Add DEFINE_TRACE() to tracepoints to let them declare the tracepoint
      structure in a single spot for all the kernel. It helps reducing memory
      consumption, especially when declaring a lot of tracepoints, e.g. for
      kmalloc tracing.
      
      *API CHANGE WARNING*: now, DECLARE_TRACE() must be used in headers for
      tracepoint declarations rather than DEFINE_TRACE(). This is the sane way
      to do it. The name previously used was misleading.
      
      Updates scheduler instrumentation to follow this API change.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7e066fb8
  17. 13 11月, 2008 1 次提交
  18. 12 11月, 2008 1 次提交
    • S
      tracing: profile likely and unlikely annotations · 1f0d69a9
      Steven Rostedt 提交于
      Impact: new unlikely/likely profiler
      
      Andrew Morton recently suggested having an in-kernel way to profile
      likely and unlikely macros. This patch achieves that goal.
      
      When configured, every(*) likely and unlikely macro gets a counter attached
      to it. When the condition is hit, the hit and misses of that condition
      are recorded. These numbers can later be retrieved by:
      
        /debugfs/tracing/profile_likely    - All likely markers
        /debugfs/tracing/profile_unlikely  - All unlikely markers.
      
      # cat /debug/tracing/profile_unlikely | head
       correct incorrect  %        Function                  File              Line
       ------- ---------  -        --------                  ----              ----
          2167        0   0 do_arch_prctl                  process_64.c         832
             0        0   0 do_arch_prctl                  process_64.c         804
          2670        0   0 IS_ERR                         err.h                34
         71230     5693   7 __switch_to                    process_64.c         673
         76919        0   0 __switch_to                    process_64.c         639
         43184    33743  43 __switch_to                    process_64.c         624
         12740    64181  83 __switch_to                    process_64.c         594
         12740    64174  83 __switch_to                    process_64.c         590
      
      # cat /debug/tracing/profile_unlikely | \
        awk '{ if ($3 > 25) print $0; }' |head -20
         44963    35259  43 __switch_to                    process_64.c         624
         12762    67454  84 __switch_to                    process_64.c         594
         12762    67447  84 __switch_to                    process_64.c         590
          1478      595  28 syscall_get_error              syscall.h            51
             0     2821 100 syscall_trace_leave            ptrace.c             1567
             0        1 100 native_smp_prepare_cpus        smpboot.c            1237
         86338   265881  75 calc_delta_fair                sched_fair.c         408
        210410   108540  34 calc_delta_mine                sched.c              1267
             0    54550 100 sched_info_queued              sched_stats.h        222
         51899    66435  56 pick_next_task_fair            sched_fair.c         1422
             6       10  62 yield_task_fair                sched_fair.c         982
          7325     2692  26 rt_policy                      sched.c              144
             0     1270 100 pre_schedule_rt                sched_rt.c           1261
          1268    48073  97 pick_next_task_rt              sched_rt.c           884
             0    45181 100 sched_info_dequeued            sched_stats.h        177
             0       15 100 sched_move_task                sched.c              8700
             0       15 100 sched_move_task                sched.c              8690
         53167    33217  38 schedule                       sched.c              4457
             0    80208 100 sched_info_switch              sched_stats.h        270
         30585    49631  61 context_switch                 sched.c              2619
      
      # cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }'
         39900    36577  47 pick_next_task                 sched.c              4397
         20824    15233  42 switch_mm                      mmu_context_64.h     18
             0        7 100 __cancel_work_timer            workqueue.c          560
           617    66484  99 clocksource_adjust             timekeeping.c        456
             0   346340 100 audit_syscall_exit             auditsc.c            1570
            38   347350  99 audit_get_context              auditsc.c            732
             0   345244 100 audit_syscall_entry            auditsc.c            1541
            38     1017  96 audit_free                     auditsc.c            1446
             0     1090 100 audit_alloc                    auditsc.c            862
          2618     1090  29 audit_alloc                    auditsc.c            858
             0        6 100 move_masked_irq                migration.c          9
             1      198  99 probe_sched_wakeup             trace_sched_switch.c 58
             2        2  50 probe_wakeup                   trace_sched_wakeup.c 227
             0        2 100 probe_wakeup_sched_switch      trace_sched_wakeup.c 144
          4514     2090  31 __grab_cache_page              filemap.c            2149
         12882   228786  94 mapping_unevictable            pagemap.h            50
             4       11  73 __flush_cpu_slab               slub.c               1466
        627757   330451  34 slab_free                      slub.c               1731
          2959    61245  95 dentry_lru_del_init            dcache.c             153
           946     1217  56 load_elf_binary                binfmt_elf.c         904
           102       82  44 disk_put_part                  genhd.h              206
             1        1  50 dst_gc_task                    dst.c                82
             0       19 100 tcp_mss_split_point            tcp_output.c         1126
      
      As you can see by the above, there's a bit of work to do in rethinking
      the use of some unlikelys and likelys. Note: the unlikely case had 71 hits
      that were more than 25%.
      
      Note:  After submitting my first version of this patch, Andrew Morton
        showed me a version written by Daniel Walker, where I picked up
        the following ideas from:
      
        1)  Using __builtin_constant_p to avoid profiling fixed values.
        2)  Using __FILE__ instead of instruction pointers.
        3)  Using the preprocessor to stop all profiling of likely
             annotations from vsyscall_64.c.
      
      Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar
      for their feed back on this patch.
      
      (*) Not ever unlikely is recorded, those that are used by vsyscalls
       (a few of them) had to have profiling disabled.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Theodore Tso <tytso@mit.edu>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1f0d69a9
  19. 09 11月, 2008 1 次提交
  20. 27 10月, 2008 1 次提交
  21. 24 10月, 2008 1 次提交
    • N
      mutex: speed up generic mutex implementations · a8ddac7e
      Nick Piggin 提交于
      - atomic operations which both modify the variable and return something imply
        full smp memory barriers before and after the memory operations involved
        (failing atomic_cmpxchg, atomic_add_unless, etc don't imply a barrier because
        they don't modify the target). See Documentation/atomic_ops.txt.
        So remove extra barriers and branches.
      
      - All architectures support atomic_cmpxchg. This has no relation to
        __HAVE_ARCH_CMPXCHG. We can just take the atomic_cmpxchg path unconditionally
      
      This reduces a simple single threaded fastpath lock+unlock test from 590 cycles
      to 203 cycles on a ppc970 system.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a8ddac7e
  22. 21 10月, 2008 1 次提交
  23. 20 10月, 2008 1 次提交
  24. 17 10月, 2008 5 次提交
  25. 16 10月, 2008 3 次提交
    • T
      genirq: revert dynarray · d6c88a50
      Thomas Gleixner 提交于
      Revert the dynarray changes. They need more thought and polishing.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      d6c88a50
    • Y
      add per_cpu_dyn_array support · 1f3fcd4b
      Yinghai Lu 提交于
      allow dyn-array in per_cpu area, allocated dynamically.
      
      usage:
      
      |  /* in .h */
      | struct kernel_stat {
      |        struct cpu_usage_stat   cpustat;
      |        unsigned int *irqs;
      | };
      |
      |  /* in .c */
      | DEFINE_PER_CPU(struct kernel_stat, kstat);
      |
      | DEFINE_PER_CPU_DYN_ARRAY_ADDR(per_cpu__kstat_irqs, per_cpu__kstat.irqs, sizeof(unsigned int), nr_irqs, sizeof(unsigned long), NULL);
      
      after setup_percpu()/per_cpu_alloc_dyn_array(), the dyn_array in
      per_cpu area is ready to use.
      Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1f3fcd4b
    • Y
      generic: add dyn_array support · 3ddfda11
      Yinghai Lu 提交于
      Allow crazy big arrays via bootmem at init stage.
      Architectures use CONFIG_HAVE_DYN_ARRAY to enable it.
      
      usage:
      
      | static struct irq_desc irq_desc_init __initdata = {
      |        .status = IRQ_DISABLED,
      |        .chip = &no_irq_chip,
      |        .handle_irq = handle_bad_irq,
      |        .depth = 1,
      |        .lock = __SPIN_LOCK_UNLOCKED(irq_desc->lock),
      | #ifdef CONFIG_SMP
      |        .affinity = CPU_MASK_ALL
      | #endif
      | };
      |
      | static void __init init_work(void *data)
      | {
      |        struct dyn_array *da = data;
      |        struct  irq_desc *desc;
      |        int i;
      |
      |        desc = *da->name;
      |
      |        for (i = 0; i < *da->nr; i++)
      |                memcpy(&desc[i], &irq_desc_init, sizeof(struct irq_desc));
      | }
      |
      | struct irq_desc *irq_desc;
      | DEFINE_DYN_ARRAY(irq_desc, sizeof(struct irq_desc), nr_irqs, PAGE_SIZE, init_work);
      
      after pre_alloc_dyn_array() after setup_arch(), the array is ready to be
      used.
      
      Via this facility we can replace irq_desc[NR_IRQS] array with dyn_array
      irq_desc[nr_irqs].
      
      v2: remove _nopanic in pre_alloc_dyn_array()
      Signed-off-by: NYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3ddfda11
  26. 14 10月, 2008 2 次提交
    • S
      ftrace: create __mcount_loc section · 8da3821b
      Steven Rostedt 提交于
      This patch creates a section in the kernel called "__mcount_loc".
      This will hold a list of pointers to the mcount relocation for
      each call site of mcount.
      
      For example:
      
      objdump -dr init/main.o
      [...]
      Disassembly of section .text:
      
      0000000000000000 <do_one_initcall>:
         0:   55                      push   %rbp
      [...]
      000000000000017b <init_post>:
       17b:   55                      push   %rbp
       17c:   48 89 e5                mov    %rsp,%rbp
       17f:   53                      push   %rbx
       180:   48 83 ec 08             sub    $0x8,%rsp
       184:   e8 00 00 00 00          callq  189 <init_post+0xe>
                              185: R_X86_64_PC32      mcount+0xfffffffffffffffc
      [...]
      
      We will add a section to point to each function call.
      
         .section __mcount_loc,"a",@progbits
      [...]
         .quad .text + 0x185
      [...]
      
      The offset to of the mcount call site in init_post is an offset from
      the start of the section, and not the start of the function init_post.
      The mcount relocation is at the call site 0x185 from the start of the
      .text section.
      
        .text + 0x185  == init_post + 0xa
      
      We need a way to add this __mcount_loc section in a way that we do not
      lose the relocations after final link.  The .text section here will
      be attached to all other .text sections after final link and the
      offsets will be meaningless.  We need to keep track of where these
      .text sections are.
      
      To do this, we use the start of the first function in the section.
      do_one_initcall.  We can make a tmp.s file with this function as a reference
      to the start of the .text section.
      
         .section __mcount_loc,"a",@progbits
      [...]
         .quad do_one_initcall + 0x185
      [...]
      
      Then we can compile the tmp.s into a tmp.o
      
        gcc -c tmp.s -o tmp.o
      
      And link it into back into main.o.
      
        ld -r main.o tmp.o -o tmp_main.o
        mv tmp_main.o main.o
      
      But we have a problem.  What happens if the first function in a section
      is not exported, and is a static function. The linker will not let
      the tmp.o use it.  This case exists in main.o as well.
      
      Disassembly of section .init.text:
      
      0000000000000000 <set_reset_devices>:
         0:   55                      push   %rbp
         1:   48 89 e5                mov    %rsp,%rbp
         4:   e8 00 00 00 00          callq  9 <set_reset_devices+0x9>
                              5: R_X86_64_PC32        mcount+0xfffffffffffffffc
      
      The first function in .init.text is a static function.
      
      00000000000000a8 t __setup_set_reset_devices
      000000000000105f t __setup_str_set_reset_devices
      0000000000000000 t set_reset_devices
      
      The lowercase 't' means that set_reset_devices is local and is not exported.
      If we simply try to link the tmp.o with the set_reset_devices we end
      up with two symbols: one local and one global.
      
       .section __mcount_loc,"a",@progbits
       .quad set_reset_devices + 0x10
      
      00000000000000a8 t __setup_set_reset_devices
      000000000000105f t __setup_str_set_reset_devices
      0000000000000000 t set_reset_devices
                       U set_reset_devices
      
      We still have an undefined reference to set_reset_devices, and if we try
      to compile the kernel, we will end up with an undefined reference to
      set_reset_devices, or even worst, it could be exported someplace else,
      and then we will have a reference to the wrong location.
      
      To handle this case, we make an intermediate step using objcopy.
      We convert set_reset_devices into a global exported symbol before linking
      it with tmp.o and set it back afterwards.
      
      00000000000000a8 t __setup_set_reset_devices
      000000000000105f t __setup_str_set_reset_devices
      0000000000000000 T set_reset_devices
      
      00000000000000a8 t __setup_set_reset_devices
      000000000000105f t __setup_str_set_reset_devices
      0000000000000000 T set_reset_devices
      
      00000000000000a8 t __setup_set_reset_devices
      000000000000105f t __setup_str_set_reset_devices
      0000000000000000 t set_reset_devices
      
      Now we have a section in main.o called __mcount_loc that we can place
      somewhere in the kernel using vmlinux.ld.S and access it to convert
      all these locations that call mcount into nops before starting SMP
      and thus, eliminating the need to do this with kstop_machine.
      
      Note, A well documented perl script (scripts/recordmcount.pl) is used
      to do all this in one location.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8da3821b
    • M
      tracing: Kernel Tracepoints · 97e1c18e
      Mathieu Desnoyers 提交于
      Implementation of kernel tracepoints. Inspired from the Linux Kernel
      Markers. Allows complete typing verification by declaring both tracing
      statement inline functions and probe registration/unregistration static
      inline functions within the same macro "DEFINE_TRACE". No format string
      is required. See the tracepoint Documentation and Samples patches for
      usage examples.
      
      Taken from the documentation patch :
      
      "A tracepoint placed in code provides a hook to call a function (probe)
      that you can provide at runtime. A tracepoint can be "on" (a probe is
      connected to it) or "off" (no probe is attached). When a tracepoint is
      "off" it has no effect, except for adding a tiny time penalty (checking
      a condition for a branch) and space penalty (adding a few bytes for the
      function call at the end of the instrumented function and adds a data
      structure in a separate section).  When a tracepoint is "on", the
      function you provide is called each time the tracepoint is executed, in
      the execution context of the caller. When the function provided ends its
      execution, it returns to the caller (continuing from the tracepoint
      site).
      
      You can put tracepoints at important locations in the code. They are
      lightweight hooks that can pass an arbitrary number of parameters, which
      prototypes are described in a tracepoint declaration placed in a header
      file."
      
      Addition and removal of tracepoints is synchronized by RCU using the
      scheduler (and preempt_disable) as guarantees to find a quiescent state
      (this is really RCU "classic"). The update side uses rcu_barrier_sched()
      with call_rcu_sched() and the read/execute side uses
      "preempt_disable()/preempt_enable()".
      
      We make sure the previous array containing probes, which has been
      scheduled for deletion by the rcu callback, is indeed freed before we
      proceed to the next update. It therefore limits the rate of modification
      of a single tracepoint to one update per RCU period. The objective here
      is to permit fast batch add/removal of probes on _different_
      tracepoints.
      
      Changelog :
      - Use #name ":" #proto as string to identify the tracepoint in the
        tracepoint table. This will make sure not type mismatch happens due to
        connexion of a probe with the wrong type to a tracepoint declared with
        the same name in a different header.
      - Add tracepoint_entry_free_old.
      - Change __TO_TRACE to get rid of the 'i' iterator.
      
      Masami Hiramatsu <mhiramat@redhat.com> :
      Tested on x86-64.
      
      Performance impact of a tracepoint : same as markers, except that it
      adds about 70 bytes of instructions in an unlikely branch of each
      instrumented function (the for loop, the stack setup and the function
      call). It currently adds a memory read, a test and a conditional branch
      at the instrumentation site (in the hot path). Immediate values will
      eventually change this into a load immediate, test and branch, which
      removes the memory read which will make the i-cache impact smaller
      (changing the memory read for a load immediate removes 3-4 bytes per
      site on x86_32 (depending on mov prefixes), or 7-8 bytes on x86_64, it
      also saves the d-cache hit).
      
      About the performance impact of tracepoints (which is comparable to
      markers), even without immediate values optimizations, tests done by
      Hideo Aoki on ia64 show no regression. His test case was using hackbench
      on a kernel where scheduler instrumentation (about 5 events in code
      scheduler code) was added.
      
      Quoting Hideo Aoki about Markers :
      
      I evaluated overhead of kernel marker using linux-2.6-sched-fixes git
      tree, which includes several markers for LTTng, using an ia64 server.
      
      While the immediate trace mark feature isn't implemented on ia64, there
      is no major performance regression. So, I think that we don't have any
      issues to propose merging marker point patches into Linus's tree from
      the viewpoint of performance impact.
      
      I prepared two kernels to evaluate. The first one was compiled without
      CONFIG_MARKERS. The second one was enabled CONFIG_MARKERS.
      
      I downloaded the original hackbench from the following URL:
      http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c
      
      I ran hackbench 5 times in each condition and calculated the average and
      difference between the kernels.
      
          The parameter of hackbench: every 50 from 50 to 800
          The number of CPUs of the server: 2, 4, and 8
      
      Below is the results. As you can see, major performance regression
      wasn't found in any case. Even if number of processes increases,
      differences between marker-enabled kernel and marker- disabled kernel
      doesn't increase. Moreover, if number of CPUs increases, the differences
      doesn't increase either.
      
      Curiously, marker-enabled kernel is better than marker-disabled kernel
      in more than half cases, although I guess it comes from the difference
      of memory access pattern.
      
      * 2 CPUs
      
      Number of | without      | with         | diff     | diff    |
      processes | Marker [Sec] | Marker [Sec] |   [Sec]  |   [%]   |
      --------------------------------------------------------------
             50 |      4.811   |       4.872  |  +0.061  |  +1.27  |
            100 |      9.854   |      10.309  |  +0.454  |  +4.61  |
            150 |     15.602   |      15.040  |  -0.562  |  -3.6   |
            200 |     20.489   |      20.380  |  -0.109  |  -0.53  |
            250 |     25.798   |      25.652  |  -0.146  |  -0.56  |
            300 |     31.260   |      30.797  |  -0.463  |  -1.48  |
            350 |     36.121   |      35.770  |  -0.351  |  -0.97  |
            400 |     42.288   |      42.102  |  -0.186  |  -0.44  |
            450 |     47.778   |      47.253  |  -0.526  |  -1.1   |
            500 |     51.953   |      52.278  |  +0.325  |  +0.63  |
            550 |     58.401   |      57.700  |  -0.701  |  -1.2   |
            600 |     63.334   |      63.222  |  -0.112  |  -0.18  |
            650 |     68.816   |      68.511  |  -0.306  |  -0.44  |
            700 |     74.667   |      74.088  |  -0.579  |  -0.78  |
            750 |     78.612   |      79.582  |  +0.970  |  +1.23  |
            800 |     85.431   |      85.263  |  -0.168  |  -0.2   |
      --------------------------------------------------------------
      
      * 4 CPUs
      
      Number of | without      | with         | diff     | diff    |
      processes | Marker [Sec] | Marker [Sec] |   [Sec]  |   [%]   |
      --------------------------------------------------------------
             50 |      2.586   |       2.584  |  -0.003  |  -0.1   |
            100 |      5.254   |       5.283  |  +0.030  |  +0.56  |
            150 |      8.012   |       8.074  |  +0.061  |  +0.76  |
            200 |     11.172   |      11.000  |  -0.172  |  -1.54  |
            250 |     13.917   |      14.036  |  +0.119  |  +0.86  |
            300 |     16.905   |      16.543  |  -0.362  |  -2.14  |
            350 |     19.901   |      20.036  |  +0.135  |  +0.68  |
            400 |     22.908   |      23.094  |  +0.186  |  +0.81  |
            450 |     26.273   |      26.101  |  -0.172  |  -0.66  |
            500 |     29.554   |      29.092  |  -0.461  |  -1.56  |
            550 |     32.377   |      32.274  |  -0.103  |  -0.32  |
            600 |     35.855   |      35.322  |  -0.533  |  -1.49  |
            650 |     39.192   |      38.388  |  -0.804  |  -2.05  |
            700 |     41.744   |      41.719  |  -0.025  |  -0.06  |
            750 |     45.016   |      44.496  |  -0.520  |  -1.16  |
            800 |     48.212   |      47.603  |  -0.609  |  -1.26  |
      --------------------------------------------------------------
      
      * 8 CPUs
      
      Number of | without      | with         | diff     | diff    |
      processes | Marker [Sec] | Marker [Sec] |   [Sec]  |   [%]   |
      --------------------------------------------------------------
             50 |      2.094   |       2.072  |  -0.022  |  -1.07  |
            100 |      4.162   |       4.273  |  +0.111  |  +2.66  |
            150 |      6.485   |       6.540  |  +0.055  |  +0.84  |
            200 |      8.556   |       8.478  |  -0.078  |  -0.91  |
            250 |     10.458   |      10.258  |  -0.200  |  -1.91  |
            300 |     12.425   |      12.750  |  +0.325  |  +2.62  |
            350 |     14.807   |      14.839  |  +0.032  |  +0.22  |
            400 |     16.801   |      16.959  |  +0.158  |  +0.94  |
            450 |     19.478   |      19.009  |  -0.470  |  -2.41  |
            500 |     21.296   |      21.504  |  +0.208  |  +0.98  |
            550 |     23.842   |      23.979  |  +0.137  |  +0.57  |
            600 |     26.309   |      26.111  |  -0.198  |  -0.75  |
            650 |     28.705   |      28.446  |  -0.259  |  -0.9   |
            700 |     31.233   |      31.394  |  +0.161  |  +0.52  |
            750 |     34.064   |      33.720  |  -0.344  |  -1.01  |
            800 |     36.320   |      36.114  |  -0.206  |  -0.57  |
      --------------------------------------------------------------
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Acked-by: NMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: N'Peter Zijlstra' <peterz@infradead.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      97e1c18e
  27. 23 9月, 2008 1 次提交
    • S
      signals: demultiplexing SIGTRAP signal · da654b74
      Srinivasa Ds 提交于
      Currently a SIGTRAP can denote any one of below reasons.
      	- Breakpoint hit
      	- H/W debug register hit
      	- Single step
      	- Signal sent through kill() or rasie()
      
      Architectures like powerpc/parisc provides infrastructure to demultiplex
      SIGTRAP signal by passing down the information for receiving SIGTRAP through
      si_code of siginfot_t structure. Here is an attempt is generalise this
      infrastructure by extending it to x86 and x86_64 archs.
      Signed-off-by: NSrinivasa DS <srinivasa@in.ibm.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: akpm@linux-foundation.org
      Cc: paulus@samba.org
      Cc: linuxppc-dev@ozlabs.org
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      da654b74
  28. 17 9月, 2008 1 次提交
  29. 10 9月, 2008 1 次提交