1. 23 6月, 2015 1 次提交
    • D
      bpf: BPF based latency tracing · 0fb1170e
      Daniel Wagner 提交于
      BPF offers another way to generate latency histograms. We attach
      kprobes at trace_preempt_off and trace_preempt_on and calculate the
      time it takes to from seeing the off/on transition.
      
      The first array is used to store the start time stamp. The key is the
      CPU id. The second array stores the log2(time diff). We need to use
      static allocation here (array and not hash tables). The kprobes
      hooking into trace_preempt_on|off should not calling any dynamic
      memory allocation or free path. We need to avoid recursivly
      getting called. Besides that, it reduces jitter in the measurement.
      
      CPU 0
            latency        : count     distribution
             1 -> 1        : 0        |                                        |
             2 -> 3        : 0        |                                        |
             4 -> 7        : 0        |                                        |
             8 -> 15       : 0        |                                        |
            16 -> 31       : 0        |                                        |
            32 -> 63       : 0        |                                        |
            64 -> 127      : 0        |                                        |
           128 -> 255      : 0        |                                        |
           256 -> 511      : 0        |                                        |
           512 -> 1023     : 0        |                                        |
          1024 -> 2047     : 0        |                                        |
          2048 -> 4095     : 166723   |*************************************** |
          4096 -> 8191     : 19870    |***                                     |
          8192 -> 16383    : 6324     |                                        |
         16384 -> 32767    : 1098     |                                        |
         32768 -> 65535    : 190      |                                        |
         65536 -> 131071   : 179      |                                        |
        131072 -> 262143   : 18       |                                        |
        262144 -> 524287   : 4        |                                        |
        524288 -> 1048575  : 1363     |                                        |
      CPU 1
            latency        : count     distribution
             1 -> 1        : 0        |                                        |
             2 -> 3        : 0        |                                        |
             4 -> 7        : 0        |                                        |
             8 -> 15       : 0        |                                        |
            16 -> 31       : 0        |                                        |
            32 -> 63       : 0        |                                        |
            64 -> 127      : 0        |                                        |
           128 -> 255      : 0        |                                        |
           256 -> 511      : 0        |                                        |
           512 -> 1023     : 0        |                                        |
          1024 -> 2047     : 0        |                                        |
          2048 -> 4095     : 114042   |*************************************** |
          4096 -> 8191     : 9587     |**                                      |
          8192 -> 16383    : 4140     |                                        |
         16384 -> 32767    : 673      |                                        |
         32768 -> 65535    : 179      |                                        |
         65536 -> 131071   : 29       |                                        |
        131072 -> 262143   : 4        |                                        |
        262144 -> 524287   : 1        |                                        |
        524288 -> 1048575  : 364      |                                        |
      CPU 2
            latency        : count     distribution
             1 -> 1        : 0        |                                        |
             2 -> 3        : 0        |                                        |
             4 -> 7        : 0        |                                        |
             8 -> 15       : 0        |                                        |
            16 -> 31       : 0        |                                        |
            32 -> 63       : 0        |                                        |
            64 -> 127      : 0        |                                        |
           128 -> 255      : 0        |                                        |
           256 -> 511      : 0        |                                        |
           512 -> 1023     : 0        |                                        |
          1024 -> 2047     : 0        |                                        |
          2048 -> 4095     : 40147    |*************************************** |
          4096 -> 8191     : 2300     |*                                       |
          8192 -> 16383    : 828      |                                        |
         16384 -> 32767    : 178      |                                        |
         32768 -> 65535    : 59       |                                        |
         65536 -> 131071   : 2        |                                        |
        131072 -> 262143   : 0        |                                        |
        262144 -> 524287   : 1        |                                        |
        524288 -> 1048575  : 174      |                                        |
      CPU 3
            latency        : count     distribution
             1 -> 1        : 0        |                                        |
             2 -> 3        : 0        |                                        |
             4 -> 7        : 0        |                                        |
             8 -> 15       : 0        |                                        |
            16 -> 31       : 0        |                                        |
            32 -> 63       : 0        |                                        |
            64 -> 127      : 0        |                                        |
           128 -> 255      : 0        |                                        |
           256 -> 511      : 0        |                                        |
           512 -> 1023     : 0        |                                        |
          1024 -> 2047     : 0        |                                        |
          2048 -> 4095     : 29626    |*************************************** |
          4096 -> 8191     : 2704     |**                                      |
          8192 -> 16383    : 1090     |                                        |
         16384 -> 32767    : 160      |                                        |
         32768 -> 65535    : 72       |                                        |
         65536 -> 131071   : 32       |                                        |
        131072 -> 262143   : 26       |                                        |
        262144 -> 524287   : 12       |                                        |
        524288 -> 1048575  : 298      |                                        |
      
      All this is based on the trace3 examples written by
      Alexei Starovoitov <ast@plumgrid.com>.
      Signed-off-by: NDaniel Wagner <daniel.wagner@bmw-carit.de>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0fb1170e