1. 27 9月, 2017 1 次提交
  2. 02 9月, 2017 1 次提交
  3. 29 8月, 2017 1 次提交
  4. 22 8月, 2017 1 次提交
  5. 20 8月, 2017 1 次提交
  6. 17 8月, 2017 1 次提交
    • J
      bpf: sockmap sample program · 69e8cc13
      John Fastabend 提交于
      This program binds a program to a cgroup and then matches hard
      coded IP addresses and adds these to a sockmap.
      
      This will receive messages from the backend and send them to
      the client.
      
           client:X <---> frontend:10000 client:X <---> backend:10001
      
      To keep things simple this is only designed for 1:1 connections
      using hard coded values. A more complete example would allow many
      backends and clients.
      
      To run,
      
       # sockmap <cgroup2_dir>
      Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      69e8cc13
  7. 10 8月, 2017 1 次提交
    • D
      bpf: add BPF_J{LT,LE,SLT,SLE} instructions · 92b31a9a
      Daniel Borkmann 提交于
      Currently, eBPF only understands BPF_JGT (>), BPF_JGE (>=),
      BPF_JSGT (s>), BPF_JSGE (s>=) instructions, this means that
      particularly *JLT/*JLE counterparts involving immediates need
      to be rewritten from e.g. X < [IMM] by swapping arguments into
      [IMM] > X, meaning the immediate first is required to be loaded
      into a register Y := [IMM], such that then we can compare with
      Y > X. Note that the destination operand is always required to
      be a register.
      
      This has the downside of having unnecessarily increased register
      pressure, meaning complex program would need to spill other
      registers temporarily to stack in order to obtain an unused
      register for the [IMM]. Loading to registers will thus also
      affect state pruning since we need to account for that register
      use and potentially those registers that had to be spilled/filled
      again. As a consequence slightly more stack space might have
      been used due to spilling, and BPF programs are a bit longer
      due to extra code involving the register load and potentially
      required spill/fills.
      
      Thus, add BPF_JLT (<), BPF_JLE (<=), BPF_JSLT (s<), BPF_JSLE (s<=)
      counterparts to the eBPF instruction set. Modifying LLVM to
      remove the NegateCC() workaround in a PoC patch at [1] and
      allowing it to also emit the new instructions resulted in
      cilium's BPF programs that are injected into the fast-path to
      have a reduced program length in the range of 2-3% (e.g.
      accumulated main and tail call sections from one of the object
      file reduced from 4864 to 4729 insns), reduced complexity in
      the range of 10-30% (e.g. accumulated sections reduced in one
      of the cases from 116432 to 88428 insns), and reduced stack
      usage in the range of 1-5% (e.g. accumulated sections from one
      of the object files reduced from 824 to 784b).
      
      The modification for LLVM will be incorporated in a backwards
      compatible way. Plan is for LLVM to have i) a target specific
      option to offer a possibility to explicitly enable the extension
      by the user (as we have with -m target specific extensions today
      for various CPU insns), and ii) have the kernel checked for
      presence of the extensions and enable them transparently when
      the user is selecting more aggressive options such as -march=native
      in a bpf target context. (Other frontends generating BPF byte
      code, e.g. ply can probe the kernel directly for its code
      generation.)
      
        [1] https://github.com/borkmann/llvm/tree/bpf-insnsSigned-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92b31a9a
  8. 09 8月, 2017 1 次提交
  9. 02 8月, 2017 3 次提交
  10. 01 8月, 2017 3 次提交
  11. 31 7月, 2017 1 次提交
  12. 30 7月, 2017 1 次提交
  13. 20 7月, 2017 1 次提交
  14. 19 7月, 2017 3 次提交
    • J
      perf/core: Define the common branch type classification · eb0baf8a
      Jin Yao 提交于
      It is often useful to know the branch types while analyzing branch data.
      For example, a call is very different from a conditional branch.
      
      Currently we have to look it up in binary while the binary may later not
      be available and even the binary is available but user has to take some
      time. It is very useful for user to check it directly in perf report.
      
      Perf already has support for disassembling the branch instruction to get
      the x86 branch type.
      
      To keep consistent on kernel and userspace and make the classification
      more common, the patch adds the common branch type classification
      in perf_event.h.
      
      The patch only defines a minimum but most common set of branch types.
      
      PERF_BR_UNKNOWN         : unknown
      PERF_BR_COND            :conditional
      PERF_BR_UNCOND          : unconditional
      PERF_BR_IND             : indirect
      PERF_BR_CALL            : function call
      PERF_BR_IND_CALL        : indirect function call
      PERF_BR_RET             : function return
      PERF_BR_SYSCALL         : syscall
      PERF_BR_SYSRET          : syscall return
      PERF_BR_COND_CALL       : conditional function call
      PERF_BR_COND_RET        : conditional function return
      
      The patch also adds a new field type (4 bits) in perf_branch_entry
      to record the branch type.
      
      Since the disassembling of branch instruction needs some overhead,
      a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it
      needs to disassemble the branch instruction and record the branch
      type.
      
      Change log:
      
      v10: Not changed.
      
      v9: Not changed.
      
      v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN.
          No other change.
      
      v7: Just keep the most common branch types.
          Others are removed.
      
      v6: Not changed.
      
      v5: Not changed. The v5 patch series just change the userspace.
      
      v4: Comparing to previous version, the major changes are:
      
      1. Remove the PERF_BR_JCC_FWD/PERF_BR_JCC_BWD, they will be
         computed later in userspace.
      
      2. Remove the "cross" field in perf_branch_entry. The cross page
         computing will be done later in userspace.
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Link: http://lkml.kernel.org/r/1500379995-6449-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eb0baf8a
    • A
      tools include uapi asm-generic: Grab a copy of fcntl.h · 84d1d8a1
      Arnaldo Carvalho de Melo 提交于
      We'll need defines for beautifying fcntl arguments that are not
      available in older distros, these:
      
        trace/beauty/fcntl.c: In function 'syscall_arg__scnprintf_fcntl_arg':
        trace/beauty/fcntl.c:93: error: 'F_OFD_SETLK' undeclared (first use in this function)
        trace/beauty/fcntl.c:93: error: (Each undeclared identifier is reported only once
        trace/beauty/fcntl.c:93: error: for each function it appears in.)
        trace/beauty/fcntl.c:93: error: 'F_OFD_SETLKW' undeclared (first use in this function)
        trace/beauty/fcntl.c:93: error: 'F_OFD_GETLK' undeclared (first use in this function)
        trace/beauty/fcntl.c:94: error: 'F_GETOWN_EX' undeclared (first use in this function)
        trace/beauty/fcntl.c:94: error: 'F_SETOWN_EX' undeclared (first use in this function)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-gvlw67a47e9z65jdunj4je5s@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      84d1d8a1
    • A
      tools: Update include/uapi/linux/fcntl.h copy from the kernel · ca3cf049
      Arnaldo Carvalho de Melo 提交于
      To get the changes in the commit c75b1d94 ("fs: add fcntl()
      interface for setting/getting write life time hints").
      
      Silencing this perf build warning:
      
        Warning: include/uapi/linux/fcntl.h differs from kernel
      
      We already beautify the fcntl cmd argument, so an upcoming cset will
      update the 'cmd' strarray to cover these new commands.
      
      The hints are in the 3rd arg, a pointer, so not yet supported in 'perf
      trace', for that we need to copy it somehow, probably using eBPF, a new
      attempt at doing that is planned.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-al471wzs3x48alql0tm3mnfa@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ca3cf049
  15. 03 7月, 2017 1 次提交
  16. 02 7月, 2017 1 次提交
  17. 11 6月, 2017 1 次提交
  18. 07 6月, 2017 1 次提交
  19. 05 6月, 2017 1 次提交
  20. 24 5月, 2017 1 次提交
    • I
      tools/include: Sync kernel ABI headers with tooling headers · 6e30437b
      Ingo Molnar 提交于
      Sync (copy) the following v4.12 kernel headers to the tooling headers:
      
        arch/x86/include/asm/disabled-features.h:
        arch/x86/include/uapi/asm/kvm.h:
        arch/powerpc/include/uapi/asm/kvm.h:
        arch/s390/include/uapi/asm/kvm.h:
        arch/arm/include/uapi/asm/kvm.h:
        arch/arm64/include/uapi/asm/kvm.h:
      
         - 'struct kvm_sync_regs' got changed in an ABI-incompatible way,
           fortunately none of the (in-kernel) tooling relied on it
      
         - new KVM_DEV calls added
      
        arch/x86/include/asm/required-features.h:
      
         - 5-level paging hardware ABI detail added
      
        arch/x86/include/asm/cpufeatures.h:
      
         - new CPU feature added
      
        arch/x86/include/uapi/asm/vmx.h:
      
         - new VMX exit conditions
      
      None of the changes requires fixes in the tooling source code.
      
      This addresses the following warnings:
      
        Warning: include/uapi/linux/stat.h differs from kernel
        Warning: arch/x86/include/asm/disabled-features.h differs from kernel
        Warning: arch/x86/include/asm/required-features.h differs from kernel
        Warning: arch/x86/include/asm/cpufeatures.h differs from kernel
        Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel
        Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel
        Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel
        Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel
        Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel
        Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel
      
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170524065721.j2mlch6bgk5klgbc@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6e30437b
  21. 12 5月, 2017 1 次提交
  22. 25 4月, 2017 1 次提交
  23. 22 4月, 2017 1 次提交
  24. 19 4月, 2017 1 次提交
    • S
      powerpc/perf: Define big-endian version of perf_mem_data_src · 8c5073db
      Sukadev Bhattiprolu 提交于
      perf_mem_data_src is a union that is initialized in the kernel via the ->val
      field and accessed by userspace via the mem_xxx bitfields. For this to work
      correctly on big endian platforms, we need a big-endian definition for the
      bitfields.
      
      Currently on a big endian system, if a user requests PERF_SAMPLE_DATA_SRC (perf
      report -d), they will get the default value from perf_sample_data_init(), which
      is PERF_MEM_NA. The value for PERF_MEM_NA is constructed using shifts:
      
        /* TLB access */
        #define PERF_MEM_TLB_NA		0x01 /* not available */
        ...
        #define PERF_MEM_TLB_SHIFT	26
      
        #define PERF_MEM_S(a, s) \
      	(((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
      
        #define PERF_MEM_NA (PERF_MEM_S(OP, NA)   |\
      		    PERF_MEM_S(LVL, NA)   |\
      		    PERF_MEM_S(SNOOP, NA) |\
      		    PERF_MEM_S(LOCK, NA)  |\
      		    PERF_MEM_S(TLB, NA))
      
      Which works out as:
      
        ((0x01 << 0) | (0x01 << 5) | (0x01 << 19) | (0x01 << 24) | (0x01 << 26))
      
      Which means the PERF_MEM_NA value comes out of the kernel as 0x5080021
      in CPU endian.
      
      But then in the perf tool, the code uses the bitfields to inspect the value, and
      currently the bitfields are defined using little endian ordering.
      
      So eg. in perf_mem__tlb_scnprintf() we see:
        data_src->val = 0x5080021
                   op = 0x0
                  lvl = 0x0
                snoop = 0x0
                 lock = 0x0
                 dtlb = 0x0
                 rsvd = 0x5080021
      
      Because of the way the perf tool code is written this is still displayed to the
      user as "N/A", so there is no bug visible at the UI level.
      
      Currently there are no big endian architectures which export a meaningful
      value (ie. other than PERF_MEM_NA), so the extent of the bug on big endian
      platforms is that the PERF_MEM_NA value is exported incorrectly as described
      above. Subsequent patches will add support on big endian powerpc for populating
      the data source value.
      
      This patch does a minimal fix of adding big endian definition of the bitfields
      to match the values that are already exported by the kernel on big endian. And
      it makes no change on little endian.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8c5073db
  25. 10 4月, 2017 1 次提交
  26. 02 4月, 2017 1 次提交
  27. 31 3月, 2017 1 次提交
  28. 24 3月, 2017 2 次提交
  29. 23 3月, 2017 1 次提交
  30. 17 3月, 2017 1 次提交
  31. 14 3月, 2017 1 次提交
    • H
      perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info · f3b3614a
      Hari Bathini 提交于
      Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
      by the kernel when fork, clone, setns or unshare are invoked. And update
      perf-record documentation with the new option to record namespace
      events.
      
      Committer notes:
      
      Combined it with a later patch to allow printing it via 'perf report -D'
      and be able to test the feature introduced in this patch. Had to move
      here also perf_ns__name(), that was introduced in another later patch.
      
      Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:
      
        util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
           ret  += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
                                               ^
      Testing it:
      
        # perf record --namespaces -a
        ^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
        #
        # perf report -D
        <SNIP>
        3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
                      [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
                       4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
      
        0x1151e0 [0x30]: event: 9
        .
        . ... raw event: size 48 bytes
        .  0000:  09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00  ......0..q.h....
        .  0010:  a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00  .9...9...(.c....
        .  0020:  03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00  ................
        <SNIP>
              NAMESPACES events:          1
        <SNIP>
        #
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3b3614a
  32. 13 3月, 2017 1 次提交
  33. 15 2月, 2017 1 次提交