1. 16 3月, 2015 1 次提交
    • A
      bpf: allow extended BPF programs access skb fields · 9bac3d6d
      Alexei Starovoitov 提交于
      introduce user accessible mirror of in-kernel 'struct sk_buff':
      struct __sk_buff {
          __u32 len;
          __u32 pkt_type;
          __u32 mark;
          __u32 queue_mapping;
      };
      
      bpf programs can do:
      
      int bpf_prog(struct __sk_buff *skb)
      {
          __u32 var = skb->pkt_type;
      
      which will be compiled to bpf assembler as:
      
      dst_reg = *(u32 *)(src_reg + 4) // 4 == offsetof(struct __sk_buff, pkt_type)
      
      bpf verifier will check validity of access and will convert it to:
      
      dst_reg = *(u8 *)(src_reg + offsetof(struct sk_buff, __pkt_type_offset))
      dst_reg &= 7
      
      since skb->pkt_type is a bitfield.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9bac3d6d
  2. 13 3月, 2015 1 次提交
    • D
      ebpf: verifier: check that call reg with ARG_ANYTHING is initialized · 80f1d68c
      Daniel Borkmann 提交于
      I noticed that a helper function with argument type ARG_ANYTHING does
      not need to have an initialized value (register).
      
      This can worst case lead to unintented stack memory leakage in future
      helper functions if they are not carefully designed, or unintended
      application behaviour in case the application developer was not careful
      enough to match a correct helper function signature in the API.
      
      The underlying issue is that ARG_ANYTHING should actually be split
      into two different semantics:
      
        1) ARG_DONTCARE for function arguments that the helper function
           does not care about (in other words: the default for unused
           function arguments), and
      
        2) ARG_ANYTHING that is an argument actually being used by a
           helper function and *guaranteed* to be an initialized register.
      
      The current risk is low: ARG_ANYTHING is only used for the 'flags'
      argument (r4) in bpf_map_update_elem() that internally does strict
      checking.
      
      Fixes: 17a52670 ("bpf: verifier (add verifier core)")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80f1d68c
  3. 02 3月, 2015 2 次提交
    • D
      ebpf: move read-only fields to bpf_prog and shrink bpf_prog_aux · 24701ece
      Daniel Borkmann 提交于
      is_gpl_compatible and prog_type should be moved directly into bpf_prog
      as they stay immutable during bpf_prog's lifetime, are core attributes
      and they can be locked as read-only later on via bpf_prog_select_runtime().
      
      With a bit of rearranging, this also allows us to shrink bpf_prog_aux
      to exactly 1 cacheline.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      24701ece
    • D
      ebpf: add sched_cls_type and map it to sk_filter's verifier ops · 96be4325
      Daniel Borkmann 提交于
      As discussed recently and at netconf/netdev01, we want to prevent making
      bpf_verifier_ops registration available for modules, but have them at a
      controlled place inside the kernel instead.
      
      The reason for this is, that out-of-tree modules can go crazy and define
      and register any verfifier ops they want, doing all sorts of crap, even
      bypassing available GPLed eBPF helper functions. We don't want to offer
      such a shiny playground, of course, but keep strict control to ourselves
      inside the core kernel.
      
      This also encourages us to design eBPF user helpers carefully and
      generically, so they can be shared among various subsystems using eBPF.
      
      For the eBPF traffic classifier (cls_bpf), it's a good start to share
      the same helper facilities as we currently do in eBPF for socket filters.
      
      That way, we have BPF_PROG_TYPE_SCHED_CLS look like it's own type, thus
      one day if there's a good reason to diverge the set of helper functions
      from the set available to socket filters, we keep ABI compatibility.
      
      In future, we could place all bpf_prog_type_list at a central place,
      perhaps.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      96be4325
  4. 06 12月, 2014 1 次提交
  5. 31 10月, 2014 1 次提交
    • A
      bpf: reduce verifier memory consumption · 9c399760
      Alexei Starovoitov 提交于
      verifier keeps track of register state spilled to stack.
      registers are 8-byte wide and always aligned, so instead of tracking them
      in every byte-sized stack slot, use MAX_BPF_STACK / 8 array to track
      spilled register state.
      Though verifier runs in user context and its state freed immediately
      after verification, it makes sense to reduce its memory usage.
      This optimization reduces sizeof(struct verifier_state)
      from 12464 to 1712 on 64-bit and from 6232 to 1112 on 32-bit.
      
      Note, this patch doesn't change existing limits, which are there to bound
      time and memory during verification: 4k total number of insns in a program,
      1k number of jumps (states to visit) and 32k number of processed insn
      (since an insn may be visited multiple times). Theoretical worst case memory
      during verification is 1712 * 1k = 17Mbyte. Out-of-memory situation triggers
      cleanup and rejects the program.
      Suggested-by: NAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c399760
  6. 22 10月, 2014 1 次提交
  7. 02 10月, 2014 1 次提交
    • A
      bpf: add search pruning optimization to verifier · f1bca824
      Alexei Starovoitov 提交于
      consider C program represented in eBPF:
      int filter(int arg)
      {
          int a, b, c, *ptr;
      
          if (arg == 1)
              ptr = &a;
          else if (arg == 2)
              ptr = &b;
          else
              ptr = &c;
      
          *ptr = 0;
          return 0;
      }
      eBPF verifier has to follow all possible paths through the program
      to recognize that '*ptr = 0' instruction would be safe to execute
      in all situations.
      It's doing it by picking a path towards the end and observes changes
      to registers and stack at every insn until it reaches bpf_exit.
      Then it comes back to one of the previous branches and goes towards
      the end again with potentially different values in registers.
      When program has a lot of branches, the number of possible combinations
      of branches is huge, so verifer has a hard limit of walking no more
      than 32k instructions. This limit can be reached and complex (but valid)
      programs could be rejected. Therefore it's important to recognize equivalent
      verifier states to prune this depth first search.
      
      Basic idea can be illustrated by the program (where .. are some eBPF insns):
          1: ..
          2: if (rX == rY) goto 4
          3: ..
          4: ..
          5: ..
          6: bpf_exit
      In the first pass towards bpf_exit the verifier will walk insns: 1, 2, 3, 4, 5, 6
      Since insn#2 is a branch the verifier will remember its state in verifier stack
      to come back to it later.
      Since insn#4 is marked as 'branch target', the verifier will remember its state
      in explored_states[4] linked list.
      Once it reaches insn#6 successfully it will pop the state recorded at insn#2 and
      will continue.
      Without search pruning optimization verifier would have to walk 4, 5, 6 again,
      effectively simulating execution of insns 1, 2, 4, 5, 6
      With search pruning it will check whether state at #4 after jumping from #2
      is equivalent to one recorded in explored_states[4] during first pass.
      If there is an equivalent state, verifier can prune the search at #4 and declare
      this path to be safe as well.
      In other words two states at #4 are equivalent if execution of 1, 2, 3, 4 insns
      and 1, 2, 4 insns produces equivalent registers and stack.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f1bca824
  8. 27 9月, 2014 5 次提交
    • A
      bpf: verifier (add verifier core) · 17a52670
      Alexei Starovoitov 提交于
      This patch adds verifier core which simulates execution of every insn and
      records the state of registers and program stack. Every branch instruction seen
      during simulation is pushed into state stack. When verifier reaches BPF_EXIT,
      it pops the state from the stack and continues until it reaches BPF_EXIT again.
      For program:
      1: bpf_mov r1, xxx
      2: if (r1 == 0) goto 5
      3: bpf_mov r0, 1
      4: goto 6
      5: bpf_mov r0, 2
      6: bpf_exit
      The verifier will walk insns: 1, 2, 3, 4, 6
      then it will pop the state recorded at insn#2 and will continue: 5, 6
      
      This way it walks all possible paths through the program and checks all
      possible values of registers. While doing so, it checks for:
      - invalid instructions
      - uninitialized register access
      - uninitialized stack access
      - misaligned stack access
      - out of range stack access
      - invalid calling convention
      - instruction encoding is not using reserved fields
      
      Kernel subsystem configures the verifier with two callbacks:
      
      - bool (*is_valid_access)(int off, int size, enum bpf_access_type type);
        that provides information to the verifer which fields of 'ctx'
        are accessible (remember 'ctx' is the first argument to eBPF program)
      
      - const struct bpf_func_proto *(*get_func_proto)(enum bpf_func_id func_id);
        returns argument constraints of kernel helper functions that eBPF program
        may call, so that verifier can checks that R1-R5 types match the prototype
      
      More details in Documentation/networking/filter.txt and in kernel/bpf/verifier.c
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      17a52670
    • A
      bpf: verifier (add branch/goto checks) · 475fb78f
      Alexei Starovoitov 提交于
      check that control flow graph of eBPF program is a directed acyclic graph
      
      check_cfg() does:
      - detect loops
      - detect unreachable instructions
      - check that program terminates with BPF_EXIT insn
      - check that all branches are within program boundary
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      475fb78f
    • A
      bpf: handle pseudo BPF_LD_IMM64 insn · 0246e64d
      Alexei Starovoitov 提交于
      eBPF programs passed from userspace are using pseudo BPF_LD_IMM64 instructions
      to refer to process-local map_fd. Scan the program for such instructions and
      if FDs are valid, convert them to 'struct bpf_map' pointers which will be used
      by verifier to check access to maps in bpf_map_lookup/update() calls.
      If program passes verifier, convert pseudo BPF_LD_IMM64 into generic by dropping
      BPF_PSEUDO_MAP_FD flag.
      
      Note that eBPF interpreter is generic and knows nothing about pseudo insns.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0246e64d
    • A
      bpf: verifier (add ability to receive verification log) · cbd35700
      Alexei Starovoitov 提交于
      add optional attributes for BPF_PROG_LOAD syscall:
      union bpf_attr {
          struct {
      	...
      	__u32         log_level; /* verbosity level of eBPF verifier */
      	__u32         log_size;  /* size of user buffer */
      	__aligned_u64 log_buf;   /* user supplied 'char *buffer' */
          };
      };
      
      when log_level > 0 the verifier will return its verification log in the user
      supplied buffer 'log_buf' which can be used by program author to analyze why
      verifier rejected given program.
      
      'Understanding eBPF verifier messages' section of Documentation/networking/filter.txt
      provides several examples of these messages, like the program:
      
        BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
        BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
        BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
        BPF_LD_MAP_FD(BPF_REG_1, 0),
        BPF_CALL_FUNC(BPF_FUNC_map_lookup_elem),
        BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1),
        BPF_ST_MEM(BPF_DW, BPF_REG_0, 4, 0),
        BPF_EXIT_INSN(),
      
      will be rejected with the following multi-line message in log_buf:
      
        0: (7a) *(u64 *)(r10 -8) = 0
        1: (bf) r2 = r10
        2: (07) r2 += -8
        3: (b7) r1 = 0
        4: (85) call 1
        5: (15) if r0 == 0x0 goto pc+1
         R0=map_ptr R10=fp
        6: (7a) *(u64 *)(r0 +4) = 0
        misaligned access off 4 size 8
      
      The format of the output can change at any time as verifier evolves.
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cbd35700
    • A
      bpf: verifier (add docs) · 51580e79
      Alexei Starovoitov 提交于
      this patch adds all of eBPF verfier documentation and empty bpf_check()
      
      The end goal for the verifier is to statically check safety of the program.
      
      Verifier will catch:
      - loops
      - out of range jumps
      - unreachable instructions
      - invalid instructions
      - uninitialized register access
      - uninitialized stack access
      - misaligned stack access
      - out of range stack access
      - invalid calling convention
      
      More details in Documentation/networking/filter.txt
      Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51580e79