1. 03 8月, 2018 15 次提交
    • D
      Merge branch 'bpf-cgroup-local-storage' · 82c018d7
      Daniel Borkmann 提交于
      Roman Gushchin says:
      
      ====================
      This patchset implements cgroup local storage for bpf programs.
      The main idea is to provide a fast accessible memory for storing
      various per-cgroup data, e.g. number of transmitted packets.
      
      Cgroup local storage looks as a special type of map for userspace,
      and is accessible using generic bpf maps API for reading and
      updating of the data. The (cgroup inode id, attachment type) pair
      is used as a map key.
      
      A user can't create new entries or destroy existing entries;
      it happens automatically when a user attaches/detaches a bpf program
      to a cgroup.
      
      From a bpf program's point of view, cgroup storage is accessible
      without lookup using the special get_local_storage() helper function.
      It takes a map fd as an argument. It always returns a valid pointer
      to the corresponding memory area.
      
      To implement such a lookup-free access a pointer to the cgroup
      storage is saved for an attachment of a bpf program to a cgroup,
      if required by the program. Before running the program, it's saved
      in a special global per-cpu variable, which is accessible from the
      get_local_storage() helper.
      
      This patchset implement only cgroup local storage, however the API
      is intentionally made extensible to support other local storage types
      further: e.g. thread local storage, socket local storage, etc.
      
      v7->v6:
        - fixed a use-after-free bug, caused by not clearing
          prog->aux->cgroup_storage pointer after releasing the map
      
      v6->v5:
        - fixed an error with returning -EINVAL instead of a pointer
      
      v5->v4:
        - fixed an issue in verifier (test that flags == 0 properly)
        - added a corresponding test
        - added a note about synchronization, sync docs to tools/uapi/...
        - switched the cgroup test to use XADD
        - added a check for attr->max_entries to be 0, and atter->max_flags
          to be sane
        - use bpf_uncharge_memlock() in bpf_uncharge_memlock()
        - rebased to bpf-next
      
      v4->v3:
        - fixed a leak in cgroup attachment code (discovered by Daniel)
        - cgroup storage map will be released if the corresponding
          bpf program failed to load by any reason
        - introduced bpf_uncharge_memlock() helper
      
      v3->v2:
        - fixed more build and sparse issues
        - rebased to bpf-next
      
      v2->v1:
        - fixed build issues
        - removed explicit rlimit calls in patch 14
        - rebased to bpf-next
      ====================
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      82c018d7
    • R
      samples/bpf: extend test_cgrp2_attach2 test to use cgroup storage · 28ba0687
      Roman Gushchin 提交于
      The test_cgrp2_attach test covers bpf cgroup attachment code well,
      so let's re-use it for testing allocation/releasing of cgroup storage.
      
      The extension is pretty straightforward: the bpf program will use
      the cgroup storage to save the number of transmitted bytes.
      
      Expected output:
        $ ./test_cgrp2_attach2
        Attached DROP prog. This ping in cgroup /foo should fail...
        ping: sendmsg: Operation not permitted
        Attached DROP prog. This ping in cgroup /foo/bar should fail...
        ping: sendmsg: Operation not permitted
        Attached PASS prog. This ping in cgroup /foo/bar should pass...
        Detached PASS from /foo/bar while DROP is attached to /foo.
        This ping in cgroup /foo/bar should fail...
        ping: sendmsg: Operation not permitted
        Attached PASS from /foo/bar and detached DROP from /foo.
        This ping in cgroup /foo/bar should pass...
        ### override:PASS
        ### multi:PASS
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      28ba0687
    • R
      selftests/bpf: add a cgroup storage test · 68cfa3ac
      Roman Gushchin 提交于
      Implement a test to cover the cgroup storage functionality.
      The test implements a bpf program which drops every second packet
      by using the cgroup storage as a persistent storage.
      
      The test also use the userspace API to check the data
      in the cgroup storage, alter it, and check that the loaded
      and attached bpf program sees the update.
      
      Expected output:
        $ ./test_cgroup_storage
        test_cgroup_storage:PASS
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      68cfa3ac
    • R
      selftests/bpf: add verifier cgroup storage tests · d4c9f573
      Roman Gushchin 提交于
      Add the following verifier tests to cover the cgroup storage
      functionality:
      1) valid access to the cgroup storage
      2) invalid access: use regular hashmap instead of cgroup storage map
      3) invalid access: use invalid map fd
      4) invalid access: try access memory after the cgroup storage
      5) invalid access: try access memory before the cgroup storage
      6) invalid access: call get_local_storage() with non-zero flags
      
      For tests 2)-6) check returned error strings.
      
      Expected output:
        $ ./test_verifier
        #0/u add+sub+mul OK
        #0/p add+sub+mul OK
        #1/u DIV32 by 0, zero check 1 OK
        ...
        #280/p valid cgroup storage access OK
        #281/p invalid cgroup storage access 1 OK
        #282/p invalid cgroup storage access 2 OK
        #283/p invalid per-cgroup storage access 3 OK
        #284/p invalid cgroup storage access 4 OK
        #285/p invalid cgroup storage access 5 OK
        ...
        #649/p pass modified ctx pointer to helper, 2 OK
        #650/p pass modified ctx pointer to helper, 3 OK
        Summary: 901 PASSED, 0 SKIPPED, 0 FAILED
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      d4c9f573
    • R
      bpf/test_run: support cgroup local storage · f42ee093
      Roman Gushchin 提交于
      Allocate a temporary cgroup storage to use for bpf program test runs.
      
      Because the test program is not actually attached to a cgroup,
      the storage is allocated manually just for the execution
      of the bpf program.
      
      If the program is executed multiple times, the storage is not zeroed
      on each run, emulating multiple runs of the program, attached to
      a real cgroup.
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      f42ee093
    • R
      bpftool: add support for CGROUP_STORAGE maps · 34a6bbb8
      Roman Gushchin 提交于
      Add BPF_MAP_TYPE_CGROUP_STORAGE maps to the list
      of maps types which bpftool recognizes.
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      34a6bbb8
    • R
      bpf: sync bpf.h to tools/ · c419cf52
      Roman Gushchin 提交于
      Sync cgroup storage related changes:
      1) new BPF_MAP_TYPE_CGROUP_STORAGE map type
      2) struct bpf_cgroup_sotrage_key definition
      3) get_local_storage() helper
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      c419cf52
    • R
      bpf: introduce the bpf_get_local_storage() helper function · cd339431
      Roman Gushchin 提交于
      The bpf_get_local_storage() helper function is used
      to get a pointer to the bpf local storage from a bpf program.
      
      It takes a pointer to a storage map and flags as arguments.
      Right now it accepts only cgroup storage maps, and flags
      argument has to be 0. Further it can be extended to support
      other types of local storage: e.g. thread local storage etc.
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      cd339431
    • R
      bpf: don't allow create maps of cgroup local storages · 7b5dd2bd
      Roman Gushchin 提交于
      As there is one-to-one relation between a bpf program
      and cgroup local storage map, there is no sense in
      creating a map of cgroup local storage maps.
      
      Forbid it explicitly to avoid possible side effects.
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      7b5dd2bd
    • R
      bpf/verifier: introduce BPF_PTR_TO_MAP_VALUE · 3e6a4b3e
      Roman Gushchin 提交于
      BPF_MAP_TYPE_CGROUP_STORAGE maps are special in a way
      that the access from the bpf program side is lookup-free.
      That means the result is guaranteed to be a valid
      pointer to the cgroup storage; no NULL-check is required.
      
      This patch introduces BPF_PTR_TO_MAP_VALUE return type,
      which is required to cause the verifier accept programs,
      which are not checking the map value pointer for being NULL.
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      3e6a4b3e
    • R
      bpf: extend bpf_prog_array to store pointers to the cgroup storage · 394e40a2
      Roman Gushchin 提交于
      This patch converts bpf_prog_array from an array of prog pointers
      to the array of struct bpf_prog_array_item elements.
      
      This allows to save a cgroup storage pointer for each bpf program
      efficiently attached to a cgroup.
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      394e40a2
    • R
      bpf: allocate cgroup storage entries on attaching bpf programs · d7bf2c10
      Roman Gushchin 提交于
      If a bpf program is using cgroup local storage, allocate
      a bpf_cgroup_storage structure automatically on attaching the program
      to a cgroup and save the pointer into the corresponding bpf_prog_list
      entry.
      Analogically, release the cgroup local storage on detaching
      of the bpf program.
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      d7bf2c10
    • R
      bpf: pass a pointer to a cgroup storage using pcpu variable · aa0ad5b0
      Roman Gushchin 提交于
      This commit introduces the bpf_cgroup_storage_set() helper,
      which will be used to pass a pointer to a cgroup storage
      to the bpf helper.
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      aa0ad5b0
    • R
      bpf: introduce cgroup storage maps · de9cbbaa
      Roman Gushchin 提交于
      This commit introduces BPF_MAP_TYPE_CGROUP_STORAGE maps:
      a special type of maps which are implementing the cgroup storage.
      
      >From the userspace point of view it's almost a generic
      hash map with the (cgroup inode id, attachment type) pair
      used as a key.
      
      The only difference is that some operations are restricted:
        1) a user can't create new entries,
        2) a user can't remove existing entries.
      
      The lookup from userspace is o(log(n)).
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      de9cbbaa
    • R
      bpf: add ability to charge bpf maps memory dynamically · 0a4c58f5
      Roman Gushchin 提交于
      This commits extends existing bpf maps memory charging API
      to support dynamic charging/uncharging.
      
      This is required to account memory used by maps,
      if all entries are created dynamically after
      the map initialization.
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      0a4c58f5
  2. 01 8月, 2018 1 次提交
    • A
      bpf: verifier: MOV64 don't mark dst reg unbounded · fbeb1603
      Arthur Fabre 提交于
      When check_alu_op() handles a BPF_MOV64 between two registers,
      it calls check_reg_arg(DST_OP) on the dst register, marking it
      as unbounded. If the src and dst register are the same, this
      marks the src as unbounded, which can lead to unexpected errors
      for further checks that rely on bounds info. For example:
      
      	BPF_MOV64_IMM(BPF_REG_2, 0),
      	BPF_MOV64_REG(BPF_REG_2, BPF_REG_2),
      	BPF_ALU64_REG(BPF_ADD, BPF_REG_1, BPF_REG_2),
      	BPF_MOV64_IMM(BPF_REG_0, 0),
      	BPF_EXIT_INSN(),
      
      Results in:
      
      	"math between ctx pointer and register with unbounded
      	min value is not allowed"
      
      check_alu_op() now uses check_reg_arg(DST_OP_NO_MARK), and MOVs
      that need to mark the dst register (MOVIMM, MOV32) do so.
      
      Added a test case for MOV64 dst == src, and dst != src.
      Signed-off-by: NArthur Fabre <afabre@cloudflare.com>
      Acked-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      fbeb1603
  3. 31 7月, 2018 8 次提交
  4. 27 7月, 2018 16 次提交