1. 16 1月, 2020 2 次提交
  2. 10 1月, 2020 2 次提交
  3. 20 12月, 2019 1 次提交
    • A
      bpf: Support replacing cgroup-bpf program in MULTI mode · 7dd68b32
      Andrey Ignatov 提交于
      The common use-case in production is to have multiple cgroup-bpf
      programs per attach type that cover multiple use-cases. Such programs
      are attached with BPF_F_ALLOW_MULTI and can be maintained by different
      people.
      
      Order of programs usually matters, for example imagine two egress
      programs: the first one drops packets and the second one counts packets.
      If they're swapped the result of counting program will be different.
      
      It brings operational challenges with updating cgroup-bpf program(s)
      attached with BPF_F_ALLOW_MULTI since there is no way to replace a
      program:
      
      * One way to update is to detach all programs first and then attach the
        new version(s) again in the right order. This introduces an
        interruption in the work a program is doing and may not be acceptable
        (e.g. if it's egress firewall);
      
      * Another way is attach the new version of a program first and only then
        detach the old version. This introduces the time interval when two
        versions of same program are working, what may not be acceptable if a
        program is not idempotent. It also imposes additional burden on
        program developers to make sure that two versions of their program can
        co-exist.
      
      Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH
      command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI
      flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag
      and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to
      replace
      
      That way user can replace any program among those attached with
      BPF_F_ALLOW_MULTI flag without the problems described above.
      
      Details of the new API:
      
      * If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid
        descriptor of BPF program, BPF_PROG_ATTACH will return corresponding
        error (EINVAL or EBADF).
      
      * If replace_bpf_fd has valid descriptor of BPF program but such a
        program is not attached to specified cgroup, BPF_PROG_ATTACH will
        return ENOENT.
      
      BPF_F_REPLACE is introduced to make the user intent clear, since
      replace_bpf_fd alone can't be used for this (its default value, 0, is a
      valid fd). BPF_F_REPLACE also makes it possible to extend the API in the
      future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed).
      Signed-off-by: NAndrey Ignatov <rdna@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NAndrii Narkyiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com
      7dd68b32
  4. 18 11月, 2019 1 次提交
    • A
      bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY · fc970227
      Andrii Nakryiko 提交于
      Add ability to memory-map contents of BPF array map. This is extremely useful
      for working with BPF global data from userspace programs. It allows to avoid
      typical bpf_map_{lookup,update}_elem operations, improving both performance
      and usability.
      
      There had to be special considerations for map freezing, to avoid having
      writable memory view into a frozen map. To solve this issue, map freezing and
      mmap-ing is happening under mutex now:
        - if map is already frozen, no writable mapping is allowed;
        - if map has writable memory mappings active (accounted in map->writecnt),
          map freezing will keep failing with -EBUSY;
        - once number of writable memory mappings drops to zero, map freezing can be
          performed again.
      
      Only non-per-CPU plain arrays are supported right now. Maps with spinlocks
      can't be memory mapped either.
      
      For BPF_F_MMAPABLE array, memory allocation has to be done through vmalloc()
      to be mmap()'able. We also need to make sure that array data memory is
      page-sized and page-aligned, so we over-allocate memory in such a way that
      struct bpf_array is at the end of a single page of memory with array->value
      being aligned with the start of the second page. On deallocation we need to
      accomodate this memory arrangement to free vmalloc()'ed memory correctly.
      
      One important consideration regarding how memory-mapping subsystem functions.
      Memory-mapping subsystem provides few optional callbacks, among them open()
      and close().  close() is called for each memory region that is unmapped, so
      that users can decrease their reference counters and free up resources, if
      necessary. open() is *almost* symmetrical: it's called for each memory region
      that is being mapped, **except** the very first one. So bpf_map_mmap does
      initial refcnt bump, while open() will do any extra ones after that. Thus
      number of close() calls is equal to number of open() calls plus one more.
      Signed-off-by: NAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Link: https://lore.kernel.org/bpf/20191117172806.2195367-4-andriin@fb.com
      fc970227
  5. 16 11月, 2019 2 次提交
  6. 03 11月, 2019 1 次提交
    • D
      bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers · 6ae08ae3
      Daniel Borkmann 提交于
      The current bpf_probe_read() and bpf_probe_read_str() helpers are broken
      in that they assume they can be used for probing memory access for kernel
      space addresses /as well as/ user space addresses.
      
      However, plain use of probe_kernel_read() for both cases will attempt to
      always access kernel space address space given access is performed under
      KERNEL_DS and some archs in-fact have overlapping address spaces where a
      kernel pointer and user pointer would have the /same/ address value and
      therefore accessing application memory via bpf_probe_read{,_str}() would
      read garbage values.
      
      Lets fix BPF side by making use of recently added 3d708182 ("uaccess:
      Add non-pagefault user-space read functions"). Unfortunately, the only way
      to fix this status quo is to add dedicated bpf_probe_read_{user,kernel}()
      and bpf_probe_read_{user,kernel}_str() helpers. The bpf_probe_read{,_str}()
      helpers are kept as-is to retain their current behavior.
      
      The two *_user() variants attempt the access always under USER_DS set, the
      two *_kernel() variants will -EFAULT when accessing user memory if the
      underlying architecture has non-overlapping address ranges, also avoiding
      throwing the kernel warning via 00c42373 ("x86-64: add warning for
      non-canonical user access address dereferences").
      
      Fixes: a5e8c070 ("bpf: add bpf_probe_read_str helper")
      Fixes: 2541517c ("tracing, perf: Implement BPF programs attached to kprobes")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/796ee46e948bc808d54891a1108435f8652c6ca4.1572649915.git.daniel@iogearbox.net
      6ae08ae3
  7. 31 10月, 2019 1 次提交
  8. 17 10月, 2019 2 次提交
  9. 07 10月, 2019 1 次提交
  10. 28 8月, 2019 1 次提交
  11. 22 8月, 2019 1 次提交
  12. 21 8月, 2019 1 次提交
  13. 18 8月, 2019 1 次提交
  14. 10 8月, 2019 1 次提交
  15. 31 7月, 2019 1 次提交
  16. 30 7月, 2019 1 次提交
  17. 26 7月, 2019 2 次提交
  18. 16 7月, 2019 1 次提交
  19. 15 7月, 2019 1 次提交
  20. 08 7月, 2019 1 次提交
  21. 03 7月, 2019 1 次提交
  22. 29 6月, 2019 1 次提交
  23. 28 6月, 2019 1 次提交
  24. 15 6月, 2019 2 次提交
  25. 14 6月, 2019 1 次提交
  26. 11 6月, 2019 1 次提交
  27. 07 6月, 2019 1 次提交
  28. 25 5月, 2019 2 次提交
  29. 13 5月, 2019 1 次提交
  30. 28 4月, 2019 1 次提交
  31. 27 4月, 2019 1 次提交
  32. 16 4月, 2019 1 次提交
  33. 13 4月, 2019 1 次提交