1. 17 8月, 2021 1 次提交
    • A
      bpf: Refactor BPF_PROG_RUN into a function · fb7dd8bc
      Andrii Nakryiko 提交于
      Turn BPF_PROG_RUN into a proper always inlined function. No functional and
      performance changes are intended, but it makes it much easier to understand
      what's going on with how BPF programs are actually get executed. It's more
      obvious what types and callbacks are expected. Also extra () around input
      parameters can be dropped, as well as `__` variable prefixes intended to avoid
      naming collisions, which makes the code simpler to read and write.
      
      This refactoring also highlighted one extra issue. BPF_PROG_RUN is both
      a macro and an enum value (BPF_PROG_RUN == BPF_PROG_TEST_RUN). Turning
      BPF_PROG_RUN into a function causes naming conflict compilation error. So
      rename BPF_PROG_RUN into lower-case bpf_prog_run(), similar to
      bpf_prog_run_xdp(), bpf_prog_run_pin_on_cpu(), etc. All existing callers of
      BPF_PROG_RUN, the macro, are switched to bpf_prog_run() explicitly.
      Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20210815070609.987780-2-andrii@kernel.org
      fb7dd8bc
  2. 10 8月, 2021 1 次提交
  3. 05 8月, 2021 1 次提交
  4. 17 7月, 2021 1 次提交
    • A
      bpf: Add ambient BPF runtime context stored in current · c7603cfa
      Andrii Nakryiko 提交于
      b910eaaa ("bpf: Fix NULL pointer dereference in bpf_get_local_storage()
      helper") fixed the problem with cgroup-local storage use in BPF by
      pre-allocating per-CPU array of 8 cgroup storage pointers to accommodate
      possible BPF program preemptions and nested executions.
      
      While this seems to work good in practice, it introduces new and unnecessary
      failure mode in which not all BPF programs might be executed if we fail to
      find an unused slot for cgroup storage, however unlikely it is. It might also
      not be so unlikely when/if we allow sleepable cgroup BPF programs in the
      future.
      
      Further, the way that cgroup storage is implemented as ambiently-available
      property during entire BPF program execution is a convenient way to pass extra
      information to BPF program and helpers without requiring user code to pass
      around extra arguments explicitly. So it would be good to have a generic
      solution that can allow implementing this without arbitrary restrictions.
      Ideally, such solution would work for both preemptable and sleepable BPF
      programs in exactly the same way.
      
      This patch introduces such solution, bpf_run_ctx. It adds one pointer field
      (bpf_ctx) to task_struct. This field is maintained by BPF_PROG_RUN family of
      macros in such a way that it always stays valid throughout BPF program
      execution. BPF program preemption is handled by remembering previous
      current->bpf_ctx value locally while executing nested BPF program and
      restoring old value after nested BPF program finishes. This is handled by two
      helper functions, bpf_set_run_ctx() and bpf_reset_run_ctx(), which are
      supposed to be used before and after BPF program runs, respectively.
      
      Restoring old value of the pointer handles preemption, while bpf_run_ctx
      pointer being a property of current task_struct naturally solves this problem
      for sleepable BPF programs by "following" BPF program execution as it is
      scheduled in and out of CPU. It would even allow CPU migration of BPF
      programs, even though it's not currently allowed by BPF infra.
      
      This patch cleans up cgroup local storage handling as a first application. The
      design itself is generic, though, with bpf_run_ctx being an empty struct that
      is supposed to be embedded into a specific struct for a given BPF program type
      (bpf_cg_run_ctx in this case). Follow up patches are planned that will expand
      this mechanism for other uses within tracing BPF programs.
      
      To verify that this change doesn't revert the fix to the original cgroup
      storage issue, I ran the same repro as in the original report ([0]) and didn't
      get any problems. Replacing bpf_reset_run_ctx(old_run_ctx) with
      bpf_reset_run_ctx(NULL) triggers the issue pretty quickly (so repro does work).
      
        [0] https://lore.kernel.org/bpf/YEEvBUiJl2pJkxTd@krava/
      
      Fixes: b910eaaa ("bpf: Fix NULL pointer dereference in bpf_get_local_storage() helper")
      Signed-off-by: NAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20210712230615.3525979-1-andrii@kernel.org
      c7603cfa
  5. 12 7月, 2021 1 次提交
    • X
      bpf, test: fix NULL pointer dereference on invalid expected_attach_type · 5e21bb4e
      Xuan Zhuo 提交于
      These two types of XDP progs (BPF_XDP_DEVMAP, BPF_XDP_CPUMAP) will not be
      executed directly in the driver, therefore we should also not directly
      run them from here. To run in these two situations, there must be further
      preparations done, otherwise these may cause a kernel panic.
      
      For more details, see also dev_xdp_attach().
      
        [   46.982479] BUG: kernel NULL pointer dereference, address: 0000000000000000
        [   46.984295] #PF: supervisor read access in kernel mode
        [   46.985777] #PF: error_code(0x0000) - not-present page
        [   46.987227] PGD 800000010dca4067 P4D 800000010dca4067 PUD 10dca6067 PMD 0
        [   46.989201] Oops: 0000 [#1] SMP PTI
        [   46.990304] CPU: 7 PID: 562 Comm: a.out Not tainted 5.13.0+ #44
        [   46.992001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/24
        [   46.995113] RIP: 0010:___bpf_prog_run+0x17b/0x1710
        [   46.996586] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02
        [   47.001562] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246
        [   47.003115] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000
        [   47.005163] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98
        [   47.007135] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff
        [   47.009171] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98
        [   47.011172] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8
        [   47.013244] FS:  00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000
        [   47.015705] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [   47.017475] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0
        [   47.019558] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [   47.021595] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [   47.023574] PKRU: 55555554
        [   47.024571] Call Trace:
        [   47.025424]  __bpf_prog_run32+0x32/0x50
        [   47.026296]  ? printk+0x53/0x6a
        [   47.027066]  ? ktime_get+0x39/0x90
        [   47.027895]  bpf_test_run.cold.28+0x23/0x123
        [   47.028866]  ? printk+0x53/0x6a
        [   47.029630]  bpf_prog_test_run_xdp+0x149/0x1d0
        [   47.030649]  __sys_bpf+0x1305/0x23d0
        [   47.031482]  __x64_sys_bpf+0x17/0x20
        [   47.032316]  do_syscall_64+0x3a/0x80
        [   47.033165]  entry_SYSCALL_64_after_hwframe+0x44/0xae
        [   47.034254] RIP: 0033:0x7f04a51364dd
        [   47.035133] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 48
        [   47.038768] RSP: 002b:00007fff8f9fc518 EFLAGS: 00000213 ORIG_RAX: 0000000000000141
        [   47.040344] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f04a51364dd
        [   47.041749] RDX: 0000000000000048 RSI: 0000000020002a80 RDI: 000000000000000a
        [   47.043171] RBP: 00007fff8f9fc530 R08: 0000000002049300 R09: 0000000020000100
        [   47.044626] R10: 0000000000000004 R11: 0000000000000213 R12: 0000000000401070
        [   47.046088] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
        [   47.047579] Modules linked in:
        [   47.048318] CR2: 0000000000000000
        [   47.049120] ---[ end trace 7ad34443d5be719a ]---
        [   47.050273] RIP: 0010:___bpf_prog_run+0x17b/0x1710
        [   47.051343] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02
        [   47.054943] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246
        [   47.056068] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000
        [   47.057522] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98
        [   47.058961] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff
        [   47.060390] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98
        [   47.061803] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8
        [   47.063249] FS:  00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000
        [   47.065070] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [   47.066307] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0
        [   47.067747] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [   47.069217] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [   47.070652] PKRU: 55555554
        [   47.071318] Kernel panic - not syncing: Fatal exception
        [   47.072854] Kernel Offset: disabled
        [   47.073683] ---[ end Kernel panic - not syncing: Fatal exception ]---
      
      Fixes: 92164774 ("bpf: cpumap: Add the possibility to attach an eBPF program to cpumap")
      Fixes: fbee97fe ("bpf: Add support to attach bpf program to a devmap entry")
      Reported-by: NAbaci <abaci@linux.alibaba.com>
      Signed-off-by: NXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: NDust Li <dust.li@linux.alibaba.com>
      Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: NDavid Ahern <dsahern@kernel.org>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20210708080409.73525-1-xuanzhuo@linux.alibaba.com
      5e21bb4e
  6. 08 7月, 2021 2 次提交
  7. 19 5月, 2021 2 次提交
  8. 27 3月, 2021 1 次提交
  9. 26 3月, 2021 1 次提交
    • Y
      bpf: Fix NULL pointer dereference in bpf_get_local_storage() helper · b910eaaa
      Yonghong Song 提交于
      Jiri Olsa reported a bug ([1]) in kernel where cgroup local
      storage pointer may be NULL in bpf_get_local_storage() helper.
      There are two issues uncovered by this bug:
        (1). kprobe or tracepoint prog incorrectly sets cgroup local storage
             before prog run,
        (2). due to change from preempt_disable to migrate_disable,
             preemption is possible and percpu storage might be overwritten
             by other tasks.
      
      This issue (1) is fixed in [2]. This patch tried to address issue (2).
      The following shows how things can go wrong:
        task 1:   bpf_cgroup_storage_set() for percpu local storage
               preemption happens
        task 2:   bpf_cgroup_storage_set() for percpu local storage
               preemption happens
        task 1:   run bpf program
      
      task 1 will effectively use the percpu local storage setting by task 2
      which will be either NULL or incorrect ones.
      
      Instead of just one common local storage per cpu, this patch fixed
      the issue by permitting 8 local storages per cpu and each local
      storage is identified by a task_struct pointer. This way, we
      allow at most 8 nested preemption between bpf_cgroup_storage_set()
      and bpf_cgroup_storage_unset(). The percpu local storage slot
      is released (calling bpf_cgroup_storage_unset()) by the same task
      after bpf program finished running.
      bpf_test_run() is also fixed to use the new bpf_cgroup_storage_set()
      interface.
      
      The patch is tested on top of [2] with reproducer in [1].
      Without this patch, kernel will emit error in 2-3 minutes.
      With this patch, after one hour, still no error.
      
       [1] https://lore.kernel.org/bpf/CAKH8qBuXCfUz=w8L+Fj74OaUpbosO29niYwTki7e3Ag044_aww@mail.gmail.com/T
       [2] https://lore.kernel.org/bpf/20210309185028.3763817-1-yhs@fb.comSigned-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NRoman Gushchin <guro@fb.com>
      Link: https://lore.kernel.org/bpf/20210323055146.3334476-1-yhs@fb.com
      b910eaaa
  10. 05 3月, 2021 2 次提交
  11. 14 1月, 2021 1 次提交
    • S
      bpf: Reject too big ctx_size_in for raw_tp test run · 7ac6ad05
      Song Liu 提交于
      syzbot reported a WARNING for allocating too big memory:
      
      WARNING: CPU: 1 PID: 8484 at mm/page_alloc.c:4976 __alloc_pages_nodemask+0x5f8/0x730 mm/page_alloc.c:5011
      Modules linked in:
      CPU: 1 PID: 8484 Comm: syz-executor862 Not tainted 5.11.0-rc2-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__alloc_pages_nodemask+0x5f8/0x730 mm/page_alloc.c:4976
      Code: 00 00 0c 00 0f 85 a7 00 00 00 8b 3c 24 4c 89 f2 44 89 e6 c6 44 24 70 00 48 89 6c 24 58 e8 d0 d7 ff ff 49 89 c5 e9 ea fc ff ff <0f> 0b e9 b5 fd ff ff 89 74 24 14 4c 89 4c 24 08 4c 89 74 24 18 e8
      RSP: 0018:ffffc900012efb10 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 1ffff9200025df66 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000140dc0
      RBP: 0000000000140dc0 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffffff81b1f7e1 R11: 0000000000000000 R12: 0000000000000014
      R13: 0000000000000014 R14: 0000000000000000 R15: 0000000000000000
      FS:  000000000190c880(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f08b7f316c0 CR3: 0000000012073000 CR4: 00000000001506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      alloc_pages_current+0x18c/0x2a0 mm/mempolicy.c:2267
      alloc_pages include/linux/gfp.h:547 [inline]
      kmalloc_order+0x2e/0xb0 mm/slab_common.c:837
      kmalloc_order_trace+0x14/0x120 mm/slab_common.c:853
      kmalloc include/linux/slab.h:557 [inline]
      kzalloc include/linux/slab.h:682 [inline]
      bpf_prog_test_run_raw_tp+0x4b5/0x670 net/bpf/test_run.c:282
      bpf_prog_test_run kernel/bpf/syscall.c:3120 [inline]
      __do_sys_bpf+0x1ea9/0x4f10 kernel/bpf/syscall.c:4398
      do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
      entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x440499
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffe1f3bfb18 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440499
      RDX: 0000000000000048 RSI: 0000000020000600 RDI: 000000000000000a
      RBP: 00000000006ca018 R08: 0000000000000000 R09: 00000000004002c8
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401ca0
      R13: 0000000000401d30 R14: 0000000000000000 R15: 0000000000000000
      
      This is because we didn't filter out too big ctx_size_in. Fix it by
      rejecting ctx_size_in that are bigger than MAX_BPF_FUNC_ARGS (12) u64
      numbers.
      
      Fixes: 1b4d60ec ("bpf: Enable BPF_PROG_TEST_RUN for raw_tracepoint")
      Reported-by: syzbot+4f98876664c7337a4ae6@syzkaller.appspotmail.com
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20210112234254.1906829-1-songliubraving@fb.com
      7ac6ad05
  12. 09 1月, 2021 2 次提交
  13. 30 9月, 2020 1 次提交
    • S
      bpf: fix raw_tp test run in preempt kernel · 963ec27a
      Song Liu 提交于
      In preempt kernel, BPF_PROG_TEST_RUN on raw_tp triggers:
      
      [   35.874974] BUG: using smp_processor_id() in preemptible [00000000]
      code: new_name/87
      [   35.893983] caller is bpf_prog_test_run_raw_tp+0xd4/0x1b0
      [   35.900124] CPU: 1 PID: 87 Comm: new_name Not tainted 5.9.0-rc6-g615bd02bf #1
      [   35.907358] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      BIOS 1.10.2-1ubuntu1 04/01/2014
      [   35.916941] Call Trace:
      [   35.919660]  dump_stack+0x77/0x9b
      [   35.923273]  check_preemption_disabled+0xb4/0xc0
      [   35.928376]  bpf_prog_test_run_raw_tp+0xd4/0x1b0
      [   35.933872]  ? selinux_bpf+0xd/0x70
      [   35.937532]  __do_sys_bpf+0x6bb/0x21e0
      [   35.941570]  ? find_held_lock+0x2d/0x90
      [   35.945687]  ? vfs_write+0x150/0x220
      [   35.949586]  do_syscall_64+0x2d/0x40
      [   35.953443]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fix this by calling migrate_disable() before smp_processor_id().
      
      Fixes: 1b4d60ec ("bpf: Enable BPF_PROG_TEST_RUN for raw_tracepoint")
      Reported-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      963ec27a
  14. 29 9月, 2020 1 次提交
  15. 24 8月, 2020 1 次提交
  16. 04 8月, 2020 2 次提交
  17. 01 7月, 2020 1 次提交
    • Y
      bpf: Add tests for PTR_TO_BTF_ID vs. null comparison · d923021c
      Yonghong Song 提交于
      Add two tests for PTR_TO_BTF_ID vs. null ptr comparison,
      one for PTR_TO_BTF_ID in the ctx structure and the
      other for PTR_TO_BTF_ID after one level pointer chasing.
      In both cases, the test ensures condition is not
      removed.
      
      For example, for this test
       struct bpf_fentry_test_t {
           struct bpf_fentry_test_t *a;
       };
       int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
       {
           if (arg == 0)
               test7_result = 1;
           return 0;
       }
      Before the previous verifier change, we have xlated codes:
        int test7(long long unsigned int * ctx):
        ; int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
           0: (79) r1 = *(u64 *)(r1 +0)
        ; int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
           1: (b4) w0 = 0
           2: (95) exit
      After the previous verifier change, we have:
        int test7(long long unsigned int * ctx):
        ; int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
           0: (79) r1 = *(u64 *)(r1 +0)
        ; if (arg == 0)
           1: (55) if r1 != 0x0 goto pc+4
        ; test7_result = 1;
           2: (18) r1 = map[id:6][0]+48
           4: (b7) r2 = 1
           5: (7b) *(u64 *)(r1 +0) = r2
        ; int BPF_PROG(test7, struct bpf_fentry_test_t *arg)
           6: (b4) w0 = 0
           7: (95) exit
      Signed-off-by: NYonghong Song <yhs@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200630171241.2523875-1-yhs@fb.com
      d923021c
  18. 19 5月, 2020 1 次提交
  19. 15 5月, 2020 1 次提交
  20. 29 3月, 2020 1 次提交
  21. 05 3月, 2020 2 次提交
  22. 04 3月, 2020 1 次提交
  23. 25 2月, 2020 1 次提交
  24. 19 12月, 2019 1 次提交
  25. 14 12月, 2019 2 次提交
  26. 11 12月, 2019 1 次提交
  27. 10 12月, 2019 1 次提交
  28. 19 11月, 2019 1 次提交
  29. 16 11月, 2019 1 次提交
  30. 16 10月, 2019 1 次提交
  31. 26 7月, 2019 2 次提交
  32. 31 5月, 2019 1 次提交