1. 01 8月, 2018 2 次提交
  2. 12 6月, 2018 1 次提交
    • B
      xsk: silence warning on memory allocation failure · a343993c
      Björn Töpel 提交于
      syzkaller reported a warning from xdp_umem_pin_pages():
      
        WARNING: CPU: 1 PID: 4537 at mm/slab_common.c:996 kmalloc_slab+0x56/0x70 mm/slab_common.c:996
        ...
        __do_kmalloc mm/slab.c:3713 [inline]
        __kmalloc+0x25/0x760 mm/slab.c:3727
        kmalloc_array include/linux/slab.h:634 [inline]
        kcalloc include/linux/slab.h:645 [inline]
        xdp_umem_pin_pages net/xdp/xdp_umem.c:205 [inline]
        xdp_umem_reg net/xdp/xdp_umem.c:318 [inline]
        xdp_umem_create+0x5c9/0x10f0 net/xdp/xdp_umem.c:349
        xsk_setsockopt+0x443/0x550 net/xdp/xsk.c:531
        __sys_setsockopt+0x1bd/0x390 net/socket.c:1935
        __do_sys_setsockopt net/socket.c:1946 [inline]
        __se_sys_setsockopt net/socket.c:1943 [inline]
        __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1943
        do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      This is a warning about attempting to allocate more than
      KMALLOC_MAX_SIZE memory. The request originates from userspace, and if
      the request is too big, the kernel is free to deny its allocation. In
      this patch, the failed allocation attempt is silenced with
      __GFP_NOWARN.
      
      Fixes: c0c77d8f ("xsk: add user memory registration support sockopt")
      Reported-by: syzbot+4abadc5d69117b346506@syzkaller.appspotmail.com
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      a343993c
  3. 08 6月, 2018 1 次提交
    • D
      bpf, xdp: fix crash in xdp_umem_unaccount_pages · c09290c5
      Daniel Borkmann 提交于
      syzkaller was able to trigger the following panic for AF_XDP:
      
        BUG: KASAN: null-ptr-deref in atomic64_sub include/asm-generic/atomic-instrumented.h:144 [inline]
        BUG: KASAN: null-ptr-deref in atomic_long_sub include/asm-generic/atomic-long.h:199 [inline]
        BUG: KASAN: null-ptr-deref in xdp_umem_unaccount_pages.isra.4+0x3d/0x80 net/xdp/xdp_umem.c:135
        Write of size 8 at addr 0000000000000060 by task syz-executor246/4527
      
        CPU: 1 PID: 4527 Comm: syz-executor246 Not tainted 4.17.0+ #89
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
        Call Trace:
         __dump_stack lib/dump_stack.c:77 [inline]
         dump_stack+0x1b9/0x294 lib/dump_stack.c:113
         kasan_report_error mm/kasan/report.c:352 [inline]
         kasan_report.cold.7+0x6d/0x2fe mm/kasan/report.c:412
         check_memory_region_inline mm/kasan/kasan.c:260 [inline]
         check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
         kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278
         atomic64_sub include/asm-generic/atomic-instrumented.h:144 [inline]
         atomic_long_sub include/asm-generic/atomic-long.h:199 [inline]
         xdp_umem_unaccount_pages.isra.4+0x3d/0x80 net/xdp/xdp_umem.c:135
         xdp_umem_reg net/xdp/xdp_umem.c:334 [inline]
         xdp_umem_create+0xd6c/0x10f0 net/xdp/xdp_umem.c:349
         xsk_setsockopt+0x443/0x550 net/xdp/xsk.c:531
         __sys_setsockopt+0x1bd/0x390 net/socket.c:1935
         __do_sys_setsockopt net/socket.c:1946 [inline]
         __se_sys_setsockopt net/socket.c:1943 [inline]
         __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1943
         do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      In xdp_umem_reg() the call to xdp_umem_account_pages() passed
      with CAP_IPC_LOCK where we didn't need to end up charging rlimit
      on memlock for the current user and therefore umem->user continues
      to be NULL. Later on through fault injection syzkaller triggered
      a failure in either umem->pgs or umem->pages allocation such that
      we bail out and undo accounting in xdp_umem_unaccount_pages()
      where we eventually hit the panic since it tries to deref the
      umem->user.
      
      The code is pretty close to mm_account_pinned_pages() and
      mm_unaccount_pinned_pages() pair and potentially could reuse
      it even in a later cleanup, and it appears that the initial
      commit c0c77d8f ("xsk: add user memory registration support
      sockopt") got this right while later follow-up introduced the
      bug via a49049ea ("xsk: simplified umem setup").
      
      Fixes: a49049ea ("xsk: simplified umem setup")
      Reported-by: syzbot+979217770b09ebf5c407@syzkaller.appspotmail.com
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      c09290c5
  4. 05 6月, 2018 4 次提交
  5. 04 6月, 2018 1 次提交
    • B
      xsk: new descriptor addressing scheme · bbff2f32
      Björn Töpel 提交于
      Currently, AF_XDP only supports a fixed frame-size memory scheme where
      each frame is referenced via an index (idx). A user passes the frame
      index to the kernel, and the kernel acts upon the data.  Some NICs,
      however, do not have a fixed frame-size model, instead they have a
      model where a memory window is passed to the hardware and multiple
      frames are filled into that window (referred to as the "type-writer"
      model).
      
      By changing the descriptor format from the current frame index
      addressing scheme, AF_XDP can in the future be extended to support
      these kinds of NICs.
      
      In the index-based model, an idx refers to a frame of size
      frame_size. Addressing a frame in the UMEM is done by offseting the
      UMEM starting address by a global offset, idx * frame_size + offset.
      Communicating via the fill- and completion-rings are done by means of
      idx.
      
      In this commit, the idx is removed in favor of an address (addr),
      which is a relative address ranging over the UMEM. To convert an
      idx-based address to the new addr is simply: addr = idx * frame_size +
      offset.
      
      We also stop referring to the UMEM "frame" as a frame. Instead it is
      simply called a chunk.
      
      To transfer ownership of a chunk to the kernel, the addr of the chunk
      is passed in the fill-ring. Note, that the kernel will mask addr to
      make it chunk aligned, so there is no need for userspace to do
      that. E.g., for a chunk size of 2k, passing an addr of 2048, 2050 or
      3000 to the fill-ring will refer to the same chunk.
      
      On the completion-ring, the addr will match that of the Tx descriptor,
      passed to the kernel.
      
      Changing the descriptor format to use chunks/addr will allow for
      future changes to move to a type-writer based model, where multiple
      frames can reside in one chunk. In this model passing one single chunk
      into the fill-ring, would potentially result in multiple Rx
      descriptors.
      
      This commit changes the uapi of AF_XDP sockets, and updates the
      documentation.
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      bbff2f32
  6. 22 5月, 2018 2 次提交
  7. 18 5月, 2018 2 次提交
  8. 10 5月, 2018 1 次提交
  9. 04 5月, 2018 4 次提交
    • M
      xsk: add umem completion queue support and mmap · fe230832
      Magnus Karlsson 提交于
      Here, we add another setsockopt for registered user memory (umem)
      called XDP_UMEM_COMPLETION_QUEUE. Using this socket option, the
      process can ask the kernel to allocate a queue (ring buffer) and also
      mmap it (XDP_UMEM_PGOFF_COMPLETION_QUEUE) into the process.
      
      The queue is used to explicitly pass ownership of umem frames from the
      kernel to user process. This will be used by the TX path to tell user
      space that a certain frame has been transmitted and user space can use
      it for something else, if it wishes.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      fe230832
    • M
      xsk: add support for bind for Rx · 965a9909
      Magnus Karlsson 提交于
      Here, the bind syscall is added. Binding an AF_XDP socket, means
      associating the socket to an umem, a netdev and a queue index. This
      can be done in two ways.
      
      The first way, creating a "socket from scratch". Create the umem using
      the XDP_UMEM_REG setsockopt and an associated fill queue with
      XDP_UMEM_FILL_QUEUE. Create the Rx queue using the XDP_RX_QUEUE
      setsockopt. Call bind passing ifindex and queue index ("channel" in
      ethtool speak).
      
      The second way to bind a socket, is simply skipping the
      umem/netdev/queue index, and passing another already setup AF_XDP
      socket. The new socket will then have the same umem/netdev/queue index
      as the parent so it will share the same umem. You must also set the
      flags field in the socket address to XDP_SHARED_UMEM.
      
      v2: Use PTR_ERR instead of passing error variable explicitly.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      965a9909
    • M
      xsk: add umem fill queue support and mmap · 423f3832
      Magnus Karlsson 提交于
      Here, we add another setsockopt for registered user memory (umem)
      called XDP_UMEM_FILL_QUEUE. Using this socket option, the process can
      ask the kernel to allocate a queue (ring buffer) and also mmap it
      (XDP_UMEM_PGOFF_FILL_QUEUE) into the process.
      
      The queue is used to explicitly pass ownership of umem frames from the
      user process to the kernel. These frames will in a later patch be
      filled in with Rx packet data by the kernel.
      
      v2: Fixed potential crash in xsk_mmap.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      423f3832
    • B
      xsk: add user memory registration support sockopt · c0c77d8f
      Björn Töpel 提交于
      In this commit the base structure of the AF_XDP address family is set
      up. Further, we introduce the abilty register a window of user memory
      to the kernel via the XDP_UMEM_REG setsockopt syscall. The memory
      window is viewed by an AF_XDP socket as a set of equally large
      frames. After a user memory registration all frames are "owned" by the
      user application, and not the kernel.
      
      v2: More robust checks on umem creation and unaccount on error.
          Call set_page_dirty_lock on cleanup.
          Simplified xdp_umem_reg.
      Co-authored-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      c0c77d8f