“2170b8b33f72eed5e09d95880025b1868294064f”上不存在“data/3.算法高阶/3.leetcode-图与搜索/7.79-单词搜索/solution.cpp”
  1. 21 12月, 2019 1 次提交
  2. 20 12月, 2019 1 次提交
  3. 25 11月, 2019 1 次提交
  4. 02 11月, 2019 1 次提交
  5. 03 10月, 2019 1 次提交
  6. 25 9月, 2019 1 次提交
  7. 05 9月, 2019 3 次提交
    • B
      xsk: use state member for socket synchronization · 42fddcc7
      Björn Töpel 提交于
      Prior the state variable was introduced by Ilya, the dev member was
      used to determine whether the socket was bound or not. However, when
      dev was read, proper SMP barriers and READ_ONCE were missing. In order
      to address the missing barriers and READ_ONCE, we start using the
      state variable as a point of synchronization. The state member
      read/write is paired with proper SMP barriers, and from this follows
      that the members described above does not need READ_ONCE if used in
      conjunction with state check.
      
      In all syscalls and the xsk_rcv path we check if state is
      XSK_BOUND. If that is the case we do a SMP read barrier, and this
      implies that the dev, umem and all rings are correctly setup. Note
      that no READ_ONCE are needed for these variable if used when state is
      XSK_BOUND (plus the read barrier).
      
      To summarize: The members struct xdp_sock members dev, queue_id, umem,
      fq, cq, tx, rx, and state were read lock-less, with incorrect barriers
      and missing {READ, WRITE}_ONCE. Now, umem, fq, cq, tx, rx, and state
      are read lock-less. When these members are updated, WRITE_ONCE is
      used. When read, READ_ONCE are only used when read outside the control
      mutex (e.g. mmap) or, not synchronized with the state member
      (XSK_BOUND plus smp_rmb())
      
      Note that dev and queue_id do not need a WRITE_ONCE or READ_ONCE, due
      to the introduce state synchronization (XSK_BOUND plus smp_rmb()).
      
      Introducing the state check also fixes a race, found by syzcaller, in
      xsk_poll() where umem could be accessed when stale.
      Suggested-by: NHillf Danton <hdanton@sina.com>
      Reported-by: syzbot+c82697e3043781e08802@syzkaller.appspotmail.com
      Fixes: 77cd0d7b ("xsk: add support for need_wakeup flag in AF_XDP rings")
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      42fddcc7
    • B
      xsk: avoid store-tearing when assigning umem · 9764f4b3
      Björn Töpel 提交于
      The umem member of struct xdp_sock is read outside of the control
      mutex, in the mmap implementation, and needs a WRITE_ONCE to avoid
      potential store-tearing.
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Fixes: 423f3832 ("xsk: add umem fill queue support and mmap")
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      9764f4b3
    • B
      xsk: avoid store-tearing when assigning queues · 94a99763
      Björn Töpel 提交于
      Use WRITE_ONCE when doing the store of tx, rx, fq, and cq, to avoid
      potential store-tearing. These members are read outside of the control
      mutex in the mmap implementation.
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Fixes: 37b07693 ("xsk: add missing write- and data-dependency barrier")
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      94a99763
  8. 31 8月, 2019 1 次提交
  9. 18 8月, 2019 3 次提交
    • B
      xsk: remove AF_XDP socket from map when the socket is released · 0402acd6
      Björn Töpel 提交于
      When an AF_XDP socket is released/closed the XSKMAP still holds a
      reference to the socket in a "released" state. The socket will still
      use the netdev queue resource, and block newly created sockets from
      attaching to that queue, but no user application can access the
      fill/complete/rx/tx queues. This results in that all applications need
      to explicitly clear the map entry from the old "zombie state"
      socket. This should be done automatically.
      
      In this patch, the sockets tracks, and have a reference to, which maps
      it resides in. When the socket is released, it will remove itself from
      all maps.
      Suggested-by: NBruce Richardson <bruce.richardson@intel.com>
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      0402acd6
    • M
      xsk: add support for need_wakeup flag in AF_XDP rings · 77cd0d7b
      Magnus Karlsson 提交于
      This commit adds support for a new flag called need_wakeup in the
      AF_XDP Tx and fill rings. When this flag is set, it means that the
      application has to explicitly wake up the kernel Rx (for the bit in
      the fill ring) or kernel Tx (for bit in the Tx ring) processing by
      issuing a syscall. Poll() can wake up both depending on the flags
      submitted and sendto() will wake up tx processing only.
      
      The main reason for introducing this new flag is to be able to
      efficiently support the case when application and driver is executing
      on the same core. Previously, the driver was just busy-spinning on the
      fill ring if it ran out of buffers in the HW and there were none on
      the fill ring. This approach works when the application is running on
      another core as it can replenish the fill ring while the driver is
      busy-spinning. Though, this is a lousy approach if both of them are
      running on the same core as the probability of the fill ring getting
      more entries when the driver is busy-spinning is zero. With this new
      feature the driver now sets the need_wakeup flag and returns to the
      application. The application can then replenish the fill queue and
      then explicitly wake up the Rx processing in the kernel using the
      syscall poll(). For Tx, the flag is only set to one if the driver has
      no outstanding Tx completion interrupts. If it has some, the flag is
      zero as it will be woken up by a completion interrupt anyway.
      
      As a nice side effect, this new flag also improves the performance of
      the case where application and driver are running on two different
      cores as it reduces the number of syscalls to the kernel. The kernel
      tells user space if it needs to be woken up by a syscall, and this
      eliminates many of the syscalls.
      
      This flag needs some simple driver support. If the driver does not
      support this, the Rx flag is always zero and the Tx flag is always
      one. This makes any application relying on this feature default to the
      old behaviour of not requiring any syscalls in the Rx path and always
      having to call sendto() in the Tx path.
      
      For backwards compatibility reasons, this feature has to be explicitly
      turned on using a new bind flag (XDP_USE_NEED_WAKEUP). I recommend
      that you always turn it on as it so far always have had a positive
      performance impact.
      
      The name and inspiration of the flag has been taken from io_uring by
      Jens Axboe. Details about this feature in io_uring can be found in
      http://kernel.dk/io_uring.pdf, section 8.3.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      77cd0d7b
    • M
      xsk: replace ndo_xsk_async_xmit with ndo_xsk_wakeup · 9116e5e2
      Magnus Karlsson 提交于
      This commit replaces ndo_xsk_async_xmit with ndo_xsk_wakeup. This new
      ndo provides the same functionality as before but with the addition of
      a new flags field that is used to specifiy if Rx, Tx or both should be
      woken up. The previous ndo only woke up Tx, as implied by the
      name. The i40e and ixgbe drivers (which are all the supported ones)
      are updated with this new interface.
      
      This new ndo will be used by the new need_wakeup functionality of XDP
      sockets that need to be able to wake up both Rx and Tx driver
      processing.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      9116e5e2
  10. 12 7月, 2019 2 次提交
  11. 09 7月, 2019 1 次提交
  12. 03 7月, 2019 1 次提交
    • I
      xdp: fix hang while unregistering device bound to xdp socket · 455302d1
      Ilya Maximets 提交于
      Device that bound to XDP socket will not have zero refcount until the
      userspace application will not close it. This leads to hang inside
      'netdev_wait_allrefs()' if device unregistering requested:
      
        # ip link del p1
        < hang on recvmsg on netlink socket >
      
        # ps -x | grep ip
        5126  pts/0    D+   0:00 ip link del p1
      
        # journalctl -b
      
        Jun 05 07:19:16 kernel:
        unregister_netdevice: waiting for p1 to become free. Usage count = 1
      
        Jun 05 07:19:27 kernel:
        unregister_netdevice: waiting for p1 to become free. Usage count = 1
        ...
      
      Fix that by implementing NETDEV_UNREGISTER event notification handler
      to properly clean up all the resources and unref device.
      
      This should also allow socket killing via ss(8) utility.
      
      Fixes: 965a9909 ("xsk: add support for bind for Rx")
      Signed-off-by: NIlya Maximets <i.maximets@samsung.com>
      Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      455302d1
  13. 28 6月, 2019 3 次提交
  14. 09 3月, 2019 1 次提交
    • B
      xsk: fix to reject invalid flags in xsk_bind · f54ba391
      Björn Töpel 提交于
      Passing a non-existing flag in the sxdp_flags member of struct
      sockaddr_xdp was, incorrectly, silently ignored. This patch addresses
      that behavior, and rejects any non-existing flags.
      
      We have examined existing user space code, and to our best knowledge,
      no one is relying on the current incorrect behavior. AF_XDP is still
      in its infancy, so from our perspective, the risk of breakage is very
      low, and addressing this problem now is important.
      
      Fixes: 965a9909 ("xsk: add support for bind for Rx")
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      f54ba391
  15. 21 2月, 2019 1 次提交
  16. 11 2月, 2019 1 次提交
    • M
      xsk: add missing smp_rmb() in xsk_mmap · e6762c8b
      Magnus Karlsson 提交于
      All the setup code in AF_XDP is protected by a mutex with the
      exception of the mmap code that cannot use it. To make sure that a
      process banging on the mmap call at the same time as another process
      is setting up the socket, smp_wmb() calls were added in the umem
      registration code and the queue creation code, so that the published
      structures that xsk_mmap needs would be consistent. However, the
      corresponding smp_rmb() calls were not added to the xsk_mmap
      code. This patch adds these calls.
      
      Fixes: 37b07693 ("xsk: add missing write- and data-dependency barrier")
      Fixes: c0c77d8f ("xsk: add user memory registration support sockopt")
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      e6762c8b
  17. 25 1月, 2019 2 次提交
  18. 20 12月, 2018 1 次提交
    • B
      xsk: simplify AF_XDP socket teardown · e2ce3674
      Björn Töpel 提交于
      Prior this commit, when the struct socket object was being released,
      the UMEM did not have its reference count decreased. Instead, this was
      done in the struct sock sk_destruct function.
      
      There is no reason to keep the UMEM reference around when the socket
      is being orphaned, so in this patch the xdp_put_mem is called in the
      xsk_release function. This results in that the xsk_destruct function
      can be removed!
      
      Note that, it still holds that a struct xsk_sock reference might still
      linger in the XSKMAP after the UMEM is released, e.g. if a user does
      not clear the XSKMAP prior to closing the process. This sock will be
      in a "released" zombie like state, until the XSKMAP is removed.
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      e2ce3674
  19. 11 10月, 2018 1 次提交
  20. 08 10月, 2018 1 次提交
    • B
      xsk: proper AF_XDP socket teardown ordering · 541d7fdd
      Björn Töpel 提交于
      The AF_XDP socket struct can exist in three different, implicit
      states: setup, bound and released. Setup is prior the socket has been
      bound to a device. Bound is when the socket is active for receive and
      send. Released is when the process/userspace side of the socket is
      released, but the sock object is still lingering, e.g. when there is a
      reference to the socket in an XSKMAP after process termination.
      
      The Rx fast-path code uses the "dev" member of struct xdp_sock to
      check whether a socket is bound or relased, and the Tx code uses the
      struct xdp_umem "xsk_list" member in conjunction with "dev" to
      determine the state of a socket.
      
      However, the transition from bound to released did not tear the socket
      down in correct order.
      
      On the Rx side "dev" was cleared after synchronize_net() making the
      synchronization useless. On the Tx side, the internal queues were
      destroyed prior removing them from the "xsk_list".
      
      This commit corrects the cleanup order, and by doing so
      xdp_del_sk_umem() can be simplified and one synchronize_net() can be
      removed.
      
      Fixes: 965a9909 ("xsk: add support for bind for Rx")
      Fixes: ac98d8aa ("xsk: wire upp Tx zero-copy functions")
      Reported-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      541d7fdd
  21. 05 10月, 2018 1 次提交
    • M
      xsk: fix bug when trying to use both copy and zero-copy on one queue id · c9b47cc1
      Magnus Karlsson 提交于
      Previously, the xsk code did not record which umem was bound to a
      specific queue id. This was not required if all drivers were zero-copy
      enabled as this had to be recorded in the driver anyway. So if a user
      tried to bind two umems to the same queue, the driver would say
      no. But if copy-mode was first enabled and then zero-copy mode (or the
      reverse order), we mistakenly enabled both of them on the same umem
      leading to buggy behavior. The main culprit for this is that we did
      not store the association of umem to queue id in the copy case and
      only relied on the driver reporting this. As this relation was not
      stored in the driver for copy mode (it does not rely on the AF_XDP
      NDOs), this obviously could not work.
      
      This patch fixes the problem by always recording the umem to queue id
      relationship in the netdev_queue and netdev_rx_queue structs. This way
      we always know what kind of umem has been bound to a queue id and can
      act appropriately at bind time.
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      c9b47cc1
  22. 01 9月, 2018 1 次提交
  23. 30 8月, 2018 1 次提交
  24. 31 7月, 2018 1 次提交
  25. 13 7月, 2018 4 次提交
  26. 03 7月, 2018 2 次提交
    • M
      xsk: fix potential race in SKB TX completion code · a9744f7c
      Magnus Karlsson 提交于
      There is a potential race in the TX completion code for the SKB
      case. One process enters the sendmsg code of an AF_XDP socket in order
      to send a frame. The execution eventually trickles down to the driver
      that is told to send the packet. However, it decides to drop the
      packet due to some error condition (e.g., rings full) and frees the
      SKB. This will trigger the SKB destructor and a completion will be
      sent to the AF_XDP user space through its
      single-producer/single-consumer queues.
      
      At the same time a TX interrupt has fired on another core and it
      dispatches the TX completion code in the driver. It does its HW
      specific things and ends up freeing the SKB associated with the
      transmitted packet. This will trigger the SKB destructor and a
      completion will be sent to the AF_XDP user space through its
      single-producer/single-consumer queues. With a pseudo call stack, it
      would look like this:
      
      Core 1:
      sendmsg() being called in the application
        netdev_start_xmit()
          Driver entered through ndo_start_xmit
            Driver decides to free the SKB for some reason (e.g., rings full)
              Destructor of SKB called
                xskq_produce_addr() is called to signal completion to user space
      
      Core 2:
      TX completion irq
        NAPI loop
          Driver irq handler for TX completions
            Frees the SKB
              Destructor of SKB called
                xskq_produce_addr() is called to signal completion to user space
      
      We now have a violation of the single-producer/single-consumer
      principle for our queues as there are two threads trying to produce at
      the same time on the same queue.
      
      Fixed by introducing a spin_lock in the destructor. In regards to the
      performance, I get around 1.74 Mpps for txonly before and after the
      introduction of the spinlock. There is of course some impact due to
      the spin lock but it is in the less significant digits that are too
      noisy for me to measure. But let us say that the version without the
      spin lock got 1.745 Mpps in the best case and the version with 1.735
      Mpps in the worst case, then that would mean a maximum drop in
      performance of 0.5%.
      
      Fixes: 35fcde7f ("xsk: support for Tx")
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      a9744f7c
    • M
      xsk: frame could be completed more than once in SKB path · fe588685
      Magnus Karlsson 提交于
      Fixed a bug in which a frame could be completed more than once
      when an error was returned from dev_direct_xmit(). The code
      erroneously retried sending the message leading to multiple
      calls to the SKB destructor and therefore multiple completions
      of the same buffer to user space.
      
      The error code in this case has been changed from EAGAIN to EBUSY
      in order to tell user space that the sending of the packet failed
      and the buffer has been return to user space through the completion
      queue.
      
      Fixes: 35fcde7f ("xsk: support for Tx")
      Signed-off-by: NMagnus Karlsson <magnus.karlsson@intel.com>
      Reported-by: NPavel Odintsov <pavel@fastnetmon.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      fe588685
  27. 29 6月, 2018 1 次提交
    • L
      Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL · a11e1d43
      Linus Torvalds 提交于
      The poll() changes were not well thought out, and completely
      unexplained.  They also caused a huge performance regression, because
      "->poll()" was no longer a trivial file operation that just called down
      to the underlying file operations, but instead did at least two indirect
      calls.
      
      Indirect calls are sadly slow now with the Spectre mitigation, but the
      performance problem could at least be largely mitigated by changing the
      "->get_poll_head()" operation to just have a per-file-descriptor pointer
      to the poll head instead.  That gets rid of one of the new indirections.
      
      But that doesn't fix the new complexity that is completely unwarranted
      for the regular case.  The (undocumented) reason for the poll() changes
      was some alleged AIO poll race fixing, but we don't make the common case
      slower and more complex for some uncommon special case, so this all
      really needs way more explanations and most likely a fundamental
      redesign.
      
      [ This revert is a revert of about 30 different commits, not reverted
        individually because that would just be unnecessarily messy  - Linus ]
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a11e1d43
  28. 12 6月, 2018 1 次提交