1. 27 1月, 2020 1 次提交
  2. 17 1月, 2020 1 次提交
    • T
      xdp: Use bulking for non-map XDP_REDIRECT and consolidate code paths · 1d233886
      Toke Høiland-Jørgensen 提交于
      Since the bulk queue used by XDP_REDIRECT now lives in struct net_device,
      we can re-use the bulking for the non-map version of the bpf_redirect()
      helper. This is a simple matter of having xdp_do_redirect_slow() queue the
      frame on the bulk queue instead of sending it out with __bpf_tx_xdp().
      
      Unfortunately we can't make the bpf_redirect() helper return an error if
      the ifindex doesn't exit (as bpf_redirect_map() does), because we don't
      have a reference to the network namespace of the ingress device at the time
      the helper is called. So we have to leave it as-is and keep the device
      lookup in xdp_do_redirect_slow().
      
      Since this leaves less reason to have the non-map redirect code in a
      separate function, so we get rid of the xdp_do_redirect_slow() function
      entirely. This does lose us the tracepoint disambiguation, but fortunately
      the xdp_redirect and xdp_redirect_map tracepoints use the same tracepoint
      entry structures. This means both can contain a map index, so we can just
      amend the tracepoint definitions so we always emit the xdp_redirect(_err)
      tracepoints, but with the map ID only populated if a map is present. This
      means we retire the xdp_redirect_map(_err) tracepoints entirely, but keep
      the definitions around in case someone is still listening for them.
      
      With this change, the performance of the xdp_redirect sample program goes
      from 5Mpps to 8.4Mpps (a 68% increase).
      
      Since the flush functions are no longer map-specific, rename the flush()
      functions to drop _map from their names. One of the renamed functions is
      the xdp_do_flush_map() callback used in all the xdp-enabled drivers. To
      keep from having to update all drivers, use a #define to keep the old name
      working, and only update the virtual drivers in this patch.
      Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/bpf/157918768505.1458396.17518057312953572912.stgit@toke.dk
      1d233886
  3. 18 11月, 2019 1 次提交
  4. 02 10月, 2019 1 次提交
    • F
      netfilter: drop bridge nf reset from nf_reset · 895b5c9f
      Florian Westphal 提交于
      commit 174e2381
      ("sk_buff: drop all skb extensions on free and skb scrubbing") made napi
      recycle always drop skb extensions.  The additional skb_ext_del() that is
      performed via nf_reset on napi skb recycle is not needed anymore.
      
      Most nf_reset() calls in the stack are there so queued skb won't block
      'rmmod nf_conntrack' indefinitely.
      
      This removes the skb_ext_del from nf_reset, and renames it to a more
      fitting nf_reset_ct().
      
      In a few selected places, add a call to skb_ext_reset to make sure that
      no active extensions remain.
      
      I am submitting this for "net", because we're still early in the release
      cycle.  The patch applies to net-next too, but I think the rename causes
      needless divergence between those trees.
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      895b5c9f
  5. 04 9月, 2019 1 次提交
    • ?
      virtio-net: lower min ring num_free for efficiency · 718be6ba
      ? jiang 提交于
      This change lowers ring buffer reclaim threshold from 1/2*queue to budget
      for better performance. According to our test with qemu + dpdk, packet
      dropping happens when the guest is not able to provide free buffer in
      avail ring timely with default 1/2*queue. The value in the patch has been
      tested and does show better performance.
      
      Test setup: iperf3 to generate packets to guest (total 30mins, pps 400k, UDP)
      avg packets drop before: 2842
      avg packets drop after: 360(-87.3%)
      
      Further, current code suffers from a starvation problem: the amount of
      work done by try_fill_recv is not bounded by the budget parameter, thus
      (with large queues) once in a while userspace gets blocked for a long
      time while queue is being refilled. Trigger refills earlier to make sure
      the amount of work to do is limited.
      Signed-off-by: Njiangkidd <jiangkidd@hotmail.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      718be6ba
  6. 15 6月, 2019 1 次提交
  7. 21 5月, 2019 1 次提交
    • T
      treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 · 1ccea77e
      Thomas Gleixner 提交于
      Based on 2 normalized pattern(s):
      
        this program is free software you can redistribute it and or modify
        it under the terms of the gnu general public license as published by
        the free software foundation either version 2 of the license or at
        your option any later version this program is distributed in the
        hope that it will be useful but without any warranty without even
        the implied warranty of merchantability or fitness for a particular
        purpose see the gnu general public license for more details you
        should have received a copy of the gnu general public license along
        with this program if not see http www gnu org licenses
      
        this program is free software you can redistribute it and or modify
        it under the terms of the gnu general public license as published by
        the free software foundation either version 2 of the license or at
        your option any later version this program is distributed in the
        hope that it will be useful but without any warranty without even
        the implied warranty of merchantability or fitness for a particular
        purpose see the gnu general public license for more details [based]
        [from] [clk] [highbank] [c] you should have received a copy of the
        gnu general public license along with this program if not see http
        www gnu org licenses
      
      extracted by the scancode license scanner the SPDX license identifier
      
        GPL-2.0-or-later
      
      has been chosen to replace the boilerplate/reference in 355 file(s).
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NJilayne Lovejoy <opensource@jilayne.com>
      Reviewed-by: NSteve Winslow <swinslow@gmail.com>
      Reviewed-by: NAllison Randal <allison@lohutok.net>
      Cc: linux-spdx@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ccea77e
  8. 07 4月, 2019 2 次提交
  9. 02 4月, 2019 1 次提交
    • F
      net: move skb->xmit_more hint to softnet data · 6b16f9ee
      Florian Westphal 提交于
      There are two reasons for this.
      
      First, the xmit_more flag conceptually doesn't fit into the skb, as
      xmit_more is not a property related to the skb.
      Its only a hint to the driver that the stack is about to transmit another
      packet immediately.
      
      Second, it was only done this way to not have to pass another argument
      to ndo_start_xmit().
      
      We can place xmit_more in the softnet data, next to the device recursion.
      The recursion counter is already written to on each transmit. The "more"
      indicator is placed right next to it.
      
      Drivers can use the netdev_xmit_more() helper instead of skb->xmit_more
      to check the "more packets coming" hint.
      
      skb->xmit_more is retained (but always 0) to not cause build breakage.
      
      This change takes care of the simple s/skb->xmit_more/netdev_xmit_more()/
      conversions.  Remaining drivers are converted in the next patches.
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6b16f9ee
  10. 19 3月, 2019 1 次提交
  11. 04 2月, 2019 1 次提交
  12. 31 1月, 2019 7 次提交
  13. 20 1月, 2019 2 次提交
  14. 21 12月, 2018 1 次提交
    • W
      virtio-net: ethtool configurable LRO · a02e8964
      Willem de Bruijn 提交于
      Virtio-net devices negotiate LRO support with the host.
      Display the initially negotiated state with ethtool -k.
      
      Also allow configuring it with ethtool -K, reusing the existing
      virtnet_set_guest_offloads helper that configures LRO for XDP.
      This is conditional on VIRTIO_NET_F_CTRL_GUEST_OFFLOADS.
      
      Virtio-net negotiates TSO4 and TSO6 separately, but ethtool does not
      distinguish between the two. Display LRO as on only if any offload
      is active.
      
      RTNL is held while calling virtnet_set_features, same as on the path
      from virtnet_xdp_set.
      
      Changes v1 -> v2
        - allow ethtool config (-K) only if VIRTIO_NET_F_CTRL_GUEST_OFFLOADS
        - show LRO as enabled if any LRO variant is enabled
        - do not allow configuration while XDP is active
        - differentiate current features from the capable set, to restore
          on XDP down only those features that were active on XDP up
        - move test out of VIRTIO_NET_F_CSUM/TSO branch, which is tx only
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a02e8964
  15. 01 12月, 2018 1 次提交
  16. 24 11月, 2018 2 次提交
  17. 18 10月, 2018 1 次提交
  18. 11 10月, 2018 1 次提交
  19. 29 9月, 2018 1 次提交
    • E
      virtio_net: remove ndo_poll_controller · 260dd2c3
      Eric Dumazet 提交于
      As diagnosed by Song Liu, ndo_poll_controller() can
      be very dangerous on loaded hosts, since the cpu
      calling ndo_poll_controller() might steal all NAPI
      contexts (for all RX/TX queues of the NIC). This capture
      can last for unlimited amount of time, since one
      cpu is generally not able to drain all the queues under load.
      
      virto_net uses NAPI for TX completions, so we better let core
      networking stack call the napi->poll() to avoid the capture.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      260dd2c3
  20. 14 8月, 2018 1 次提交
  21. 12 8月, 2018 2 次提交
  22. 10 8月, 2018 1 次提交
    • A
      net: allow to call netif_reset_xps_queues() under cpus_read_lock · 4d99f660
      Andrei Vagin 提交于
      The definition of static_key_slow_inc() has cpus_read_lock in place. In the
      virtio_net driver, XPS queues are initialized after setting the queue:cpu
      affinity in virtnet_set_affinity() which is already protected within
      cpus_read_lock. Lockdep prints a warning when we are trying to acquire
      cpus_read_lock when it is already held.
      
      This patch adds an ability to call __netif_set_xps_queue under
      cpus_read_lock().
      Acked-by: NJason Wang <jasowang@redhat.com>
      
      ============================================
      WARNING: possible recursive locking detected
      4.18.0-rc3-next-20180703+ #1 Not tainted
      --------------------------------------------
      swapper/0/1 is trying to acquire lock:
      00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: static_key_slow_inc+0xe/0x20
      
      but task is already holding lock:
      00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0x513/0x5a0
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(cpu_hotplug_lock.rw_sem);
        lock(cpu_hotplug_lock.rw_sem);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      3 locks held by swapper/0/1:
       #0: 00000000244bc7da (&dev->mutex){....}, at: __driver_attach+0x5a/0x110
       #1: 00000000cf973d46 (cpu_hotplug_lock.rw_sem){++++}, at: init_vqs+0x513/0x5a0
       #2: 000000005cd8463f (xps_map_mutex){+.+.}, at: __netif_set_xps_queue+0x8d/0xc60
      
      v2: move cpus_read_lock() out of __netif_set_xps_queue()
      
      Cc: "Nambiar, Amritha" <amritha.nambiar@intel.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Fixes: 8af2c06f ("net-sysfs: Add interface for Rx queue(s) map per Tx queue")
      Signed-off-by: NAndrei Vagin <avagin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4d99f660
  23. 06 8月, 2018 1 次提交
  24. 01 8月, 2018 2 次提交
  25. 26 7月, 2018 5 次提交