1. 06 1月, 2018 1 次提交
    • J
      virtio_net: setup xdp_rxq_info · 754b8a21
      Jesper Dangaard Brouer 提交于
      The virtio_net driver doesn't dynamically change the RX-ring queue
      layout and backing pages, but instead reject XDP setup if all the
      conditions for XDP is not meet.  Thus, the xdp_rxq_info also remains
      fairly static.  This allow us to simply add the reg/unreg to
      net_device open/close functions.
      
      Driver hook points for xdp_rxq_info:
       * reg  : virtnet_open
       * unreg: virtnet_close
      
      V3:
       - bugfix, also setup xdp.rxq in receive_mergeable()
       - Tested bpf-sample prog inside guest on a virtio_net device
      
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Cc: virtualization@lists.linux-foundation.org
      Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
      Reviewed-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      754b8a21
  2. 09 12月, 2017 1 次提交
    • T
      virtio_net: Disable interrupts if napi_complete_done rescheduled napi · fdaa767a
      Toshiaki Makita 提交于
      Since commit 39e6c820 ("net: solve a NAPI race") napi has been able
      to be rescheduled within napi_complete_done() even in non-busypoll case,
      but virtnet_poll() always enabled interrupts before complete, and when
      napi was rescheduled within napi_complete_done() it did not disable
      interrupts.
      This caused more interrupts when event idx is disabled.
      
      According to commit cbdadbbf ("virtio_net: fix race in RX VQ
      processing") we cannot place virtqueue_enable_cb_prepare() after
      NAPI_STATE_SCHED is cleared, so disable interrupts again if
      napi_complete_done() returned false.
      
      Tested with vhost-user of OVS 2.7 on host, which does not have the event
      idx feature.
      
      * Before patch:
      
      $ netperf -t UDP_STREAM -H 192.168.150.253 -l 60 -- -m 1472
      MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.150.253 () port 0 AF_INET
      Socket  Message  Elapsed      Messages
      Size    Size     Time         Okay Errors   Throughput
      bytes   bytes    secs            #      #   10^6bits/sec
      
      212992    1472   60.00     32763206      0    6430.32
      212992           60.00     23384299           4589.56
      
      Interrupts on guest: 9872369
      Packets/interrupt:   2.37
      
      * After patch
      
      $ netperf -t UDP_STREAM -H 192.168.150.253 -l 60 -- -m 1472
      MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.150.253 () port 0 AF_INET
      Socket  Message  Elapsed      Messages
      Size    Size     Time         Okay Errors   Throughput
      bytes   bytes    secs            #      #   10^6bits/sec
      
      212992    1472   60.00     32794646      0    6436.49
      212992           60.00     32793501           6436.27
      
      Interrupts on guest: 4941299
      Packets/interrupt:   6.64
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fdaa767a
  3. 08 12月, 2017 1 次提交
  4. 16 11月, 2017 1 次提交
    • M
      mm: remove __GFP_COLD · 453f85d4
      Mel Gorman 提交于
      As the page free path makes no distinction between cache hot and cold
      pages, there is no real useful ordering of pages in the free list that
      allocation requests can take advantage of.  Juding from the users of
      __GFP_COLD, it is likely that a number of them are the result of copying
      other sites instead of actually measuring the impact.  Remove the
      __GFP_COLD parameter which simplifies a number of paths in the page
      allocator.
      
      This is potentially controversial but bear in mind that the size of the
      per-cpu pagelists versus modern cache sizes means that the whole per-cpu
      list can often fit in the L3 cache.  Hence, there is only a potential
      benefit for microbenchmarks that alloc/free pages in a tight loop.  It's
      even worse when THP is taken into account which has little or no chance
      of getting a cache-hot page as the per-cpu list is bypassed and the
      zeroing of multiple pages will thrash the cache anyway.
      
      The truncate microbenchmarks are not shown as this patch affects the
      allocation path and not the free path.  A page fault microbenchmark was
      tested but it showed no sigificant difference which is not surprising
      given that the __GFP_COLD branches are a miniscule percentage of the
      fault path.
      
      Link: http://lkml.kernel.org/r/20171018075952.10627-9-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      453f85d4
  5. 05 11月, 2017 1 次提交
  6. 27 9月, 2017 1 次提交
    • D
      bpf: add meta pointer for direct access · de8f3a83
      Daniel Borkmann 提交于
      This work enables generic transfer of metadata from XDP into skb. The
      basic idea is that we can make use of the fact that the resulting skb
      must be linear and already comes with a larger headroom for supporting
      bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work
      on a similar principle and introduce a small helper bpf_xdp_adjust_meta()
      for adjusting a new pointer called xdp->data_meta. Thus, the packet has
      a flexible and programmable room for meta data, followed by the actual
      packet data. struct xdp_buff is therefore laid out that we first point
      to data_hard_start, then data_meta directly prepended to data followed
      by data_end marking the end of packet. bpf_xdp_adjust_head() takes into
      account whether we have meta data already prepended and if so, memmove()s
      this along with the given offset provided there's enough room.
      
      xdp->data_meta is optional and programs are not required to use it. The
      rationale is that when we process the packet in XDP (e.g. as DoS filter),
      we can push further meta data along with it for the XDP_PASS case, and
      give the guarantee that a clsact ingress BPF program on the same device
      can pick this up for further post-processing. Since we work with skb
      there, we can also set skb->mark, skb->priority or other skb meta data
      out of BPF, thus having this scratch space generic and programmable
      allows for more flexibility than defining a direct 1:1 transfer of
      potentially new XDP members into skb (it's also more efficient as we
      don't need to initialize/handle each of such new members). The facility
      also works together with GRO aggregation. The scratch space at the head
      of the packet can be multiple of 4 byte up to 32 byte large. Drivers not
      yet supporting xdp->data_meta can simply be set up with xdp->data_meta
      as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out,
      such that the subsequent match against xdp->data for later access is
      guaranteed to fail.
      
      The verifier treats xdp->data_meta/xdp->data the same way as we treat
      xdp->data/xdp->data_end pointer comparisons. The requirement for doing
      the compare against xdp->data is that it hasn't been modified from it's
      original address we got from ctx access. It may have a range marking
      already from prior successful xdp->data/xdp->data_end pointer comparisons
      though.
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      de8f3a83
  7. 23 9月, 2017 1 次提交
  8. 21 9月, 2017 3 次提交
  9. 25 8月, 2017 1 次提交
  10. 19 8月, 2017 1 次提交
  11. 17 8月, 2017 1 次提交
  12. 14 8月, 2017 1 次提交
  13. 01 8月, 2017 1 次提交
  14. 26 7月, 2017 1 次提交
    • A
      virtio-net: mark PM functions as __maybe_unused · 67a75194
      Arnd Bergmann 提交于
      After removing the reset function, the freeze and restore functions
      are now unused when CONFIG_PM_SLEEP is disabled:
      
      drivers/net/virtio_net.c:1881:12: error: 'virtnet_restore_up' defined but not used [-Werror=unused-function]
       static int virtnet_restore_up(struct virtio_device *vdev)
      drivers/net/virtio_net.c:1859:13: error: 'virtnet_freeze_down' defined but not used [-Werror=unused-function]
       static void virtnet_freeze_down(struct virtio_device *vdev)
      
      A more robust way to do this is to remove the #ifdef around the callers
      and instead mark them as __maybe_unused. The compiler will now just
      silently drop the unused code.
      
      Fixes: 4941d472 ("virtio-net: do not reset during XDP set")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NJason Wang <jasowang@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      67a75194
  15. 25 7月, 2017 5 次提交
  16. 18 7月, 2017 1 次提交
  17. 08 7月, 2017 1 次提交
  18. 30 6月, 2017 1 次提交
  19. 16 6月, 2017 2 次提交
  20. 05 6月, 2017 1 次提交
  21. 03 6月, 2017 1 次提交
  22. 25 5月, 2017 1 次提交
  23. 09 5月, 2017 6 次提交
  24. 03 5月, 2017 2 次提交
  25. 01 5月, 2017 2 次提交
  26. 27 4月, 2017 1 次提交