1. 24 9月, 2009 3 次提交
    • R
      virtio_net: don't free buffers in xmit ring · b0c39dbd
      Rusty Russell 提交于
      The virtio_net driver is complicated by the two methods of freeing old
      xmit buffers (in addition to freeing old ones at the start of the xmit
      path).
      
      The original code used a 1/10 second timer attached to xmit_free(),
      reset on every xmit.  Before we orphaned skbs on xmit, the
      transmitting userspace could block with a full socket until the timer
      fired, the skb destructor was called, and they were re-woken.
      
      So we added the VIRTIO_F_NOTIFY_ON_EMPTY feature: supporting devices
      send an interrupt (even if normally suppressed) on an empty xmit ring
      which makes us schedule xmit_tasklet().  This was a benchmark win.
      
      Unfortunately, VIRTIO_F_NOTIFY_ON_EMPTY makes quite a lot of work: a
      host which is faster than the guest will fire the interrupt every xmit
      packet (slowing the guest down further).  Attempting mitigation in the
      host adds overhead of userspace timers (possibly with the additional
      pain of signals), and risks increasing latency anyway if you get it
      wrong.
      
      In practice, this effect was masked by benchmarks which take advantage
      of GSO (with its inherent transmit batching), but it's still there.
      
      Now we orphan xmitted skbs, the pressure is off: remove both paths and
      no longer request VIRTIO_F_NOTIFY_ON_EMPTY.  Note that the current
      QEMU will notify us even if we don't negotiate this feature (legal,
      but suboptimal); a patch is outstanding to improve that.
      
      Move the skb_orphan/nf_reset to after we've done the send and notified
      the other end, for a slight optimization.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Mark McLoughlin <markmc@redhat.com>
      b0c39dbd
    • R
      virtio_net: return NETDEV_TX_BUSY instead of queueing an extra skb. · 8958f574
      Rusty Russell 提交于
      This effectively reverts 99ffc696
      "virtio: wean net driver off NETDEV_TX_BUSY".
      
      The complexity of queuing an skb (setting a tasklet to re-xmit) is
      questionable, especially once we get rid of the other reason for the
      tasklet in the next patch.
      
      If the skb won't fit in the tx queue, just return NETDEV_TX_BUSY.
      This is frowned upon, so a followup patch uses a more complex solution.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      8958f574
    • R
      virtio_net: skb_orphan() and nf_reset() in xmit path. · 2b5bbe3b
      Rusty Russell 提交于
      The complex transmit free logic was introduced to avoid hangs on
      removing the ip_conntrack module and also because drivers aren't
      generally supposed to keep stale skbs for unbounded times.
      
      After some debate, it was decided that while doing skb_orphan()
      generally is a rat's nest, we can do it in this driver.  Following
      patches take advantage of this.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      2b5bbe3b
  2. 23 9月, 2009 2 次提交
  3. 02 9月, 2009 1 次提交
  4. 01 9月, 2009 1 次提交
  5. 27 8月, 2009 1 次提交
  6. 18 7月, 2009 1 次提交
  7. 18 6月, 2009 1 次提交
    • J
      net: group address list and its count · 31278e71
      Jiri Pirko 提交于
      This patch is inspired by patch recently posted by Johannes Berg. Basically what
      my patch does is to group list and a count of addresses into newly introduced
      structure netdev_hw_addr_list. This brings us two benefits:
      1) struct net_device becames a bit nicer.
      2) in the future there will be a possibility to operate with lists independently
         on netdevices (with exporting right functions).
      I wanted to introduce this patch before I'll post a multicast lists conversion.
      Signed-off-by: NJiri Pirko <jpirko@redhat.com>
      
       drivers/net/bnx2.c              |    4 +-
       drivers/net/e1000/e1000_main.c  |    4 +-
       drivers/net/ixgbe/ixgbe_main.c  |    6 +-
       drivers/net/mv643xx_eth.c       |    2 +-
       drivers/net/niu.c               |    4 +-
       drivers/net/virtio_net.c        |   10 ++--
       drivers/s390/net/qeth_l2_main.c |    2 +-
       include/linux/netdevice.h       |   17 +++--
       net/core/dev.c                  |  130 ++++++++++++++++++--------------------
       9 files changed, 89 insertions(+), 90 deletions(-)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31278e71
  8. 12 6月, 2009 3 次提交
  9. 08 6月, 2009 1 次提交
  10. 30 5月, 2009 1 次提交
    • J
      net: convert unicast addr list · ccffad25
      Jiri Pirko 提交于
      This patch converts unicast address list to standard list_head using
      previously introduced struct netdev_hw_addr. It also relaxes the
      locking. Original spinlock (still used for multicast addresses) is not
      needed and is no longer used for a protection of this list. All
      reading and writing takes place under rtnl (with no changes).
      
      I also removed a possibility to specify the length of the address
      while adding or deleting unicast address. It's always dev->addr_len.
      
      The convertion touched especially e1000 and ixgbe codes when the
      change is not so trivial.
      Signed-off-by: NJiri Pirko <jpirko@redhat.com>
      
       drivers/net/bnx2.c               |   13 +--
       drivers/net/e1000/e1000_main.c   |   24 +++--
       drivers/net/ixgbe/ixgbe_common.c |   14 ++--
       drivers/net/ixgbe/ixgbe_common.h |    4 +-
       drivers/net/ixgbe/ixgbe_main.c   |    6 +-
       drivers/net/ixgbe/ixgbe_type.h   |    4 +-
       drivers/net/macvlan.c            |   11 +-
       drivers/net/mv643xx_eth.c        |   11 +-
       drivers/net/niu.c                |    7 +-
       drivers/net/virtio_net.c         |    7 +-
       drivers/s390/net/qeth_l2_main.c  |    6 +-
       drivers/scsi/fcoe/fcoe.c         |   16 ++--
       include/linux/netdevice.h        |   18 ++--
       net/8021q/vlan.c                 |    4 +-
       net/8021q/vlan_dev.c             |   10 +-
       net/core/dev.c                   |  195 +++++++++++++++++++++++++++-----------
       net/dsa/slave.c                  |   10 +-
       net/packet/af_packet.c           |    4 +-
       18 files changed, 227 insertions(+), 137 deletions(-)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ccffad25
  11. 02 5月, 2009 2 次提交
  12. 14 4月, 2009 1 次提交
  13. 05 4月, 2009 1 次提交
  14. 19 3月, 2009 1 次提交
  15. 05 2月, 2009 5 次提交
  16. 27 1月, 2009 1 次提交
  17. 26 1月, 2009 1 次提交
  18. 22 1月, 2009 2 次提交
  19. 07 1月, 2009 1 次提交
  20. 23 12月, 2008 1 次提交
  21. 02 12月, 2008 1 次提交
  22. 17 11月, 2008 3 次提交
    • M
      virtio_net: VIRTIO_NET_F_MSG_RXBUF (imprive rcv buffer allocation) · 3f2c31d9
      Mark McLoughlin 提交于
      If segmentation offload is enabled by the host, we currently allocate
      maximum sized packet buffers and pass them to the host. This uses up
      20 ring entries, allowing us to supply only 20 packet buffers to the
      host with a 256 entry ring. This is a huge overhead when receiving
      small packets, and is most keenly felt when receiving MTU sized
      packets from off-host.
      
      The VIRTIO_NET_F_MRG_RXBUF feature flag is set by hosts which support
      using receive buffers which are smaller than the maximum packet size.
      In order to transfer large packets to the guest, the host merges
      together multiple receive buffers to form a larger logical buffer.
      The number of merged buffers is returned to the guest via a field in
      the virtio_net_hdr.
      
      Make use of this support by supplying single page receive buffers to
      the host. On receive, we extract the virtio_net_hdr, copy 128 bytes of
      the payload to the skb's linear data buffer and adjust the fragment
      offset to point to the remaining data. This ensures proper alignment
      and allows us to not use any paged data for small packets. If the
      payload occupies multiple pages, we simply append those pages as
      fragments and free the associated skbs.
      
      This scheme allows us to be efficient in our use of ring entries
      while still supporting large packets. Benchmarking using netperf from
      an external machine to a guest over a 10Gb/s network shows a 100%
      improvement from ~1Gb/s to ~2Gb/s. With a local host->guest benchmark
      with GSO disabled on the host side, throughput was seen to increase
      from 700Mb/s to 1.7Gb/s.
      
      Based on a patch from Herbert Xu.
      Signed-off-by: NMark McLoughlin <markmc@redhat.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f2c31d9
    • M
      virtio_net: hook up the set-tso ethtool op · 0276b497
      Mark McLoughlin 提交于
      Seems like an oversight that we have set-tx-csum and set-sg hooked
      up, but not set-tso.
      
      Also leads to the strange situation that if you e.g. disable tx-csum,
      then tso doesn't get disabled.
      Signed-off-by: NMark McLoughlin <markmc@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0276b497
    • M
      virtio_net: Recycle some more rx buffer pages · 0a888fd1
      Mark McLoughlin 提交于
      Each time we re-fill the recv queue with buffers, we allocate
      one too many skbs and free it again when adding fails. We should
      recycle the pages allocated in this case.
      
      A previous version of this patch made trim_pages() trim trailing
      unused pages from skbs with some paged data, but this actually
      caused a barely measurable slowdown.
      Signed-off-by: NMark McLoughlin <markmc@redhat.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a888fd1
  23. 13 11月, 2008 1 次提交
    • W
      netdevice: safe convert to netdev_priv() #part-3 · 8f15ea42
      Wang Chen 提交于
      We have some reasons to kill netdev->priv:
      1. netdev->priv is equal to netdev_priv().
      2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously
         netdev_priv() is more flexible than netdev->priv.
      But we cann't kill netdev->priv, because so many drivers reference to it
      directly.
      
      This patch is a safe convert for netdev->priv to netdev_priv(netdev).
      Since all of the netdev->priv is only for read.
      But it is too big to be sent in one mail.
      I split it to 4 parts and make every part smaller than 100,000 bytes,
      which is max size allowed by vger.
      Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f15ea42
  24. 28 10月, 2008 1 次提交
  25. 25 7月, 2008 3 次提交
    • R
      virtio: Recycle unused recv buffer pages for large skbs in net driver · fb6813f4
      Rusty Russell 提交于
      If we hack the virtio_net driver to always allocate full-sized (64k+)
      skbuffs, the driver slows down (lguest numbers):
      
        Time to receive 1GB (small buffers): 10.85 seconds
        Time to receive 1GB (64k+ buffers): 24.75 seconds
      
      Of course, large buffers use up more space in the ring, so we increase
      that from 128 to 2048:
      
        Time to receive 1GB (64k+ buffers, 2k ring): 16.61 seconds
      
      If we recycle pages rather than using alloc_page/free_page:
      
        Time to receive 1GB (64k+ buffers, 2k ring, recycle pages): 10.81 seconds
      
      This demonstrates that with efficient allocation, we don't need to
      have a separate "small buffer" queue.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      fb6813f4
    • H
      virtio net: Allow receiving SG packets · 97402b96
      Herbert Xu 提交于
      Finally this patch lets virtio_net receive GSO packets in addition
      to sending them.  This can definitely be optimised for the non-GSO
      case.  For comparison the Xen approach stores one page in each skb
      and uses subsequent skb's pages to construct an SG skb instead of
      preallocating the maximum amount of pages per skb.
      
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (added feature bits)
      97402b96
    • H
      virtio net: Add ethtool ops for SG/GSO · a9ea3fc6
      Herbert Xu 提交于
      This patch adds some basic ethtool operations to virtio_net so
      I could test SG without GSO (which was really useful because TSO
      turned out to be buggy :)
      
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (remove MTU setting)
      a9ea3fc6