1. 14 10月, 2009 1 次提交
  2. 02 10月, 2009 1 次提交
  3. 24 9月, 2009 6 次提交
    • A
      virtio_net: Check for room in the vq before adding buffer · 0aea51c3
      Amit Shah 提交于
      Saves us one cycle of alloc-add-free if the queue was full.
      Signed-off-by: NAmit Shah <amit.shah@redhat.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (modified)
      0aea51c3
    • R
      virtio_net: avoid (most) NETDEV_TX_BUSY by stopping queue early. · 48925e37
      Rusty Russell 提交于
      Now we can tell the theoretical capacity remaining in the output
      queue, virtio_net can waste entries by stopping the queue early.
      
      It doesn't work in the case of indirect buffers and kmalloc failure,
      but that's rare (we could drop the packet in that case, but other
      drivers return TX_BUSY for similar reasons).
      
      For the record, I think this patch reflects poorly on the linux
      network API.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Dinesh Subhraveti <dineshs@us.ibm.com>
      48925e37
    • R
      virtio_net: formalize skb_vnet_hdr · b3f24698
      Rusty Russell 提交于
      We put the virtio_net_hdr into the skb's cb region; turn this into a
      union to clean up the code slightly and allow future expansion.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Mark McLoughlin <markmc@redhat.com>
      Cc: Dinesh Subhraveti <dineshs@us.ibm.com>
      b3f24698
    • R
      virtio_net: don't free buffers in xmit ring · b0c39dbd
      Rusty Russell 提交于
      The virtio_net driver is complicated by the two methods of freeing old
      xmit buffers (in addition to freeing old ones at the start of the xmit
      path).
      
      The original code used a 1/10 second timer attached to xmit_free(),
      reset on every xmit.  Before we orphaned skbs on xmit, the
      transmitting userspace could block with a full socket until the timer
      fired, the skb destructor was called, and they were re-woken.
      
      So we added the VIRTIO_F_NOTIFY_ON_EMPTY feature: supporting devices
      send an interrupt (even if normally suppressed) on an empty xmit ring
      which makes us schedule xmit_tasklet().  This was a benchmark win.
      
      Unfortunately, VIRTIO_F_NOTIFY_ON_EMPTY makes quite a lot of work: a
      host which is faster than the guest will fire the interrupt every xmit
      packet (slowing the guest down further).  Attempting mitigation in the
      host adds overhead of userspace timers (possibly with the additional
      pain of signals), and risks increasing latency anyway if you get it
      wrong.
      
      In practice, this effect was masked by benchmarks which take advantage
      of GSO (with its inherent transmit batching), but it's still there.
      
      Now we orphan xmitted skbs, the pressure is off: remove both paths and
      no longer request VIRTIO_F_NOTIFY_ON_EMPTY.  Note that the current
      QEMU will notify us even if we don't negotiate this feature (legal,
      but suboptimal); a patch is outstanding to improve that.
      
      Move the skb_orphan/nf_reset to after we've done the send and notified
      the other end, for a slight optimization.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Mark McLoughlin <markmc@redhat.com>
      b0c39dbd
    • R
      virtio_net: return NETDEV_TX_BUSY instead of queueing an extra skb. · 8958f574
      Rusty Russell 提交于
      This effectively reverts 99ffc696
      "virtio: wean net driver off NETDEV_TX_BUSY".
      
      The complexity of queuing an skb (setting a tasklet to re-xmit) is
      questionable, especially once we get rid of the other reason for the
      tasklet in the next patch.
      
      If the skb won't fit in the tx queue, just return NETDEV_TX_BUSY.
      This is frowned upon, so a followup patch uses a more complex solution.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      8958f574
    • R
      virtio_net: skb_orphan() and nf_reset() in xmit path. · 2b5bbe3b
      Rusty Russell 提交于
      The complex transmit free logic was introduced to avoid hangs on
      removing the ip_conntrack module and also because drivers aren't
      generally supposed to keep stale skbs for unbounded times.
      
      After some debate, it was decided that while doing skb_orphan()
      generally is a rat's nest, we can do it in this driver.  Following
      patches take advantage of this.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      2b5bbe3b
  4. 23 9月, 2009 2 次提交
  5. 02 9月, 2009 1 次提交
  6. 01 9月, 2009 1 次提交
  7. 27 8月, 2009 1 次提交
  8. 18 7月, 2009 1 次提交
  9. 18 6月, 2009 1 次提交
    • J
      net: group address list and its count · 31278e71
      Jiri Pirko 提交于
      This patch is inspired by patch recently posted by Johannes Berg. Basically what
      my patch does is to group list and a count of addresses into newly introduced
      structure netdev_hw_addr_list. This brings us two benefits:
      1) struct net_device becames a bit nicer.
      2) in the future there will be a possibility to operate with lists independently
         on netdevices (with exporting right functions).
      I wanted to introduce this patch before I'll post a multicast lists conversion.
      Signed-off-by: NJiri Pirko <jpirko@redhat.com>
      
       drivers/net/bnx2.c              |    4 +-
       drivers/net/e1000/e1000_main.c  |    4 +-
       drivers/net/ixgbe/ixgbe_main.c  |    6 +-
       drivers/net/mv643xx_eth.c       |    2 +-
       drivers/net/niu.c               |    4 +-
       drivers/net/virtio_net.c        |   10 ++--
       drivers/s390/net/qeth_l2_main.c |    2 +-
       include/linux/netdevice.h       |   17 +++--
       net/core/dev.c                  |  130 ++++++++++++++++++--------------------
       9 files changed, 89 insertions(+), 90 deletions(-)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      31278e71
  10. 12 6月, 2009 3 次提交
  11. 08 6月, 2009 1 次提交
  12. 30 5月, 2009 1 次提交
    • J
      net: convert unicast addr list · ccffad25
      Jiri Pirko 提交于
      This patch converts unicast address list to standard list_head using
      previously introduced struct netdev_hw_addr. It also relaxes the
      locking. Original spinlock (still used for multicast addresses) is not
      needed and is no longer used for a protection of this list. All
      reading and writing takes place under rtnl (with no changes).
      
      I also removed a possibility to specify the length of the address
      while adding or deleting unicast address. It's always dev->addr_len.
      
      The convertion touched especially e1000 and ixgbe codes when the
      change is not so trivial.
      Signed-off-by: NJiri Pirko <jpirko@redhat.com>
      
       drivers/net/bnx2.c               |   13 +--
       drivers/net/e1000/e1000_main.c   |   24 +++--
       drivers/net/ixgbe/ixgbe_common.c |   14 ++--
       drivers/net/ixgbe/ixgbe_common.h |    4 +-
       drivers/net/ixgbe/ixgbe_main.c   |    6 +-
       drivers/net/ixgbe/ixgbe_type.h   |    4 +-
       drivers/net/macvlan.c            |   11 +-
       drivers/net/mv643xx_eth.c        |   11 +-
       drivers/net/niu.c                |    7 +-
       drivers/net/virtio_net.c         |    7 +-
       drivers/s390/net/qeth_l2_main.c  |    6 +-
       drivers/scsi/fcoe/fcoe.c         |   16 ++--
       include/linux/netdevice.h        |   18 ++--
       net/8021q/vlan.c                 |    4 +-
       net/8021q/vlan_dev.c             |   10 +-
       net/core/dev.c                   |  195 +++++++++++++++++++++++++++-----------
       net/dsa/slave.c                  |   10 +-
       net/packet/af_packet.c           |    4 +-
       18 files changed, 227 insertions(+), 137 deletions(-)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ccffad25
  13. 02 5月, 2009 2 次提交
  14. 14 4月, 2009 1 次提交
  15. 05 4月, 2009 1 次提交
  16. 19 3月, 2009 1 次提交
  17. 05 2月, 2009 5 次提交
  18. 27 1月, 2009 1 次提交
  19. 26 1月, 2009 1 次提交
  20. 22 1月, 2009 2 次提交
  21. 07 1月, 2009 1 次提交
  22. 23 12月, 2008 1 次提交
  23. 02 12月, 2008 1 次提交
  24. 17 11月, 2008 3 次提交
    • M
      virtio_net: VIRTIO_NET_F_MSG_RXBUF (imprive rcv buffer allocation) · 3f2c31d9
      Mark McLoughlin 提交于
      If segmentation offload is enabled by the host, we currently allocate
      maximum sized packet buffers and pass them to the host. This uses up
      20 ring entries, allowing us to supply only 20 packet buffers to the
      host with a 256 entry ring. This is a huge overhead when receiving
      small packets, and is most keenly felt when receiving MTU sized
      packets from off-host.
      
      The VIRTIO_NET_F_MRG_RXBUF feature flag is set by hosts which support
      using receive buffers which are smaller than the maximum packet size.
      In order to transfer large packets to the guest, the host merges
      together multiple receive buffers to form a larger logical buffer.
      The number of merged buffers is returned to the guest via a field in
      the virtio_net_hdr.
      
      Make use of this support by supplying single page receive buffers to
      the host. On receive, we extract the virtio_net_hdr, copy 128 bytes of
      the payload to the skb's linear data buffer and adjust the fragment
      offset to point to the remaining data. This ensures proper alignment
      and allows us to not use any paged data for small packets. If the
      payload occupies multiple pages, we simply append those pages as
      fragments and free the associated skbs.
      
      This scheme allows us to be efficient in our use of ring entries
      while still supporting large packets. Benchmarking using netperf from
      an external machine to a guest over a 10Gb/s network shows a 100%
      improvement from ~1Gb/s to ~2Gb/s. With a local host->guest benchmark
      with GSO disabled on the host side, throughput was seen to increase
      from 700Mb/s to 1.7Gb/s.
      
      Based on a patch from Herbert Xu.
      Signed-off-by: NMark McLoughlin <markmc@redhat.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3f2c31d9
    • M
      virtio_net: hook up the set-tso ethtool op · 0276b497
      Mark McLoughlin 提交于
      Seems like an oversight that we have set-tx-csum and set-sg hooked
      up, but not set-tso.
      
      Also leads to the strange situation that if you e.g. disable tx-csum,
      then tso doesn't get disabled.
      Signed-off-by: NMark McLoughlin <markmc@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0276b497
    • M
      virtio_net: Recycle some more rx buffer pages · 0a888fd1
      Mark McLoughlin 提交于
      Each time we re-fill the recv queue with buffers, we allocate
      one too many skbs and free it again when adding fails. We should
      recycle the pages allocated in this case.
      
      A previous version of this patch made trim_pages() trim trailing
      unused pages from skbs with some paged data, but this actually
      caused a barely measurable slowdown.
      Signed-off-by: NMark McLoughlin <markmc@redhat.com>
      Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv)
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a888fd1