- 12 6月, 2009 1 次提交
-
-
由 Herbert Xu 提交于
We need to enforce the IP alignment on the non-mergeable RX path just like the other RX path. Not doing so results in misaligned IP headers. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 6月, 2009 1 次提交
-
-
由 Herbert Xu 提交于
Through a bug in the tun driver, I noticed that virtio_net is producing bogus hdr_len values. In particular, it only includes the IP header in the linear area, and excludes the entire TCP header. This causes the TCP header to be copied twice for each packet. (The bug omitted the second copy :) This patch corrects this. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 5月, 2009 1 次提交
-
-
由 Jiri Pirko 提交于
This patch converts unicast address list to standard list_head using previously introduced struct netdev_hw_addr. It also relaxes the locking. Original spinlock (still used for multicast addresses) is not needed and is no longer used for a protection of this list. All reading and writing takes place under rtnl (with no changes). I also removed a possibility to specify the length of the address while adding or deleting unicast address. It's always dev->addr_len. The convertion touched especially e1000 and ixgbe codes when the change is not so trivial. Signed-off-by: NJiri Pirko <jpirko@redhat.com> drivers/net/bnx2.c | 13 +-- drivers/net/e1000/e1000_main.c | 24 +++-- drivers/net/ixgbe/ixgbe_common.c | 14 ++-- drivers/net/ixgbe/ixgbe_common.h | 4 +- drivers/net/ixgbe/ixgbe_main.c | 6 +- drivers/net/ixgbe/ixgbe_type.h | 4 +- drivers/net/macvlan.c | 11 +- drivers/net/mv643xx_eth.c | 11 +- drivers/net/niu.c | 7 +- drivers/net/virtio_net.c | 7 +- drivers/s390/net/qeth_l2_main.c | 6 +- drivers/scsi/fcoe/fcoe.c | 16 ++-- include/linux/netdevice.h | 18 ++-- net/8021q/vlan.c | 4 +- net/8021q/vlan_dev.c | 10 +- net/core/dev.c | 195 +++++++++++++++++++++++++++----------- net/dsa/slave.c | 10 +- net/packet/af_packet.c | 4 +- 18 files changed, 227 insertions(+), 137 deletions(-) Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 5月, 2009 2 次提交
-
-
由 Alex Williamson 提交于
Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alex Williamson 提交于
We were avoiding calling sg_init* on scatterlists passed into virtnet_send_command to prevent extraneous end markers. This caused build warnings for uninitialized variables. Cleanup the code to create proper scatterlists. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 4月, 2009 1 次提交
-
-
由 Alexander Beregalov 提交于
Signed-off-by: NAlexander Beregalov <a.beregalov@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 4月, 2009 1 次提交
-
-
由 Alex Williamson 提交于
VIRTIO_NET_F_MAC indicates the presence of the mac field in config space, not the validity of the value it contains. Allow the mac to be changed at runtime, but only push the change into config space with the VIRTIO_NET_F_MAC feature present. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 3月, 2009 1 次提交
-
-
由 Pantelis Koukousoulas 提交于
Impact: Make NetworkManager work with virtio_net For now the semantics are simple: There is always carrier. This allows a seamless experience with e.g., qemu/kvm where NetworkManager just configures and sets up everything automagically. If/when a generally agreed-upon way to control carrier on/off in the emulator/hypervisor level emerges, it will be trivial to extend the driver to support that too, but for now even this 2-liner makes user experience that much better. Signed-off-by: NPantelis Koukousoulas <pktoss@gmail.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 2月, 2009 5 次提交
-
-
由 Alex Williamson 提交于
Many physical NICs let the OS re-program the "hardware" MAC address. Virtual NICs should allow this too. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Acked-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alex Williamson 提交于
VLAN filtering allows the hypervisor to drop packets from VLANs that we're not a part of, further reducing the number of extraneous packets recieved. This makes use of the VLAN virtqueue command class. The CTRL_VLAN feature bit tells us whether the backend supports VLAN filtering. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alex Williamson 提交于
Make use of the MAC control virtqueue class to support a MAC filter table. The filter table is managed by the hypervisor. We consider the table to be available if the CTRL_RX feature bit is set. We leave it to the hypervisor to manage the table and enable promiscuous or all-multi mode as necessary depending on the resources available to it. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alex Williamson 提交于
Make use of the RX_MODE control virtqueue class to enable the set_rx_mode netdev interface. This allows us to selectively enable/disable promiscuous and allmulti mode so we don't see packets we don't want. For now, we automatically enable these as needed if additional unicast or multicast addresses are requested. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alex Williamson 提交于
This will be used for RX mode, MAC filter table, VLAN filtering, etc... The control transaction consists of one or more "out" sg entries and one or more "in" sg entries. The first out entry contains a header defining the class and command. Additional out entries may provide data for the command. The last in entry provides a status response back from the command. Virtqueues typically run asynchronous, running a callback function when there's data in the channel. We can't readily make use of this in the command paths where we need to use this. Instead, we kick the virtqueue and spin. The kick causes an I/O write, triggering an immediate trap into the hypervisor. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 1月, 2009 1 次提交
-
-
由 Ira W. Snyder 提交于
Without this fix, virtio_net makes incorrect usage of scatterlists. It sets the end of the scatterlist chain after the first element, despite the fact that more entries come after it. If you try to run dma_map_sg() on one of the scatterlists given to you by add_buf(), you will get a null pointer oops. Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 1月, 2009 1 次提交
-
-
由 Alex Williamson 提交于
802.1Q expanded the maximum ethernet frame size by 4 bytes for the VLAN tag. We're not taking this into account in virtio_net, which means the buffers we provide to the backend in the virtqueue RX ring aren't big enough to hold a full MTU VLAN packet. For QEMU/KVM, this results in the backend exiting with a packet truncation error. Signed-off-by: NAlex Williamson <alex.williamson@hp.com> Acked-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 1月, 2009 2 次提交
-
-
由 Mark McLoughlin 提交于
Allow the host to inform us that the link is down by adding a VIRTIO_NET_F_STATUS which indicates that device status is available in virtio_net config. This is currently useful for simulating link down conditions (e.g. using proposed qemu 'set_link' monitor command) but would also be needed if we were to support device assignment via virtio. Signed-off-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (added future masking) Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Hutchings 提交于
Following the removal of the unused struct net_device * parameter from the NAPI functions named *netif_rx_* in commit 908a7a16, they are exactly equivalent to the corresponding *napi_* functions and are therefore redundant. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 1月, 2009 1 次提交
-
-
由 Stephen Hemminger 提交于
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Acked-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 12月, 2008 1 次提交
-
-
由 Neil Horman 提交于
When the napi api was changed to separate its 1:1 binding to the net_device struct, the netif_rx_[prep|schedule|complete] api failed to remove the now vestigual net_device structure parameter. This patch cleans up that api by properly removing it.. Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 12月, 2008 1 次提交
-
-
由 Mark McLoughlin 提交于
We don't really have a max tx packet size limit, so allow configuring the device with up to 64k tx MTU. Signed-off-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 17 11月, 2008 3 次提交
-
-
由 Mark McLoughlin 提交于
If segmentation offload is enabled by the host, we currently allocate maximum sized packet buffers and pass them to the host. This uses up 20 ring entries, allowing us to supply only 20 packet buffers to the host with a 256 entry ring. This is a huge overhead when receiving small packets, and is most keenly felt when receiving MTU sized packets from off-host. The VIRTIO_NET_F_MRG_RXBUF feature flag is set by hosts which support using receive buffers which are smaller than the maximum packet size. In order to transfer large packets to the guest, the host merges together multiple receive buffers to form a larger logical buffer. The number of merged buffers is returned to the guest via a field in the virtio_net_hdr. Make use of this support by supplying single page receive buffers to the host. On receive, we extract the virtio_net_hdr, copy 128 bytes of the payload to the skb's linear data buffer and adjust the fragment offset to point to the remaining data. This ensures proper alignment and allows us to not use any paged data for small packets. If the payload occupies multiple pages, we simply append those pages as fragments and free the associated skbs. This scheme allows us to be efficient in our use of ring entries while still supporting large packets. Benchmarking using netperf from an external machine to a guest over a 10Gb/s network shows a 100% improvement from ~1Gb/s to ~2Gb/s. With a local host->guest benchmark with GSO disabled on the host side, throughput was seen to increase from 700Mb/s to 1.7Gb/s. Based on a patch from Herbert Xu. Signed-off-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv) Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mark McLoughlin 提交于
Seems like an oversight that we have set-tx-csum and set-sg hooked up, but not set-tso. Also leads to the strange situation that if you e.g. disable tx-csum, then tso doesn't get disabled. Signed-off-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mark McLoughlin 提交于
Each time we re-fill the recv queue with buffers, we allocate one too many skbs and free it again when adding fails. We should recycle the pages allocated in this case. A previous version of this patch made trim_pages() trim trailing unused pages from skbs with some paged data, but this actually caused a barely measurable slowdown. Signed-off-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (use netdev_priv) Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 11月, 2008 1 次提交
-
-
由 Wang Chen 提交于
We have some reasons to kill netdev->priv: 1. netdev->priv is equal to netdev_priv(). 2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously netdev_priv() is more flexible than netdev->priv. But we cann't kill netdev->priv, because so many drivers reference to it directly. This patch is a safe convert for netdev->priv to netdev_priv(netdev). Since all of the netdev->priv is only for read. But it is too big to be sent in one mail. I split it to 4 parts and make every part smaller than 100,000 bytes, which is max size allowed by vger. Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 10月, 2008 1 次提交
-
-
由 Johannes Berg 提交于
This converts pretty much everything to print_mac. There were a few things that had conflicts which I have just dropped for now, no harm done. I've built an allyesconfig with this and looked at the files that weren't built very carefully, but it's a huge patch. Signed-off-by: NJohannes Berg <johannes@sipsolutions.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 7月, 2008 4 次提交
-
-
由 Rusty Russell 提交于
If we hack the virtio_net driver to always allocate full-sized (64k+) skbuffs, the driver slows down (lguest numbers): Time to receive 1GB (small buffers): 10.85 seconds Time to receive 1GB (64k+ buffers): 24.75 seconds Of course, large buffers use up more space in the ring, so we increase that from 128 to 2048: Time to receive 1GB (64k+ buffers, 2k ring): 16.61 seconds If we recycle pages rather than using alloc_page/free_page: Time to receive 1GB (64k+ buffers, 2k ring, recycle pages): 10.81 seconds This demonstrates that with efficient allocation, we don't need to have a separate "small buffer" queue. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Herbert Xu 提交于
Finally this patch lets virtio_net receive GSO packets in addition to sending them. This can definitely be optimised for the non-GSO case. For comparison the Xen approach stores one page in each skb and uses subsequent skb's pages to construct an SG skb instead of preallocating the maximum amount of pages per skb. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (added feature bits)
-
由 Herbert Xu 提交于
This patch adds some basic ethtool operations to virtio_net so I could test SG without GSO (which was really useful because TSO turned out to be buggy :) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (remove MTU setting)
-
由 Mark McLoughlin 提交于
On Mon, 2008-05-26 at 17:42 +1000, Rusty Russell wrote: > If we fail to transmit a packet, we assume the queue is full and put > the skb into last_xmit_skb. However, if more space frees up before we > xmit it, we loop, and the result can be transmitting the same skb twice. > > Fix is simple: set skb to NULL if we've used it in some way, and check > before sending. ... > diff -r 564237b31993 drivers/net/virtio_net.c > --- a/drivers/net/virtio_net.c Mon May 19 12:22:00 2008 +1000 > +++ b/drivers/net/virtio_net.c Mon May 19 12:24:58 2008 +1000 > @@ -287,21 +287,25 @@ again: > free_old_xmit_skbs(vi); > > /* If we has a buffer left over from last time, send it now. */ > - if (vi->last_xmit_skb) { > + if (unlikely(vi->last_xmit_skb)) { > if (xmit_skb(vi, vi->last_xmit_skb) != 0) { > /* Drop this skb: we only queue one. */ > vi->dev->stats.tx_dropped++; > kfree_skb(skb); > + skb = NULL; > goto stop_queue; > } > vi->last_xmit_skb = NULL; With this, may drop an skb and then later in the function discover that we could have sent it after all. Poor wee skb :) How about the incremental patch below? Cheers, Mark. Subject: [PATCH] virtio_net: Delay dropping tx skbs Currently we drop the skb in start_xmit() if we have a queued buffer and fail to transmit it. However, if we delay dropping it until we've stopped the queue and enabled the tx notification callback, then there is a chance space might become available for it. Signed-off-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 11 7月, 2008 1 次提交
-
-
由 Mark McLoughlin 提交于
We can handle receiving partial csums, so set the appropriate feature bit. Signed-off-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 11 6月, 2008 3 次提交
-
-
由 Rusty Russell 提交于
virtio_net uses a timer to free old transmitted packets, rather than leaving callbacks enabled all the time. If the host promises to always notify us when the transmit ring is empty, we can free packets at that point and avoid the timer. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Mark McLoughlin 提交于
virtio_net currently only frees old transmit skbs just before queueing new ones. If the queue is full, it then enables interrupts and waits for notification that more work has been performed. However, a side-effect of this scheme is that there are always xmit skbs left dangling when no new packets are sent, against the Documentation/networking/driver.txt guideline: "... it is not allowed for your TX mitigation scheme to let TX packets "hang out" in the TX ring unreclaimed forever if no new TX packets are sent." Add a timer to ensure that any time we queue new TX skbs, we will shortly free them again. This fixes an easily reproduced hang at shutdown where iptables attempts to unload nf_conntrack and nf_conntrack waits for an skb it is tracking to be freed, but virtio_net never frees it. Signed-off-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Mark McLoughlin 提交于
hdr->csum_start is the offset from the start of the ethernet header to the transport layer checksum field. skb->csum_start is the offset from skb->head. skb_partial_csum_set() assumes that skb->data points to the ethernet header - i.e. it computes skb->csum_start by adding the headroom to hdr->csum_start. Since eth_type_trans() skb_pull()s the ethernet header, skb_partial_csum_set() should be called before eth_type_trans(). (Without this patch, GSO packets from a guest to the world outside the host are corrupted). Signed-off-by: NMark McLoughlin <markmc@redhat.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 31 5月, 2008 2 次提交
-
-
由 Rusty Russell 提交于
Because we cache the last failed-to-xmit packet, if there are no packets queued behind that one we may never send it (reproduced here as TCP stalls, "cured" by an outgoing ping). Cc: Mark McLoughlin <markmc@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Rusty Russell 提交于
If we fail to transmit a packet, we assume the queue is full and put the skb into last_xmit_skb. However, if more space frees up before we xmit it, we loop, and the result can be transmitting the same skb twice. Fix is simple: set skb to NULL if we've used it in some way, and check before sending. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 23 5月, 2008 1 次提交
-
-
由 Wang Chen 提交于
Use standard routine for queue purging. Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 02 5月, 2008 4 次提交
-
-
由 Rusty Russell 提交于
A recent proposed feature addition to the virtio block driver revealed some flaws in the API: in particular, we assume that feature negotiation is complete once a driver's probe function returns. There is nothing in the API to require this, however, and even I didn't notice when it was violated. So instead, we require the driver to specify what features it supports in a table, we can then move the feature negotiation into the virtio core. The intersection of device and driver features are presented in a new 'features' bitmap in the struct virtio_device. Note that this highlights the difference between Linux unsigned-long bitmaps where each unsigned long is in native endian, and a straight-forward little-endian array of bytes. Drivers can still remove feature bits in their probe routine if they really have to. API changes: - dev->config->feature() no longer gets and acks a feature. - drivers should advertise their features in the 'feature_table' field - use virtio_has_feature() for extra sanity when checking feature bits Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
So, we previously had a 'VIRTIO_NET_F_GSO' bit which meant that 'the host can handle csum offload, and any TSO (v4&v6 incl ECN) or UFO packets you might want to send. I thought this was good enough for Linux, but it actually isn't, since we don't do UFO in software. So, add separate feature bits for what the host can handle. Add equivalent ones for the guest to say what it can handle, because LRO is coming too (thanks Herbert!). Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
由 Rusty Russell 提交于
Herbert tells me that returning NETDEV_TX_BUSY from hard_start_xmit is seen as a poor thing to do; we should cache the packet and stop the queue. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
-
由 Rusty Russell 提交于
Herbert Xu points out (within another patch) that my scatterlists are too short: one entry for the gso header, one for the skb->data, and MAX_SKB_FRAGS for all the fragments. Fix both xmit and recv sides (recv currently unused, coming in later patch). Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-