- 26 3月, 2010 1 次提交
-
-
由 Eric Dumazet 提交于
RPS currently depends on SMP and SYSFS Adding a CONFIG_RPS makes sense in case this requirement changes in the future. This patch saves about 1500 bytes of kernel text in case SMP is on but SYSFS is off. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 3月, 2010 1 次提交
-
-
由 Jiri Pirko 提交于
After the type change, addresses in unicast and multicast lists wouldn't make sense, not to mention possible different lenghts. So flush both lists here. Note "dev_addr_discard" will be very soon replaced by "dev_mc_flush" (once mc_list conversion will be done). Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 3月, 2010 3 次提交
-
-
由 Eric Dumazet 提交于
When doing "ifenslave -d bond0 eth0", there is chance to get NULL dereference in netif_receive_skb(), because dev->master suddenly becomes NULL after we tested it. We should use ACCESS_ONCE() to avoid this (or rcu_dereference()) Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
This patch adds the possibility to refuse the bonding type change for other subsystems (such as for example bridge, vlan, etc.) Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 3月, 2010 1 次提交
-
-
由 Tom Herbert 提交于
This patch implements software receive side packet steering (RPS). RPS distributes the load of received packet processing across multiple CPUs. Problem statement: Protocol processing done in the NAPI context for received packets is serialized per device queue and becomes a bottleneck under high packet load. This substantially limits pps that can be achieved on a single queue NIC and provides no scaling with multiple cores. This solution queues packets early on in the receive path on the backlog queues of other CPUs. This allows protocol processing (e.g. IP and TCP) to be performed on packets in parallel. For each device (or each receive queue in a multi-queue device) a mask of CPUs is set to indicate the CPUs that can process packets. A CPU is selected on a per packet basis by hashing contents of the packet header (e.g. the TCP or UDP 4-tuple) and using the result to index into the CPU mask. The IPI mechanism is used to raise networking receive softirqs between CPUs. This effectively emulates in software what a multi-queue NIC can provide, but is generic requiring no device support. Many devices now provide a hash over the 4-tuple on a per packet basis (e.g. the Toeplitz hash). This patch allow drivers to set the HW reported hash in an skb field, and that value in turn is used to index into the RPS maps. Using the HW generated hash can avoid cache misses on the packet when steering it to a remote CPU. The CPU mask is set on a per device and per queue basis in the sysfs variable /sys/class/net/<device>/queues/rx-<n>/rps_cpus. This is a set of canonical bit maps for receive queues in the device (numbered by <n>). If a device does not support multi-queue, a single variable is used for the device (rx-0). Generally, we have found this technique increases pps capabilities of a single queue device with good CPU utilization. Optimal settings for the CPU mask seem to depend on architectures and cache hierarcy. Below are some results running 500 instances of netperf TCP_RR test with 1 byte req. and resp. Results show cumulative transaction rate and system CPU utilization. e1000e on 8 core Intel Without RPS: 108K tps at 33% CPU With RPS: 311K tps at 64% CPU forcedeth on 16 core AMD Without RPS: 156K tps at 15% CPU With RPS: 404K tps at 49% CPU bnx2x on 16 core AMD Without RPS 567K tps at 61% CPU (4 HW RX queues) Without RPS 738K tps at 96% CPU (8 HW RX queues) With RPS: 854K tps at 76% CPU (4 HW RX queues) Caveats: - The benefits of this patch are dependent on architecture and cache hierarchy. Tuning the masks to get best performance is probably necessary. - This patch adds overhead in the path for processing a single packet. In a lightly loaded server this overhead may eliminate the advantages of increased parallelism, and possibly cause some relative performance degradation. We have found that masks that are cache aware (share same caches with the interrupting CPU) mitigate much of this. - The RPS masks can be changed dynamically, however whenever the mask is changed this introduces the possibility of generating out of order packets. It's probably best not change the masks too frequently. Signed-off-by: NTom Herbert <therbert@google.com> include/linux/netdevice.h | 32 ++++- include/linux/skbuff.h | 3 + net/core/dev.c | 335 +++++++++++++++++++++++++++++++++++++-------- net/core/net-sysfs.c | 225 ++++++++++++++++++++++++++++++- net/core/skbuff.c | 2 + 5 files changed, 538 insertions(+), 59 deletions(-) Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 2月, 2010 3 次提交
-
-
由 Patrick McHardy 提交于
Split dev_change_flags() into two functions: __dev_change_flags() to perform the actual changes and __dev_notify_flags() to invoke netdevice notifiers. This will be used by rtnl_link to defer netlink notifications until the device has been fully configured. This changes ordering of some operations, in particular: - netlink notifications are sent after all changes have been performed. As a side effect this surpresses one unnecessary netlink message when the IFF_UP and other flags are changed simultaneously. - The NETDEV_UP/NETDEV_DOWN and NETDEV_CHANGE notifiers are invoked after all changes have been performed. Their relative is unchanged. - net_dmaengine_put() is invoked before the NETDEV_DOWN notifier instead of afterwards. This should not make any difference since both RX and TX are already shut down at this point. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
In order to support specifying device flags during device creation, we must be able to roll back device registration in case setting the flags fails without sending any notifications related to the device to userspace. This patch changes rollback_registered_many() and register_netdevice() to manually send netlink notifications for devices not handled by rtnl_link and allows to defer notifications for devices handled by rtnl_link until setup is complete. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John W. Linville 提交于
In "wireless: remove WLAN_80211 and WLAN_PRE80211 from Kconfig" I inadvertantly missed a line in include/linux/netdevice.h. I thereby effectively reverted "net: Set LL_MAX_HEADER properly for wireless." by accident. :-( Now we should check there for CONFIG_WLAN instead. Signed-off-by: NJohn W. Linville <linville@tuxdriver.com> Reported-by: NChristoph Egger <siccegge@stud.informatik.uni-erlangen.de> Cc: stable@kernel.org
-
- 13 2月, 2010 3 次提交
-
-
由 Williams, Mitch A 提交于
Add netdev ops for configuring SR-IOV VF devices through the PF driver. Signed-off-by: NMitch Williams <mitch.a.williams@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Perches 提交于
Add macros to test a private structure for msg_enable bits and the netif_msg_##bit to test and call netdev_printk if set Simplifies logic in callers and adds message logging consistency Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Perches 提交于
These netdev_printk routines take a struct net_device * and emit dev_printk logging messages adding "%s: " ... netdev->dev.parent to the dev_printk format and arguments. This can create some uniformity in the output message log. These helpers should not be used until a successful alloc_netdev. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 2月, 2010 1 次提交
-
-
由 Peter P Waskiewicz Jr 提交于
This patchset enables the ethtool layer to program n-tuple filters to an underlying device. The idea is to allow capable hardware to have static rules applied that can assist steering flows into appropriate queues. Hardware that is known to support these types of filters today are ixgbe and niu. Signed-off-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 2月, 2010 1 次提交
-
-
由 Jiri Pirko 提交于
This patch introduces the similar helpers as those already done for uc list. However multicast lists are no list_head lists but "mademanually". The three macros added by this patch will make the transition of mc_list to list_head smooth in two steps: 1) convert all drivers to use these macros (with the original iterator of type "struct dev_mc_list") 2) once all drivers are converted, convert list type and iterators to "struct netdev_hw_addr" in one patch. >From now on, drivers can (and should) use "netdev_for_each_mc_addr" to iterate over the addresses with iterator of type "struct netdev_hw_addr". Also macros "netdev_mc_count" and "netdev_mc_empty" to read list's length. This is the state which should be reached in all drivers. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 2月, 2010 1 次提交
-
-
由 Arnd Bergmann 提交于
In the vlan and macvlan drivers, the start_xmit function forwards data to the dev_queue_xmit function for another device, which may potentially belong to a different namespace. To make sure that classification stays within a single namespace, this resets the potentially critical fields. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 1月, 2010 1 次提交
-
-
由 Jiri Pirko 提交于
This patch introduces three macros to work with uc list from net drivers. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 1月, 2010 1 次提交
-
-
由 Alexey Dobriyan 提交于
After netdev_ops compat code HAVE_* macros aren't needed, in fact they _will_ result in compile breakage for out of tree drivers. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 1月, 2010 1 次提交
-
-
由 David S. Miller 提交于
Nothing outside of net/core/dev.c uses it. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 12月, 2009 1 次提交
-
-
由 Patrick Mullaney 提交于
Provide common routine for the transition of operational state for a leaf device during a root device transition. Signed-off-by: NPatrick Mullaney <pmullaney@novell.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 12月, 2009 1 次提交
-
-
由 Eric W. Biederman 提交于
I will need this shortly to implement network namespace shutdown batching. For sanity sake network devices should be removed in the reverse order they were created in. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 11月, 2009 1 次提交
-
-
由 Arnd Bergmann 提交于
The veth driver contains code to forward an skb from the start_xmit function of one network device into the receive path of another device. Moving that code into a common location lets us reuse the code for direct forwarding of data between macvlan ports, and possibly in other drivers. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 11月, 2009 2 次提交
-
-
由 Eric Dumazet 提交于
Herbert Xu a écrit : > On Tue, Nov 17, 2009 at 04:26:04AM -0800, David Miller wrote: >> Really, the link watch stuff is just due for a redesign. I don't >> think a simple hack is going to cut it this time, sorry Eric :-) > > I have no objections against any redesigns, but since the only > caller of linkwatch_forget_dev runs in process context with the > RTNL, it could also legally emit those events. Thanks guys, here an updated version then, before linkwatch surgery ? In this version, I force the event to be sent synchronously. [PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105 real 0m0.266s user 0m0.000s sys 0m0.001s real 0m0.770s user 0m0.000s sys 0m0.000s real 0m1.022s user 0m0.000s sys 0m0.000s One problem of current schem in vlan dismantle phase is the holding of device done by following chain : vlan_dev_stop() -> netif_carrier_off(dev) -> linkwatch_fire_event(dev) -> dev_hold() ... And __linkwatch_run_queue() runs up to one second later... A generic fix to this problem is to add a linkwatch_forget_dev() method to unlink the device from the list of watched devices. dev->link_watch_next becomes dev->link_watch_list (and use a bit more memory), to be able to unlink device in O(1). After patch : time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105 real 0m0.024s user 0m0.000s sys 0m0.000s real 0m0.032s user 0m0.000s sys 0m0.001s real 0m0.033s user 0m0.000s sys 0m0.000s Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Some drivers ndo_get_stats() method need to perform txqueue stats folding. Move folding from dev_get_stats() to a new dev_txq_stats_fold() function Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 11月, 2009 1 次提交
-
-
由 Jarek Poplawski 提交于
Recent changes in the TX error propagation require additional checking and masking of values returned from hard_start_xmit(), mainly to separate cases where skb was consumed. This aim can be simplified by changing the order of NETDEV_TX and NET_XMIT codes, because the latter are treated similarly to negative (ERRNO) values. After this change much simpler dev_xmit_complete() is also used in sch_direct_xmit(), so it is moved to netdevice.h. Additionally NET_RX definitions in netdevice.h are moved up from between TX codes to avoid confusion while reading the TX comment. Signed-off-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 11月, 2009 2 次提交
-
-
由 Eric Dumazet 提交于
No longer need read_lock(&dev_base_lock), use RCU instead. We also can avoid taking references on inet6_dev structs. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
Currently the ->ndo_hard_start_xmit() callbacks are only permitted to return one of the NETDEV_TX codes. This prevents any kind of error propagation for virtual devices, like queue congestion of the underlying device in case of layered devices, or unreachability in case of tunnels. This patches changes the NET_XMIT codes to avoid clashes with the NETDEV_TX codes and changes the two callers of dev_hard_start_xmit() to expect either errno codes, NET_XMIT codes or NETDEV_TX codes as return value. In case of qdisc_restart(), all non NETDEV_TX codes are mapped to NETDEV_TX_OK since no error propagation is possible when using qdiscs. In case of dev_queue_xmit(), the error is propagated upwards. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 11月, 2009 1 次提交
-
-
由 stephen hemminger 提交于
This adds an RCU macro for continuing search, useful for some network devices like vlan. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 11月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
This cleanup patch puts struct/union/enum opening braces, in first line to ease grep games. struct something { becomes : struct something { Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 11月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
Adds RCU management to the list of netdevices. Convert some for_each_netdev() users to RCU version, if it can avoid read_lock-ing dev_base_lock Ie: read_lock(&dev_base_loack); for_each_netdev(net, dev) some_action(); read_unlock(&dev_base_lock); becomes : rcu_read_lock(); for_each_netdev_rcu(net, dev) some_action(); rcu_read_unlock(); Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 11月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
Some workloads hit dev_base_lock rwlock pretty hard. We can use RCU lookups to avoid touching this rwlock (and avoid touching netdevice refcount) netdevices are already freed after a RCU grace period, so this patch adds no penalty at device dismantle time. However, it adds a synchronize_rcu() call in dev_change_name() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 10月, 2009 1 次提交
-
-
由 Eric W. Biederman 提交于
This isn't beautifully abstracted, but it is simple, simplifies uses and so far is only needed for the bonding driver. Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 10月, 2009 2 次提交
-
-
由 Ben Hutchings 提交于
This will allow drivers to adjust their receive path dynamically based on whether GRO is being applied successfully. Currently all in-tree callers ignore the return values of these functions and do not need to be changed. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Hutchings 提交于
This clarifies which return and parameter types are GRO result codes and not RX result codes. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 10月, 2009 2 次提交
-
-
由 Eric Dumazet 提交于
Some workloads hit dev_base_lock rwlock pretty hard. We can use RCU lookups to avoid touching this rwlock. netdevices are already freed after a RCU grace period, so this patch adds no penalty at device dismantle time. dev_ifname() converted to dev_get_by_index_rcu() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yi Zou 提交于
Add ndo_fcoe_get_wwn so Fiber Channel over Ethernet (FCoE) can make use of the provided World Wide Port Name (WWPN) and World Wide Node Name (WWNN) from the underlying network interface driver. Signed-off-by: NYi Zou <yi.zou@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 10月, 2009 2 次提交
-
-
由 Eric Dumazet 提交于
Introduce rollback_registered_many() and unregister_netdevice_many() rollback_registered_many() is able to perform necessary steps at device dismantle time, factorizing two expensive synchronize_net() calls. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
This patchs adds an unreg_list anchor to struct net_device, and introduces an unregister_netdevice_queue() function, able to queue a net_device to a list instead of immediately unregister it. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 10月, 2009 1 次提交
-
-
由 Wolfram Sang 提交于
nanodoc was missing an ndo_-prefix. Signed-off-by: NWolfram Sang <w.sang@pengutronix.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 10月, 2009 1 次提交
-
-
由 Stephen Hemminger 提交于
The data center bridging ops structure can be const Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Acked-by: NPeter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 9月, 2009 1 次提交
-
-
由 David Brownell 提交于
Let attribute group vectors be declared "const". We'd like to let most attribute metadata live in read-only sections... this is a start. Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-