- 13 4月, 2010 1 次提交
-
-
由 Eric Dumazet 提交于
Dont use netdev_warn() in dev_cap_txqueue() and get_rps_cpu() so that we can catch following warnings without crash. bond0.2240 received packet on queue 6, but number of RX queues is 1 bond0.2240 received packet on queue 11, but number of RX queues is 1 Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 4月, 2010 2 次提交
-
-
由 chavey 提交于
Fix coding style errors and warnings output while running checkpatch.pl on the files net/core/ethtool.c and include/linux/ethtool.h Signed-off-by: Nchavey <chavey@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
As pointed by Randy Dunlap, we must include linux/proc_fs.h in net/core/dev_addr_lists.c, regardless of CONFIG_PROC_FS Reported-by: Randy Dunlap <randy.dunlap@oracle.com>, Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Acked-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 4月, 2010 2 次提交
-
-
由 Timo Teräs 提交于
Speed up lookups by freeing flow cache entries later. After virtualizing flow cache entry operations, the flow cache may now end up calling policy or bundle destructor which can be slowish. As gc_list is more effective with double linked list, the flow cache is converted to use common hlist and list macroes where appropriate. Signed-off-by: NTimo Teras <timo.teras@iki.fi> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Timo Teräs 提交于
This allows to validate the cached object before returning it. It also allows to destruct object properly, if the last reference was held in flow cache. This is also a prepartion for caching bundles in the flow cache. In return for virtualizing the methods, we save on: - not having to regenerate the whole flow cache on policy removal: each flow matching a killed policy gets refreshed as the getter function notices it smartly. - we do not have to call flow_cache_flush from policy gc, since the flow cache now properly deletes the object if it had any references Signed-off-by: NTimo Teras <timo.teras@iki.fi> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 4月, 2010 2 次提交
-
-
由 Eric Dumazet 提交于
As noticed by Changli Gao, we must call local_irq_enable() after rps_unlock() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Fix spin_unlock_irq which needs to be rps_unlock. Signed-off-by: NTom Herbert <therbert@google.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 4月, 2010 2 次提交
-
-
由 Jiri Pirko 提交于
Converts the list and the core manipulating with it to be the same as uc_list. +uses two functions for adding/removing mc address (normal and "global" variant) instead of a function parameter. +removes dev_mcast.c completely. +exposes netdev_hw_addr_list_* macros along with __hw_addr_* functions for manipulation with lists on a sandbox (used in bonding and 80211 drivers) Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
+little renaming of unicast functions to be smooth with multicast ones Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 4月, 2010 1 次提交
-
-
由 Eric Dumazet 提交于
Followup to commit 5acbbd42 (net: change illegal_highdma to use dma_mask) If dev->dev.parent is NULL, we should not try to dereference it. Dont force inline illegal_highdma() as its pretty big now. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 4月, 2010 3 次提交
-
-
由 FUJITA Tomonori 提交于
Robert Hancock pointed out two problems about NETIF_F_HIGHDMA: -Many drivers only set the flag when they detect they can use 64-bit DMA, since otherwise they could receive DMA addresses that they can't handle (which on platforms without IOMMU/SWIOTLB support is fatal). This means that if 64-bit support isn't available, even buffers located below 4GB will get copied unnecessarily. -Some drivers set the flag even though they can't actually handle 64-bit DMA, which would mean that on platforms without IOMMU/SWIOTLB they would get a DMA mapping error if the memory they received happened to be located above 4GB. http://lkml.org/lkml/2010/3/3/530 We can use the dma_mask if we need bouncing or not here. Then we can safely fix drivers that misuse NETIF_F_HIGHDMA. Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Timo Teräs 提交于
Group all per-cpu data to one structure instead of having many globals. Also prepare the internals so that we can have multiple instances of the flow cache if needed. Only the kmem_cache is left as a global as all flow caches share the same element size, and benefit from using a common cache. Signed-off-by: NTimo Teras <timo.teras@iki.fi> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Changli Gao 提交于
keep the old behavior on SMP without rps RPS introduces a lock operation to per cpu variable input_pkt_queue on SMP whenever rps is enabled or not. On SMP without RPS, this lock isn't needed at all. Signed-off-by: NChangli Gao <xiaosuo@gmail.com> ---- net/core/dev.c | 42 ++++++++++++++++++++++++++++-------------- 1 file changed, 28 insertions(+), 14 deletions(-) Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 3月, 2010 2 次提交
-
-
由 laurent chavey 提交于
Fix coding style errors and warnings output while running checkpatch.pl on the file net/core/dst.c. Signed-off-by: Nchavey <chavey@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 stephen hemminger 提交于
This adds ethtool and device feature flag to allow control of receive hashing offload. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Acked-by: NJeff Garzik <jgarzik@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: NTejun Heo <tj@kernel.org> Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-
- 29 3月, 2010 2 次提交
-
-
由 Stephen Rothwell 提交于
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 3月, 2010 2 次提交
-
-
由 Jan Engelhardt 提交于
When more data is stuffed into an nlmsg than initially projected, an extra allocation needs to be done. Reserve enough for IFLA_STATS64 so that this does not to needlessy happen. Signed-off-by: NJan Engelhardt <jengelh@medozas.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jan Engelhardt 提交于
Tony Luck observes that the original IFLA_STATS64 submission causes unaligned accesses. This is because nla_data() returns a pointer to a memory region that is only aligned to 32 bits. Do some memcpying to workaround this. Signed-off-by: NJan Engelhardt <jengelh@medozas.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 3月, 2010 1 次提交
-
-
由 Eric Dumazet 提交于
RPS currently depends on SMP and SYSFS Adding a CONFIG_RPS makes sense in case this requirement changes in the future. This patch saves about 1500 bytes of kernel text in case SMP is on but SYSFS is off. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 3月, 2010 1 次提交
-
-
由 Tom Herbert 提交于
Need to take spinlocks when dequeuing from input_pkt_queue in flush_backlog. Also, flush_backlog can now be called directly from netdev_run_todo. Signed-off-by: NTom Herbert <therbert@google.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 3月, 2010 2 次提交
-
-
由 Amerigo Wang 提交于
v2: update according to Frans' comments. Currently, if we leave spaces before dst port, netconsole will silently accept it as 0. Warn about this. Also, when spaces appear in other places, make them visible in error messages. Signed-off-by: NWANG Cong <amwang@redhat.com> Cc: David Miller <davem@davemloft.net> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Fix build with CONFIG_SYSFS not enabled. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 3月, 2010 5 次提交
-
-
由 Robert Olsson 提交于
Here is patch to manipulate packet node allocation and implicitly how packets are DMA'd etc. The flag NODE_ALLOC enables the function and numa_node_id(); when enabled it can also be explicitly controlled via a new node parameter Tested this with 10 Intel 82599 ports w. TYAN S7025 E5520 CPU's. Was able to TX/DMA ~80 Gbit/s to Ethernet wires. Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Use RCU to avoid RTNL use in dev_getfirstbyhwtype() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We currently force a synchronize_net() in netdev_set_master() This seems necessary only when a slave had a master and we dismantle it. In the other case ("ifenslave bond0 ethO"), we dont need this long delay. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
After the type change, addresses in unicast and multicast lists wouldn't make sense, not to mention possible different lenghts. So flush both lists here. Note "dev_addr_discard" will be very soon replaced by "dev_mc_flush" (once mc_list conversion will be done). Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
Ignore the new NETDEV_PRE_TYPE_CHANGE event in rtnetlink_event() since there have been no changes userspace needs to be notified of. Also add a comment to the netdev notifier event definitions to remind people to update the exclusion list when adding new event types. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 3月, 2010 3 次提交
-
-
由 Eric Dumazet 提交于
When doing "ifenslave -d bond0 eth0", there is chance to get NULL dereference in netif_receive_skb(), because dev->master suddenly becomes NULL after we tested it. We should use ACCESS_ONCE() to avoid this (or rcu_dereference()) Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
This patch adds the possibility to refuse the bonding type change for other subsystems (such as for example bridge, vlan, etc.) Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 3月, 2010 4 次提交
-
-
由 Jan Engelhardt 提交于
`ip -s link` shows interface counters truncated to 32 bit. This is because interface statistics are transported only in 32-bit quantity to userspace. This commit adds a new IFLA_STATS64 attribute that exports them in full 64 bit. References: http://lkml.indiana.edu/hypermail/linux/kernel/0307.3/0215.htmlSigned-off-by: NJan Engelhardt <jengelh@medozas.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We hold RTNL at this point and dont use RCU variants of list traversals, we dont need rcu_read_lock()/rcu_read_unlock() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
This patch implements software receive side packet steering (RPS). RPS distributes the load of received packet processing across multiple CPUs. Problem statement: Protocol processing done in the NAPI context for received packets is serialized per device queue and becomes a bottleneck under high packet load. This substantially limits pps that can be achieved on a single queue NIC and provides no scaling with multiple cores. This solution queues packets early on in the receive path on the backlog queues of other CPUs. This allows protocol processing (e.g. IP and TCP) to be performed on packets in parallel. For each device (or each receive queue in a multi-queue device) a mask of CPUs is set to indicate the CPUs that can process packets. A CPU is selected on a per packet basis by hashing contents of the packet header (e.g. the TCP or UDP 4-tuple) and using the result to index into the CPU mask. The IPI mechanism is used to raise networking receive softirqs between CPUs. This effectively emulates in software what a multi-queue NIC can provide, but is generic requiring no device support. Many devices now provide a hash over the 4-tuple on a per packet basis (e.g. the Toeplitz hash). This patch allow drivers to set the HW reported hash in an skb field, and that value in turn is used to index into the RPS maps. Using the HW generated hash can avoid cache misses on the packet when steering it to a remote CPU. The CPU mask is set on a per device and per queue basis in the sysfs variable /sys/class/net/<device>/queues/rx-<n>/rps_cpus. This is a set of canonical bit maps for receive queues in the device (numbered by <n>). If a device does not support multi-queue, a single variable is used for the device (rx-0). Generally, we have found this technique increases pps capabilities of a single queue device with good CPU utilization. Optimal settings for the CPU mask seem to depend on architectures and cache hierarcy. Below are some results running 500 instances of netperf TCP_RR test with 1 byte req. and resp. Results show cumulative transaction rate and system CPU utilization. e1000e on 8 core Intel Without RPS: 108K tps at 33% CPU With RPS: 311K tps at 64% CPU forcedeth on 16 core AMD Without RPS: 156K tps at 15% CPU With RPS: 404K tps at 49% CPU bnx2x on 16 core AMD Without RPS 567K tps at 61% CPU (4 HW RX queues) Without RPS 738K tps at 96% CPU (8 HW RX queues) With RPS: 854K tps at 76% CPU (4 HW RX queues) Caveats: - The benefits of this patch are dependent on architecture and cache hierarchy. Tuning the masks to get best performance is probably necessary. - This patch adds overhead in the path for processing a single packet. In a lightly loaded server this overhead may eliminate the advantages of increased parallelism, and possibly cause some relative performance degradation. We have found that masks that are cache aware (share same caches with the interrupting CPU) mitigate much of this. - The RPS masks can be changed dynamically, however whenever the mask is changed this introduces the possibility of generating out of order packets. It's probably best not change the masks too frequently. Signed-off-by: NTom Herbert <therbert@google.com> include/linux/netdevice.h | 32 ++++- include/linux/skbuff.h | 3 + net/core/dev.c | 335 +++++++++++++++++++++++++++++++++++++-------- net/core/net-sysfs.c | 225 ++++++++++++++++++++++++++++++- net/core/skbuff.c | 2 + 5 files changed, 538 insertions(+), 59 deletions(-) Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Slaby 提交于
Stanse found that one error path in netpoll_setup dereferences npinfo even though it is NULL. Avoid that by adding new label and go to that instead. Signed-off-by: NJiri Slaby <jslaby@suse.cz> Cc: Daniel Borkmann <danborkmann@googlemail.com> Cc: David S. Miller <davem@davemloft.net> Acked-by: chavey@google.com Acked-by: NMatt Mackall <mpm@selenic.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 3月, 2010 2 次提交
-
-
由 Eric Dumazet 提交于
Commit 6e17d45a (net: add addr len check to dev_mc_add) added a bug in dev_mc_add(), since it can now exit with a lock imbalance. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> CC: Jiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Annotates neigh_invalidate() with __releases() and __acquires() for sparse sake. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 3月, 2010 1 次提交
-
-
由 Eric Dumazet 提交于
Use self documenting noinline_for_stack instead of duplicated comments. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 3月, 2010 1 次提交
-
-
由 Dan Carpenter 提交于
We test that "prot->rsk_prot" is non-null right before we dereference it on this line. Signed-off-by: NDan Carpenter <error27@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-