- 30 11月, 2011 3 次提交
-
-
由 Igor Maravic 提交于
Change function rcu_dereference to rcu_dereference_bh to avoid warning [ INFO: suspicious RCU usage. ] ------------------------------- net/core/dev.c:2459 suspicious rcu_dereference_check() usage! because we are locking with rcu_read_lock_bh(); in function dev_queue_xmit(struct sk_buff *skb) Signed-off-by: NIgor Maravic <igorm@etf.rs> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Networking stack support for byte queue limits, uses dynamic queue limits library. Byte queue limits are maintained per transmit queue, and a dql structure has been added to netdev_queue structure for this purpose. Configuration of bql is in the tx-<n> sysfs directory for the queue under the byte_queue_limits directory. Configuration includes: limit_min, bql minimum limit limit_max, bql maximum limit hold_time, bql slack hold time Also under the directory are: limit, current byte limit inflight, current number of bytes on the queue Signed-off-by: NTom Herbert <therbert@google.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Create separate queue state flags so that either the stack or drivers can turn on XOFF. Added a set of functions used in the stack to determine if a queue is really stopped (either by stack or driver) Signed-off-by: NTom Herbert <therbert@google.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 11月, 2011 2 次提交
-
-
由 Eric Dumazet 提交于
Igor Maravic reported an error caused by jump_label_dec() being called from IRQ context : BUG: sleeping function called from invalid context at kernel/mutex.c:271 in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper 1 lock held by swapper/0: #0: (&n->timer){+.-...}, at: [<ffffffff8107ce90>] call_timer_fn+0x0/0x340 Pid: 0, comm: swapper Not tainted 3.2.0-rc2-net-next-mpls+ #1 Call Trace: <IRQ> [<ffffffff8104f417>] __might_sleep+0x137/0x1f0 [<ffffffff816b9a2f>] mutex_lock_nested+0x2f/0x370 [<ffffffff810a89fd>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff8109a37f>] ? local_clock+0x6f/0x80 [<ffffffff810a90a5>] ? lock_release_holdtime.part.22+0x15/0x1a0 [<ffffffff81557929>] ? sock_def_write_space+0x59/0x160 [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90 [<ffffffff810969cd>] atomic_dec_and_mutex_lock+0x5d/0x80 [<ffffffff8112fc1d>] jump_label_dec+0x1d/0x50 [<ffffffff81566525>] net_disable_timestamp+0x15/0x20 [<ffffffff81557a75>] sock_disable_timestamp+0x45/0x50 [<ffffffff81557b00>] __sk_free+0x80/0x200 [<ffffffff815578d0>] ? sk_send_sigurg+0x70/0x70 [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90 [<ffffffff81557cba>] sock_wfree+0x3a/0x70 [<ffffffff8155c2b0>] skb_release_head_state+0x70/0x120 [<ffffffff8155c0b6>] __kfree_skb+0x16/0x30 [<ffffffff8155c119>] kfree_skb+0x49/0x170 [<ffffffff815e936e>] arp_error_report+0x3e/0x90 [<ffffffff81575bd9>] neigh_invalidate+0x89/0xc0 [<ffffffff81578dbe>] neigh_timer_handler+0x9e/0x2a0 [<ffffffff81578d20>] ? neigh_update+0x640/0x640 [<ffffffff81073558>] __do_softirq+0xc8/0x3a0 Since jump_label_{inc|dec} must be called from process context only, we must defer jump_label_dec() if net_disable_timestamp() is called from interrupt context. Reported-by: NIgor Maravic <igorm@etf.rs> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
No functional changes. This uses the code we factorized in skb_flow_dissect() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 11月, 2011 1 次提交
-
-
由 Neil Horman 提交于
This patch adds in the infrastructure code to create the network priority cgroup. The cgroup, in addition to the standard processes file creates two control files: 1) prioidx - This is a read-only file that exports the index of this cgroup. This is a value that is both arbitrary and unique to a cgroup in this subsystem, and is used to index the per-device priority map 2) priomap - This is a writeable file. On read it reports a table of 2-tuples <name:priority> where name is the name of a network interface and priority is indicates the priority assigned to frames egresessing on the named interface and originating from a pid in this cgroup This cgroup allows for skb priority to be set prior to a root qdisc getting selected. This is benenficial for DCB enabled systems, in that it allows for any application to use dcb configured priorities so without application modification Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> CC: Robert Love <robert.w.love@intel.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 11月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
Most machines dont use RPS/RFS, and pay a fair amount of instructions in netif_receive_skb() / netif_rx() / get_rps_cpu() just to discover RPS/RFS is not setup. Add a jump_label named rps_needed. If no device rps_map or global rps_sock_flow_table is setup, netif_receive_skb() / netif_rx() do a single instruction instead of many ones, including conditional jumps. jmp +0 (if CONFIG_JUMP_LABEL=y) Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> CC: Tom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 11月, 2011 4 次提交
-
-
由 Michał Mirosław 提交于
Only distinct use is checking if NETIF_F_NOCACHE_COPY should be enabled by default. The check heuristics is altered a bit here, so it hits other people than before. The default shouldn't be trusted for performance-critical cases anyway. For all other uses NETIF_F_NO_CSUM is equivalent to NETIF_F_HW_CSUM. Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michał Mirosław 提交于
v2: add couple missing conversions in drivers split unexporting netdev_fix_features() implemented %pNF convert sock::sk_route_(no?)caps Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michał Mirosław 提交于
As all drivers are converted, we may now remove discrete offload setting callback handling. Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl> Acked-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
netstamp_needed seems a good candidate to jump_label conversion. This avoids 3 conditional branches per incoming packet in fast path. No measurable difference, given that these conditional branches are predicted on modern cpus. Only a small icache reduction, thanks to the unlikely() stuff. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 10月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
commit 2425717b (net: allow vlan traffic to be received under bond) broke ARP processing on vlan on top of bonding. +-------+ eth0 --| bond0 |---bond0.103 eth1 --| | +-------+ 52870.115435: skb_gro_reset_offset <-napi_gro_receive 52870.115435: dev_gro_receive <-napi_gro_receive 52870.115435: napi_skb_finish <-napi_gro_receive 52870.115435: netif_receive_skb <-napi_skb_finish 52870.115435: get_rps_cpu <-netif_receive_skb 52870.115435: __netif_receive_skb <-netif_receive_skb 52870.115436: vlan_do_receive <-__netif_receive_skb 52870.115436: bond_handle_frame <-__netif_receive_skb 52870.115436: vlan_do_receive <-__netif_receive_skb 52870.115436: arp_rcv <-__netif_receive_skb 52870.115436: kfree_skb <-arp_rcv Packet is dropped in arp_rcv() because its pkt_type was set to PACKET_OTHERHOST in the first vlan_do_receive() call, since no eth0.103 exists. We really need to change pkt_type only if no more rx_handler is about to be called for the packet. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Reviewed-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 10月, 2011 1 次提交
-
-
由 Eric W. Biederman 提交于
Renato Westphal noticed that since commit a2835763 "rtnetlink: handle rtnl_link netlink notifications manually" was merged we no longer send a netlink message when a networking device is moved from one network namespace to another. Fix this by adding the missing manual notification in dev_change_net_namespaces. Since all network devices that are processed by dev_change_net_namspaces are in the initialized state the complicated tests that guard the manual rtmsg_ifinfo calls in rollback_registered and register_netdevice are unnecessary and we can just perform a plain notification. Cc: stable@kernel.org Tested-by: NRenato Westphal <renatowestphal@gmail.com> Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 10月, 2011 1 次提交
-
-
由 Mihai Maruseac 提交于
Instead of using the dev->next chain and trying to resync at each call to dev_seq_start, use the name hash, keeping the bucket and the offset in seq->private field. Tests revealed the following results for ifconfig > /dev/null * 1000 interfaces: * 0.114s without patch * 0.089s with patch * 3000 interfaces: * 0.489s without patch * 0.110s with patch * 5000 interfaces: * 1.363s without patch * 0.250s with patch * 128000 interfaces (other setup): * ~100s without patch * ~30s with patch Signed-off-by: NMihai Maruseac <mmaruseac@ixiacom.com> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 10月, 2011 2 次提交
-
-
由 Richard Cochran 提交于
This patch adds a sanity check on the values provided by user space for the hardware time stamping configuration. If the values lie outside of the absolute limits, then the ioctl request will be denied. Signed-off-by: NRichard Cochran <richard.cochran@omicron.at> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric W. Biederman 提交于
This patch moves the rcu_barrier from rollback_registered_many (inside the rtnl_lock) into netdev_run_todo (just outside the rtnl_lock). This allows us to gain the full benefit of sychronize_net calling synchronize_rcu_expedited when the rtnl_lock is held. The rcu_barrier in rollback_registered_many was originally a synchronize_net but was promoted to be a rcu_barrier() when it was found that people were unnecessarily hitting the 250ms wait in netdev_wait_allrefs(). Changing the rcu_barrier back to a synchronize_net is therefore safe. Since we only care about waiting for the rcu callbacks before we get to netdev_wait_allrefs() it is also safe to move the wait into netdev_run_todo. This was tested by creating and destroying 1000 tap devices and observing /proc/lock_stat. /proc/lock_stat reports this change reduces the hold times of the rtnl_lock by a factor of 10. There was no observable difference in the amount of time it takes to destroy a network device. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 10月, 2011 2 次提交
-
-
由 Eric Dumazet 提交于
To ease skb->truesize sanitization, its better to be able to localize all references to skb frags size. Define accessors : skb_frag_size() to fetch frag size, and skb_frag_size_{set|add|sub}() to manipulate it. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John Fastabend 提交于
The following configuration used to work as I expected. At least we could use the fcoe interfaces to do MPIO and the bond0 iface to do load balancing or failover. ---eth2.228-fcoe | eth2 -----| | |---- bond0 | eth3 -----| | ---eth3.228-fcoe This worked because of a change we added to allow inactive slaves to rx 'exact' matches. This functionality was kept intact with the rx_handler mechanism. However now the vlan interface attached to the active slave never receives traffic because the bonding rx_handler updates the skb->dev and goto's another_round. Previously, the vlan_do_receive() logic was called before the bonding rx_handler. Now by the time vlan_do_receive calls vlan_find_dev() the skb->dev is set to bond0 and it is clear no vlan is attached to this iface. The vlan lookup fails. This patch moves the VLAN check above the rx_handler. A VLAN tagged frame is now routed to the eth2.228-fcoe iface in the above schematic. Untagged frames continue to the bond0 as normal. This case also remains intact, eth2 --> bond0 --> vlan.228 Here the skb is VLAN tagged but the vlan lookup fails on eth2 causing the bonding rx_handler to be called. On the second pass the vlan lookup is on the bond0 iface and completes as expected. Putting a VLAN.228 on both the bond0 and eth2 device will result in eth2.228 receiving the skb. I don't think this is completely unexpected and was the result prior to the rx_handler result. Note, the same setup is also used for other storage traffic that MPIO is used with eg. iSCSI and similar setups can be contrived without storage protocols. Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com> Acked-by: NJesse Gross <jesse@nicira.com> Reviewed-by: NJiri Pirko <jpirko@redhat.com> Tested-by: NHans Schillstrom <hams.schillstrom@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 10月, 2011 1 次提交
-
-
由 Ben Hutchings 提交于
Amir Vadai wrote: > When a stream is paused, and its rule is expired while it is paused, > no new rule will be configured to the HW when traffic resume. [...] > - When stream was resumed, traffic was steered again by RSS, and > because current-cpu was equal to desired-cpu, ndo_rx_flow_steer > wasn't called and no rule was configured to the HW. Fix this by setting the flow's current CPU only in the table for the newly selected RX queue. Reported-and-tested-by: NAmir Vadai <amirv@dev.mellanox.co.il> Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 9月, 2011 1 次提交
-
-
由 Changli Gao 提交于
The upper protocol numbers of PPPOE are different, and should be treated specially. Signed-off-by: NChangli Gao <xiaosuo@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 9月, 2011 2 次提交
-
-
由 Jiri Pirko 提交于
This patch does several things: - introduces __ethtool_get_settings which is called from ethtool code and from drivers as well. Put ASSERT_RTNL there. - dev_ethtool_get_settings() is replaced by __ethtool_get_settings() - changes calling in drivers so rtnl locking is respected. In iboe_get_rate was previously ->get_settings() called unlocked. This fixes it. Also prb_calc_retire_blk_tmo() in af_packet.c had the same problem. Also fixed by calling __dev_get_by_index() instead of dev_get_by_index() and holding rtnl_lock for both calls. - introduces rtnl_lock in bnx2fc_vport_create() and fcoe_vport_create() so bnx2fc_if_create() and fcoe_if_create() are called locked as they are from other places. - use __ethtool_get_settings() in bonding code Signed-off-by: NJiri Pirko <jpirko@redhat.com> v2->v3: -removed dev_ethtool_get_settings() -added ASSERT_RTNL into __ethtool_get_settings() -prb_calc_retire_blk_tmo - use __dev_get_by_index() and lock around it and __ethtool_get_settings() call v1->v2: add missing export_symbol Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> [except FCoE bits] Acked-by: NRalf Baechle <ralf@linux-mips.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michael S. Tsirkin 提交于
dev_forward_skb loops an skb back into host networking stack which might hang on the memory indefinitely. In particular, this can happen in macvtap in bridged mode. Copy the userspace fragments to avoid blocking the sender in that case. As this patch makes skb_copy_ubufs extern now, I also added some documentation and made it clear the SKBTX_DEV_ZEROCOPY flag automatically instead of doing it in all callers. This can be made into a separate patch if people feel it's worth it. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 8月, 2011 2 次提交
-
-
由 Ian Campbell 提交于
Signed-off-by: NIan Campbell <ian.campbell@citrix.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl> Cc: netdev@vger.kernel.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Skip IPIP header to get proper layer-4 information. Like GRE tunnels, this only works if rxhash is not already provided by the device itself (ethtool -K ethX rxhash off), to allow kernel compute a software rxhash. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 8月, 2011 2 次提交
-
-
由 Jason Baron 提交于
Previously, if dynamic debug was enabled netdev_dbg() was using dynamic_dev_dbg() to print out the underlying msg. Fix this by making sure netdev_dbg() uses __netdev_printk(). Cc: David S. Miller <davem@davemloft.net> Signed-off-by: NJason Baron <jbaron@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
由 Jiri Pirko 提交于
Now, when vlan tag on untagged in non-accelerated path is stripped from skb, headers are reset right away. Benefit from that and avoid calling __netif_receive_skb recursivelly and just use another_round. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 8月, 2011 2 次提交
-
-
由 Changli Gao 提交于
Inspect the payload of PPPOE session messages for the 4 tuples to generate skb->rxhash. Signed-off-by: NChangli Gao <xiaosuo@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Changli Gao 提交于
For the 802.1Q packets, if the NIC doesn't support hw-accel-vlan-rx, RPS won't inspect the internal 4 tuples to generate skb->rxhash, so this kind of traffic can't get any benefit from RPS. This patch adds the support for 802.1Q to RPS. Signed-off-by: NChangli Gao <xiaosuo@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 8月, 2011 6 次提交
-
-
由 Jiri Pirko 提交于
Remove no longer used operation. Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Use IFF_UNICAST_FTL to find out if driver handles unicast address filtering. In case it does not, promisc mode is entered. Patch also fixes following drivers: stmmac, niu: support uc filtering and yet it propagated ndo_set_multicast_list bna, benet, pxa168_eth, ks8851, ks8851_mll, ksz884x : has set ndo_set_rx_mode but do not support uc filtering Signed-off-by: NJiri Pirko <jpirko@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Crack open GRE packets in __skb_get_rxhash to compute 4-tuple hash on in encapsulated packet. Note that this is used only when the __skb_get_rxhash is taken, in particular only when the device does not compute provide the rxhash (ie. feature is disabled). This was tested by creating a single GRE tunnel between two 16 core AMD machines. 200 netperf TCP_RR streams were ran with 1 byte request and response size. Without patch: 157497 tps, 50/90/99% latencies 1250/1292/1364 usecs With patch: 325896 tps, 50/90/99% latencies 603/848/1169 Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Basics for looking for ports in encapsulated packets in tunnels. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
The l4_rxhash flag was added to the skb structure to indicate that the rxhash value was computed over the 4 tuple for the packet which includes the port information in the encapsulated transport packet. This is used by the stack to preserve the rxhash value in __skb_rx_tunnel. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Use some variables for clarity and extensibility. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 8月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
RCU api had been completed and rcu_access_pointer() or rcu_dereference_protected() are better than generic rcu_dereference_raw() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 8月, 2011 1 次提交
-
-
由 Stephen Hemminger 提交于
When assigning a NULL value to an RCU protected pointer, no barrier is needed. The rcu_assign_pointer, used to handle that but will soon change to not handle the special case. Convert all rcu_assign_pointer of NULL value. //smpl @@ expression P; @@ - rcu_assign_pointer(P, NULL) + RCU_INIT_POINTER(P, NULL) // </smpl> Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 7月, 2011 1 次提交
-
-
由 Joe Perches 提交于
No need to use int, its uses are boolean. May save a few bytes one day. Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 7月, 2011 2 次提交
-
-
由 Michał Mirosław 提交于
It is not used anywhere except net/core/dev.c now. Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michał Mirosław 提交于
vlan_features contains features inherited from underlying device. NETIF_SOFT_FEATURES are not inherited but belong to the vlan device itself (ensured in vlan_dev_fix_features()). Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 7月, 2011 1 次提交
-
-
由 Shan Wei 提交于
Just add GSO to vlan_features initialization, and update comments. When we set offload features, vlan_dev_fix_features() will do more check. In vlan_dev_fix_features(), final features is decided by features of real device and vlan_features of real device. Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-