- 17 7月, 2008 2 次提交
-
-
由 Pavel Emelyanov 提交于
The tcp_enter_memory_pressure calls NET_INC_STATS, but doesn't have where to get the net from. I decided to add a sk argument, not the net itself, only to factor all the required sock_net(sk) calls inside the enter_memory_pressure callback itself. Signed-off-by: NPavel Emelyanov <xemul@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Harvey Harrison 提交于
net/core/skbuff.c:1335:5: warning: symbol '__skb_splice_bits' was not declared. Should it be static? Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 7月, 2008 7 次提交
-
-
由 Octavian Purdila 提交于
- move all of the details on offsets, lengths and buffers into a single function instead of doing these operation from multiple places - use a bottom up approach: try to avoid details in the high level functions, introduce them gradually as we go deeper in the function call stack With helpful feedback from Jarek Poplawski. Signed-off-by: NOctavian Purdila <opurdila@ixiacom.com> Acked-by: NJarek Poplawski <jarkao2@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Now that we have a specific lock to protect the network device unicast and multicast lists, remove extraneous grabs of the TX lock in cases where the code only needs address list protection. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Add netif_addr_{lock,unlock}{,_bh}() helpers. Use them to protect operations that operate on or read the network device unicast and multicast address lists. Also use them in cases where the code simply wants to block calls into the driver's ->set_rx_mode() and ->set_multicast_list() methods. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This will be used to protect the per-device unicast and multicast address lists, as well as the callbacks into the drivers which configure such state such as ->set_rx_mode() and ->set_multicast_list(). Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
When VLAN header stripping is used, packets currently bypass packet sockets (and other network taps) completely. For locally existing VLANs, they appear directly on the VLAN device, for unknown VLANs they are silently dropped. Add a new function netif_nit_deliver() to deliver incoming packets to all network interface taps and use it in __vlan_hwaccel_rx() to make VLAN packets visible on the underlying device. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
Use a real skb member to store the skb to avoid clashes with qdiscs, which are allowed to use the cb area themselves. As currently only real devices that consume the skb set the NETIF_F_HW_VLAN_TX flag, no explicit invalidation is neccessary. The new member fills a hole on 64 bit, the skb layout changes from: __u32 mark; /* 172 4 */ sk_buff_data_t transport_header; /* 176 4 */ sk_buff_data_t network_header; /* 180 4 */ sk_buff_data_t mac_header; /* 184 4 */ sk_buff_data_t tail; /* 188 4 */ /* --- cacheline 3 boundary (192 bytes) --- */ sk_buff_data_t end; /* 192 4 */ /* XXX 4 bytes hole, try to pack */ to __u32 mark; /* 172 4 */ __u16 vlan_tci; /* 176 2 */ /* XXX 2 bytes hole, try to pack */ sk_buff_data_t transport_header; /* 180 4 */ sk_buff_data_t network_header; /* 184 4 */ Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
The /sys/class/net/*/wireless/ direcory is, as far as I know, not used by anyone. Additionally, the same data is available via wext ioctls. Hence the sysfs files are pretty much useless. This patch makes them optional and schedules them for removal. Signed-off-by: NJohannes Berg <johannes@sipsolutions.net> Cc: Jean Tourrilhes <jt@hpl.hp.com> Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
-
- 09 7月, 2008 10 次提交
-
-
由 David S. Miller 提交于
Accesses are mostly structured such that when there are multiple TX queues the code transformations will be a little bit simpler. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This allows us to use this calling convention all the way down into qdisc_restart(). Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Only plain netif_schedule() remains taking a net_device, mostly as a compatability item while we transition the rest of these interfaces. Everything else calls netif_schedule_queue() or __netif_schedule(), both of which take a netdev_queue pointer. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
First, we add a qdisc_tx_changing() helper which returns true if the qdisc attachment is in transition. Second, we remove an assertion warning which is of limited value and is hard to express precisely in a multiqueue environment. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
We schedule queues, not the device, for output queue processing in BH. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Now that our qdisc management is bi-directional, per-queue, and fully orthogonal, there is no reason to have a special ingress qdisc pointer in struct net_device. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Now qdisc, qdisc_sleeping, and qdisc_list also live there. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Every qdisc is assosciated with a queue, and in the case of ingress qdiscs that will now be netdev->rx_queue so using that queue's lock is the thing to do. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
The lock is now an attribute of the device queue. One thing to notice is that "suspicious" places emerge which will need specific training about multiple queue handling. They are so marked with explicit "netdev->rx_queue" and "netdev->tx_queue" references. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
A netdev_queue is an entity managed by a qdisc. Currently there is one RX and one TX queue, and a netdev_queue merely contains a backpointer to the net_device. The Qdisc struct is augmented with a netdev_queue pointer as well. Eventually the 'dev' Qdisc member will go away and we will have the resulting hierarchy: net_device --> netdev_queue --> Qdisc Also, qdisc_alloc() and qdisc_create_dflt() now take a netdev_queue pointer argument. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 7月, 2008 1 次提交
-
-
由 Patrick McHardy 提交于
Commit dad9b335 (netdevice: Fix promiscuity and allmulti overflow) broke dev_set_promiscuity() by returning on success without reprogramming the device. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 7月, 2008 1 次提交
-
-
由 Denis V. Lunev 提交于
This is required to pass namespace context into rt_cache_flush called from ->flush_cache. Signed-off-by: NDenis V. Lunev <den@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 7月, 2008 4 次提交
-
-
由 Santwona Behera 提交于
Added new interfaces to ethtool to configure receive network flow distribution across multiple rx rings using hashing. Signed-off-by: NSantwona Behera <santwona.behera@sun.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
The errno code returned must be negative. Fixes "RTNETLINK answers: Unknown error 18446744073709551519". Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wang Chen 提交于
v1->v2: Use strlcpy() to ensure s[i].name be null-termination. 1. In netdev_boot_setup_add(), a long name will leak. ex. : dev=21,0x1234,0x1234,0x2345,eth123456789verylongname......... 2. In netdev_boot_setup_check(), mismatch will happen if s[i].name is a substring of dev->name. ex. : dev=...eth1 dev=...eth11 [ With feedback from Ben Hutchings. ] Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wang Chen 提交于
Parameter "needlock" no long exists. Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 6月, 2008 2 次提交
-
-
由 Wang Chen 提交于
Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Octavian Purdila 提交于
If an skb has nr_frags set to zero but its frag_list is not empty (as it can happen if software LRO is enabled), and a previous tcp_read_sock has consumed the linear part of the skb, then __skb_splice_bits: (a) incorrectly reports an error and (b) forgets to update the offset to account for the linear part Any of the two problems will cause the subsequent __skb_splice_bits call (the one that handles the frag_list skbs) to either skip data, or, if the unadjusted offset is greater then the size of the next skb in the frag_list, make tcp_splice_read loop forever. Signed-off-by: NOctavian Purdila <opurdila@ixiacom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 6月, 2008 1 次提交
-
-
由 Eric W. Biederman 提交于
Alexey Dobriyan <adobriyan@gmail.com> writes: > Subject: ICMP sockets destruction vs ICMP packets oops > After icmp_sk_exit() nuked ICMP sockets, we get an interrupt. > icmp_reply() wants ICMP socket. > > Steps to reproduce: > > launch shell in new netns > move real NIC to netns > setup routing > ping -i 0 > exit from shell > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 > IP: [<ffffffff803fce17>] icmp_sk+0x17/0x30 > PGD 17f3cd067 PUD 17f3ce067 PMD 0 > Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC > CPU 0 > Modules linked in: usblp usbcore > Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4 > RIP: 0010:[<ffffffff803fce17>] [<ffffffff803fce17>] icmp_sk+0x17/0x30 > RSP: 0018:ffffffff8057fc30 EFLAGS: 00010286 > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900 > RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800 > RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815 > R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28 > R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800 > FS: 0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0) > Stack: 0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4 > ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246 > 000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436 > Call Trace: > <IRQ> [<ffffffff803fcfe4>] icmp_reply+0x44/0x1e0 > [<ffffffff803d3a0a>] ? ip_route_input+0x23a/0x1360 > [<ffffffff803fd645>] icmp_echo+0x65/0x70 > [<ffffffff803fd300>] icmp_rcv+0x180/0x1b0 > [<ffffffff803d6d84>] ip_local_deliver+0xf4/0x1f0 > [<ffffffff803d71bb>] ip_rcv+0x33b/0x650 > [<ffffffff803bb16a>] netif_receive_skb+0x27a/0x340 > [<ffffffff803be57d>] process_backlog+0x9d/0x100 > [<ffffffff803bdd4d>] net_rx_action+0x18d/0x250 > [<ffffffff80237be5>] __do_softirq+0x75/0x100 > [<ffffffff8020c97c>] call_softirq+0x1c/0x30 > [<ffffffff8020f085>] do_softirq+0x65/0xa0 > [<ffffffff80237af7>] irq_exit+0x97/0xa0 > [<ffffffff8020f198>] do_IRQ+0xa8/0x130 > [<ffffffff80212ee0>] ? mwait_idle+0x0/0x60 > [<ffffffff8020bc46>] ret_from_intr+0x0/0xf > <EOI> [<ffffffff80212f2c>] ? mwait_idle+0x4c/0x60 > [<ffffffff80212f23>] ? mwait_idle+0x43/0x60 > [<ffffffff8020a217>] ? cpu_idle+0x57/0xa0 > [<ffffffff8040f380>] ? rest_init+0x70/0x80 > Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 > 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 <48> 8b 04 c3 48 83 c4 08 > 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00 > RIP [<ffffffff803fce17>] icmp_sk+0x17/0x30 > RSP <ffffffff8057fc30> > CR2: 0000000000000000 > ---[ end trace ea161157b76b33e8 ]--- > Kernel panic - not syncing: Aiee, killing interrupt handler! Receiving packets while we are cleaning up a network namespace is a racy proposition. It is possible when the packet arrives that we have removed some but not all of the state we need to fully process it. We have the choice of either playing wack-a-mole with the cleanup routines or simply dropping packets when we don't have a network namespace to handle them. Since the check looks inexpensive in netif_receive_skb let's just drop the incoming packets. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 6月, 2008 2 次提交
-
-
由 Ben Hutchings 提交于
Add skb_warn_if_lro() to test whether an skb was received with LRO and warn if so. Change br_forward(), ip_forward() and ip6_forward() to call it) and discard the skb if it returns true. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Hutchings 提交于
Large Receive Offload (LRO) is only appropriate for packets that are destined for the host, and should be disabled if received packets may be forwarded. It can also confuse the GSO on output. Add dev_disable_lro() function which uses the appropriate ethtool ops to disable LRO if enabled. Add calls to dev_disable_lro() in br_add_if() and functions that enable IPv4 and IPv6 forwarding. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 6月, 2008 4 次提交
-
-
由 Wang Chen 提交于
Max of promiscuity and allmulti plus positive @inc can cause overflow. Fox example: when allmulti=0xFFFFFFFF, any caller give dev_set_allmulti() a positive @inc will cause allmulti be off. This is not what we want, though it's rare case. The fix is that only negative @inc will cause allmulti or promiscuity be off and when any caller makes the counters touch the roof, we return error. Change of v2: Change void function dev_set_promiscuity/allmulti to return int. So callers can get the overflow error. Caller's fix will be done later. Change of v3: 1. Since we return error to caller, we don't need to print KERN_ERROR, KERN_WARNING is enough. 2. In dev_set_promiscuity(), if __dev_set_promiscuity() failed, we return at once. Signed-off-by: NWang Chen <wangchen@cn.fujitsu.com> Acked-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
In order to more easily grep for all things that set sk->sk_socket, add sk_set_socket() helper inline function. Suggested (although only half-seriously) by Evgeniy Polyakov. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jay Vosburgh 提交于
Permit bonding to function rationally if max_bonds is set to zero. This will load the module, but create no master devices (which can be created via sysfs). Requires some change to bond_create_sysfs; currently, the netdev sysfs directory is determined from the first bonding device created, but this is no longer possible. Instead, an interface from net/core is created to create and destroy files in net_class. Based on a patch submitted by Phil Oester <kernel@linuxaces.com>. Modified by Jay Vosburgh to fix the sysfs issue mentioned above and to update the documentation. Signed-off-by: NPhil Oester <kernel@linuxace.com> Signed-off-by: NJay Vosburgh <fubar@us.ibm.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Or Gerlitz 提交于
Add NETDEV_BONDING_FAILOVER event to be used in a successive patch by bonding to announce fail-over for the active-backup mode through the netdev events notifier chain mechanism. Such an event can be of use for the RDMA CM (communication manager) to let native RDMA ULPs (eg NFS-RDMA, iSER) always be aligned with the IP stack, in the sense that they use the same ports/links as the stack does. More usages can be done to allow monitoring tools based on netlink events being aware to bonding fail-over. Signed-off-by: NOr Gerlitz <ogerlitz@voltaire.com> Signed-off-by: NJay Vosburgh <fubar@us.ibm.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
- 17 6月, 2008 1 次提交
-
-
由 Ben Hutchings 提交于
Selected device feature bits can be propagated to VLAN devices, so we can make use of TX checksum offload and TSO on VLAN-tagged packets. However, if the physical device does not do VLAN tag insertion or generic checksum offload then the test for TX checksum offload in dev_queue_xmit() will see a protocol of htons(ETH_P_8021Q) and yield false. This splits the checksum offload test into two functions: - can_checksum_protocol() tests a given protocol against a feature bitmask - dev_can_checksum() first tests the skb protocol against the device features; if that fails and the protocol is htons(ETH_P_8021Q) then it tests the encapsulated protocol against the effective device features for VLANs Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Acked-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 6月, 2008 1 次提交
-
-
由 Adrian Bunk 提交于
This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 6月, 2008 1 次提交
-
-
由 Octavian Purdila 提交于
skb_splice_bits temporary drops the socket lock while iterating over the socket queue in order to break a reverse locking condition which happens with sendfile. This, however, opens a window of opportunity for tcp_collapse() to aggregate skbs and thus potentially free the current skb used in skb_splice_bits and tcp_read_sock. This patch fixes the problem by (re-)getting the same "logical skb" after the lock has been temporary dropped. Based on idea and initial patch from Evgeniy Polyakov. Signed-off-by: NOctavian Purdila <opurdila@ixiacom.com> Acked-by: NEvgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 6月, 2008 3 次提交
-
-
由 Thomas Graf 提交于
Make nlmsg_trim(), nlmsg_cancel(), genlmsg_cancel(), and nla_nest_cancel() void functions. Return -EMSGSIZE instead of -1 if the provided message buffer is not big enough. Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Brice Goglin 提交于
No need to compute copy twice in the frags loop in dma_skb_copy_datagram_iovec(). Signed-off-by: NBrice Goglin <Brice.Goglin@inria.fr> Acked-by: NShannon Nelson <shannon.nelson@intel.com> Signed-off-by: NMaciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
The neighbor table time of last use information is returned in the incorrect unit. Kernel to user space ABI's need to use USER_HZ (or milliseconds), otherwise the application has to try and discover the real system HZ value which is problematic. Linux has standardized on keeping USER_HZ consistent (100hz) even when kernel is running internally at some other value. This change is small, but it breaks the ABI for older version of iproute2 utilities. But these utilities are already broken since they are looking at the psched_hz values which are completely different. So let's just go ahead and fix both kernel and user space. Older utilities will just print wrong values. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-