- 13 6月, 2008 3 次提交
-
-
由 David S. Miller 提交于
This reverts two changesets, ec3c0982 ("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and the follow-on bug fix 9ae27e0a ("tcp: Fix slab corruption with ipv6 and tcp6fuzz"). This change causes several problems, first reported by Ingo Molnar as a distcc-over-loopback regression where connections were getting stuck. Ilpo Järvinen first spotted the locking problems. The new function added by this code, tcp_defer_accept_check(), only has the child socket locked, yet it is modifying state of the parent listening socket. Fixing that is non-trivial at best, because we can't simply just grab the parent listening socket lock at this point, because it would create an ABBA deadlock. The normal ordering is parent listening socket --> child socket, but this code path would require the reverse lock ordering. Next is a problem noticed by Vitaliy Gusev, he noted: ---------------------------------------- >--- a/net/ipv4/tcp_timer.c >+++ b/net/ipv4/tcp_timer.c >@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data) > goto death; > } > >+ if (tp->defer_tcp_accept.request && sk->sk_state == TCP_ESTABLISHED) { >+ tcp_send_active_reset(sk, GFP_ATOMIC); >+ goto death; Here socket sk is not attached to listening socket's request queue. tcp_done() will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should release this sk) as socket is not DEAD. Therefore socket sk will be lost for freeing. ---------------------------------------- Finally, Alexey Kuznetsov argues that there might not even be any real value or advantage to these new semantics even if we fix all of the bugs: ---------------------------------------- Hiding from accept() sockets with only out-of-order data only is the only thing which is impossible with old approach. Is this really so valuable? My opinion: no, this is nothing but a new loophole to consume memory without control. ---------------------------------------- So revert this thing for now. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
In changeset 22dd4850 ("raw: Raw socket leak.") code was added so that we flush pending frames on raw sockets to avoid leaks. The ipv4 part was fine, but the ipv6 part was not done correctly. Unlike the ipv4 side, the ipv6 code already has a .destroy method for rawv6_prot. So now there were two assignments to this member, and what the compiler does is use the last one, effectively making the ipv6 parts of that changeset a NOP. Fix this by removing the: .destroy = inet6_destroy_sock, line, and adding an inet6_destroy_sock() call to the end of raw6_destroy(). Noticed by Al Viro. Signed-off-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
-
由 Eilon Greenstein 提交于
I would like to thank Eliezer Tamir for writing and maintaining the driver for the past two years. I will take over maintaining the bnx2x driver from now on. Signed-off-by: NEilon Greenstein <eilong@broadcom.com> Signed-off-by: NEliezer Tamir <eliezert@broadcom.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 6月, 2008 14 次提交
-
-
由 David S. Miller 提交于
If the RTNL is held when we invoke flush_scheduled_work() we could deadlock. One such case is linkwatch, it is a work struct which tries to grab the RTNL semaphore. The most common case are net driver ->stop() methods. The simplest conversion is to instead use cancel_{delayed_}work_sync() explicitly on the various work struct the driver uses. This is an OK transformation because these work structs are doing things like resetting the chip, restarting link negotiation, and so forth. And if we're bringing down the device, we're about to turn the chip off and reset it anways. So if we cancel a pending work event, that's fine here. Some drivers were working around this deadlock by using a msleep() polling loop of some sort, and those cases are converted to instead use cancel_{delayed_}work_sync() as well. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
-
由 Christophe Jaillet 提交于
Compared to other places in the kernel, I think that this driver misuses the function round_jiffies. Signed-off-by: NChristophe Jaillet <christophe.jaillet@wanadoo.fr> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Roel Kluin 提交于
Duplicate NETIF_MSG_IFDOWN, 2nd should be NETIF_MSG_IFUP Signed-off-by: NRoel Kluin <roel.kluin@gmail.com> Acked-by: NSylvain Munaut <tnt@246tNt.com> Cc: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Roel Kluin 提交于
The branches are dead code. even when dev->flag IFF_MULTICAST (defined 0x1000) is set, dev->flags & IFF_MULTICAST & [boolean] always evaluates to 0. Signed-off-by: NRoel Kluin <roel.kluin@gmail.com> Cc: Francois Romieu <romieu@fr.zoreil.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
-
由 Patrick McHardy 提交于
When creation of a new conntrack entry in ctnetlink fails after having set up the NAT mappings, the conntrack has an extension area allocated that is not getting properly destroyed when freeing the conntrack again. This means the NAT extension is still in the bysource hash, causing a crash when walking over the hash chain the next time: BUG: unable to handle kernel paging request at 00120fbd IP: [<c03d394b>] nf_nat_setup_info+0x221/0x58a *pde = 00000000 Oops: 0000 [#1] PREEMPT SMP Pid: 2795, comm: conntrackd Not tainted (2.6.26-rc5 #1) EIP: 0060:[<c03d394b>] EFLAGS: 00010206 CPU: 1 EIP is at nf_nat_setup_info+0x221/0x58a EAX: 00120fbd EBX: 00120fbd ECX: 00000001 EDX: 00000000 ESI: 0000019e EDI: e853bbb4 EBP: e853bbc8 ESP: e853bb78 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process conntrackd (pid: 2795, ti=e853a000 task=f7de10f0 task.ti=e853a000) Stack: 00000000 e853bc2c e85672ec 00000008 c0561084 63c1db4a 00000000 00000000 00000000 0002e109 61d2b1c3 00000000 00000000 00000000 01114e22 61d2b1c3 00000000 00000000 f7444674 e853bc04 00000008 c038e728 0000000a f7444674 Call Trace: [<c038e728>] nla_parse+0x5c/0xb0 [<c0397c1b>] ctnetlink_change_status+0x190/0x1c6 [<c0397eec>] ctnetlink_new_conntrack+0x189/0x61f [<c0119aee>] update_curr+0x3d/0x52 [<c03902d1>] nfnetlink_rcv_msg+0xc1/0xd8 [<c0390228>] nfnetlink_rcv_msg+0x18/0xd8 [<c0390210>] nfnetlink_rcv_msg+0x0/0xd8 [<c038d2ce>] netlink_rcv_skb+0x2d/0x71 [<c0390205>] nfnetlink_rcv+0x19/0x24 [<c038d0f5>] netlink_unicast+0x1b3/0x216 ... Move invocation of the extension destructors to nf_conntrack_free() to fix this problem. Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10875Reported-and-Tested-by: NKrzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Leblond 提交于
The message "nf_log_packet: can't log since no backend logging module loaded in! Please either load one, or disable logging explicitly" was displayed for each logged packet when no userspace application is listening to nflog events. The message seems to warn for a problem with a kernel module missing but as said before this is not the case. I thus propose to suppress the message (I don't see any reason to flood the log because a user application has crashed.) Signed-off-by: NEric Leblond <eric@inl.fr> Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 YOSHIFUJI Hideaki 提交于
IPV6_MULTICAST_HOPS, for example, is not valid for stream sockets. Since they are virtually unavailable for stream sockets, we should return ENOPROTOOPT instead of EINVAL. Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
-
由 YOSHIFUJI Hideaki 提交于
Only 0 and 1 are valid for IPV6_MULTICAST_LOOP socket option, and we should return an error of EINVAL otherwise, per RFC3493. Based on patch from Shan Wei <shanwei@cn.fujitsu.com>. Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
-
由 Shan Wei 提交于
When specifing the outgoing hop limit as ancillary data for sendmsg(), the kernel doesn't check the integer hop limit value as specified in [RFC-3542] section 6.3. Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com> Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
-
由 YOSHIFUJI Hideaki 提交于
1) We may have route lifetime larger than INT_MAX. In that case we had wired value in lifetime. Use INT_MAX if lifetime does not fit in s32. 2) Lifetime is valid iif RTF_EXPIRES is set. Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
-
由 YOSHIFUJI Hideaki 提交于
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
-
- 11 6月, 2008 23 次提交
-
-
由 Gerrit Renker 提交于
Step 8.5 in RFC 4340 says for the newly cloned socket Initialize S.GAR := S.ISS, but what in fact the code (minisocks.c) does is Initialize S.GAR := S.ISR, which is wrong (typo?) -- fixed by the patch. Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
-
由 Gerrit Renker 提交于
This fixes a bug in computing the inter-packet-interval t_ipi = s/X: scaled_div32(a, b) uses u32 for b, but in "scaled_div32(s, X)" the type of the sending rate `X' is u64. Since X is scaled by 2^6, this truncates rates greater than 2^26 Bps (~537 Mbps). Using full 64-bit division now. Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
-
由 Gerrit Renker 提交于
This fixes a bug in the reverse lookup of p: given a value f(p), instead of p, the function returned the smallest tabulated value f(p). The smallest tabulated value of 10^6 * f(p) = sqrt(2*p/3) + 12 * sqrt(3*p/8) * (32 * p^3 + p) for p=0.0001 is 8172. Since this value is scaled by 10^6, the outcome of this bug is that a loss of 8172/10^6 = 0.8172% was reported whenever the input was below the table resolution of 0.01%. This means that the value was over 80 times too high, resulting in large spikes of the initial loss interval, thus unnecessarily reducing the throughput. Also corrected the printk format (%u for u32). Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
-
由 Gerrit Renker 提交于
This fixes an oversight from an earlier patch, ensuring that Ack Vectors are not processed on request sockets. The issue is that Ack Vectors must not be parsed on request sockets, since the Ack Vector feature depends on the selection of the (TX) CCID. During the initial handshake the CCIDs are undefined, and so RFC 4340, 10.3 applies: "Using CCID-specific options and feature options during a negotiation for the corresponding CCID feature is NOT RECOMMENDED [...]" And it is not even possible: when the server receives the Request from the client, the CCID and Ack vector features are undefined; when the Ack finalising the 3-way hanshake arrives, the request socket has not been cloned yet into a full socket. (This order is necessary, since otherwise the newly created socket would have to be destroyed whenever an option error occurred - a malicious hacker could simply send garbage options and exploit this.) Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
-
由 Gerrit Renker 提交于
This patch fixes the following sparse warnings: * nested min(max()) expression: net/dccp/ccids/ccid3.c:91:21: warning: symbol '__x' shadows an earlier one net/dccp/ccids/ccid3.c:91:21: warning: symbol '__y' shadows an earlier one * Declaration of function prototypes in .c instead of .h file, resulting in "should it be static?" warnings. * Declared "struct dccpw" static (local to dccp_probe). * Disabled dccp_delayed_ack() - not fully removed due to RFC 4340, 11.3 ("Receivers SHOULD implement delayed acknowledgement timers ..."). * Used a different local variable name to avoid net/dccp/ackvec.c:293:13: warning: symbol 'state' shadows an earlier one net/dccp/ackvec.c:238:33: originally declared here * Removed unused functions `dccp_ackvector_print' and `dccp_ackvec_print'. Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
-
由 Gerrit Renker 提交于
In commit $(825de27d) (from 27th May, commit message `dccp ccid-3: Fix "t_ipi explosion" bug'), the CCID-3 window counter computation was fixed to cope with RTTs < 4 microseconds. Such RTTs can be found e.g. when running CCID-3 over loopback. The fix removed a check against RTT < 4, but introduced a divide-by-zero bug. All steady-state RTTs in DCCP are filtered using dccp_sample_rtt(), which ensures non-zero samples. However, a zero RTT is possible on initialisation, when there is no RTT sample from the Request/Response exchange. The fix is to use the fallback-RTT from RFC 4340, 3.4. This is also better than just fixing update_win_count() since it allows other parts of the code to always assume that the RTT is non-zero during the time that the CCID is used. Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk>
-
由 David S. Miller 提交于
-
由 Krzysztof Piotr Oledzki 提交于
Most legacy software do not like tables > 255 as rtm_table is u8 so tb_id is sent &0xff and it is possible to mismatch for example table 510 with table 254 (main). This patch introduces RT_TABLE_COMPAT=252 so the code uses it if tb_id > 255. It makes such old applications happy, new ones are still able to use RTA_TABLE to get a proper table id. Signed-off-by: NKrzysztof Piotr Oledzki <ole@ans.pl> Acked-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ben Hutchings 提交于
dev_close() must be called holding the RTNL. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Jay Cliburn 提交于
Using vendor magic to force the PHY into power save mode breaks suspend. It isn't needed anyway, so remove it. Tested-by: NAvuton Olrich <avuton@gmail.com> Signed-off-by: NJay Cliburn <jacliburn@bellsouth.net> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Frank Blaschka 提交于
In case the xmit function drop out with an error, we have to wake the netdevice queue to start another xmit. Signed-off-by: NFrank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Peter Tiedemann 提交于
Prepare-function to call s390dbf was wrong handling variable arguments. This worked as macro but not as function any more. Now using va_list processing. Signed-off-by: NPeter Tiedemann <ptiedem@de.ibm.com> Signed-off-by: NFrank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Frank Blaschka 提交于
Remove unnecessary messages. Write important debug information to s390dbf. Signed-off-by: NFrank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Cornelia Huck 提交于
Get the devno from the ccw device via ccw_device_get_id() instead of parsing the bus_id. Signed-off-by: NCornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NFrank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Frank Blaschka 提交于
The ip event handler may present us non qeth network interfaces. Add qeth card pointer check. Signed-off-by: NFrank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Rusty Russell 提交于
virtio_net uses a timer to free old transmitted packets, rather than leaving callbacks enabled all the time. If the host promises to always notify us when the transmit ring is empty, we can free packets at that point and avoid the timer. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Mark McLoughlin 提交于
virtio_net currently only frees old transmit skbs just before queueing new ones. If the queue is full, it then enables interrupts and waits for notification that more work has been performed. However, a side-effect of this scheme is that there are always xmit skbs left dangling when no new packets are sent, against the Documentation/networking/driver.txt guideline: "... it is not allowed for your TX mitigation scheme to let TX packets "hang out" in the TX ring unreclaimed forever if no new TX packets are sent." Add a timer to ensure that any time we queue new TX skbs, we will shortly free them again. This fixes an easily reproduced hang at shutdown where iptables attempts to unload nf_conntrack and nf_conntrack waits for an skb it is tracking to be freed, but virtio_net never frees it. Signed-off-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Mark McLoughlin 提交于
Signed-off-by: NMark McLoughlin <markmc@redhat.com> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Mark McLoughlin 提交于
hdr->csum_start is the offset from the start of the ethernet header to the transport layer checksum field. skb->csum_start is the offset from skb->head. skb_partial_csum_set() assumes that skb->data points to the ethernet header - i.e. it computes skb->csum_start by adding the headroom to hdr->csum_start. Since eth_type_trans() skb_pull()s the ethernet header, skb_partial_csum_set() should be called before eth_type_trans(). (Without this patch, GSO packets from a guest to the world outside the host are corrupted). Signed-off-by: NMark McLoughlin <markmc@redhat.com> Acked-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Jan-Bernd Themann 提交于
eHEA has to call firmware functions in order to change the mac address of a logical port. This patch checks if the logical port is up when calling the register / deregister mac address calls. If the port is down these firmware calls would fail and are therefore not executed. Signed-off-by: NJan-Bernd Themann <themann@de.ibm.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Steve Hodgson 提交于
RX queue flush can fail if traffic continues to arrive. Recover by performing an invisible reset. Signed-off-by: NBen Hutchings <bhutchings@solarflare.com> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-
由 Adrian Bunk 提交于
This patch fixes the following build error: <-- snip --> ... Building modules, stage 2. MODPOST 1203 modules ERROR: "lance_open" [drivers/net/mvme147.ko] undefined! ERROR: "lance_close" [drivers/net/mvme147.ko] undefined! ERROR: "lance_tx_timeout" [drivers/net/mvme147.ko] undefined! ERROR: "lance_set_multicast" [drivers/net/mvme147.ko] undefined! ERROR: "lance_start_xmit" [drivers/net/mvme147.ko] undefined! ... make[2]: *** [__modpost] Error 1 <-- snip --> Reported-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NAdrian Bunk <bunk@kernel.org> Signed-off-by: NJeff Garzik <jgarzik@redhat.com>
-