- 14 2月, 2013 1 次提交
-
-
由 Andrey Vagin 提交于
This functionality is used for restoring tcp sockets. A tcp timestamp depends on how long a system has been running, so it's differ for each host. The solution is to set a per-socket offset. A per-socket offset for a TIME_WAIT socket is inherited from a proper tcp socket. tcp_request_sock doesn't have a timestamp offset, because the repair mode for them are not implemented. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: NAndrey Vagin <avagin@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 2月, 2013 1 次提交
-
-
由 Eric Dumazet 提交于
We should call skb_share_check() before pskb_may_pull(), or we can crash in pskb_expand_head() Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 2月, 2013 1 次提交
-
-
由 Ilpo Järvinen 提交于
There are transients during normal FRTO procedure during which the packets_in_flight can go to zero between write_queue state updates and firing the resulting segments out. As FRTO processing occurs during that window the check must be more precise to not match "spuriously" :-). More specificly, e.g., when packets_in_flight is zero but FLAG_DATA_ACKED is true the problematic branch that set cwnd into zero would not be taken and new segments might be sent out later. Signed-off-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 2月, 2013 2 次提交
-
-
由 Stephen Hemminger 提交于
TCP Appropriate Byte Count was added by me, but later disabled. There is no point in maintaining it since it is a potential source of bugs and Linux already implements other better window protection heuristics. Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
All in-tree ipv4 protocol implementations are now namespace aware. Therefore all the run-time checks are superfluous. Reject registry of any non-namespace aware ipv4 protocol. Eventually we'll remove prot->netns_ok and this registry time check as well. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 2月, 2013 3 次提交
-
-
由 David S. Miller 提交于
This module is namespace aware, netns_ok was just disabled by default for sanity. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
As in del_timer() there has already placed a timer_pending() function to check whether the timer to be deleted is pending or not, it's unnecessary to check timer pending state again before del_timer() is called. Signed-off-by: NYing Xue <ying.xue@windriver.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vijay Subramanian 提交于
This patch updates LINUX_MIB_LISTENDROPS in tcp_v4_conn_request() and tcp_v4_err(). tcp_v4_conn_request() in particular can drop SYNs for various reasons which are not currently tracked. Signed-off-by: NVijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 2月, 2013 2 次提交
-
-
由 Eric Dumazet 提交于
Commit 9dc27415 (tcp: fix ABC in tcp_slow_start()) uncovered a bug in FRTO code : tcp_process_frto() is setting snd_cwnd to 0 if the number of in flight packets is 0. As Neal pointed out, if no packet is in flight we lost our chance to disambiguate whether a loss timeout was spurious. We should assume it was a proper loss. Reported-by: NPasi Kärkkäinen <pasik@iki.fi> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Since commit 9dc27415 (tcp: fix ABC in tcp_slow_start()), a nul snd_cwnd triggers an infinite loop in tcp_slow_start() Avoid this infinite loop and log a one time error for further analysis. FRTO code is suspected to cause this bug. Reported-by: NPasi Kärkkäinen <pasik@iki.fi> Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 2月, 2013 1 次提交
-
-
由 Yuchung Cheng 提交于
On receiving the SYN-ACK, Fast Open checks icsk_retransmit for SYN retransmission to detect SYN/data drops. But if F-RTO is disabled, icsk_retransmit is reset at step D of tcp_fastretrans_alert() ( under tcp_ack()) before tcp_rcv_fastopen_synack(). The fix is to use total_retrans instead which accounts for SYN retransmission regardless the use of F-RTO. Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 1月, 2013 6 次提交
-
-
由 Nivedita Singhvi 提交于
We drop a connection request if the accept backlog is full and there are sufficient packets in the syn queue to warrant starting drops. Increment the appropriate counters so this isn't silent, for accurate stats and help in debugging. This patch assumes LINUX_MIB_LISTENDROPS is a superset of/includes the counter LINUX_MIB_LISTENOVERFLOWS. Signed-off-by: NNivedita Singhvi <niv@us.ibm.com> Acked-by: NVijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ward 提交于
A GRE tunnel can be configured so that outgoing tunnel packets inherit the value of the TOS field from the inner IP header. In doing so, when a non-IP packet is transmitted through the tunnel, the TOS field will always be set to 0. Instead, the user should be able to configure a different TOS value as the fallback to use for non-IP packets. This is helpful when the non-IP packets are all control packets and should be handled by routers outside the tunnel as having Internet Control precedence. One example of this is the NHRP packets that control a DMVPN-compatible mGRE tunnel; they are encapsulated directly by GRE and do not contain an inner IP header. Under the existing behavior, the IFLA_GRE_TOS parameter must be set to '1' for the TOS value to be inherited. Now, only the least significant bit of this parameter must be set to '1', and when a non-IP packet is sent through the tunnel, the upper 6 bits of this same parameter will be copied into the TOS field. (The ECN bits get masked off as before.) This behavior is backwards-compatible with existing configurations and iproute2 versions. Signed-off-by: NDavid Ward <david.ward@ll.mit.edu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
There are some usecase when lifetime of ipv4 addresses might be helpful. For example: 1) initramfs networkmanager uses a DHCP daemon to learn network configuration parameters 2) initramfs networkmanager addresses, routes and DNS configuration 3) initramfs networkmanager is requested to stop 4) initramfs networkmanager stops all daemons including dhclient 5) there are addresses and routes configured but no daemon running. If the system doesn't start networkmanager for some reason, addresses and routes will be used forever, which violates RFC 2131. This patch is essentially a backport of ivp6 address lifetime mechanism for ipv4 addresses. Current "ip" tool supports this without any patch (since it does not distinguish between ipv4 and ipv6 addresses in this perspective. Also, this should be back-compatible with all current netlink users. Reported-by: NPavel Šimerda <psimerda@redhat.com> Signed-off-by: NJiri Pirko <jiri@resnulli.us> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
Updating the fragmentation queues LRU (Least-Recently-Used) list, required taking the hash writer lock. However, the LRU list isn't tied to the hash at all, so we can use a separate lock for it. Original-idea-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
Replace the per network namespace shared atomic "mem" accounting variable, in the fragmentation code, with a lib/percpu_counter. Getting percpu_counter to scale to the fragmentation code usage requires some tweaks. At first view, percpu_counter looks superfast, but it does not scale on multi-CPU/NUMA machines, because the default batch size is too small, for frag code usage. Thus, I have adjusted the batch size by using __percpu_counter_add() directly, instead of percpu_counter_sub() and percpu_counter_add(). The batch size is increased to 130.000, based on the largest 64K fragment memory usage. This does introduce some imprecise memory accounting, but its does not need to be strict for this use-case. It is also essential, that the percpu_counter, does not share cacheline with other writers, to make this scale. Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
This change is primarily a preparation to ease the extension of memory limit tracking. The change does reduce the number atomic operation, during freeing of a frag queue. This does introduce a some performance improvement, as these atomic operations are at the core of the performance problems seen on NUMA systems. Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 1月, 2013 1 次提交
-
-
由 bingtian.ly@taobao.com 提交于
I found if we write a larger than 4GB value to some sysctl variables, the sending syscall will hang up forever, because these variables are 32 bits, such large values make them overflow to 0 or negative. This patch try to fix overflow or prevent from zero value setup of below sysctl variables: net.core.wmem_default net.core.rmem_default net.core.rmem_max net.core.wmem_max net.ipv4.udp_rmem_min net.ipv4.udp_wmem_min net.ipv4.tcp_wmem net.ipv4.tcp_rmem Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NLi Yu <raise.sail@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 1月, 2013 2 次提交
-
-
由 Eric Dumazet 提交于
Pravin Shelar mentioned that GSO could potentially generate wrong TX checksum if skb has fragments that are overwritten by the user between the checksum computation and transmit. He suggested to linearize skbs but this extra copy can be avoided for normal tcp skbs cooked by tcp_sendmsg(). This patch introduces a new SKB_GSO_SHARED_FRAG flag, set in skb_shinfo(skb)->gso_type if at least one frag can be modified by the user. Typical sources of such possible overwrites are {vm}splice(), sendfile(), and macvtap/tun/virtio_net drivers. Tested: $ netperf -H 7.7.8.84 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.8.84 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 3959.52 $ netperf -H 7.7.8.84 -t TCP_SENDFILE TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.8.84 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 3216.80 Performance of the SENDFILE is impacted by the extra allocation and copy, and because we use order-0 pages, while the TCP_STREAM uses bigger pages. Reported-by: NPravin Shelar <pshelar@nicira.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pravin B Shelar 提交于
Due to IP_GRE GSO support, GRE can recieve non linear skb which results in panic in case of GRE_CSUM. Following patch fixes it by using correct csum API. Bug introduced in commit 6b78f16e (gre: add GSO support) Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 1月, 2013 1 次提交
-
-
由 Tom Herbert 提交于
Fix a reported compilation error where ia variable of type kuid_t was being set to zero. Eliminate two instances of setting tb->fastuid to zero. tb->fastuid is only used if tb->fastreuseport is set, so there should be no problem if tb->fastuid is not initialized (when tb->fastreuesport is zero). Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 1月, 2013 2 次提交
-
-
由 Tom Herbert 提交于
Allow multiple UDP sockets to bind to the same port. Motivation soreuseport would be something like a DNS server. An alternative would be to recv on the same socket from multiple threads. As in the case of TCP, the load across these threads tends to be disproportionate and we also see a lot of contection on the socketlock. Note that SO_REUSEADDR already allows multiple UDP sockets to bind to the same port, however there is no provision to prevent hijacking and nothing to distribute packets across all the sockets sharing the same bound port. This patch does not change the semantics of SO_REUSEADDR, but provides usable functionality of it for unicast. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Allow multiple listener sockets to bind to the same port. Motivation for soresuseport would be something like a web server binding to port 80 running with multiple threads, where each thread might have it's own listener socket. This could be done as an alternative to other models: 1) have one listener thread which dispatches completed connections to workers. 2) accept on a single listener socket from multiple threads. In case #1 the listener thread can easily become the bottleneck with high connection turn-over rate. In case #2, the proportion of connections accepted per thread tends to be uneven under high connection load (assuming simple event loop: while (1) { accept(); process() }, wakeup does not promote fairness among the sockets. We have seen the disproportion to be as high as 3:1 ratio between thread accepting most connections and the one accepting the fewest. With so_reusport the distribution is uniform. Signed-off-by: NTom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 1月, 2013 6 次提交
-
-
由 Gao feng 提交于
Move the code that register/unregister l4proto to the module_init/exit context. Given that we have to modify some interfaces to accomodate these changes, it is a good time to use shorter function names for this using the nf_ct_* prefix instead of nf_conntrack_*, that is: nf_ct_l4proto_register nf_ct_l4proto_pernet_register nf_ct_l4proto_unregister nf_ct_l4proto_pernet_unregister We same many line breaks with it. Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Gao feng 提交于
Move the code that register/unregister l3proto to the module_init/exit context. Given that we have to modify some interfaces to accomodate these changes, it is a good time to use shorter function names for this using the nf_ct_* prefix instead of nf_conntrack_*, that is: nf_ct_l3proto_register nf_ct_l3proto_pernet_register nf_ct_l3proto_unregister nf_ct_l3proto_pernet_unregister We same many line breaks with it. Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 YOSHIFUJI Hideaki / 吉藤英明 提交于
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 YOSHIFUJI Hideaki / 吉藤英明 提交于
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
mfc_mcastgrp and mfc_origin are __be32, thus we need to convert INADDR_ANY. Because INADDR_ANY is 0, this patch just fix sparse warnings. Reported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steffen Klassert 提交于
git commit 9cb3a50c (ipv4: Invalidate the socket cached route on pmtu events if possible) introduced a refcount problem. We don't get a refcount on the route if we get it from__sk_dst_get(), but we need one if we want to reuse this route because __sk_dst_set() releases the refcount of the old route. This patch adds proper refcount handling for that case. We introduce a 'new' flag to indicate that we are going to use a new route and we release the old route only if we replace it by a new one. Reported-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 1月, 2013 3 次提交
-
-
由 Steffen Klassert 提交于
This implements a socket release callback function to check if the socket cached route got invalid during the time we owned the socket. The function is used from udp, raw and ping sockets. Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steffen Klassert 提交于
The route lookup in ipv4_sk_update_pmtu() might return a route different from the route we cached at the socket. This is because standart routes are per cpu, so each cpu has it's own struct rtable. This means that we do not invalidate the socket cached route if the NET_RX_SOFTIRQ is not served by the same cpu that the sending socket uses. As a result, the cached route reused until we disconnect. With this patch we invalidate the socket cached route if possible. If the socket is owened by the user, we can't update the cached route directly. A followup patch will implement socket release callback functions for datagram sockets to handle this case. Reported-by: NYurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
This patch add the support of proxy multicast, ie being able to build a static multicast tree. It adds the support of (*,*) and (*,G) entries. The user should define an (*,*) entry which is not used for real forwarding. This entry defines the upstream in iif and contains all interfaces from the static tree in its oifs. It will be used to forward packet upstream when they come from an interface belonging to the static tree. Hence, the user should define (*,G) entries to build its static tree. Note that upstream interface must be part of oifs: packets are sent to all oifs interfaces except the input interface. This ensures to always join the whole static tree, even if the packet is not coming from the upstream interface. Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: NDavid L Stevens <dlstevens@us.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 1月, 2013 2 次提交
-
-
由 Steffen Klassert 提交于
On IPsec pmtu events we can't access the transport headers of the original packet, so we can't find the socket that sent the packet. The only chance to notify the socket about the pmtu change is to force a relookup for all routes. This patch implenents this for the IPsec protocols. Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
由 Eric Dumazet 提交于
commit 563d34d0 (tcp: dont drop MTU reduction indications) added an error leading to incorrect accounting of LINUX_MIB_LOCKDROPPEDICMPS If socket is owned by the user, we want to increment this SNMP counter, unless the message is a (ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED) one. Reported-by: NMaciej Żenczykowski <maze@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: NMaciej Żenczykowski <maze@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 1月, 2013 1 次提交
-
-
由 Jesper Dangaard Brouer 提交于
Increase the amount of memory usage limits for incomplete IP fragments. Arguing for new thresh high/low values: High threshold = 4 MBytes Low threshold = 3 MBytes The fragmentation memory accounting code, tries to account for the real memory usage, by measuring both the size of frag queue struct (inet_frag_queue (ipv4:ipq/ipv6:frag_queue)) and the SKB's truesize. We want to be able to handle/hold-on-to enough fragments, to ensure good performance, without causing incomplete fragments to hurt scalability, by causing the number of inet_frag_queue to grow too much (resulting longer searches for frag queues). For IPv4, how much memory does the largest frag consume. Maximum size fragment is 64K, which is approx 44 fragments with MTU(1500) sized packets. Sizeof(struct ipq) is 200. A 1500 byte packet results in a truesize of 2944 (not 2048 as I first assumed) (44*2944)+200 = 129736 bytes The current default high thresh of 262144 bytes, is obviously problematic, as only two 64K fragments can fit in the queue at the same time. How many 64K fragment can we fit into 4 MBytes: 4*2^20/((44*2944)+200) = 32.34 fragment in queues An attacker could send a separate/distinct fake fragment packets per queue, causing us to allocate one inet_frag_queue per packet, and thus attacking the hash table and its lists. How many frag queue do we need to store, and given a current hash size of 64, what is the average list length. Using one MTU sized fragment per inet_frag_queue, each consuming (2944+200) 3144 bytes. 4*2^20/(2944+200) = 1334 frag queues -> 21 avg list length An attack could send small fragments, the smallest packet I could send resulted in a truesize of 896 bytes (I'm a little surprised by this). 4*2^20/(896+200) = 3827 frag queues -> 59 avg list length When increasing these number, we also need to followup with improvements, that is going to help scalability. Simply increasing the hash size, is not enough as the current implementation does not have a per hash bucket locking. Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 1月, 2013 2 次提交
-
-
由 Steffen Klassert 提交于
Routes with locked mtu should not use learned pmtu informations, so do not update the pmtu on these routes. Reported-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Steffen Klassert 提交于
The output route check was introduced with git commit 261663b0 (ipv4: Don't use the cached pmtu informations for input routes) during times when we cached the pmtu informations on the inetpeer. Now the pmtu informations are back in the routes, so this check is obsolete. It also had some unwanted side effects, as reported by Timo Teras and Lukas Tribus. Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Acked-by: NTimo Teräs <timo.teras@iki.fi> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 1月, 2013 1 次提交
-
-
由 Rami Rosen 提交于
In fib_frontend.c, there is a confusing comment; NETLINK_CB(skb).portid does not refer to a pid of sending process, but rather to a netlink portid. Signed-off-by: NRami Rosen <ramirose@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 1月, 2013 2 次提交
-
-
由 Eric Dumazet 提交于
commit c3ae62af (tcp: should drop incoming frames without ACK flag set) added a regression on the handling of RST messages. RST should be allowed to come even without ACK bit set. We validate the RST by checking the exact sequence, as requested by RFC 793 and 5961 3.2, in tcp_validate_incoming() Reported-by: NEric Wong <normalperson@yhbt.net> Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Tested-by: NEric Wong <normalperson@yhbt.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Under unusual circumstances, TCP collapse can split a big GRO TCP packet while its being used in a splice(socket->pipe) operation. skb_splice_bits() releases the socket lock before calling splice_to_pipe(). [ 1081.353685] WARNING: at net/ipv4/tcp.c:1330 tcp_cleanup_rbuf+0x4d/0xfc() [ 1081.371956] Hardware name: System x3690 X5 -[7148Z68]- [ 1081.391820] cleanup rbuf bug: copied AD3BCF1 seq AD370AF rcvnxt AD3CF13 To fix this problem, we must eat skbs in tcp_recv_skb(). Remove the inline keyword from tcp_recv_skb() definition since it has three call sites. Reported-by: NChristian Becker <c.becker@traviangames.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: NEric Dumazet <edumazet@google.com> Tested-by: NWilly Tarreau <w@1wt.eu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-