- 06 7月, 2005 22 次提交
-
-
由 David S. Miller 提交于
This makes it easier to understand, and allows easier tweaking of the heuristic later on. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
In tcp_clean_rtx_queue(), if the TSO packet is not even partially acked, do not waste time calling tcp_tso_acked(). Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Everything stated there is out of data. tcp_trim_skb() does adjust the available socket send buffer space and skb->truesize now. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Only put user data purely to pages when doing TSO. The extra page allocations cause two problems: 1) Add the overhead of the page allocations themselves. 2) Make us do small user copies when we get to the end of the TCP socket cache page. It is still beneficial to purely use pages for TSO, so we will do it for that case. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
tcp_snd_test() is run for every packet output by a single call to tcp_write_xmit(), but this is not necessary. For one, the congestion window space needs to only be calculated one time, then used throughout the duration of the loop. This cleanup also makes experimenting with different TSO packetization schemes much easier. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
tcp_snd_test() does several different things, use inline functions to express this more clearly. 1) It initializes the TSO count of SKB, if necessary. 2) It performs the Nagle test. 3) It makes sure the congestion window is adhered to. 4) It makes sure SKB fits into the send window. This cleanup also sets things up so that things like the available packets in the congestion window does not need to be calculated multiple times by packet sending loops such as tcp_write_xmit(). Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
'nonagle' should be passed to the tcp_snd_test() function as 'TCP_NAGLE_PUSH' if we are checking an SKB not at the tail of the write_queue. This is because Nagle does not apply to such frames since we cannot possibly tack more data onto them. However, while doing this __tcp_push_pending_frames() makes all of the packets in the write_queue use this modified 'nonagle' value. Fix the bug and simplify this function by just calling tcp_write_xmit() directly if sk_send_head is non-NULL. As a result, we can now make tcp_data_snd_check() just call tcp_push_pending_frames() instead of the specialized __tcp_data_snd_check(). Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
tcp_write_xmit() uses tcp_current_mss(), but some of it's callers, namely __tcp_push_pending_frames(), already has this value available already. While we're here, fix the "cur_mss" argument to be "unsigned int" instead of plain "unsigned". Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Put the main basic block of work at the top-level of tabbing, and mark the TCP_CLOSE test with unlikely(). Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
The tcp_cwnd_validate() function should only be invoked if we actually send some frames, yet __tcp_push_pending_frames() will always invoke it. tcp_write_xmit() does the call for us, so the call here can simply be removed. Also, tcp_write_xmit() can be marked static. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
When we add any new packet to the TCP socket write queue, we must call skb_header_release() on it in order for the TSO sharing checks in the drivers to work. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
It reimplements portions of tcp_snd_check(), so it we move it to tcp_output.c we can consolidate it's logic much easier in a later change. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This just moves the code into tcp_output.c, no code logic changes are made by this patch. Using this as a baseline, we can begin to untangle the mess of comparisons for the Nagle test et al. We will also be able to reduce all of the redundant computation that occurs when outputting data packets. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
On each packet output, we call tcp_dec_quickack_mode() if the ACK flag is set. It drops tp->ack.quick until it hits zero, at which time we deflate the ATO value. When doing TSO, we are emitting multiple packets with ACK set, so we should decrement tp->ack.quick that many segments. Note that, unlike this case, tcp_enter_cwr() should not take the tcp_skb_pcount(skb) into consideration. That function, one time, readjusts tp->snd_cwnd and moves into TCP_CA_CWR state. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
The ideal and most optimal layout for an SKB when doing scatter-gather is to put all the headers at skb->data, and all the user data in the page array. This makes SKB splitting and combining extremely simple, especially before a packet goes onto the wire the first time. So, when sk_stream_alloc_pskb() is given a zero size, make sure there is no skb_tailroom(). This is achieved by applying SKB_DATA_ALIGN() to the header length used here. Next, make select_size() in TCP output segmentation use a length of zero when NETIF_F_SG is true on the outgoing interface. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Robert Olsson 提交于
Below a patch to preallocate memory when doing resize of trie (inflate halve) If preallocations fails it just skips the resize of this tnode for this time. The oops we got when killing bgpd (with full routing) is now gone. Patrick memory patch is also used. Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
- rt_check_expire() fixes (an overflow occured if size of the hash was >= 65536) reminder of the bugfix: The rt_check_expire() has a serious problem on machines with large route caches, and a standard HZ value of 1000. With default values, ie ip_rt_gc_interval = 60*HZ = 60000 ; the loop count : for (t = ip_rt_gc_interval << rt_hash_log; t >= 0; overflows (t is a 31 bit value) as soon rt_hash_log is >= 16 (65536 slots in route cache hash table). In this case, rt_check_expire() does nothing at all Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
- rt hash table allocated using alloc_large_system_hash() function. Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
- Locking abstraction - Spinlocks moved out of rt hash table : Less memory (50%) used by rt hash table. it's a win even on UP. - Sizing of spinlocks table depends on NR_CPUS Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
Inflating a node a couple of times makes it exceed the 128k kmalloc limit. Use __get_free_pages for allocations > PAGE_SIZE, as in fib_hash. Signed-off-by: NPatrick McHardy <kaber@trash.net> Acked-by: NRobert Olsson <Robert.Olsson@data.slu.se> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
Makes IPv4 ip_rcv registration happen last in af_inet. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Graf 提交于
Signed-off-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 6月, 2005 7 次提交
-
-
由 Patrick McHardy 提交于
In 2.6.12 we started dropping the conntrack reference when a packet leaves the IP layer. This broke connection tracking on a bridge, because bridge-netfilter defers calling some NF_IP_* hooks to the bridge layer for locally generated packets going out a bridge, where the conntrack reference is no longer available. This patch keeps the reference in this case as a temporary solution, long term we will remove the defered hook calling. No attempt is made to drop the reference in the bridge-code when it is no longer needed, tc actions could already have sent the packet anywhere. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neil Horman 提交于
In an smp system, it is possible for an connection timer to expire, calling ip_vs_conn_expire while the connection table is being flushed, before ct_write_lock_bh is acquired. Since the list iterator loop in ip_vs_con_flush releases and re-acquires the spinlock (even though it doesn't re-enable softirqs), it is possible for the expiration function to modify the connection list, while it is being traversed in ip_vs_conn_flush. The result is that the next pointer gets set to NULL, and subsequently dereferenced, resulting in an oops. Signed-off-by: NNeil Horman <nhorman@redhat.com> Acked-by: JulianAnastasov Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Robert Olsson 提交于
This should help up the insertion... but the resize is more crucial. and complex and needs some thinking. Signed-off-by: NRobert Olsson <robert.olsson@its.uu.se> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Maxime Bizon 提交于
I think there is a small bug in ipconfig.c in case IPCONFIG_DHCP is set and dhcp is used. When a DHCPOFFER is received, ip address is kept until we get DHCPACK. If no ack is received, ic_dynamic() returns negatively, but leaves the offered ip address in ic_myaddr. This makes the main loop in ip_auto_config() break and uses the maybe incomplete configuration. Not sure if it's the best way to do, but the following trivial patch correct this. Signed-off-by: NMaxime Bizon <mbizon@freebox.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dietmar Eggemann 提交于
I followed Thomas' proposal to see every martian destination as a case where the ipInAddrErrors counter has to be incremented. There are two advantages by doing so: (1) The relation between the ipInReceive counter and all the other ipInXXX counters is more accurate in the case the RTN_UNICAST code check fails and (2) it makes the code in ip_route_input_slow easier. Signed-off-by: NDietmar Eggemann <dietmar.eggemann@gmx.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Patrick McHardy 提交于
Mostly missing initialization of padding fields of 1 or 2 bytes length, two instances of uninitialized nlmsgerr->msg of 16 bytes length. Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Harald Welte 提交于
This patch adds mangling of ARP requests (in addition to replies), since ARP caches are made from snooping both requests and replies. Signed-off-by: NHarald Welte <laforge@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 6月, 2005 3 次提交
-
-
由 pageexec 提交于
From: <pageexec@freemail.hu> $subject was fixed in 2.4 already, 2.6 needs it as well. The impact of the bugs is a kernel stack overflow and privilege escalation from CAP_NET_ADMIN via the IP_VS_SO_SET_STARTDAEMON/IP_VS_SO_GET_DAEMON ioctls. People running with 'root=all caps' (i.e., most users) are not really affected (there's nothing to escalate), but SELinux and similar users should take it seriously if they grant CAP_NET_ADMIN to other users. Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Adrian Bunk 提交于
It doesn't seem to make much sense to let an "If unsure, say N." option default to y. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Since it is tristate when we offer it as a choice, we should definte it also as tristate when forcing it as the default. Otherwise kconfig warns. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 6月, 2005 2 次提交
-
-
由 David S. Miller 提交于
Create TCP_CONG_ADVANCED option, akin to IP_ADVANCED_ROUTER, which when disabled will bypass all of the congestion control Kconfig options and leave the user with a safe default. That safe default is currently BIC-TCP with new Reno as a fallback. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Most users need not be concerned with a complex choice of what FIB lookup algorithm to use. So give them the safe default of IP_FIB_HASH if IP_ADVANCED_ROUTING is disabled. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 6月, 2005 6 次提交
-
-
由 Stephen Hemminger 提交于
Allow using setsockopt to set TCP congestion control to use on a per socket basis. Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John Heffner 提交于
This patch implements Tom Kelly's Scalable TCP congestion control algorithm for the modular framework. The algorithm has some nice scaling properties, and has been used a fair bit in research, though is known to have significant fairness issues, so it's not really suitable for general purpose use. Signed-off-by: NJohn Heffner <jheffner@psc.edu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Baruch Even 提交于
H-TCP is a congestion control algorithm developed at the Hamilton Institute, by Douglas Leith and Robert Shorten. It is extending the standard Reno algorithm with mode switching is thus a relatively simple modification. H-TCP is defined in a layered manner as it is still a research platform. The basic form includes the modification of beta according to the ratio of maxRTT to min RTT and the alpha=2*factor*(1-beta) relation, where factor is dependant on the time since last congestion. The other layers improve convergence by adding appropriate factors to alpha. The following patch implements the H-TCP algorithm in it's basic form. Signed-Off-By: NBaruch Even <baruch@ev-en.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
TCP Vegas code modified for the new TCP infrastructure. Vegas now uses microsecond resolution timestamps for better estimation of performance over higher speed links. Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniele Lacamera 提交于
TCP Hybla congestion avoidance. - "In heterogeneous networks, TCP connections that incorporate a terrestrial or satellite radio link are greatly disadvantaged with respect to entirely wired connections, because of their longer round trip times (RTTs). To cope with this problem, a new TCP proposal, the TCP Hybla, is presented and discussed in the paper[1]. It stems from an analytical evaluation of the congestion window dynamics in the TCP standard versions (Tahoe, Reno, NewReno), which suggests the necessary modifications to remove the performance dependence on RTT.[...]"[1] [1]: Carlo Caini, Rosario Firrincieli, "TCP Hybla: a TCP enhancement for heterogeneous networks", International Journal of Satellite Communications and Networking Volume 22, Issue 5 , Pages 547 - 566. September 2004. Signed-off-by: Daniele Lacamera (root at danielinux.net)net Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John Heffner 提交于
Sally Floyd's high speed TCP congestion control. This is useful for comparison and research. Signed-off-by: NJohn Heffner <jheffner@psc.edu> Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-