- 03 12月, 2006 3 次提交
-
-
由 Stephen Hemminger 提交于
Allow normal users to only choose among a restricted set of congestion control choices. The default is reno and what ever has been configured as default. But the policy can be changed by administrator at any time. For example, to allow any choice: cp /proc/sys/net/ipv4/tcp_available_congestion_control \ /proc/sys/net/ipv4/tcp_allowed_congestion_control Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Create /proc/sys/net/ipv4/tcp_available_congestion_control that reflects currently available TCP choices. Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for each LISTEN socket, regardless of various parameters (listen backlog for example) On x86_64, this means order-1 allocations (might fail), even for 'small' sockets, expecting few connections. On the contrary, a huge server wanting a backlog of 50000 is slowed down a bit because of this fixed limit. This patch makes the sizing of listen hash table a dynamic parameter, depending of : - net.core.somaxconn tunable (default is 128) - net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128) - backlog value given by user application (2nd parameter of listen()) For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of kmalloc(). We still limit memory allocation with the two existing tunables (somaxconn & tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM usage. Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 8月, 2006 1 次提交
-
-
由 Wei Yongjun 提交于
Refer to RFC2012, tcpAttemptFails is defined as following: tcpAttemptFails OBJECT-TYPE SYNTAX Counter32 MAX-ACCESS read-only STATUS current DESCRIPTION "The number of times TCP connections have made a direct transition to the CLOSED state from either the SYN-SENT state or the SYN-RCVD state, plus the number of times TCP connections have made a direct transition to the LISTEN state from the SYN-RCVD state." ::= { tcp 7 } When I lookup into RFC793, I found that the state change should occured under following condition: 1. SYN-SENT -> CLOSED a) Received ACK,RST segment when SYN-SENT state. 2. SYN-RCVD -> CLOSED b) Received SYN segment when SYN-RCVD state(came from LISTEN). c) Received RST segment when SYN-RCVD state(came from SYN-SENT). d) Received SYN segment when SYN-RCVD state(came from SYN-SENT). 3. SYN-RCVD -> LISTEN e) Received RST segment when SYN-RCVD state(came from LISTEN). In my test, those direct state transition can not be counted to tcpAttemptFails. Signed-off-by: NWei Yongjun <yjwei@nanjing-fnst.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 7月, 2006 1 次提交
-
-
由 Herbert Xu 提交于
Certain subsystems in the stack (e.g., netfilter) can break the partial checksum on GSO packets. Until they're fixed, this patch allows this to work by recomputing the partial checksums through the GSO mechanism. Once they've all been converted to update the partial checksum instead of clearing it, this workaround can be removed. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 7月, 2006 1 次提交
-
-
由 Herbert Xu 提交于
This patch generalises the TSO-specific bits from sk_setup_caps by adding the sk_gso_type member to struct sock. This makes sk_setup_caps generic so that it can be used by TCPv6 or UFO. The only catch is that whoever uses this must provide a GSO implementation for their protocol which I think is a fair deal :) For now UFO continues to live without a GSO implementation which is OK since it doesn't use the sock caps field at the moment. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 6月, 2006 1 次提交
-
-
由 Herbert Xu 提交于
When GSO packets come from an untrusted source (e.g., a Xen guest domain), we need to verify the header integrity before passing it to the hardware. Since the first step in GSO is to verify the header, we can reuse that code by adding a new bit to gso_type: SKB_GSO_DODGY. Packets with this bit set can only be fed directly to devices with the corresponding bit NETIF_F_GSO_ROBUST. If the device doesn't have that bit, then the skb is fed to the GSO engine which will allow the packet to be sent to the hardware if it passes the header check. This patch changes the sg flag to a full features flag. The same method can be used to implement TSO ECN support. We simply have to mark packets with CWR set with SKB_GSO_ECN so that only hardware with a corresponding NETIF_F_TSO_ECN can accept them. The GSO engine can either fully segment the packet, or segment the first MTU and pass the rest to the hardware for further segmentation. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 6月, 2006 2 次提交
-
-
由 Herbert Xu 提交于
This patch adds the GSO implementation for IPv4 TCP. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not going to scale if we add any more segmentation methods (e.g., DCCP). So let's merge them. They were used to tell the protocol of a packet. This function has been subsumed by the new gso_type field. This is essentially a set of netdev feature bits (shifted by 16 bits) that are required to process a specific skb. As such it's easy to tell whether a given device can process a GSO skb: you just have to and the gso_type field and the netdev's features field. I've made gso_type a conjunction. The idea is that you have a base type (e.g., SKB_GSO_TCPV4) that can be modified further to support new features. For example, if we add a hardware TSO type that supports ECN, they would declare NETIF_F_TSO | NETIF_F_TSO_ECN. All TSO packets with CWR set would have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO packets would be SKB_GSO_TCPV4. This means that only the CWR packets need to be emulated in software. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 6月, 2006 5 次提交
-
-
由 David S. Miller 提交于
A lot of people have asked for a way to disable tcp_cwnd_restart(), and it seems reasonable to add a sysctl to do that. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Many of the TCP congestion methods all just use ssthresh as the minimum congestion window on decrease. Rather than duplicating the code, just have that be the default if that handle in the ops structure is not set. Minor behaviour change to TCP compound. It probably wants to use this (ssthresh) as lower bound, rather than ssthresh/2 because the latter causes undershoot on loss. Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chris Leech 提交于
Any socket recv of less than this ammount will not be offloaded Signed-off-by: NChris Leech <christopher.leech@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chris Leech 提交于
Needed to be able to call tcp_cleanup_rbuf in tcp_input.c for I/OAT Signed-off-by: NChris Leech <christopher.leech@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chris Leech 提交于
Adds an async_wait_queue and some additional fields to tcp_sock, and a dma_cookie_t to sk_buff. Signed-off-by: NChris Leech <christopher.leech@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 4月, 2006 1 次提交
-
-
由 David Woodhouse 提交于
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
-
- 31 3月, 2006 1 次提交
-
-
由 David S. Miller 提交于
Noticed by Alan Menegotto. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 3月, 2006 3 次提交
-
-
由 Dmitry Mishin 提交于
This patch extends {get|set}sockopt compatibility layer in order to move protocol specific parts to their place and avoid huge universal net/compat.c file in the future. Signed-off-by: NDmitry Mishin <dim@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Rick Jones 提交于
Back in the dark ages, we had to be conservative and only allow 15-bit window fields if the window scale option was not negotiated. Some ancient stacks used a signed 16-bit quantity for the window field of the TCP header and would get confused. Those days are long gone, so we can use the full 16-bits by default now. There is a sysctl added so that we can still interact with such old stacks Signed-off-by: NRick Jones <rick.jones2@hp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John Heffner 提交于
Implementation of packetization layer path mtu discovery for TCP, based on the internet-draft currently found at <http://www.ietf.org/internet-drafts/draft-ietf-pmtud-method-05.txt>. Signed-off-by: NJohn Heffner <jheffner@psc.edu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 1月, 2006 4 次提交
-
-
由 Stephen Hemminger 提交于
TCP inline usage cleanup: * get rid of inline in several places * replace __inline__ with inline where possible * move functions used in one file out of tcp.h * let compiler decide on used once cases On x86_64: text data bss dec hex filename 3594701 648348 567400 4810449 4966d1 vmlinux.orig 3593133 648580 567400 4809113 496199 vmlinux On sparc64: text data bss dec hex filename 2538278 406152 530392 3474822 350586 vmlinux.ORIG 2536382 406384 530392 3473158 34ff06 vmlinux Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
So that we can share several timewait sockets related functions and make the timewait mini sockets infrastructure closer to the request mini sockets one. Next changesets will take advantage of this, moving more code out of TCP and DCCP v4 and v6 to common infrastructure. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
And move it to struct inet_connection_sock. DCCP will use it in the upcoming changesets. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 11月, 2005 1 次提交
-
-
由 Stephen Hemminger 提交于
From Joe Perches Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 11月, 2005 6 次提交
-
-
由 Stephen Hemminger 提交于
Use "hints" to speed up the SACK processing. Various forms of this have been used by TCP developers (Web100, STCP, BIC) to avoid the 2x linear search of outstanding segments. Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Minor spelling fixes for TCP code. Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
This is an updated version of the RFC3465 ABC patch originally for Linux 2.6.11-rc4 by Yee-Ting Li. ABC is a way of counting bytes ack'd rather than packets when updating congestion control. The orignal ABC described in the RFC applied to a Reno style algorithm. For advanced congestion control there is little change after leaving slow start. Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Move all the code that does linear TCP slowstart to one inline function to ease later patch to add ABC support. Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
TCP peformance with TSO over networks with delay is awful. On a 100Mbit link with 150ms delay, we get 4Mbits/sec with TSO and 50Mbits/sec without TSO. The problem is with TSO, we intentionally do not keep the maximum number of packets in flight to fill the window, we hold out to until we can send a MSS chunk. But, we also don't update the congestion window unless we have filled, as per RFC2861. This patch replaces the check for the congestion window being full with something smarter that accounts for TSO. Signed-off-by: NStephen Hemminger <shemminger@osdl.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Herbert Xu 提交于
Here is the patch that introduces the generic skb_checksum_complete which also checks for hardware RX checksum faults. If that happens, it'll call netdev_rx_csum_fault which currently prints out a stack trace with the device name. In future it can turn off RX checksum. I've converted every spot under net/ that does RX checksum checks to use skb_checksum_complete or __skb_checksum_complete with the exceptions of: * Those places where checksums are done bit by bit. These will call netdev_rx_csum_fault directly. * The following have not been completely checked/converted: ipmr ip_vs netfilter dccp This patch is based on patches and suggestions from Stephen Hemminger and David S. Miller. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 10月, 2005 1 次提交
-
-
由 Al Viro 提交于
- added typedef unsigned int __nocast gfp_t; - replaced __nocast uses for gfp flags with gfp_t - it gives exactly the same warnings as far as sparse is concerned, doesn't change generated code (from gcc point of view we replaced unsigned int with typedef) and documents what's going on far better. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 02 9月, 2005 1 次提交
-
-
由 David S. Miller 提交于
All we need to do is resegment the queue so that we record SACK information accurately. The edges of the SACK blocks guide our resegmenting decisions. With help from Herbert Xu. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 8月, 2005 8 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Of this type, mostly: CHECK net/ipv6/netfilter.c net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static? net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static? Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
With this the previous setup is back, i.e. tcp_diag can be built as a module, as dccp_diag and both share the infrastructure available in inet_diag. If one selects CONFIG_INET_DIAG as module CONFIG_INET_TCP_DIAG will also be built as a module, as will CONFIG_INET_DCCP_DIAG, if CONFIG_IP_DCCP was selected static or as a module, if CONFIG_INET_DIAG is y, being statically linked CONFIG_INET_TCP_DIAG will follow suit and CONFIG_INET_DCCP_DIAG will be built in the same manner as CONFIG_IP_DCCP. Now to aim at UDP, converting it to use inet_hashinfo, so that we can use iproute2 for UDP sockets as well. Ah, just to show an example of this new infrastructure working for DCCP :-) [root@qemu ~]# ./ss -dane State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 0 *:5001 *:* ino:942 sk:cfd503a0 ESTAB 0 0 127.0.0.1:5001 127.0.0.1:32770 ino:943 sk:cfd50a60 ESTAB 0 0 127.0.0.1:32770 127.0.0.1:5001 ino:947 sk:cfd50700 TIME-WAIT 0 0 127.0.0.1:32769 127.0.0.1:5001 timer:(timewait,3.430ms,0) ino:0 sk:cf209620 Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
Next changeset will rename tcp_diag.[ch] to inet_diag.[ch]. I'm taking this longer route so as to easy review, making clear the changes made all along the way. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
This changeset basically moves tcp_sk()->{ca_ops,ca_state,etc} to inet_csk(), minimal renaming/moving done in this changeset to ease review. Most of it is just changes of struct tcp_sock * to struct sock * parameters. With this we move to a state closer to two interesting goals: 1. Generalisation of net/ipv4/tcp_diag.c, becoming inet_diag.c, being used for any INET transport protocol that has struct inet_hashinfo and are derived from struct inet_connection_sock. Keeps the userspace API, that will just not display DCCP sockets, while newer versions of tools can support DCCP. 2. INET generic transport pluggable Congestion Avoidance infrastructure, using the current TCP CA infrastructure with DCCP. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
Also export the ones that will be used in the next changeset, when DCCP uses this infrastructure. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
That groups all of the tables and variables associated to the TCP timewait schedulling/recycling/killing code, that now can be isolated from the TCP specific code and used by other transport protocols, such as DCCP. Next changeset will move this code to net/ipv4/inet_timewait_sock.c Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
This also improves reqsk_queue_prune and renames it to inet_csk_reqsk_queue_prune, as it deals with both inet_connection_sock and inet_request_sock objects, not just with request_sock ones thus belonging to inet_request_sock. Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
With this we're very close to getting all of the current TCP refactorings in my dccp-2.6 tree merged, next changeset will export some functions needed by the current DCCP code and then dccp-2.6.git will be born! Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-