- 22 4月, 2012 2 次提交
-
-
由 Pavel Emelyanov 提交于
This includes (according the the previous description): * TCP_REPAIR sockoption This one just puts the socket in/out of the repair mode. Allowed for CAP_NET_ADMIN and for closed/establised sockets only. When repair mode is turned off and the socket happens to be in the established state the window probe is sent to the peer to 'unlock' the connection. * TCP_REPAIR_QUEUE sockoption This one sets the queue which we're about to repair. The 'no-queue' is set by default. * TCP_QUEUE_SEQ socoption Sets the write_seq/rcv_nxt of a selected repaired queue. Allowed for TCP_CLOSE-d sockets only. When the socket changes its state the other seq-s are changed by the kernel according to the protocol rules (most of the existing code is actually reused). * Ability to forcibly bind a socket to a port The sk->sk_reuse is set to SK_FORCE_REUSE. * Immediate connect modification The connect syscall initializes the connection, then directly jumps to the code which finalizes it. * Silent close modification The close just aborts the connection (similar to SO_LINGER with 0 time) but without sending any FIN/RST-s to peer. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
This is just the preparation patch, which makes the needed for TCP repair code ready for use. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 4月, 2012 1 次提交
-
-
由 Neal Cardwell 提交于
Commit b82d1bb4 inadvertendly placed unrelated new code between TCPCB_EVER_RETRANS and TCPCB_RETRANS and the other macros that refer to the sacked field in the struct tcp_skb_cb (probably because there was a misleading empty line there). This commit fixes up the formatting so that all macros related to the sacked field are adjacent again. Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 4月, 2012 1 次提交
-
-
由 Eric Dumazet 提交于
Use of "unsigned int" is preferred to bare "unsigned" in net tree. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 4月, 2012 1 次提交
-
-
由 Eric Dumazet 提交于
Updates some comments to track RFC6298 Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: H.K. Jerry Chu <hkchu@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 3月, 2012 1 次提交
-
-
由 Paul Gortmaker 提交于
If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any other BUG variant in a static inline (i.e. not in a #define) then that header really should be including <linux/bug.h> and not just expecting it to be implicitly present. We can make this change risk-free, since if the files using these headers didn't have exposure to linux/bug.h already, they would have been causing compile failures/warnings. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 29 2月, 2012 1 次提交
-
-
由 Neal Cardwell 提交于
There was an off-by-one error in the comments describing the highest_sack field in struct tcp_sock. The comments previously claimed that it was the "start sequence of the highest skb with SACKed bit". This commit fixes the comments to note that it is the "start sequence of the skb just *after* the highest skb with SACKed bit". Signed-off-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 2月, 2012 1 次提交
-
-
由 Arun Sharma 提交于
Some of our machines were reporting: TCP: too many of orphaned sockets even when the number of orphaned sockets was well below the limit. We print a different message depending on whether we're out of TCP memory or there are too many orphaned sockets. Also move the check out of line and cleanup the messages that were printed. Signed-off-by: NArun Sharma <asharma@fb.com> Suggested-by: NMohan Srinivasan <mohan@fb.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: David Miller <davem@davemloft.net> Cc: Glauber Costa <glommer@parallels.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 2月, 2012 3 次提交
-
-
由 Eric Dumazet 提交于
This patch makes sure we use appropriate memory barriers before publishing tp->md5sig_info, allowing tcp_md5_do_lookup() being used from tcp_v4_send_reset() without holding socket lock (upcoming patch from Shawn Lu) Note we also need to respect rcu grace period before its freeing, since we can free socket without this grace period thanks to SLAB_DESTROY_BY_RCU Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Cc: Shawn Lu <shawn.lu@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
In order to be able to support proper RST messages for TCP MD5 flows, we need to allow access to MD5 keys without locking listener socket. This conversion is a nice cleanup, and shrinks size of timewait sockets by 80 bytes. IPv6 code reuses generic code found in IPv4 instead of duplicating it. Control path uses GFP_KERNEL allocations instead of GFP_ATOMIC. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Cc: Shawn Lu <shawn.lu@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We no longer use md5_add() method from struct tcp_sock_af_ops Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 1月, 2012 1 次提交
-
-
由 Glauber Costa 提交于
sysctl_tcp_mem() initialization was moved to sysctl_tcp_ipv4.c in commit 3dc43e3e, since it became a per-ns value. That code, however, will never run when CONFIG_SYSCTL is disabled, leading to bogus values on those fields - causing hung TCP sockets. This patch fixes it by keeping an initialization code in tcp_init(). It will be overwritten by the first net namespace init if CONFIG_SYSCTL is compiled in, and do the right thing if it is compiled out. It is also named properly as tcp_init_mem(), to properly signal its non-sysctl side effect on TCP limits. Reported-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NGlauber Costa <glommer@parallels.com> Cc: David S. Miller <davem@davemloft.net> Link: http://lkml.kernel.org/r/4F22D05A.8030604@parallels.com [ renamed the function, tidied up the changelog a bit ] Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 12月, 2011 1 次提交
-
-
由 Vijay Subramanian 提交于
to record the state of SACK/FACK and DSACK for better readability and maintenance. Signed-off-by: NVijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 12月, 2011 2 次提交
-
-
由 Glauber Costa 提交于
This patch allows each namespace to independently set up its levels for tcp memory pressure thresholds. This patch alone does not buy much: we need to make this values per group of process somehow. This is achieved in the patches that follows in this patchset. Signed-off-by: NGlauber Costa <glommer@parallels.com> Reviewed-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> CC: David S. Miller <davem@davemloft.net> CC: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Glauber Costa 提交于
This patch replaces all uses of struct sock fields' memory_pressure, memory_allocated, sockets_allocated, and sysctl_mem to acessor macros. Those macros can either receive a socket argument, or a mem_cgroup argument, depending on the context they live in. Since we're only doing a macro wrapping here, no performance impact at all is expected in the case where we don't have cgroups disabled. Signed-off-by: NGlauber Costa <glommer@parallels.com> Reviewed-by: NHiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com> CC: David S. Miller <davem@davemloft.net> CC: Eric W. Biederman <ebiederm@xmission.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 12月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 11月, 2011 1 次提交
-
-
由 Neal Cardwell 提交于
Since 2005 (c1b4a7e6) tcp_tso_should_defer has been using tcp_max_burst() as a target limit for deciding how large to make outgoing TSO packets when not using sysctl_tcp_tso_win_divisor. But since 2008 (dd9e0dda) tcp_max_burst() returns the reordering degree. We should not have tcp_tso_should_defer attempt to build larger segments just because there is more reordering. This commit splits the notion of deferral size used in TSO from the notion of burst size used in cwnd moderation, and returns the TSO deferral limit to its original value. Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 11月, 2011 1 次提交
-
-
由 Michał Mirosław 提交于
v2: add couple missing conversions in drivers split unexporting netdev_fix_features() implemented %pNF convert sock::sk_route_(no?)caps Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 11月, 2011 1 次提交
-
-
由 Arjan van de Ven 提交于
the tcp and udp code creates a set of struct file_operations at runtime while it can also be done at compile time, with the added benefit of then having these file operations be const. the trickiest part was to get the "THIS_MODULE" reference right; the naive method of declaring a struct in the place of registration would not work for this reason. Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 10月, 2011 1 次提交
-
-
由 Flavio Leitner 提交于
It was enabled by default and the messages guarded by the define are useful. Signed-off-by: NFlavio Leitner <fbl@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 10月, 2011 2 次提交
-
-
由 Eric Dumazet 提交于
Now tcp_md5_hash_header() has a const tcphdr argument, we can add more const attributes to callers. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
tcp_md5_hash_header() writes into skb header a temporary zero value, this might confuse other users of this area. Since tcphdr is small (20 bytes), copy it in a temporary variable and make the change in the copy. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 10月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
Adding const qualifiers to pointers can ease code review, and spot some bugs. It might allow compiler to optimize code further. For example, is it legal to temporary write a null cksum into tcphdr in tcp_md5_hash_header() ? I am afraid a sniffer could catch the temporary null value... Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 9月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
Rename struct tcp_skb_cb "flags" to "tcp_flags" to ease code review and maintenance. Its content is a combination of FIN/SYN/RST/PSH/ACK/URG/ECE/CWR flags Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 9月, 2011 2 次提交
-
-
由 Eric Dumazet 提交于
struct tcp_skb_cb contains a "flags" field containing either tcp flags or IP dsfield depending on context (input or output path) Introduce ip_dsfield to make the difference clear and ease maintenance. If later we want to save space, we can union flags/ip_dsfield Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
While playing with a new ADSL box at home, I discovered that ECN blackhole can trigger suboptimal quickack mode on linux : We send one ACK for each incoming data frame, without any delay and eventual piggyback. This is because TCP_ECN_check_ce() considers that if no ECT is seen on a segment, this is because this segment was a retransmit. Refine this heuristic and apply it only if we seen ECT in a previous segment, to detect ECN blackhole at IP level. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> CC: Jamal Hadi Salim <jhs@mojatatu.com> CC: Jerry Chu <hkchu@google.com> CC: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> CC: Jim Gettys <jg@freedesktop.org> CC: Dave Taht <dave.taht@gmail.com> Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 9月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
commit 946cedcc (tcp: Change possible SYN flooding messages) added a build error if CONFIG_SYN_COOKIES=n Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 9月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
tcp_md5sig_pool is currently an 'array' (a percpu object) of pointers to struct tcp_md5sig_pool. Only the pointers are NUMA aware, but objects themselves are all allocated on a single node. Remove this extra indirection to get proper percpu memory (NUMA aware) and make code simpler. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 9月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
"Possible SYN flooding on port xxxx " messages can fill logs on servers. Change logic to log the message only once per listener, and add two new SNMP counters to track : TCPReqQFullDoCookies : number of times a SYNCOOKIE was replied to client TCPReqQFullDrop : number of times a SYN request was dropped because syncookies were not enabled. Based on a prior patch from Tom Herbert, and suggestions from David. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> CC: Tom Herbert <therbert@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 6月, 2011 1 次提交
-
-
由 Jerry Chu 提交于
This patch lowers the default initRTO from 3secs to 1sec per RFC2988bis. It falls back to 3secs if the SYN or SYN-ACK packet has been retransmitted, AND the TCP timestamp option is not on. It also adds support to take RTT sample during 3WHS on the passive open side, just like its active open counterpart, and uses it, if valid, to seed the initRTO for the data transmission phase. The patch also resets ssthresh to its initial default at the beginning of the data transmission phase, and reduces cwnd to 1 if there has been MORE THAN ONE retransmission during 3WHS per RFC5681. Signed-off-by: NH.K. Jerry Chu <hkchu@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 2月, 2011 1 次提交
-
-
由 Shan Wei 提交于
Now, TCP_CHECK_TIMER is not used for debuging, it does nothing. And, it has been there for several years, maybe 6 years. Remove it to keep code clearer. Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 2月, 2011 1 次提交
-
-
由 David S. Miller 提交于
Suggested by Alexander Zimmermann Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 2月, 2011 1 次提交
-
-
由 David S. Miller 提交于
Signed-off-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NNandita Dukkipati <nanditad@google.com>
-
- 25 1月, 2011 1 次提交
-
-
由 Michał Mirosław 提交于
Quoting Ben Hutchings: we presumably won't be defining features that can only be enabled on 64-bit architectures. Occurences found by `grep -r` on net/, drivers/net, include/ [ Move features and vlan_features next to each other in struct netdev, as per Eric Dumazet's suggestion -DaveM ] Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 12月, 2010 1 次提交
-
-
由 Nandita Dukkipati 提交于
This patch changes the default initial receive window to 10 mss (defined constant). The default window is limited to the maximum of 10*1460 and 2*mss (when mss > 1460). draft-ietf-tcpm-initcwnd-00 is a proposal to the IETF that recommends increasing TCP's initial congestion window to 10 mss or about 15KB. Leading up to this proposal were several large-scale live Internet experiments with an initial congestion window of 10 mss (IW10), where we showed that the average latency of HTTP responses improved by approximately 10%. This was accompanied by a slight increase in retransmission rate (0.5%), most of which is coming from applications opening multiple simultaneous connections. To understand the extreme worst case scenarios, and fairness issues (IW10 versus IW3), we further conducted controlled testbed experiments. We came away finding minimal negative impact even under low link bandwidths (dial-ups) and small buffers. These results are extremely encouraging to adopting IW10. However, an initial congestion window of 10 mss is useless unless a TCP receiver advertises an initial receive window of at least 10 mss. Fortunately, in the large-scale Internet experiments we found that most widely used operating systems advertised large initial receive windows of 64KB, allowing us to experiment with a wide range of initial congestion windows. Linux systems were among the few exceptions that advertised a small receive window of 6KB. The purpose of this patch is to fix this shortcoming. References: 1. A comprehensive list of all IW10 references to date. http://code.google.com/speed/protocols/tcpm-IW10.html 2. Paper describing results from large-scale Internet experiments with IW10. http://ccr.sigcomm.org/drupal/?q=node/621 3. Controlled testbed experiments under worst case scenarios and a fairness study. http://www.ietf.org/proceedings/79/slides/tcpm-0.pdf 4. Raw test data from testbed experiments (Linux senders/receivers) with initial congestion and receive windows of both 10 mss. http://research.csc.ncsu.edu/netsrv/?q=content/iw10 5. Internet-Draft. Increasing TCP's Initial Window. https://datatracker.ietf.org/doc/draft-ietf-tcpm-initcwnd/Signed-off-by: NNandita Dukkipati <nanditad@google.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 12月, 2010 1 次提交
-
-
由 Shan Wei 提交于
These macros never be used, so remove them. Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 12月, 2010 1 次提交
-
-
由 Eric Dumazet 提交于
Some windows versions have wrong RFC1323 implementations, with SYN and SYNACKS messages containing zero tcp timestamps. We relaxed in commit fc1ad92d the passive connection case (Windows connects to a linux machine), but the reverse case (linux connects to a Windows machine) has an analogue problem when tsvals from windows machine are 'negative' (high order bit set) : PAWS triggers and we drops incoming messages. Fix this by making zero ts_recent value special, allowing frame to be processed. Based on a report and initial patch from Dmitiy Balakin Bugzilla reference : https://bugzilla.kernel.org/show_bug.cgi?id=24842 Reported-by: dmitriy.balakin@nicneiron.ru Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 12月, 2010 1 次提交
-
-
由 Shan Wei 提交于
These macros have been defined for several years since v2.6.12-rc2(tracing by git), but never be used. So remove them. Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 12月, 2010 1 次提交
-
-
由 David S. Miller 提交于
The only thing AF-specific about remembering the timestamp for a time-wait TCP socket is getting the peer. Abstract that behind a new timewait_sock_ops vector. Support for real IPV6 sockets is not filled in yet, but curiously this makes timewait recycling start to work for v4-mapped ipv6 sockets. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 12月, 2010 1 次提交
-
-
由 David S. Miller 提交于
Then we can make a completely generic tcp_remember_stamp() that uses ->get_peer() as a helper, minimizing the AF specific code and minimizing the eventual code duplication when we implement the ipv6 side of TW recycling. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-