- 10 12月, 2011 1 次提交
-
-
由 Pavel Emelyanov 提交于
There's an info_size value stored on inet_diag_handler, but for existing code this value is effectively constant, so just use sizeof(struct tcp_info) where required. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 12月, 2011 11 次提交
-
-
由 Pavel Emelyanov 提交于
This patch moves the sock_ code from inet_diag.c to generic sock_diag.c file and provides necessary request_module-s calls and a pointer on inet_diag_compat dumping routine. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Now all the code works with sock_diag_req-compatible structs, so it's possible to stop using the inet_diag_type2proto in inet_csk_diag_fill. Pass the inet_diag_req into it and use the sdiag_protocol field. At the same time remove the explicit ext argument, since it's also on the req. However, this conversion is still required in _compat code, so just move this routine, not remove. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
The new API will specify family to work with. Teach the existing socket walking code to bypass not interesting ones. To preserve compatibility with existing behavior the _compat code sets interesting family to AF_UNSPEC to dump them all. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Make inet_diag_dumo work with given header instead of calculating one from the nl message. The SOCK_DIAG_BY_FAMILY just passes skb's one through, the compat code converts the old header to new one. Also fix the bytecode calculation to find one at proper offset. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Make inet_diag_get_exact work with given header instead of calculating one from the nl message. The SOCK_DIAG_BY_FAMILY just passes skb's one through, the compat code converts the old header to new one. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
This one coinsides with the sock_diag_req in the beginning and contains only used fields from its previous analogue. The existing code is patched to use the _compat version of it for now. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
When receiving the SOCK_DIAG_BY_FAMILY message we have to find the handler for provided family and pass the nl message to it. This patch describes an infrastructure to work with such nandlers and implements stubs for AF_INET(6) ones. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Sorry, but the vger didn't let this message go to the list. Re-sending it with less spam-filter-prone subject. When dumping the AF_INET/AF_INET6 sockets user will also specify the protocol, so prepare the protocol diag handlers to work with IPPROTO_ constants. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
Current code calculates it at fixed offset. This offset will change, so move the BC calculation upper to make the further patching simpler. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
This type will run the family+protocol based socket dumping. Also prepare the stub function for it. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pavel Emelyanov 提交于
The ultimate goal is to get the sock_diag module, that works in family+protocol terms. Currently this is suitable to do on the inet_diag basis, so rename parts of the code. It will be moved to sock_diag.c later. Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 12月, 2011 5 次提交
-
-
由 Igor Maravic 提交于
Use "IS_ENABLED(CONFIG_FOO)" macro instead of "defined(CONFIG_FOO) || defined(CONFIG_FOO_MODULE)" Signed-off-by: NIgor Maravic <igorm@etf.rs> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
As mentioned by Joe Perches, TCP_OFF() and TCP_PAGE() macros are useless. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
commit f07d960d (tcp: avoid frag allocation for small frames) breaked assumption in tcp stack that skb is either linear (skb->data_len == 0), or fully fragged (skb->data_len == skb->len) tcp_trim_head() made this assumption, we must fix it. Thanks to Vijay for providing a very detailed explanation. Reported-by: NVijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Miller 提交于
To reflect the fact that a refrence is not obtained to the resulting neighbour entry. Signed-off-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NRoland Dreier <roland@purestorage.com>
-
由 David S. Miller 提交于
If ipv4_valdiate_peer() fails during a cached entry lookup, we'll NULL derer since the loop iterator assumes rth is not NULL. Letting this be handled as a failure is just bogus, so just make it not fail. If we have trouble getting a non-NULL neighbour for the redirected gateway, just restore the original gateway and continue. The very next use of this cached route will try again. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 12月, 2011 2 次提交
-
-
由 Eric Dumazet 提交于
If our TCP_PAGE(sk) is not shared (page_count() == 1), we can set page offset to 0. This permits better filling of the pages on small to medium tcp writes. "tbench 16" results on my dev server (2x4x2 machine) : Before : 3072 MB/s After : 3146 MB/s (2.4 % gain) Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We discovered that TCP stack could retransmit misaligned skbs if a malicious peer acknowledged sub MSS frame. This currently can happen only if output interface is non SG enabled : If SG is enabled, tcp builds headless skbs (all payload is included in fragments), so the tcp trimming process only removes parts of skb fragments, header stay aligned. Some arches cant handle misalignments, so force a head reallocation and shrink headroom to MAX_TCP_HEADER. Dont care about misaligments on x86 and PPC (or other arches setting NET_IP_ALIGN to 0) This patch introduces __pskb_copy() which can specify the headroom of new head, and pskb_copy() becomes a wrapper on top of __pskb_copy() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 12月, 2011 1 次提交
-
-
由 Eric Dumazet 提交于
Denys Fedoryshchenko reported that SYN+FIN attacks were bringing his linux machines to their limits. Dont call conn_request() if the TCP flags includes SYN flag Reported-by: NDenys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 12月, 2011 1 次提交
-
-
由 Julian Anastasov 提交于
__mkroute_output fails to work with the original tos and uses value with stripped RTO_ONLINK bit. Make sure we put the original TOS bits into rt_key_tos because it used to match cached route. Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 12月, 2011 4 次提交
-
-
由 Peter Pan(潘卫平) 提交于
After reset ipv4_devconf->data[IPV4_DEVCONF_ACCEPT_LOCAL] to 0, we should flush route cache, or it will continue receive packets with local source address, which should be dropped. Signed-off-by: NWeiping Pan <panweiping3@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This reverts commit 81d54ec8. If we take the "try_again" goto, due to a checksum error, the 'len' has already been truncated. So we won't compute the same values as the original code did. Reported-by: Npaul bilke <fsmail@conspiracy.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Otherwise we won't notice the peer GENID change. Reported-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
gcc compiler is smart enough to use a single load/store if we memcpy(dptr, sptr, 8) on x86_64, regardless of CONFIG_CC_OPTIMIZE_FOR_SIZE In IP header, daddr immediately follows saddr, this wont change in the future. We only need to make sure our flowi4 (saddr,daddr) fields wont break the rule. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 12月, 2011 5 次提交
-
-
由 Jun Zhao 提交于
Need not to used 'delta' flag when add single-source to interface filter source list. Signed-off-by: NJun Zhao <mypopydev@gmail.com> Signed-off-by: NDavid S. Miller <davem@drr.davemloft.net>
-
由 David Miller 提交于
Instead of instantiating an entire new neigh_table instance just for ATM handling, use the neigh device private facility. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Miller 提交于
Let the core self-size the neigh entry based upon the key length. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
After commit f2c31e32 (fix NULL dereferences in check_peer_redir()), dst_get_neighbour() should be guarded by rcu_read_lock() / rcu_read_unlock() section. Reported-by: NMiles Lane <miles.lane@gmail.com> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Rick Jones reported that TCP_CONGESTION sockopt performed on a listener was ignored for its children sockets : right after accept() the congestion control for new socket is the system default one. This seems an oversight of the initial design (quoted from Stephen) Based on prior investigation and patch from Rick. Reported-by: NRick Jones <rick.jones2@hp.com> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Yuchung Cheng <ycheng@google.com> Tested-by: NRick Jones <rick.jones2@hp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 11月, 2011 2 次提交
-
-
由 RongQing.Li 提交于
Commit 7dc00c82 added a 'notify' parameter for vif_delete() to distinguish whether to unregister the device. When notify=1 means we does not need to unregister the device, so calling unregister_netdevice_many is useless. Signed-off-by: NRongQing.Li <roy.qing.li@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
tcp_sendmsg() uses select_size() helper to choose skb head size when a new skb must be allocated. If GSO is enabled for the socket, current strategy is to force all payload data to be outside of headroom, in PAGE fragments. This strategy is not welcome for small packets, wasting memory. Experiments show that best results are obtained when using 2048 bytes for skb head (This includes the skb overhead and various headers) This patch provides better len/truesize ratios for packets sent to loopback device, and reduce memory needs for in-flight loopback packets, particularly on arches with big pages. If a sender sends many 1-byte packets to an unresponsive application, receiver rmem_alloc will grow faster and will stop queuing these packets sooner, or will collapse its receive queue to free excess memory. netperf -t TCP_RR results are improved by ~4 %, and many workloads are improved as well (tbench, mysql...) Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 11月, 2011 3 次提交
-
-
由 Neal Cardwell 提交于
Since 2005 (c1b4a7e6) tcp_tso_should_defer has been using tcp_max_burst() as a target limit for deciding how large to make outgoing TSO packets when not using sysctl_tcp_tso_win_divisor. But since 2008 (dd9e0dda) tcp_max_burst() returns the reordering degree. We should not have tcp_tso_should_defer attempt to build larger segments just because there is more reordering. This commit splits the notion of deferral size used in TSO from the notion of burst size used in cwnd moderation, and returns the TSO deferral limit to its original value. Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Igor Maravic reported an error caused by jump_label_dec() being called from IRQ context : BUG: sleeping function called from invalid context at kernel/mutex.c:271 in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper 1 lock held by swapper/0: #0: (&n->timer){+.-...}, at: [<ffffffff8107ce90>] call_timer_fn+0x0/0x340 Pid: 0, comm: swapper Not tainted 3.2.0-rc2-net-next-mpls+ #1 Call Trace: <IRQ> [<ffffffff8104f417>] __might_sleep+0x137/0x1f0 [<ffffffff816b9a2f>] mutex_lock_nested+0x2f/0x370 [<ffffffff810a89fd>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff8109a37f>] ? local_clock+0x6f/0x80 [<ffffffff810a90a5>] ? lock_release_holdtime.part.22+0x15/0x1a0 [<ffffffff81557929>] ? sock_def_write_space+0x59/0x160 [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90 [<ffffffff810969cd>] atomic_dec_and_mutex_lock+0x5d/0x80 [<ffffffff8112fc1d>] jump_label_dec+0x1d/0x50 [<ffffffff81566525>] net_disable_timestamp+0x15/0x20 [<ffffffff81557a75>] sock_disable_timestamp+0x45/0x50 [<ffffffff81557b00>] __sk_free+0x80/0x200 [<ffffffff815578d0>] ? sk_send_sigurg+0x70/0x70 [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90 [<ffffffff81557cba>] sock_wfree+0x3a/0x70 [<ffffffff8155c2b0>] skb_release_head_state+0x70/0x120 [<ffffffff8155c0b6>] __kfree_skb+0x16/0x30 [<ffffffff8155c119>] kfree_skb+0x49/0x170 [<ffffffff815e936e>] arp_error_report+0x3e/0x90 [<ffffffff81575bd9>] neigh_invalidate+0x89/0xc0 [<ffffffff81578dbe>] neigh_timer_handler+0x9e/0x2a0 [<ffffffff81578d20>] ? neigh_update+0x640/0x640 [<ffffffff81073558>] __do_softirq+0xc8/0x3a0 Since jump_label_{inc|dec} must be called from process context only, we must defer jump_label_dec() if net_disable_timestamp() is called from interrupt context. Reported-by: NIgor Maravic <igorm@etf.rs> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Now sk_route_caps is u64, its dangerous to use an integer to store result of an AND operator. It wont work if NETIF_F_SG is moved on the upper part of u64. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> CC: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 11月, 2011 5 次提交
-
-
由 Neal Cardwell 提交于
The problem: Senders were overriding cwnd values picked during an undo by calling tcp_moderate_cwnd() in tcp_try_to_open(). The fix: Don't moderate cwnd in tcp_try_to_open() if we're in TCP_CA_Open, since doing so is generally unnecessary and specifically would override a DSACK-based undo of a cwnd reduction made in fast recovery. Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neal Cardwell 提交于
Previously, SACK-enabled connections hung around in TCP_CA_Disorder state while snd_una==high_seq, just waiting to accumulate DSACKs and hopefully undo a cwnd reduction. This could and did lead to the following unfortunate scenario: if some incoming ACKs advance snd_una beyond high_seq then we were setting undo_marker to 0 and moving to TCP_CA_Open, so if (due to reordering in the ACK return path) we shortly thereafter received a DSACK then we were no longer able to undo the cwnd reduction. The change: Simplify the congestion avoidance state machine by removing the behavior where SACK-enabled connections hung around in the TCP_CA_Disorder state just waiting for DSACKs. Instead, when snd_una advances to high_seq or beyond we typically move to TCP_CA_Open immediately and allow an undo in either TCP_CA_Open or TCP_CA_Disorder if we later receive enough DSACKs. Other patches in this series will provide other changes that are necessary to fully fix this problem. Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neal Cardwell 提交于
The bug: When the ACK field is below snd_una (which can happen when ACKs are reordered), senders ignored DSACKs (preventing undo) and did not call tcp_fastretrans_alert, so they did not increment prr_delivered to reflect newly-SACKed sequence ranges, and did not call tcp_xmit_retransmit_queue, thus passing up chances to send out more retransmitted and new packets based on any newly-SACKed packets. The change: When the ACK field is below snd_una (the "old_ack" goto label), call tcp_fastretrans_alert to allow undo based on any newly-arrived DSACKs and try to send out more packets based on newly-SACKed packets. Other patches in this series will provide other changes that are necessary to fully fix this problem. Signed-off-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neal Cardwell 提交于
The bug: Senders ignored DSACKs after recovery when there were no outstanding packets (a common scenario for HTTP servers). The change: when there are no outstanding packets (the "no_queue" goto label), call tcp_fastretrans_alert() in order to use DSACKs to undo congestion window reductions. Other patches in this series will provide other changes that are necessary to fully fix this problem. Signed-off-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neal Cardwell 提交于
Allow callers to decide whether an ACK is a duplicate ACK. This is a prerequisite to allowing fastretrans_alert to be called from new contexts, such as the no_queue and old_ack code paths, from which we have extra info that tells us whether an ACK is a dupack. Signed-off-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NIlpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-