- 07 6月, 2010 2 次提交
-
-
由 Eric Dumazet 提交于
Avoid two atomic ops per raw_send_hdrinc() call Avoid two atomic ops per raw6_send_hdrinc() call Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
This patch address a serious performance issue in reading the TCP sockets table (/proc/net/tcp). Reading the full table is done by a number of sequential read operations. At each read operation, a seek is done to find the last socket that was previously read. This seek operation requires that the sockets in the table need to be counted up to the current file position, and to count each of these requires taking a lock for each non-empty bucket. The whole algorithm is O(n^2). The fix is to cache the last bucket value, offset within the bucket, and the file position returned by the last read operation. On the next sequential read, the bucket and offset are used to find the last read socket immediately without needing ot scan the previous buckets the table. This algorithm t read the whole table is O(n). The improvement offered by this patch is easily show by performing cat'ing /proc/net/tcp on a machine with a lot of connections. With about 182K connections in the table, I see the following: - Without patch time cat /proc/net/tcp > /dev/null real 1m56.729s user 0m0.214s sys 1m56.344s - With patch time cat /proc/net/tcp > /dev/null real 0m0.894s user 0m0.290s sys 0m0.594s Signed-off-by: NTom Herbert <therbert@google.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 6月, 2010 6 次提交
-
-
由 Florian Westphal 提交于
- ipv6 msstab: account for ipv6 header size - ipv4 msstab: add mss for Jumbograms. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
caller: if (!th->rst && !th->syn && th->ack) callee: if (!th->ack) make the caller only check for !syn (common for 3whs), and move the !rst / ack test to the callee. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
both syn_flood_warning functions print a message, but ipv4 version only prints a warning if CONFIG_SYN_COOKIES=y. Make the v4 one behave like the v6 one. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Its better to make a route lookup in appropriate namespace. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
I believe a moderate SYN flood attack can corrupt RFS flow table (rps_sock_flow_table), making RPS/RFS much less effective. Even in a normal situation, server handling short lived sessions suffer from bad steering for the first data packet of a session, if another SYN packet is received for another session. We do following action in tcp_v4_rcv() : sock_rps_save_rxhash(sk, skb->rxhash); We should _not_ do this if sk is a LISTEN socket, as about each packet received on a LISTEN socket has a different rxhash than previous one. -> RPS_NO_CPU markers are spread all over rps_sock_flow_table. Also, it makes sense to protect sk->rxhash field changes with socket lock (We currently can change it even if user thread owns the lock and might use rxhash) This patch moves sock_rps_save_rxhash() to a sock locked section, and only for non LISTEN sockets. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
syncookies default to on since e994b7c9 (tcp: Don't make syn cookies initial setting depend on CONFIG_SYSCTL). Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 6月, 2010 2 次提交
-
-
由 Eric Dumazet 提交于
Avoid two atomic ops in arp_fwd_proxy() Avoid two atomic ops in arp_process() Valid optims since arp_rcv() is run under rcu_read_lock() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Avoid two atomic ops on output device refcount Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 6月, 2010 2 次提交
-
-
由 Eric Dumazet 提交于
Avoid two atomic ops on struct in_device refcount per incoming packet, if slow path taken, (or route cache disabled) Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Christoph Lameter mentioned that packets could be dropped in input path because of rp_filter settings, without any SNMP counter being incremented. System administrator can have a hard time to track the problem. This patch introduces a new counter, LINUX_MIB_IPRPFILTER, incremented each time we drop a packet because Reverse Path Filter triggers. (We receive an IPv4 datagram on a given interface, and find the route to send an answer would use another interface) netstat -s | grep IPReversePathFilter IPReversePathFilter: 21714 Reported-by: NChristoph Lameter <cl@linux-foundation.org> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 6月, 2010 2 次提交
-
-
由 Daniele Lacamera 提交于
For large values of rtt, 2^rho operation may overflow u32. Clamp down the increment to 2^16. Signed-off-by: NDaniele Lacamera <root@danielinux.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wu Fengguang 提交于
Normally dhclient can be configured to send the "host-name" option in DHCP requests to update the client's DNS record. However for an NFSROOT system, dhclient shall never be called (which may change the IP addr and therefore lose your root NFS mount connection). So enable updating the DNS record with kernel parameter ip=::::$HOST_NAME::dhcp Signed-off-by: NWu Fengguang <fengguang.wu@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 6月, 2010 3 次提交
-
-
由 Joe Perches 提交于
Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Perches 提交于
Commit: c720c7e8 missed these. Signed-off-by: NJoe Perches <joe@perches.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Correct sk_forward_alloc handling for error_queue would need to use a backlog of frames that softirq handler could not deliver because socket is owned by user thread. Or extend backlog processing to be able to process normal and error packets. Another possibility is to not use mem charge for error queue, this is what I implemented in this patch. Note: this reverts commit 29030374 (net: fix sk_forward_alloc corruptions), since we dont need to lock socket anymore. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 5月, 2010 4 次提交
-
-
由 Eric Dumazet 提交于
commit f3c5c1bf (netfilter: xtables: make ip_tables reentrant) introduced a performance regression, because stackptr array is shared by all cpus, adding cache line ping pongs. (16 cpus share a 64 bytes cache line) Fix this using alloc_percpu() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Acked-By: NJan Engelhardt <jengelh@medozas.de> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
由 Ian Campbell 提交于
Currently such notifications are only generated when the device comes up or the address changes. However one use case for these notifications is to enable faster network recovery after a virtual machine migration (by causing switches to relearn their MAC tables). A migration appears to the network stack as a temporary loss of carrier and therefore does not trigger either of the current conditions. Rather than adding carrier up as a trigger (which can cause issues when interfaces a flapping) simply add an interface which the driver can use to explicitly trigger the notification. Signed-off-by: NIan Campbell <ian.campbell@citrix.com> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: stable@kernel.org Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
tcp_md5_hash_skb_data() should handle skb->frag_list, and eventually recurse. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
This patch saves 224 bytes of text on my machine. __this_cpu_inc() generates a single instruction, using no scratch registers : 65 ff 04 25 a8 30 01 00 incl %gs:0x130a8 instead of : 48 c7 c2 80 30 01 00 mov $0x13080,%rdx 65 48 8b 04 25 88 ea 00 00 mov %gs:0xea88,%rax 83 44 10 28 01 addl $0x1,0x28(%rax,%rdx,1) Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 5月, 2010 1 次提交
-
-
由 Eric Dumazet 提交于
As David found out, sock_queue_err_skb() should be called with socket lock hold, or we risk sk_forward_alloc corruption, since we use non atomic operations to update this field. This patch adds bh_lock_sock()/bh_unlock_sock() pair to three spots. (BH already disabled) 1) skb_tstamp_tx() 2) Before calling ip_icmp_error(), in __udp4_lib_err() 3) Before calling ipv6_icmp_error(), in __udp6_lib_err() Reported-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 5月, 2010 1 次提交
-
-
由 Eric Dumazet 提交于
This new sock lock primitive was introduced to speedup some user context socket manipulation. But it is unsafe to protect two threads, one using regular lock_sock/release_sock, one using lock_sock_bh/unlock_sock_bh This patch changes lock_sock_bh to be careful against 'owned' state. If owned is found to be set, we must take the slow path. lock_sock_bh() now returns a boolean to say if the slow path was taken, and this boolean is used at unlock_sock_bh time to call the appropriate unlock function. After this change, BH are either disabled or enabled during the lock_sock_bh/unlock_sock_bh protected section. This might be misleading, so we rename these functions to lock_sock_fast()/unlock_sock_fast(). Reported-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Tested-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 5月, 2010 1 次提交
-
-
由 Dan Carpenter 提交于
This fixes a smatch warning: net/ipv4/ipmr.c +1917 __ipmr_fill_mroute(12) error: buffer overflow '(mrt)->vif_table' 32 <= 32 The ipv6 version had the same issue. Signed-off-by: NDan Carpenter <error27@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 5月, 2010 1 次提交
-
-
由 Alexey Dobriyan 提交于
- C99 knows about USHRT_MAX/SHRT_MAX/SHRT_MIN, not USHORT_MAX/SHORT_MAX/SHORT_MIN. - Make SHRT_MIN of type s16, not int, for consistency. [akpm@linux-foundation.org: fix drivers/dma/timb_dma.c] [akpm@linux-foundation.org: fix security/keys/keyring.c] Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Acked-by: NWANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 5月, 2010 8 次提交
-
-
由 Joe Perches 提交于
This patch removes from net/ (but not any netfilter files) all the unnecessary return; statements that precede the last closing brace of void functions. It does not remove the returns that are immediately preceded by a label as gcc doesn't like that. Done via: $ grep -rP --include=*.[ch] -l "return;\n}" net/ | \ xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }' Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
skb rxhash should be cleared when a skb is handled by a tunnel before being delivered again, so that correct packet steering can take place. There are other cleanups and accounting that we can factorize in a new helper, skb_tunnel_rx() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Commit 33ad798c (tcp: options clean up) introduced a problem if MD5+SACK+timestamps were used in initial SYN message. Some stacks (old linux for example) try to negotiate MD5+SACK+TSTAMP sessions, but since 40 bytes of tcp options space are not enough to store all the bits needed, we chose to disable timestamps in this case. We send a SYN-ACK _without_ timestamp option, but socket has timestamps enabled and all further outgoing messages contain a TS block, all with the initial timestamp of the remote peer. Fix is to really disable timestamps option for the whole session. Reported-by: NBijay Singh <Bijay.Singh@guavus.com> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Perches 提交于
Also added an explicit break; to avoid a fallthrough in net/ipv4/tcp_input.c Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
TCP outgoing packets can avoid two atomic ops, and dirtying of previously higly contended cache line using new refdst infrastructure. Note 1: loopback device excluded because of !IFF_XMIT_DST_RELEASE Note 2: UDP packets dsts are built before ip_queue_xmit(). Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Use ip_route_input_noref() in ip fast path, to avoid two atomic ops per incoming packet. Note: loopback is excluded from this optimization in ip_rcv_finish() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
ip_route_input() is the version returning a refcounted dst, while ip_route_input_noref() returns a non refcounted one. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Use low order bit of skb->_skb_dst to tell dst is not refcounted. Change _skb_dst to _skb_refdst to make sure all uses are catched. skb_dst() returns the dst, regardless of noref bit set or not, but with a lockdep check to make sure a noref dst is not given if current user is not rcu protected. New skb_dst_set_noref() helper to set an notrefcounted dst on a skb. (with lockdep check) skb_dst_drop() drops a reference only if skb dst was refcounted. skb_dst_force() helper is used to force a refcount on dst, when skb is queued and not anymore RCU protected. Use skb_dst_force() in __sk_add_backlog(), __dev_xmit_skb() if !IFF_XMIT_DST_RELEASE or skb enqueued on qdisc queue, in sock_queue_rcv_skb(), in __nf_queue(). Use skb_dst_force() in dev_requeue_skb(). Note: dst_use_noref() still dirties dst, we might transform it later to do one dirtying per jiffies. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 5月, 2010 3 次提交
-
-
由 Eric Dumazet 提交于
TCP-MD5 sessions have intermittent failures, when route cache is invalidated. ip_queue_xmit() has to find a new route, calls sk_setup_caps(sk, &rt->u.dst), destroying the sk->sk_route_caps &= ~NETIF_F_GSO_MASK that MD5 desperately try to make all over its way (from tcp_transmit_skb() for example) So we send few bad packets, and everything is fine when tcp_transmit_skb() is called again for this socket. Since ip_queue_xmit() is at a lower level than TCP-MD5, I chose to use a socket field, sk_route_nocaps, containing bits to mask on sk_route_caps. Reported-by: NBhaskar Dutta <bhaskie@gmail.com> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
TCP MD5 support uses percpu data for temporary storage. It currently disables preemption so that same storage cannot be reclaimed by another thread on same cpu. We also have to make sure a softirq handler wont try to use also same context. Various bug reports demonstrated corruptions. Fix is to disable preemption and BH. Reported-by: NBhaskar Dutta <bhaskie@gmail.com> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Amerigo Wang 提交于
(Dropped the infiniband part, because Tetsuo modified the related code, I will send a separate patch for it once this is accepted.) This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which allows users to reserve ports for third-party applications. The reserved ports will not be used by automatic port assignments (e.g. when calling connect() or bind() with port number 0). Explicit port allocation behavior is unchanged. Signed-off-by: NOctavian Purdila <opurdila@ixiacom.com> Signed-off-by: NWANG Cong <amwang@redhat.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 5月, 2010 3 次提交
-
-
由 Joe Perches 提交于
This patch removes from net/ netfilter files all the unnecessary return; statements that precede the last closing brace of void functions. It does not remove the returns that are immediately preceded by a label as gcc doesn't like that. Done via: $ grep -rP --include=*.[ch] -l "return;\n}" net/ | \ xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }' Signed-off-by: NJoe Perches <joe@perches.com> [Patrick: changed to keep return statements in otherwise empty function bodies] Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
由 Stephen Hemminger 提交于
Make sure all printk messages have a severity level. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
由 Stephen Hemminger 提交于
Change netfilter asserts to standard WARN_ON. This has the benefit of backtrace info and also causes netfilter errors to show up on kerneloops.org. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NPatrick McHardy <kaber@trash.net>
-
- 12 5月, 2010 1 次提交
-
-
由 Jan Engelhardt 提交于
Prepare the arrays for use with the multiregister function. The future layer-3 xt matches can then be easily added to it without needing more (un)register code. Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
-