- 05 11月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
As pointed by Stephen Rothwell, commit c6d14c84 added a warning : net/ipv4/devinet.c: In function 'inet_select_addr': net/ipv4/devinet.c:902: warning: label 'out' defined but not used delete unused 'out' label and do some cleanups as well Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 11月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
Adds RCU management to the list of netdevices. Convert some for_each_netdev() users to RCU version, if it can avoid read_lock-ing dev_base_lock Ie: read_lock(&dev_base_loack); for_each_netdev(net, dev) some_action(); read_unlock(&dev_base_lock); becomes : rcu_read_lock(); for_each_netdev_rcu(net, dev) some_action(); rcu_read_unlock(); Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 11月, 2009 2 次提交
-
-
由 Eric Dumazet 提交于
We can avoid touching device refcount in icmp_send(), using dev_get_by_index_rcu() Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Use dev_get_by_index_rcu() instead of __dev_get_by_index() and dev_base_lock rwlock Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 10月, 2009 10 次提交
-
-
由 Cyrill Gorcunov 提交于
proto_ops->getname implies copying protocol specific data into storage unit (particulary to __kernel_sockaddr_storage). So when we implement new protocol support we should keep such a detail in mind (which is easy to forget about). Lets introduce DECLARE_SOCKADDR helper which check if storage unit is not overfowed at build time. Eventually inet_getname is switched to use DECLARE_SOCKADDR (to show example of usage). Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 roel kluin 提交于
optlen is unsigned so the `< 0' test is never true. Signed-off-by: NRoel Kluin <roel.kluin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gilad Ben-Yossef 提交于
Add and use no DSCAK bit in the features field. Signed-off-by: NGilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: NOri Finkelman <ori@comsleep.com> Sigend-off-by: NYony Amit <yony@comsleep.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gilad Ben-Yossef 提交于
Add and use no window scale bit in the features field. Note that this is not the same as setting a window scale of 0 as would happen with window limit on route. Signed-off-by: NGilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: NOri Finkelman <ori@comsleep.com> Sigend-off-by: NYony Amit <yony@comsleep.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gilad Ben-Yossef 提交于
Implement querying and acting upon the no timestamp bit in the feature field. Signed-off-by: NGilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: NOri Finkelman <ori@comsleep.com> Sigend-off-by: NYony Amit <yony@comsleep.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gilad Ben-Yossef 提交于
Implement querying and acting upon the no sack bit in the features field. Signed-off-by: NGilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: NOri Finkelman <ori@comsleep.com> Sigend-off-by: NYony Amit <yony@comsleep.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gilad Ben-Yossef 提交于
We need tcp_parse_options to be aware of dst_entry to take into account per dst_entry TCP options settings Signed-off-by: NGilad Ben-Yossef <gilad@codefidence.com> Sigend-off-by: NOri Finkelman <ori@comsleep.com> Sigend-off-by: NYony Amit <yony@comsleep.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gilad Ben-Yossef 提交于
Since we only use tcp_parse_options here to check for the exietence of TCP timestamp option in the header, it is better to call with the "established" flag on. Signed-off-by: NGilad Ben-Yossef <gilad@codefidence.com> Signed-off-by: NOri Finkelman <ori@comsleep.com> Signed-off-by: NYony Amit <yony@comsleep.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Speedup module unloading by factorizing synchronize_rcu() calls Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neil Horman 提交于
Augment raw_send_hdrinc to correct for incorrect ip header length values A series of oopses was reported to me recently. Apparently when using AF_RAW sockets to send data to peers that were reachable via ipsec encapsulation, people could panic or BUG halt their systems. I've tracked the problem down to user space sending an invalid ip header over an AF_RAW socket with IP_HDRINCL set to 1. Basically what happens is that userspace sends down an ip frame that includes only the header (no data), but sets the ip header ihl value to a large number, one that is larger than the total amount of data passed to the sendmsg call. In raw_send_hdrincl, we allocate an skb based on the size of the data in the msghdr that was passed in, but assume the data is all valid. Later during ipsec encapsulation, xfrm4_tranport_output moves the entire frame back in the skbuff to provide headroom for the ipsec headers. During this operation, the skb->transport_header is repointed to a spot computed by skb->network_header + the ip header length (ihl). Since so little data was passed in relative to the value of ihl provided by the raw socket, we point transport header to an unknown location, resulting in various crashes. This fix for this is pretty straightforward, simply validate the value of of iph->ihl when sending over a raw socket. If (iph->ihl*4U) > user data buffer size, drop the frame and return -EINVAL. I just confirmed this fixes the reported crashes. Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 10月, 2009 3 次提交
-
-
由 Andreas Petlund 提交于
Corrected a spelling error in a function name. Signed-off-by: NAndreas Petlund <apetlund@simula.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Speedup module unloading by factorizing synchronize_rcu() calls Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Speedup module unloading by factorizing synchronize_rcu() calls Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 10月, 2009 2 次提交
-
-
由 Eric Dumazet 提交于
GRE tunnels use one rwlock to protect their hash tables. This locking scheme can be converted to RCU for free, since netdevice already must wait for a RCU grace period at dismantle time. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
IPIP tunnels use one rwlock to protect their hash tables. This locking scheme can be converted to RCU for free, since netdevice already must wait for a RCU grace period at dismantle time. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 10月, 2009 1 次提交
-
-
由 Arjan van de Ven 提交于
Commit b6b39e8f (tcp: Try to catch MSG_PEEK bug) added a printk() to the WARN_ON() that's in tcp.c. This patch changes this combination to WARN(); the advantage of WARN() is that the printk message shows up inside the message, so that kerneloops.org will collect the message. In addition, this gets rid of an extra if() statement. Signed-off-by: NArjan van de Ven <arjan@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 10月, 2009 1 次提交
-
-
由 Krishna Kumar 提交于
dst_negative_advice() should check for changed dst and reset sk_tx_queue_mapping accordingly. Pass sock to the callers of dst_negative_advice. (sk_reset_txq is defined just for use by dst_negative_advice. The only way I could find to get around this is to move dst_negative_() from dst.h to dst.c, include sock.h in dst.c, etc) Signed-off-by: NKrishna Kumar <krkumar2@in.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 10月, 2009 7 次提交
-
-
由 Herbert Xu 提交于
This patch tries to print out more information when we hit the MSG_PEEK bug in tcp_recvmsg. It's been around since at least 2005 and it's about time that we finally fix it. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 John Dykstra 提交于
Use symbols instead of magic constants while checking PMTU discovery setsockopt. Remove redundant test in ip_rt_frag_needed() (done by caller). Signed-off-by: NJohn Dykstra <john.dykstra1@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
ipv4/ipv6 setsockopt(IP_MULTICAST_IF) have dubious __dev_get_by_index() calls. This function should be called only with RTNL or dev_base_lock held, or reader could see a corrupt hash chain and eventually enter an endless loop. Fix is to call dev_get_by_index()/dev_put(). If this happens to be performance critical, we could define a new dev_exist_by_index() function to avoid touching dev refcount. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Julian Anastasov 提交于
Fix TCP_DEFER_ACCEPT conversion between seconds and retransmission to match the TCP SYN-ACK retransmission periods because the time is converted to such retransmissions. The old algorithm selects one more retransmission in some cases. Allow up to 255 retransmissions. Signed-off-by: NJulian Anastasov <ja@ssi.bg> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Julian Anastasov 提交于
Change SYN-ACK retransmitting code for the TCP_DEFER_ACCEPT users to not retransmit SYN-ACKs during the deferring period if ACK from client was received. The goal is to reduce traffic during the deferring period. When the period is finished we continue with sending SYN-ACKs (at least one) but this time any traffic from client will change the request to established socket allowing application to terminate it properly. Also, do not drop acked request if sending of SYN-ACK fails. Signed-off-by: NJulian Anastasov <ja@ssi.bg> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Julian Anastasov 提交于
Willy Tarreau and many other folks in recent years were concerned what happens when the TCP_DEFER_ACCEPT period expires for clients which sent ACK packet. They prefer clients that actively resend ACK on our SYN-ACK retransmissions to be converted from open requests to sockets and queued to the listener for accepting after the deferring period is finished. Then application server can decide to wait longer for data or to properly terminate the connection with FIN if read() returns EAGAIN which is an indication for accepting after the deferring period. This change still can have side effects for applications that expect always to see data on the accepted socket. Others can be prepared to work in both modes (with or without TCP_DEFER_ACCEPT period) and their data processing can ignore the read=EAGAIN notification and to allocate resources for clients which proved to have no data to send during the deferring period. OTOH, servers that use TCP_DEFER_ACCEPT=1 as flag (not as a timeout) to wait for data will notice clients that didn't send data for 3 seconds but that still resend ACKs. Thanks to Willy Tarreau for the initial idea and to Eric Dumazet for the review and testing the change. Signed-off-by: NJulian Anastasov <ja@ssi.bg> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This reverts commit 6d01a026. Julian Anastasov, Willy Tarreau and Eric Dumazet have come up with a more correct way to deal with this. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 10月, 2009 3 次提交
-
-
由 Steffen Klassert 提交于
This patch converts ah4 to the new ahash interface. Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
- skb_kill_datagram() can increment sk->sk_drops itself, not callers. - UDP on IPV4 & IPV6 dropped frames (because of bad checksum or policy checks) increment sk_drops Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
In order to have better cache layouts of struct sock (separate zones for rx/tx paths), we need this preliminary patch. Goal is to transfert fields used at lookup time in the first read-mostly cache line (inside struct sock_common) and move sk_refcnt to a separate cache line (only written by rx path) This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr, sport and id fields. This allows a future patch to define these fields as macros, like sk_refcnt, without name clashes. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 10月, 2009 1 次提交
-
-
由 Eric Dumazet 提交于
sock_queue_rcv_skb() can update sk_drops itself, removing need for callers to take care of it. This is more consistent since sock_queue_rcv_skb() also reads sk_drops when queueing a skb. This adds sk_drops managment to many protocols that not cared yet. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 10月, 2009 4 次提交
-
-
由 Eric Dumazet 提交于
Storing the mask (size - 1) instead of the size allows fast path to be a bit faster. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
udp_poll() can in some circumstances drop frames with incorrect checksums. Problem is we now have to lock the socket while dropping frames, or risk sk_forward corruption. This bug is present since commit 95766fff ([UDP]: Add memory accounting.) While we are at it, we can correct ioctl(SIOCINQ) to also drop bad frames. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willy Tarreau 提交于
I was trying to use TCP_DEFER_ACCEPT and noticed that if the client does not talk, the connection is never accepted and remains in SYN_RECV state until the retransmits expire, where it finally is deleted. This is bad when some firewall such as netfilter sits between the client and the server because the firewall sees the connection in ESTABLISHED state while the server will finally silently drop it without sending an RST. This behaviour contradicts the man page which says it should wait only for some time : TCP_DEFER_ACCEPT (since Linux 2.4) Allows a listener to be awakened only when data arrives on the socket. Takes an integer value (seconds), this can bound the maximum number of attempts TCP will make to complete the connection. This option should not be used in code intended to be portable. Also, looking at ipv4/tcp.c, a retransmit counter is correctly computed : case TCP_DEFER_ACCEPT: icsk->icsk_accept_queue.rskq_defer_accept = 0; if (val > 0) { /* Translate value in seconds to number of * retransmits */ while (icsk->icsk_accept_queue.rskq_defer_accept < 32 && val > ((TCP_TIMEOUT_INIT / HZ) << icsk->icsk_accept_queue.rskq_defer_accept)) icsk->icsk_accept_queue.rskq_defer_accept++; icsk->icsk_accept_queue.rskq_defer_accept++; } break; ==> rskq_defer_accept is used as a counter of retransmits. But in tcp_minisocks.c, this counter is only checked. And in fact, I have found no location which updates it. So I think that what was intended was to decrease it in tcp_minisocks whenever it is checked, which the trivial patch below does. Signed-off-by: NWilly Tarreau <w@1wt.eu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neil Horman 提交于
Create a new socket level option to report number of queue overflows Recently I augmented the AF_PACKET protocol to report the number of frames lost on the socket receive queue between any two enqueued frames. This value was exported via a SOL_PACKET level cmsg. AFter I completed that work it was requested that this feature be generalized so that any datagram oriented socket could make use of this option. As such I've created this patch, It creates a new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue overflowed between any two given frames. It also augments the AF_PACKET protocol to take advantage of this new feature (as it previously did not touch sk->sk_drops, which this patch uses to record the overflow count). Tested successfully by me. Notes: 1) Unlike my previous patch, this patch simply records the sk_drops value, which is not a number of drops between packets, but rather a total number of drops. Deltas must be computed in user space. 2) While this patch currently works with datagram oriented protocols, it will also be accepted by non-datagram oriented protocols. I'm not sure if thats agreeable to everyone, but my argument in favor of doing so is that, for those protocols which aren't applicable to this option, sk_drops will always be zero, and reporting no drops on a receive queue that isn't used for those non-participating protocols seems reasonable to me. This also saves us having to code in a per-protocol opt in mechanism. 3) This applies cleanly to net-next assuming that commit 97775007 (my af packet cmsg patch) is reverted Signed-off-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 10月, 2009 3 次提交
-
-
由 Eric Dumazet 提交于
UDP_HTABLE_SIZE was initialy defined to 128, which is a bit small for several setups. 4000 active UDP sockets -> 32 sockets per chain in average. An incoming frame has to lookup all sockets to find best match, so long chains hurt latency. Instead of a fixed size hash table that cant be perfect for every needs, let UDP stack choose its table size at boot time like tcp/ip route, using alloc_large_system_hash() helper Add an optional boot parameter, uhash_entries=x so that an admin can force a size between 256 and 65536 if needed, like thash_entries and rhash_entries. dmesg logs two new lines : [ 0.647039] UDP hash table entries: 512 (order: 0, 4096 bytes) [ 0.647099] UDP Lite hash table entries: 512 (order: 0, 4096 bytes) Maximal size on 64bit arches would be 65536 slots, ie 1 MBytes for non debugging spinlocks. Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hagen Paul Pfeifer 提交于
There is no reason that cipso_v4_delopt() is not defined as a static function. Signed-off-by: NHagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Atis Elsts 提交于
Add support for route lookup using sk_mark on IPv4 listening sockets. Signed-off-by: NAtis Elsts <atis@mikrotik.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 10月, 2009 1 次提交
-
-
由 Stephen Hemminger 提交于
This fixes a bug with arp_notify. If arp_notify is enabled, kernel will crash if address is changed and no IP address is assigned. http://bugzilla.kernel.org/show_bug.cgi?id=14330Reported-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-