- 11 10月, 2007 4 次提交
-
-
由 Gerrit Renker 提交于
Since DCCP requires to close both ends of a connection simultaneously, permission to write in state DCCP_CLOSING is removed in dccp_sendmsg(): * if the sending end closed, it would encounter a write error anyhow; * if the other end has closed the connection, it accepts no more data. Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: NIan McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
-
由 Gerrit Renker 提交于
This implements a SHOULD from RFC 4340, 7.5.4: "To protect against denial-of-service attacks, DCCP implementations SHOULD impose a rate limit on DCCP-Syncs sent in response to sequence-invalid packets, such as not more than eight DCCP-Syncs per second." The rate-limit is maintained on a per-socket basis. This is a more stringent policy than enforcing the rate-limit on a per-source-address basis and protects against attacks with forged source addresses. Moreover, the mechanism is deliberately kept simple. In contrast to xrlim_allow(), bursts of Sync packets in reply to sequence-invalid packets are not supported. This foils such attacks where the receipt of a Sync triggers further sequence-invalid packets. (I have tested this mechanism against xrlim_allow algorithm for Syncs, permitting bursts just increases the problems.) In order to keep flexibility, the timeout parameter can be set via sysctl; and the whole mechanism can even be disabled (which is however not recommended). The algorithm in this patch has been improved with regard to wrapping issues thanks to a suggestion by Arnaldo. Commiter note: Rate limited the step 6 DCCP_WARN too, as it says we're sending a sync. Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: NIan McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net>
-
由 Gerrit Renker 提交于
This provides a timesource, conveniently used for DCCP timestamps, which returns the elapsed time in 10s of microseconds since initialisation. This makes for a wrap-around time of about 11.9 hours, which should be sufficient for most applications. Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 7月, 2007 1 次提交
-
-
由 Paul Mundt 提交于
Slab destructors were no longer supported after Christoph's c59def9f change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
-
- 29 3月, 2007 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
We were only checking if there was enough space to put the int, but left len as specified by the (malicious) user, sigh, fix it by setting len to sizeof(val) and transfering just one int worth of data, the one asked for. Also check for negative len values. Signed-off-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 2月, 2007 1 次提交
-
-
由 YOSHIFUJI Hideaki 提交于
Signed-off-by: NYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 2月, 2007 1 次提交
-
-
由 Eric Dumazet 提交于
ehash table layout is currently this one : First half of this table is used by sockets not in TIME_WAIT state Second half of it is used by sockets in TIME_WAIT state. This is non optimal because of for a given hash or socket, the two chain heads are located in separate cache lines. Moreover the locks of the second half are never used. If instead of this halving, we use two list heads in inet_ehash_bucket instead of only one, we probably can avoid one cache miss, and reduce ram usage, particularly if sizeof(rwlock_t) is big (various CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_LOCK_ALLOC settings). So we still halves the table but we keep together related chains to speedup lookups and socket state change. In this patch I did not try to align struct inet_ehash_bucket, but a future patch could try to make this structure have a convenient size (a power of two or a multiple of L1_CACHE_SIZE). I guess rwlock will just vanish as soon as RCU is plugged into ehash :) , so maybe we dont need to scratch our heads to align the bucket... Note : In case struct inet_ehash_bucket is not a power of two, we could probably change alloc_large_system_hash() (in case it use __get_free_pages()) to free the unused space. It currently allocates a big zone, but the last quarter of it could be freed. Again, this should be a temporary 'problem'. Patch tested on ipv4 tcp only, but should be OK for IPV6 and DCCP. Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 12月, 2006 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
That accumulated over the last months hackaton, shame on me for not using git-apply whitespace helping hand, will do that from now on. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
-
- 03 12月, 2006 8 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To reflect the fact that this now is of no effect, not making apps stop working, just be warned in the system log. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
-
由 Gerrit Renker 提交于
This removes and cleans up unused variables and structures which have become unnecessary following the introduction of the EWMA patch to automatically track the CCID 3 receiver/sender packet sizes `s'. It deprecates the PACKET_SIZE socket option by returning an error code and printing a deprecation warning if an application tries to read or write this socket option. Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
-
由 Gerrit Renker 提交于
This reaps the benefit of the earlier patch, which changed the type of CCID 3 states to use enums, in that many conditions are now simplified and the number of possible (unexpected) values is greatly reduced. In a few instances, this also allowed to simplify pre-conditions; where care has been taken to retain logical equivalence. [DCCP]: Introduce a consistent BUG/WARN message scheme This refines the existing set of DCCP messages so that * BUG(), BUG_ON(), WARN_ON() have meaningful DCCP-specific counterparts * DCCP_CRIT (for severe warnings) is not rate-limited * DCCP_WARN() is introduced as rate-limited wrapper Using these allows a faster and cleaner transition to their original counterparts once the code has matured into a full DCCP implementation. Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
-
由 Ian McDonald 提交于
Previously the transmit queue was unbounded. This patch: * puts a limit on transmit queue length and sends back EAGAIN if the buffer is full * sets the TX queue length to a sensible default * implements tx buffer sysctls for DCCP Signed-off-by: NIan McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
-
由 Gerrit Renker 提交于
This patch does not change code; it performs some trivial clean/tidy-ups: * removal of a `debug_prefix' string in favour of the already existing dccp_role(sk) * add documentation of structures and constants * separated out the cases for invalid packets (step 1 of the packet validation) * removing duplicate statements * combining declaration & initialisation Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
-
由 Gerrit Renker 提交于
This adds 3 sysctls which govern the retransmission behaviour of DCCP control packets (3way handshake, feature negotiation). It removes 4 FIXMEs from the code. The close resemblance of sysctl variables to their TCP analogues is emphasised not only by their name, but also by giving them the same initial values. This is useful since there is not much practical experience with DCCP yet. Furthermore, with regard to the previous patch, it is now possible to limit the number of keepalive-Responses by setting net.dccp.default.request_retries (also a bit like in TCP). Lastly, added documentation of all existing DCCP sysctls. Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
-
由 Gerrit Renker 提交于
This patch does the following: a) introduces variable-length checksums as specified in [RFC 4340, sec. 9.2] b) provides necessary socket options and documentation as to how to use them c) basic support and infrastructure for the Minimum Checksum Coverage feature [RFC 4340, sec. 9.2.1]: acceptability tests, user notification and user interface In addition, it (1) fixes two bugs in the DCCPv4 checksum computation: * pseudo-header used checksum_len instead of skb->len * incorrect checksum coverage calculation based on dccph_x (2) removes dccp_v4_verify_checksum() since it reduplicates code of the checksum computation; code calling this function is updated accordingly. (3) now uses skb_checksum(), which is safer than checksum_partial() if the sk_buff has is a non-linear buffer (has pages attached to it). (4) fixes an outstanding TODO item: * If P.CsCov is too large for the packet size, drop packet and return. The code has been tested with applications, the latest version of tcpdump now comes with support for partial DCCP checksums. Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
-
由 Eric Dumazet 提交于
We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for each LISTEN socket, regardless of various parameters (listen backlog for example) On x86_64, this means order-1 allocations (might fail), even for 'small' sockets, expecting few connections. On the contrary, a huge server wanting a backlog of 50000 is slowed down a bit because of this fixed limit. This patch makes the sizing of listen hash table a dynamic parameter, depending of : - net.core.somaxconn tunable (default is 128) - net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128) - backlog value given by user application (2nd parameter of listen()) For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of kmalloc(). We still limit memory allocation with the two existing tunables (somaxconn & tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM usage. Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 9月, 2006 1 次提交
-
-
由 Gerrit Renker 提交于
This has been discussed on dccp@vger and removes the necessity for applications to supply service codes in each and every case. If an application does not want to provide a service code, that's fine, it will be given 0. Otherwise, service codes can be set via socket options as before. This patch has been tested using various client/server configurations (including listening on multiple service codes). Signed-off-by: NGerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com>
-
- 23 9月, 2006 1 次提交
-
-
由 Ian McDonald 提交于
This adds transmit buffering to DCCP. I have tested with CCID2/3 and with loss and rate limiting. Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 7月, 2006 1 次提交
-
-
由 Alan Cox 提交于
No actual bugs that I can see just a couple of unmarked casts getting annoying in my debug log files. Signed-off-by: NAlan Cox <alan@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 7月, 2006 1 次提交
-
-
由 Jörn Engel 提交于
Signed-off-by: NJörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: NAdrian Bunk <bunk@stusta.de>
-
- 18 6月, 2006 1 次提交
-
-
由 Chris Leech 提交于
Add an extra argument to sk_eat_skb, and make it move early copied packets to the async_wait_queue instead of freeing them. Signed-off-by: NChris Leech <christopher.leech@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 5月, 2006 1 次提交
-
-
由 Herbert Xu 提交于
Calling sock_orphan inside bh_lock_sock in dccp_close can lead to dead locks. For example, the inet_diag code holds sk_callback_lock without disabling BH. If an inbound packet arrives during that admittedly tiny window, it will cause a dead lock on bh_lock_sock. Another possible path would be through sock_wfree if the network device driver frees the tx skb in process context with BH enabled. We can fix this by moving sock_orphan out of bh_lock_sock. The tricky bit is to work out when we need to destroy the socket ourselves and when it has already been destroyed by someone else. By moving sock_orphan before the release_sock we can solve this problem. This is because as long as we own the socket lock its state cannot change. So we simply record the socket state before the release_sock and then check the state again after we regain the socket lock. If the socket state has transitioned to DCCP_CLOSED in the time being, we know that the socket has been destroyed. Otherwise the socket is still ours to keep. This problem was discoverd by Ingo Molnar using his lock validator. Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 3月, 2006 1 次提交
-
-
由 Davide Libenzi 提交于
Implement the half-closed devices notifiation, by adding a new POLLRDHUP (and its alias EPOLLRDHUP) bit to the existing poll/select sets. Since the existing POLLHUP handling, that does not report correctly half-closed devices, was feared to be changed, this implementation leaves the current POLLHUP reporting unchanged and simply add a new bit that is set in the few places where it makes sense. The same thing was discussed and conceptually agreed quite some time ago: http://lkml.org/lkml/2003/7/12/116 Since this new event bit is added to the existing Linux poll infrastruture, even the existing poll/select system calls will be able to use it. As far as the existing POLLHUP handling, the patch leaves it as is. The pollrdhup-2.6.16.rc5-0.10.diff defines the POLLRDHUP for all the existing archs and sets the bit in the six relevant files. The other attached diff is the simple change required to sys/epoll.h to add the EPOLLRDHUP definition. There is "a stupid program" to test POLLRDHUP delivery here: http://www.xmailserver.org/pollrdhup-test.c It tests poll(2), but since the delivery is same epoll(2) will work equally. Signed-off-by: NDavide Libenzi <davidel@xmailserver.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 21 3月, 2006 16 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
This is in preparation for having a dccp_minisock embedded into dccp_request_sock so that feature negotiation can be done prior to creating the full blown dccp_sock. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
This will later be included in struct dccp_request_sock so that we can have per connection feature negotiation state while in the 3way handshake, when we clone the DCCP_ROLE_LISTEN socket (in dccp_create_openreq_child) we'll just copy this state from dreq_minisock to dccps_minisock. Also the feature negotiation and option parsing code will mostly touch dccps_minisock, which will simplify some stuff. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
No code changes, just tidying up, in some cases moving EXPORT_SYMBOLs to just after the function exported, etc. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dmitry Mishin 提交于
This patch extends {get|set}sockopt compatibility layer in order to move protocol specific parts to their place and avoid huge universal net/compat.c file in the future. Signed-off-by: NDmitry Mishin <dim@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
I guess I forgot to add it, nah, now it just works: 18:04:33.274066 IP6 ::1.1476 > ::1.5001: request (service=0) 18:04:33.334482 IP6 ::1.5001 > ::1.1476: reset (code=bad_service_code) Ditched IP_DCCP_UNLOAD_HACK, as now we would have to do it for both IPv6 and IPv4, so I'll come up with another way for freeing the control sockets in upcoming changesets. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
With this patch in place we can break down the complexity by better compartmentalizing the code that is common to ipv6 and ipv4. Now we have these modules: Module Size Used by dccp_diag 1344 0 inet_diag 9448 1 dccp_diag dccp_ccid3 15856 0 dccp_tfrc_lib 12320 1 dccp_ccid3 dccp_ccid2 5764 0 dccp_ipv4 16996 2 dccp 48208 4 dccp_diag,dccp_ccid3,dccp_ccid2,dccp_ipv4 dccp_ipv6 still requires dccp_ipv4 due to dccp_ipv6_mapped, that is the next target to work on the "hey, ipv4 is legacy, I only want ipv6 dude!" direction. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
And introduce dccp_mib_exit grouping previously open coded sequence. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
As it is used by both ipv4 and ipv6. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
As this is used by both ipv4 and ipv6 and is not ipv4 specific. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
Removing one more ipv6 uses ipv4 stuff case in dccp land. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
[root@qemu ~]# for a in /proc/sys/net/dccp/default/* ; do echo $a ; cat $a ; done /proc/sys/net/dccp/default/ack_ratio 2 /proc/sys/net/dccp/default/rx_ccid 3 /proc/sys/net/dccp/default/send_ackvec 1 /proc/sys/net/dccp/default/send_ndp 1 /proc/sys/net/dccp/default/seq_window 100 /proc/sys/net/dccp/default/tx_ccid 3 [root@qemu ~]# So if wanting to test ccid3 as the tx CCID one can just do: [root@qemu ~]# echo 3 > /proc/sys/net/dccp/default/tx_ccid [root@qemu ~]# echo 2 > /proc/sys/net/dccp/default/rx_ccid [root@qemu ~]# cat /proc/sys/net/dccp/default/[tr]x_ccid 2 3 [root@qemu ~]# Of course we also need the setsockopt for each app to tell its preferences, but for testing or defining something other than CCID2 as the default for apps that don't explicitely set their preference the sysctl interface is handy. Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrea Bittau 提交于
This also fixes the layout of dccp_hdr short sequence numbers, problem was not fatal now as we only support long (48 bits) sequence numbers. Signed-off-by: NAndrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrea Bittau 提交于
Still needs more work, but boots and doesn't crashes, even does some negotiation! 18:38:52.174934 127.0.0.1.43458 > 127.0.0.1.5001: request <change_l ack_ratio 2, change_r ccid 2, change_l ccid 2> 18:38:52.218526 127.0.0.1.5001 > 127.0.0.1.43458: response <nop, nop, change_l ack_ratio 2, confirm_r ccid 2 2, confirm_l ccid 2 2, confirm_r ack_ratio 2> 18:38:52.185398 127.0.0.1.43458 > 127.0.0.1.5001: <nop, confirm_r ack_ratio 2, ack_vector0 0x00, elapsed_time 212> :-) Signed-off-by: NAndrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnaldo Carvalho de Melo 提交于
Signed-off-by: NArnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-