- 28 8月, 2013 6 次提交
-
-
由 Patrick McHardy 提交于
Add a SYNPROXY for netfilter. The code is split into two parts, the synproxy core with common functions and an address family specific target. The SYNPROXY receives the connection request from the client, responds with a SYN/ACK containing a SYN cookie and announcing a zero window and checks whether the final ACK from the client contains a valid cookie. It then establishes a connection to the original destination and, if successful, sends a window update to the client with the window size announced by the server. Support for timestamps, SACK, window scaling and MSS options can be statically configured as target parameters if the features of the server are known. If timestamps are used, the timestamp value sent back to the client in the SYN/ACK will be different from the real timestamp of the server. In order to now break PAWS, the timestamps are translated in the direction server->client. Signed-off-by: NPatrick McHardy <kaber@trash.net> Tested-by: NMartin Topholm <mph@one.com> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Patrick McHardy 提交于
Extract the local TCP stack independant parts of tcp_v4_init_sequence() and cookie_v4_check() and export them for use by the upcoming SYNPROXY target. Signed-off-by: NPatrick McHardy <kaber@trash.net> Acked-by: NDavid S. Miller <davem@davemloft.net> Tested-by: NMartin Topholm <mph@one.com> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Patrick McHardy 提交于
Split out sequence number adjustments from NAT and move them to the conntrack core to make them usable for SYN proxying. The sequence number adjustment information is moved to a seperate extend. The extend is added to new conntracks when a NAT mapping is set up for a connection using a helper. As a side effect, this saves 24 bytes per connection with NAT in the common case that a connection does not have a helper assigned. Signed-off-by: NPatrick McHardy <kaber@trash.net> Tested-by: NMartin Topholm <mph@one.com> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Nathan Hintz 提交于
'nf_defrag_ipv6' is built as a separate module; it shouldn't be included in the 'nf_conntrack_ipv6' module as well. Signed-off-by: NNathan Hintz <nlhintz@hotmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Phil Oester 提交于
As reported by Casper Gripenberg, in a bridged setup, using ip[6]t_REJECT with the tcp-reset option sends out reset packets with the src MAC address of the local bridge interface, instead of the MAC address of the intended destination. This causes some routers/firewalls to drop the reset packet as it appears to be spoofed. Fix this by bypassing ip[6]_local_out and setting the MAC of the sender in the tcp reset packet. This closes netfilter bugzilla #531. Signed-off-by: NPhil Oester <kernel@linuxace.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Daniel Borkmann 提交于
Currently, the tcp_probe snooper can either filter packets by a given port (handed to the module via module parameter e.g. port=80) or lets all TCP traffic pass (port=0, default). When a port is specified, the port number is tested against the sk's source/destination port. Thus, if one of them matches, the information will be further processed for the log. As this is quite limited, allow for more advanced filtering possibilities which can facilitate debugging/analysis with the help of the tcp_probe snooper. Therefore, similarly as added to BPF machine in commit 7e75f93e ("pkt_sched: ingress socket filter by mark"), add the possibility to use skb->mark as a filter. If the mark is not being used otherwise, this allows ingress filtering by flow (e.g. in order to track updates from only a single flow, or a subset of all flows for a given port) and other things such as dynamic logging and reconfiguration without removing/re-inserting the tcp_probe module, etc. Simple example: insmod net/ipv4/tcp_probe.ko fwmark=8888 full=1 ... iptables -A INPUT -i eth4 -t mangle -p tcp --dport 22 \ --sport 60952 -j MARK --set-mark 8888 [... sampling interval ...] iptables -D INPUT -i eth4 -t mangle -p tcp --dport 22 \ --sport 60952 -j MARK --set-mark 8888 The current option to filter by a given port is still being preserved. A similar approach could be done for the sctp_probe module as a follow-up. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 8月, 2013 1 次提交
-
-
由 Dan Carpenter 提交于
Eric Dumazet says that my previous fix for an ERR_PTR dereference (ea857f28 'ipip: dereferencing an ERR_PTR in ip_tunnel_init_net()') could be racy and suggests the following fix instead. Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 8月, 2013 7 次提交
-
-
由 Daniel Borkmann 提交于
We can simply use the %pISc format specifier that was recently added and thus remove some code that distinguishes between IPv4 and IPv6. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Duan Jiong 提交于
rfc 4861 says the Redirected Header option is optional, so the kernel should not drop the Redirect Message that has no Redirected Header option. In this patch, the function ip6_redirect_no_header() is introduced to deal with that condition. Signed-off-by: NDuan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
-
由 Daniel Borkmann 提交于
The tcp_probe currently only supports analysis of IPv4 connections. Therefore, it would be nice to have IPv6 supported as well. Since we have the recently added %pISpc specifier that is IPv4/IPv6 generic, build related sockaddress structures from the flow information and pass this to our format string. Tested with SSH and HTTP sessions on IPv4 and IPv6. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
This patches fixes a rather unproblematic function signature mismatch as the const specifier was missing for the th variable; and next to that it adds a build-time assertion so that future function signature mismatches for kprobes will not end badly, similarly as commit 22222997 ("net: sctp: add build check for sctp_sf_eat_sack_6_2/jsctp_sf_eat_sack") did it for SCTP. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
It is helpful to sometimes know the TCP window sizes of an established socket e.g. to confirm that window scaling is working or to tweak the window size to improve high-latency connections, etc etc. Currently the TCP snooper only exports the send window size, but not the receive window size. Therefore, also add the receive window size to the end of the output line. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuchung Cheng 提交于
The stack currently detects reordering and avoid spurious retransmission very well. However the throughput is sub-optimal under high reordering because cwnd is increased only if the data is deliverd in order. I.e., FLAG_DATA_ACKED check in tcp_ack(). The more packet are reordered the worse the throughput is. Therefore when reordering is proven high, cwnd should advance whenever the data is delivered regardless of its ordering. If reordering is low, conservatively advance cwnd only on ordered deliveries in Open state, and retain cwnd in Disordered state (RFC5681). Using netperf on a qdisc setup of 20Mbps BW and random RTT from 45ms to 55ms (for reordering effect). This change increases TCP throughput by 20 - 25% to near bottleneck BW. A special case is the stretched ACK with new SACK and/or ECE mark. For example, a receiver may receive an out of order or ECN packet with unacked data buffered because of LRO or delayed ACK. The principle on such an ACK is to advance cwnd on the cummulative acked part first, then reduce cwnd in tcp_fastretrans_alert(). Signed-off-by: NYuchung Cheng <ycheng@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
This reverts commit 58ad436f. It turns out that the change introduced a potential deadlock by causing a locking dependency with netlink's cb_mutex. I can't seem to find a way to resolve this without doing major changes to the locking, so revert this. Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Acked-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 8月, 2013 19 次提交
-
-
由 Daniel Borkmann 提交于
Instead of hard-coding length values, use a define to make it clear where those lengths come from. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
For the functions mld_gq_start_timer(), mld_ifc_start_timer(), and mld_dad_start_timer(), rather use unsigned long than int as we operate only on unsigned values anyway. This seems more appropriate as there is no good reason to do type conversions to int, that could lead to future errors. Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
Use proper API functions to calculate jiffies from milliseconds and not the crude method of dividing HZ by a value. This ensures more accurate values even in the case of strange HZ values. While at it, also simplify code in the mlh2 case by using max(). Signed-off-by: NDaniel Borkmann <dborkman@redhat.com> Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nicolas Dichtel 提交于
When an Xin6 tunnel is set up, we check other netdevices to inherit the link- local address. If none is available, the interface will not have any link-local address. RFC4862 expects that each interface has a link local address. Now than this kind of tunnels supports x-netns, it's easy to fall in this case (by creating the tunnel in a netns where ethernet interfaces stand and then moving it to a other netns where no ethernet interface is available). RFC4291, Appendix A suggests two methods: the first is the one currently implemented, the second is to generate a unique identifier, so that we can always generate the link-local address. Let's use eth_random_addr() to generate this interface indentifier. I remove completly the previous method, hence for the whole life of the interface, the link-local address remains the same (previously, it depends on which ethernet interfaces were up when the tunnel interface was set up). Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This reverts commit df8372ca. These changes are buggy and make unintended semantic changes to ip6_tnl_add_linklocal(). Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Toshiaki Makita 提交于
The VLAN code needs to know the length of the per-port VLAN bitmap to perform its most basic operations (retrieving VLAN informations, removing VLANs, forwarding database manipulation, etc). Unfortunately, in the current implementation we are using a macro that indicates the bitmap size in longs in places where the size in bits is expected, which in some cases can cause what appear to be random failures. Use the correct macro. Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
getsockopt PACKET_STATISTICS returns tp_packets + tp_drops. Commit ee80fbf3 ("packet: account statistics only in tpacket_stats_u") cleaned up the getsockopt PACKET_STATISTICS code. This also changed semantics. Historically, tp_packets included tp_drops on return. The commit removed the line that adds tp_drops into tp_packets. This patch reinstates the old semantics. Signed-off-by: NWillem de Bruijn <willemb@google.com> Acked-by: NDaniel Borkmann <dborkman@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dan Carpenter 提交于
We need to move the derefernce after the IS_ERR() check. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
As discussed last year [1], there is no compelling reason to limit IPv4 MTU to 0xFFF0, while real limit is 0xFFFF [1] : http://marc.info/?l=linux-netdev&m=135607247609434&w=2 Willem raised this issue again because some of our internal regression tests broke after lo mtu being set to 65536. IP_MTU reports 0xFFF0, and the test attempts to send a RAW datagram of mtu + 1 bytes, expecting the send() to fail, but it does not. Alexey raised interesting points about TCP MSS, that should be addressed in follow-up patches in TCP stack if needed, as someone could also set an odd mtu anyway. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christoph Paasch 提交于
The nocache-argument was used in tcp_v4_send_synack as an argument to inet_csk_route_req. However, since ba3f7f04 (ipv4: Kill FLOWI_FLAG_RT_NOCACHE and associated code.) this is no more used. This patch removes the unsued argument from tcp_v4_send_synack. Signed-off-by: NChristoph Paasch <christoph.paasch@uclouvain.be> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 dingtianhong 提交于
ERROR: code indent should use tabs where possible: fix 2. ERROR: do not use assignment in if condition: fix 5. Signed-off-by: NDing Tianhong <dingtianhong@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 dingtianhong 提交于
Just follow the Joe Perches's opinion, it is a better way to fix the style errors. Suggested-by: NJoe Perches <joe@perches.com> Signed-off-by: NDing Tianhong <dingtianhong@huawei.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
Handle context based address when an unspecified address is given. For other context based address we print a warning and drop the packet because we don't support it right now. Signed-off-by: NAlexander Aring <alex.aring@gmail.com> Reviewed-by: NWerner Almesberger <werner@almesberger.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch drops the pre and postcount calculation from the lowpan_uncompress_addr function.We use instead a switch/case over address_mode value. The original implementation has several bugs in this function and it was hard to decrypt how it works. To make it maintainable and fix these bugs this patch basically reimplements lowpan_uncompress_addr from scratch. A list of bugs we found in the current implementation: 1) Properly support uncompression of short-address based IPv6 addresses (instead of basically copying garbage) 2) Fix use and uncompression of long-addresses based IPv6 addresses 3) Add missing ff:fe00 in the case of SAM/DAM = 2 and M = 0 Signed-off-by: NAlexander Aring <alex.aring@gmail.com> Reviewed-by: NWerner Almesberger <werner@almesberger.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
Add function to uncompress multicast address. This function split the uncompress function for a multicast address in a seperate function. To uncompress a multicast address is different than a other non-multicasts addresses according to rfc6282. Signed-off-by: NAlexander Aring <alex.aring@gmail.com> Reviewed-by: NWerner Almesberger <werner@almesberger.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds a helper function to parse the ipv6 header to a 6lowpan header in stream. This function checks first if we can pull data with a specific length from a skb. If this seems to be okay, we copy skb data to a destination pointer and run skb_pull. Signed-off-by: NAlexander Aring <alex.aring@gmail.com> Reviewed-by: NWerner Almesberger <werner@almesberger.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Hauweele 提交于
When a new 6lowpan fragment is received, a skbuff is allocated for the reassembled packet. However when a 6lowpan packet compresses link-local addresses based on link-layer addresses, the processing function relies on the skb mac control block to find the related link-layer address. This patch copies the control block from the first fragment into the newly allocated skb to keep a trace of the link-layer addresses in case of a link-local compressed address. Edit: small changes on comment issue Signed-off-by: NDavid Hauweele <david@hauweele.net> Signed-off-by: NAlexander Aring <alex.aring@gmail.com> Reviewed-by: NWerner Almesberger <werner@almesberger.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch simplify the handling to set fields inside of struct ipv6hdr to zero. Instead of setting some memory regions with memset to zero we initialize the whole ipv6hdr to zero. This is a simplification for parsing the 6lowpan header for the upcomming patches. Signed-off-by: NAlexander Aring <alex.aring@gmail.com> Reviewed-by: NWerner Almesberger <werner@almesberger.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Andrey Vagin 提交于
When the repair mode is turned off, the write queue seqs are updated so that the whole queue is considered to be 'already sent. The "when" field must be set for such skb. It's used in tcp_rearm_rto for example. If the "when" field isn't set, the retransmit timeout can be calculated incorrectly and a tcp connected can stop for two minutes (TCP_RTO_MAX). Acked-by: NPavel Emelyanov <xemul@parallels.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: NAndrey Vagin <avagin@openvz.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 8月, 2013 3 次提交
-
-
由 Pravin B Shelar 提交于
Following patch adds vxlan vport type for openvswitch using vxlan api. So now there is vxlan dependency for openvswitch. CC: Jesse Gross <jesse@nicira.com> Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Acked-by: NJesse Gross <jesse@nicira.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hannes Frederic Sowa 提交于
It is not allowed for an ipv6 packet to contain multiple fragmentation headers. So discard packets which were already reassembled by fragmentation logic and send back a parameter problem icmp. The updates for RFC 6980 will come in later, I have to do a bit more research here. Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hannes Frederic Sowa 提交于
Because of the max_addresses check attackers were able to disable privacy extensions on an interface by creating enough autoconfigured addresses: <http://seclists.org/oss-sec/2012/q4/292> But the check is not actually needed: max_addresses protects the kernel to install too many ipv6 addresses on an interface and guards addrconf_prefix_rcv to install further addresses as soon as this limit is reached. We only generate temporary addresses in direct response of a new address showing up. As soon as we filled up the maximum number of addresses of an interface, we stop installing more addresses and thus also stop generating more temp addresses. Even if the attacker tries to generate a lot of temporary addresses by announcing a prefix and removing it again (lifetime == 0) we won't install more temp addresses, because the temporary addresses do count to the maximum number of addresses, thus we would stop installing new autoconfigured addresses when the limit is reached. This patch fixes CVE-2013-0343 (but other layer-2 attacks are still possible). Thanks to Ding Tianhong to bring this topic up again. Cc: Ding Tianhong <dingtianhong@huawei.com> Cc: George Kargiotakis <kargig@void.gr> Cc: P J P <ppandit@redhat.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: NDing Tianhong <dingtianhong@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 8月, 2013 1 次提交
-
-
由 Rami Rosen 提交于
This patch removes a comment in xfrm_input() which became irrelevant due to commit 2774c131, "xfrm: Handle blackhole route creation via afinfo". That commit removed returning -EREMOTE in the xfrm_lookup() method when the packet should be discarded and also removed the correspoinding -EREMOTE handlers. This was replaced by calling the make_blackhole() method. Therefore the comment about -EREMOTE is not relevant anymore. Signed-off-by: NRami Rosen <ramirose@gmail.com> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
- 18 8月, 2013 1 次提交
-
-
由 Linus Lüssing 提交于
batadv_unicast(_4addr)_prepare_skb might reallocate the skb's data. And if it tries to do so then this can potentially fail. We shouldn't continue working on this skb in such a case. Signed-off-by: NLinus Lüssing <linus.luessing@web.de> Signed-off-by: NMarek Lindner <lindner_marek@yahoo.de> Acked-by: NAntonio Quartulli <ordex@autistici.org> Signed-off-by: NAntonio Quartulli <ordex@autistici.org>
-
- 16 8月, 2013 2 次提交
-
-
由 Fan Du 提交于
xfrm_state timer should be independent of system clock change, so switch to CLOCK_BOOTTIME base which is not only monotonic but also counting suspend time. Thus issue reported in commit: 9e0d57fd ("xfrm: SAD entries do not expire correctly after suspend-resume") could ALSO be avoided. v2: Use CLOCK_BOOTTIME to count suspend time, but still monotonic. Signed-off-by: NFan Du <fan.du@windriver.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
-
由 Pravin B Shelar 提交于
Following patch stores struct netlink_callback in netlink_sock to avoid allocating and freeing it on every netlink dump msg. Only one dump operation is allowed for a given socket at a time therefore we can safely convert cb pointer to cb struct inside netlink_sock. Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-