- 18 7月, 2018 13 次提交
-
-
由 Florian Westphal 提交于
always call this function, followup patch can use this to aquire a per-netns transaction log to guard the entire batch instead of using the nfnl susbsys mutex (which is shared among all namespaces). Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
module autoload is problematic, it requires dropping the mutex that protects the transaction. Once the mutex has been dropped, another client can start a new transaction before we had a chance to abort current transaction log. This helper makes sure we first zap the transaction log, then drop mutex for module autoload. In case autload is successful, the caller has to reply entire message anyway. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Gao Feng 提交于
The param helper of nf_ct_helper_ext_add is useless now, then remove it now. Signed-off-by: NGao Feng <gfree.wind@vip.163.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Julian Anastasov 提交于
Before now, connection templates were ignored by the random dropentry procedure. But Michal Koutný suggests that we should add exception for connections under SYN attack. He provided patch that implements it for TCP: <quote> IPVS includes protection against filling the ip_vs_conn_tab by dropping 1/32 of feasible entries every second. The template entries (for persistent services) are never directly deleted by this mechanism but when a picked TCP connection entry is being dropped (1), the respective template entry is dropped too (realized by expiring 60 seconds after the connection entry being dropped). There is another mechanism that removes connection entries when they time out (2), in this case the associated template entry is not deleted. Under SYN flood template entries would accumulate (due to their entry longer timeout). The accumulation takes place also with drop_entry being enabled. Roughly 15% ((31/32)^60) of SYN_RECV connections survive the dropping mechanism (1) and are removed by the timeout mechanism (2)(defaults to 60 seconds for SYN_RECV), thus template entries would still accumulate. The patch ensures that when a connection entry times out, we also remove the template entry from the table. To prevent breaking persistent services (since the connection may time out in already established state) we add a new entry flag to protect templates what spawned at least one established TCP connection. </quote> We already added ASSURED flag for the templates in previous patch, so that we can use it now to decide which connection templates should be dropped under attack. But we also have some cases that need special handling. We modify the dropentry procedure as follows: - Linux timers currently use LIFO ordering but we can not rely on this to drop controlling connections. So, set cp->timeout to 0 to indicate that connection was dropped and that on expiration we should try to drop our controlling connections. As result, we can now avoid the ip_vs_conn_expire_now call. - move the cp->n_control check above, so that it avoids restarting the timer for controlling connections when not needed. - drop unassured connection templates here if they are not referred by any connections. On connection expiration: if connection was dropped (cp->timeout=0) try to drop our controlling connection except if it is a template in assured state. In ip_vs_conn_flush change order of ip_vs_conn_expire_now calls according to the LIFO timer expiration order. It should work faster for controlling connections with single controlled one. Suggested-by: NMichal Koutný <mkoutny@suse.com> Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Julian Anastasov 提交于
cp->state was not used for templates. Add support for state bits and for the first "assured" bit which indicates that some connection controlled by this template was established or assured by the real server. In a followup patch we will use it to drop templates under SYN attack. Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Julian Anastasov 提交于
In preparation for followup patches, provide just the cp ptr to ip_vs_state_name. Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Martynas Pumputis 提交于
This patch enables the clash resolution for NAT (disabled in "590b52e1") if clashing conntracks match (i.e. both tuples are equal) and a protocol allows it. The clash might happen for a connections-less protocol (e.g. UDP) when two threads in parallel writes to the same socket and consequent calls to "get_unique_tuple" return the same tuples (incl. reply tuples). In this case it is safe to perform the resolution, as the losing CT describes the same mangling as the winning CT, so no modifications to the packet are needed, and the result of rules traversal for the loser's packet stays valid. Signed-off-by: NMartynas Pumputis <martynas@weave.works> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Yi-Hung Wei 提交于
This patch is originally from Florian Westphal. This patch does the following 3 main tasks. 1) Add list lock to 'struct nf_conncount_list' so that we can alter the lists containing the individual connections without holding the main tree lock. It would be useful when we only need to add/remove to/from a list without allocate/remove a node in the tree. With this change, we update nft_connlimit accordingly since we longer need to maintain a list lock in nft_connlimit now. 2) Use RCU for the initial tree search to improve tree look up performance. 3) Add a garbage collection worker. This worker is schedule when there are excessive tree node that needed to be recycled. Moreover,the rbnode reclaim logic is moved from search tree to insert tree to avoid race condition. Signed-off-by: NYi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Yi-Hung Wei 提交于
This patch is originally from Florian Westphal. When we have a very coarse grouping, e.g. by large subnets, zone id, etc, it's likely that we do not need to do tree rotation because we'll find a node where we can attach new entry. Based on this observation, we split tree traversal and insertion. Later on, we can make traversal lockless (tree protected by RCU), and add extra lock in the individual nodes to protect list insertion/deletion, thereby allowing parallel insert/delete in different tree nodes. Signed-off-by: NYi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Yi-Hung Wei 提交于
This patch is originally from Florian Westphal. This is a preparation patch to allow lockless traversal of the tree via RCU. Signed-off-by: NYi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Yi-Hung Wei 提交于
This patch is originally from Florian Westphal. This patch does the following three tasks. It applies the same early exit technique for nf_conncount_lookup(). Since now we keep the number of connections in 'struct nf_conncount_list', we no longer need to return the count in nf_conncount_lookup(). Moreover, we expose the garbage collection function nf_conncount_gc_list() for nft_connlimit. Signed-off-by: NYi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Yi-Hung Wei 提交于
Original patch is from Florian Westphal. This patch switches from hlist to plain list to store the list of connections with the same filtering key in nf_conncount. With the plain list, we can insert new connections at the tail, so over time the beginning of list holds long-running connections and those are expired, while the newly creates ones are at the end. Later on, we could probably move checked ones to the end of the list, so the next run has higher chance to reclaim stale entries in the front. Signed-off-by: NYi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Yi-Hung Wei 提交于
This patch is originally from Florian Westphal. We use an extra function with early exit for garbage collection. It is not necessary to traverse the full list for every node since it is enough to zap a couple of entries for garbage collection. Signed-off-by: NYi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 17 7月, 2018 2 次提交
-
-
由 Máté Eckl 提交于
... from IPV6 to NF_TABLES_IPV6 and IP6_NF_IPTABLES. In some cases module selects depend on IPV6, but this means that they select another module even if eg. NF_TABLES_IPV6 is not set in which case the selected module is useless due to the lack of IPv6 nf_tables functionality. The same applies for IP6_NF_IPTABLES and iptables. Joint work with: Arnd Bermann <arnd@arndb.de> Signed-off-by: NMáté Eckl <ecklm94@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
This unifies ipv4 and ipv6 protocol trackers and removes the l3proto abstraction. This gets rid of all l3proto indirect calls and the need to do a lookup on the function to call for l3 demux. It increases module size by only a small amount (12kbyte), so this reduces size because nf_conntrack.ko is useless without either nf_conntrack_ipv4 or nf_conntrack_ipv6 module. before: text data bss dec hex filename 7357 1088 0 8445 20fd nf_conntrack_ipv4.ko 7405 1084 4 8493 212d nf_conntrack_ipv6.ko 72614 13689 236 86539 1520b nf_conntrack.ko 19K nf_conntrack_ipv4.ko 19K nf_conntrack_ipv6.ko 179K nf_conntrack.ko after: text data bss dec hex filename 79277 13937 236 93450 16d0a nf_conntrack.ko 191K nf_conntrack.ko Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 16 7月, 2018 23 次提交
-
-
由 Florian Westphal 提交于
Not needed, we can have the l4trackers fetch it themselvs. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
Handle common protocols (udp, tcp, ..), in the core and only do the call if needed by the l4proto tracker. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
Handle the common cases (tcp, udp, etc). in the core and only do the indirect call for the protocols that need it (GRE for instance). Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
Handle it in the core instead. ipv6_skip_exthdr() is built-in even if ipv6 is a module, i.e. this doesn't create an ipv6 dependency. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
Its simpler to just handle it directly in nf_ct_invert_tuple(). Also gets rid of need to pass l3proto pointer to resolve_conntrack(). Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
handle everything from ctnetlink directly. After all these years we still only support ipv4 and ipv6, so it seems reasonable to remove l3 protocol tracker support and instead handle ipv4/ipv6 from a common, always builtin inet tracker. Step 1: Get rid of all the l3proto->func() calls. Start with ctnetlink, then move on to packet-path ones. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Máté Eckl 提交于
Instead of depending on it. Signed-off-by: NMáté Eckl <ecklm94@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
These versions deal with the l3proto/l4proto details internally. It removes only caller of nf_ct_get_tuple, so make it static. After this, l3proto->get_l4proto() can be removed in a followup patch. Signed-off-by: NFlorian Westphal <fw@strlen.de> Acked-by: NPravin B Shelar <pshelar@ovn.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
similar to previous change, this also allows to remove it from nf_ipv6_ops and avoid the indirection. It also removes the bogus dependency of nf_conntrack_ipv6 on ipv6 module: ipv6 checksum functions are built into kernel even if CONFIG_IPV6=m, but ipv6/netfilter.o isn't. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
allows to make nf_ip_checksum_partial static, it no longer has an external caller. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Máté Eckl 提交于
This function is also necessary to implement nft tproxy support Fixes: 45ca4e0c ("netfilter: Libify xt_TPROXY") Signed-off-by: NMáté Eckl <ecklm94@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
This is one of the very few external callers of ->get_timeouts(), We can use a fixed timeout instead, conntrack core will refresh this in case a new packet comes within this period. Use of ESTABLISHED timeout seems way too huge anyway. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Taehee Yoo 提交于
In the nft_reject_br_send_v4_tcp_reset(), a ttl is set by the nf_reject_iphdr_put(). so, below code is unnecessary. Signed-off-by: NTaehee Yoo <ap420073@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Boris Pismenny 提交于
zerocopy_from_iter iterates over the message, but it doesn't revert the updates made by the iov iteration. This patch fixes it. Now, the iov can be used after calling zerocopy_from_iter. Fixes: 3c4d7559 ("tls: kernel TLS support") Signed-off-by: NBoris Pismenny <borisp@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Boris Pismenny 提交于
This patch completes the generic infrastructure to offload TLS crypto to a network device. It enables the kernel to skip decryption and authentication of some skbs marked as decrypted by the NIC. In the fast path, all packets received are decrypted by the NIC and the performance is comparable to plain TCP. This infrastructure doesn't require a TCP offload engine. Instead, the NIC only decrypts packets that contain the expected TCP sequence number. Out-Of-Order TCP packets are provided unmodified. As a result, at the worst case a received TLS record consists of both plaintext and ciphertext packets. These partially decrypted records must be reencrypted, only to be decrypted. The notable differences between SW KTLS Rx and this offload are as follows: 1. Partial decryption - Software must handle the case of a TLS record that was only partially decrypted by HW. This can happen due to packet reordering. 2. Resynchronization - tls_read_size calls the device driver to resynchronize HW after HW lost track of TLS record framing in the TCP stream. Signed-off-by: NBoris Pismenny <borisp@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Boris Pismenny 提交于
This patch allows tls_set_sw_offload to fill the context in case it was already allocated previously. We will use it in TLS_DEVICE to fill the RX software context. Signed-off-by: NBoris Pismenny <borisp@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Boris Pismenny 提交于
This patch splits tls_sw_release_resources_rx into two functions one which releases all inner software tls structures and another that also frees the containing structure. In TLS_DEVICE we will need to release the software structures without freeeing the containing structure, which contains other information. Signed-off-by: NBoris Pismenny <borisp@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Boris Pismenny 提交于
Previously, decrypt_skb also updated the TLS context. Now, decrypt_skb only decrypts the payload using the current context, while decrypt_skb_update also updates the state. Later, in the tls_device Rx flow, we will use decrypt_skb directly. Signed-off-by: NBoris Pismenny <borisp@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Boris Pismenny 提交于
For symmetry, we rename tls_offload_context to tls_offload_context_tx before we add tls_offload_context_rx. Signed-off-by: NBoris Pismenny <borisp@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Boris Pismenny 提交于
Prevent coalescing of decrypted and encrypted SKBs in GRO and TCP layer. Signed-off-by: NBoris Pismenny <borisp@mellanox.com> Signed-off-by: NIlya Lesokhin <ilyal@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ilya Lesokhin 提交于
This patch adds a netdev feature to configure TLS RX inline crypto offload. Signed-off-by: NIlya Lesokhin <ilyal@mellanox.com> Signed-off-by: NBoris Pismenny <borisp@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Boris Pismenny 提交于
The decrypted bit is propogated to cloned/copied skbs. This will be used later by the inline crypto receive side offload of tls. Signed-off-by: NBoris Pismenny <borisp@mellanox.com> Signed-off-by: NIlya Lesokhin <ilyal@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 7月, 2018 2 次提交
-
-
由 Andrey Ignatov 提交于
Add new TCP-BPF callback that is called on listen(2) right after socket transition to TCP_LISTEN state. It fills the gap for listening sockets in TCP-BPF. For example BPF program can set BPF_SOCK_OPS_STATE_CB_FLAG when socket becomes listening and track later transition from TCP_LISTEN to TCP_CLOSE with BPF_SOCK_OPS_STATE_CB callback. Before there was no way to do it with TCP-BPF and other options were much harder to work with. E.g. socket state tracking can be done with tracepoints (either raw or regular) but they can't be attached to cgroup and their lifetime has to be managed separately. Signed-off-by: NAndrey Ignatov <rdna@fb.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Yafang Shao 提交于
tcp_rcv_nxt_update() is already executed in tcp_data_queue(). This line is redundant. See bellow, tcp_queue_rcv tcp_rcv_nxt_update(tcp_sk(sk), TCP_SKB_CB(skb)->end_seq); tcp_rcv_nxt_update(tp, TCP_SKB_CB(skb)->end_seq); <<<< redundant Signed-off-by: NYafang Shao <laoar.shao@gmail.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-