- 08 2月, 2017 12 次提交
-
-
由 Manuel Messner 提交于
This patch implements the kernel side of the TCP option patch. Signed-off-by: NManuel Messner <mm@skelett.io> Reviewed-by: NFlorian Westphal <fw@strlen.de> Acked-by: NPhil Sutter <phil@nwl.cc> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
zones allow tracking multiple connections sharing identical tuples, this is needed e.g. when tracking distinct vlans with overlapping ip addresses (conntrack is l2 agnostic). Thus the zone has to be set before the packet is picked up by the connection tracker. This is done by means of 'conntrack templates' which are conntrack structures used solely to pass this info from one netfilter hook to the next. The iptables CT target instantiates these connection tracking templates once per rule, i.e. the template is fixed/tied to particular zone, can be read-only and therefore be re-used by as many skbs simultaneously as needed. We can't follow this model because we want to take the zone id from an sreg at rule eval time so we could e.g. fill in the zone id from the packets vlan id or a e.g. nftables key : value maps. To avoid cost of per packet alloc/free of the template, use a percpu template 'scratch' object and use the refcount to detect the (unlikely) case where the template is still attached to another skb (i.e., previous skb was nfqueued ...). Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
Next patch will add ZONE_ID set support which will need similar error unwind (put operation) as conntrack labels. Prepare for this: remove the 'label_got' boolean in favor of a switch statement that can be extended in next patch. As we already have that in the set_destroy function place that in a separate function and call it from the set init function. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
Just like with counters the direction attribute is optional. We set priv->dir to MAX unconditionally to avoid duplicating the assignment for all keys with optional direction. For keys where direction is mandatory, existing code already returns an error. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
This patch adds a new bitmap set type. This bitmap uses two bits to represent one element. These two bits determine the element state in the current and the future generation that fits into the nf_tables commit protocol. When dumping elements back to userspace, the two bits are expanded into a struct nft_set_ext object. If no NFTA_SET_DESC_SIZE is specified, the existing automatic set backend selection prefers bitmap over hash in case of keys whose size is <= 16 bit. If the set size is know, the bitmap set type is selected if with 16 bit kets and more than 390 elements in the set, otherwise the hash table set implementation is used. For 8 bit keys, the bitmap consumes 66 bytes. For 16 bit keys, the bitmap takes 16388 bytes. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
The space notation allows us to classify the set backend implementation based on the amount of required memory. This provides an order of the set representation scalability in terms of memory. The size field is still left in place so use this if the userspace provides no explicit number of elements, so we cannot calculate the real memory that this set needs. This also helps us break ties in the set backend selection routine, eg. two backend implementations provide the same performance. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
Use lookup as field name instead, to prepare the introduction of the memory class in a follow up patch. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
This provides context to walk callback iterator, thus, we know if the walk happens from the set flush path. This is required by the new bitmap set type coming in a follow up patch which has no real struct nft_set_ext, so it has to allocate it based on the two bit compact element representation. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
Although semantics are similar to deactivate() with no implicit element lookup, this is only called from the set flush path, so better rename this to flush(). Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
Instead of struct nft_set_dump_args, remove unnecessary wrapper structure. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
This new parameter is required by the new bitmap set type that comes in a follow up patch. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Phil Sutter 提交于
If NFT_EXTHDR_F_PRESENT is set, exthdr will not copy any header field data into *dest, but instead set it to 1 if the header is found and 0 otherwise. Signed-off-by: NPhil Sutter <phil@nwl.cc> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 04 2月, 2017 28 次提交
-
-
由 Eric Dumazet 提交于
My recent change missed fact that UFO would perform a complete UDP checksum before segmenting in frags. In this case skb->ip_summed is set to CHECKSUM_NONE. We need to add this valid case to skb_needs_check() Fixes: b2504a5d ("net: reduce skb_warn_bad_offload() noise") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We added generic support for busy polling in NAPI layer in linux-4.5 No network driver uses ndo_busy_poll() anymore, we can get rid of the pointer in struct net_device_ops, and its use in sk_busy_loop() Saves NETIF_F_BUSY_POLL features bit. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
We do polling generically these days. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
In linux-4.5, busy polling was implemented in core NAPI stack, meaning that all custom implementation can be removed from drivers. Not only we remove lot's of code, we also remove one lock operation in fast path, and allow GRO to do its job. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
In linux-4.5, busy polling was implemented in core NAPI stack, meaning that all custom implementation can be removed from drivers. Not only we remove lot's of code, we also remove one lock operation in fast path, and allow GRO to do its job. Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next由 David S. Miller 提交于
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree, they are: 1) Stash ctinfo 3-bit field into pointer to nf_conntrack object from sk_buff so we only access one single cacheline in the conntrack hotpath. Patchset from Florian Westphal. 2) Don't leak pointer to internal structures when exporting x_tables ruleset back to userspace, from Willem DeBruijn. This includes new helper functions to copy data to userspace such as xt_data_to_user() as well as conversions of our ip_tables, ip6_tables and arp_tables clients to use it. Not surprinsingly, ebtables requires an ad-hoc update. There is also a new field in x_tables extensions to indicate the amount of bytes that we copy to userspace. 3) Add nf_log_all_netns sysctl: This new knob allows you to enable logging via nf_log infrastructure for all existing netnamespaces. Given the effort to provide pernet syslog has been discontinued, let's provide a way to restore logging using netfilter kernel logging facilities in trusted environments. Patch from Michal Kubecek. 4) Validate SCTP checksum from conntrack helper, from Davide Caratti. 5) Merge UDPlite conntrack and NAT helpers into UDP, this was mostly a copy&paste from the original helper, from Florian Westphal. 6) Reset netfilter state when duplicating packets, also from Florian. 7) Remove unnecessary check for broadcast in IPv6 in pkttype match and nft_meta, from Liping Zhang. 8) Add missing code to deal with loopback packets from nft_meta when used by the netdev family, also from Liping. 9) Several cleanups on nf_tables, one to remove unnecessary check from the netlink control plane path to add table, set and stateful objects and code consolidation when unregister chain hooks, from Gao Feng. 10) Fix harmless reference counter underflow in IPVS that, however, results in problems with the introduction of the new refcount_t type, from David Windsor. 11) Enable LIBCRC32C from nf_ct_sctp instead of nf_nat_sctp, from Davide Caratti. 12) Missing documentation on nf_tables uapi header, from Liping Zhang. 13) Use rb_entry() helper in xt_connlimit, from Geliang Tang. ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Jiri Pirko says: ==================== mlxsw: Introduce TC Flower offload using TCAM This patchset introduces support for offloading TC cls_flower and actions to Spectrum TCAM-base policy engine. The patchset contains patches to allow work with flexible keys and actions which are used in Spectrum TCAM. It also contains in-driver infrastructure for offloading TC rules to TCAM HW. The TCAM management code is simple and limited for now. It is going to be extended as a follow-up work. The last patch uses the previously introduced infra to allow to implement cls_flower offloading. Initially, only limited set of match-keys and only a drop and forward actions are supported. As a dependency, this patchset introduces parman - priority array area manager - as a library. v1->v2: - patch11: - use __set_bit and __test_and_clear_bit as suggested by DaveM - patch16: - Added documentation to the API functions as suggested by Tom Herbert - patch17: - use __set_bit and __clear_bit as suggested by DaveM ==================== Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Extend the existing setup_tc ndo call and allow to offload cls_flower rules. Only limited set of dissector keys and actions are supported now. Use previously introduced ACL infrastructure to offload cls_flower rules to be processed in the HW. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
The driver that offloads flower rules needs to know with which priority user inserted the rules. So add this information into offload struct. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Add ACL core infrastructure for Spectrum ASIC. This infra provides an abstraction layer over specific HW implementations. There are two basic objects used. One is "rule" and the second is "ruleset" which serves as a container of multiple rules. In general, within one ruleset the rules are allowed to have multiple priorities and masks. Each ruleset is bound to either ingress or egress a of port netdevice. The initial TCAM implementation is very simple and limited. It utilizes parman lsort manager to take care of TCAM region layout. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
This introduces a infrastructure for management of linear priority areas. Priority order in an array matters, however order of items inside a priority group does not matter. As an initial implementation, L-sort algorithm is used. It is quite trivial. More advanced algorithm called P-sort will be introduced as a follow-up. The infrastructure is prepared for other algos. Alongside this, a testing module is introduced as well. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Similar to list_for_each_entry_continue and its reverse variant list_for_each_entry_continue_reverse, introduce reverse helper for list_for_each_entry_from. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Add couple of resource limits related to ACL. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Introduce basic set of Spectrum flexible key blocks. It contains blocks needed to carry all elements defined so far. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Each entry which is matched during ACL lookup points to an action set. This action set contains up to three separate actions. If more actions are needed to be chained, the extended set is created to hold them in KVD linear area. This patch implements handling of sets and encoding of actions. Currectly, only two actions are supported. Drop and forward. Forward action uses PBS pointer to KVD linear area, so the action code needs to take care of this as well. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Hardware supports matching on so called "flexible keys". The idea is to assemble an optimal key to use for matching according to the fields in packet (elements) requested by user. Certain sets of elements are combined into pre-defined blocks. There is a picker to find needed blocks. Keys consist of 1..n blocks. Alongside with that, an initial portion of elements is introduced in order to be able to offload basic cls_flower rules. Picked keys are cached so multiple rules could share them. There is an encode function provided that takes care of encoding key and mask values according to given key. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
PEFA register is used for accessing an extended flexible action entry in the central KVD Linear Database. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
The PPBS register retrieves and sets Policy Based Switching Table entries. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
The PRCR register is used for accessing rules within a TCAM region. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
The PPBT is used for configuration of the Port Binding Table. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
The PTCE-V2 register is used for accessing rules within a TCAM region. It is a new version of PTCE in order to support wider key, mask and action within a TCAM region. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
The PTAR register is used for allocation of regions in the TCAM. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
The PAGT register is used for configuration of the ACL Group Table. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
The PACL register is used for configuration of the ACL. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Sometimes it is handy to get a pointer to a char buffer item and use it direcly to write/read data. So add these helpers. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Item heplers for 8bit values are needed, let's add them. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Zhu Yanjun 提交于
The function bond_info_query alwarys returns 0. As such, in the function bond_do_ioctl, it is not necessary to check the returned value. So the interface type of the function bond_info_query is changed to void. The redundant check is removed. Signed-off-by: NZhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Josef Bacik diagnosed following problem : I was seeing random disconnects while testing NBD over loopback. This turned out to be because NBD sets pfmemalloc on it's socket, however the receiving side is a user space application so does not have pfmemalloc set on its socket. This means that sk_filter_trim_cap will simply drop this packet, under the assumption that the other side will simply retransmit. Well we do retransmit, and then the packet is just dropped again for the same reason. It seems the better way to address this problem is to clear pfmemalloc in the TCP transmit path. pfmemalloc strict control really makes sense on the receive path. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-