- 07 2月, 2017 1 次提交
-
-
由 Vivien Didelot 提交于
Add a notifier block per DSA switch, registered against a notifier head in the switch fabric they belong to. This infrastructure will allow to propagate fabric-wide events such as port bridging, VLAN configuration, etc. If a DSA switch driver cares about cross-chip configuration, such events can be caught. Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 2月, 2017 2 次提交
-
-
由 David Ahern 提交于
Change ip6_route_multipath_add to send one notifciation with the full route encoded with RTA_MULTIPATH instead of a series of individual routes. This is done by adding a skip_notify flag to the nl_info struct. The flag is used to skip sending of the notification in the fib code that actually inserts the route. Once the full route has been added, a notification is generated with all nexthops. ip6_route_multipath_add handles 3 use cases: new routes, route replace, and route append. The multipath notification generated needs to be consistent with the order of the nexthops and it should be consistent with the order in a FIB dump which means the route with the first nexthop needs to be used as the route reference. For the first 2 cases (new and replace), a reference to the route used to send the notification is obtained by saving the first route added. For the append case, the last route added is used to loop back to its first sibling route which is the first nexthop in the multipath route. Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
IPv4 allows multipath routes to be deleted using just the prefix and length. For example: $ ip ro ls vrf red unreachable default metric 8192 1.1.1.0/24 nexthop via 10.100.1.254 dev eth1 weight 1 nexthop via 10.11.200.2 dev eth11.200 weight 1 10.11.200.0/24 dev eth11.200 proto kernel scope link src 10.11.200.3 10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.3 $ ip ro del 1.1.1.0/24 vrf red $ ip ro ls vrf red unreachable default metric 8192 10.11.200.0/24 dev eth11.200 proto kernel scope link src 10.11.200.3 10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.3 The same notation does not work with IPv6 because of how multipath routes are implemented for IPv6. For IPv6 only the first nexthop of a multipath route is deleted if the request contains only a prefix and length. This leads to unnecessary complexity in userspace dealing with IPv6 multipath routes. This patch allows all nexthops to be deleted without specifying each one in the delete request. Internally, this is done by walking the sibling list of the route matching the specifications given (prefix, length, metric, protocol, etc). $ ip -6 ro ls vrf red 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium 2001:db8:200::/120 via 2001:db8:1::2 dev eth1 metric 1024 pref medium 2001:db8:200::/120 via 2001:db8:2::2 dev eth2 metric 1024 pref medium ... $ ip -6 ro del vrf red 2001:db8:200::/120 $ ip -6 ro ls vrf red 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium ... Because IPv6 allows individual nexthops to be deleted without deleting the entire route, the ip6_route_multipath_del and non-multipath code path (ip6_route_del) have to be discriminated so that all nexthops are only deleted for the latter case. This is done by making the existing fc_type in fib6_config a u16 and then adding a new u16 field with fc_delete_all_nh as the first bit. Suggested-by: NDinesh Dutt <ddutt@cumulusnetworks.com> Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 2月, 2017 5 次提交
-
-
由 Jiri Pirko 提交于
The driver that offloads flower rules needs to know with which priority user inserted the rules. So add this information into offload struct. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Roopa Prabhu 提交于
New ip_tunnel_info flag to represent bridged tunnel metadata. Used by bridge driver later in the series to pass per vlan dst metadata to bridge ports. Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yotam Gigi 提交于
Use the encode/decode functionality from the ife module instead of using implementation inside the act_ife. Reviewed-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NYotam Gigi <yotamg@mellanox.com> Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NRoman Mashak <mrv@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yotam Gigi 提交于
This module is responsible for the ife encapsulation protocol encode/decode logics. That module can: - ife_encode: encode skb and reserve space for the ife meta header - ife_decode: decode skb and extract the meta header size - ife_tlv_meta_encode - encodes one tlv entry into the reserved ife header space. - ife_tlv_meta_decode - decodes one tlv entry from the packet - ife_tlv_meta_next - advance to the next tlv Reviewed-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NYotam Gigi <yotamg@mellanox.com> Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NRoman Mashak <mrv@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yotam Gigi 提交于
As the function ife_tlv_meta_encode is not used by any other module, unexport it and make it static for the act_ife module. Signed-off-by: NYotam Gigi <yotamg@mellanox.com> Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NRoman Mashak <mrv@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 2月, 2017 8 次提交
-
-
由 Michal Kubeček 提交于
Commit 69b34fb9 ("netfilter: xt_LOG: add net namespace support for xt_LOG") disabled logging packets using the LOG target from non-init namespaces. The motivation was to prevent containers from flooding kernel log of the host. The plan was to keep it that way until syslog namespace implementation allows containers to log in a safe way. However, the work on syslog namespace seems to have hit a dead end somewhere in 2013 and there are users who want to use xt_LOG in all network namespaces. This patch allows to do so by setting /proc/sys/net/netfilter/nf_log_all_netns to a nonzero value. This sysctl is only accessible from init_net so that one cannot switch the behaviour from inside a container. Signed-off-by: NMichal Kubecek <mkubecek@suse.cz> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 David Windsor 提交于
Currently, the ip_vs_dest cache frees ip_vs_dest objects when their reference count becomes < 0. Aside from not being semantically sound, this is problematic for the new type refcount_t, which will be introduced shortly in a separate patch. refcount_t is the new kernel type for holding reference counts, and provides overflow protection and a constrained interface relative to atomic_t (the type currently being used for kernel reference counts). Per Julian Anastasov: "The problem is that dest_trash currently holds deleted dests (unlinked from RCU lists) with refcnt=0." Changing dest_trash to hold dest with refcnt=1 will allow us to free ip_vs_dest structs when their refcnt=0, in ip_vs_dest_put_and_free(). Signed-off-by: NDavid Windsor <dwindsor@gmail.com> Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
After this change conntrack operations (lookup, creation, matching from ruleset) only access one instead of two sk_buff cache lines. This works for normal conntracks because those are allocated from a slab that guarantees hw cacheline or 8byte alignment (whatever is larger) so the 3 bits needed for ctinfo won't overlap with nf_conn addresses. Template allocation now does manual address alignment (see previous change) on arches that don't have sufficent kmalloc min alignment. Some spots intentionally use skb->_nfct instead of skb_nfct() helpers, this is to avoid undoing the skb_nfct() use when we remove untracked conntrack object in the future. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
The next change will merge skb->nfct pointer and skb->nfctinfo status bits into single skb->_nfct (unsigned long) area. For this to work nf_conn addresses must always be aligned at least on an 8 byte boundary since we will need the lower 3bits to store nfctinfo. Conntrack templates are allocated via kmalloc. kbuild test robot reported BUILD_BUG_ON failed: NFCT_INFOMASK >= ARCH_KMALLOC_MINALIGN on v1 of this patchset, so not all platforms meet this requirement. Do manual alignment if needed, the alignment offset is stored in the nf_conn entry protocol area. This works because templates are not handed off to L4 protocol trackers. Reported-by: Nkbuild test robot <fengguang.wu@intel.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
Add a helper to assign a nf_conn entry and the ctinfo bits to an sk_buff. This avoids changing code in followup patch that merges skb->nfct and skb->nfctinfo into skb->_nfct. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
Followup patch renames skb->nfct and changes its type so add a helper to avoid intrusive rename change later. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
Next patch makes direct skb->nfct access illegal, reduce noise in next patch by using accessors we already have. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
It is never accessed for reading and the only places that write to it are the icmp(6) handlers, which also set skb->nfct (and skb->nfctinfo). The conntrack core specifically checks for attached skb->nfct after ->error() invocation and returns early in this case. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 01 2月, 2017 1 次提交
-
-
由 Dimitris Michailidis 提交于
ip6_make_flowlabel() determines the flow label for IPv6 packets. It's supposed to be passed a flow label, which it returns as is if non-0 and in some other cases, otherwise it calculates a new value. The problem is callers often pass a flowi6.flowlabel, which may also contain traffic class bits. If the traffic class is non-0 ip6_make_flowlabel() mistakes the non-0 it gets as a flow label and returns the whole thing. Thus it can return a 'flow label' longer than 20b and the low 20b of that is typically 0 resulting in packets with 0 label. Moreover, different packets of a flow may be labeled differently. For a TCP flow with ECN non-payload and payload packets get different labels as exemplified by this pair of consecutive packets: (pure ACK) Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2:: 0110 .... = Version: 6 .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0) .... .... .... 0001 1100 1110 0100 1001 = Flow Label: 0x1ce49 Payload Length: 32 Next Header: TCP (6) (payload) Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2:: 0110 .... = Version: 6 .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0)) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2) .... .... .... 0000 0000 0000 0000 0000 = Flow Label: 0x00000 Payload Length: 688 Next Header: TCP (6) This patch allows ip6_make_flowlabel() to be passed more than just a flow label and has it extract the part it really wants. This was simpler than modifying the callers. With this patch packets like the above become Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2:: 0110 .... = Version: 6 .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0) .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e Payload Length: 32 Next Header: TCP (6) Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2:: 0110 .... = Version: 6 .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0)) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2) .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e Payload Length: 688 Next Header: TCP (6) Signed-off-by: NDimitris Michailidis <dmichail@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 1月, 2017 4 次提交
-
-
由 Florian Fainelli 提交于
Add necessary plumbing at the slave network device level to have switch drivers implement ndo_setup_tc() and most particularly the cls_matchall classifier. We add support for two switch operations: port_add_mirror and port_del_mirror() which configure, on a per-port basis the mirror parameters requested from the cls_matchall classifier. Code is largely borrowed from the Mellanox Spectrum switch driver. Reviewed-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Nothing about lwt state requires a device reference, so remove the input argument. Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Robert Shearman 提交于
Packets arriving in a VRF currently are delivered to UDP sockets that aren't bound to any interface. TCP defaults to not delivering packets arriving in a VRF to unbound sockets. IP route lookup and socket transmit both assume that unbound means using the default table and UDP applications that haven't been changed to be aware of VRFs may not function correctly in this case since they may not be able to handle overlapping IP address ranges, or be able to send packets back to the original sender if required. So add a sysctl, udp_l3mdev_accept, to control this behaviour with it being analgous to the existing tcp_l3mdev_accept, namely to allow a process to have a VRF-global listen socket. Have this default to off as this is the behaviour that users will expect, given that there is no explicit mechanism to set unmodified VRF-unaware application into a default VRF. Signed-off-by: NRobert Shearman <rshearma@brocade.com> Acked-by: NDavid Ahern <dsa@cumulusnetworks.com> Tested-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Fainelli 提交于
In preparation for adding support for CFP/TCAMP in the bcm_sf2 driver add the plumbing to call into driver specific {get,set}_rxnfc operations. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 1月, 2017 4 次提交
-
-
由 Vivien Didelot 提交于
Upon reception of the NETDEV_CHANGEUPPER, a leaving port is already unbridged, so reflect this by assigning the port's bridge_dev pointer to NULL before calling the port_bridge_leave DSA driver operation. Now that the bridge_dev pointer is exposed to the drivers, reflecting the current state of the DSA switch fabric is necessary for the drivers to adjust their port based VLANs correctly. Pass the bridge device pointer to the port_bridge_leave operation so that drivers have all information to re-program their chips properly, and do not need to cache it anymore. Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vivien Didelot 提交于
Move the bridge_dev pointer from dsa_slave_priv to dsa_port so that DSA drivers can access this information and remove the need to cache it. Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vivien Didelot 提交于
Add the physical switch instance and port index a DSA port belongs to to the dsa_port structure. That can be used later to retrieve information about a physical port when configuring a switch fabric, or lighten up struct dsa_slave_priv. Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vivien Didelot 提交于
Change the ports[DSA_MAX_PORTS] array of the dsa_switch structure for a zero-length array, allocated at the same time as the dsa_switch structure itself. A dsa_switch_alloc() helper is provided for that. This commit brings no functional change yet since we pass DSA_MAX_PORTS as the number of ports for the moment. Future patches can update the DSA drivers separately to support dynamic number of ports. Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 1月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
Slava Shwartsman reported a warning in skb_try_coalesce(), when we detect skb->truesize is completely wrong. In his case, issue came from IPv6 reassembly coping with malicious datagrams, that forced various pskb_may_pull() to reallocate a bigger skb->head than the one allocated by NIC driver before entering GRO layer. Current code does not change skb->truesize, leaving this burden to callers if they care enough. Blindly changing skb->truesize in pskb_expand_head() is not easy, as some producers might track skb->truesize, for example in xmit path for back pressure feedback (sk->sk_wmem_alloc) We can detect the cases where it should be safe to change skb->truesize : 1) skb is not attached to a socket. 2) If it is attached to a socket, destructor is sock_edemux() My audit gave only two callers doing their own skb->truesize manipulation. I had to remove skb parameter in sock_edemux macro when CONFIG_INET is not set to avoid a compile error. Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NSlava Shwartsman <slavash@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 1月, 2017 3 次提交
-
-
由 Pablo Neira 提交于
Unlike ipv4, this control socket is shared by all cpus so we cannot use it as scratchpad area to annotate the mark that we pass to ip6_xmit(). Add a new parameter to ip6_xmit() to indicate the mark. The SCTP socket family caches the flowi6 structure in the sctp_transport structure, so we cannot use to carry the mark unless we later on reset it back, which I discarded since it looks ugly to me. Fixes: bf99b4de ("tcp: fix mark propagation with fwmark_reflect enabled") Suggested-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Felix Jia 提交于
The address generation mode for IPv6 link-local can only be configured by netlink messages. This patch adds the ability to change the address generation mode via sysctl. v1 -> v2 Removed the rtnl lock and switch to use RCU lock to iterate through the netdev list. v2 -> v3 Removed the addrgenmode variable from the idev structure and use the systcl storage for the flag. Simplifed the logic for sysctl handling by removing the supported for all operation. Added support for more types of tunnel interfaces for link-local address generation. Based the patches from net-next. v3 -> v4 Removed unnecessary whitespace changes. Signed-off-by: NFelix Jia <felix.jia@alliedtelesis.co.nz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Fainelli 提交于
In preparation for allowing dsa_register_switch() to be supplied with device/platform data, pass down a struct device pointer instead of a struct device_node. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 1月, 2017 5 次提交
-
-
由 Andrew Lunn 提交于
Previous patches have moved the temperature sensor code into the Marvell PHYs. A few now dead references to NET_DSA_HWMON were left behind. Go reap them. Reported-by: NValentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willy Tarreau 提交于
Without TFO, any subsequent connect() call after a successful one returns -1 EISCONN. The last API update ensured that __inet_stream_connect() can return -1 EINPROGRESS in response to sendmsg() when TFO is in use to indicate that the connection is now in progress. Unfortunately since this function is used both for connect() and sendmsg(), it has the undesired side effect of making connect() now return -1 EINPROGRESS as well after a successful call, while at the same time poll() returns POLLOUT. This can confuse some applications which happen to call connect() and to check for -1 EISCONN to ensure the connection is usable, and for which EINPROGRESS indicates a need to poll, causing a loop. This problem was encountered in haproxy where a call to connect() is precisely used in certain cases to confirm a connection's readiness. While arguably haproxy's behaviour should be improved here, it seems important to aim at a more robust behaviour when the goal of the new API is to make it easier to implement TFO in existing applications. This patch simply ensures that we preserve the same semantics as in the non-TFO case on the connect() syscall when using TFO, while still returning -1 EINPROGRESS on sendmsg(). For this we simply tell __inet_stream_connect() whether we're doing a regular connect() or in fact connecting for a sendmsg() call. Cc: Wei Wang <weiwan@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NWilly Tarreau <w@1wt.eu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Wang 提交于
This patch adds a new socket option, TCP_FASTOPEN_CONNECT, as an alternative way to perform Fast Open on the active side (client). Prior to this patch, a client needs to replace the connect() call with sendto(MSG_FASTOPEN). This can be cumbersome for applications who want to use Fast Open: these socket operations are often done in lower layer libraries used by many other applications. Changing these libraries and/or the socket call sequences are not trivial. A more convenient approach is to perform Fast Open by simply enabling a socket option when the socket is created w/o changing other socket calls sequence: s = socket() create a new socket setsockopt(s, IPPROTO_TCP, TCP_FASTOPEN_CONNECT …); newly introduced sockopt If set, new functionality described below will be used. Return ENOTSUPP if TFO is not supported or not enabled in the kernel. connect() With cookie present, return 0 immediately. With no cookie, initiate 3WHS with TFO cookie-request option and return -1 with errno = EINPROGRESS. write()/sendmsg() With cookie present, send out SYN with data and return the number of bytes buffered. With no cookie, and 3WHS not yet completed, return -1 with errno = EINPROGRESS. No MSG_FASTOPEN flag is needed. read() Return -1 with errno = EWOULDBLOCK/EAGAIN if connect() is called but write() is not called yet. Return -1 with errno = EWOULDBLOCK/EAGAIN if connection is established but no msg is received yet. Return number of bytes read if socket is established and there is msg received. The new API simplifies life for applications that always perform a write() immediately after a successful connect(). Such applications can now take advantage of Fast Open by merely making one new setsockopt() call at the time of creating the socket. Nothing else about the application's socket call sequence needs to change. Signed-off-by: NWei Wang <weiwan@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Wang 提交于
Refactor the cookie check logic in tcp_send_syn_data() into a function. This function will be called else where in later changes. Signed-off-by: NWei Wang <weiwan@google.com> Acked-by: NEric Dumazet <edumazet@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jamal Hadi Salim 提交于
Introduce optional 128-bit action cookie. Like all other cookie schemes in the networking world (eg in protocols like http or existing kernel fib protocol field, etc) the idea is to save user state that when retrieved serves as a correlator. The kernel _should not_ intepret it. The user can store whatever they wish in the 128 bits. Sample exercise(showing variable length use of cookie) .. create an accept action with cookie a1b2c3d4 sudo $TC actions add action ok index 1 cookie a1b2c3d4 .. dump all gact actions.. sudo $TC -s actions ls action gact action order 0: gact action pass random type none pass val 0 index 1 ref 1 bind 0 installed 5 sec used 5 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 cookie a1b2c3d4 .. bind the accept action to a filter.. sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \ u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 1 ... send some traffic.. $ ping 127.0.0.1 -c 3 PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.020 ms 64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.027 ms 64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.038 ms Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 1月, 2017 5 次提交
-
-
由 Robert Shearman 提交于
Modules implementing lwtunnel ops should not be allowed to unload while there is state alive using those ops, so specify the owning module for all lwtunnel ops. Signed-off-by: NRobert Shearman <rshearma@brocade.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pablo Neira Ayuso 提交于
The flush operation needs to modify set and element objects, so let's deconstify this. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Yotam Gigi 提交于
This action allows the user to sample traffic matched by tc classifier. The sampling consists of choosing packets randomly and sampling them using the psample module. The user can configure the psample group number, the sampling rate and the packet's truncation (to save kernel-user traffic). Example: To sample ingress traffic from interface eth1, one may use the commands: tc qdisc add dev eth1 handle ffff: ingress tc filter add dev eth1 parent ffff: \ matchall action sample rate 12 group 4 Where the first command adds an ingress qdisc and the second starts sampling randomly with an average of one sampled packet per 12 packets on dev eth1 to psample group 4. Signed-off-by: NYotam Gigi <yotamg@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: NSimon Horman <simon.horman@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yotam Gigi 提交于
Add a general way for kernel modules to sample packets, without being tied to any specific subsystem. This netlink channel can be used by tc, iptables, etc. and allow to standardize packet sampling in the kernel. For every sampled packet, the psample module adds the following metadata fields: PSAMPLE_ATTR_IIFINDEX - the packets input ifindex, if applicable PSAMPLE_ATTR_OIFINDEX - the packet output ifindex, if applicable PSAMPLE_ATTR_ORIGSIZE - the packet's original size, in case it has been truncated during sampling PSAMPLE_ATTR_SAMPLE_GROUP - the packet's sample group, which is set by the user who initiated the sampling. This field allows the user to differentiate between several samplers working simultaneously and filter packets relevant to him PSAMPLE_ATTR_GROUP_SEQ - sequence counter of last sent packet. The sequence is kept for each group PSAMPLE_ATTR_SAMPLE_RATE - the sampling rate used for sampling the packets PSAMPLE_ATTR_DATA - the actual packet bits The sampled packets are sent to the PSAMPLE_NL_MCGRP_SAMPLE multicast group. In addition, add the GET_GROUPS netlink command which allows the user to see the current sample groups, their refcount and sequence number. This command currently supports only netlink dump mode. Signed-off-by: NYotam Gigi <yotamg@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NJamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: NSimon Horman <simon.horman@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Krister Johansen 提交于
Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl that denotes the first unprivileged inet port in the namespace. To disable all privileged ports set this to zero. It also checks for overlap with the local port range. The privileged and local range may not overlap. The use case for this change is to allow containerized processes to bind to priviliged ports, but prevent them from ever being allowed to modify their container's network configuration. The latter is accomplished by ensuring that the network namespace is not a child of the user namespace. This modification was needed to allow the container manager to disable a namespace's priviliged port restrictions without exposing control of the network namespace to processes in the user namespace. Signed-off-by: NKrister Johansen <kjlx@templeofstupid.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 1月, 2017 1 次提交
-
-
由 Andrew Lunn 提交于
Only the Marvell mv88e6xxx DSA driver made use of the HWMON support in DSA. The temperature sensor registers are actually in the embedded PHYs, and the PHY driver now supports it. So remove all HWMON support from DSA and drivers. Signed-off-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-