- 19 3月, 2016 3 次提交
-
-
由 Daniel Borkmann 提交于
eBPF defines this as BPF_TUNLEN_MAX and OVS just uses the hard-coded value inside struct sw_flow_key. Thus, add and use IP_TUNNEL_OPTS_MAX for this, which makes the code a bit more generic and allows to remove BPF_TUNLEN_MAX from eBPF code. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Simon Horman 提交于
Currently output of MPLS packets on tunnel vports is not allowed by Open vSwitch. This is because historically encapsulation was done in such a way that the inner_protocol field of the skb needed to hold the inner protocol for both MPLS and tunnel encapsulation in order for GSO segmentation to be performed correctly. Since b2acd1dc ("openvswitch: Use regular GRE net_device instead of vport") Open vSwitch makes use of lwt to output to tunnel netdevs which perform encapsulation. As no drivers expose support for MPLS offloads this means that GSO packets are segmented in software by validate_xmit_skb(), which is called from __dev_queue_xmit(), before tunnel encapsulation occurs. This means that the inner protocol of MPLS is no longer needed by the time encapsulation occurs and the contention on the inner_protocol field of the skb no longer occurs. Thus it is now safe to output MPLS to tunnel vports. Signed-off-by: NSimon Horman <simon.horman@netronome.com> Reviewed-by: NJesse Gross <jesse@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wu Fengguang 提交于
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 3月, 2016 7 次提交
-
-
由 Jarno Rajahalme 提交于
Extend OVS conntrack interface to cover NAT. New nested OVS_CT_ATTR_NAT attribute may be used to include NAT with a CT action. A bare OVS_CT_ATTR_NAT only mangles existing and expected connections. If OVS_NAT_ATTR_SRC or OVS_NAT_ATTR_DST is included within the nested attributes, new (non-committed/non-confirmed) connections are mangled according to the rest of the nested attributes. The corresponding OVS userspace patch series includes test cases (in tests/system-traffic.at) that also serve as example uses. This work extends on a branch by Thomas Graf at https://github.com/tgraf/ovs/tree/nat. Signed-off-by: NJarno Rajahalme <jarno@ovn.org> Acked-by: NThomas Graf <tgraf@suug.ch> Acked-by: NJoe Stringer <joe@ovn.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Jarno Rajahalme 提交于
There is no need to help connections that are not confirmed, so we can delay helping new connections to the time when they are confirmed. This change is needed for NAT support, and having this as a separate patch will make the following NAT patch a bit easier to review. Signed-off-by: NJarno Rajahalme <jarno@ovn.org> Acked-by: NJoe Stringer <joe@ovn.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Jarno Rajahalme 提交于
Repeat the nf_conntrack_in() call when it returns NF_REPEAT. This avoids dropping a SYN packet re-opening an existing TCP connection. Signed-off-by: NJarno Rajahalme <jarno@ovn.org> Acked-by: NJoe Stringer <joe@ovn.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Jarno Rajahalme 提交于
Add a new function ovs_ct_find_existing() to find an existing conntrack entry for which this packet was already applied to. This is only to be called when there is evidence that the packet was already tracked and committed, but we lost the ct reference due to an userspace upcall. ovs_ct_find_existing() is called from skb_nfct_cached(), which can now hide the fact that the ct reference may have been lost due to an upcall. This allows ovs_ct_commit() to be simplified. This patch is needed by later "openvswitch: Interface with NAT" patch, as we need to be able to pass the packet through NAT using the original ct reference also after the reference is lost after an upcall. Signed-off-by: NJarno Rajahalme <jarno@ovn.org> Acked-by: NJoe Stringer <joe@ovn.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Jarno Rajahalme 提交于
Only a successful nf_conntrack_in() call can effect a connection state change, so it suffices to update the key only after the nf_conntrack_in() returns. This change is needed for the later NAT patches. Signed-off-by: NJarno Rajahalme <jarno@ovn.org> Acked-by: NJoe Stringer <joe@ovn.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Jarno Rajahalme 提交于
This makes the code easier to understand and the following patches more focused. Signed-off-by: NJarno Rajahalme <jarno@ovn.org> Acked-by: NJoe Stringer <joe@ovn.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Jarno Rajahalme 提交于
Remove the definition of IP_CT_NEW_REPLY from the kernel as it does not make sense. This allows the definition of IP_CT_NUMBER to be simplified as well. Signed-off-by: NJarno Rajahalme <jarno@ovn.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 14 3月, 2016 1 次提交
-
-
由 Samuel Gauthier 提交于
When we want to change a flow using netlink, we have to identify it to be able to perform a lookup. Both the flow key and unique flow ID (ufid) are valid identifiers, but we always have to specify the flow key in the netlink message. When both attributes are there, the ufid is used. The flow key is used to validate the actions provided by the userland. This commit allows to use the ufid without having to provide the flow key, as it is already done in the netlink 'flow get' and 'flow del' path. The flow key remains mandatory when an action is provided. Signed-off-by: NSamuel Gauthier <samuel.gauthier@6wind.com> Reviewed-by: NSimon Horman <simon.horman@netronome.com> Acked-by: NPravin B Shelar <pshelar@ovn.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 3月, 2016 1 次提交
-
-
由 Paolo Abeni 提交于
This patch implements bookkeeping support to compute the maximum headroom for all the devices in each datapath. When said value changes, the underlying devs are notified via the ndo_set_rx_headroom method. This also increases the internal vports xmit performance. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 2月, 2016 2 次提交
-
-
由 Daniel Borkmann 提交于
Replace individual implementations with the recently introduced skb_postpush_rcsum() helper. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Acked-by: NTom Herbert <tom@herbertland.com> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
the commit 35e2d115 ("tunnels: Allow IPv6 UDP checksums to be correctly controlled.") changed the default xmit checksum setting for lwt vxlan/geneve ipv6 tunnels, so that now the checksum is not set into external UDP header. This commit changes the rx checksum setting for both lwt vxlan/geneve devices created by openvswitch accordingly, so that lwt over ipv6 tunnel pairs are again able to communicate with default values. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Acked-by: NJiri Benc <jbenc@redhat.com> Acked-by: NJesse Gross <jesse@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 2月, 2016 2 次提交
-
-
由 Florian Westphal 提交于
This reverts commit bb9b18fb ("genl: Add genlmsg_new_unicast() for unicast message allocation")'. Nothing wrong with it; its no longer needed since this was only for mmapped netlink support. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
revert commit 795449d8 ("openvswitch: Enable memory mapped Netlink i/o"). Following the mmaped netlink removal this code can be removed. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 2月, 2016 1 次提交
-
-
由 Paolo Abeni 提交于
In case of UDP traffic with datagram length below MTU this give about 2% performance increase when tunneling over ipv4 and about 60% when tunneling over ipv6 Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Suggested-and-acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 2月, 2016 1 次提交
-
-
由 Tycho Andersen 提交于
Operations with the GENL_ADMIN_PERM flag fail permissions checks because this flag means we call netlink_capable, which uses the init user ns. Instead, let's introduce a new flag, GENL_UNS_ADMIN_PERM for operations which should be allowed inside a user namespace. The motivation for this is to be able to run openvswitch in unprivileged containers. I've tested this and it seems to work, but I really have no idea about the security consequences of this patch, so thoughts would be much appreciated. v2: use the GENL_UNS_ADMIN_PERM flag instead of a check in each function v3: use separate ifs for UNS_ADMIN_PERM and ADMIN_PERM, instead of one massive one Reported-by: NJames Page <james.page@canonical.com> Signed-off-by: NTycho Andersen <tycho.andersen@canonical.com> CC: Eric Biederman <ebiederm@xmission.com> CC: Pravin Shelar <pshelar@ovn.org> CC: Justin Pettit <jpettit@nicira.com> CC: "David S. Miller" <davem@davemloft.net> Acked-by: NPravin B Shelar <pshelar@ovn.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 2月, 2016 1 次提交
-
-
由 David Wragg 提交于
Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could transmit vxlan packets of any size, constrained only by the ability to send out the resulting packets. 4.3 introduced netdevs corresponding to tunnel vports. These netdevs have an MTU, which limits the size of a packet that can be successfully encapsulated. The default MTU values are low (1500 or less), which is awkwardly small in the context of physical networks supporting jumbo frames, and leads to a conspicuous change in behaviour for userspace. Instead, set the MTU on openvswitch-created netdevs to be the relevant maximum (i.e. the maximum IP packet size minus any relevant overhead), effectively restoring the behaviour prior to 4.3. Signed-off-by: NDavid Wragg <david@weave.works> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 1月, 2016 1 次提交
-
-
由 Hannes Frederic Sowa 提交于
It was seen that defective configurations of openvswitch could overwrite the STACK_END_MAGIC and cause a hard crash of the kernel because of too many recursions within ovs. This problem arises due to the high stack usage of openvswitch. The rest of the kernel is fine with the current limit of 10 (RECURSION_LIMIT). We use the already existing recursion counter in ovs_execute_actions to implement an upper bound of 5 recursions. Cc: Pravin Shelar <pshelar@ovn.org> Cc: Simon Horman <simon.horman@netronome.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Simon Horman <simon.horman@netronome.com> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 1月, 2016 1 次提交
-
-
由 Konstantin Khlebnikov 提交于
Skb_gso_segment() uses skb control block during segmentation. This patch adds 32-bytes room for previous control block which will be copied into all resulting segments. This patch fixes kernel crash during fragmenting forwarded packets. Fragmentation requires valid IP CB in skb for clearing ip options. Also patch removes custom save/restore in ovs code, now it's redundant. Signed-off-by: NKonstantin Khlebnikov <koct9i@gmail.com> Link: http://lkml.kernel.org/r/CALYGNiP-0MZ-FExV2HutTvE9U-QQtkKSoE--KN=JQE5STYsjAA@mail.gmail.comSigned-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 1月, 2016 3 次提交
-
-
由 Jean Sacren 提交于
commit be4ace6e ("openvswitch: Move dev pointer into vport itself") The commit above added @dev and moved @rcu to the bottom of struct vport, but the change was not reflected in the kernel doc. So let's update the kernel doc as well. Signed-off-by: NJean Sacren <sakiwit@gmail.com> Cc: Thomas Graf <tgraf@suug.ch> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jean Sacren 提交于
commit 6b001e68 ("openvswitch: Use Geneve device.") The commit above introduced 'port_no' as the name for the member of struct geneve_port. The correct name should be 'dst_port' as described in the kernel doc. Let's fix that member name and all the pertinent instances so that both doc and code would be consistent. Signed-off-by: NJean Sacren <sakiwit@gmail.com> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jean Sacren 提交于
commit 6b001e68 ("openvswitch: Use Geneve device.") The commit above deleted the only call site of ovs_tunnel_route_lookup() and now that function is not used any more. So let's delete the function definition as well. Signed-off-by: NJean Sacren <sakiwit@gmail.com> Acked-by: NThomas Graf <tgraf@suug.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 12月, 2015 1 次提交
-
-
由 Joe Stringer 提交于
Commit 5b48bb8506c5 ("openvswitch: Fix helper reference leak") fixed a reference leak on helper objects, but inadvertently introduced a leak on the ct template. Previously, ct_info.ct->general.use was initialized to 0 by nf_ct_tmpl_alloc() and only incremented when ovs_ct_copy_action() returned successful. If an error occurred while adding the helper or adding the action to the actions buffer, the __ovs_ct_free_action() cleanup would use nf_ct_put() to free the entry; However, this relies on atomic_dec_and_test(ct_info.ct->general.use). This reference must be incremented first, or nf_ct_put() will never free it. Fix the issue by acquiring a reference to the template immediately after allocation. Fixes: cae3a262 ("openvswitch: Allow attaching helpers to ct action") Fixes: 5b48bb8506c5 ("openvswitch: Fix helper reference leak") Signed-off-by: NJoe Stringer <joe@ovn.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 12月, 2015 1 次提交
-
-
由 Simon Horman 提交于
In a set action tunnel attributes should be encoded in a nested action. I noticed this because ovs-dpctl was reporting an error when dumping flows due to the incorrect encoding of tunnel attributes in a set action. Fixes: fc4099f1 ("openvswitch: Fix egress tunnel info.") Signed-off-by: NSimon Horman <simon.horman@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 12月, 2015 2 次提交
-
-
由 Joe Stringer 提交于
If userspace executes ct(zone=1), and the connection tracker determines that the packet is invalid, then the ct_zone flow key field is populated with the default zone rather than the zone that was specified. Even though connection tracking failed, this field should be updated with the value that the action specified. Fix the issue. Fixes: 7f8a436e ("openvswitch: Add conntrack action") Signed-off-by: NJoe Stringer <joe@ovn.org> Acked-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Stringer 提交于
If the actions (re)allocation fails, or the actions list is larger than the maximum size, and the conntrack action is the last action when these problems are hit, then references to helper modules may be leaked. Fix the issue. Fixes: cae3a262 ("openvswitch: Allow attaching helpers to ct action") Signed-off-by: NJoe Stringer <joe@ovn.org> Acked-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 12月, 2015 3 次提交
-
-
由 Paolo Abeni 提交于
Each openvswitch tunnel vport (vxlan,gre,geneve) holds a reference to the underlying tunnel device, but never released it when such device is deleted. Deleting the underlying device via the ip tool cause the kernel to hangup in the netdev_wait_allrefs() loop. This commit ensure that on device unregistration dp_detach_port_notify() is called for all vports that hold the device reference, properly releasing it. Fixes: 614732ea ("openvswitch: Use regular VXLAN net_device device") Fixes: b2acd1dc ("openvswitch: Use regular GRE net_device instead of vport") Fixes: 6b001e68 ("openvswitch: Use Geneve device.") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Acked-by: NFlavio Leitner <fbl@sysclose.org> Acked-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Sometimes the drivers and other code would find it handy to know some internal information about upper device being changed. So allow upper-code to pass information down to notifier listeners during linking. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Eliminate netdev_master_upper_dev_link_private and pass priv directly as a parameter of netdev_master_upper_dev_link. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 12月, 2015 1 次提交
-
-
由 Paolo Abeni 提交于
After 614732ea, no refcount is maintained for the vport-vxlan module. This allows the userspace to remove such module while vport-vxlan devices still exist, which leads to later oops. v1 -> v2: - move vport 'owner' initialization in ovs_vport_ops_register() and make such function a macro Fixes: 614732ea ("openvswitch: Use regular VXLAN net_device device") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 11月, 2015 1 次提交
-
-
由 Aaron Conole 提交于
During pre-upstream development, the openvswitch datapath used a custom hashtable to store vports that could fail on delete due to lack of memory. However, prior to upstream submission, this code was reworked to use an hlist based hastable with flexible-array based buckets. As such the failure condition was eliminated from the vport_del path, rendering this comment invalid. Signed-off-by: NAaron Conole <aconole@bytheb.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 11月, 2015 2 次提交
-
-
由 Florian Westphal 提交于
The previous patch changed nf_ct_frag6_gather() to morph reassembled skb with the previous one. This means that the return value is always NULL or the skb argument. So change it to an err value. Instead of invoking NF_HOOK recursively with threshold to skip already-called hooks we can now just return NF_ACCEPT to move on to the next hook except for -EINPROGRESS (which means skb has been queued for reassembly), in which case we return NF_STOLEN. Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Florian Westphal 提交于
commit 6aafeef0 ("netfilter: push reasm skb through instead of original frag skbs") changed ipv6 defrag to not use the original skbs anymore. So rather than keeping the original skbs around just to discard them afterwards just use the original skbs directly for the fraglist of the newly assembled skb and remove the extra clone/free operations. The skb that completes the fragment queue is morphed into a the reassembled one instead, just like ipv4 defrag. openvswitch doesn't need any additional skb_morph magic anymore to deal with this situation so just remove that. A followup patch can then also remove the NF_HOOK (re)invocation in the ipv6 netfilter defrag hook. Cc: Joe Stringer <joestringer@nicira.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 28 10月, 2015 2 次提交
-
-
由 Joe Stringer 提交于
nf_ct_frag6_gather() makes a clone of each skb passed to it, and if the reassembly is successful, expects the caller to free all of the original skbs using nf_ct_frag6_consume_orig(). This call was previously missing, meaning that the original fragments were never freed (with the exception of the last fragment to arrive). Fix this by ensuring that all original fragments except for the last fragment are freed via nf_ct_frag6_consume_orig(). The last fragment will be morphed into the head, so it must not be freed yet. Furthermore, retain the ->next pointer for the head after skb_morph(). Fixes: 7f8a436e ("openvswitch: Add conntrack action") Reported-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NJoe Stringer <joestringer@nicira.com> Acked-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Joe Stringer 提交于
If ip_defrag() returns an error other than -EINPROGRESS, then the skb is freed. When handle_fragments() passes this back up to do_execute_actions(), it will be freed again. Prevent this double free by never freeing the skb in do_execute_actions() for errors returned by ovs_ct_execute. Always free it in ovs_ct_execute() error paths instead. Fixes: 7f8a436e ("openvswitch: Add conntrack action") Reported-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NJoe Stringer <joestringer@nicira.com> Acked-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 10月, 2015 1 次提交
-
-
由 Pravin B Shelar 提交于
While transitioning to netdev based vport we broke OVS feature which allows user to retrieve tunnel packet egress information for lwtunnel devices. Following patch fixes it by introducing ndo operation to get the tunnel egress info. Same ndo operation can be used for lwtunnel devices and compat ovs-tnl-vport devices. So after adding such device operation we can remove similar operation from ovs-vport. Fixes: 614732ea ("openvswitch: Use regular VXLAN net_device device"). Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 10月, 2015 2 次提交
-
-
由 Pravin B Shelar 提交于
With use of lwtunnel, we can directly call dev_queue_xmit() rather than calling netdev vport send operation. Following change make tunnel vport code bit cleaner. Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Acked-by: NThomas Graf <tgraf@suug.ch> Acked-by: NJiri Benc <jbenc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pravin B Shelar 提交于
Patch fixes following sparse warning. net/openvswitch/flow_netlink.c:583:30: warning: incorrect type in assignment (different base types) net/openvswitch/flow_netlink.c:583:30: expected restricted __be16 [usertype] ipv4 net/openvswitch/flow_netlink.c:583:30: got int Fixes: 6b26ba3a ("openvswitch: netlink attributes for IPv6 tunneling") Signed-off-by: NPravin B Shelar <pshelar@nicira.com> Acked-by: NThomas Graf <tgraf@suug.ch> Acked-by: NJiri Benc <jbenc@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-