提交 · 2790460e145d198fe6925d405301a49b2d8cb03f · openanolis / cloud-kernel

09 4月, 2015 1 次提交

vxlan: fix a shadow local variable · 2790460e

由 WANG Cong 提交于 4月 08, 2015

Commit 79b16aad
("udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb()")
introduce 'sk' but we already have one inner 'sk'.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2790460e

08 4月, 2015 1 次提交

udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb(). · 79b16aad

由 David Miller 提交于 4月 05, 2015

That was we can make sure the output path of ipv4/ipv6 operate on
the UDP socket rather than whatever random thing happens to be in
skb->sk.

Based upon a patch by Jiri Pirko.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>

79b16aad

02 4月, 2015 1 次提交

vxlan: correct spelling in comments · c4b49512

由 Simon Horman 提交于 4月 02, 2015

Fix some spelling / typos:
* droppped -> dropped
* asddress -> address
* compatbility -> compatibility
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c4b49512

01 4月, 2015 3 次提交

netlink: implement nla_get_in_addr and nla_get_in6_addr · 67b61f6c

由 Jiri Benc 提交于 3月 29, 2015

Those are counterparts to nla_put_in_addr and nla_put_in6_addr.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

67b61f6c

netlink: implement nla_put_in_addr and nla_put_in6_addr · 930345ea

由 Jiri Benc 提交于 3月 29, 2015

IP addresses are often stored in netlink attributes. Add generic functions
to do that.

For nla_put_in_addr, it would be nicer to pass struct in_addr but this is
not used universally throughout the kernel, in way too many places __be32 is
used to store IPv4 address.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

930345ea

vxlan: fix indentation · f0ef3126

由 Jiri Benc 提交于 3月 29, 2015

Some lines in vxlan code are indented by 7 spaces instead of a tab.

Fixes: e4c7ed41 ("vxlan: add ipv6 support")
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0ef3126

24 3月, 2015 1 次提交

vxlan: simplify if clause in dev_close · 24c0e683

由 Marcelo Ricardo Leitner 提交于 3月 23, 2015

Dan Carpenter's static checker warned that in vxlan_stop we are checking
if 'vs' can be NULL while later we simply derreference it.

As after commit 56ef9c90 ("vxlan: Move socket initialization to
within rtnl scope") 'vs' just cannot be NULL in vxlan_stop() anymore, as
the interface won't go up if the socket initialization fails. So we are
good to just remove the check and make it consistent.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24c0e683

21 3月, 2015 1 次提交

vxlan: fix possible use of uninitialized in vxlan_igmp_{join, leave} · 149d7549

由 Marcelo Ricardo Leitner 提交于 3月 20, 2015

Test robot noticed that we check the return of vxlan_igmp_join and leave
but inside them there was a path that it could be used initialized.

It's not really possible because those if() inside these igmp functions
would always match as we can't have sockets of other type in there, but
this way we keep the compiler happy.

Fixes: 56ef9c90 ("vxlan: Move socket initialization to within rtnl scope")
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

149d7549

19 3月, 2015 2 次提交

vxlan: Move socket initialization to within rtnl scope · 56ef9c90

由 Marcelo Ricardo Leitner 提交于 3月 18, 2015

Currently, if a multicast join operation fail, the vxlan interface will
be UP but not functional, without even a log message informing the user.

Now that we can grab socket lock after already having rntl, we don't
need to defer socket creation and multicast operations. By not deferring
we can do proper error reporting to the user through ip exit code.

This patch thus removes all deferred work that vxlan had and put it back
inline. Now the socket will only be created, bound and join multicast
group when one bring the interface up, and will undo all that as soon as
one put the interface down.

As vxlan_sock_hold() is not used after this patch, it was removed too.
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56ef9c90

ipv4, ipv6: kill ip_mc_{join, leave}_group and ipv6_sock_mc_{join, drop} · 54ff9ef3

由 Marcelo Ricardo Leitner 提交于 3月 18, 2015

in favor of their inner __ ones, which doesn't grab rtnl.

As these functions need to operate on a locked socket, we can't be
grabbing rtnl by then. It's too late and doing so causes reversed
locking.

So this patch:
- move rtnl handling to callers instead while already fixing some
  reversed locking situations, like on vxlan and ipvs code.
- renames __ ones to not have the __ mark:
  __ip_mc_{join,leave}_group -> ip_mc_{join,leave}_group
  __ipv6_sock_mc_{join,drop} -> ipv6_sock_mc_{join,drop}
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54ff9ef3

14 3月, 2015 1 次提交

vxlan: fix wrong usage of VXLAN_VID_MASK · 40fb70f3

由 Alexey Kodanev 提交于 3月 13, 2015

commit dfd8645e wrongly assumes that VXLAN_VDI_MASK includes
eight lower order reserved bits of VNI field that are using for remote
checksum offload.

Right now, when VNI number greater then 0xffff, vxlan_udp_encap_recv()
will always return with 'bad_flag' error, reducing the usable vni range
from 0..16777215 to 0..65535. Also, it doesn't really check whether RCO
bits processed or not.

Fix it by adding new VNI mask which has all 32 bits of VNI field:
24 bits for id and 8 bits for other usage.
Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40fb70f3

13 3月, 2015 1 次提交

vxlan: Don't set s_addr in vxlan_create_sock · 719a11cd

由 Simon Horman 提交于 3月 13, 2015

In the case of AF_INET s_addr was set to INADDR_ANY (0) which which both
symmetric with the AF_INET6 case, where s_addr is not set, and unnecessary
as udp_conf is zeroed out earlier in the same function.

I suspect this change does not have any run-time effect due to compiler
optimisations. But it does make the code a little easier on the/my eyes.

Cc: Tom Herbert <therbert@google.com>
Signed-off-by: NSimon Horman <simon.horman@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

719a11cd

12 2月, 2015 3 次提交

vxlan: Use checksum partial with remote checksum offload · 0ace2ca8

由 Tom Herbert 提交于 2月 10, 2015

Change remote checksum handling to set checksum partial as default
behavior. Added an iflink parameter to configure not using
checksum partial (calling csum_partial to update checksum).
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0ace2ca8

net: Infrastructure for CHECKSUM_PARTIAL with remote checsum offload · 15e2396d

由 Tom Herbert 提交于 2月 10, 2015

This patch adds infrastructure so that remote checksum offload can
set CHECKSUM_PARTIAL instead of calling csum_partial and writing
the modfied checksum field.

Add skb_remcsum_adjust_partial function to set an skb for using
CHECKSUM_PARTIAL with remote checksum offload.  Changed
skb_remcsum_process and skb_gro_remcsum_process to take a boolean
argument to indicate if checksum partial can be set or the
checksum needs to be modified using the normal algorithm.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

15e2396d

net: Fix remcsum in GRO path to not change packet · 26c4f7da

由 Tom Herbert 提交于 2月 10, 2015

Remote checksum offload processing is currently the same for both
the GRO and non-GRO path. When the remote checksum offload option
is encountered, the checksum field referred to is modified in
the packet. So in the GRO case, the packet is modified in the
GRO path and then the operation is skipped when the packet goes
through the normal path based on skb->remcsum_offload. There is
a problem in that the packet may be modified in the GRO path, but
then forwarded off host still containing the remote checksum option.
A remote host will again perform RCO but now the checksum verification
will fail since GRO RCO already modified the checksum.

To fix this, we ensure that GRO restores a packet to it's original
state before returning. In this model, when GRO processes a remote
checksum option it still changes the checksum per the algorithm
but on return from lower layer processing the checksum is restored
to its original value.

In this patch we add define gro_remcsum structure which is passed
to skb_gro_remcsum_process to save offset and delta for the checksum
being changed. After lower layer processing, skb_gro_remcsum_cleanup
is called to restore the checksum before returning from GRO.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

26c4f7da

09 2月, 2015 1 次提交

vxlan: Wrong type passed to %pIS · a4870f79

由 Rasmus Villemoes 提交于 2月 07, 2015

src_ip is a pointer to a union vxlan_addr, one member of which is a
struct sockaddr. Passing a pointer to src_ip is wrong; one should pass
the value of src_ip itself. Since %pIS formally expects something of
type struct sockaddr*, let's pass a pointer to the appropriate union
member, though this of course doesn't change the generated code.

Fixes: e4c7ed41 ("vxlan: add ipv6 support")
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4870f79

05 2月, 2015 2 次提交

vxlan: Only set has-GBP bit in header if any other bits would be set · db79a621

由 Thomas Graf 提交于 2月 04, 2015

This allows for a VXLAN-GBP socket to talk to a Linux VXLAN socket by
not setting any of the bits.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db79a621

net: add skb functions to process remote checksum offload · dcdc8994

由 Tom Herbert 提交于 2月 02, 2015

This patch adds skb_remcsum_process and skb_gro_remcsum_process to
perform the appropriate adjustments to the skb when receiving
remote checksum offload.

Updated vxlan and gue to use these functions.

Tested: Ran TCP_RR and TCP_STREAM netperf for VXLAN and GUE, did
not see any change in performance.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dcdc8994

30 1月, 2015 1 次提交

vxlan: setup the right link netns in newlink hdlr · 33564bbb

由 Nicolas Dichtel 提交于 1月 26, 2015

Rename the netns to src_net to avoid confusion with the netns where the
interface stands. The user may specify IFLA_NET_NS_[PID|FD] to create
a x-netns netndevice: IFLA_NET_NS_[PID|FD] points to the netns where the
netdevice stands and src_net to the link netns.

Note that before commit f01ec1c0 ("vxlan: add x-netns support"), it was
possible to create a x-netns vxlan netdevice, but the netdevice was not
operational.

Fixes: f01ec1c0 ("vxlan: add x-netns support")
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

33564bbb

28 1月, 2015 1 次提交

vxlan: advertise link netns in fdb messages · 4967082b

由 Nicolas Dichtel 提交于 1月 26, 2015

Previous commit is based on a wrong assumption, fdb messages are always sent
into the netns where the interface stands (see vxlan_fdb_notify()).

These fdb messages doesn't embed the rtnl attribute IFLA_LINK_NETNSID, thus we
need to add it (useful to interpret NDA_IFINDEX or NDA_DST for example).

Note also that vxlan_nlmsg_size() was not updated.

Fixes: 193523bf ("vxlan: advertise netns of vxlan dev in fdb msg")
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4967082b

25 1月, 2015 2 次提交

vxlan: Eliminate dependency on UDP socket in transmit path · af33c1ad

由 Tom Herbert 提交于 1月 20, 2015

In the vxlan transmit path there is no need to reference the socket
for a tunnel which is needed for the receive side. We do, however,
need the vxlan_dev flags. This patch eliminate references
to the socket in the transmit path, and changes VXLAN_F_UNSHAREABLE
to be VXLAN_F_RCV_FLAGS. This mask is used to store the flags
applicable to receive (GBP, CSUM6_RX, and REMCSUM_RX) in the
vxlan_sock flags.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

af33c1ad

udp: Do not require sock in udp_tunnel_xmit_skb · d998f8ef

由 Tom Herbert 提交于 1月 20, 2015

The UDP tunnel transmit functions udp_tunnel_xmit_skb and
udp_tunnel6_xmit_skb include a socket argument. The socket being
passed to the functions (from VXLAN) is a UDP created for receive
side. The only thing that the socket is used for in the transmit
functions is to get the setting for checksum (enabled or zero).
This patch removes the argument and and adds a nocheck argument
for checksum setting. This eliminates the unnecessary dependency
on a UDP socket for UDP tunnel transmit.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d998f8ef

24 1月, 2015 1 次提交

vxlan: advertise netns of vxlan dev in fdb msg · 193523bf

由 Nicolas Dichtel 提交于 1月 20, 2015

Netlink FDB messages are sent in the link netns. The header of these messages
contains the ifindex (ndm_ifindex) of the netdevice, but this ifindex is
unusable in case of x-netns vxlan.
I named the new attribute NDA_NDM_IFINDEX_NETNSID, to avoid confusion with
NDA_IFINDEX.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

193523bf

20 1月, 2015 1 次提交

tunnels: advertise link netns via netlink · 1728d4fa

由 Nicolas Dichtel 提交于 1月 15, 2015

Implement rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is
added to rtnetlink messages.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1728d4fa

18 1月, 2015 1 次提交

netlink: make nlmsg_end() and genlmsg_end() void · 053c095a

由 Johannes Berg 提交于 1月 16, 2015

Contrary to common expectations for an "int" return, these functions
return only a positive value -- if used correctly they cannot even
return 0 because the message header will necessarily be in the skb.

This makes the very common pattern of

  if (genlmsg_end(...) < 0) { ... }

be a whole bunch of dead code. Many places also simply do

  return nlmsg_end(...);

and the caller is expected to deal with it.

This also commonly (at least for me) causes errors, because it is very
common to write

  if (my_function(...))
    /* error condition */

and if my_function() does "return nlmsg_end()" this is of course wrong.

Additionally, there's not a single place in the kernel that actually
needs the message length returned, and if anyone needs it later then
it'll be very easy to just use skb->len there.

Remove this, and make the functions void. This removes a bunch of dead
code as described above. The patch adds lines because I did

-	return nlmsg_end(...);
+	nlmsg_end(...);
+	return 0;

I could have preserved all the function's return values by returning
skb->len, but instead I've audited all the places calling the affected
functions and found that none cared. A few places actually compared
the return value with <= 0 in dump functionality, but that could just
be changed to < 0 with no change in behaviour, so I opted for the more
efficient version.

One instance of the error I've made numerous times now is also present
in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
check for <0 or <=0 and thus broke out of the loop every single time.
I've preserved this since it will (I think) have caused the messages to
userspace to be formatted differently with just a single message for
every SKB returned to userspace. It's possible that this isn't needed
for the tools that actually use this, but I don't even know what they
are so couldn't test that changing this behaviour would be acceptable.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

053c095a

15 1月, 2015 4 次提交

vxlan: Only bind to sockets with compatible flags enabled · ac5132d1

由 Thomas Graf 提交于 1月 15, 2015

A VXLAN net_device looking for an appropriate socket may only consider
a socket which has a matching set of flags/extensions enabled. If
incompatible flags are enabled, return a conflict to have the caller
create a distinct socket with distinct port.

The OVS VXLAN port is kept unaware of extensions at this point.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac5132d1

vxlan: Group Policy extension · 3511494c

由 Thomas Graf 提交于 1月 15, 2015

Implements supports for the Group Policy VXLAN extension [0] to provide
a lightweight and simple security label mechanism across network peers
based on VXLAN. The security context and associated metadata is mapped
to/from skb->mark. This allows further mapping to a SELinux context
using SECMARK, to implement ACLs directly with nftables, iptables, OVS,
tc, etc.

The group membership is defined by the lower 16 bits of skb->mark, the
upper 16 bits are used for flags.

SELinux allows to manage label to secure local resources. However,
distributed applications require ACLs to implemented across hosts. This
is typically achieved by matching on L2-L4 fields to identify the
original sending host and process on the receiver. On top of that,
netlabel and specifically CIPSO [1] allow to map security contexts to
universal labels.  However, netlabel and CIPSO are relatively complex.
This patch provides a lightweight alternative for overlay network
environments with a trusted underlay. No additional control protocol
is required.

           Host 1:                       Host 2:

      Group A        Group B        Group B     Group A
      +-----+   +-------------+    +-------+   +-----+
      | lxc |   | SELinux CTX |    | httpd |   | VM  |
      +--+--+   +--+----------+    +---+---+   +--+--+
	  \---+---/                     \----+---/
	      |                              |
	  +---+---+                      +---+---+
	  | vxlan |                      | vxlan |
	  +---+---+                      +---+---+
	      +------------------------------+

Backwards compatibility:
A VXLAN-GBP socket can receive standard VXLAN frames and will assign
the default group 0x0000 to such frames. A Linux VXLAN socket will
drop VXLAN-GBP  frames. The extension is therefore disabled by default
and needs to be specifically enabled:

   ip link add [...] type vxlan [...] gbp

In a mixed environment with VXLAN and VXLAN-GBP sockets, the GBP socket
must run on a separate port number.

Examples:
 iptables:
  host1# iptables -I OUTPUT -m owner --uid-owner 101 -j MARK --set-mark 0x200
  host2# iptables -I INPUT -m mark --mark 0x200 -j DROP

 OVS:
  # ovs-ofctl add-flow br0 'in_port=1,actions=load:0x200->NXM_NX_TUN_GBP_ID[],NORMAL'
  # ovs-ofctl add-flow br0 'in_port=2,tun_gbp_id=0x200,actions=drop'

[0] https://tools.ietf.org/html/draft-smith-vxlan-group-policy
[1] http://lwn.net/Articles/204905/Signed-off-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3511494c

vxlan: Remote checksum offload · dfd8645e

由 Tom Herbert 提交于 1月 12, 2015

Add support for remote checksum offload in VXLAN. This uses a
reserved bit to indicate that RCO is being done, and uses the low order
reserved eight bits of the VNI to hold the start and offset values in a
compressed manner.

Start is encoded in the low order seven bits of VNI. This is start >> 1
so that the checksum start offset is 0-254 using even values only.
Checksum offset (transport checksum field) is indicated in the high
order bit in the low order byte of the VNI. If the bit is set, the
checksum field is for UDP (so offset = start + 6), else checksum
field is for TCP (so offset = start + 16). Only TCP and UDP are
supported in this implementation.

Remote checksum offload for VXLAN is described in:

https://tools.ietf.org/html/draft-herbert-vxlan-rco-00

Tested by running 200 TCP_STREAM connections with VXLAN (over IPv4).

With UDP checksums and Remote Checksum Offload
  IPv4
      Client
        11.84% CPU utilization
      Server
        12.96% CPU utilization
      9197 Mbps
  IPv6
      Client
        12.46% CPU utilization
      Server
        14.48% CPU utilization
      8963 Mbps

With UDP checksums, no remote checksum offload
  IPv4
      Client
        15.67% CPU utilization
      Server
        14.83% CPU utilization
      9094 Mbps
  IPv6
      Client
        16.21% CPU utilization
      Server
        14.32% CPU utilization
      9058 Mbps

No UDP checksums
  IPv4
      Client
        15.03% CPU utilization
      Server
        23.09% CPU utilization
      9089 Mbps
  IPv6
      Client
        16.18% CPU utilization
      Server
        26.57% CPU utilization
       8954 Mbps
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dfd8645e

udp: pass udp_offload struct to UDP gro callbacks · a2b12f3c

由 Tom Herbert 提交于 1月 12, 2015

This patch introduces udp_offload_callbacks which has the same
GRO functions (but not a GSO function) as offload_callbacks,
except there is an argument to a udp_offload struct passed to
gro_receive and gro_complete functions. This additional argument
can be used to retrieve the per port structure of the encapsulation
for use in gro processing (mostly by doing container_of on the
structure).
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2b12f3c

14 1月, 2015 1 次提交

net: rename vlan_tx_* helpers since "tx" is misleading there · df8a39de

由 Jiri Pirko 提交于 1月 13, 2015

The same macros are used for rx as well. So rename it.
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df8a39de

13 1月, 2015 1 次提交

vxlan: Improve support for header flags · 3bf39475

由 Tom Herbert 提交于 1月 08, 2015

This patch cleans up the header flags of VXLAN in anticipation of
defining some new ones:

- Move header related definitions from vxlan.c to vxlan.h
- Change VXLAN_FLAGS to be VXLAN_HF_VNI (only currently defined flag)
- Move check for unknown flags to after we find vxlan_sock, this
  assumes that some flags may be processed based on tunnel
  configuration
- Add a comment about why the stack treating unknown set flags as an
  error instead of ignoring them
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3bf39475

03 1月, 2015 1 次提交

net: Add Transparent Ethernet Bridging GRO support. · 9b174d88

由 Jesse Gross 提交于 12月 30, 2014

Currently the only tunnel protocol that supports GRO with encapsulated
Ethernet is VXLAN. This pulls out the Ethernet code into a proper layer
so that it can be used by other tunnel protocols such as GRE and Geneve.
Signed-off-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b174d88

24 12月, 2014 1 次提交

vxlan: Fix double free of skb. · 74f47278

由 Pravin B Shelar 提交于 12月 23, 2014

In case of error vxlan_xmit_one() can free already freed skb.
Also fixes memory leak of dst-entry.

Fixes: acbf74a7 ("vxlan: Refactor vxlan driver to make use
of the common UDP tunnel functions").
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74f47278

12 12月, 2014 1 次提交

Fix race condition between vxlan_sock_add and vxlan_sock_release · 00c83b01

由 Marcelo Leitner 提交于 12月 11, 2014

Currently, when trying to reuse a socket, vxlan_sock_add will grab
vn->sock_lock, locate a reusable socket, inc refcount and release
vn->sock_lock.

But vxlan_sock_release() will first decrement refcount, and then grab
that lock. refcnt operations are atomic but as currently we have
deferred works which hold vs->refcnt each, this might happen, leading to
a use after free (specially after vxlan_igmp_leave):

  CPU 1                            CPU 2

deferred work                    vxlan_sock_add
  ...                              ...
                                   spin_lock(&vn->sock_lock)
                                   vs = vxlan_find_sock();
  vxlan_sock_release
    dec vs->refcnt, reaches 0
    spin_lock(&vn->sock_lock)
                                   vxlan_sock_hold(vs), refcnt=1
                                   spin_unlock(&vn->sock_lock)
    hlist_del_rcu(&vs->hlist);
    vxlan_notify_del_rx_port(vs)
    spin_unlock(&vn->sock_lock)

So when we look for a reusable socket, we check if it wasn't freed
already before reusing it.
Signed-off-by: NMarcelo Ricardo Leitner <mleitner@redhat.com>
Fixes: 7c47cedf ("vxlan: move IGMP join/leave to work queue")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

00c83b01

03 12月, 2014 1 次提交

net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del · f6f6424b

由 Jiri Pirko 提交于 11月 28, 2014

Do the work of parsing NDA_VLAN directly in rtnetlink code, pass simple
u16 vid to drivers from there.
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Acked-by: NAndy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6f6424b

26 11月, 2014 1 次提交

vxlan: Fix boolean flip in VXLAN_F_UDP_ZERO_CSUM6_[TX|RX] · 3dc2b6a8

由 Alexander Duyck 提交于 11月 24, 2014

In "vxlan: Call udp_sock_create" there was a logic error that resulted in
the default for IPv6 VXLAN tunnels going from using checksums to not using
checksums.  Since there is currently no support in iproute2 for setting
these values it means that a kernel after the change cannot talk over a IPv6
VXLAN tunnel to a kernel prior the change.

Fixes: 3ee64f39 ("vxlan: Call udp_sock_create")

Cc: Tom Herbert <therbert@google.com>
Signed-off-by: NAlexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3dc2b6a8

22 11月, 2014 2 次提交

vlan: introduce *vlan_hwaccel_push_inside helpers · 5968250c

由 Jiri Pirko 提交于 11月 19, 2014

Use them to push skb->vlan_tci into the payload and avoid code
duplication.
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5968250c

vlan: rename __vlan_put_tag to vlan_insert_tag_set_proto · 62749e2c

由 Jiri Pirko 提交于 11月 19, 2014

Name fits better. Plus there's going to be introduced
__vlan_insert_tag later on.
Signed-off-by: NJiri Pirko <jiri@resnulli.us>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62749e2c

19 11月, 2014 1 次提交

vxlan: Inline vxlan_gso_check(). · 11bf7828

由 Joe Stringer 提交于 11月 17, 2014

Suggested-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NJoe Stringer <joestringer@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

11bf7828

15 11月, 2014 1 次提交

net: Add vxlan_gso_check() helper · 23e62de3

由 Joe Stringer 提交于 11月 13, 2014

Most NICs that report NETIF_F_GSO_UDP_TUNNEL support VXLAN, and not
other UDP-based encapsulation protocols where the format and size of the
header differs. This patch implements a generic ndo_gso_check() for
VXLAN which will only advertise GSO support when the skb looks like it
contains VXLAN (or no UDP tunnelling at all).

Implementation shamelessly stolen from Tom Herbert:
http://thread.gmane.org/gmane.linux.network/332428/focus=333111Signed-off-by: NJoe Stringer <joestringer@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

23e62de3

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功