提交 · aeb073241fe7a2b932e04e20c60e47718332877f · openeuler / Kernel

01 5月, 2017 2 次提交

vxlan: do not output confusing error message · baf4d786

由 Jiri Benc 提交于 4月 27, 2017

The message "Cannot bind port X, err=Y" creates only confusion. In metadata
based mode, failure of IPv6 socket creation is okay if IPv6 is disabled and
no error message should be printed. But when IPv6 tunnel was requested, such
failure is fatal. The vxlan_socket_create does not know when the error is
harmless and when it's not.

Instead of passing such information down to vxlan_socket_create, remove the
message completely. It's not useful. We propagate the error code up to the
user space and the port number comes from the user space. There's nothing in
the message that the process creating vxlan interface does not know.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

baf4d786

vxlan: correctly handle ipv6.disable module parameter · d074bf96

由 Jiri Benc 提交于 4月 27, 2017

When IPv6 is compiled but disabled at runtime, __vxlan_sock_add returns
-EAFNOSUPPORT. For metadata based tunnels, this causes failure of the whole
operation of bringing up the tunnel.

Ignore failure of IPv6 socket creation for metadata based tunnels caused by
IPv6 not being available.

Fixes: b1be00a6 ("vxlan: support both IPv4 and IPv6 sockets in a single vxlan device")
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d074bf96

04 4月, 2017 1 次提交

vxlan: fix ND proxy when skb doesn't have transport header offset · f1fb08f6

由 Vincent Bernat 提交于 4月 02, 2017

When an incoming frame is tagged or when GRO is disabled, the skb
handled to vxlan_xmit() doesn't contain a valid transport header
offset. This makes ND proxying fail.

We combine two changes: replace use of skb_transport_offset() and ensure
the necessary amount of skb is linear just before using it:

 - In vxlan_xmit(), when determining if we have an ICMPv6 neighbor
   discovery packet, just check if it is an ICMPv6 packet and rely on
   neigh_reduce() to do more checks if this is the case. The use of
   pskb_may_pull() is replaced by skb_header_pointer() for just the IPv6
   header.

 - In neigh_reduce(), add pskb_may_pull() for IPv6 header and neighbor
   discovery message since this was removed from vxlan_xmit(). Replace
   skb_transport_header() with ipv6_hdr() + 1.

 - In vxlan_na_create(), replace first skb_transport_offset() with
   ipv6_hdr() + 1 and second with skb_network_offset() + sizeof(struct
   ipv6hdr). Additionally, ensure we pskb_may_pull() the whole skb as we
   need it to iterate over the options.
Signed-off-by: NVincent Bernat <vincent@bernat.im>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f1fb08f6

02 4月, 2017 1 次提交

vxlan: vxlan dev should inherit lowerdev's gso_max_size · d6acfeb1

由 Felix Manlunas 提交于 3月 29, 2017

vxlan dev currently ignores lowerdev's gso_max_size, which adversely
affects TSO performance of liquidio if it's the lowerdev.  Egress TCP
packets' skb->len often exceed liquidio's advertised gso_max_size.  This
may happen on other NIC drivers.

Fix it by assigning lowerdev's gso_max_size to that of vxlan dev.  Might as
well do likewise for gso_max_segs.

Single flow TSO throughput of liquidio as lowerdev (using iperf3):

    Before the patch:    139 Mbps
    After the patch :   8.68 Gbps
    Percent increase:  6,144 %
Signed-off-by: NFelix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: NSatanand Burla <satananda.burla@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d6acfeb1

29 3月, 2017 1 次提交

vxlan: don't age NTF_EXT_LEARNED fdb entries · def499c9

由 Roopa Prabhu 提交于 3月 27, 2017

vxlan driver already implicitly supports installing
of external fdb entries with NTF_EXT_LEARNED. This
patch just makes sure these entries are not aged
by the vxlan driver. An external entity managing these
entries will age them out. This is consistent with
the use of NTF_EXT_LEARNED in the bridge driver.
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

def499c9

14 3月, 2017 1 次提交

vxlan: fix ovs support · c80498e3

由 Nicolas Dichtel 提交于 3月 13, 2017

The required changes in the function vxlan_dev_create() were missing
in commit 8bcdc4f3.
The vxlan device is not registered anymore after this patch and the error
path causes an stack dump:
 WARNING: CPU: 3 PID: 1498 at net/core/dev.c:6713 rollback_registered_many+0x9d/0x3f0

Fixes: 8bcdc4f3 ("vxlan: add changelink support")
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c80498e3

13 3月, 2017 1 次提交

vxlan: use appropriate family on L3 miss · 8f48ba71

由 Vincent Bernat 提交于 3月 10, 2017

When sending a L3 miss, the family is set to AF_INET even for IPv6. This
causes userland (eg "ip monitor") to be confused. Ensure we send the
appropriate family in this case. For L2 miss, keep using AF_INET.
Signed-off-by: NVincent Bernat <vincent@bernat.im>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8f48ba71

02 3月, 2017 1 次提交

vxlan: lock RCU on TX path · 56de859e

由 Jakub Kicinski 提交于 2月 24, 2017

There is no guarantees that callers of the TX path will hold
the RCU lock.  Grab it explicitly.

Fixes: c6fcc4fc ("vxlan: avoid using stale vxlan socket.")
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56de859e

25 2月, 2017 2 次提交

vxlan: don't allow overwrite of config src addr · 1158632b

由 Brian Russell 提交于 2月 24, 2017

When using IPv6 transport and a default dst, a pointer to the configured
source address is passed into the route lookup. If no source address is
configured, then the value is overwritten.

IPv6 route lookup ignores egress ifindex match if the source address is set,
so if egress ifindex match is desired, the source address must be passed
as any. The overwrite breaks this for subsequent lookups.

Avoid this by copying the configured address to an existing stack variable
and pass a pointer to that instead.

Fixes: 272d96a5 ("net: vxlan: lwt: Use source ip address during route lookup.")
Signed-off-by: NBrian Russell <brussell@brocade.com>
Acked-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1158632b

vxlan: correctly validate VXLAN ID against VXLAN_N_VID · 4e37d691

由 Matthias Schiffer 提交于 2月 23, 2017

The incorrect check caused an off-by-one error: the maximum VID 0xffffff
was unusable.

Fixes: d342894c ("vxlan: virtual extensible lan")
Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net>
Acked-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4e37d691

22 2月, 2017 2 次提交

vxlan: remove unused variable saddr in neigh_reduce · 8dcd81a9

由 Roopa Prabhu 提交于 2月 20, 2017

silences the below warning:
    drivers/net/vxlan.c: In function ‘neigh_reduce’:
    drivers/net/vxlan.c:1599:25: warning: variable ‘saddr’ set but not used
    [-Wunused-but-set-variable]
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8dcd81a9

vxlan: add changelink support · 8bcdc4f3

由 Roopa Prabhu 提交于 2月 20, 2017

This patch adds changelink rtnl op support for vxlan netdevs.
code changes involve:
    - refactor vxlan_newlink into vxlan_nl2conf to be
    used by vxlan_newlink and vxlan_changelink
    - vxlan_nl2conf and vxlan_dev_configure take a
    changelink argument to isolate changelink checks
    and updates.
    - Allow changing only a few attributes:
        - return -EOPNOTSUPP for attributes that cannot
        be changed for now. Incremental patches can
        make the non-supported one available in the future
        if needed.
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8bcdc4f3

18 2月, 2017 1 次提交

vxlan: fix oops in dev_fill_metadata_dst · 22f0708a

由 Paolo Abeni 提交于 2月 17, 2017

Since the commit 0c1d70af ("net: use dst_cache for vxlan device")
vxlan_fill_metadata_dst() calls vxlan_get_route() passing a NULL
dst_cache pointer, so the latter should explicitly check for
valid dst_cache ptr. Unfortunately the commit d71785ff ("net: add
dst_cache to ovs vxlan lwtunnel") removed said check.

As a result is possible to trigger a null pointer access calling
vxlan_fill_metadata_dst(), e.g. with:

ovs-vsctl add-br ovs-br0
ovs-vsctl add-port ovs-br0 vxlan0 -- set interface vxlan0 \
	type=vxlan options:remote_ip=192.168.1.1 \
	options:key=1234 options:dst_port=4789 ofport_request=10
ip address add dev ovs-br0 172.16.1.2/24
ovs-vsctl set Bridge ovs-br0 ipfix=@i -- --id=@i create IPFIX \
	targets=\"172.16.1.1:1234\" sampling=1
iperf -c 172.16.1.1 -u -l 1000 -b 10M -t 1 -p 1234

This commit addresses the issue passing to vxlan_get_route() the
dst_cache already available into the lwt info processed by
vxlan_fill_metadata_dst().

Fixes: d71785ff ("net: add dst_cache to ovs vxlan lwtunnel")
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Acked-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

22f0708a

12 2月, 2017 1 次提交

vxlan: remove vni zero check and drop for COLLECT_METADATA · 98eb253c

由 Roopa Prabhu 提交于 2月 10, 2017

This patch drops the vni zero check for COLLECT_METADATA mode.
It is not really needed, vni zero is a valid vni.

Fixes: 3ad7a4b1 ("vxlan: support fdb and learning in COLLECT_METADATA mode"
Reported-by: NJoe Stringer <joe@ovn.org>
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98eb253c

04 2月, 2017 1 次提交

vxlan: support fdb and learning in COLLECT_METADATA mode · 3ad7a4b1

由 Roopa Prabhu 提交于 1月 31, 2017

Vxlan COLLECT_METADATA mode today solves the per-vni netdev
scalability problem in l3 networks. It expects all forwarding
information to be present in dst_metadata. This patch series
enhances collect metadata mode to include the case where only
vni is present in dst_metadata, and the vxlan driver can then use
the rest of the forwarding information datbase to make forwarding
decisions. There is no change to default COLLECT_METADATA
behaviour. These changes only apply to COLLECT_METADATA when
used with the bridging use-case with a special dst_metadata
tunnel info flag (eg: where vxlan device is part of a bridge).
For all this to work, the vxlan driver will need to now support a
single fdb table hashed by mac + vni. This series essentially makes
this happen.

use-case and workflow:
vxlan collect metadata device participates in bridging vlan
to vn-segments. Bridge driver above the vxlan device,
sends the vni corresponding to the vlan in the dst_metadata.
vxlan driver will lookup forwarding database with (mac + vni)
for the required remote destination information to forward the
packet.

Changes introduced by this patch:
    - allow learning and forwarding database state in vxlan netdev in
      COLLECT_METADATA mode. Current behaviour is not changed
      by default. tunnel info flag IP_TUNNEL_INFO_BRIDGE is used
      to support the new bridge friendly mode.
    - A single fdb table hashed by (mac, vni) to allow fdb entries with
      multiple vnis in the same fdb table
    - rx path already has the vni
    - tx path expects a vni in the packet with dst_metadata
    - prior to this series, fdb remote_dsts carried remote vni and
      the vxlan device carrying the fdb table represented the
      source vni. With the vxlan device now representing multiple vnis,
      this patch adds a src vni attribute to the fdb entry. The remote
      vni already uses NDA_VNI attribute. This patch introduces
      NDA_SRC_VNI netlink attribute to represent the src vni in a multi
      vni fdb table.

iproute2 example (patched and pruned iproute2 output to just show
relevant fdb entries):
example shows same host mac learnt on two vni's.

before (netdev per vni):
$bridge fdb show | grep "00:02:00:00:00:03"
00:02:00:00:00:03 dev vxlan1001 dst 12.0.0.8 self
00:02:00:00:00:03 dev vxlan1000 dst 12.0.0.8 self

after this patch with collect metadata in bridged mode (single netdev):
$bridge fdb show | grep "00:02:00:00:00:03"
00:02:00:00:00:03 dev vxlan0 src_vni 1001 dst 12.0.0.8 self
00:02:00:00:00:03 dev vxlan0 src_vni 1000 dst 12.0.0.8 self
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ad7a4b1

25 1月, 2017 2 次提交

vxlan: do not age static remote mac entries · efb5f68f

由 Balakrishnan Raman 提交于 1月 23, 2017

Mac aging is applicable only for dynamically learnt remote mac
entries. Check for user configured static remote mac entries
and skip aging.
Signed-off-by: NBalakrishnan Raman <ramanb@cumulusnetworks.com>
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

efb5f68f

vxlan: don't flush static fdb entries on admin down · 8b3f9337

由 Roopa Prabhu 提交于 1月 23, 2017

This patch skips flushing static fdb entries in
ndo_stop, but flushes all fdb entries during vxlan
device delete. This is consistent with the bridge
driver fdb
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8b3f9337

21 1月, 2017 1 次提交

vxlan: preserve type of dst_port parm for encap_bypass_if_local() · 1f6cc07e

由 Lance Richardson 提交于 1月 18, 2017

Eliminate sparse warning by maintaining type of dst_port
as __be16.
Signed-off-by: NLance Richardson <lrichard@redhat.com>
Acked-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f6cc07e

18 1月, 2017 1 次提交

vxlan: fix byte order of vxlan-gpe port number · d5ff72d9

由 Lance Richardson 提交于 1月 16, 2017

vxlan->cfg.dst_port is in network byte order, so an htons()
is needed here. Also reduced comment length to stay closer
to 80 column width (still slightly over, however).

Fixes: e1e5314d ("vxlan: implement GPE")
Signed-off-by: NLance Richardson <lrichard@redhat.com>
Acked-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5ff72d9

12 1月, 2017 1 次提交

vxlan: Set ports in flow key when doing route lookups · 4ecb1d83

由 Martynas Pumputis 提交于 1月 11, 2017

Otherwise, a xfrm policy with sport/dport being set cannot be matched.
Signed-off-by: NMartynas Pumputis <martynas@weave.works>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4ecb1d83

01 12月, 2016 1 次提交

vxlan: fix a potential issue when create a new vxlan fdb entry. · 17b46365

由 Haishuang Yan 提交于 11月 29, 2016

vxlan_fdb_append may return error, so add the proper check,
otherwise it will cause memory leak.
Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>

Changes in v2:
  - Unnecessary to initialize rc to zero.
Acked-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

17b46365

18 11月, 2016 1 次提交

netns: make struct pernet_operations::id unsigned int · c7d03a00

由 Alexey Dobriyan 提交于 11月 17, 2016

Make struct pernet_operations::id unsigned.

There are 2 reasons to do so:

1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.

2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.

"int" being used as an array index needs to be sign-extended
to 64-bit before being used.

	void f(long *p, int i)
	{
		g(p[i]);
	}

  roughly translates to

	movsx	rsi, esi
	mov	rdi, [rsi+...]
	call 	g

MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.

Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:

	static inline void *net_generic(const struct net *net, int id)
	{
		...
		ptr = ng->ptr[id - 1];
		...
	}

And this function is used a lot, so those sign extensions add up.

Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):

	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]

However, overall balance is in negative direction:

	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
	function                                     old     new   delta
	nfsd4_lock                                  3886    3959     +73
	tipc_link_build_proto_msg                   1096    1140     +44
	mac80211_hwsim_new_radio                    2776    2808     +32
	tipc_mon_rcv                                1032    1058     +26
	svcauth_gss_legacy_init                     1413    1429     +16
	tipc_bcbase_select_primary                   379     392     +13
	nfsd4_exchange_id                           1247    1260     +13
	nfsd4_setclientid_confirm                    782     793     +11
		...
	put_client_renew_locked                      494     480     -14
	ip_set_sockfn_get                            730     716     -14
	geneve_sock_add                              829     813     -16
	nfsd4_sequence_done                          721     703     -18
	nlmclnt_lookup_host                          708     686     -22
	nfsd4_lockt                                 1085    1063     -22
	nfs_get_client                              1077    1050     -27
	tcf_bpf_init                                1106    1076     -30
	nfsd4_encode_fattr                          5997    5930     -67
	Total: Before=154856051, After=154854321, chg -0.00%
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7d03a00

16 11月, 2016 7 次提交

vxlan: Fix uninitialized variable warnings. · 8ebd115b

由 David S. Miller 提交于 11月 15, 2016

drivers/net/vxlan.c: In function ‘vxlan_xmit_one’:
drivers/net/vxlan.c:2141:10: warning: ‘err’ may be used uninitialized in this function [-Wmaybe-uninitialized]
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ebd115b

vxlan: simplify vxlan xmit · 0770b53b