提交 · c65a6a930b70ce99bd34ea3835184d2deccd35a3 · openanolis / cloud-kernel

29 6月, 2018 1 次提交

net: check tunnel option type in tunnel flags · 256c87c1

由 Pieter Jansen van Vuuren 提交于 6月 26, 2018

Check the tunnel option type stored in tunnel flags when creating options
for tunnels. Thereby ensuring we do not set geneve, vxlan or erspan tunnel
options on interfaces that are not associated with them.

Make sure all users of the infrastructure set correct flags, for the BPF
helper we have to set all bits to keep backward compatibility.
Signed-off-by: NPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

256c87c1

26 6月, 2018 1 次提交

net: Convert GRO SKB handling to list_head. · d4546c25

由 David Miller 提交于 6月 24, 2018

Manage pending per-NAPI GRO packets via list_head.

Return an SKB pointer from the GRO receive handlers.  When GRO receive
handlers return non-NULL, it means that this SKB needs to be completed
at this time and removed from the NAPI queue.

Several operations are greatly simplified by this transformation,
especially timing out the oldest SKB in the list when gro_count
exceeds MAX_GRO_SKBS, and napi_gro_flush() which walks the queue
in reverse order.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4546c25

20 4月, 2018 4 次提交

geneve: configure MTU based on a lower device · c40e89fd

由 Alexey Kodanev 提交于 4月 19, 2018

Currently, on a new link creation or when 'remote' address parameter
is updated, an MTU is not changed and always equals 1500. When a lower
device has a larger MTU, it might not be efficient, e.g. for UDP, and
requires the manual MTU adjustments to match the MTU of the lower
device.

This patch tries to automate this process, finds a lower device using
the 'remote' address parameter, then uses its MTU to tune GENEVE's MTU:
  * on a new link creation
  * when 'remote' parameter is changed

Also with this patch, the MTU from a user, on a new link creation, is
passed to geneve_change_mtu() where it is verified, and MTU adjustments
with a lower device is skipped in that case. Prior that change, it was
possible to set the invalid MTU values on a new link creation.
Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c40e89fd

geneve: check MTU for a minimum in geneve_change_mtu() · 321acc1c

由 Alexey Kodanev 提交于 4月 19, 2018

geneve_change_mtu() will be used not only as ndo_change_mtu() callback,
but also to verify a user specified MTU on a new link creation in the
next patch.
Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

321acc1c

geneve: cleanup hard coded value for Ethernet header length · 5edbea69

由 Alexey Kodanev 提交于 4月 19, 2018

Use ETH_HLEN instead and introduce two new macros: GENEVE_IPV4_HLEN
and GENEVE_IPV6_HLEN that include Ethernet header length, corresponded
IP header length and GENEVE_BASE_HLEN.
Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5edbea69

A
geneve: remove white-space before '#if IS_ENABLED(CONFIG_IPV6)' · 4c52a889
由 Alexey Kodanev 提交于 4月 19, 2018
```
Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
4c52a889

28 3月, 2018 1 次提交

net: Drop pernet_operations::async · 2f635cee

由 Kirill Tkhai 提交于 3月 27, 2018

Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f635cee

28 2月, 2018 1 次提交

net: Convert geneve_net_ops · f60f3346

由 Kirill Tkhai 提交于 2月 26, 2018

These pernet_operations are similar to bond_net_ops. Exit method
unregisters all net geneve devices, and it looks like another
pernet_operations are not interested in foreign net geneve list.
So, it's possible to mark them async.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f60f3346

26 1月, 2018 1 次提交

net: don't call update_pmtu unconditionally · f15ca723

由 Nicolas Dichtel 提交于 1月 25, 2018

Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to:
"BUG: unable to handle kernel NULL pointer dereference at           (null)"

Let's add a helper to check if update_pmtu is available before calling it.

Fixes: 52a589d5 ("geneve: update skb dst pmtu on tx path")
Fixes: a93bf0ff ("vxlan: update skb dst pmtu on tx path")
CC: Roman Kapl <code@rkapl.cz>
CC: Xin Long <lucien.xin@gmail.com>
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f15ca723

03 1月, 2018 1 次提交

geneve: update skb dst pmtu on tx path · 52a589d5

由 Xin Long 提交于 12月 25, 2017

Commit a93bf0ff ("vxlan: update skb dst pmtu on tx path") has fixed
a performance issue caused by the change of lower dev's mtu for vxlan.

The same thing needs to be done for geneve as well.

Note that geneve cannot adjust it's mtu according to lower dev's mtu
when creating it. The performance is very low later when netperfing
over it without fixing the mtu manually. This patch could also avoid
this issue.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52a589d5

19 12月, 2017 1 次提交

geneve: speedup geneve tunnels dismantle · 2843a253

由 Haishuang Yan 提交于 12月 16, 2017

Since we now hold RTNL lock in geneve_exit_net, it's better batch them
to speedup geneve tunnel dismantle.
Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2843a253

24 11月, 2017 1 次提交

geneve: only configure or fill UDP_ZERO_CSUM6_RX/TX info when CONFIG_IPV6 · f9094b76

由 Hangbin Liu 提交于 11月 23, 2017

Stefano pointed that configure or show UDP_ZERO_CSUM6_RX/TX info doesn't
make sense if we haven't enabled CONFIG_IPV6. Fix it by adding
if IS_ENABLED(CONFIG_IPV6) check.

Fixes: abe492b4 ("geneve: UDP checksum configuration via netlink")
Fixes: fd7eafd0 ("geneve: fix fill_info when link down")
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f9094b76

15 11月, 2017 1 次提交

geneve: fix fill_info when link down · fd7eafd0

由 Hangbin Liu 提交于 11月 15, 2017

geneve->sock4/6 were added with geneve_open and released with geneve_stop.
So when geneve link down, we will not able to show remote address and
checksum info after commit 11387fe4 ("geneve: fix fill_info when using
collect_metadata").

Fix this by avoid passing *_REMOTE{,6} for COLLECT_METADATA since they are
mutually exclusive, and always show UDP_ZERO_CSUM6_RX info.

Fixes: 11387fe4 ("geneve: fix fill_info when using collect_metadata")
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd7eafd0

14 11月, 2017 1 次提交

geneve: exit_net cleanup check added · ab384b63

由 Vasily Averin 提交于 11月 12, 2017

Be sure that sock_list initialized in net_init hook was return
to initial state.
Signed-off-by: NVasily Averin <vvs@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab384b63

22 10月, 2017 1 次提交

geneve: Get rid of is_all_zero(), streamline is_tnl_info_zero() · 3fa5f11d

由 Stefano Brivio 提交于 10月 20, 2017

No need to re-invent memchr_inv() with !is_all_zero(). While at
it, replace conditional and return clauses with a single return
clause in is_tnl_info_zero().
Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fa5f11d

21 10月, 2017 1 次提交

geneve: Fix function matching VNI and tunnel ID on big-endian · 772e97b5

由 Stefano Brivio 提交于 10月 19, 2017

On big-endian machines, functions converting between tunnel ID
and VNI use the three LSBs of tunnel ID storage to map VNI.

The comparison function eq_tun_id_and_vni(), on the other hand,
attempted to map the VNI from the three MSBs. Fix it by using
the same check implemented on LE, which maps VNI from the three
LSBs of tunnel ID.

Fixes: 2e0b26e1 ("geneve: Optimize geneve device lookup.")
Signed-off-by: NStefano Brivio <sbrivio@redhat.com>
Reviewed-by: NJakub Sitnicki <jkbs@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

772e97b5

12 8月, 2017 1 次提交

geneve: use netlink_ext_ack for error reporting in rtnl operations · c5ebc440

由 Girish Moodalbail 提交于 8月 09, 2017

Add extack error messages for failure paths while creating/modifying
geneve devices. Once extack support is added to iproute2, more
meaningful and helpful error messages will be displayed making it easy
for users to discern what went wrong.

Before:

=======
$ ip link add gen1 address 0:1:2:3:4:5:6 type geneve id 200 \
  remote 192.168.13.2
RTNETLINK answers: Invalid argument

After:
======
$ ip link add gen1 address 0:1:2:3:4:5:6 type geneve id 200 \
  remote 192.168.13.2
Error: Provided link layer address is not Ethernet

Also, netdev_dbg() calls used to log errors associated with Netlink
request have been removed.
Signed-off-by: NGirish Moodalbail <girish.moodalbail@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5ebc440

10 8月, 2017 1 次提交

geneve: maximum value of VNI cannot be used · 04db70d9

由 Girish Moodalbail 提交于 8月 08, 2017

Geneve's Virtual Network Identifier (VNI) is 24 bit long, so the range
of values for it would be from 0 to 16777215 (2^24 -1).  However, one
cannot create a geneve device with VNI set to 16777215. This patch fixes
this issue.
Signed-off-by: NGirish Moodalbail <girish.moodalbail@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04db70d9

25 7月, 2017 3 次提交

geneve/vxlan: offload ports on register/unregister events · 04584957

由 Sabrina Dubroca 提交于 7月 21, 2017

This improves consistency of handling when moving a netdev to another
netns. Most drivers currently do a full reset when the device goes up,
so that will flush the offload state anyway.
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04584957

S
geneve/vxlan: add support for NETDEV_UDP_TUNNEL_DROP_INFO · 2d2b13fc
由 Sabrina Dubroca 提交于 7月 21, 2017
```
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
2d2b13fc

geneve: add rtnl changelink support · 5b861f6b

由 Girish Moodalbail 提交于 7月 20, 2017

This patch adds changelink rtnl operation support for geneve devices
and the code changes involve:

  - added geneve_quiesce() which quiesces the geneve device data path
    for both TX and RX. This lets us perform the changelink operation
    atomically w.r.t data path. Also added geneve_unquiesce() to
    reverse the operation of geneve_quiesce().

  - refactor geneve_newlink into geneve_nl2info to be used by both
    geneve_newlink and geneve_changelink

  - geneve_nl2info takes a changelink boolean argument to isolate
    changelink checks.

  - Allow changing only a few attributes (ttl, tos, and remote tunnel
    endpoint IP address (within the same address family)):
    - return -EOPNOTSUPP for attributes that cannot be changed for
      now. Incremental patches can make the non-supported one
      available in the future if needed.
Signed-off-by: NGirish Moodalbail <girish.moodalbail@oracle.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b861f6b

03 7月, 2017 1 次提交

geneve: fix hlist corruption · 4b4c21fa

由 Jiri Benc 提交于 7月 02, 2017

It's not a good idea to add the same hlist_node to two different hash lists.
This leads to various hard to debug memory corruptions.

Fixes: 8ed66f0e ("geneve: implement support for IPv6-based tunnels")
Cc: John W. Linville <linville@tuxdriver.com>
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4b4c21fa

27 6月, 2017 2 次提交

net: add netlink_ext_ack argument to rtnl_link_ops.validate · a8b8a889

由 Matthias Schiffer 提交于 6月 25, 2017

Add support for extended error reporting.
Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net>
Acked-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8b8a889

net: add netlink_ext_ack argument to rtnl_link_ops.newlink · 7a3f4a18

由 Matthias Schiffer 提交于 6月 25, 2017

Add support for extended error reporting.
Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net>
Acked-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7a3f4a18

16 6月, 2017 1 次提交

networking: make skb_push & __skb_push return void pointers · d58ff351

由 Johannes Berg 提交于 6月 16, 2017

It seems like a historic accident that these return unsigned char *,
and in many places that means casts are required, more often than not.

Make these functions return void * and remove all the casts across
the tree, adding a (u8 *) cast only where the unsigned char pointer
was used directly, all done with the following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    @@
    expression SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - fn(SKB, LEN)[0]
    + *(u8 *)fn(SKB, LEN)

Note that the last part there converts from push(...)[0] to the
more idiomatic *(u8 *)push(...).
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d58ff351

10 6月, 2017 1 次提交

geneve: add missing rx stats accounting · fe741e23

由 Girish Moodalbail 提交于 6月 08, 2017

There are few places on the receive path where packet drops and packet
errors were not accounted for. This patch fixes that issue.
Signed-off-by: NGirish Moodalbail <girish.moodalbail@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe741e23

08 6月, 2017 1 次提交

net: Fix inconsistent teardown and release of private netdev state. · cf124db5

由 David S. Miller 提交于 5月 08, 2017

Network devices can allocate reasources and private memory using
netdev_ops->ndo_init().  However, the release of these resources
can occur in one of two different places.

Either netdev_ops->ndo_uninit() or netdev->destructor().

The decision of which operation frees the resources depends upon
whether it is necessary for all netdev refs to be released before it
is safe to perform the freeing.

netdev_ops->ndo_uninit() presumably can occur right after the
NETDEV_UNREGISTER notifier completes and the unicast and multicast
address lists are flushed.

netdev->destructor(), on the other hand, does not run until the
netdev references all go away.

Further complicating the situation is that netdev->destructor()
almost universally does also a free_netdev().

This creates a problem for the logic in register_netdevice().
Because all callers of register_netdevice() manage the freeing
of the netdev, and invoke free_netdev(dev) if register_netdevice()
fails.

If netdev_ops->ndo_init() succeeds, but something else fails inside
of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
it is not able to invoke netdev->destructor().

This is because netdev->destructor() will do a free_netdev() and
then the caller of register_netdevice() will do the same.

However, this means that the resources that would normally be released
by netdev->destructor() will not be.

Over the years drivers have added local hacks to deal with this, by
invoking their destructor parts by hand when register_netdevice()
fails.

Many drivers do not try to deal with this, and instead we have leaks.

Let's close this hole by formalizing the distinction between what
private things need to be freed up by netdev->destructor() and whether
the driver needs unregister_netdevice() to perform the free_netdev().

netdev->priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev->destructor(), except for
free_netdev().

netdev->needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().

Now, register_netdevice() can sanely release all resources after
ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
and netdev->priv_destructor().

And at the end of unregister_netdevice(), we invoke
netdev->priv_destructor() and optionally call free_netdev().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf124db5

05 6月, 2017 1 次提交

geneve: fix needed_headroom and max_mtu for collect_metadata · 9a1c44d9

由 Eric Garver 提交于 6月 02, 2017

Since commit 9b4437a5 ("geneve: Unify LWT and netdev handling.")
when using COLLECT_METADATA geneve devices are created with too small of
a needed_headroom and too large of a max_mtu. This is because
ip_tunnel_info_af() is not valid with the device level info when using
COLLECT_METADATA and we mistakenly fall into the IPv4 case.

For COLLECT_METADATA, always use the worst case of ipv6 since both
sockets are created.

Fixes: 9b4437a5 ("geneve: Unify LWT and netdev handling.")
Signed-off-by: NEric Garver <e@erig.me>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9a1c44d9

26 5月, 2017 1 次提交

geneve: fix fill_info when using collect_metadata · 11387fe4

由 Eric Garver 提交于 5月 23, 2017

Since 9b4437a5 ("geneve: Unify LWT and netdev handling.") fill_info
does not return UDP_ZERO_CSUM6_RX when using COLLECT_METADATA. This is
because it uses ip_tunnel_info_af() with the device level info, which is
not valid for COLLECT_METADATA.

Fix by checking for the presence of the actual sockets.

Fixes: 9b4437a5 ("geneve: Unify LWT and netdev handling.")
Signed-off-by: NEric Garver <e@erig.me>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

11387fe4

01 5月, 2017 1 次提交

geneve: fix incorrect setting of UDP checksum flag · 5e0740c4

由 Girish Moodalbail 提交于 4月 27, 2017

Creating a geneve link with 'udpcsum' set results in a creation of link
for which UDP checksum will NOT be computed on outbound packets, as can
be seen below.

11: gen0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
    link/ether c2:85:27:b6:b4:15 brd ff:ff:ff:ff:ff:ff promiscuity 0
    geneve id 200 remote 192.168.13.1 dstport 6081 noudpcsum

Similarly, creating a link with 'noudpcsum' set results in a creation
of link for which UDP checksum will be computed on outbound packets.

Fixes: 9b4437a5 ("geneve: Unify LWT and netdev handling.")
Signed-off-by: NGirish Moodalbail <girish.moodalbail@oracle.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Acked-by: NLance Richardson <lrichard@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e0740c4

02 3月, 2017 1 次提交

geneve: lock RCU on TX path · a717e3f7

由 Jakub Kicinski 提交于 2月 24, 2017

There is no guarantees that callers of the TX path will hold
the RCU lock.  Grab it explicitly.

Fixes: fceb9c3e ("geneve: avoid using stale geneve socket.")
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a717e3f7

03 12月, 2016 1 次提交

geneve: avoid use-after-free of skb->data · 5b010147

由 Sabrina Dubroca 提交于 12月 02, 2016

geneve{,6}_build_skb can end up doing a pskb_expand_head(), which
makes the ip_hdr(skb) reference we stashed earlier stale. Since it's
only needed as an argument to ip_tunnel_ecn_encap(), move this
directly in the function call.

Fixes: 08399efc ("geneve: ensure ECN info is handled properly in all tx/rx paths")
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Reviewed-by: NJohn W. Linville <linville@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b010147

29 11月, 2016 1 次提交

geneve: fix ip_hdr_len reserved for geneve6 tunnel. · 31ac1c19

由 Haishuang Yan 提交于 11月 28, 2016

It shold reserved sizeof(ipv6hdr) for geneve in ipv6 tunnel.

Fixes: c3ef5aa5 ('geneve: Merge ipv4 and ipv6 geneve_build_skb()')
Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

31ac1c19

22 11月, 2016 4 次提交

geneve: Optimize geneve device lookup. · 2e0b26e1

由 pravin shelar 提交于 11月 21, 2016

Rather than comparing 64-bit tunnel-id, compare tunnel vni
which is 24-bit id. This also save conversion from vni
to tunnel id on each tunnel packet receive.
Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e0b26e1

geneve: Remove redundant socket checks. · bcceeec3

由 pravin shelar 提交于 11月 21, 2016

Geneve already has check for device socket in route
lookup function. So no need to check it in xmit
function.
Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bcceeec3

geneve: Merge ipv4 and ipv6 geneve_build_skb() · c3ef5aa5

由 pravin shelar 提交于 11月 21, 2016

There are minimal difference in building Geneve header
between ipv4 and ipv6 geneve tunnels. Following patch
refactors code to unify it.
Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3ef5aa5

geneve: Unify LWT and netdev handling. · 9b4437a5

由 pravin shelar 提交于 11月 21, 2016

Current geneve implementation has two separate cases to handle.
1. netdev xmit
2. LWT xmit.

In case of netdev, geneve configuration is stored in various
struct geneve_dev members. For example geneve_addr, ttl, tos,
label, flags, dst_cache, etc. For LWT ip_tunnel_info is passed
to the device in ip_tunnel_info.

Following patch uses ip_tunnel_info struct to store almost all
of configuration of a geneve netdevice. This allows us to unify
most of geneve driver code around ip_tunnel_info struct.
This dramatically simplify geneve code, since it does not
need to handle two different configuration cases. Removes
duplicate code, single code path can handle either type
of geneve devices.
Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b4437a5

18 11月, 2016 1 次提交

netns: make struct pernet_operations::id unsigned int · c7d03a00

由 Alexey Dobriyan 提交于 11月 17, 2016

Make struct pernet_operations::id unsigned.

There are 2 reasons to do so:

1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.

2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.

"int" being used as an array index needs to be sign-extended
to 64-bit before being used.

	void f(long *p, int i)
	{
		g(p[i]);
	}

  roughly translates to

	movsx	rsi, esi
	mov	rdi, [rsi+...]
	call 	g

MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.

Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:

	static inline void *net_generic(const struct net *net, int id)
	{
		...
		ptr = ng->ptr[id - 1];
		...
	}

And this function is used a lot, so those sign extensions add up.

Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):

	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]

However, overall balance is in negative direction:

	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
	function                                     old     new   delta
	nfsd4_lock                                  3886    3959     +73
	tipc_link_build_proto_msg                   1096    1140     +44
	mac80211_hwsim_new_radio                    2776    2808     +32
	tipc_mon_rcv                                1032    1058     +26
	svcauth_gss_legacy_init                     1413    1429     +16
	tipc_bcbase_select_primary                   379     392     +13
	nfsd4_exchange_id                           1247    1260     +13
	nfsd4_setclientid_confirm                    782     793     +11
		...
	put_client_renew_locked                      494     480     -14
	ip_set_sockfn_get                            730     716     -14
	geneve_sock_add                              829     813     -16
	nfsd4_sequence_done                          721     703     -18
	nlmclnt_lookup_host                          708     686     -22
	nfsd4_lockt                                 1085    1063     -22
	nfs_get_client                              1077    1050     -27
	tcf_bpf_init                                1106    1076     -30
	nfsd4_encode_fattr                          5997    5930     -67
	Total: Before=154856051, After=154854321, chg -0.00%
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7d03a00

30 10月, 2016 1 次提交

geneve: avoid using stale geneve socket. · fceb9c3e

由 pravin shelar 提交于 10月 28, 2016

This patch is similar to earlier vxlan patch.
Geneve device close operation frees geneve socket. This
operation can race with geneve-xmit function which
dereferences geneve socket. Following patch uses RCU
mechanism to avoid this situation.
Signed-off-by: NPravin B Shelar <pshelar@ovn.org>
Acked-by: NJohn W. Linville <linville@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fceb9c3e

21 10月, 2016 1 次提交

net: use core MTU range checking in core net infra · 91572088

由 Jarod Wilson 提交于 10月 20, 2016

geneve:
- Merge __geneve_change_mtu back into geneve_change_mtu, set max_mtu
- This one isn't quite as straight-forward as others, could use some
  closer inspection and testing

macvlan:
- set min/max_mtu

tun:
- set min/max_mtu, remove tun_net_change_mtu

vxlan:
- Merge __vxlan_change_mtu back into vxlan_change_mtu
- Set max_mtu to IP_MAX_MTU and retain dynamic MTU range checks in
  change_mtu function
- This one is also not as straight-forward and could use closer inspection
  and testing from vxlan folks

bridge:
- set max_mtu of IP_MAX_MTU and retain dynamic MTU range checks in
  change_mtu function

openvswitch:
- set min/max_mtu, remove internal_dev_change_mtu
- note: max_mtu wasn't checked previously, it's been set to 65535, which
  is the largest possible size supported

sch_teql:
- set min/max_mtu (note: max_mtu previously unchecked, used max of 65535)

macsec:
- min_mtu = 0, max_mtu = 65535

macvlan:
- min_mtu = 0, max_mtu = 65535

ntb_netdev:
- min_mtu = 0, max_mtu = 65535

veth:
- min_mtu = 68, max_mtu = 65535

8021q:
- min_mtu = 0, max_mtu = 65535

CC: netdev@vger.kernel.org
CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
CC: Tom Herbert <tom@herbertland.com>
CC: Daniel Borkmann <daniel@iogearbox.net>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Paolo Abeni <pabeni@redhat.com>
CC: Jiri Benc <jbenc@redhat.com>
CC: WANG Cong <xiyou.wangcong@gmail.com>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Pravin B Shelar <pshelar@ovn.org>
CC: Sabrina Dubroca <sd@queasysnail.net>
CC: Patrick McHardy <kaber@trash.net>
CC: Stephen Hemminger <stephen@networkplumber.org>
CC: Pravin Shelar <pshelar@nicira.com>
CC: Maxim Krasnyansky <maxk@qti.qualcomm.com>
Signed-off-by: NJarod Wilson <jarod@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

91572088

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功