提交 · d1f20c03f48102e52eb98b8651d129b83134cae4 · openeuler / Kernel

31 1月, 2019 1 次提交

ipvlan, l3mdev: fix broken l3s mode wrt local routes · d5256083

由 Daniel Borkmann 提交于 1月 30, 2019

While implementing ipvlan l3 and l3s mode for kubernetes CNI plugin,
I ran into the issue that while l3 mode is working fine, l3s mode
does not have any connectivity to kube-apiserver and hence all pods
end up in Error state as well. The ipvlan master device sits on
top of a bond device and hostns traffic to kube-apiserver (also running
in hostns) is DNATed from 10.152.183.1:443 to 139.178.29.207:37573
where the latter is the address of the bond0. While in l3 mode, a
curl to https://10.152.183.1:443 or to https://139.178.29.207:37573
works fine from hostns, neither of them do in case of l3s. In the
latter only a curl to https://127.0.0.1:37573 appeared to work where
for local addresses of bond0 I saw kernel suddenly starting to emit
ARP requests to query HW address of bond0 which remained unanswered
and neighbor entries in INCOMPLETE state. These ARP requests only
happen while in l3s.

Debugging this further, I found the issue is that l3s mode is piggy-
backing on l3 master device, and in this case local routes are using
l3mdev_master_dev_rcu(dev) instead of net->loopback_dev as per commit
f5a0aab8 ("net: ipv4: dst for local input routes should use l3mdev
if relevant") and 5f02ce24 ("net: l3mdev: Allow the l3mdev to be
a loopback"). I found that reverting them back into using the
net->loopback_dev fixed ipvlan l3s connectivity and got everything
working for the CNI.

Now judging from 4fbae7d8 ("ipvlan: Introduce l3s mode") and the
l3mdev paper in [0] the only sole reason why ipvlan l3s is relying
on l3 master device is to get the l3mdev_ip_rcv() receive hook for
setting the dst entry of the input route without adding its own
ipvlan specific hacks into the receive path, however, any l3 domain
semantics beyond just that are breaking l3s operation. Note that
ipvlan also has the ability to dynamically switch its internal
operation from l3 to l3s for all ports via ipvlan_set_port_mode()
at runtime. In any case, l3 vs l3s soley distinguishes itself by
'de-confusing' netfilter through switching skb->dev to ipvlan slave
device late in NF_INET_LOCAL_IN before handing the skb to L4.

Minimal fix taken here is to add a IFF_L3MDEV_RX_HANDLER flag which,
if set from ipvlan setup, gets us only the wanted l3mdev_l3_rcv() hook
without any additional l3mdev semantics on top. This should also have
minimal impact since dev->priv_flags is already hot in cache. With
this set, l3s mode is working fine and I also get things like
masquerading pod traffic on the ipvlan master properly working.

  [0] https://netdevconf.org/1.2/papers/ahern-what-is-l3mdev-paper.pdf

Fixes: f5a0aab8 ("net: ipv4: dst for local input routes should use l3mdev if relevant")
Fixes: 5f02ce24 ("net: l3mdev: Allow the l3mdev to be a loopback")
Fixes: 4fbae7d8 ("ipvlan: Introduce l3s mode")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: David Ahern <dsa@cumulusnetworks.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Martynas Pumputis <m@lambda.lt>
Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5256083

14 12月, 2018 1 次提交

net: ipvlan: Issue NETDEV_PRE_CHANGEADDR · 61345fab

由 Petr Machata 提交于 12月 13, 2018

A NETDEV_CHANGEADDR event implies a change of address of each of the
IPVLANs of this IPVLAN device. Therefore propagate NETDEV_PRE_CHANGEADDR
to all the IPVLANs.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61345fab

11 12月, 2018 1 次提交

ipvlan: Remove a useless comparison · b1dd054d

由 YueHaibing 提交于 12月 10, 2018

Fix following gcc warning:

drivers/net/ipvlan/ipvlan_main.c:543:12: warning:
 comparison is always false due to limited range of data type [-Wtype-limits]

'mode' is a u16 variable, IPVLAN_MODE_L2 is zero,
the comparison is always false
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1dd054d

07 12月, 2018 2 次提交

net: core: dev: Add extack argument to dev_change_flags() · 567c5e13

由 Petr Machata 提交于 12月 06, 2018

In order to pass extack together with NETDEV_PRE_UP notifications, it's
necessary to route the extack to __dev_open() from diverse (possibly
indirect) callers. One prominent API through which the notification is
invoked is dev_change_flags().

Therefore extend dev_change_flags() with and extra extack argument and
update all users. Most of the calls end up just encoding NULL, but
several sites (VLAN, ipvlan, VRF, rtnetlink) do have extack available.

Since the function declaration line is changed anyway, name the other
function arguments to placate checkpatch.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Reviewed-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

567c5e13

net: ipvlan: ipvlan_set_port_mode(): Add an extack argument · cf7686a0

由 Petr Machata 提交于 12月 06, 2018

A follow-up patch will extend dev_change_flags() with an extack
argument. Extend ipvlan_set_port_mode() to have that argument available
for the conversion.
Signed-off-by: NPetr Machata <petrm@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf7686a0

02 7月, 2018 1 次提交

ipvlan: call dev_change_flags when ipvlan mode is reset · 5dc2d399

由 Hangbin Liu 提交于 7月 01, 2018

After we change the ipvlan mode from l3 to l2, or vice versa, we only
reset IFF_NOARP flag, but don't flush the ARP table cache, which will
cause eth->h_dest to be equal to eth->h_source in ipvlan_xmit_mode_l2().
Then the message will not come out of host.

Here is the reproducer on local host:

ip link set eth1 up
ip addr add 192.168.1.1/24 dev eth1
ip link add link eth1 ipvlan1 type ipvlan mode l3

ip netns add net1
ip link set ipvlan1 netns net1
ip netns exec net1 ip link set ipvlan1 up
ip netns exec net1 ip addr add 192.168.2.1/24 dev ipvlan1

ip route add 192.168.2.0/24 via 192.168.1.2
ping 192.168.2.2 -c 2

ip netns exec net1 ip link set ipvlan1 type ipvlan mode l2
ping 192.168.2.2 -c 2

Add the same configuration on remote host. After we set the mode to l2,
we could find that the src/dst MAC addresses are the same on eth1:

21:26:06.648565 00:b7:13:ad:d3:05 > 00:b7:13:ad:d3:05, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 58356, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.2.1 > 192.168.2.2: ICMP echo request, id 22686, seq 1, length 64

Fix this by calling dev_change_flags(), which will call netdevice notifier
with flag change info.

v2:
a) As pointed out by Wang Cong, check return value for dev_change_flags() when
change dev flags.
b) As suggested by Stefano and Sabrina, move flags setting before l3mdev_ops.
So we don't need to redo ipvlan_{, un}register_nf_hook() again in err path.
Reported-by: NJianlin Shi <jishi@redhat.com>
Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
Fixes: 2ad7bf36 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5dc2d399

21 6月, 2018 1 次提交

ipvlan: fix IFLA_MTU ignored on NEWLINK · 30877961

由 Xin Long 提交于 6月 21, 2018

Commit 296d4856 ("ipvlan: inherit MTU from master device") adjusted
the mtu from the master device when creating a ipvlan device, but it
would also override the mtu value set in rtnl_create_link. It causes
IFLA_MTU param not to take effect.

So this patch is to not adjust the mtu if IFLA_MTU param is set when
creating a ipvlan device.

Fixes: 296d4856 ("ipvlan: inherit MTU from master device")
Reported-by: NJianlin Shi <jishi@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30877961

20 6月, 2018 1 次提交

ipvlan: use ETH_MAX_MTU as max mtu · 548feb33

由 Xin Long 提交于 6月 18, 2018

Similar to the fixes on team and bonding, this restores the ability
to set an ipvlan device's mtu to anything higher than 1500.

Fixes: 91572088 ("net: use core MTU range checking in core net infra")
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

548feb33

16 5月, 2018 1 次提交

ipvlan: call netdevice notifier when master mac address changed · ab452c3c

由 Keefe Liu 提交于 5月 14, 2018

When master device's mac has been changed, the commit
32c10bbf ("ipvlan: always use the current L2 addr of the
master") makes the IPVlan devices's mac changed also, but it
doesn't do related works such as flush the IPVlan devices's
arp table.
Signed-off-by: NKeefe Liu <liuqifa@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab452c3c

28 3月, 2018 1 次提交

net: Drop pernet_operations::async · 2f635cee

由 Kirill Tkhai 提交于 3月 27, 2018

Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f635cee

10 3月, 2018 1 次提交

net: introduce IFF_NO_RX_HANDLER · f5426250

由 Paolo Abeni 提交于 3月 09, 2018

Some network devices - notably ipvlan slave - are not compatible with
any kind of rx_handler. Currently the hook can be installed but any
configuration (bridge, bond, macsec, ...) is nonfunctional.

This change allocates a priv_flag bit to mark such devices and explicitly
forbid installing a rx_handler if such bit is set. The new bit is used
by ipvlan slave device.
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f5426250

08 3月, 2018 1 次提交

net: unpollute priv_flags space · 1ec54cb4

由 Paolo Abeni 提交于 3月 06, 2018

the ipvlan device driver defines and uses 2 bits inside the priv_flags
net_device field. Such bits and the related helper are used only
inside the ipvlan device driver, and the core networking does not
need to be aware of them.

This change moves netif_is_ipvlan* helper in the ipvlan driver and
re-implement them looking for ipvlan specific symbols instead of
using priv_flags.

Overall this frees two bits inside priv_flags - and move the following
ones to avoid gaps - without any intended functional change.
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ec54cb4

05 3月, 2018 1 次提交

ipvlan: forbid vlan devices on top of ipvlan · 3518e40b

由 Paolo Abeni 提交于 3月 02, 2018

Currently we allow the creation of 8021q devices on top of
ipvlan, but such devices are nonfunctional, as the underlying
ipvlan rx_hanlder hook can't match the relevant traffic.

Be explicit and forbid the creation of such nonfunctional devices.
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3518e40b

01 3月, 2018 1 次提交

ipvlan: use per device spinlock to protect addrs list updates · 82308194

由 Paolo Abeni 提交于 2月 28, 2018

This changeset moves ipvlan address under RCU protection, using
a per ipvlan device spinlock to protect list mutation and RCU
read access to protect list traversal.

Also explicitly use RCU read lock to traverse the per port
ipvlans list, so that we can now perform a full address lookup
without asserting the RTNL lock.

Overall this allows the ipvlan driver to check fully for duplicate
addresses - before this commit ipv6 addresses assigned by autoconf
via prefix delegation where accepted without any check - and avoid
the following rntl assertion failure still in the same code path:

 RTNL: assertion failed at drivers/net/ipvlan/ipvlan_core.c (124)
 WARNING: CPU: 15 PID: 0 at drivers/net/ipvlan/ipvlan_core.c:124 ipvlan_addr_busy+0x97/0xa0 [ipvlan]
 Modules linked in: ipvlan(E) ixgbe
 CPU: 15 PID: 0 Comm: swapper/15 Tainted: G            E    4.16.0-rc2.ipvlan+ #1782
 Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
 RIP: 0010:ipvlan_addr_busy+0x97/0xa0 [ipvlan]
 RSP: 0018:ffff881ff9e03768 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffff881fdf2a9000 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: 00000000000000f6 RDI: 0000000000000300
 RBP: ffff881fdf2a8000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: ffff881ff9e034c0 R12: ffff881fe07bcc00
 R13: 0000000000000001 R14: ffffffffa02002b0 R15: 0000000000000001
 FS:  0000000000000000(0000) GS:ffff881ff9e00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007fc5c1a4f248 CR3: 000000207e012005 CR4: 00000000001606e0
 Call Trace:
  <IRQ>
  ipvlan_addr6_event+0x6c/0xd0 [ipvlan]
  notifier_call_chain+0x49/0x90
  atomic_notifier_call_chain+0x6a/0x100
  ipv6_add_addr+0x5f9/0x720
  addrconf_prefix_rcv_add_addr+0x244/0x3c0
  addrconf_prefix_rcv+0x2f3/0x790
  ndisc_router_discovery+0x633/0xb70
  ndisc_rcv+0x155/0x180
  icmpv6_rcv+0x4ac/0x5f0
  ip6_input_finish+0x138/0x6a0
  ip6_input+0x41/0x1f0
  ipv6_rcv+0x4db/0x8d0
  __netif_receive_skb_core+0x3d5/0xe40
  netif_receive_skb_internal+0x89/0x370
  napi_gro_receive+0x14f/0x1e0
  ixgbe_clean_rx_irq+0x4ce/0x1020 [ixgbe]
  ixgbe_poll+0x31a/0x7a0 [ixgbe]
  net_rx_action+0x296/0x4f0
  __do_softirq+0xcf/0x4f5
  irq_exit+0xf5/0x110
  do_IRQ+0x62/0x110
  common_interrupt+0x91/0x91
  </IRQ>

 v1 -> v2: drop unneeded in_softirq check in ipvlan_addr6_validator_event()

Fixes: e9997c29 ("ipvlan: fix check for IP addresses in control path")
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82308194

28 2月, 2018 1 次提交

net: Convert ipvlan_net_ops · 68eabe8b

由 Kirill Tkhai 提交于 2月 26, 2018

These pernet_operations unregister ipvlan net hooks.
nf_unregister_net_hooks() removes hooks one-by-one,
and then frees the memory via rcu. This looks similar
to that happens, when a new hooks is added: allocation
of bigger memory region, copy of old content, and rcu
freeing the old memory. So, all of net code should be
well with this behavior. Also at the time of hook
unregistering, there are no packets, and foreign net
pernet_operations are not interested in others hooks.
So, we mark them as async.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68eabe8b

22 2月, 2018 1 次提交

ipvlan: drop ipv6 dependency · 94333fac

由 Matteo Croce 提交于 2月 21, 2018

IPVlan has an hard dependency on IPv6, refactor the ipvlan code to allow
compiling it with IPv6 disabled, move duplicate code into addr_equal()
and refactor series of if-else into a switch.
Signed-off-by: NMatteo Croce <mcroce@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

94333fac

03 12月, 2017 1 次提交

ipvlan: Add new func ipvlan_is_valid_dev instead of duplicated codes · 5e51fe6f

由 Gao Feng 提交于 12月 01, 2017

There are multiple duplicated condition checks in the current codes, so
I add the new func ipvlan_is_valid_dev instead of the duplicated codes to
check if the netdev is real ipvlan dev.
Signed-off-by: NGao Feng <gfree.wind@vip.163.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e51fe6f

18 11月, 2017 1 次提交

ipvlan: NULL pointer dereference panic in ipvlan_port_destroy · fe18da60

由 Girish Moodalbail 提交于 11月 16, 2017

When call to register_netdevice() (called from ipvlan_link_new()) fails,
we call ipvlan_uninit() (through ndo_uninit()) to destroy the ipvlan
port. After returning unsuccessfully from register_netdevice() we go
ahead and call ipvlan_port_destroy() again which causes NULL pointer
dereference panic. Fix the issue by making ipvlan_init() and
ipvlan_uninit() call symmetric.

The ipvlan port will now be created inside ipvlan_init() and will be
destroyed in ipvlan_uninit().

Fixes: 2ad7bf36 (ipvlan: Initial check-in of the IPVLAN driver)
Signed-off-by: NGirish Moodalbail <girish.moodalbail@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe18da60

29 10月, 2017 2 次提交

ipvlan: implement VEPA mode · fe89aa6b

由 Mahesh Bandewar 提交于 10月 26, 2017

This is very similar to the Macvlan VEPA mode, however, there is some
difference. IPvlan uses the mac-address of the lower device, so the VEPA
mode has implications of ICMP-redirects for packets destined for its
immediate neighbors sharing same master since the packets will have same
source and dest mac. The external switch/router will send redirect msg.

Having said that, this will be useful tool in terms of debugging
since IPvlan will not switch packets within its slaves and rely completely
on the external entity as intended in 802.1Qbg.
Signed-off-by: NMahesh Bandewar <maheshb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe89aa6b

ipvlan: introduce 'private' attribute for all existing modes. · a190d04d

由 Mahesh Bandewar 提交于 10月 26, 2017

IPvlan has always operated in bridge mode. However there are scenarios
where each slave should be able to talk through the master device but
not necessarily across each other. Think of an environment where each
of a namespace is a private and independant customer. In this scenario
the machine which is hosting these namespaces neither want to tell who
their neighbor is nor the individual namespaces care to talk to neighbor
on short-circuited network path.

This patch implements the mode that is very similar to the 'private' mode
in macvlan where individual slaves can send and receive traffic through
the master device, just that they can not talk among slave devices.
Signed-off-by: NMahesh Bandewar <maheshb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a190d04d

20 10月, 2017 2 次提交

net: Add extack to validator_info structs used for address notifier · de95e047

由 David Ahern 提交于 10月 18, 2017

Add extack to in_validator_info and in6_validator_info. Update the one
user of each, ipvlan, to return an error message for failures.

Only manual configuration of an address is plumbed in the IPv6 code path.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de95e047

net: ipv6: Make inet6addr_validator a blocking notifier · ff7883ea

由 David Ahern 提交于 10月 18, 2017

inet6addr_validator chain was added by commit 3ad7d246 ("Ipvlan
should return an error when an address is already in use") to allow
address validation before changes are committed and to be able to
fail the address change with an error back to the user. The address
validation is not done for addresses received from router
advertisements.

Handling RAs in softirq context is the only reason for the notifier
chain to be atomic versus blocking. Since the only current user, ipvlan,
of the validator chain ignores softirq context, the notifier can be made
blocking and simply not invoked for softirq path.

The blocking option is needed by spectrum for example to validate
resources for an adding an address to an interface.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff7883ea

13 10月, 2017 1 次提交

ipvlan: always use the current L2 addr of the master · 32c10bbf

由 Mahesh Bandewar 提交于 10月 11, 2017

If the underlying master ever changes its L2 (e.g. bonding device),
then make sure that the IPvlan slaves always emit packets with the
current L2 of the master instead of the stale mac addr which was
copied during the device creation. The problem can be seen with
following script -

  #!/bin/bash
  # Create a vEth pair
  ip link add dev veth0 type veth peer name veth1
  ip link set veth0 up
  ip link set veth1 up
  ip link show veth0
  ip link show veth1
  # Create an IPvlan device on one end of this vEth pair.
  ip link add link veth0 dev ipvl0 type ipvlan mode l2
  ip link show ipvl0
  # Change the mac-address of the vEth master.
  ip link set veth0 address 02:11:22:33:44:55

Fixes: 2ad7bf36 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: NMahesh Bandewar <maheshb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32c10bbf

05 10月, 2017 1 次提交

net: Add extack to upper device linking · 42ab19ee

由 David Ahern 提交于 10月 04, 2017

Add extack arg to netdev_upper_dev_link and netdev_master_upper_dev_link
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

42ab19ee

02 8月, 2017 1 次提交

ipvlan: Fix 64-bit statistics seqcount initialization · 87173cd6

由 Florian Fainelli 提交于 8月 01, 2017

On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a
lockdep splat indicating this seqcount is not correctly initialized, fix
that by using the proper helper function: netdev_alloc_pcpu_stats().

Fixes: 2ad7bf36 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

87173cd6

01 8月, 2017 1 次提交

netfilter: nf_hook_ops structs can be const · 591bb278

由 Florian Westphal 提交于 7月 26, 2017

We no longer place these on a list so they can be const.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

591bb278

18 7月, 2017 1 次提交
- D
  ipvlan: Stop advertising NETIF_F_UFO support. · 182e0b6b
  由 David S. Miller 提交于 7月 03, 2017
```
It is going away.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  182e0b6b
27 6月, 2017 3 次提交

net: add netlink_ext_ack argument to rtnl_link_ops.validate · a8b8a889

由 Matthias Schiffer 提交于 6月 25, 2017

Add support for extended error reporting.
Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net>
Acked-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8b8a889

net: add netlink_ext_ack argument to rtnl_link_ops.changelink · ad744b22

由 Matthias Schiffer 提交于 6月 25, 2017

Add support for extended error reporting.
Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net>
Acked-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad744b22

net: add netlink_ext_ack argument to rtnl_link_ops.newlink · 7a3f4a18

由 Matthias Schiffer 提交于 6月 25, 2017

Add support for extended error reporting.
Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net>
Acked-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7a3f4a18

10 6月, 2017 1 次提交

Ipvlan should return an error when an address is already in use. · 3ad7d246

由 Krister Johansen 提交于 6月 08, 2017

The ipvlan code already knows how to detect when a duplicate address is
about to be assigned to an ipvlan device. However, that failure is not
propogated outward and leads to a silent failure.

Introduce a validation step at ip address creation time and allow device
drivers to register to validate the incoming ip addresses. The ipvlan
code is the first consumer. If it detects an address in use, we can
return an error to the user before beginning to commit the new ifa in
the networking code.

This can be especially useful if it is necessary to provision many
ipvlans in containers. The provisioning software (or operator) can use
this to detect situations where an ip address is unexpectedly in use.
Signed-off-by: NKrister Johansen <kjlx@templeofstupid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ad7d246

08 6月, 2017 1 次提交

net: Fix inconsistent teardown and release of private netdev state. · cf124db5

由 David S. Miller 提交于 5月 08, 2017

Network devices can allocate reasources and private memory using
netdev_ops->ndo_init().  However, the release of these resources
can occur in one of two different places.

Either netdev_ops->ndo_uninit() or netdev->destructor().

The decision of which operation frees the resources depends upon
whether it is necessary for all netdev refs to be released before it
is safe to perform the freeing.

netdev_ops->ndo_uninit() presumably can occur right after the
NETDEV_UNREGISTER notifier completes and the unicast and multicast
address lists are flushed.

netdev->destructor(), on the other hand, does not run until the
netdev references all go away.

Further complicating the situation is that netdev->destructor()
almost universally does also a free_netdev().

This creates a problem for the logic in register_netdevice().
Because all callers of register_netdevice() manage the freeing
of the netdev, and invoke free_netdev(dev) if register_netdevice()
fails.

If netdev_ops->ndo_init() succeeds, but something else fails inside
of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
it is not able to invoke netdev->destructor().

This is because netdev->destructor() will do a free_netdev() and
then the caller of register_netdevice() will do the same.

However, this means that the resources that would normally be released
by netdev->destructor() will not be.

Over the years drivers have added local hacks to deal with this, by
invoking their destructor parts by hand when register_netdevice()
fails.

Many drivers do not try to deal with this, and instead we have leaks.

Let's close this hole by formalizing the distinction between what
private things need to be freed up by netdev->destructor() and whether
the driver needs unregister_netdevice() to perform the free_netdev().

netdev->priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev->destructor(), except for
free_netdev().

netdev->needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().

Now, register_netdevice() can sanely release all resources after
ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
and netdev->priv_destructor().

And at the end of unregister_netdevice(), we invoke
netdev->priv_destructor() and optionally call free_netdev().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf124db5

25 4月, 2017 1 次提交

ipvlan: use pernet operations and restrict l3s hooks to master netns · 3133822f

由 Florian Westphal 提交于 4月 20, 2017

commit 4fbae7d8 ("ipvlan: Introduce l3s mode") added
registration of netfilter hooks via nf_register_hooks().

This API provides the illusion of 'global' netfilter hooks by placing the
hooks in all current and future network namespaces.

In case of ipvlan the hook appears to be only needed in the namespace
that contains the ipvlan master device (i.e., usually init_net), so
placing them in all namespaces is not needed.

This switches ipvlan driver to pernet operations, and then only registers
hooks in namespaces where a ipvlan master device is set to l3s mode.

Extra care has to be taken when the master device is moved to another
namespace, as we might have to 'move' the netfilter hooks too.

This is done by storing the namespace the ipvlan port was created in.
On REGISTER event, do (un)register operations in the old/new namespaces.

This will also allow removal of the nf_register_hooks() in a future patch.

Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3133822f

12 2月, 2017 1 次提交

ipvtap: IP-VLAN based tap driver · 235a9d89

由 Sainath Grandhi 提交于 2月 10, 2017

This patch adds a tap character device driver that is based on the
IP-VLAN network interface, called ipvtap. An ipvtap device can be created
in the same way as an ipvlan device, using 'type ipvtap', and then accessed
using the tap user space interface.
Signed-off-by: NSainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

235a9d89

21 1月, 2017 1 次提交

ipvlan: use netdev_is_rx_handler_busy instead of checking specific type · c3262d9d

由 Mahesh Bandewar 提交于 1月 18, 2017

IPvlan checks if the master device is already used by checking a
specific device (here it's macvlan device). This is technically not
sufficient and it should just ensure the rx_handler is busy or not.
This would be a super check that includes macvlan and any other that
has already registered rx-handler.
Signed-off-by: NMahesh Bandewar <maheshb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3262d9d

17 1月, 2017 1 次提交

ipvlan: fix dev_id creation corner case. · 019ec003

由 Mahesh Bandewar 提交于 1月 13, 2017

In the last patch da36e13c ("ipvlan: improvise dev_id generation
logic in IPvlan") I missed some part of Dave's suggestion and because
of that the dev_id creation could fail in a corner case scenario. This
would happen when more or less 64k devices have been already created and
several have been deleted. If the devices that are still sticking around
are the last n bits from the bitmap. So in this scenario even if lower
bits are available, the dev_id search is so narrow that it always fails.

Fixes: da36e13c ("ipvlan: improvise dev_id generation logic in IPvlan")
CC: David Miller <davem@davemloft.org>
CC: Eric Dumazet <edumazet@google.com>
Signed-off-by: NMahesh Bandewar <maheshb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

019ec003

11 1月, 2017 1 次提交

ipvlan: improvise dev_id generation logic in IPvlan · da36e13c

由 Mahesh Bandewar 提交于 1月 09, 2017

The patch 009146d1 ("ipvlan: assign unique dev-id for each slave
device.") used ida_simple_get() to generate dev_ids assigned to the
slave devices. However (Eric has pointed out that) there is a shortcoming
with that approach as it always uses the first available ID. This
becomes a problem when a slave gets deleted and a new slave gets added.
The ID gets reassigned causing the new slave to get the same link-local
address. This side-effect is undesirable.

This patch adds a per-port variable that keeps track of the IDs
assigned and used as the stat-base for the IDR api. This base will be
wrapped around when it reaches the MAX (0xFFFE) value possibly on a
busy system where slaves are added and deleted routinely.

Fixes: 009146d1 ("ipvlan: assign unique dev-id for each slave device.")
Signed-off-by: NMahesh Bandewar <maheshb@google.com>
CC: Eric Dumazet <edumazet@google.com>
CC: David Miller <davem@davemloft.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da36e13c

09 1月, 2017 1 次提交

net: make ndo_get_stats64 a void function · bc1f4470

由 stephen hemminger 提交于 1月 06, 2017

The network device operation for reading statistics is only called
in one place, and it ignores the return value. Having a structure
return value is potentially confusing because some future driver could
incorrectly assume that the return value was used.

Fix all drivers with ndo_get_stats64 to have a void function.
Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bc1f4470

05 1月, 2017 1 次提交

ipvlan: assign unique dev-id for each slave device. · 009146d1

由 Mahesh Bandewar 提交于 1月 03, 2017

IPvlan setup uses one mac-address (of master). The IPv6 link-local
addresses are derived using the mac-address on the link. Lack of
dev-ids makes these link-local addresses same for all slaves including
that of master device. dev-ids are necessary to add differentiation
when L2 address is shared.
Signed-off-by: NMahesh Bandewar <maheshb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

009146d1

29 12月, 2016 1 次提交

driver: ipvlan: Define common functions to decrease duplicated codes used to add or del IP address · 86673982

由 Gao Feng 提交于 12月 28, 2016

There are some duplicated codes in ipvlan_add_addr6/4 and
ipvlan_del_addr6/4. Now define two common functions ipvlan_add_addr
and ipvlan_del_addr to decrease the duplicated codes.
It could be helful to maintain the codes.
Signed-off-by: NGao Feng <fgao@ikuai8.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86673982

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功