提交 · 41d1c8839e5f8cb781cc635f12791decee8271b7 · openeuler / Kernel

12 1月, 2019 15 次提交

net: clear skb->tstamp in bridge forwarding path · 41d1c883

由 Paolo Abeni 提交于 1月 08, 2019

Matteo reported forwarding issues inside the linux bridge,
if the enslaved interfaces use the fq qdisc.

Similar to commit 8203e2d8 ("net: clear skb->tstamp in
forwarding paths"), we need to clear the tstamp field in
the bridge forwarding path.

Fixes: 80b14dee ("net: Add a new socket option for a future transmit time.")
Fixes: fb420d5d ("tcp/fq: move back to CLOCK_MONOTONIC")
Reported-and-tested-by: NMatteo Croce <mcroce@redhat.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

41d1c883

Merge branch 'bpfilter-fixes' · 3f4261d4

由 David S. Miller 提交于 1月 11, 2019

Taehee Yoo says:

====================
net: bpfilter: fix two bugs in bpfilter

This patches fix two bugs in the bpfilter_umh which are related in
iptables command.

The first patch adds an exit code for UMH process.
This provides an opportunity to cleanup members of the umh_info
to modules which use the UMH.
In order to identify UMH processes, a new flag PF_UMH is added.

The second patch makes the bpfilter_umh use UMH cleanup callback.

The third patch adds re-start routine for the bpfilter_umh.
The bpfilter_umh does not re-start after error occurred.
because there is no re-start routine in the module.

The fourth patch ensures that the bpfilter.ko module will not removed while
it's being used.
The bpfilter.ko is not protected by locks or module reference counter.
Therefore that can be removed while module is being used.
In order to protect that, mutex is used.

The first and second patch are preparation patches for the third and
fourth patch.

TEST #1
   while :
   do
	modprobe bpfilter
	kill -9 <pid of the bpfilter_umh>
	iptables -vnL
   done

TEST #2
   while :
   do
	iptables -I FORWARD -m string --string ap --algo kmp &
	iptables -F &
	modprobe -rv bpfilter &
   done

TEST #3
   while :
   do
	modprobe bpfilter &
	modprobe -rv bpfilter &
   done

The TEST1 makes a failure of iptables command.
This is fixed by the third patch.

The TEST2 makes a panic because of a race condition in the bpfilter_umh
module.
This is fixed by the fourth patch.

The TEST3 makes a double-create UMH process.
This is fixed by the third and fourth patch.

v4 :
 - declare the exit_umh() as static inline
 - check stop flag in the load_umh() to avoid a double-create UMH
v3 :
 - Avoid unnecessary list lookup for non-UMH processes
 - Add a new PF_UMH flag
v2 : add the first and second patch
v1 : Initial patch
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3f4261d4

net: bpfilter: disallow to remove bpfilter module while being used · 71a85084

由 Taehee Yoo 提交于 1月 09, 2019

The bpfilter.ko module can be removed while functions of the bpfilter.ko
are executing. so panic can occurred. in order to protect that, locks can
be used. a bpfilter_lock protects routines in the
__bpfilter_process_sockopt() but it's not enough because __exit routine
can be executed concurrently.

Now, the bpfilter_umh can not run in parallel.
So, the module do not removed while it's being used and it do not
double-create UMH process.
The members of the umh_info and the bpfilter_umh_ops are protected by
the bpfilter_umh_ops.lock.

test commands:
   while :
   do
	iptables -I FORWARD -m string --string ap --algo kmp &
	modprobe -rv bpfilter &
   done

splat looks like:
[  298.623435] BUG: unable to handle kernel paging request at fffffbfff807440b
[  298.628512] #PF error: [normal kernel read fault]
[  298.633018] PGD 124327067 P4D 124327067 PUD 11c1a3067 PMD 119eb2067 PTE 0
[  298.638859] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  298.638859] CPU: 0 PID: 2997 Comm: iptables Not tainted 4.20.0+ #154
[  298.638859] RIP: 0010:__mutex_lock+0x6b9/0x16a0
[  298.638859] Code: c0 00 00 e8 89 82 ff ff 80 bd 8f fc ff ff 00 0f 85 d9 05 00 00 48 8b 85 80 fc ff ff 48 bf 00 00 00 00 00 fc ff df 48 c1 e8 03 <80> 3c 38 00 0f 85 1d 0e 00 00 48 8b 85 c8 fc ff ff 49 39 47 58 c6
[  298.638859] RSP: 0018:ffff88810e7777a0 EFLAGS: 00010202
[  298.638859] RAX: 1ffffffff807440b RBX: ffff888111bd4d80 RCX: 0000000000000000
[  298.638859] RDX: 1ffff110235ff806 RSI: ffff888111bd5538 RDI: dffffc0000000000
[  298.638859] RBP: ffff88810e777b30 R08: 0000000080000002 R09: 0000000000000000
[  298.638859] R10: 0000000000000000 R11: 0000000000000000 R12: fffffbfff168a42c
[  298.638859] R13: ffff888111bd4d80 R14: ffff8881040e9a05 R15: ffffffffc03a2000
[  298.638859] FS:  00007f39e3758700(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000
[  298.638859] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  298.638859] CR2: fffffbfff807440b CR3: 000000011243e000 CR4: 00000000001006f0
[  298.638859] Call Trace:
[  298.638859]  ? mutex_lock_io_nested+0x1560/0x1560
[  298.638859]  ? kasan_kmalloc+0xa0/0xd0
[  298.638859]  ? kmem_cache_alloc+0x1c2/0x260
[  298.638859]  ? __alloc_file+0x92/0x3c0
[  298.638859]  ? alloc_empty_file+0x43/0x120
[  298.638859]  ? alloc_file_pseudo+0x220/0x330
[  298.638859]  ? sock_alloc_file+0x39/0x160
[  298.638859]  ? __sys_socket+0x113/0x1d0
[  298.638859]  ? __x64_sys_socket+0x6f/0xb0
[  298.638859]  ? do_syscall_64+0x138/0x560
[  298.638859]  ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  298.638859]  ? __alloc_file+0x92/0x3c0
[  298.638859]  ? init_object+0x6b/0x80
[  298.638859]  ? cyc2ns_read_end+0x10/0x10
[  298.638859]  ? cyc2ns_read_end+0x10/0x10
[  298.638859]  ? hlock_class+0x140/0x140
[  298.638859]  ? sched_clock_local+0xd4/0x140
[  298.638859]  ? sched_clock_local+0xd4/0x140
[  298.638859]  ? check_flags.part.37+0x440/0x440
[  298.638859]  ? __lock_acquire+0x4f90/0x4f90
[  298.638859]  ? set_rq_offline.part.89+0x140/0x140
[ ... ]

Fixes: d2ba09c1 ("net: add skeleton of bpfilter kernel module")
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71a85084

net: bpfilter: restart bpfilter_umh when error occurred · 61fbf593

由 Taehee Yoo 提交于 1月 09, 2019

The bpfilter_umh will be stopped via __stop_umh() when the bpfilter
error occurred.
The bpfilter_umh() couldn't start again because there is no restart
routine.

The section of the bpfilter_umh_{start/end} is no longer .init.rodata
because these area should be reused in the restart routine. hence
the section name is changed to .bpfilter_umh.

The bpfilter_ops->start() is restart callback. it will be called when
bpfilter_umh is stopped.
The stop bit means bpfilter_umh is stopped. this bit is set by both
start and stop routine.

Before this patch,
Test commands:
   $ iptables -vnL
   $ kill -9 <pid of bpfilter_umh>
   $ iptables -vnL
   [  480.045136] bpfilter: write fail -32
   $ iptables -vnL

All iptables commands will fail.

After this patch,
Test commands:
   $ iptables -vnL
   $ kill -9 <pid of bpfilter_umh>
   $ iptables -vnL
   $ iptables -vnL

Now, all iptables commands will work.

Fixes: d2ba09c1 ("net: add skeleton of bpfilter kernel module")
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61fbf593

net: bpfilter: use cleanup callback to release umh_info · 5b4cb650

由 Taehee Yoo 提交于 1月 09, 2019

Now, UMH process is killed, do_exit() calls the umh_info->cleanup callback
to release members of the umh_info.
This patch makes bpfilter_umh's cleanup routine to use the
umh_info->cleanup callback.
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b4cb650

umh: add exit routine for UMH process · 73ab1cb2

由 Taehee Yoo 提交于 1月 09, 2019

A UMH process which is created by the fork_usermode_blob() such as
bpfilter needs to release members of the umh_info when process is
terminated.
But the do_exit() does not release members of the umh_info. hence module
which uses UMH needs own code to detect whether UMH process is
terminated or not.
But this implementation needs extra code for checking the status of
UMH process. it eventually makes the code more complex.

The new PF_UMH flag is added and it is used to identify UMH processes.
The exit_umh() does not release members of the umh_info.
Hence umh_info->cleanup callback should release both members of the
umh_info and the private data.
Suggested-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73ab1cb2

isdn: i4l: isdn_tty: Fix some concurrency double-free bugs · 2ff33d66

由 Jia-Ju Bai 提交于 1月 08, 2019

The functions isdn_tty_tiocmset() and isdn_tty_set_termios() may be
concurrently executed.

isdn_tty_tiocmset
  isdn_tty_modem_hup
    line 719: kfree(info->dtmf_state);
    line 721: kfree(info->silence_state);
    line 723: kfree(info->adpcms);
    line 725: kfree(info->adpcmr);

isdn_tty_set_termios
  isdn_tty_modem_hup
    line 719: kfree(info->dtmf_state);
    line 721: kfree(info->silence_state);
    line 723: kfree(info->adpcms);
    line 725: kfree(info->adpcmr);

Thus, some concurrency double-free bugs may occur.

These possible bugs are found by a static tool written by myself and
my manual code review.

To fix these possible bugs, the mutex lock "modem_info_mutex" used in
isdn_tty_tiocmset() is added in isdn_tty_set_termios().
Signed-off-by: NJia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ff33d66

vhost/vsock: fix vhost vsock cid hashing inconsistent · 7fbe078c

由 Zha Bin 提交于 1月 08, 2019

The vsock core only supports 32bit CID, but the Virtio-vsock spec define
CID (dst_cid and src_cid) as u64 and the upper 32bits is reserved as
zero. This inconsistency causes one bug in vhost vsock driver. The
scenarios is:

  0. A hash table (vhost_vsock_hash) is used to map an CID to a vsock
  object. And hash_min() is used to compute the hash key. hash_min() is
  defined as:
  (sizeof(val) <= 4 ? hash_32(val, bits) : hash_long(val, bits)).
  That means the hash algorithm has dependency on the size of macro
  argument 'val'.
  0. In function vhost_vsock_set_cid(), a 64bit CID is passed to
  hash_min() to compute the hash key when inserting a vsock object into
  the hash table.
  0. In function vhost_vsock_get(), a 32bit CID is passed to hash_min()
  to compute the hash key when looking up a vsock for an CID.

Because the different size of the CID, hash_min() returns different hash
key, thus fails to look up the vsock object for an CID.

To fix this bug, we keep CID as u64 in the IOCTLs and virtio message
headers, but explicitly convert u64 to u32 when deal with the hash table
and vsock core.

Fixes: 834e772c ("vhost/vsock: fix use-after-free in network stack callers")
Link: https://github.com/stefanha/virtio/blob/vsock/trunk/content.texSigned-off-by: NZha Bin <zhabin@linux.alibaba.com>
Reviewed-by: NLiu Jiang <gerry@linux.alibaba.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7fbe078c

Merge branch 'stmmac-fixes' · 5fea7f10

由 David S. Miller 提交于 1月 11, 2019

Jose Abreu says:

====================
net: stmmac: Misc Fixes

Some small fixes for stmmac targeting -net. Detailed info in commit log.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5fea7f10

net: stmmac: Prevent RX starvation in stmmac_napi_poll() · fa0be0a4

由 Jose Abreu 提交于 1月 09, 2019

Currently, TX is given a budget which is consumed by stmmac_tx_clean()
and stmmac_rx() is given the remaining non-consumed budget.

This is wrong and in case we are sending a large number of packets this
can starve RX because remaining budget will be low.

Let's give always the same budget for RX and TX clean.

While at it, check if we missed any interrupts while we were in NAPI
callback by looking at DMA interrupt status.

Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: NJose Abreu <joabreu@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa0be0a4

net: stmmac: Fix the logic of checking if RX Watchdog must be enabled · 3b509466

由 Jose Abreu 提交于 1月 09, 2019

RX Watchdog can be disabled by platform definitions but currently we are
initializing the descriptors before checking if Watchdog must be
disabled or not.

Fix this by checking earlier if user wants Watchdog disabled or not.

Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: NJose Abreu <joabreu@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b509466

net: stmmac: Check if CBS is supported before configuring · 0650d401

由 Jose Abreu 提交于 1月 09, 2019

Check if CBS is currently supported before trying to configure it in HW.

Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: NJose Abreu <joabreu@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0650d401

net: stmmac: dwxgmac2: Only clear interrupts that are active · fcc509eb

由 Jose Abreu 提交于 1月 09, 2019

In DMA interrupt handler we were clearing all interrupts status, even
the ones that were not active. Fix this and only clear the active
interrupts.

Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: NJose Abreu <joabreu@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fcc509eb

net: stmmac: Fix PCI module removal leak · 6dea7e18

由 Jose Abreu 提交于 1月 09, 2019

Since commit b7d0f08e, the enable / disable of PCI device is not
managed which will result in IO regions not being automatically unmapped.
As regions continue mapped it is currently not possible to remove and
then probe again the PCI module of stmmac.

Fix this by manually unmapping regions on remove callback.

Changes from v1:
- Fix build error

Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Fixes: b7d0f08e ("net: stmmac: Fix WoL for PCI-based setups")
Signed-off-by: NJose Abreu <joabreu@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6dea7e18

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · e8b108b0

由 David S. Miller 提交于 1月 11, 2019

Daniel Borkmann says:

====================
pull-request: bpf 2019-01-11

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix TCP-BPF support for correctly setting the initial window
   via TCP_BPF_IW on an active TFO sender, from Yuchung.

2) Fix a panic in BPF's stack_map_get_build_id()'s ELF parsing on
   32 bit archs caused by page_address() returning NULL, from Song.

3) Fix BTF pretty print in kernel and bpftool when bitfield member
   offset is greater than 256. Also add test cases, from Yonghong.

4) Fix improper argument handling in xdp1 sample, from Ioana.

5) Install missing tcp_server.py and tcp_client.py files from
   BPF selftests, from Anders.

6) Add test_libbpf to gitignore in libbpf and BPF selftests,
   from Stanislav.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e8b108b0

11 1月, 2019 8 次提交

Merge branch 'bpf-fix-bitfield-printing' · fb4129b9

由 Daniel Borkmann 提交于 1月 11, 2019

Yonghong Song says:

====================
The previous BTF kind_flag support patch set introduced a bug
for kernel bpffs pretty printing and another bug for bpftool
map pretty printing. If a bitfield struct member offset is
greater than 256 bits, printed value for that struct
member will be incorrect.

- Patch #1 fixed the bug in kernel bpffs pretty printing.
- Patch #2 enhanced the test_btf test case to cover the
           issue exposed by patch #1.
- Patch #3 fixed the bug in bpftool map pretty printing.
====================
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

fb4129b9

tools/bpf: fix bpftool map dump with bitfields · 298e59d3

由 Yonghong Song 提交于 1月 10, 2019

Commit 8772c8bc ("tools: bpftool: support pretty print
with kind_flag set") added bpftool map dump with kind_flag
support. When bitfield_size can be retrieved directly from
btf_member, function btf_dumper_bitfield() is called to
dump the bitfield. The implementation passed the
wrong parameter "bit_offset" to the function. The excepted
value is the bit_offset within a byte while the passed-in
value is the struct member offset.

This commit fixed the bug with passing correct "bit_offset"
with adjusted data pointer.

Fixes: 8772c8bc ("tools: bpftool: support pretty print with kind_flag set")
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NYonghong Song <yhs@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

298e59d3

tools/bpf: test btf bitfield with >=256 struct member offset · e43207fa

由 Yonghong Song 提交于 1月 10, 2019

This patch modified test_btf pretty print test to cover
the bitfield with struct member equal to or greater 256.

Without the previous kernel patch fix, the modified test will fail:

  $ test_btf -p
  ......
  BTF pretty print array(#1)......unexpected pprint output
  expected: 0: {0,0,0,0x3,0x0,0x3,{0|[0,0,0,0,0,0,0,0]},ENUM_ZERO,4,0x1}
      read: 0: {0,0,0,0x3,0x0,0x3,{0|[0,0,0,0,0,0,0,0]},ENUM_ZERO,4,0x0}

  BTF pretty print array(#2)......unexpected pprint output
  expected: 0: {0,0,0,0x3,0x0,0x3,{0|[0,0,0,0,0,0,0,0]},ENUM_ZERO,4,0x1}
      read: 0: {0,0,0,0x3,0x0,0x3,{0|[0,0,0,0,0,0,0,0]},ENUM_ZERO,4,0x0}

  PASS:6 SKIP:0 FAIL:2

With the kernel fix, the modified test will succeed:
  $ test_btf -p
  ......
  BTF pretty print array(#1)......OK
  BTF pretty print array(#2)......OK
  PASS:8 SKIP:0 FAIL:0

Fixes: 9d5f9f70 ("bpf: btf: fix struct/union/fwd types with kind_flag")
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NYonghong Song <yhs@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

e43207fa

bpf: fix bpffs bitfield pretty print · 17e3ac81

由 Yonghong Song 提交于 1月 10, 2019

Commit 9d5f9f70 ("bpf: btf: fix struct/union/fwd types
with kind_flag") introduced kind_flag and used bitfield_size
in the btf_member to directly pretty print member values.

The commit contained a bug where the incorrect parameters could be
passed to function btf_bitfield_seq_show(). The bits_offset
parameter in the function expects a value less than 8.
Instead, the member offset in the structure is passed.

The below is btf_bitfield_seq_show() func signature:
  void btf_bitfield_seq_show(void *data, u8 bits_offset,
                             u8 nr_bits, struct seq_file *m)
both bits_offset and nr_bits are u8 type. If the bitfield
member offset is greater than 256, incorrect value will
be printed.

This patch fixed the issue by calculating correct proper
data offset and bits_offset similar to non kind_flag case.

Fixes: 9d5f9f70 ("bpf: btf: fix struct/union/fwd types with kind_flag")
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NYonghong Song <yhs@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

17e3ac81

net: ethernet: mediatek: fix warning in phy_start_aneg · b19bce03

由 Heiner Kallweit 提交于 1月 09, 2019

linux 5.0-rc1 shows following warning on bpi-r2/mt7623 bootup:

[ 5.170597] WARNING: CPU: 3 PID: 1 at drivers/net/phy/phy.c:548 phy_start_aneg+0x110/0x144
[ 5.178826] called from state READY
....
[ 5.264111] [<c0629fd4>] (phy_start_aneg) from [<c0e3e720>] (mtk_init+0x414/0x47c)
[ 5.271630] r7:df5f5eec r6:c0f08c48 r5:00000000 r4:dea67800
[ 5.277256] [<c0e3e30c>] (mtk_init) from [<c07dabbc>] (register_netdevice+0x98/0x51c)
[ 5.285035] r8:00000000 r7:00000000 r6:c0f97080 r5:c0f08c48 r4:dea67800
[ 5.291693] [<c07dab24>] (register_netdevice) from [<c07db06c>] (register_netdev+0x2c/0x44)
[ 5.299989] r8:00000000 r7:dea2e608 r6:deacea00 r5:dea2e604 r4:dea67800
[ 5.306646] [<c07db040>] (register_netdev) from [<c06326d8>] (mtk_probe+0x668/0x7ac)
[ 5.314336] r5:dea2e604 r4:dea2e040
[ 5.317890] [<c0632070>] (mtk_probe) from [<c05a78fc>] (platform_drv_probe+0x58/0xa8)
[ 5.325670] r10:c0f86bac r9:00000000 r8:c0fbe578 r7:00000000 r6:c0f86bac r5:00000000
[ 5.333445] r4:deacea10
[ 5.335963] [<c05a78a4>] (platform_drv_probe) from [<c05a5248>] (really_probe+0x2d8/0x424)

maybe other boards using this generic driver are affected

v2:
optimization:

- phy_set_max_speed() is only needed if you want to reduce the
  max speed, typically if the PHY supports 1Gbps but the MAC
  supports 100Mbps only.

- The pause parameters are autonegotiated. Except you have a specific
  need you normally don't need to manually fiddle with this.

- phy_start_aneg() is called implicitly by the phylib state machine,
  you shouldn't call it manually except you have a good excuse.

- netif_carrier_on/netif_carrier_off in mtk_phy_link_adjust() isn't
  needed. It's done by phy_link_change() in phylib.
Signed-off-by: NFrank Wunderlich <frank-w@public-files.de>
Reviewed-by: NHeiner Kallweit <hkallweit1@gmail.com>
Acked-by: NSean Wang <sean.wang@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b19bce03

tcp: change txhash on SYN-data timeout · c5715b8f

由 Yuchung Cheng 提交于 1月 08, 2019

Previously upon SYN timeouts the sender recomputes the txhash to
try a different path. However this does not apply on the initial
timeout of SYN-data (active Fast Open). Therefore an active IPv6
Fast Open connection may incur one second RTO penalty to take on
a new path after the second SYN retransmission uses a new flow label.

This patch removes this undesirable behavior so Fast Open changes
the flow label just like the regular connections. This also helps
avoid falsely disabling Fast Open on the sender which triggers
after two consecutive SYN timeouts on Fast Open.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Reviewed-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5715b8f

net: dsa: mv88x6xxx: mv88e6390 errata · ea89098e

由 Andrew Lunn 提交于 1月 09, 2019

The 6390 copper ports have an errata which require poking magic values
into undocumented magic registers and then performing a software
reset.
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea89098e

bonding: update nest level on unlink · 001e465f

由 Willem de Bruijn 提交于 1月 08, 2019

A network device stack with multiple layers of bonding devices can
trigger a false positive lockdep warning. Adding lockdep nest levels
fixes this. Update the level on both enslave and unlink, to avoid the
following series of events ..

    ip netns add test
    ip netns exec test bash
    ip link set dev lo addr 00:11:22:33:44:55
    ip link set dev lo down

    ip link add dev bond1 type bond
    ip link add dev bond2 type bond

    ip link set dev lo master bond1
    ip link set dev bond1 master bond2

    ip link set dev bond1 nomaster
    ip link set dev bond2 master bond1

.. from still generating a splat:

    [  193.652127] ======================================================
    [  193.658231] WARNING: possible circular locking dependency detected
    [  193.664350] 4.20.0 #8 Not tainted
    [  193.668310] ------------------------------------------------------
    [  193.674417] ip/15577 is trying to acquire lock:
    [  193.678897] 00000000a40e3b69 (&(&bond->stats_lock)->rlock#3/3){+.+.}, at: bond_get_stats+0x58/0x290
    [  193.687851]
    	       but task is already holding lock:
    [  193.693625] 00000000807b9d9f (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0x58/0x290

    [..]

    [  193.851092]        lock_acquire+0xa7/0x190
    [  193.855138]        _raw_spin_lock_nested+0x2d/0x40
    [  193.859878]        bond_get_stats+0x58/0x290
    [  193.864093]        dev_get_stats+0x5a/0xc0
    [  193.868140]        bond_get_stats+0x105/0x290
    [  193.872444]        dev_get_stats+0x5a/0xc0
    [  193.876493]        rtnl_fill_stats+0x40/0x130
    [  193.880797]        rtnl_fill_ifinfo+0x6c5/0xdc0
    [  193.885271]        rtmsg_ifinfo_build_skb+0x86/0xe0
    [  193.890091]        rtnetlink_event+0x5b/0xa0
    [  193.894320]        raw_notifier_call_chain+0x43/0x60
    [  193.899225]        netdev_change_features+0x50/0xa0
    [  193.904044]        bond_compute_features.isra.46+0x1ab/0x270
    [  193.909640]        bond_enslave+0x141d/0x15b0
    [  193.913946]        do_set_master+0x89/0xa0
    [  193.918016]        do_setlink+0x37c/0xda0
    [  193.921980]        __rtnl_newlink+0x499/0x890
    [  193.926281]        rtnl_newlink+0x48/0x70
    [  193.930238]        rtnetlink_rcv_msg+0x171/0x4b0
    [  193.934801]        netlink_rcv_skb+0xd1/0x110
    [  193.939103]        rtnetlink_rcv+0x15/0x20
    [  193.943151]        netlink_unicast+0x3b5/0x520
    [  193.947544]        netlink_sendmsg+0x2fd/0x3f0
    [  193.951942]        sock_sendmsg+0x38/0x50
    [  193.955899]        ___sys_sendmsg+0x2ba/0x2d0
    [  193.960205]        __x64_sys_sendmsg+0xad/0x100
    [  193.964687]        do_syscall_64+0x5a/0x460
    [  193.968823]        entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 7e2556e4 ("bonding: avoid lockdep confusion in bond_get_stats()")
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

001e465f

10 1月, 2019 11 次提交

bpf: fix panic in stack_map_get_build_id() on i386 and arm32 · beaf3d19

由 Song Liu 提交于 1月 08, 2019

As Naresh reported, test_stacktrace_build_id() causes panic on i386 and
arm32 systems. This is caused by page_address() returns NULL in certain
cases.

This patch fixes this error by using kmap_atomic/kunmap_atomic instead
of page_address.

Fixes: 615755a7 (" bpf: extend stackmap to save binary_build_id+offset instead of address")
Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: NSong Liu <songliubraving@fb.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

beaf3d19

selftests: bpf: install files tcp_(server|client)*.py · f98937c6

由 Anders Roxell 提交于 1月 08, 2019

When test_tcpbpf_user runs it complains that it can't find files
tcp_server.py and tcp_client.py.

Rework so that tcp_server.py and tcp_client.py gets installed, added them
to the variable TEST_PROGS_EXTENDED.

Fixes: d6d4f60c ("bpf: add selftest for tcpbpf")
Signed-off-by: NAnders Roxell <anders.roxell@linaro.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

f98937c6

samples: bpf: user proper argument index · 11b36abc

由 Ioana Ciornei 提交于 1月 09, 2019

Use optind as index for argv instead of a hardcoded value.
When the program has options this leads to improper parameter handling.

Fixes: dc378a1a ("samples: bpf: get ifindex from ifname")
Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
Acked-by: NMatteo Croce <mcroce@redhat.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

11b36abc

selftests/bpf: add missing executables to .gitignore · e3ca63de

由 Stanislav Fomichev 提交于 1月 08, 2019

We build test_libbpf with CXX to make sure linking against C++ works.

$ make -s -C tools/lib/bpf
$ git status -sb
? tools/lib/bpf/test_libbpf
$ make -s -C tools/testing/selftests/bpf
$ git status -sb
? tools/lib/bpf/test_libbpf
? tools/testing/selftests/bpf/test_libbpf

Fixes: 8c4905b9 ("libbpf: make sure bpf headers are c++ include-able")
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

e3ca63de

ipv6: fix kernel-infoleak in ipv6_local_error() · 7d033c9f

由 Eric Dumazet 提交于 1月 08, 2019

This patch makes sure the flow label in the IPv6 header
forged in ipv6_local_error() is initialized.

BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
CPU: 1 PID: 24675 Comm: syz-executor1 Not tainted 4.20.0-rc7+ #4
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x173/0x1d0 lib/dump_stack.c:113
 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
 kmsan_internal_check_memory+0x455/0xb00 mm/kmsan/kmsan.c:675
 kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601
 _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
 copy_to_user include/linux/uaccess.h:177 [inline]
 move_addr_to_user+0x2e9/0x4f0 net/socket.c:227
 ___sys_recvmsg+0x5d7/0x1140 net/socket.c:2284
 __sys_recvmsg net/socket.c:2327 [inline]
 __do_sys_recvmsg net/socket.c:2337 [inline]
 __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334
 __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334
 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7
RIP: 0033:0x457ec9
Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f8750c06c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9
RDX: 0000000000002000 RSI: 0000000020000400 RDI: 0000000000000005
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f8750c076d4
R13: 00000000004c4a60 R14: 00000000004d8140 R15: 00000000ffffffff

Uninit was stored to memory at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
 kmsan_save_stack mm/kmsan/kmsan.c:219 [inline]
 kmsan_internal_chain_origin+0x134/0x230 mm/kmsan/kmsan.c:439
 __msan_chain_origin+0x70/0xe0 mm/kmsan/kmsan_instr.c:200
 ipv6_recv_error+0x1e3f/0x1eb0 net/ipv6/datagram.c:475
 udpv6_recvmsg+0x398/0x2ab0 net/ipv6/udp.c:335
 inet_recvmsg+0x4fb/0x600 net/ipv4/af_inet.c:830
 sock_recvmsg_nosec net/socket.c:794 [inline]
 sock_recvmsg+0x1d1/0x230 net/socket.c:801
 ___sys_recvmsg+0x4d5/0x1140 net/socket.c:2278
 __sys_recvmsg net/socket.c:2327 [inline]
 __do_sys_recvmsg net/socket.c:2337 [inline]
 __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334
 __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334
 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
 kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158
 kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176
 kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185
 slab_post_alloc_hook mm/slab.h:446 [inline]
 slab_alloc_node mm/slub.c:2759 [inline]
 __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383
 __kmalloc_reserve net/core/skbuff.c:137 [inline]
 __alloc_skb+0x309/0xa20 net/core/skbuff.c:205
 alloc_skb include/linux/skbuff.h:998 [inline]
 ipv6_local_error+0x1a7/0x9e0 net/ipv6/datagram.c:334
 __ip6_append_data+0x129f/0x4fd0 net/ipv6/ip6_output.c:1311
 ip6_make_skb+0x6cc/0xcf0 net/ipv6/ip6_output.c:1775
 udpv6_sendmsg+0x3f8e/0x45d0 net/ipv6/udp.c:1384
 inet_sendmsg+0x54a/0x720 net/ipv4/af_inet.c:798
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg net/socket.c:631 [inline]
 __sys_sendto+0x8c4/0xac0 net/socket.c:1788
 __do_sys_sendto net/socket.c:1800 [inline]
 __se_sys_sendto+0x107/0x130 net/socket.c:1796
 __x64_sys_sendto+0x6e/0x90 net/socket.c:1796
 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x63/0xe7

Bytes 4-7 of 28 are uninitialized
Memory access of size 28 starts at ffff8881937bfce0
Data copied to user address 0000000020000000
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d033c9f

net/core/neighbour: tell kmemleak about hash tables · 85704cb8

由 Konstantin Khlebnikov 提交于 1月 08, 2019

This fixes false-positive kmemleak reports about leaked neighbour entries:

unreferenced object 0xffff8885c6e4d0a8 (size 1024):
  comm "softirq", pid 0, jiffies 4294922664 (age 167640.804s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 20 2c f3 83 ff ff ff ff  ........ ,......
    08 c0 ef 5f 84 88 ff ff 01 8c 7d 02 01 00 00 00  ..._......}.....
  backtrace:
    [<00000000748509fe>] ip6_finish_output2+0x887/0x1e40
    [<0000000036d7a0d8>] ip6_output+0x1ba/0x600
    [<0000000027ea7dba>] ip6_send_skb+0x92/0x2f0
    [<00000000d6e2111d>] udp_v6_send_skb.isra.24+0x680/0x15e0
    [<000000000668a8be>] udpv6_sendmsg+0x18c9/0x27a0
    [<000000004bd5fa90>] sock_sendmsg+0xb3/0xf0
    [<000000008227b29f>] ___sys_sendmsg+0x745/0x8f0
    [<000000008698009d>] __sys_sendmsg+0xde/0x170
    [<00000000889dacf1>] do_syscall_64+0x9b/0x400
    [<0000000081cdb353>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<000000005767ed39>] 0xffffffffffffffff
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

85704cb8

net: cxgb4: fix various indentation issues · fd21c89b

由 Colin Ian King 提交于 1月 07, 2019

There are some lines that have indentation issues, fix these.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd21c89b

net: cxgb3: fix various indentation issues · 2acc0abc

由 Colin Ian King 提交于 1月 07, 2019

There are handful of lines that have indentation issues, fix these.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2acc0abc

ip: on queued skb use skb_header_pointer instead of pskb_may_pull · 4a06fa67

由 Willem de Bruijn 提交于 1月 07, 2019

Commit 2efd4fca ("ip: in cmsg IP(V6)_ORIGDSTADDR call
pskb_may_pull") avoided a read beyond the end of the skb linear
segment by calling pskb_may_pull.

That function can trigger a BUG_ON in pskb_expand_head if the skb is
shared, which it is when when peeking. It can also return ENOMEM.

Avoid both by switching to safer skb_header_pointer.

Fixes: 2efd4fca ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull")
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Suggested-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a06fa67

tun: publish tfile after it's fully initialized · 0b7959b6

由 Stanislav Fomichev 提交于 1月 07, 2019

BUG: unable to handle kernel NULL pointer dereference at 00000000000000d1
Call Trace:
 ? napi_gro_frags+0xa7/0x2c0
 tun_get_user+0xb50/0xf20
 tun_chr_write_iter+0x53/0x70
 new_sync_write+0xff/0x160
 vfs_write+0x191/0x1e0
 __x64_sys_write+0x5e/0xd0
 do_syscall_64+0x47/0xf0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

I think there is a subtle race between sending a packet via tap and
attaching it:

CPU0:                    CPU1:
tun_chr_ioctl(TUNSETIFF)
  tun_set_iff
    tun_attach
      rcu_assign_pointer(tfile->tun, tun);
                         tun_fops->write_iter()
                           tun_chr_write_iter
                             tun_napi_alloc_frags
                               napi_get_frags
                                 napi->skb = napi_alloc_skb
      tun_napi_init
        netif_napi_add
          napi->skb = NULL
                              napi->skb is NULL here
                              napi_gro_frags
                                napi_frags_skb
				  skb = napi->skb
				  skb_reset_mac_header(skb)
				  panic()

Move rcu_assign_pointer(tfile->tun) and rcu_assign_pointer(tun->tfiles) to
be the last thing we do in tun_attach(); this should guarantee that when we
call tun_get() we always get an initialized object.

v2 changes:
* remove extra napi_mutex locks/unlocks for napi operations
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Fixes: 90e33d45 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0b7959b6

bpf: correctly set initial window on active Fast Open sender · 31aa6503

由 Yuchung Cheng 提交于 1月 08, 2019

The existing BPF TCP initial congestion window (TCP_BPF_IW) does not
to work on (active) Fast Open sender. This is because it changes the
(initial) window only if data_segs_out is zero -- but data_segs_out
is also incremented on SYN-data.  This patch fixes the issue by
proerly accounting for SYN-data additionally.

Fixes: fc747810 ("bpf: Adds support for setting initial cwnd")
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Reviewed-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NLawrence Brakmo <brakmo@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

31aa6503

09 1月, 2019 6 次提交

packet: Do not leak dev refcounts on error exit · d972f3dc

由 Jason Gunthorpe 提交于 1月 08, 2019

'dev' is non NULL when the addr_len check triggers so it must goto a label
that does the dev_put otherwise dev will have a leaked refcount.

This bug causes the ib_ipoib module to become unloadable when using
systemd-network as it triggers this check on InfiniBand links.

Fixes: 99137b78 ("packet: validate address length")
Reported-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d972f3dc

Merge branch 'mlxsw-fixes' · 4314b1f6

由 David S. Miller 提交于 1月 08, 2019

Daniel Borkmann says:

====================
pull-request: bpf 2019-01-08

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix BSD'ism in sendmsg(2) to rewrite unspecified IPv6 dst for
   unconnected UDP sockets with [::1] _after_ cgroup BPF invocation,
   from Andrey.

2) Follow-up fix to the speculation fix where we need to reject a
   corner case for sanitation when ptr and scalars are mixed in the
   same alu op. Also, some unrelated minor doc fixes, from Daniel.

3) Fix BPF kselftest's incorrect uses of create_and_get_cgroup()
   by not assuming fd of zero value to be the result of an error
   case, from Stanislav.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4314b1f6

selftests: forwarding: Add a test for VLAN deletion · 4fabf3bf

由 Ido Schimmel 提交于 1月 08, 2019

Add a VLAN on a bridge port, delete it and make sure the PVID VLAN is
not affected.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4fabf3bf

mlxsw: spectrum_switchdev: Set PVID correctly during VLAN deletion · 674bed5d

由 Ido Schimmel 提交于 1月 08, 2019

When a VLAN is deleted from a bridge port we should not change the PVID
unless the deleted VLAN is the PVID.

Fixes: fe9ccc78 ("mlxsw: spectrum_switchdev: Don't batch VLAN operations")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

674bed5d

selftests: forwarding: Fix test for different devices · 289fb44d

由 Ido Schimmel 提交于 1月 08, 2019

When running the test on the Spectrum ASIC the generated packets are
counted on the ingress filter and injected back to the pipeline because
of the 'pass' action. The router block then drops the packets due to
checksum error, as the test generates packets with zero checksum.

When running the test on an emulator that is not as strict about
checksum errors the test fails since packets are counted twice. Once by
the emulated ASIC on its ingress filter and again by the kernel as the
emulator does not perform checksum validation and allows the packets to
be trapped by a matching host route.

Fix this by changing the action to 'drop', which will prevent the packet
from continuing further in the pipeline to the router block.

For veth pairs this change is essentially a NOP given packets are only
processed once (by the kernel).

Fixes: a0b61f3d ("selftests: forwarding: vxlan_bridge_1d: Add an ECN decap test")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Reviewed-by: NPetr Machata <petrm@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

289fb44d

net: bridge: Fix VLANs memory leak · 27973793

由 Ido Schimmel 提交于 1月 08, 2019

When adding / deleting VLANs to / from a bridge port, the bridge driver
first tries to propagate the information via switchdev and falls back to
the 8021q driver in case the underlying driver does not support
switchdev. This can result in a memory leak [1] when VXLAN and mlxsw
ports are enslaved to the bridge:

$ ip link set dev vxlan0 master br0
# No mlxsw ports are enslaved to 'br0', so mlxsw ignores the switchdev
# notification and the bridge driver adds the VLAN on 'vxlan0' via the
# 8021q driver
$ bridge vlan add vid 10 dev vxlan0 pvid untagged
# mlxsw port is enslaved to the bridge
$ ip link set dev swp1 master br0
# mlxsw processes the switchdev notification and the 8021q driver is
# skipped
$ bridge vlan del vid 10 dev vxlan0

This results in 'struct vlan_info' and 'struct vlan_vid_info' being
leaked, as they were allocated by the 8021q driver during VLAN addition,
but never freed as the 8021q driver was skipped during deletion.

Fix this by introducing a new VLAN private flag that indicates whether
the VLAN was added on the port by switchdev or the 8021q driver. If the
VLAN was added by the 8021q driver, then we make sure to delete it via
the 8021q driver as well.

[1]
unreferenced object 0xffff88822d20b1e8 (size 256):
  comm "bridge", pid 2532, jiffies 4295216998 (age 1188.830s)
  hex dump (first 32 bytes):
    e0 42 97 ce 81 88 ff ff 00 00 00 00 00 00 00 00  .B..............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000f82d851d>] kmem_cache_alloc_trace+0x1be/0x330
    [<00000000e0178b02>] vlan_vid_add+0x661/0x920
    [<00000000218ebd5f>] __vlan_add+0x1be9/0x3a00
    [<000000006eafa1ca>] nbp_vlan_add+0x8b3/0xd90
    [<000000003535392c>] br_vlan_info+0x132/0x410
    [<00000000aedaa9dc>] br_afspec+0x75c/0x870
    [<00000000f5716133>] br_setlink+0x3dc/0x6d0
    [<00000000aceca5e2>] rtnl_bridge_setlink+0x615/0xb30
    [<00000000a2f2d23e>] rtnetlink_rcv_msg+0x3a3/0xa80
    [<0000000064097e69>] netlink_rcv_skb+0x152/0x3c0
    [<000000008be8d614>] rtnetlink_rcv+0x21/0x30
    [<000000009ab2ca25>] netlink_unicast+0x52f/0x740
    [<00000000e7d9ac96>] netlink_sendmsg+0x9c7/0xf50
    [<000000005d1e2050>] sock_sendmsg+0xbe/0x120
    [<00000000d51426bc>] ___sys_sendmsg+0x778/0x8f0
    [<00000000b9d7b2cc>] __sys_sendmsg+0x112/0x270
unreferenced object 0xffff888227454308 (size 32):
  comm "bridge", pid 2532, jiffies 4295216998 (age 1188.882s)
  hex dump (first 32 bytes):
    88 b2 20 2d 82 88 ff ff 88 b2 20 2d 82 88 ff ff  .. -...... -....
    81 00 0a 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000f82d851d>] kmem_cache_alloc_trace+0x1be/0x330
    [<0000000018050631>] vlan_vid_add+0x3e6/0x920
    [<00000000218ebd5f>] __vlan_add+0x1be9/0x3a00
    [<000000006eafa1ca>] nbp_vlan_add+0x8b3/0xd90
    [<000000003535392c>] br_vlan_info+0x132/0x410
    [<00000000aedaa9dc>] br_afspec+0x75c/0x870
    [<00000000f5716133>] br_setlink+0x3dc/0x6d0
    [<00000000aceca5e2>] rtnl_bridge_setlink+0x615/0xb30
    [<00000000a2f2d23e>] rtnetlink_rcv_msg+0x3a3/0xa80
    [<0000000064097e69>] netlink_rcv_skb+0x152/0x3c0
    [<000000008be8d614>] rtnetlink_rcv+0x21/0x30
    [<000000009ab2ca25>] netlink_unicast+0x52f/0x740
    [<00000000e7d9ac96>] netlink_sendmsg+0x9c7/0xf50
    [<000000005d1e2050>] sock_sendmsg+0xbe/0x120
    [<00000000d51426bc>] ___sys_sendmsg+0x778/0x8f0
    [<00000000b9d7b2cc>] __sys_sendmsg+0x112/0x270

Fixes: d70e42b2 ("mlxsw: spectrum: Enable VxLAN enslavement to VLAN-aware bridges")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Reviewed-by: NPetr Machata <petrm@mellanox.com>
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: bridge@lists.linux-foundation.org
Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27973793

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功