提交 · e28587cc491ef0f3c51258fdc87fbc386b1d4c59 · openeuler / Kernel

17 12月, 2021 3 次提交

sit: do not call ipip6_dev_free() from sit_init_net() · e28587cc

由 Eric Dumazet 提交于 12月 16, 2021

ipip6_dev_free is sit dev->priv_destructor, already called
by register_netdevice() if something goes wrong.

Alternative would be to make ipip6_dev_free() robust against
multiple invocations, but other drivers do not implement this
strategy.

syzbot reported:

dst_release underflow
WARNING: CPU: 0 PID: 5059 at net/core/dst.c:173 dst_release+0xd8/0xe0 net/core/dst.c:173
Modules linked in:
CPU: 1 PID: 5059 Comm: syz-executor.4 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:dst_release+0xd8/0xe0 net/core/dst.c:173
Code: 4c 89 f2 89 d9 31 c0 5b 41 5e 5d e9 da d5 44 f9 e8 1d 90 5f f9 c6 05 87 48 c6 05 01 48 c7 c7 80 44 99 8b 31 c0 e8 e8 67 29 f9 <0f> 0b eb 85 0f 1f 40 00 53 48 89 fb e8 f7 8f 5f f9 48 83 c3 a8 48
RSP: 0018:ffffc9000aa5faa0 EFLAGS: 00010246
RAX: d6894a925dd15a00 RBX: 00000000ffffffff RCX: 0000000000040000
RDX: ffffc90005e19000 RSI: 000000000003ffff RDI: 0000000000040000
RBP: 0000000000000000 R08: ffffffff816a1f42 R09: ffffed1017344f2c
R10: ffffed1017344f2c R11: 0000000000000000 R12: 0000607f462b1358
R13: 1ffffffff1bfd305 R14: ffffe8ffffcb1358 R15: dffffc0000000000
FS:  00007f66c71a2700(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f88aaed5058 CR3: 0000000023e0f000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 dst_cache_destroy+0x107/0x1e0 net/core/dst_cache.c:160
 ipip6_dev_free net/ipv6/sit.c:1414 [inline]
 sit_init_net+0x229/0x550 net/ipv6/sit.c:1936
 ops_init+0x313/0x430 net/core/net_namespace.c:140
 setup_net+0x35b/0x9d0 net/core/net_namespace.c:326
 copy_net_ns+0x359/0x5c0 net/core/net_namespace.c:470
 create_new_namespaces+0x4ce/0xa00 kernel/nsproxy.c:110
 unshare_nsproxy_namespaces+0x11e/0x180 kernel/nsproxy.c:226
 ksys_unshare+0x57d/0xb50 kernel/fork.c:3075
 __do_sys_unshare kernel/fork.c:3146 [inline]
 __se_sys_unshare kernel/fork.c:3144 [inline]
 __x64_sys_unshare+0x34/0x40 kernel/fork.c:3144
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f66c882ce99
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f66c71a2168 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
RAX: ffffffffffffffda RBX: 00007f66c893ff60 RCX: 00007f66c882ce99
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000048040200
RBP: 00007f66c8886ff1 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff6634832f R14: 00007f66c71a2300 R15: 0000000000022000
 </TASK>

Fixes: cf124db5 ("net: Fix inconsistent teardown and release of private netdev state.")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20211216111741.1387540-1-eric.dumazet@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

e28587cc

net: systemport: Add global locking for descriptor lifecycle · 8b8e6e78

由 Florian Fainelli 提交于 12月 15, 2021

The descriptor list is a shared resource across all of the transmit queues, and
the locking mechanism used today only protects concurrency across a given
transmit queue between the transmit and reclaiming. This creates an opportunity
for the SYSTEMPORT hardware to work on corrupted descriptors if we have
multiple producers at once which is the case when using multiple transmit
queues.

This was particularly noticeable when using multiple flows/transmit queues and
it showed up in interesting ways in that UDP packets would get a correct UDP
header checksum being calculated over an incorrect packet length. Similarly TCP
packets would get an equally correct checksum computed by the hardware over an
incorrect packet length.

The SYSTEMPORT hardware maintains an internal descriptor list that it re-arranges
when the driver produces a new descriptor anytime it writes to the
WRITE_PORT_{HI,LO} registers, there is however some delay in the hardware to
re-organize its descriptors and it is possible that concurrent TX queues
eventually break this internal allocation scheme to the point where the
length/status part of the descriptor gets used for an incorrect data buffer.

The fix is to impose a global serialization for all TX queues in the short
section where we are writing to the WRITE_PORT_{HI,LO} registers which solves
the corruption even with multiple concurrent TX queues being used.

Fixes: 80105bef ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20211215202450.4086240-1-f.fainelli@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

8b8e6e78

net/smc: Prevent smc_release() from long blocking · 5c15b312

由 D. Wythe 提交于 12月 15, 2021

In nginx/wrk benchmark, there's a hung problem with high probability
on case likes that: (client will last several minutes to exit)

server: smc_run nginx

client: smc_run wrk -c 10000 -t 1 http://server

Client hangs with the following backtrace:

0 [ffffa7ce8Of3bbf8] __schedule at ffffffff9f9eOd5f
1 [ffffa7ce8Of3bc88] schedule at ffffffff9f9eløe6
2 [ffffa7ce8Of3bcaO] schedule_timeout at ffffffff9f9e3f3c
3 [ffffa7ce8Of3bd2O] wait_for_common at ffffffff9f9el9de
4 [ffffa7ce8Of3bd8O] __flush_work at ffffffff9fOfeOl3
5 [ffffa7ce8øf3bdfO] smc_release at ffffffffcO697d24 [smc]
6 [ffffa7ce8Of3be2O] __sock_release at ffffffff9f8O2e2d
7 [ffffa7ce8Of3be4ø] sock_close at ffffffff9f8ø2ebl
8 [ffffa7ce8øf3be48] __fput at ffffffff9f334f93
9 [ffffa7ce8Of3be78] task_work_run at ffffffff9flOlff5
10 [ffffa7ce8Of3beaO] do_exit at ffffffff9fOe5Ol2
11 [ffffa7ce8Of3bflO] do_group_exit at ffffffff9fOe592a
12 [ffffa7ce8Of3bf38] __x64_sys_exit_group at ffffffff9fOe5994
13 [ffffa7ce8Of3bf4O] do_syscall_64 at ffffffff9f9d4373
14 [ffffa7ce8Of3bfsO] entry_SYSCALL_64_after_hwframe at ffffffff9fa0007c

This issue dues to flush_work(), which is used to wait for
smc_connect_work() to finish in smc_release(). Once lots of
smc_connect_work() was pending or all executing work dangling,
smc_release() has to block until one worker comes to free, which
is equivalent to wait another smc_connnect_work() to finish.

In order to fix this, There are two changes:

1. For those idle smc_connect_work(), cancel it from the workqueue; for
   executing smc_connect_work(), waiting for it to finish. For that
   purpose, replace flush_work() with cancel_work_sync().

2. Since smc_connect() hold a reference for passive closing, if
   smc_connect_work() has been cancelled, release the reference.

Fixes: 24ac3a08 ("net/smc: rebuild nonblocking connect")
Reported-by: NTony Lu <tonylu@linux.alibaba.com>
Tested-by: NDust Li <dust.li@linux.alibaba.com>
Reviewed-by: NDust Li <dust.li@linux.alibaba.com>
Reviewed-by: NTony Lu <tonylu@linux.alibaba.com>
Signed-off-by: ND. Wythe <alibuda@linux.alibaba.com>
Acked-by: NKarsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/1639571361-101128-1-git-send-email-alibuda@linux.alibaba.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

5c15b312

16 12月, 2021 15 次提交

net: Fix double 0x prefix print in SKB dump · 8a03ef67

由 Gal Pressman 提交于 12月 16, 2021

When printing netdev features %pNF already takes care of the 0x prefix,
remove the explicit one.

Fixes: 6413139d ("skbuff: increase verbosity when dumping skb data")
Signed-off-by: NGal Pressman <gal@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a03ef67

virtio_net: fix rx_drops stat for small pkts · 053c9e18

由 Wenliang Wang 提交于 12月 16, 2021

We found the stat of rx drops for small pkts does not increment when
build_skb fail, it's not coherent with other mode's rx drops stat.
Signed-off-by: NWenliang Wang <wangwenliang.1995@bytedance.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

053c9e18

dsa: mv88e6xxx: fix debug print for SPEED_UNFORCED · e08cdf63

由 Andrey Eremeev 提交于 12月 15, 2021

Debug print uses invalid check to detect if speed is unforced:
(speed != SPEED_UNFORCED) should be used instead of (!speed).

Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: NAndrey Eremeev <Axtone4all@yandex.ru>
Fixes: 96a2b40c ("net: dsa: mv88e6xxx: add port's MAC speed setter")
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e08cdf63

sfc_ef100: potential dereference of null pointer · 407ecd1b

由 Jiasheng Jiang 提交于 12月 15, 2021

The return value of kmalloc() needs to be checked.
To avoid use in efx_nic_update_stats() in case of the failure of alloc.

Fixes: b593b6f1 ("sfc_ef100: statistics gathering")
Signed-off-by: NJiasheng Jiang <jiasheng@iscas.ac.cn>
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

407ecd1b

net: stmmac: dwmac-rk: fix oob read in rk_gmac_setup · 0546b224

由 John Keeping 提交于 12月 14, 2021

KASAN reports an out-of-bounds read in rk_gmac_setup on the line:

	while (ops->regs[i]) {

This happens for most platforms since the regs flexible array member is
empty, so the memory after the ops structure is being read here.  It
seems that mostly this happens to contain zero anyway, so we get lucky
and everything still works.

To avoid adding redundant data to nearly all the ops structures, add a
new flag to indicate whether the regs field is valid and avoid this loop
when it is not.

Fixes: 3bb3d6b1 ("net: stmmac: Add RK3566/RK3568 SoC support")
Signed-off-by: NJohn Keeping <john@metanate.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0546b224

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 6209dd77

由 David S. Miller 提交于 12月 16, 2021

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-12-15

This series contains updates to igb, igbvf, igc and ixgbe drivers.

Karen moves checks for invalid VF MAC filters to occur earlier for
igb.

Letu Ren fixes a double free issue in igbvf probe.

Sasha fixes incorrect min value being used when calculating for max for
igc.

Robert Schlabbach adds documentation on enabling NBASE-T support for
ixgbe.

Cyril Novikov adds missing initialization of MDIO bus speed for ixgbe.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6209dd77

net: usb: lan78xx: add Allied Telesis AT29M2-AF · ef8a0f6e

由 Greg Jesionowski 提交于 12月 14, 2021

This adds the vendor and product IDs for the AT29M2-AF which is a
lan7801-based device.
Signed-off-by: NGreg Jesionowski <jesionowskigreg@gmail.com>
Link: https://lore.kernel.org/r/20211214221027.305784-1-jesionowskigreg@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

ef8a0f6e

net/packet: rx_owner_map depends on pg_vec · ec6af094

由 Willem de Bruijn 提交于 12月 15, 2021

Packet sockets may switch ring versions. Avoid misinterpreting state
between versions, whose fields share a union. rx_owner_map is only
allocated with a packet ring (pg_vec) and both are swapped together.
If pg_vec is NULL, meaning no packet ring was allocated, then neither
was rx_owner_map. And the field may be old state from a tpacket_v3.

Fixes: 61fad681 ("net/packet: tpacket_rcv: avoid a producer race condition")
Reported-by: NSyzbot <syzbot+1ac0994a0a0c55151121@syzkaller.appspotmail.com>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211215143937.106178-1-willemdebruijn.kernel@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

ec6af094

netdevsim: Zero-initialize memory for new map's value in function nsim_bpf_map_alloc · 48122177

由 Haimin Zhang 提交于 12月 15, 2021

Zero-initialize memory for new map's value in function nsim_bpf_map_alloc
since it may cause a potential kernel information leak issue, as follows:
1. nsim_bpf_map_alloc calls nsim_map_alloc_elem to allocate elements for
a new map.
2. nsim_map_alloc_elem uses kmalloc to allocate map's value, but doesn't
zero it.
3. A user application can use IOCTL BPF_MAP_LOOKUP_ELEM to get specific
element's information in the map.
4. The kernel function map_lookup_elem will call bpf_map_copy_value to get
the information allocated at step-2, then use copy_to_user to copy to the
user buffer.
This can only leak information for an array map.

Fixes: 395cacb5 ("netdevsim: bpf: support fake map offload")
Suggested-by: NJakub Kicinski <kuba@kernel.org>
Acked-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NHaimin Zhang <tcs.kernel@gmail.com>
Link: https://lore.kernel.org/r/20211215111530.72103-1-tcs.kernel@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

48122177

dpaa2-eth: fix ethtool statistics · 972ce7e3

由 Ioana Ciornei 提交于 12月 15, 2021

Unfortunately, with the blamed commit I also added a side effect in the
ethtool stats shown. Because I added two more fields in the per channel
structure without verifying if its size is used in any way, part of the
ethtool statistics were off by 2.
Fix this by not looking up the size of the structure but instead on a
fixed value kept in a macro.

Fixes: fc398bec ("net: dpaa2: add adaptive interrupt coalescing")
Signed-off-by: NIoana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20211215105831.290070-1-ioana.ciornei@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

972ce7e3

ixgbe: set X550 MDIO speed before talking to PHY · bf0a3750

由 Cyril Novikov 提交于 11月 01, 2021

The MDIO bus speed must be initialized before talking to the PHY the first
time in order to avoid talking to it using a speed that the PHY doesn't
support.

This fixes HW initialization error -17 (IXGBE_ERR_PHY_ADDR_INVALID) on
Denverton CPUs (a.k.a. the Atom C3000 family) on ports with a 10Gb network
plugged in. On those devices, HLREG0[MDCSPD] resets to 1, which combined
with the 10Gb network results in a 24MHz MDIO speed, which is apparently
too fast for the connected PHY. PHY register reads over MDIO bus return
garbage, leading to initialization failure.

Reproduced with Linux kernel 4.19 and 5.15-rc7. Can be reproduced using
the following setup:

* Use an Atom C3000 family system with at least one X552 LAN on the SoC
* Disable PXE or other BIOS network initialization if possible
  (the interface must not be initialized before Linux boots)
* Connect a live 10Gb Ethernet cable to an X550 port
* Power cycle (not reset, doesn't always work) the system and boot Linux
* Observe: ixgbe interfaces w/ 10GbE cables plugged in fail with error -17

Fixes: e84db727 ("ixgbe: Introduce function to control MDIO speed")
Signed-off-by: NCyril Novikov <cnovikov@lynx.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

bf0a3750

ixgbe: Document how to enable NBASE-T support · 271225fd

由 Robert Schlabbach 提交于 10月 26, 2021

Commit a296d665 ("ixgbe: Add ethtool support to enable 2.5 and 5.0
Gbps support") introduced suppression of the advertisement of NBASE-T
speeds by default, according to Todd Fujinaka to accommodate customers
with network switches which could not cope with advertised NBASE-T
speeds, as posted in the E1000-devel mailing list:

https://sourceforge.net/p/e1000/mailman/message/37106269/

However, the suppression was not documented at all, nor was how to
enable NBASE-T support.

Properly document the NBASE-T suppression and how to enable NBASE-T
support.

Fixes: a296d665 ("ixgbe: Add ethtool support to enable 2.5 and 5.0 Gbps support")
Reported-by: NRobert Schlabbach <robert_s@gmx.net>
Signed-off-by: NRobert Schlabbach <robert_s@gmx.net>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

271225fd

igc: Fix typo in i225 LTR functions · 0182d1f3

由 Sasha Neftin 提交于 11月 02, 2021

The LTR maximum value was incorrectly written using the scale from
the LTR minimum value. This would cause incorrect values to be sent,
in cases where the initial calculation lead to different min/max scales.

Fixes: 707abf06 ("igc: Add initial LTR support")
Suggested-by: NDima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: NSasha Neftin <sasha.neftin@intel.com>
Tested-by: NNechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

0182d1f3

igbvf: fix double free in `igbvf_probe` · b6d335a6

由 Letu Ren 提交于 11月 13, 2021

In `igbvf_probe`, if register_netdev() fails, the program will go to
label err_hw_init, and then to label err_ioremap. In free_netdev() which
is just below label err_ioremap, there is `list_for_each_entry_safe` and
`netif_napi_del` which aims to delete all entries in `dev->napi_list`.
The program has added an entry `adapter->rx_ring->napi` which is added by
`netif_napi_add` in igbvf_alloc_queues(). However, adapter->rx_ring has
been freed below label err_hw_init. So this a UAF.

In terms of how to patch the problem, we can refer to igbvf_remove() and
delete the entry before `adapter->rx_ring`.

The KASAN logs are as follows:

[   35.126075] BUG: KASAN: use-after-free in free_netdev+0x1fd/0x450
[   35.127170] Read of size 8 at addr ffff88810126d990 by task modprobe/366
[   35.128360]
[   35.128643] CPU: 1 PID: 366 Comm: modprobe Not tainted 5.15.0-rc2+ #14
[   35.129789] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[   35.131749] Call Trace:
[   35.132199]  dump_stack_lvl+0x59/0x7b
[   35.132865]  print_address_description+0x7c/0x3b0
[   35.133707]  ? free_netdev+0x1fd/0x450
[   35.134378]  __kasan_report+0x160/0x1c0
[   35.135063]  ? free_netdev+0x1fd/0x450
[   35.135738]  kasan_report+0x4b/0x70
[   35.136367]  free_netdev+0x1fd/0x450
[   35.137006]  igbvf_probe+0x121d/0x1a10 [igbvf]
[   35.137808]  ? igbvf_vlan_rx_add_vid+0x100/0x100 [igbvf]
[   35.138751]  local_pci_probe+0x13c/0x1f0
[   35.139461]  pci_device_probe+0x37e/0x6c0
[   35.165526]
[   35.165806] Allocated by task 366:
[   35.166414]  ____kasan_kmalloc+0xc4/0xf0
[   35.167117]  foo_kmem_cache_alloc_trace+0x3c/0x50 [igbvf]
[   35.168078]  igbvf_probe+0x9c5/0x1a10 [igbvf]
[   35.168866]  local_pci_probe+0x13c/0x1f0
[   35.169565]  pci_device_probe+0x37e/0x6c0
[   35.179713]
[   35.179993] Freed by task 366:
[   35.180539]  kasan_set_track+0x4c/0x80
[   35.181211]  kasan_set_free_info+0x1f/0x40
[   35.181942]  ____kasan_slab_free+0x103/0x140
[   35.182703]  kfree+0xe3/0x250
[   35.183239]  igbvf_probe+0x1173/0x1a10 [igbvf]
[   35.184040]  local_pci_probe+0x13c/0x1f0

Fixes: d4e0fe01 (igbvf: add new driver to support 82576 virtual functions)
Reported-by: NZheyu Ma <zheyuma97@gmail.com>
Signed-off-by: NLetu Ren <fantasquex@gmail.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

b6d335a6

igb: Fix removal of unicast MAC filters of VFs · 584af821

由 Karen Sornek 提交于 8月 31, 2021

Move checking condition of VF MAC filter before clearing
or adding MAC filter to VF to prevent potential blackout caused
by removal of necessary and working VF's MAC filter.

Fixes: 1b8b062a ("igb: add VF trust infrastructure")
Signed-off-by: NKaren Sornek <karen.sornek@intel.com>
Tested-by: NKonrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

584af821

15 12月, 2021 9 次提交

Merge tag 'wireless-drivers-2021-12-15' of... · 1d1c950f

由 David S. Miller 提交于 12月 15, 2021

Merge tag 'wireless-drivers-2021-12-15' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for v5.16

Second set of fixes for v5.16, hopefully also the last one. I changed
my email in MAINTAINERS, one crash fix in iwlwifi and some build
problems fixed.

iwlwifi

* fix crash caused by a warning

* fix LED linking problem

brcmsmac

* rework LED dependencies for being consistent with other drivers

mt76

* mt7921: fix build regression
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d1c950f

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 7c8089f9

由 David S. Miller 提交于 12月 15, 2021

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-12-14

This series contains updates to ice driver only.

Karol corrects division that was causing incorrect calculations and
adds a check to ensure stale timestamps are not being used.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c8089f9

Merge branch 'mptcp-fixes-for-ulp-a-deadlock-and-netlink-docs' · 500f3720

由 Jakub Kicinski 提交于 12月 14, 2021

Mat Martineau says:

====================
mptcp: Fixes for ULP, a deadlock, and netlink docs

Two of the MPTCP fixes in this set are related to the TCP_ULP socket
option with MPTCP sockets operating in "fallback" mode (the connection
has reverted to regular TCP). The other issues are an observed deadlock
and missing parameter documentation in the MPTCP netlink API.

Patch 1 marks TCP_ULP as unsupported earlier in MPTCP setsockopt code,
so the fallback code path in the MPTCP layer does not pass the TCP_ULP
option down to the subflow TCP socket.

Patch 2 makes sure a TCP fallback socket returned to userspace by
accept()ing on a MPTCP listening socket does not allow use of the
"mptcp" TCP_ULP type. That ULP is intended only for use by in-kernel
MPTCP subflows.

Patch 3 fixes the possible deadlock when sending data and there are
socket option changes to sync to the subflows.

Patch 4 makes sure all MPTCP netlink event parameters are documented
in the MPTCP uapi header.
====================

Link: https://lore.kernel.org/r/20211214231604.211016-1-mathew.j.martineau@linux.intel.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

500f3720

mptcp: add missing documented NL params · 6813b192

由 Matthieu Baerts 提交于 12月 14, 2021

'loc_id' and 'rem_id' are set in all events linked to subflows but those
were missing in the events description in the comments.

Fixes: b911c97c ("mptcp: add netlink event support")
Signed-off-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

6813b192

mptcp: fix deadlock in __mptcp_push_pending() · 3d79e375

由 Maxim Galaganov 提交于 12月 14, 2021

__mptcp_push_pending() may call mptcp_flush_join_list() with subflow
socket lock held. If such call hits mptcp_sockopt_sync_all() then
subsequently __mptcp_sockopt_sync() could try to lock the subflow
socket for itself, causing a deadlock.

sysrq: Show Blocked State
task:ss-server       state:D stack:    0 pid:  938 ppid:     1 flags:0x00000000
Call Trace:
 <TASK>
 __schedule+0x2d6/0x10c0
 ? __mod_memcg_state+0x4d/0x70
 ? csum_partial+0xd/0x20
 ? _raw_spin_lock_irqsave+0x26/0x50
 schedule+0x4e/0xc0
 __lock_sock+0x69/0x90
 ? do_wait_intr_irq+0xa0/0xa0
 __lock_sock_fast+0x35/0x50
 mptcp_sockopt_sync_all+0x38/0xc0
 __mptcp_push_pending+0x105/0x200
 mptcp_sendmsg+0x466/0x490
 sock_sendmsg+0x57/0x60
 __sys_sendto+0xf0/0x160
 ? do_wait_intr_irq+0xa0/0xa0
 ? fpregs_restore_userregs+0x12/0xd0
 __x64_sys_sendto+0x20/0x30
 do_syscall_64+0x38/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f9ba546c2d0
RSP: 002b:00007ffdc3b762d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f9ba56c8060 RCX: 00007f9ba546c2d0
RDX: 000000000000077a RSI: 0000000000e5e180 RDI: 0000000000000234
RBP: 0000000000cc57f0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9ba56c8060
R13: 0000000000b6ba60 R14: 0000000000cc7840 R15: 41d8685b1d7901b8
 </TASK>

Fix the issue by using __mptcp_flush_join_list() instead of plain
mptcp_flush_join_list() inside __mptcp_push_pending(), as suggested by
Florian. The sockopt sync will be deferred to the workqueue.

Fixes: 1b3e7ede ("mptcp: setsockopt: handle SO_KEEPALIVE and SO_PRIORITY")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/244Suggested-by: NFlorian Westphal <fw@strlen.de>
Reviewed-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NMaxim Galaganov <max@internet.ru>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

3d79e375

mptcp: clear 'kern' flag from fallback sockets · d6692b3b

由 Florian Westphal 提交于 12月 14, 2021

The mptcp ULP extension relies on sk->sk_sock_kern being set correctly:
It prevents setsockopt(fd, IPPROTO_TCP, TCP_ULP, "mptcp", 6); from
working for plain tcp sockets (any userspace-exposed socket).

But in case of fallback, accept() can return a plain tcp sk.
In such case, sk is still tagged as 'kernel' and setsockopt will work.

This will crash the kernel, The subflow extension has a NULL ctx->conn
mptcp socket:

BUG: KASAN: null-ptr-deref in subflow_data_ready+0x181/0x2b0
Call Trace:
 tcp_data_ready+0xf8/0x370
 [..]

Fixes: cf7da0d6 ("mptcp: Create SUBFLOW socket for incoming connections")
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

d6692b3b

mptcp: remove tcp ulp setsockopt support · 404cd9a2

由 Florian Westphal 提交于 12月 14, 2021

TCP_ULP setsockopt cannot be used for mptcp because its already
used internally to plumb subflow (tcp) sockets to the mptcp layer.

syzbot managed to trigger a crash for mptcp connections that are
in fallback mode:

KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027]
CPU: 1 PID: 1083 Comm: syz-executor.3 Not tainted 5.16.0-rc2-syzkaller #0
RIP: 0010:tls_build_proto net/tls/tls_main.c:776 [inline]
[..]
 __tcp_set_ulp net/ipv4/tcp_ulp.c:139 [inline]
 tcp_set_ulp+0x428/0x4c0 net/ipv4/tcp_ulp.c:160
 do_tcp_setsockopt+0x455/0x37c0 net/ipv4/tcp.c:3391
 mptcp_setsockopt+0x1b47/0x2400 net/mptcp/sockopt.c:638

Remove support for TCP_ULP setsockopt.

Fixes: d9e4c129 ("mptcp: only admit explicitly supported sockopt")
Reported-by: syzbot+1fd9b69cde42967d1add@syzkaller.appspotmail.com
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NMat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

404cd9a2

ice: Don't put stale timestamps in the skb · 37e738b6

由 Karol Kolacinski 提交于 11月 16, 2021

The driver has to check if it does not accidentally put the timestamp in
the SKB before previous timestamp gets overwritten.
Timestamp values in the PHY are read only and do not get cleared except
at hardware reset or when a new timestamp value is captured.
The cached_tstamp field is used to detect the case where a new timestamp
has not yet been captured, ensuring that we avoid sending stale
timestamp data to the stack.

Fixes: ea9b847c ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: NKarol Kolacinski <karol.kolacinski@intel.com>
Tested-by: NGurucharan G <gurucharanx.g@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

37e738b6

ice: Use div64_u64 instead of div_u64 in adjfine · 0013881c

由 Karol Kolacinski 提交于 11月 04, 2021

Change the division in ice_ptp_adjfine from div_u64 to div64_u64.
div_u64 is used when the divisor is 32 bit but in this case incval is
64 bit and it caused incorrect calculations and incval adjustments.

Fixes: 06c16d89 ("ice: register 1588 PTP clock device object for E810 devices")
Signed-off-by: NKarol Kolacinski <karol.kolacinski@intel.com>
Tested-by: NGurucharan G <gurucharanx.g@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

0013881c

14 12月, 2021 13 次提交

Merge branch 'mlxsw-fixes' · 3dd7d40b

由 David S. Miller 提交于 12月 14, 2021

Ido Schimmel says:

====================
mlxsw: MAC profiles occupancy fix

Patch #1 fixes a router interface (RIF) MAC profiles occupancy bug that
was merged in the last cycle.

Patch #2 adds a selftest that fails without the fix.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3dd7d40b

selftests: mlxsw: Add a test case for MAC profiles consolidation · 20617717

由 Danielle Ratson 提交于 12月 14, 2021

Add a test case to cover the bug fixed by the previous patch.

Edit the MAC address of one netdev so that it matches the MAC address of
the second netdev. Verify that the two MAC profiles were consolidated by
testing that the MAC profiles occupancy decreased by one.
Signed-off-by: NDanielle Ratson <danieller@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20617717

mlxsw: spectrum_router: Consolidate MAC profiles when possible · b442f2ea

由 Danielle Ratson 提交于 12月 14, 2021

Currently, when setting a router interface (RIF) MAC address while the
MAC profile is not shared with other RIFs, the profile is edited so that
the new MAC address is assigned to it.

This does not take into account a situation in which the new MAC address
already matches an existing MAC profile. In that situation, two MAC
profiles will be occupied even though they hold MAC addresses from the
same profile.

In order to prevent that, add a check to ensure that editing a MAC
profile takes place only when the new MAC address does not match an
existing profile.

Fixes: 605d25cd ("mlxsw: spectrum_router: Add RIF MAC profiles support")
Reported-by: NMaksym Yaremchuk <maksymy@nvidia.com>
Tested-by: NMaksym Yaremchuk <maksymy@nvidia.com>
Signed-off-by: NDanielle Ratson <danieller@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b442f2ea

rds: memory leak in __rds_conn_create() · 5f9562eb

由 Hangyu Hua 提交于 12月 14, 2021

__rds_conn_create() did not release conn->c_path when loop_trans != 0 and
trans->t_prefer_loopback != 0 and is_outgoing == 0.

Fixes: aced3ce5 ("RDS tcp loopback connection can hang")
Signed-off-by: NHangyu Hua <hbh25y@gmail.com>
Reviewed-by: NSharath Srinivasan <sharath.srinivasan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f9562eb

Merge tag 'mac80211-for-net-2021-12-14' of... · d971650e

由 David S. Miller 提交于 12月 14, 2021

Merge tag 'mac80211-for-net-2021-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
A fairly large number of fixes this time:
 * fix a station info memory leak on insert collisions
 * a rate control fix for retransmissions
 * two aggregation setup fixes
 * reload current regdomain when reloading database
 * a locking fix in regulatory work
 * a probe request allocation size fix in mac80211
 * apply TCP vs. aggregation (sk pacing) on mesh
 * fix ordering of channel context update vs. station
   state
 * set up skb->dev for mesh forwarding properly
 * track QoS data frames only for admission control to
   avoid out-of-bounds read (found by syzbot)
 * validate extended element ID vs. existing data to
   avoid out-of-bounds read (found by syzbot)
 * fix locking in mac80211 aggregation TX setup
 * fix traffic stall after HW restart when TXQs are used
 * fix ordering of reconfig/restart after HW restart
 * fix interface type for extended aggregation capability
   lookup
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d971650e

Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · a41c4d96

由 David S. Miller 提交于 12月 14, 2021

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-12-13

This series contains updates to iavf driver only.

Dan Carpenter fixes some missing mutex unlocking.

Stefan Assmann restores stopping watchdog from overriding to reset state.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a41c4d96

flow_offload: return EOPNOTSUPP for the unsupported mpls action type · 166b6a46

由 Baowen Zheng 提交于 12月 13, 2021

We need to return EOPNOTSUPP for the unsupported mpls action type when
setup the flow action.

In the original implement, we will return 0 for the unsupported mpls
action type, actually we do not setup it and the following actions
to the flow action entry.

Fixes: 9838b20a ("net: sched: take rtnl lock in tc_setup_flow_action()")
Signed-off-by: NBaowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: NSimon Horman <simon.horman@corigine.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

166b6a46

net: stmmac: fix tc flower deletion for VLAN priority Rx steering · aeb7c75c

由 Ong Boon Leong 提交于 12月 11, 2021

To replicate the issue:-

1) Add 1 flower filter for VLAN Priority based frame steering:-
$ IFDEVNAME=eth0
$ tc qdisc add dev $IFDEVNAME ingress
$ tc qdisc add dev $IFDEVNAME root mqprio num_tc 8 \
   map 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 \
   queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
$ tc filter add dev $IFDEVNAME parent ffff: protocol 802.1Q \
   flower vlan_prio 0 hw_tc 0

2) Get the 'pref' id
$ tc filter show dev $IFDEVNAME ingress

3) Delete a specific tc flower record (say pref 49151)
$ tc filter del dev $IFDEVNAME parent ffff: pref 49151

From dmesg, we will observe kernel NULL pointer ooops

[  197.170464] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  197.171367] #PF: supervisor read access in kernel mode
[  197.171367] #PF: error_code(0x0000) - not-present page
[  197.171367] PGD 0 P4D 0
[  197.171367] Oops: 0000 [#1] PREEMPT SMP NOPTI

<snip>

[  197.171367] RIP: 0010:tc_setup_cls+0x20b/0x4a0 [stmmac]

<snip>

[  197.171367] Call Trace:
[  197.171367]  <TASK>
[  197.171367]  ? __stmmac_disable_all_queues+0xa8/0xe0 [stmmac]
[  197.171367]  stmmac_setup_tc_block_cb+0x70/0x110 [stmmac]
[  197.171367]  tc_setup_cb_destroy+0xb3/0x180
[  197.171367]  fl_hw_destroy_filter+0x94/0xc0 [cls_flower]

The above issue is due to previous incorrect implementation of
tc_del_vlan_flow(), shown below, that uses flow_cls_offload_flow_rule()
to get struct flow_rule *rule which is no longer valid for tc filter
delete operation.

  struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
  struct flow_dissector *dissector = rule->match.dissector;

So, to ensure tc_del_vlan_flow() deletes the right VLAN cls record for
earlier configured RX queue (configured by hw_tc) in tc_add_vlan_flow(),
this patch introduces stmmac_rfs_entry as driver-side flow_cls_offload
record for 'RX frame steering' tc flower, currently used for VLAN
priority. The implementation has taken consideration for future extension
to include other type RX frame steering such as EtherType based.

v2:
 - Clean up overly extensive backtrace and rewrite git message to better
   explain the kernel NULL pointer issue.

Fixes: 0e039f5c ("net: stmmac: add RX frame steering based on VLAN priority in tc flower")
Tested-by: NKurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: NOng Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aeb7c75c

mac80211: do drv_reconfig_complete() before restarting all · 13dee10b

由 Johannes Berg 提交于 11月 29, 2021

When we reconfigure, the driver might do some things to complete
the reconfiguration. It's strange and could be broken in some
cases because we restart other works (e.g. remain-on-channel and
TX) before this happens, yet only start queues later.

Change this to do the reconfig complete when reconfiguration is
actually complete, not when we've already started doing other
things again.

For iwlwifi, this should fix a race where the reconfig can race
with TX, for ath10k and ath11k that also use this it won't make
a difference because they just start queues there, and mac80211
also stopped the queues and will restart them later as before.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20211129152938.cab99f22fe19.Iefe494687f15fd85f77c1b989d1149c8efdfdc36@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>

13dee10b

mac80211: mark TX-during-stop for TX in in_reconfig · db7205af

由 Johannes Berg 提交于 11月 29, 2021

Mark TXQs as having seen transmit while they were stopped if
we bail out of drv_wake_tx_queue() due to reconfig, so that
the queue wake after this will make them catch up. This is
particularly necessary for when TXQs are used for management
packets since those TXQs won't see a lot of traffic that'd
make them catch up later.

Cc: stable@vger.kernel.org
Fixes: 4856bfd2 ("mac80211: do not call driver wake_tx_queue op during reconfig")
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20211129152938.4573a221c0e1.I0d1d5daea3089be3fc0dccc92991b0f8c5677f0c@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>

db7205af

mac80211: update channel context before station state · 4dde3c36

由 Mordechay Goodstein 提交于 11月 29, 2021

Currently channel context is updated only after station got an update about
new assoc state, this results in station using the old channel context.

Fix this by moving the update channel context before updating station,
enabling the driver to immediately use the updated channel context in
the new assoc state.
Signed-off-by: NMordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20211129152938.1c80c17ffd8a.I94ae31378b363c1182cfdca46c4b7e7165cff984@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>

4dde3c36

mac80211: Fix the size used for building probe request · f22d9813

由 Ilan Peer 提交于 11月 29, 2021

Instead of using the hard-coded value of '100' use the correct
scan IEs length as calculated during HW registration to mac80211.
Signed-off-by: NIlan Peer <ilan.peer@intel.com>
Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20211129152938.0a82d6891719.I8ded1f2e0bccb9e71222c945666bcd86537f2e35@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>

f22d9813

mac80211: fix lookup when adding AddBA extension element · 511ab0c1

由 Johannes Berg 提交于 11月 29, 2021

We should be doing the HE capabilities lookup based on the full
interface type so if P2P doesn't have HE but client has it doesn't
get confused. Fix that.

Fixes: 2ab45876 ("mac80211: add support for the ADDBA extension element")
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20211129152938.010fc1d61137.If3a468145f29d670cb00a693bed559d8290ba693@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>

511ab0c1

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功