提交 · 36268983e90316b37000a005642af42234dabb36 · openeuler / Kernel

27 1月, 2022 5 次提交

Revert "ipv6: Honor all IPv6 PIO Valid Lifetime values" · 36268983

由 Guillaume Nault 提交于 1月 26, 2022

This reverts commit b75326c2.

This commit breaks Linux compatibility with USGv6 tests. The RFC this
commit was based on is actually an expired draft: no published RFC
currently allows the new behaviour it introduced.

Without full IETF endorsement, the flash renumbering scenario this
patch was supposed to enable is never going to work, as other IPv6
equipements on the same LAN will keep the 2 hours limit.

Fixes: b75326c2 ("ipv6: Honor all IPv6 PIO Valid Lifetime values")
Signed-off-by: NGuillaume Nault <gnault@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36268983

Merge branch 'pid-introduce-helper-task_is_in_root_ns' · c7ec845f

由 Jakub Kicinski 提交于 1月 26, 2022

Leo Yan says:

====================
pid: Introduce helper task_is_in_root_ns()

This patch series introduces a helper function task_is_in_init_pid_ns()
to replace open code.  The two patches are extracted from the original
series [1] for network subsystem.

As a plan, we can firstly land this patch set into kernel 5.18; there
have 5 patches are left out from original series [1], as a next step,
I will resend them for appropriate linux-next merging.

[1] https://lore.kernel.org/lkml/20211208083320.472503-1-leo.yan@linaro.org/
====================

Link: https://lore.kernel.org/r/20220126050427.605628-1-leo.yan@linaro.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

c7ec845f

connector/cn_proc: Use task_is_in_init_pid_ns() · 42c66d16

由 Leo Yan 提交于 1月 26, 2022

This patch replaces open code with task_is_in_init_pid_ns() to check if
a task is in root PID namespace.
Signed-off-by: NLeo Yan <leo.yan@linaro.org>
Acked-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

42c66d16

pid: Introduce helper task_is_in_init_pid_ns() · d7e4f854

由 Leo Yan 提交于 1月 26, 2022

Currently the kernel uses open code in multiple places to check if a
task is in the root PID namespace with the kind of format:

  if (task_active_pid_ns(current) == &init_pid_ns)
      do_something();

This patch creates a new helper function, task_is_in_init_pid_ns(), it
returns true if a passed task is in the root PID namespace, otherwise
returns false.  So it will be used to replace open codes.
Suggested-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: NLeo Yan <leo.yan@linaro.org>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Acked-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: NBalbir Singh <bsingharora@gmail.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

d7e4f854

gve: Fix GFP flags when allocing pages · a92f7a6f

由 Catherine Sullivan 提交于 1月 25, 2022

Use GFP_ATOMIC when allocating pages out of the hotpath,
continue to use GFP_KERNEL when allocating pages during setup.

GFP_KERNEL will allow blocking which allows it to succeed
more often in a low memory enviornment but in the hotpath we do
not want to allow the allocation to block.

Fixes: f5cedc84 ("gve: Add transmit and receive support")
Signed-off-by: NCatherine Sullivan <csully@google.com>
Signed-off-by: NDavid Awogbemila <awogbemila@google.com>
Link: https://lore.kernel.org/r/20220126003843.3584521-1-awogbemila@google.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

a92f7a6f

26 1月, 2022 12 次提交

Merge branch 'lan966x-fixes' · 2f651328

由 David S. Miller 提交于 1月 26, 2022

Horatiu Vultur says:

====================
net: lan966x: Fixes for sleep in atomic context

This patch series contains 2 fixes for lan966x that is sleeping in atomic
context. The first patch fixes the injection of the frames while the second
one fixes the updating of the MAC table.

v1->v2:
 - correct the fix tag in the second patch, it was using the wrong sha.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f651328

net: lan966x: Fix sleep in atomic context when updating MAC table · 77bdaf39

由 Horatiu Vultur 提交于 1月 25, 2022

The function lan966x_mac_wait_for_completion is used to poll the status
of the MAC table using the function readx_poll_timeout. The problem with
this function is that is called also from atomic context. Therefore
update the function to use readx_poll_timeout_atomic.

Fixes: e18aba89 ("net: lan966x: add mactable support")
Signed-off-by: NHoratiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

77bdaf39

net: lan966x: Fix sleep in atomic context when injecting frames · b6ab1496

由 Horatiu Vultur 提交于 1月 25, 2022

On lan966x, when injecting a frame it was polling the register
QS_INJ_STATUS to see if it can continue with the injection of the frame.
The problem was that it was using readx_poll_timeout which could sleep
in atomic context.
This patch fixes this issue by using readx_poll_timeout_atomic.

Fixes: d28d6d2e ("net: lan966x: add port module support")
Signed-off-by: NHoratiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6ab1496

Merge branch 'dev_addr-const-fixes' · 8199d0c6

由 David S. Miller 提交于 1月 26, 2022

Jakub Kicinski says:

====================
ethernet: fix some esoteric drivers after netdev->dev_addr constification

Looking at recent fixes for drivers which don't get included with
allmodconfig builds I thought it's worth grepping for more instances of:

  dev->dev_addr\[.*\] =

This set contains the fixes.

v2: add last 3 patches which fix drivers for the RiscPC ARM platform.
Thanks to Arnd Bergmann for explaining how to build test that.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8199d0c6

ethernet: seeq/ether3: don't write directly to netdev->dev_addr · 8eb86fc2

由 Jakub Kicinski 提交于 1月 25, 2022

netdev->dev_addr is const now.

Compile tested rpc_defconfig w/ GCC 8.5.

Fixes: adeef3e3 ("net: constify netdev->dev_addr")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8eb86fc2

ethernet: 8390/etherh: don't write directly to netdev->dev_addr · 5518c524

由 Jakub Kicinski 提交于 1月 25, 2022

netdev->dev_addr is const now.

Compile tested rpc_defconfig w/ GCC 8.5.

Fixes: adeef3e3 ("net: constify netdev->dev_addr")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5518c524

ethernet: i825xx: don't write directly to netdev->dev_addr · 98ef22bb

由 Jakub Kicinski 提交于 1月 25, 2022

netdev->dev_addr is const now.

Compile tested rpc_defconfig w/ GCC 8.5.

Fixes: adeef3e3 ("net: constify netdev->dev_addr")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: NRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98ef22bb

ethernet: broadcom/sb1250-mac: don't write directly to netdev->dev_addr · 7f6ec2b2

由 Jakub Kicinski 提交于 1月 25, 2022

netdev->dev_addr is const now.

Compile tested bigsur_defconfig and sb1250_swarm_defconfig.

Fixes: adeef3e3 ("net: constify netdev->dev_addr")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f6ec2b2

ethernet: tundra: don't write directly to netdev->dev_addr · 14ba66a6

由 Jakub Kicinski 提交于 1月 25, 2022

netdev->dev_addr is const now.

Maintain the questionable offsetting in ndo_set_mac_address.

Compile tested holly_defconfig and mpc7448_hpc2_defconfig.

Fixes: adeef3e3 ("net: constify netdev->dev_addr")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14ba66a6

ethernet: 3com/typhoon: don't write directly to netdev->dev_addr · 007c9512

由 Jakub Kicinski 提交于 1月 25, 2022

This driver casts off the const and writes directly to netdev->dev_addr.
This will result in a MAC address tree corruption and a warning.

Compile tested ppc6xx_defconfig.

Fixes: adeef3e3 ("net: constify netdev->dev_addr")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

007c9512

sch_htb: Fail on unsupported parameters when offload is requested · 429c3be8

由 Maxim Mikityanskiy 提交于 1月 25, 2022

The current implementation of HTB offload doesn't support some
parameters. Instead of ignoring them, actively return the EINVAL error
when they are set to non-defaults.

As this patch goes to stable, the driver API is not changed here. If
future drivers support more offload parameters, the checks can be moved
to the driver side.

Note that the buffer and cbuffer parameters are also not supported, but
the tc userspace tool assigns some default values derived from rate and
ceil, and identifying these defaults in sch_htb would be unreliable, so
they are still ignored.

Fixes: d03b195b ("sch_htb: Hierarchical QoS hardware offload")
Reported-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NMaxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20220125100654.424570-1-maximmi@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

429c3be8

amd: declance: use eth_hw_addr_set() · 8bdd2494

由 Thomas Bogendoerfer 提交于 1月 25, 2022

Copy scattered mac address octets into an array then eth_hw_addr_set().

Fixes: adeef3e3 ("net: constify netdev->dev_addr")
Signed-off-by: NThomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: https://lore.kernel.org/r/20220125144007.64407-1-tsbogend@alpha.franken.deSigned-off-by: NJakub Kicinski <kuba@kernel.org>

8bdd2494

25 1月, 2022 11 次提交

net: hns3: handle empty unknown interrupt for VF · 2f61353c

由 Yufeng Mo 提交于 1月 25, 2022

Since some interrupt states may be cleared by hardware, the driver
may receive an empty interrupt. Currently, the VF driver directly
disables the vector0 interrupt in this case. As a result, the VF
is unavailable. Therefore, the vector0 interrupt should be enabled
in this case.

Fixes: b90fcc5b ("net: hns3: add reset handling for VF when doing Core/Global/IMP reset")
Signed-off-by: NYufeng Mo <moyufeng@huawei.com>
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f61353c

net: fec_mpc52xx: don't discard const from netdev->dev_addr · 74afa306

由 Jakub Kicinski 提交于 1月 24, 2022

Recent changes made netdev->dev_addr const, and it's passed
directly to mpc52xx_fec_set_paddr().

Similar problem exists on the probe patch, the driver needs
to call eth_hw_addr_set().
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Fixes: adeef3e3 ("net: constify netdev->dev_addr")
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74afa306

net: cpsw: Properly initialise struct page_pool_params · c63003e3

由 Toke Høiland-Jørgensen 提交于 1月 24, 2022

The cpsw driver didn't properly initialise the struct page_pool_params
before calling page_pool_create(), which leads to crashes after the struct
has been expanded with new parameters.

The second Fixes tag below is where the buggy code was introduced, but
because the code was moved around this patch will only apply on top of the
commit in the first Fixes tag.

Fixes: c5013ac1 ("net: ethernet: ti: cpsw: move set of common functions in cpsw_priv")
Fixes: 9ed4050c ("net: ethernet: ti: cpsw: add XDP support")
Reported-by: NColin Foster <colin.foster@in-advantage.com>
Signed-off-by: NToke Høiland-Jørgensen <toke@redhat.com>
Tested-by: NColin Foster <colin.foster@in-advantage.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c63003e3

yam: fix a memory leak in yam_siocdevprivate() · 29eb3154

由 Hangyu Hua 提交于 1月 24, 2022

ym needs to be free when ym->cmd != SIOCYAMSMCS.

Fixes: 0781168e ("yam: fix a missing-check bug")
Signed-off-by: NHangyu Hua <hbh25y@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29eb3154

net: stmmac: reduce unnecessary wakeups from eee sw timer · c74ead22

由 Jisheng Zhang 提交于 1月 23, 2022

Currently, on EEE capable platforms, if EEE SW timer is used, the SW
timer cause 1 wakeup/s even if the TX has successfully entered EEE.
Remove this unnecessary wakeup by only calling mod_timer() if we
haven't successfully entered EEE.
Signed-off-by: NJisheng Zhang <jszhang@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c74ead22

Merge tag 'linux-can-fixes-for-5.17-20220124' of... · e52984be

由 Jakub Kicinski 提交于 1月 24, 2022

Merge tag 'linux-can-fixes-for-5.17-20220124' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2022-01-24

The first patch updates the email address of Brian Silverman from his
former employer to his private address.

The next patch fixes DT bindings information for the tcan4x5x SPI CAN
driver.

The following patch targets the m_can driver and fixes the
introduction of FIFO bulk read support.

Another patch for the tcan4x5x driver, which fixes the max register
value for the regmap config.

The last patch for the flexcan driver marks the RX mailbox support for
the MCF5441X as support.

* tag 'linux-can-fixes-for-5.17-20220124' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: flexcan: mark RX via mailboxes as supported on MCF5441X
  can: tcan4x5x: regmap: fix max register value
  can: m_can: m_can_fifo_{read,write}: don't read or write from/to FIFO if length is 0
  dt-bindings: can: tcan4x5x: fix mram-cfg RX FIFO config
  mailmap: update email address of Brian Silverman
====================

Link: https://lore.kernel.org/r/20220124175955.3464134-1-mkl@pengutronix.deSigned-off-by: NJakub Kicinski <kuba@kernel.org>

e52984be

can: flexcan: mark RX via mailboxes as supported on MCF5441X · f04aefd4

由 Marc Kleine-Budde 提交于 1月 21, 2022

Most flexcan IP cores support 2 RX modes:
- FIFO
- mailbox

The flexcan IP core on the MCF5441X cannot receive CAN RTR messages
via mailboxes. However the mailbox mode is more performant. The commit

| 1c45f577 ("can: flexcan: add ethtool support to change rx-rtr setting during runtime")

added support to switch from FIFO to mailbox mode on these cores.

After testing the mailbox mode on the MCF5441X by Angelo Dureghello,
this patch marks it (without RTR capability) as supported. Further the
IP core overview table is updated, that RTR reception via mailboxes is
not supported.

Link: https://lore.kernel.org/all/20220121084425.3141218-1-mkl@pengutronix.deTested-by: NAngelo Dureghello <angelo@kernel-space.org>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

f04aefd4

can: tcan4x5x: regmap: fix max register value · e59986de

由 Marc Kleine-Budde 提交于 1月 14, 2022

The MRAM of the tcan4x5x has a size of 2K and starts at 0x8000. There
are no further registers in the tcan4x5x making 0x87fc the biggest
addressable register.

This patch fixes the max register value of the regmap config from
0x8ffc to 0x87fc.

Fixes: 6e1caaf8 ("can: tcan4x5x: fix max register value")
Link: https://lore.kernel.org/all/20220119064011.2943292-1-mkl@pengutronix.deSigned-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

e59986de

can: m_can: m_can_fifo_{read,write}: don't read or write from/to FIFO if length is 0 · db72589c

由 Marc Kleine-Budde 提交于 1月 14, 2022

In order to optimize FIFO access, especially on m_can cores attached
to slow busses like SPI, in patch

| e3938177 ("can: m_can: Disable IRQs on FIFO bus errors")

bulk read/write support has been added to the m_can_fifo_{read,write}
functions.

That change leads to the tcan driver to call
regmap_bulk_{read,write}() with a length of 0 (for CAN frames with 0
data length). regmap treats this as an error:

| tcan4x5x spi1.0 tcan4x5x0: FIFO write returned -22

This patch fixes the problem by not calling the
cdev->ops->{read,write)_fifo() in case of a 0 length read/write.

Fixes: e3938177 ("can: m_can: Disable IRQs on FIFO bus errors")
Link: https://lore.kernel.org/all/20220114155751.2651888-1-mkl@pengutronix.de
Cc: stable@vger.kernel.org
Cc: Matt Kline <matt@bitbashing.io>
Cc: Chandrasekar Ramakrishnan <rcsekar@samsung.com>
Reported-by: NMichael Anochin <anochin@photo-meter.com>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

db72589c

dt-bindings: can: tcan4x5x: fix mram-cfg RX FIFO config · 17a30422

由 Marc Kleine-Budde 提交于 1月 14, 2022

This tcan4x5x only comes with 2K of MRAM, a RX FIFO with a dept of 32
doesn't fit into the MRAM. Use a depth of 16 instead.

Fixes: 4edd396a ("dt-bindings: can: tcan4x5x: Add DT bindings for TCAN4x5X driver")
Link: https://lore.kernel.org/all/20220119062951.2939851-1-mkl@pengutronix.deSigned-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

17a30422

mailmap: update email address of Brian Silverman · 984d1eff

由 Marc Kleine-Budde 提交于 1月 10, 2022

Brian Silverman's address at bluerivertech.com is not valid anymore,
use Brian's private email address instead.

Link: https://lore.kernel.org/all/20220110082359.2019735-1-mkl@pengutronix.de
Cc: Brian Silverman <bsilver16384@gmail.com>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

984d1eff

24 1月, 2022 12 次提交

net: stmmac: remove unused members in struct stmmac_priv · de8a820d

由 Jisheng Zhang 提交于 1月 23, 2022

The tx_coalesce and mii_irq are not used at all now, so remove them.
Signed-off-by: NJisheng Zhang <jszhang@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de8a820d

net: atlantic: Use the bitmap API instead of hand-writing it · ebe0582b

由 Christophe JAILLET 提交于 1月 23, 2022

Simplify code by using bitmap_weight() and bitmap_zero() instead of
hand-writing these functions.
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: NIgor Russkikh <irusskikh@marvell.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ebe0582b

ping: fix the sk_bound_dev_if match in ping_lookup · 2afc3b5a

由 Xin Long 提交于 1月 22, 2022

When 'ping' changes to use PING socket instead of RAW socket by:

   # sysctl -w net.ipv4.ping_group_range="0 100"

the selftests 'router_broadcast.sh' will fail, as such command

  # ip vrf exec vrf-h1 ping -I veth0 198.51.100.255 -b

can't receive the response skb by the PING socket. It's caused by mismatch
of sk_bound_dev_if and dif in ping_rcv() when looking up the PING socket,
as dif is vrf-h1 if dif's master was set to vrf-h1.

This patch is to fix this regression by also checking the sk_bound_dev_if
against sdif so that the packets can stil be received even if the socket
is not bound to the vrf device but to the real iif.

Fixes: c319b4d7 ("net: ipv4: add IPPROTO_ICMP socket kind")
Reported-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2afc3b5a

net/smc: Transitional solution for clcsock race issue · c0bf3d8a

由 Wen Gu 提交于 1月 22, 2022

We encountered a crash in smc_setsockopt() and it is caused by
accessing smc->clcsock after clcsock was released.

 BUG: kernel NULL pointer dereference, address: 0000000000000020
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E     5.16.0-rc4+ #53
 RIP: 0010:smc_setsockopt+0x59/0x280 [smc]
 Call Trace:
  <TASK>
  __sys_setsockopt+0xfc/0x190
  __x64_sys_setsockopt+0x20/0x30
  do_syscall_64+0x34/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f16ba83918e
  </TASK>

This patch tries to fix it by holding clcsock_release_lock and
checking whether clcsock has already been released before access.

In case that a crash of the same reason happens in smc_getsockopt()
or smc_switch_to_fallback(), this patch also checkes smc->clcsock
in them too. And the caller of smc_switch_to_fallback() will identify
whether fallback succeeds according to the return value.

Fixes: fd57770d ("net/smc: wait for pending work before clcsock release_sock")
Link: https://lore.kernel.org/lkml/5dd7ffd1-28e2-24cc-9442-1defec27375e@linux.ibm.com/T/Signed-off-by: NWen Gu <guwen@linux.alibaba.com>
Acked-by: NKarsten Graul <kgraul@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c0bf3d8a

ibmvnic: remove unused ->wait_capability · 3a5d9db7

由 Sukadev Bhattiprolu 提交于 1月 21, 2022

With previous bug fix, ->wait_capability flag is no longer needed and can
be removed.

Fixes: 249168ad ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a5d9db7

ibmvnic: don't spin in tasklet · 48079e7f

由 Sukadev Bhattiprolu 提交于 1月 21, 2022

ibmvnic_tasklet() continuously spins waiting for responses to all
capability requests. It does this to avoid encountering an error
during initialization of the vnic. However if there is a bug in the
VIOS and we do not receive a response to one or more queries the
tasklet ends up spinning continuously leading to hard lock ups.

If we fail to receive a message from the VIOS it is reasonable to
timeout the login attempt rather than spin indefinitely in the tasklet.

Fixes: 249168ad ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

48079e7f

ibmvnic: init ->running_cap_crqs early · 151b6a5c

由 Sukadev Bhattiprolu 提交于 1月 21, 2022

We use ->running_cap_crqs to determine when the ibmvnic_tasklet() should
send out the next protocol message type. i.e when we get back responses
to all our QUERY_CAPABILITY CRQs we send out REQUEST_CAPABILITY crqs.
Similiary, when we get responses to all the REQUEST_CAPABILITY crqs, we
send out the QUERY_IP_OFFLOAD CRQ.

We currently increment ->running_cap_crqs as we send out each CRQ and
have the ibmvnic_tasklet() send out the next message type, when this
running_cap_crqs count drops to 0.

This assumes that all the CRQs of the current type were sent out before
the count drops to 0. However it is possible that we send out say 6 CRQs,
get preempted and receive all the 6 responses before we send out the
remaining CRQs. This can result in ->running_cap_crqs count dropping to
zero before all messages of the current type were sent and we end up
sending the next protocol message too early.

Instead initialize the ->running_cap_crqs upfront so the tasklet will
only send the next protocol message after all responses are received.

Use the cap_reqs local variable to also detect any discrepancy (either
now or in future) in the number of capability requests we actually send.

Currently only send_query_cap() is affected by this behavior (of sending
next message early) since it is called from the worker thread (during
reset) and from application thread (during ->ndo_open()) and they can be
preempted. send_request_cap() is only called from the tasklet which
processes CRQ responses sequentially, is not be affected. But to
maintain the existing symmtery with send_query_capability() we update
send_request_capability() also.

151b6a5c

ibmvnic: Allow extra failures before disabling · db9f0e8b

由 Sukadev Bhattiprolu 提交于 1月 21, 2022

If auto-priority-failover (APF) is enabled and there are at least two
backing devices of different priorities, some resets like fail-over,
change-param etc can cause at least two back to back failovers. (Failover
from high priority backing device to lower priority one and then back
to the higher priority one if that is still functional).

Depending on the timimg of the two failovers it is possible to trigger
a "hard" reset and for the hard reset to fail due to failovers. When this
occurs, the driver assumes that the network is unstable and disables the
VNIC for a 60-second "settling time". This in turn can cause the ethtool
command to fail with "No such device" while the vnic automatically recovers
a little while later.

Given that it's possible to have two back to back failures, allow for extra
failures before disabling the vnic for the settling time.

Fixes: f15fde9d ("ibmvnic: delay next reset if hard reset fails")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db9f0e8b

ipv4: fix ip option filtering for locally generated fragments · 27a8caa5

由 Jakub Kicinski 提交于 1月 21, 2022

During IP fragmentation we sanitize IP options. This means overwriting
options which should not be copied with NOPs. Only the first fragment
has the original, full options.

ip_fraglist_prepare() copies the IP header and options from previous
fragment to the next one. Commit 19c3401a ("net: ipv4: place control
buffer handling away from fragmentation iterators") moved sanitizing
options before ip_fraglist_prepare() which means options are sanitized
and then overwritten again with the old values.

Fixing this is not enough, however, nor did the sanitization work
prior to aforementioned commit.

ip_options_fragment() (which does the sanitization) uses ipcb->opt.optlen
for the length of the options. ipcb->opt of fragments is not populated
(it's 0), only the head skb has the state properly built. So even when
called at the right time ip_options_fragment() does nothing. This seems
to date back all the way to v2.5.44 when the fast path for pre-fragmented
skbs had been introduced. Prior to that ip_options_build() would have been
called for every fragment (in fact ever since v2.5.44 the fragmentation
handing in ip_options_build() has been dead code, I'll clean it up in
-next).

In the original patch (see Link) caixf mentions fixing the handling
for fragments other than the second one, but I'm not sure how _any_
fragment could have had their options sanitized with the code
as it stood.

Tested with python (MTU on lo lowered to 1000 to force fragmentation):

import socket
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.setsockopt(socket.IPPROTO_IP, socket.IP_OPTIONS,
bytearray([7,4,5,192, 20|0x80,4,1,0]))
s.sendto(b'1'*2000, ('127.0.0.1', 1234))

Before:

IP (tos 0x0, ttl 64, id 1053, offset 0, flags [+], proto UDP (17), length 996, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256))
localhost.36500 > localhost.search-agent: UDP, length 2000
IP (tos 0x0, ttl 64, id 1053, offset 968, flags [+], proto UDP (17), length 996, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256))
localhost > localhost: udp
IP (tos 0x0, ttl 64, id 1053, offset 1936, flags [none], proto UDP (17), length 100, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256))
localhost > localhost: udp

After:

IP (tos 0x0, ttl 96, id 42549, offset 0, flags [+], proto UDP (17), length 996, options (RR [bad length 4] [bad ptr 5] 192.148.4.1,,RA value 256))
localhost.51607 > localhost.search-agent: UDP, bad length 2000 > 960
IP (tos 0x0, ttl 96, id 42549, offset 968, flags [+], proto UDP (17), length 996, options (NOP,NOP,NOP,NOP,RA value 256))
localhost > localhost: udp
IP (tos 0x0, ttl 96, id 42549, offset 1936, flags [none], proto UDP (17), length 100, options (NOP,NOP,NOP,NOP,RA value 256))
localhost > localhost: udp

RA (20 | 0x80) is now copied as expected, RR (7) is "NOPed out".

Link: https://lore.kernel.org/netdev/20220107080559.122713-1-ooppublic@163.com/
Fixes: 19c3401a ("net: ipv4: place control buffer handling away from fragmentation iterators")
Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: Ncaixf <ooppublic@163.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27a8caa5

net-procfs: show net devices bound packet types · 1d10f8a1

由 Jianguo Wu 提交于 1月 21, 2022

After commit:7866a621 ("dev: add per net_device packet type chains"),
we can not get packet types that are bound to a specified net device by
/proc/net/ptype, this patch fix the regression.

Run "tcpdump -i ens192 udp -nns0" Before and after apply this patch:

Before:
  [root@localhost ~]# cat /proc/net/ptype
  Type Device      Function
  0800          ip_rcv
  0806          arp_rcv
  86dd          ipv6_rcv

After:
  [root@localhost ~]# cat /proc/net/ptype
  Type Device      Function
  ALL  ens192   tpacket_rcv
  0800          ip_rcv
  0806          arp_rcv
  86dd          ipv6_rcv

v1 -> v2:
  - fix the regression rather than adding new /proc API as
    suggested by Stephen Hemminger.

Fixes: 7866a621 ("dev: add per net_device packet type chains")
Signed-off-by: NJianguo Wu <wujianguo@chinatelecom.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d10f8a1

bonding: use rcu_dereference_rtnl when get bonding active slave · aa603467

由 Hangbin Liu 提交于 1月 21, 2022

bond_option_active_slave_get_rcu() should not be used in rtnl_mutex as it
use rcu_dereference(). Replace to rcu_dereference_rtnl() so we also can use
this function in rtnl protected context.

With this update, we can rmeove the rcu_read_lock/unlock in
bonding .ndo_eth_ioctl and .get_ts_info.
Reported-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Fixes: 94dd016a ("bond: pass get_ts_info and SIOC[SG]HWTSTAMP ioctl to active device")
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa603467

net: sfp: ignore disabled SFP node · 2148927e

由 Marek Behún 提交于 1月 19, 2022

Commit ce0aa27f ("sfp: add sfp-bus to bridge between network devices
and sfp cages") added code which finds SFP bus DT node even if the node
is disabled with status = "disabled". Because of this, when phylink is
created, it ends with non-null .sfp_bus member, even though the SFP
module is not probed (because the node is disabled).

We need to ignore disabled SFP bus node.

Fixes: ce0aa27f ("sfp: add sfp-bus to bridge between network devices and sfp cages")
Signed-off-by: NMarek Behún <kabel@kernel.org>
Cc: stable@vger.kernel.org # 2203cbf2 ("net: sfp: move fwnode parsing into sfp-bus layer")
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2148927e

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功