提交 · bc53f67d247a38d43e081faa7e63690a1279f5c7 · openeuler / Kernel

26 6月, 2020 18 次提交

kdb: Switch to use safer dbg_io_ops over console APIs · 5946d1f5

由 Sumit Garg 提交于 6月 04, 2020

In kgdb context, calling console handlers aren't safe due to locks used
in those handlers which could in turn lead to a deadlock. Although, using
oops_in_progress increases the chance to bypass locks in most console
handlers but it might not be sufficient enough in case a console uses
more locks (VT/TTY is good example).

Currently when a driver provides both polling I/O and a console then kdb
will output using the console. We can increase robustness by using the
currently active polling I/O driver (which should be lockless) instead
of the corresponding console. For several common cases (e.g. an
embedded system with a single serial port that is used both for console
output and debugger I/O) this will result in no console handler being
used.

In order to achieve this we need to reverse the order of preference to
use dbg_io_ops (uses polling I/O mode) over console APIs. So we just
store "struct console" that represents debugger I/O in dbg_io_ops and
while emitting kdb messages, skip console that matches dbg_io_ops
console in order to avoid duplicate messages. After this change,
"is_console" param becomes redundant and hence removed.
Suggested-by: NDaniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: NSumit Garg <sumit.garg@linaro.org>
Link: https://lore.kernel.org/r/1591264879-25920-5-git-send-email-sumit.garg@linaro.orgReviewed-by: NDouglas Anderson <dianders@chromium.org>
Reviewed-by: NPetr Mladek <pmladek@suse.com>
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>

5946d1f5

i2c: core: check returned size of emulated smbus block read · 40e05200

由 Mans Rullgard 提交于 6月 13, 2020

If the i2c bus driver ignores the I2C_M_RECV_LEN flag (as some of
them do), it is possible for an I2C_SMBUS_BLOCK_DATA read issued
on some random device to return an arbitrary value in the first
byte (and nothing else).  When this happens, i2c_smbus_xfer_emulated()
will happily write past the end of the supplied data buffer, thus
causing Bad Things to happen.  To prevent this, check the size
before copying the data block and return an error if it is too large.

Fixes: 209d27c3 ("i2c: Emulate SMBus block read over I2C")
Signed-off-by: NMans Rullgard <mans@mansr.com>
[wsa: use better errno]
Signed-off-by: NWolfram Sang <wsa@kernel.org>

40e05200

media: omap3isp: remove cacheflush.h · 3c785826

由 Nathan Chancellor 提交于 6月 25, 2020

After mm.h was removed from the asm-generic version of cacheflush.h,
s390 allyesconfig shows several warnings of the following nature:

In file included from arch/s390/include/generated/asm/cacheflush.h:1,
from drivers/media/platform/omap3isp/isp.c:42:
include/asm-generic/cacheflush.h:16:42: warning: 'struct mm_struct' declared inside parameter list will not be visible outside of this definition or declaration

As Geert and Laurent point out, this driver does not need this header in
the two files that include it. Remove it so there are no warnings.

Link: http://lkml.kernel.org/r/20200622234740.72825-2-natechancellor@gmail.com
Fixes: e0cf615d ("asm-generic: don't include <linux/mm.h> in cacheflush.h")
Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Suggested-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Suggested-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3c785826

wil6210: account for napi_gro_receive never returning GRO_DROP · 045790b7

由 Jason A. Donenfeld 提交于 6月 24, 2020

The napi_gro_receive function no longer returns GRO_DROP ever, making
handling GRO_DROP dead code. This commit removes that dead code.
Further, it's not even clear that device drivers have any business in
taking action after passing off received packets; that's arguably out of
their hands. In this case, too, the non-gro path didn't bother checking
the return value. Plus, this had some clunky debugging functions that
duplicated code from elsewhere and was generally pretty messy. So, this
commit cleans that all up too.

Fixes: 6570bc79 ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

045790b7

hns: do not cast return value of napi_gro_receive to null · 93ab48a9

由 Jason A. Donenfeld 提交于 6月 24, 2020

Basically no drivers care about the return value here, and there's no
__must_check that would make casting to void sensible, so remove it.
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93ab48a9

socionext: account for napi_gro_receive never returning GRO_DROP · e5e7d805

由 Jason A. Donenfeld 提交于 6月 24, 2020

The napi_gro_receive function no longer returns GRO_DROP ever, making
handling GRO_DROP dead code. This commit removes that dead code.
Further, it's not even clear that device drivers have any business in
taking action after passing off received packets; that's arguably out of
their hands.

Fixes: 6570bc79 ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5e7d805

wireguard: receive: account for napi_gro_receive never returning GRO_DROP · df08126e

由 Jason A. Donenfeld 提交于 6月 24, 2020

The napi_gro_receive function no longer returns GRO_DROP ever, making
handling GRO_DROP dead code. This commit removes that dead code.
Further, it's not even clear that device drivers have any business in
taking action after passing off received packets; that's arguably out of
their hands.

Fixes: e7096c13 ("net: WireGuard secure network tunnel")
Fixes: 6570bc79 ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df08126e

vxlan: fix last fdb index during dump of fdb with nhid · b18e9834

由 Roopa Prabhu 提交于 6月 24, 2020

This patch fixes last saved fdb index in fdb dump handler when
handling fdb's with nhid.

Fixes: 1274e1cc ("vxlan: ecmp support for mac fdb entries")
Signed-off-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b18e9834

net: dsa: sja1105: fix tc-gate schedule with single element · 43ce887c

由 Vladimir Oltean 提交于 6月 24, 2020

The sja1105_gating_cfg_time_to_interval function does this, as per the
comments:

/* The gate entries contain absolute times in their e->interval field. Convert
 * that to proper intervals (i.e. "0, 5, 10, 15" to "5, 5, 5, 5").
 */

To perform that task, it iterates over gating_cfg->entries, at each step
updating the interval of the _previous_ entry. So one interval remains
to be updated at the end of the loop: the last one (since it isn't
"prev" for anyone else).

But there was an erroneous check, that the last element's interval
should not be updated if it's also the only element. I'm not quite sure
why that check was there, but it's clearly incorrect, as a tc-gate
schedule with a single element would get an e->interval of zero,
regardless of the duration requested by the user. The switch wouldn't
even consider this configuration as valid: it will just drop all traffic
that matches the rule.

Fixes: 834f8933 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Reported-by: NXiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

43ce887c

net: dsa: sja1105: recalculate gating subschedule after deleting tc-gate rules · 82f6896a

由 Vladimir Oltean 提交于 6月 24, 2020

Currently, tas_data->enabled would remain true even after deleting all
tc-gate rules from the switch ports, which would cause the
sja1105_tas_state_machine to get unnecessarily scheduled.

Also, if there were any errors which would prevent the hardware from
enabling the gating schedule, the sja1105_tas_state_machine would
continuously detect and print that, spamming the kernel log, even if the
rules were subsequently deleted.

The rules themselves are _not_ active, because sja1105_init_scheduling
does enough of a job to not install the gating schedule in the static
config. But the virtual link rules themselves are still present.

So call the functions that remove the tc-gate configuration from
priv->tas_data.gating_cfg, so that tas_data->enabled can be set to
false, and sja1105_tas_state_machine will stop from being scheduled.

Fixes: 834f8933 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82f6896a

net: dsa: sja1105: unconditionally free old gating config · 026bdb2b

由 Vladimir Oltean 提交于 6月 24, 2020

Currently sja1105_compose_gating_subschedule is not prepared to be
called for the case where we want to recompute the global tc-gate
configuration after we've deleted those actions on a port.

After deleting the tc-gate actions on the last port, max_cycle_time
would become zero, and that would incorrectly prevent
sja1105_free_gating_config from getting called.

So move the freeing function above the check for the need to apply a new
configuration.

Fixes: 834f8933 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

026bdb2b

net: dsa: sja1105: move sja1105_compose_gating_subschedule at the top · e39109f5

由 Vladimir Oltean 提交于 6月 24, 2020

It turns out that sja1105_compose_gating_subschedule must also be called
from sja1105_vl_delete, to recalculate the overall tc-gate
configuration. Currently this is not possible without introducing a
forward declaration. So move the function at the top of the file, along
with its dependencies.
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e39109f5

net: macb: free resources on failure path of at91ether_open() · 33fdef24

由 Claudiu Beznea 提交于 6月 24, 2020

DMA buffers were not freed on failure path of at91ether_open().
Along with changes for freeing the DMA buffers the enable/disable
interrupt instructions were moved to at91ether_start()/at91ether_stop()
functions and the operations on at91ether_stop() were done in
their reverse order (compared with how is done in at91ether_start()):
before this patch the operation order on interface open path
was as follows:
1/ alloc DMA buffers
2/ enable tx, rx
3/ enable interrupts
and the order on interface close path was as follows:
1/ disable tx, rx
2/ disable interrupts
3/ free dma buffers.

Fixes: 7897b071 ("net: macb: convert to phylink")
Signed-off-by: NClaudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

33fdef24

net: macb: call pm_runtime_put_sync on failure path · 0eaf228d

由 Claudiu Beznea 提交于 6月 24, 2020

Call pm_runtime_put_sync() on failure path of at91ether_open.

Fixes: e6a41c23 ("net: macb: ensure interface is not suspended on at91rm9200")
Signed-off-by: NClaudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0eaf228d

i2c: fsi: Fix the port number field in status register · 502035e2

由 Eddie James 提交于 6月 09, 2020

The port number field in the status register was not correct, so fix it.

Fixes: d6ffb630 ("i2c: Add FSI-attached I2C master algorithm")
Signed-off-by: NEddie James <eajames@linux.ibm.com>
Signed-off-by: NJoel Stanley <joel@jms.id.au>
Signed-off-by: NWolfram Sang <wsa@kernel.org>

502035e2

clk: sifive: allocate sufficient memory for struct __prci_data · d0a5fdf4

由 Vincent Chen 提交于 6月 23, 2020

The (struct __prci_data).hw_clks.hws is an array with dynamic elements.
Using struct_size(pd, hw_clks.hws, ARRAY_SIZE(__prci_init_clocks))
instead of sizeof(*pd) to get the correct memory size of
struct __prci_data for sifive/fu540-prci. After applying this
modifications, the kernel runs smoothly with CONFIG_SLAB_FREELIST_RANDOM
enabled on the HiFive unleashed board.

Fixes: 30b8e27e ("clk: sifive: add a driver for the SiFive FU540 PRCI IP block")
Signed-off-by: NVincent Chen <vincent.chen@sifive.com>
Signed-off-by: NPalmer Dabbelt <palmerdabbelt@google.com>

d0a5fdf4

net: phy: mscc: avoid skcipher API for single block AES encryption · 5a3235e5

由 Ard Biesheuvel 提交于 6月 25, 2020

The skcipher API dynamically instantiates the transformation object
on request that implements the requested algorithm optimally on the
given platform. This notion of optimality only matters for cases like
bulk network or disk encryption, where performance can be a bottleneck,
or in cases where the algorithm itself is not known at compile time.

In the mscc case, we are dealing with AES encryption of a single
block, and so neither concern applies, and we are better off using
the AES library interface, which is lightweight and safe for this
kind of use.

Note that the scatterlist API does not permit references to buffers
that are located on the stack, so the existing code is incorrect in
any case, but avoiding the skcipher and scatterlist APIs entirely is
the most straight-forward approach to fixing this.

Cc: Antoine Tenart <antoine.tenart@bootlin.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Fixes: 28c5107a ("net: phy: mscc: macsec support")
Reviewed-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Tested-by: NAntoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a3235e5

vfio/pci: Fix SR-IOV VF handling with MMIO blocking · ebfa440c

由 Alex Williamson 提交于 6月 25, 2020

SR-IOV VFs do not implement the memory enable bit of the command
register, therefore this bit is not set in config space after
pci_enable_device(). This leads to an unintended difference
between PF and VF in hand-off state to the user. We can correct
this by setting the initial value of the memory enable bit in our
virtualized config space. There's really no need however to
ever fault a user on a VF though as this would only indicate an
error in the user's management of the enable bit, versus a PF
where the same access could trigger hardware faults.

Fixes: abafbc55 ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory")
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

ebfa440c

25 6月, 2020 22 次提交

cpuidle: Rearrange s2idle-specific idle state entry code · 10e8b11e

由 Rafael J. Wysocki 提交于 6月 25, 2020

Implement call_cpuidle_s2idle() in analogy with call_cpuidle()
for the s2idle-specific idle state entry and invoke it from
cpuidle_idle_call() to make the s2idle-specific idle entry code
path look more similar to the "regular" idle entry one.

No intentional functional impact.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NChen Yu <yu.c.chen@intel.com>

10e8b11e

net: bcmgenet: use hardware padding of runt frames · 20d1f2d1

由 Doug Berger 提交于 6月 24, 2020

When commit 474ea9ca ("net: bcmgenet: correctly pad short
packets") added the call to skb_padto() it should have been
located before the nr_frags parameter was read since that value
could be changed when padding packets with lengths between 55
and 59 bytes (inclusive).

The use of a stale nr_frags value can cause corruption of the
pad data when tx-scatter-gather is enabled. This corruption of
the pad can cause invalid checksum computation when hardware
offload of tx-checksum is also enabled.

Since the original reason for the padding was corrected by
commit 7dd39913 ("net: bcmgenet: fix skb_len in
bcmgenet_xmit_single()") we can remove the software padding all
together and make use of hardware padding of short frames as
long as the hardware also always appends the FCS value to the
frame.

Fixes: 474ea9ca ("net: bcmgenet: correctly pad short packets")
Signed-off-by: NDoug Berger <opendmb@gmail.com>
Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

20d1f2d1

net: bcmgenet: use __be16 for htons(ETH_P_IP) · d966d2ef

由 Doug Berger 提交于 6月 24, 2020

The 16-bit value that holds a short in network byte order should
be declared as a restricted big endian type to allow type checks
to succeed during assignment.

Fixes: 3e370952 ("net: bcmgenet: add support for ethtool rxnfc flows")
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NDoug Berger <opendmb@gmail.com>
Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d966d2ef

net: bcmgenet: re-remove bcmgenet_hfb_add_filter · 673bafd5

由 Doug Berger 提交于 6月 24, 2020

This function was originally removed by Baoyou Xie in
commit e2072600 ("net: bcmgenet: remove unused function in
bcmgenet.c") to prevent a build warning.

Some of the functions removed by Baoyou Xie are now used for
WAKE_FILTER support so his commit was reverted, but this function
is still unused and the kbuild test robot dutifully reported the
warning.

This commit once again removes the remaining unused hfb functions.

Fixes: 14da1510 ("Revert "net: bcmgenet: remove unused function in bcmgenet.c"")
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NDoug Berger <opendmb@gmail.com>
Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

673bafd5

drm/amd: fix potential memleak in err branch · b5b78a6c

由 Bernard Zhao 提交于 6月 20, 2020

The function kobject_init_and_add alloc memory like:
kobject_init_and_add->kobject_add_varg->kobject_set_name_vargs
->kvasprintf_const->kstrdup_const->kstrdup->kmalloc_track_caller
->kmalloc_slab, in err branch this memory not free. If use
kmemleak, this path maybe catched.
These changes are to add kobject_put in kobject_init_and_add
failed branch, fix potential memleak.
Signed-off-by: NBernard Zhao <bernard@vivo.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

b5b78a6c

drm/amd/display: Fix ineffective setting of max bpc property · fa7041d9

由 Stylon Wang 提交于 6月 12, 2020

[Why]
Regression was introduced where setting max bpc property has no effect
on the atomic check and final commit. It has the same effect as max bpc
being stuck at 8.

[How]
Correctly propagate max bpc with the new connector state.
Signed-off-by: NStylon Wang <stylon.wang@amd.com>
Reviewed-by: NNicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: NRodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

fa7041d9

drm/amd/display: Enable output_bpc property on all outputs · 5ae9c378

由 Stylon Wang 提交于 6月 01, 2020

[Why]
Connector property output_bpc is available on DP/eDP only. New IGT tests
would benifit if this property works on HDMI.

[How]
Enable this read-only property on all types of connectors.
Signed-off-by: NStylon Wang <stylon.wang@amd.com>
Reviewed-by: NNicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: NRodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

5ae9c378

drm/amdgpu: add fw release for sdma v5_0 · edfaf6fa

由 Wenhui Sheng 提交于 6月 18, 2020

sdma fw isn't released when module exit
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NWenhui Sheng <Wenhui.Sheng@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

edfaf6fa

qed: add missing error test for DBG_STATUS_NO_MATCHING_FRAMING_MODE · a5124386

由 Colin Ian King 提交于 6月 24, 2020

The error DBG_STATUS_NO_MATCHING_FRAMING_MODE was added to the enum
enum dbg_status however there is a missing corresponding entry for
this in the array s_status_str. This causes an out-of-bounds read when
indexing into the last entry of s_status_str.  Fix this by adding in
the missing entry.

Addresses-Coverity: ("Out-of-bounds read").
Fixes: 2d22bc83 ("qed: FW 8.42.2.0 debug features")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5124386

net: phy: call phy_disable_interrupts() in phy_init_hw() · 9886a4db

由 Jisheng Zhang 提交于 6月 24, 2020

Call phy_disable_interrupts() in phy_init_hw() to "have a defined init
state as we don't know in which state the PHY is if the PHY driver is
loaded. We shouldn't assume that it's the chip power-on defaults, BIOS
or boot loader could have changed this. Or in case of dual-boot
systems the other OS could leave the PHY in whatever state." as pointed
out by Heiner.
Suggested-by: NHeiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: NJisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9886a4db

net: phy: make phy_disable_interrupts() non-static · 3dd4ef1b

由 Jisheng Zhang 提交于 6月 24, 2020

We face an issue with rtl8211f, a pin is shared between INTB and PMEB,
and the PHY Register Accessible Interrupt is enabled by default, so
the INTB/PMEB pin is always active in polling mode case.

As Heiner pointed out "I was thinking about calling
phy_disable_interrupts() in phy_init_hw(), to have a defined init
state as we don't know in which state the PHY is if the PHY driver is
loaded. We shouldn't assume that it's the chip power-on defaults, BIOS
or boot loader could have changed this. Or in case of dual-boot
systems the other OS could leave the PHY in whatever state."

Make phy_disable_interrupts() non-static so that it could be used in
phy_init_hw() to have a defined init state.
Suggested-by: NHeiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: NJisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3dd4ef1b

net: ethernet: mvneta: Add back interface mode validation · 41c2b6b4

由 Sascha Hauer 提交于 6月 24, 2020

When writing the serdes configuration register was moved to
mvneta_config_interface() the whole code block was removed from
mvneta_port_power_up() in the assumption that its only purpose was to
write the serdes configuration register. As mentioned by Russell King
its purpose was also to check for valid interface modes early so that
later in the driver we do not have to care for unexpected interface
modes.
Add back the test to let the driver bail out early on unhandled
interface modes.

Fixes: b4748553 ("net: ethernet: mvneta: Fix Serdes configuration for SoCs without comphy")
Signed-off-by: NSascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

41c2b6b4

net: ethernet: mvneta: Do not error out in non serdes modes · d3d239dc

由 Sascha Hauer 提交于 6月 24, 2020

In mvneta_config_interface() the RGMII modes are catched by the default
case which is an error return. The RGMII modes are valid modes for the
driver, so instead of returning an error add a break statement to return
successfully.

This avoids this warning for non comphy SoCs which use RGMII, like
SolidRun Clearfog:

WARNING: CPU: 0 PID: 268 at drivers/net/ethernet/marvell/mvneta.c:3512 mvneta_start_dev+0x220/0x23c

Fixes: b4748553 ("net: ethernet: mvneta: Fix Serdes configuration for SoCs without comphy")
Signed-off-by: NSascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3d239dc

drm/fb-helper: Fix vt restore · dc5bdb68

由 Daniel Vetter 提交于 6月 24, 2020

In the past we had a pile of hacks to orchestrate access between fbdev
emulation and native kms clients. We've tried to streamline this, by
always preferring the kms side above fbdev calls when a drm master
exists, because drm master controls access to the display resources.

Unfortunately this breaks existing userspace, specifically Xorg. When
exiting Xorg first restores the console to text mode using the KDSET
ioctl on the vt. This does nothing, because a drm master is still
around. Then it drops the drm master status, which again does nothing,
because logind is keeping additional drm fd open to be able to
orchestrate vt switches. In the past this is the point where fbdev was
restored, as part of the ->lastclose hook on the drm side.

Now to fix this regression we don't want to go back to letting fbdev
restore things whenever it feels like, or to the pile of hacks we've
had before. Instead try and go with a minimal exception to make the
KDSET case work again, and nothing else.

This means that if userspace does a KDSET call when switching between
graphical compositors, there will be some flickering with fbcon
showing up for a bit. But a) that's not a regression and b) userspace
can fix it by improving the vt switching dance - logind should have
all the information it needs.

While pondering all this I'm also wondering wheter we should have a
SWITCH_MASTER ioctl to allow race-free master status handover. But
that's for another day.

v2: Somehow forgot to cc all the fbdev people.

v3: Fix typo Alex spotted.
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208179
Cc: shlomo@fastmail.com
Reported-and-Tested-by: shlomo@fastmail.com
Cc: Michel Dänzer <michel@daenzer.net>
Fixes: 64914da2 ("drm/fbdev-helper: don't force restores")
Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.7+
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Qiujun Huang <hqjagain@gmail.com>
Cc: Peter Rosin <peda@axentia.se>
Cc: linux-fbdev@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200624092910.3280448-1-daniel.vetter@ffwll.ch

dc5bdb68

IB/hfi1: Add atomic triggered sleep/wakeup · 38fd98af

由 Mike Marciniszyn 提交于 6月 23, 2020

When running iperf in a two host configuration the following trace can
occur:

[  319.728730] NETDEV WATCHDOG: ib0 (hfi1): transmit queue 0 timed out

The issue happens because the current implementation relies on the netif
txq being stopped to control the flushing of the tx list.

There are two resources that the transmit logic can wait on and stop the
txq:
- SDMA descriptors
- Ring space to hold completions

The ring space is tested on the sending side and relieved when the ring is
consumed in the napi tx reaping.

Unfortunately, that reaping can run conncurrently with the workqueue
flushing of the txlist.  If the txq is started just before the workitem
executes, the txlist will never be flushed, leading to the txq being
stuck.

Fix by:
- Adding sleep/wakeup wrappers
  * Use an atomic to control the call to the netif routines inside the
    wrappers

- Use another atomic to record ring space exhaustion
  * Only wakeup when the a ring space exhaustion has happened and it
    relieved

Add additional wrappers to clarify the ring space resource handling.

Fixes: d99dc602 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20200623204327.108092.4024.stgit@awfm-01.aw.intel.comReviewed-by: NKaike Wan <kaike.wan@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

38fd98af

IB/hfi1: Correct -EBUSY handling in tx code · 82172b76

由 Mike Marciniszyn 提交于 6月 23, 2020

The current code mishandles -EBUSY in two ways:
- The flow change doesn't test the return from the flush and runs on to
  process the current packet racing with the wakeup processing
- The -EBUSY handling for a single packet inserts the tx into the txlist
  after the submit call, racing with the same wakeup processing

Fix the first by dropping the skb and returning NETDEV_TX_OK.

Fix the second by insuring the the list entry within the txreq is inited
when allocated.  This enables the sleep routine to detect that the txreq
has used the non-list api and queue the packet to the txlist.

Both flaws can lead to having the flushing thread executing in causing two
threads to manipulate the txlist.

Fixes: d99dc602 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20200623204321.108092.83898.stgit@awfm-01.aw.intel.comReviewed-by: NKaike Wan <kaike.wan@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

82172b76

IB/hfi1: Fix module use count flaw due to leftover module put calls · 822fbd37

由 Dennis Dalessandro 提交于 6月 23, 2020

When the try_module_get calls were removed from opening and closing of the
i2c debugfs file, the corresponding module_put calls were missed.  This
results in an inaccurate module use count that requires a power cycle to
fix.

Fixes: 09fbca8e ("IB/hfi1: No need to use try_module_get for debugfs")
Link: https://lore.kernel.org/r/20200623203230.106975.76240.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: NKaike Wan <kaike.wan@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

822fbd37

IB/hfi1: Restore kfree in dummy_netdev cleanup · b46925a2

由 Dennis Dalessandro 提交于 6月 23, 2020

We need to do some rework on the dummy netdev. Calling the free_netdev()
would normally make sense, and that will be addressed in an upcoming
patch. For now just revert the behavior to what it was before keeping the
unused variable removal part of the patch.

The dd->dumm_netdev is mainly used for packet receiving through
alloc_netdev_mqs() for typical net devices. A a result, it should be freed
with kfree instead of free_netdev() that leads to a crash when unloading
the hfi1 module:

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 8000000855b54067 P4D 8000000855b54067 PUD 84a4f5067 PMD 0
  Oops: 0000 [#1] SMP PTI
  CPU: 73 PID: 10299 Comm: modprobe Not tainted 5.6.0-rc5+ #1
  Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016
  RIP: 0010:__hw_addr_flush+0x12/0x80
  Code: 40 00 48 83 c4 08 4c 89 e7 5b 5d 41 5c e9 76 77 18 00 66 0f 1f 44 00 00 0f 1f 44 00 00 41 54 49 89 fc 55 53 48 8b 1f 48 39 df <48> 8b 2b 75 08 eb 4a 48 89 eb 48 89 c5 48 89 df e8 99 bf d0 ff 84
  RSP: 0018:ffffb40e08783db8 EFLAGS: 00010282
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
  RDX: ffffb40e00000000 RSI: 0000000000000246 RDI: ffff88ab13662298
  RBP: ffff88ab13662000 R08: 0000000000001549 R09: 0000000000001549
  R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffff88ab13662298
  R13: ffff88ab1b259e20 R14: ffff88ab1b259e42 R15: 0000000000000000
  FS:  00007fb39b534740(0000) GS:ffff88b31f940000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 000000084d3ea004 CR4: 00000000003606e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   dev_addr_flush+0x15/0x30
   free_netdev+0x7e/0x130
   hfi1_netdev_free+0x59/0x70 [hfi1]
   remove_one+0x65/0x110 [hfi1]
   pci_device_remove+0x3b/0xc0
   device_release_driver_internal+0xec/0x1b0
   driver_detach+0x46/0x90
   bus_remove_driver+0x58/0xd0
   pci_unregister_driver+0x26/0xa0
   hfi1_mod_cleanup+0xc/0xd54 [hfi1]
   __x64_sys_delete_module+0x16c/0x260
   ? exit_to_usermode_loop+0xa4/0xc0
   do_syscall_64+0x5b/0x200
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 193ba031 ("IB/hfi1: Use free_netdev() in hfi1_netdev_free()")
Link: https://lore.kernel.org/r/20200623203224.106975.16926.stgit@awfm-01.aw.intel.comReviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NKaike Wan <kaike.wan@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>

b46925a2

nvme-multipath: fix bogus request queue reference put · c3124466

由 Sagi Grimberg 提交于 6月 24, 2020

The mpath disk node takes a reference on the request mpath
request queue when adding live path to the mpath gendisk.
However if we connected to an inaccessible path device_add_disk
is not called, so if we disconnect and remove the mpath gendisk
we endup putting an reference on the request queue that was
never taken [1].

Fix that to check if we ever added a live path (using
NVME_NS_HEAD_HAS_DISK flag) and if not, clear the disk->queue
reference.

[1]:
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 1 PID: 1372 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
CPU: 1 PID: 1372 Comm: nvme Tainted: G           O      5.7.0-rc2+ #3
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1 04/01/2014
RIP: 0010:refcount_warn_saturate+0xa6/0xf0
RSP: 0018:ffffb29e8053bdc0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8b7a2f4fc060 RCX: 0000000000000007
RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff8b7a3ec99980
RBP: ffff8b7a2f4fc000 R08: 00000000000002e1 R09: 0000000000000004
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: fffffffffffffff2 R14: ffffb29e8053bf08 R15: ffff8b7a320e2da0
FS:  00007f135d4ca800(0000) GS:ffff8b7a3ec80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005651178c0c30 CR3: 000000003b650005 CR4: 0000000000360ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 disk_release+0xa2/0xc0
 device_release+0x28/0x80
 kobject_put+0xa5/0x1b0
 nvme_put_ns_head+0x26/0x70 [nvme_core]
 nvme_put_ns+0x30/0x60 [nvme_core]
 nvme_remove_namespaces+0x9b/0xe0 [nvme_core]
 nvme_do_delete_ctrl+0x43/0x5c [nvme_core]
 nvme_sysfs_delete.cold+0x8/0xd [nvme_core]
 kernfs_fop_write+0xc1/0x1a0
 vfs_write+0xb6/0x1a0
 ksys_write+0x5f/0xe0
 do_syscall_64+0x52/0x1a0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported-by: NAnton Eidelman <anton@lightbitslabs.com>
Tested-by: NAnton Eidelman <anton@lightbitslabs.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c3124466

nvme-multipath: fix deadlock due to head->lock · d8a22f85

由 Anton Eidelman 提交于 6月 24, 2020

In the following scenario scan_work and ana_work will deadlock:

When scan_work calls nvme_mpath_add_disk() this holds ana_lock
and invokes nvme_parse_ana_log(), which may issue IO
in device_add_disk() and hang waiting for an accessible path.

While nvme_mpath_set_live() only called when nvme_state_is_live(),
a transition may cause NVME_SC_ANA_TRANSITION and requeue the IO.

Since nvme_mpath_set_live() holds ns->head->lock, an ana_work on
ANY ctrl will not be able to complete nvme_mpath_set_live()
on the same ns->head, which is required in order to update
the new accessible path and remove NVME_NS_ANA_PENDING..
Therefore IO never completes: deadlock [1].

Fix:
Move device_add_disk out of the head->lock and protect it with an
atomic test_and_set for a new NVME_NS_HEAD_HAS_DISK bit.

[1]:
kernel: INFO: task kworker/u8:2:160 blocked for more than 120 seconds.
kernel:       Tainted: G           OE     5.3.5-050305-generic #201910071830
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: kworker/u8:2    D    0   160      2 0x80004000
kernel: Workqueue: nvme-wq nvme_ana_work [nvme_core]
kernel: Call Trace:
kernel:  __schedule+0x2b9/0x6c0
kernel:  schedule+0x42/0xb0
kernel:  schedule_preempt_disabled+0xe/0x10
kernel:  __mutex_lock.isra.0+0x182/0x4f0
kernel:  __mutex_lock_slowpath+0x13/0x20
kernel:  mutex_lock+0x2e/0x40
kernel:  nvme_update_ns_ana_state+0x22/0x60 [nvme_core]
kernel:  nvme_update_ana_state+0xca/0xe0 [nvme_core]
kernel:  nvme_parse_ana_log+0xa1/0x180 [nvme_core]
kernel:  nvme_read_ana_log+0x76/0x100 [nvme_core]
kernel:  nvme_ana_work+0x15/0x20 [nvme_core]
kernel:  process_one_work+0x1db/0x380
kernel:  worker_thread+0x4d/0x400
kernel:  kthread+0x104/0x140
kernel:  ret_from_fork+0x35/0x40
kernel: INFO: task kworker/u8:4:439 blocked for more than 120 seconds.
kernel:       Tainted: G           OE     5.3.5-050305-generic #201910071830
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: kworker/u8:4    D    0   439      2 0x80004000
kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core]
kernel: Call Trace:
kernel:  __schedule+0x2b9/0x6c0
kernel:  schedule+0x42/0xb0
kernel:  io_schedule+0x16/0x40
kernel:  do_read_cache_page+0x438/0x830
kernel:  read_cache_page+0x12/0x20
kernel:  read_dev_sector+0x27/0xc0
kernel:  read_lba+0xc1/0x220
kernel:  efi_partition+0x1e6/0x708
kernel:  check_partition+0x154/0x244
kernel:  rescan_partitions+0xae/0x280
kernel:  __blkdev_get+0x40f/0x560
kernel:  blkdev_get+0x3d/0x140
kernel:  __device_add_disk+0x388/0x480
kernel:  device_add_disk+0x13/0x20
kernel:  nvme_mpath_set_live+0x119/0x140 [nvme_core]
kernel:  nvme_update_ns_ana_state+0x5c/0x60 [nvme_core]
kernel:  nvme_mpath_add_disk+0xbe/0x100 [nvme_core]
kernel:  nvme_validate_ns+0x396/0x940 [nvme_core]
kernel:  nvme_scan_work+0x256/0x390 [nvme_core]
kernel:  process_one_work+0x1db/0x380
kernel:  worker_thread+0x4d/0x400
kernel:  kthread+0x104/0x140
kernel:  ret_from_fork+0x35/0x40

Fixes: 0d0b660f ("nvme: add ANA support")
Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d8a22f85

nvme: don't protect ns mutation with ns->head->lock · e164471d

由 Sagi Grimberg 提交于 6月 24, 2020

Right now ns->head->lock is protecting namespace mutation
which is wrong and unneeded. Move it to only protect
against head mutations. While we're at it, remove unnecessary
ns->head reference as we already have head pointer.

The problem with this is that the head->lock spans
mpath disk node I/O that may block under some conditions (if
for example the controller is disconnecting or the path
became inaccessible), The locking scheme does not allow any
other path to enable itself, preventing blocked I/O to complete
and forward-progress from there.

This is a preparation patch for the fix in a subsequent patch
where the disk I/O will also be done outside the head->lock.

Fixes: 0d0b660f ("nvme: add ANA support")
Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

e164471d

nvme-multipath: fix deadlock between ana_work and scan_work · 489dd102

由 Anton Eidelman 提交于 6月 24, 2020

When scan_work calls nvme_mpath_add_disk() this holds ana_lock
and invokes nvme_parse_ana_log(), which may issue IO
in device_add_disk() and hang waiting for an accessible path.
While nvme_mpath_set_live() only called when nvme_state_is_live(),
a transition may cause NVME_SC_ANA_TRANSITION and requeue the IO.

In order to recover and complete the IO ana_work on the same ctrl
should be able to update the path state and remove NVME_NS_ANA_PENDING.

The deadlock occurs because scan_work keeps holding ana_lock,
so ana_work hangs [1].

Fix:
Now nvme_mpath_add_disk() uses nvme_parse_ana_log() to obtain a copy
of the ANA group desc, and then calls nvme_update_ns_ana_state() without
holding ana_lock.

[1]:
kernel: Workqueue: nvme-wq nvme_scan_work [nvme_core]
kernel: Call Trace:
kernel:  __schedule+0x2b9/0x6c0
kernel:  schedule+0x42/0xb0
kernel:  io_schedule+0x16/0x40
kernel:  do_read_cache_page+0x438/0x830
kernel:  read_cache_page+0x12/0x20
kernel:  read_dev_sector+0x27/0xc0
kernel:  read_lba+0xc1/0x220
kernel:  efi_partition+0x1e6/0x708
kernel:  check_partition+0x154/0x244
kernel:  rescan_partitions+0xae/0x280
kernel:  __blkdev_get+0x40f/0x560
kernel:  blkdev_get+0x3d/0x140
kernel:  __device_add_disk+0x388/0x480
kernel:  device_add_disk+0x13/0x20
kernel:  nvme_mpath_set_live+0x119/0x140 [nvme_core]
kernel:  nvme_update_ns_ana_state+0x5c/0x60 [nvme_core]
kernel:  nvme_set_ns_ana_state+0x1e/0x30 [nvme_core]
kernel:  nvme_parse_ana_log+0xa1/0x180 [nvme_core]
kernel:  nvme_mpath_add_disk+0x47/0x90 [nvme_core]
kernel:  nvme_validate_ns+0x396/0x940 [nvme_core]
kernel:  nvme_scan_work+0x24f/0x380 [nvme_core]
kernel:  process_one_work+0x1db/0x380
kernel:  worker_thread+0x249/0x400
kernel:  kthread+0x104/0x140

kernel: Workqueue: nvme-wq nvme_ana_work [nvme_core]
kernel: Call Trace:
kernel:  __schedule+0x2b9/0x6c0
kernel:  schedule+0x42/0xb0
kernel:  schedule_preempt_disabled+0xe/0x10
kernel:  __mutex_lock.isra.0+0x182/0x4f0
kernel:  ? __switch_to_asm+0x34/0x70
kernel:  ? select_task_rq_fair+0x1aa/0x5c0
kernel:  ? kvm_sched_clock_read+0x11/0x20
kernel:  ? sched_clock+0x9/0x10
kernel:  __mutex_lock_slowpath+0x13/0x20
kernel:  mutex_lock+0x2e/0x40
kernel:  nvme_read_ana_log+0x3a/0x100 [nvme_core]
kernel:  nvme_ana_work+0x15/0x20 [nvme_core]
kernel:  process_one_work+0x1db/0x380
kernel:  worker_thread+0x4d/0x400
kernel:  kthread+0x104/0x140
kernel:  ? process_one_work+0x380/0x380
kernel:  ? kthread_park+0x80/0x80
kernel:  ret_from_fork+0x35/0x40

Fixes: 0d0b660f ("nvme: add ANA support")
Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

489dd102

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功