提交 · 4c95236a335d8b62aa8dbd587bed6a5f30d8265a · openeuler / Kernel

30 9月, 2022 3 次提交

random: add 8-bit and 16-bit batches · 585cd5fe

由 Jason A. Donenfeld 提交于 9月 28, 2022

There are numerous places in the kernel that would be sped up by having
smaller batches. Currently those callsites do `get_random_u32() & 0xff`
or similar. Since these are pretty spread out, and will require patches
to multiple different trees, let's get ahead of the curve and lay the
foundation for `get_random_u8()` and `get_random_u16()`, so that it's
then possible to start submitting conversion patches leisurely.
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

585cd5fe

random: use init_utsname() instead of utsname() · dd54fd7d

由 Jason A. Donenfeld 提交于 9月 27, 2022

Rather than going through the current-> indirection for utsname, at this
point in boot, init_utsname()==utsname(), so just use it directly that
way. Additionally, init_utsname() appears to be available nearly always,
so move it into random_init_early().
Suggested-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

dd54fd7d

random: split initialization into early step and later step · f6238499

由 Jason A. Donenfeld 提交于 9月 26, 2022

The full RNG initialization relies on some timestamps, made possible
with initialization functions like time_init() and timekeeping_init().
However, these are only available rather late in initialization.
Meanwhile, other things, such as memory allocator functions, make use of
the RNG much earlier.

So split RNG initialization into two phases. We can provide arch
randomness very early on, and then later, after timekeeping and such are
available, initialize the rest.

This ensures that, for example, slabs are properly randomized if RDRAND
is available. Without this, CONFIG_SLAB_FREELIST_RANDOM=y loses a degree
of its security, because its random seed is potentially deterministic,
since it hasn't yet incorporated RDRAND. It also makes it possible to
use a better seed in kfence, which currently relies on only the cycle
counter.

Another positive consequence is that on systems with RDRAND, running
with CONFIG_WARN_ALL_UNSEEDED_RANDOM=y results in no warnings at all.

One subtle side effect of this change is that on systems with no RDRAND,
RDTSC is now only queried by random_init() once, committing the moment
of the function call, instead of multiple times as before. This is
intentional, as the multiple RDTSCs in a loop before weren't
accomplishing very much, with jitter being better provided by
try_to_generate_entropy(). Plus, filling blocks with RDTSC is still
being done in extract_entropy(), which is necessarily called before
random bytes are served anyway.

Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

f6238499

29 9月, 2022 2 次提交

random: use expired timer rather than wq for mixing fast pool · 748bc4dd

由 Jason A. Donenfeld 提交于 9月 22, 2022

Previously, the fast pool was dumped into the main pool periodically in
the fast pool's hard IRQ handler. This worked fine and there weren't
problems with it, until RT came around. Since RT converts spinlocks into
sleeping locks, problems cropped up. Rather than switching to raw
spinlocks, the RT developers preferred we make the transformation from
originally doing:

    do_some_stuff()
    spin_lock()
    do_some_other_stuff()
    spin_unlock()

to doing:

    do_some_stuff()
    queue_work_on(some_other_stuff_worker)

This is an ordinary pattern done all over the kernel. However, Sherry
noticed a 10% performance regression in qperf TCP over a 40gbps
InfiniBand card. Quoting her message:

> MT27500 Family [ConnectX-3] cards:
> Infiniband device 'mlx4_0' port 1 status:
> default gid: fe80:0000:0000:0000:0010:e000:0178:9eb1
> base lid: 0x6
> sm lid: 0x1
> state: 4: ACTIVE
> phys state: 5: LinkUp
> rate: 40 Gb/sec (4X QDR)
> link_layer: InfiniBand
>
> Cards are configured with IP addresses on private subnet for IPoIB
> performance testing.
> Regression identified in this bug is in TCP latency in this stack as reported
> by qperf tcp_lat metric:
>
> We have one system listen as a qperf server:
> [root@yourQperfServer ~]# qperf
>
> Have the other system connect to qperf server as a client (in this
> case, it’s X7 server with Mellanox card):
> [root@yourQperfClient ~]# numactl -m0 -N0 qperf 20.20.20.101 -v -uu -ub --time 60 --wait_server 20 -oo msg_size:4K:1024K:*2 tcp_lat

Rather than incur the scheduling latency from queue_work_on, we can
instead switch to running on the next timer tick, on the same core. This
also batches things a bit more -- once per jiffy -- which is okay now
that mix_interrupt_randomness() can credit multiple bits at once.
Reported-by: NSherry Yang <sherry.yang@oracle.com>
Tested-by: NPaul Webb <paul.x.webb@oracle.com>
Cc: Sherry Yang <sherry.yang@oracle.com>
Cc: Phillip Goerl <phillip.goerl@oracle.com>
Cc: Jack Vogel <jack.vogel@oracle.com>
Cc: Nicky Veitch <nicky.veitch@oracle.com>
Cc: Colm Harrington <colm.harrington@oracle.com>
Cc: Ramanan Govindarajan <ramanan.govindarajan@oracle.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Sultan Alsawaf <sultan@kerneltoast.com>
Cc: stable@vger.kernel.org
Fixes: 58340f8e ("random: defer fast pool mixing to worker")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

748bc4dd

random: avoid reading two cache lines on irq randomness · 9ee0507e

由 Jason A. Donenfeld 提交于 9月 22, 2022

In order to avoid reading and dirtying two cache lines on every IRQ,
move the work_struct to the bottom of the fast_pool struct. add_
interrupt_randomness() always touches .pool and .count, which are
currently split, because .mix pushes everything down. Instead, move .mix
to the bottom, so that .pool and .count are always in the first cache
line, since .mix is only accessed when the pool is full.

Fixes: 58340f8e ("random: defer fast pool mixing to worker")
Reviewed-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

9ee0507e

23 9月, 2022 4 次提交

random: clamp credited irq bits to maximum mixed · e78a802a

由 Jason A. Donenfeld 提交于 9月 23, 2022

Since the most that's mixed into the pool is sizeof(long)*2, don't
credit more than that many bytes of entropy.

Fixes: e3e33fc2 ("random: do not use input pool from hard IRQs")
Cc: stable@vger.kernel.org
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

e78a802a

random: throttle hwrng writes if no entropy is credited · d775335e

由 Jason A. Donenfeld 提交于 9月 20, 2022

If a hwrng source does not provide an entropy estimate, it currently
does not contribute at all to the CRNG. In order to help fix this, in
case add_hwgenerator_randomness() is called with the entropy parameter
set to zero, go to sleep until one reseed interval has passed.

While the hwrng thread currently only runs under conditions where this
is non-zero, this change is not harmful and prepares for future updates
to the hwrng core.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

d775335e

random: use hwgenerator randomness more frequently at early boot · 745558f9

由 Dominik Brodowski 提交于 9月 04, 2022

Mix in randomness from hw-rng sources more frequently during early
boot, approximately once for every rng reseed.
Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

745558f9

random: restore O_NONBLOCK support · cd4f24ae

由 Jason A. Donenfeld 提交于 9月 08, 2022

Prior to 5.6, when /dev/random was opened with O_NONBLOCK, it would
return -EAGAIN if there was no entropy. When the pools were unified in
5.6, this was lost. The post 5.6 behavior of blocking until the pool is
initialized, and ignoring O_NONBLOCK in the process, went unnoticed,
with no reports about the regression received for two and a half years.
However, eventually this indeed did break somebody's userspace.

So we restore the old behavior, by returning -EAGAIN if the pool is not
initialized. Unlike the old /dev/random, this can only occur during
early boot, after which it never blocks again.

In order to make this O_NONBLOCK behavior consistent with other
expectations, also respect users reading with preadv2(RWF_NOWAIT) and
similar.

Fixes: 30c08efe ("random: make /dev/random be almost like /dev/urandom")
Reported-by: NGuozihua <guozihua@huawei.com>
Reported-by: NZhongguohua <zhongguohua1@huawei.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

cd4f24ae

22 9月, 2022 8 次提交

bnxt: prevent skb UAF after handing over to PTP worker · c31f26c8

由 Jakub Kicinski 提交于 9月 21, 2022

When reading the timestamp is required bnxt_tx_int() hands
over the ownership of the completed skb to the PTP worker.
The skb should not be used afterwards, as the worker may
run before the rest of our code and free the skb, leading
to a use-after-free.

Since dev_kfree_skb_any() accepts NULL make the loss of
ownership more obvious and set skb to NULL.

Fixes: 83bb623c ("bnxt_en: Transmit and retrieve packet timestamps")
Reviewed-by: NAndy Gospodarek <gospo@broadcom.com>
Reviewed-by: NMichael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20220921201005.335390-1-kuba@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

c31f26c8

net: marvell: Fix refcounting bugs in prestera_port_sfp_bind() · 3aac7ada

由 Liang He 提交于 9月 21, 2022

In prestera_port_sfp_bind(), there are two refcounting bugs:
(1) we should call of_node_get() before of_find_node_by_name() as
it will automaitcally decrease the refcount of 'from' argument;
(2) we should call of_node_put() for the break of the iteration
for_each_child_of_node() as it will automatically increase and
decrease the 'child'.

Fixes: 52323ef7 ("net: marvell: prestera: add phylink support")
Signed-off-by: NLiang He <windhl@126.com>
Reviewed-by: NYevhen Orlov <yevhen.orlov@plvision.eu>
Link: https://lore.kernel.org/r/20220921133245.4111672-1-windhl@126.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

3aac7ada

net: sunhme: Fix packet reception for len < RX_COPY_THRESHOLD · 878e2405

由 Sean Anderson 提交于 9月 20, 2022

There is a separate receive path for small packets (under 256 bytes).
Instead of allocating a new dma-capable skb to be used for the next packet,
this path allocates a skb and copies the data into it (reusing the existing
sbk for the next packet). There are two bytes of junk data at the beginning
of every packet. I believe these are inserted in order to allow aligned DMA
and IP headers. We skip over them using skb_reserve. Before copying over
the data, we must use a barrier to ensure we see the whole packet. The
current code only synchronizes len bytes, starting from the beginning of
the packet, including the junk bytes. However, this leaves off the final
two bytes in the packet. Synchronize the whole packet.

To reproduce this problem, ping a HME with a payload size between 17 and
214

	$ ping -s 17 <hme_address>

which will complain rather loudly about the data mismatch. Small packets
(below 60 bytes on the wire) do not have this issue. I suspect this is
related to the padding added to increase the minimum packet size.

Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: NSean Anderson <seanga2@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220920235018.1675956-1-seanga2@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

878e2405

bonding: fix NULL deref in bond_rr_gen_slave_id · 0e400d60

由 Jonathan Toppins 提交于 9月 20, 2022

Fix a NULL dereference of the struct bonding.rr_tx_counter member because
if a bond is initially created with an initial mode != zero (Round Robin)
the memory required for the counter is never created and when the mode is
changed there is never any attempt to verify the memory is allocated upon
switching modes.

This causes the following Oops on an aarch64 machine:
    [  334.686773] Unable to handle kernel paging request at virtual address ffff2c91ac905000
    [  334.694703] Mem abort info:
    [  334.697486]   ESR = 0x0000000096000004
    [  334.701234]   EC = 0x25: DABT (current EL), IL = 32 bits
    [  334.706536]   SET = 0, FnV = 0
    [  334.709579]   EA = 0, S1PTW = 0
    [  334.712719]   FSC = 0x04: level 0 translation fault
    [  334.717586] Data abort info:
    [  334.720454]   ISV = 0, ISS = 0x00000004
    [  334.724288]   CM = 0, WnR = 0
    [  334.727244] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000008044d662000
    [  334.733944] [ffff2c91ac905000] pgd=0000000000000000, p4d=0000000000000000
    [  334.740734] Internal error: Oops: 96000004 [#1] SMP
    [  334.745602] Modules linked in: bonding tls veth rfkill sunrpc arm_spe_pmu vfat fat acpi_ipmi ipmi_ssif ixgbe igb i40e mdio ipmi_devintf ipmi_msghandler arm_cmn arm_dsu_pmu cppc_cpufreq acpi_tad fuse zram crct10dif_ce ast ghash_ce sbsa_gwdt nvme drm_vram_helper drm_ttm_helper nvme_core ttm xgene_hwmon
    [  334.772217] CPU: 7 PID: 2214 Comm: ping Not tainted 6.0.0-rc4-00133-g64ae13ed #4
    [  334.779950] Hardware name: GIGABYTE R272-P31-00/MP32-AR1-00, BIOS F18v (SCP: 1.08.20211002) 12/01/2021
    [  334.789244] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [  334.796196] pc : bond_rr_gen_slave_id+0x40/0x124 [bonding]
    [  334.801691] lr : bond_xmit_roundrobin_slave_get+0x38/0xdc [bonding]
    [  334.807962] sp : ffff8000221733e0
    [  334.811265] x29: ffff8000221733e0 x28: ffffdbac8572d198 x27: ffff80002217357c
    [  334.818392] x26: 000000000000002a x25: ffffdbacb33ee000 x24: ffff07ff980fa000
    [  334.825519] x23: ffffdbacb2e398ba x22: ffff07ff98102000 x21: ffff07ff981029c0
    [  334.832646] x20: 0000000000000001 x19: ffff07ff981029c0 x18: 0000000000000014
    [  334.839773] x17: 0000000000000000 x16: ffffdbacb1004364 x15: 0000aaaabe2f5a62
    [  334.846899] x14: ffff07ff8e55d968 x13: ffff07ff8e55db30 x12: 0000000000000000
    [  334.854026] x11: ffffdbacb21532e8 x10: 0000000000000001 x9 : ffffdbac857178ec
    [  334.861153] x8 : ffff07ff9f6e5a28 x7 : 0000000000000000 x6 : 000000007c2b3742
    [  334.868279] x5 : ffff2c91ac905000 x4 : ffff2c91ac905000 x3 : ffff07ff9f554400
    [  334.875406] x2 : ffff2c91ac905000 x1 : 0000000000000001 x0 : ffff07ff981029c0
    [  334.882532] Call trace:
    [  334.884967]  bond_rr_gen_slave_id+0x40/0x124 [bonding]
    [  334.890109]  bond_xmit_roundrobin_slave_get+0x38/0xdc [bonding]
    [  334.896033]  __bond_start_xmit+0x128/0x3a0 [bonding]
    [  334.901001]  bond_start_xmit+0x54/0xb0 [bonding]
    [  334.905622]  dev_hard_start_xmit+0xb4/0x220
    [  334.909798]  __dev_queue_xmit+0x1a0/0x720
    [  334.913799]  arp_xmit+0x3c/0xbc
    [  334.916932]  arp_send_dst+0x98/0xd0
    [  334.920410]  arp_solicit+0xe8/0x230
    [  334.923888]  neigh_probe+0x60/0xb0
    [  334.927279]  __neigh_event_send+0x3b0/0x470
    [  334.931453]  neigh_resolve_output+0x70/0x90
    [  334.935626]  ip_finish_output2+0x158/0x514
    [  334.939714]  __ip_finish_output+0xac/0x1a4
    [  334.943800]  ip_finish_output+0x40/0xfc
    [  334.947626]  ip_output+0xf8/0x1a4
    [  334.950931]  ip_send_skb+0x5c/0x100
    [  334.954410]  ip_push_pending_frames+0x3c/0x60
    [  334.958758]  raw_sendmsg+0x458/0x6d0
    [  334.962325]  inet_sendmsg+0x50/0x80
    [  334.965805]  sock_sendmsg+0x60/0x6c
    [  334.969286]  __sys_sendto+0xc8/0x134
    [  334.972853]  __arm64_sys_sendto+0x34/0x4c
    [  334.976854]  invoke_syscall+0x78/0x100
    [  334.980594]  el0_svc_common.constprop.0+0x4c/0xf4
    [  334.985287]  do_el0_svc+0x38/0x4c
    [  334.988591]  el0_svc+0x34/0x10c
    [  334.991724]  el0t_64_sync_handler+0x11c/0x150
    [  334.996072]  el0t_64_sync+0x190/0x194
    [  334.999726] Code: b9001062 f9403c02 d53cd044 8b040042 (b8210040)
    [  335.005810] ---[ end trace 0000000000000000 ]---
    [  335.010416] Kernel panic - not syncing: Oops: Fatal exception in interrupt
    [  335.017279] SMP: stopping secondary CPUs
    [  335.021374] Kernel Offset: 0x5baca8eb0000 from 0xffff800008000000
    [  335.027456] PHYS_OFFSET: 0x80000000
    [  335.030932] CPU features: 0x0000,0085c029,19805c82
    [  335.035713] Memory Limit: none
    [  335.038756] Rebooting in 180 seconds..

The fix is to allocate the memory in bond_open() which is guaranteed
to be called before any packets are processed.

Fixes: 848ca918 ("net: bonding: Use per-cpu rr_tx_counter")
CC: Jussi Maki <joamaki@gmail.com>
Signed-off-by: NJonathan Toppins <jtoppins@redhat.com>
Acked-by: NJay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

0e400d60

net: phy: micrel: fix shared interrupt on LAN8814 · 2002fbac

由 Michael Walle 提交于 9月 20, 2022

Since commit ece19502 ("net: phy: micrel: 1588 support for LAN8814
phy") the handler always returns IRQ_HANDLED, except in an error case.
Before that commit, the interrupt status register was checked and if
it was empty, IRQ_NONE was returned. Restore that behavior to play nice
with the interrupt line being shared with others.

Fixes: ece19502 ("net: phy: micrel: 1588 support for LAN8814 phy")
Signed-off-by: NMichael Walle <michael@walle.cc>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Reviewed-by: NHoratiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: NDivya Koppera <Divya.Koppera@microchip.com>
Link: https://lore.kernel.org/r/20220920141619.808117-1-michael@walle.ccSigned-off-by: NJakub Kicinski <kuba@kernel.org>

2002fbac

efi: libstub: check Shim mode using MokSBStateRT · 5f56a74c

由 Ard Biesheuvel 提交于 9月 20, 2022

We currently check the MokSBState variable to decide whether we should
treat UEFI secure boot as being disabled, even if the firmware thinks
otherwise. This is used by shim to indicate that it is not checking
signatures on boot images. In the kernel, we use this to relax lockdown
policies.

However, in cases where shim is not even being used, we don't want this
variable to interfere with lockdown, given that the variable may be
non-volatile and therefore persist across a reboot. This means setting
it once will persistently disable lockdown checks on a given system.

So switch to the mirrored version of this variable, called MokSBStateRT,
which is supposed to be volatile, and this is something we can check.

Cc: <stable@vger.kernel.org> # v4.19+
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Reviewed-by: NIlias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: NPeter Jones <pjones@redhat.com>

5f56a74c

efi: x86: Wipe setup_data on pure EFI boot · 63bf28ce

由 Ard Biesheuvel 提交于 8月 04, 2022

When booting the x86 kernel via EFI using the LoadImage/StartImage boot
services [as opposed to the deprecated EFI handover protocol], the setup
header is taken from the image directly, and given that EFI's LoadImage
has no Linux/x86 specific knowledge regarding struct bootparams or
struct setup_header, any absolute addresses in the setup header must
originate from the file and not from a prior loading stage.

Since we cannot generally predict where LoadImage() decides to load an
image (*), such absolute addresses must be treated as suspect: even if a
prior boot stage intended to make them point somewhere inside the
[signed] image, there is no way to validate that, and if they point at
an arbitrary location in memory, the setup_data nodes will not be
covered by any signatures or TPM measurements either, and could be made
to contain an arbitrary sequence of SETUP_xxx nodes, which could
interfere quite badly with the early x86 boot sequence.

(*) Note that, while LoadImage() does take a buffer/size tuple in
addition to a device path, which can be used to provide the image
contents directly, it will re-allocate such images, as the memory
footprint of an image is generally larger than the PE/COFF file
representation.

Cc: <stable@vger.kernel.org> # v5.10+
Link: https://lore.kernel.org/all/20220904165321.1140894-1-Jason@zx2c4.com/Signed-off-by: NArd Biesheuvel <ardb@kernel.org>
Acked-by: NJason A. Donenfeld <Jason@zx2c4.com>

63bf28ce

ice: Fix ice_xdp_xmit() when XDP TX queue number is not sufficient · 114f398d

由 Larysa Zaremba 提交于 9月 19, 2022

The original patch added the static branch to handle the situation,
when assigning an XDP TX queue to every CPU is not possible,
so they have to be shared.

However, in the XDP transmit handler ice_xdp_xmit(), an error was
returned in such cases even before static condition was checked,
thus making queue sharing still impossible.

Fixes: 22bf877e ("ice: introduce XDP_TX fallback path")
Signed-off-by: NLarysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: NAlexander Lobakin <alexandr.lobakin@intel.com>
Link: https://lore.kernel.org/r/20220919134346.25030-1-larysa.zaremba@intel.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

114f398d

21 9月, 2022 19 次提交

net: atlantic: fix potential memory leak in aq_ndev_close() · 65e5d27d

由 Jianglei Nie 提交于 9月 14, 2022

If aq_nic_stop() fails, aq_ndev_close() returns err without calling
aq_nic_deinit() to release the relevant memory and resource, which
will lead to a memory leak.

We can fix it by deleting the if condition judgment and goto statement to
call aq_nic_deinit() directly after aq_nic_stop() to fix the memory leak.
Signed-off-by: NJianglei Nie <niejianglei2021@163.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

65e5d27d

iommu/vt-d: Check correct capability for sagaw determination · 15489780

由 Yi Liu 提交于 9月 21, 2022

Check 5-level paging capability for 57 bits address width instead of
checking 1GB large page capability.

Fixes: 53fc7ad6 ("iommu/vt-d: Correctly calculate sagaw value of IOMMU")
Cc: stable@vger.kernel.org
Reported-by: NRaghunathan Srinivasan <raghunathan.srinivasan@intel.com>
Signed-off-by: NYi Liu <yi.l.liu@intel.com>
Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Reviewed-by: NRaghunathan Srinivasan <raghunathan.srinivasan@intel.com>
Link: https://lore.kernel.org/r/20220916071212.2223869-2-yi.l.liu@intel.comSigned-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

15489780

Revert "iommu/vt-d: Fix possible recursive locking in intel_iommu_init()" · 7ebb5f8e

由 Lu Baolu 提交于 9月 21, 2022

This reverts commit 9cd4f143.

Some issues were reported on the original commit. Some thunderbolt devices
don't work anymore due to the following DMA fault.

DMAR: DRHD: handling fault status reg 2
DMAR: [INTR-REMAP] Request device [09:00.0] fault index 0x8080
      [fault reason 0x25]
      Blocked a compatibility format interrupt request

Bring it back for now to avoid functional regression.

Fixes: 9cd4f143 ("iommu/vt-d: Fix possible recursive locking in intel_iommu_init()")
Link: https://lore.kernel.org/linux-iommu/485A6EA5-6D58-42EA-B298-8571E97422DE@getmailspring.com/
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216497
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: <stable@vger.kernel.org> # 5.19.x
Reported-and-tested-by: NGeorge Hilliard <thirtythreeforty@gmail.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220920081701.3453504-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

7ebb5f8e

can: gs_usb: gs_usb_set_phys_id(): return with error if identify is not supported · 0f2211f1

由 Marc Kleine-Budde 提交于 8月 18, 2022

Until commit 409c188c ("can: tree-wide: advertise software
timestamping capabilities") the ethtool_ops was only assigned for
devices which support the GS_CAN_FEATURE_IDENTIFY feature. That commit
assigns ethtool_ops unconditionally.

This results on controllers without GS_CAN_FEATURE_IDENTIFY support
for the following ethtool error:

| $ ethtool -p can0 1
| Cannot identify NIC: Broken pipe

Restore the correct error value by checking for
GS_CAN_FEATURE_IDENTIFY in the gs_usb_set_phys_id() function.

| $ ethtool -p can0 1
| Cannot identify NIC: Operation not supported

While there use the variable "netdev" for the "struct net_device"
pointer and "dev" for the "struct gs_can" pointer as in the rest of
the driver.

Fixes: 409c188c ("can: tree-wide: advertise software timestamping capabilities")
Link: http://lore.kernel.org/all/20220818143853.2671854-1-mkl@pengutronix.de
Cc: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

0f2211f1

can: gs_usb: gs_can_open(): fix race dev->can.state condition · 5440428b

由 Marc Kleine-Budde 提交于 9月 20, 2022

The dev->can.state is set to CAN_STATE_ERROR_ACTIVE, after the device
has been started. On busy networks the CAN controller might receive
CAN frame between and go into an error state before the dev->can.state
is assigned.

Assign dev->can.state before starting the controller to close the race
window.

Fixes: d08e973a ("can: gs_usb: Added support for the GS_USB CAN devices")
Link: https://lore.kernel.org/all/20220920195216.232481-1-mkl@pengutronix.deSigned-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

5440428b

can: flexcan: flexcan_mailbox_read() fix return value for drop = true · a09721dd

由 Marc Kleine-Budde 提交于 8月 11, 2022

The following happened on an i.MX25 using flexcan with many packets on
the bus:

The rx-offload queue reached a length more than skb_queue_len_max. In
can_rx_offload_offload_one() the drop variable was set to true which
made the call to .mailbox_read() (here: flexcan_mailbox_read()) to
_always_ return ERR_PTR(-ENOBUFS) and drop the rx'ed CAN frame. So
can_rx_offload_offload_one() returned ERR_PTR(-ENOBUFS), too.

can_rx_offload_irq_offload_fifo() looks as follows:

| 	while (1) {
| 		skb = can_rx_offload_offload_one(offload, 0);
| 		if (IS_ERR(skb))
| 			continue;
| 		if (!skb)
| 			break;
| 		...
| 	}

The flexcan driver wrongly always returns ERR_PTR(-ENOBUFS) if drop is
requested, even if there is no CAN frame pending. As the i.MX25 is a
single core CPU, while the rx-offload processing is active, there is
no thread to process packets from the offload queue. So the queue
doesn't get any shorter and this results is a tight loop.

Instead of always returning ERR_PTR(-ENOBUFS) if drop is requested,
return NULL if no CAN frame is pending.

Changes since v1: https://lore.kernel.org/all/20220810144536.389237-1-u.kleine-koenig@pengutronix.de
- don't break in can_rx_offload_irq_offload_fifo() in case of an error,
  return NULL in flexcan_mailbox_read() in case of no pending CAN frame
  instead

Fixes: 4e9c9484 ("can: rx-offload: Prepare for CAN FD support")
Link: https://lore.kernel.org/all/20220811094254.1864367-1-mkl@pengutronix.de
Cc: stable@vger.kernel.org # v5.5
Suggested-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: NThorsten Scherer <t.scherer@eckelmann.de>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

a09721dd

gpiolib: cdev: Set lineevent_state::irq after IRQ register successfully · 69bef19d

由 Meng Li 提交于 9月 21, 2022

When running gpio test on nxp-ls1028 platform with below command
gpiomon --num-events=3 --rising-edge gpiochip1 25
There will be a warning trace as below:
Call trace:
free_irq+0x204/0x360
lineevent_free+0x64/0x70
gpio_ioctl+0x598/0x6a0
__arm64_sys_ioctl+0xb4/0x100
invoke_syscall+0x5c/0x130
......
el0t_64_sync+0x1a0/0x1a4
The reason of this issue is that calling request_threaded_irq()
function failed, and then lineevent_free() is invoked to release
the resource. Since the lineevent_state::irq was already set, so
the subsequent invocation of free_irq() would trigger the above
warning call trace. To fix this issue, set the lineevent_state::irq
after the IRQ register successfully.

Fixes: 46824272 ("gpiolib: cdev: refactor lineevent cleanup into lineevent_free")
Cc: stable@vger.kernel.org
Signed-off-by: NMeng Li <Meng.Li@windriver.com>
Reviewed-by: NKent Gibson <warthog618@gmail.com>
Signed-off-by: NBartosz Golaszewski <brgl@bgdev.pl>

69bef19d

gpio: tqmx86: fix uninitialized variable girq · 21a9acc1

由 Dongliang Mu 提交于 9月 19, 2022

The commit 92461060 ("gpio: tpmx86: Move PM device over to
irq domain") adds a dereference of girq that may be uninitialized.

Fix this by moving irq_domain_set_pm_device into if true branch
as suggested by Marc Zyngier.

Fixes: 92461060 ("gpio: tpmx86: Move PM device over to irq domain")
Suggested-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NDongliang Mu <mudongliangabcd@gmail.com>
Acked-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NBartosz Golaszewski <brgl@bgdev.pl>

21a9acc1

net: sh_eth: Fix PHY state warning splat during system resume · 6a1dbfef

由 Geert Uytterhoeven 提交于 9月 19, 2022

Since commit 744d23c7 ("net: phy: Warn about incorrect
mdio_bus_phy_resume() state"), a warning splat is printed during system
resume with Wake-on-LAN disabled:

WARNING: CPU: 0 PID: 626 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0xbc/0xe4

As the Renesas SuperH Ethernet driver already calls phy_{stop,start}()
in its suspend/resume callbacks, it is sufficient to just mark the MAC
responsible for managing the power state of the PHY.

Fixes: fba863b8 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NSergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/c6e1331b9bef61225fa4c09db3ba3e2e7214ba2d.1663598886.git.geert+renesas@glider.beSigned-off-by: NJakub Kicinski <kuba@kernel.org>

6a1dbfef

net: ravb: Fix PHY state warning splat during system resume · 4924c0cd

由 Geert Uytterhoeven 提交于 9月 19, 2022

Since commit 744d23c7 ("net: phy: Warn about incorrect
mdio_bus_phy_resume() state"), a warning splat is printed during system
resume with Wake-on-LAN disabled:

WARNING: CPU: 0 PID: 1197 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0xbc/0xc8

As the Renesas Ethernet AVB driver already calls phy_{stop,start}() in
its suspend/resume callbacks, it is sufficient to just mark the MAC
responsible for managing the power state of the PHY.

Fixes: fba863b8 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Reviewed-by: NSergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/8ec796f47620980fdd0403e21bd8b7200b4fa1d4.1663598796.git.geert+renesas@glider.beSigned-off-by: NJakub Kicinski <kuba@kernel.org>

4924c0cd

ice: Fix interface being down after reset with link-down-on-close flag on · 8ac71327

由 Mateusz Palczewski 提交于 8月 26, 2022

When performing a reset on ice driver with link-down-on-close flag on
interface would always stay down. Fix this by moving a check of this
flag to ice_stop() that is called only when user wants to bring
interface down.

Fixes: ab4ab73f ("ice: Add ethtool private flag to make forcing link down optional")
Signed-off-by: NMateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: NPetr Oros <poros@redhat.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

8ac71327

ice: config netdev tc before setting queues number · 122045ca

由 Michal Swiatkowski 提交于 8月 08, 2022

After lowering number of tx queues the warning appears:
"Number of in use tx queues changed invalidating tc mappings. Priority
traffic classification disabled!"
Example command to reproduce:
ethtool -L enp24s0f0 tx 36 rx 36

Fix this by setting correct tc mapping before setting real number of
queues on netdev.

Fixes: 0754d65b ("ice: Add infrastructure for mqprio support via ndo_setup_tc")
Signed-off-by: NMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>

122045ca

net: enetc: deny offload of tc-based TSN features on VF interfaces · 5641c751

由 Vladimir Oltean 提交于 9月 16, 2022

TSN features on the ENETC (taprio, cbs, gate, police) are configured
through a mix of command BD ring messages and port registers:
enetc_port_rd(), enetc_port_wr().

Port registers are a region of the ENETC memory map which are only
accessible from the PCIe Physical Function. They are not accessible from
the Virtual Functions.

Moreover, attempting to access these registers crashes the kernel:

$ echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/sriov_numvfs
pci 0000:00:01.0: [1957:ef00] type 00 class 0x020001
fsl_enetc_vf 0000:00:01.0: Adding to iommu group 15
fsl_enetc_vf 0000:00:01.0: enabling device (0000 -> 0002)
fsl_enetc_vf 0000:00:01.0 eno0vf0: renamed from eth0
$ tc qdisc replace dev eno0vf0 root taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
	sched-entry S 0x7f 900000 sched-entry S 0x80 100000 flags 0x2
Unable to handle kernel paging request at virtual address ffff800009551a08
Internal error: Oops: 96000007 [#1] PREEMPT SMP
pc : enetc_setup_tc_taprio+0x170/0x47c
lr : enetc_setup_tc_taprio+0x16c/0x47c
Call trace:
 enetc_setup_tc_taprio+0x170/0x47c
 enetc_setup_tc+0x38/0x2dc
 taprio_change+0x43c/0x970
 taprio_init+0x188/0x1e0
 qdisc_create+0x114/0x470
 tc_modify_qdisc+0x1fc/0x6c0
 rtnetlink_rcv_msg+0x12c/0x390

Split enetc_setup_tc() into separate functions for the PF and for the
VF drivers. Also remove enetc_qos.o from being included into
enetc-vf.ko, since it serves absolutely no purpose there.

Fixes: 34c6adf1 ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload")
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220916133209.3351399-2-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

5641c751

net: enetc: move enetc_set_psfp() out of the common enetc_set_features() · fed38e64

由 Vladimir Oltean 提交于 9月 16, 2022

The VF netdev driver shouldn't respond to changes in the NETIF_F_HW_TC
flag; only PFs should. Moreover, TSN-specific code should go to
enetc_qos.c, which should not be included in the VF driver.

Fixes: 79e49982 ("net: enetc: add hw tc hw offload features for PSPF capability")
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220916133209.3351399-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

fed38e64

wireguard: netlink: avoid variable-sized memcpy on sockaddr · 26c01310

由 Jason A. Donenfeld 提交于 9月 16, 2022

Doing a variable-sized memcpy is slower, and the compiler isn't smart
enough to turn this into a constant-size assignment.

Further, Kees' latest fortified memcpy will actually bark, because the
destination pointer is type sockaddr, not explicitly sockaddr_in or
sockaddr_in6, so it thinks there's an overflow:

    memcpy: detected field-spanning write (size 28) of single field
    "&endpoint.addr" at drivers/net/wireguard/netlink.c:446 (size 16)

Fix this by just assigning by using explicit casts for each checked
case.

Fixes: e7096c13 ("net: WireGuard secure network tunnel")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Reported-by: syzbot+a448cda4dba2dac50de5@syzkaller.appspotmail.com
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

26c01310

wireguard: ratelimiter: disable timings test by default · 684dec3c

由 Jason A. Donenfeld 提交于 9月 16, 2022

A previous commit tried to make the ratelimiter timings test more
reliable but in the process made it less reliable on other
configurations. This is an impossible problem to solve without
increasingly ridiculous heuristics. And it's not even a problem that
actually needs to be solved in any comprehensive way, since this is only
ever used during development. So just cordon this off with a DEBUG_
ifdef, just like we do for the trie's randomized tests, so it can be
enabled while hacking on the code, and otherwise disabled in CI. In the
process we also revert 151c8e49.

Fixes: 151c8e49 ("wireguard: ratelimiter: use hrtimer in selftest")
Fixes: e7096c13 ("net: WireGuard secure network tunnel")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

684dec3c

sfc/siena: fix null pointer dereference in efx_hard_start_xmit · 589c6ede

由 Íñigo Huguet 提交于 9月 15, 2022

Like in previous patch for sfc, prevent potential (but unlikely) NULL
pointer dereference.

Fixes: 12804793 ("sfc: decouple TXQ type from label")
Reported-by: NTianhao Zhao <tizhao@redhat.com>
Signed-off-by: NÍñigo Huguet <ihuguet@redhat.com>
Link: https://lore.kernel.org/r/20220915141958.16458-1-ihuguet@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

589c6ede

sfc/siena: fix TX channel offset when using legacy interrupts · 974bb793

由 Íñigo Huguet 提交于 9月 15, 2022

As in previous commit for sfc, fix TX channels offset when
efx_siena_separate_tx_channels is false (the default)

Fixes: 25bde571 ("sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels")
Reported-by: NTianhao Zhao <tizhao@redhat.com>
Signed-off-by: NÍñigo Huguet <ihuguet@redhat.com>
Link: https://lore.kernel.org/r/20220915141653.15504-1-ihuguet@redhat.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

974bb793

efi: efibc: Guard against allocation failure · 7da5b13d

由 Guilherme G. Piccoli 提交于 9月 09, 2022

There is a single kmalloc in this driver, and it's not currently
guarded against allocation failure. Do it here by just bailing-out
the reboot handler, in case this tentative allocation fails.

Fixes: 416581e4 ("efi: efibc: avoid efivar API for setting variables")
Signed-off-by: NGuilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: NArd Biesheuvel <ardb@kernel.org>

7da5b13d

20 9月, 2022 4 次提交

net: ipa: properly limit modem routing table use · cf412ec3

由 Alex Elder 提交于 9月 13, 2022

IPA can route packets between IPA-connected entities. The AP and
modem are currently the only such entities supported, and no routing
is required to transfer packets between them.

The number of entries in each routing table is fixed, and defined at
initialization time. Some of these entries are designated for use
by the modem, and the rest are available for the AP to use. The AP
sends a QMI message to the modem which describes (among other
things) information about routing table memory available for the
modem to use.

Currently the QMI initialization packet gives wrong information in
its description of routing tables. What *should* be supplied is the
maximum index that the modem can use for the routing table memory
located at a given location. The current code instead supplies the
total *number* of routing table entries. Furthermore, the modem is
granted the entire table, not just the subset it's supposed to use.

This patch fixes this. First, the ipa_mem_bounds structure is
generalized so its "end" field can be interpreted either as a final
byte offset, or a final array index. Second, the IPv4 and IPv6
(non-hashed and hashed) table information fields in the QMI
ipa_init_modem_driver_req structure are changed to be ipa_mem_bounds
rather than ipa_mem_array structures. Third, we set the "end" value
for each routing table to be the last index, rather than setting the
"count" to be the number of indices. Finally, instead of allowing
the modem to use all of a routing table's memory, it is limited to
just the portion meant to be used by the modem. In all versions of
IPA currently supported, that is IPA_ROUTE_MODEM_COUNT (8) entries.

Update a few comments for clarity.

Fixes: 530f9216 ("soc: qcom: ipa: AP/modem communications")
Signed-off-by: NAlex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20220913204602.1803004-1-elder@linaro.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

cf412ec3

of: mdio: Add of_node_put() when breaking out of for_each_xx · 1c48709e

由 Liang He 提交于 9月 13, 2022

In of_mdiobus_register(), we should call of_node_put() for 'child'
escaped out of for_each_available_child_of_node().

Fixes: 66bdede4 ("of_mdio: Fix broken PHY IRQ in case of probe deferral")
Co-developed-by: NMiaoqian Lin <linmq006@gmail.com>
Signed-off-by: NMiaoqian Lin <linmq006@gmail.com>
Signed-off-by: NLiang He <windhl@126.com>
Link: https://lore.kernel.org/r/20220913125659.3331969-1-windhl@126.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

1c48709e

gpio: ftgpio010: Make irqchip immutable · ab637d48

由 Linus Walleij 提交于 9月 15, 2022

This turns the FTGPIO010 irqchip immutable.

Tested on the D-Link DIR-685.

Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Acked-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NBartosz Golaszewski <brgl@bgdev.pl>

ab637d48

gpio: mockup: Fix potential resource leakage when register a chip · 02743c40

由 Andy Shevchenko 提交于 9月 20, 2022

If creation of software node fails, the locally allocated string
array is left unfreed. Free it on error path.

Fixes: 6fda593f ("gpio: mockup: Convert to use software nodes")
Cc: stable@vger.kernel.org
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: NBartosz Golaszewski <brgl@bgdev.pl>

02743c40

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功