提交 · d3f6bacfca86f6cf6bf85be1e8b54083d68d8195 · openeuler / Kernel

31 10月, 2022 2 次提交

drm/i915: stop abusing swiotlb_max_segment · d3f6bacf

由 Robert Beckett 提交于 10月 20, 2022

swiotlb_max_segment used to return either the maximum size that swiotlb
could bounce, or for Xen PV PAGE_SIZE even if swiotlb could bounce buffer
larger mappings.  This made i915 on Xen PV work as it bypasses the
coherency aspect of the DMA API and can't cope with bounce buffering
and this avoided bounce buffering for the Xen/PV case.

So instead of adding this hack back, check for Xen/PV directly in i915
for the Xen case and otherwise use the proper DMA API helper to query
the maximum mapping size.

Replace swiotlb_max_segment() calls with dma_max_mapping_size().
In i915_gem_object_get_pages_internal() no longer consider max_segment
only if CONFIG_SWIOTLB is enabled. There can be other (iommu related)
causes of specific max segment sizes.

Fixes: a2daa27c ("swiotlb: simplify swiotlb_max_segment")
Reported-by: NMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: NRobert Beckett <bob.beckett@collabora.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
[hch: added the Xen hack, rewrote the changelog]
Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221020110308.1582518-1-hch@lst.de
(cherry picked from commit 78a07fe7)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

d3f6bacf

drm/i915/tgl+: Add locking around DKL PHY register accesses · d7164a50

由 Imre Deak 提交于 10月 25, 2022

Accessing the TypeC DKL PHY registers during modeset-commit,
-verification, DP link-retraining and AUX power well toggling is racy
due to these code paths being concurrent and the PHY register bank
selection register (HIP_INDEX_REG) being shared between PHY instances
(aka TC ports) and the bank selection being not atomic wrt. the actual
PHY register access.

Add the required locking around each PHY register bank selection->
register access sequence.

Kudos to Ville for noticing the race conditions.

v2:
- Add the DKL PHY register accessors to intel_dkl_phy.[ch]. (Jani)
- Make the DKL_REG_TC_PORT macro independent of PHY internals.
- Move initing the DKL PHY lock to a more logical place.

v3:
- Fix parameter reuse in the DKL_REG_TC_PORT definition.
- Document the usage of phy_lock.

v4:
- Fix adding TC_PORT_1 offset in the DKL_REG_TC_PORT definition.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Acked-by: NJani Nikula <jani.nikula@intel.com>
Signed-off-by: NImre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221025114457.2191004-1-imre.deak@intel.com
(cherry picked from commit 89cb0ba4)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>

d7164a50

29 10月, 2022 3 次提交

platform/loongarch: laptop: Fix possible UAF and simplify generic_acpi_laptop_init() · d8191691

由 Yang Yingliang 提交于 10月 29, 2022

Currently the return value of 'sub_driver->init' is not checked. If
sparse_keymap_setup() called in the init function fails, 'generic_
inputdev' is freed, then it will lead a UAF when using it in generic_
acpi_laptop_init(). Fix it by checking the return value and setting
generic_inputdev to NULL after free, so as to avoid double free it.

The error code in generic_subdriver_init() is always negative, so the
return of generic_subdriver_init() can be simplified.

Fixes: 6246ed09 ("LoongArch: Add ACPI-based generic laptop driver")
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NHuacai Chen <chenhuacai@loongson.cn>

d8191691

platform/loongarch: laptop: Adjust resume order for loongson_hotkey_resume() · fbe605ab

由 Huacai Chen 提交于 10月 29, 2022

Some laptops don't support SW_LID, but still have backlight control,
move backlight resuming before SW_LID event handling so as to avoid
backlight mistake due to early return.
Signed-off-by: NHuacai Chen <chenhuacai@loongson.cn>

fbe605ab

random: use arch_get_random*_early() in random_init() · f5e4ec15

由 Jean-Philippe Brucker 提交于 10月 28, 2022

While reworking the archrandom handling, commit d349ab99 ("random:
handle archrandom with multiple longs") switched to the non-early
archrandom helpers in random_init(), which broke initialization of the
entropy pool from the arm64 random generator.

Indeed at that point the arm64 CPU features, which verify that all CPUs
have compatible capabilities, are not finalized so arch_get_random_seed_longs()
is unsuccessful. Instead random_init() should use the _early functions,
which check only the boot CPU on arm64. On other architectures the
_early functions directly call the normal ones.

Fixes: d349ab99 ("random: handle archrandom with multiple longs")
Cc: stable@vger.kernel.org
Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

f5e4ec15

28 10月, 2022 21 次提交

net: enetc: survive memory pressure without crashing · 84ce1ca3

由 Vladimir Oltean 提交于 10月 27, 2022

Under memory pressure, enetc_refill_rx_ring() may fail, and when called
during the enetc_open() -> enetc_setup_rxbdr() procedure, this is not
checked for.

An extreme case of memory pressure will result in exactly zero buffers
being allocated for the RX ring, and in such a case it is expected that
hardware drops all RX packets due to lack of buffers.

This does not happen, because the reset-default value of the consumer
and produces index is 0, and this makes the ENETC think that all buffers
have been initialized and that it owns them (when in reality none were).

The hardware guide explains this best:

| Configure the receive ring producer index register RBaPIR with a value
| of 0. The producer index is initially configured by software but owned
| by hardware after the ring has been enabled. Hardware increments the
| index when a frame is received which may consume one or more BDs.
| Hardware is not allowed to increment the producer index to match the
| consumer index since it is used to indicate an empty condition. The ring
| can hold at most RBLENR[LENGTH]-1 received BDs.
|
| Configure the receive ring consumer index register RBaCIR. The
| consumer index is owned by software and updated during operation of the
| of the BD ring by software, to indicate that any receive data occupied
| in the BD has been processed and it has been prepared for new data.
| - If consumer index and producer index are initialized to the same
|   value, it indicates that all BDs in the ring have been prepared and
|   hardware owns all of the entries.
| - If consumer index is initialized to producer index plus N, it would
|   indicate N BDs have been prepared. Note that hardware cannot start if
|   only a single buffer is prepared due to the restrictions described in
|   (2).
| - Software may write consumer index to match producer index anytime
|   while the ring is operational to indicate all received BDs prior have
|   been processed and new BDs prepared for hardware.

Normally, the value of rx_ring->rcir (consumer index) is brought in sync
with the rx_ring->next_to_use software index, but this only happens if
page allocation ever succeeded.

When PI==CI==0, the hardware appears to receive frames and write them to
DMA address 0x0 (?!), then set the READY bit in the BD.

The enetc_clean_rx_ring() function (and its XDP derivative) is naturally
not prepared to handle such a condition. It will attempt to process
those frames using the rx_swbd structure associated with index i of the
RX ring, but that structure is not fully initialized (enetc_new_page()
does all of that). So what happens next is undefined behavior.

To operate using no buffer, we must initialize the CI to PI + 1, which
will block the hardware from advancing the CI any further, and drop
everything.

The issue was seen while adding support for zero-copy AF_XDP sockets,
where buffer memory comes from user space, which can even decide to
supply no buffers at all (example: "xdpsock --txonly"). However, the bug
is present also with the network stack code, even though it would take a
very determined person to trigger a page allocation failure at the
perfect time (a series of ifup/ifdown under memory pressure should
eventually reproduce it given enough retries).

Fixes: d4fd0404 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: NVladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: NClaudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20221027182925.3256653-1-vladimir.oltean@nxp.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

84ce1ca3

fbdev: cyber2000fb: fix missing pci_disable_device() · 3c6bf6bd

由 Yang Yingliang 提交于 10月 24, 2022

Add missing pci_disable_device() in error path of probe() and remove() path.

Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: NHelge Deller <deller@gmx.de>

3c6bf6bd

net/mlx5e: Fix macsec sci endianness at rx sa update · 12ba40ba

由 Raed Salem 提交于 10月 26, 2022

The cited commit at rx sa update operation passes the sci object
attribute, in the wrong endianness and not as expected by the HW
effectively create malformed hw sa context in case of update rx sa
consequently, HW produces unexpected MACsec packets which uses this
sa.

Fix by passing sci to create macsec object with the correct endianness,
while at it add __force u64 to prevent sparse check error of type
"sparse: error: incorrect type in assignment".

Fixes: aae3454e ("net/mlx5e: Add MACsec offload Rx command support")
Signed-off-by: NRaed Salem <raeds@nvidia.com>
Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-16-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

12ba40ba

net/mlx5e: Fix wrong bitwise comparison usage in macsec_fs_rx_add_rule function · d5509564

由 Raed Salem 提交于 10月 26, 2022

The cited commit produces a sparse check error of type
"sparse: error: restricted __be64 degrades to integer". The
offending line wrongly did a bitwise operation between two different
storage types one of 64 bit when the other smaller side is 16 bit
which caused the above sparse error, furthermore bitwise operation
usage here is wrong in the first place as the constant MACSEC_PORT_ES
is not a bitwise field.

Fix by using the right mask to get the lower 16 bit if the sci number,
and use comparison operator '==' instead of bitwise '&' operator.

Fixes: 3b20949c ("net/mlx5e: Add MACsec RX steering rules")
Signed-off-by: NRaed Salem <raeds@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-15-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

d5509564

net/mlx5e: Fix macsec rx security association (SA) update/delete · 74573e38

由 Raed Salem 提交于 10月 26, 2022

The cited commit adds the support for update/delete MACsec Rx SA,
naturally, these operations need to check if the SA in question exists
to update/delete the SA and return error code otherwise, however they
do just the opposite i.e. return with error if the SA exists

Fix by change the check to return error in case the SA in question does
not exist, adjust error message and code accordingly.

Fixes: aae3454e ("net/mlx5e: Add MACsec offload Rx command support")
Signed-off-by: NRaed Salem <raeds@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-14-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

74573e38

net/mlx5e: Fix macsec coverity issue at rx sa update · d3ecf037

由 Raed Salem 提交于 10月 26, 2022

The cited commit at update rx sa operation passes object attributes
to MACsec object create function without initializing/setting all
attributes fields leaving some of them with garbage values, therefore
violating the implicit assumption at create object function, which
assumes that all input object attributes fields are set.

Fix by initializing the object attributes struct to zero, thus leaving
unset fields with the legal zero value.

Fixes: aae3454e ("net/mlx5e: Add MACsec offload Rx command support")
Signed-off-by: NRaed Salem <raeds@nvidia.com>
Reviewed-by: NLior Nahmanson <liorna@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-13-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

d3ecf037

net/mlx5: Fix crash during sync firmware reset · aefb62a9

由 Suresh Devarakonda 提交于 10月 26, 2022

When setting Bluefield to DPU NIC mode using mlxconfig tool + sync
firmware reset flow, we run into scenario where the host was not
eswitch manager at the time of mlx5 driver load but becomes eswitch manager
after the sync firmware reset flow. This results in null pointer
access of mpfs structure during mac filter add. This change prevents null
pointer access but mpfs table entries will not be added.

Fixes: 5ec69744 ("net/mlx5: Add support for devlink reload action fw activate")
Signed-off-by: NSuresh Devarakonda <ramad@nvidia.com>
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Reviewed-by: NBodong Wang <bodong@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-12-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

aefb62a9

net/mlx5: Update fw fatal reporter state on PCI handlers successful recover · 416ef713

由 Roy Novich 提交于 10月 26, 2022

Update devlink health fw fatal reporter state to "healthy" is needed by
strictly calling devlink_health_reporter_state_update() after recovery
was done by PCI error handler. This is needed when fw_fatal reporter was
triggered due to PCI error. Poll health is called and set reporter state
to error. Health recovery failed (since EEH didn't re-enable the PCI).
PCI handlers keep on recover flow and succeed later without devlink
acknowledgment. Fix this by adding devlink state update at the end of
the PCI handler recovery process.

Fixes: 6181e5cb ("devlink: add support for reporter recovery completion")
Signed-off-by: NRoy Novich <royno@nvidia.com>
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Reviewed-by: NAya Levin <ayal@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-11-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

416ef713

net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed · 94d65173

由 Roi Dayan 提交于 10月 26, 2022

On multi table split the driver creates a new attr instance with
data being copied from prev attr instance zeroing action flags.
Also need to reset dests properties to avoid incorrect dests per attr.

Fixes: 8300f225 ("net/mlx5e: Create new flow attr for multi table actions")
Signed-off-by: NRoi Dayan <roid@nvidia.com>
Reviewed-by: NMaor Dickman <maord@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-10-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

94d65173

net/mlx5e: TC, Reject forwarding from internal port to internal port · f382a241

由 Ariel Levkovich 提交于 10月 26, 2022

Reject TC rules that forward from internal port to internal port
as it is not supported.

This include rules that are explicitly have internal port as
the filter device as well as rules that apply on tunnel interfaces
as the route device for the tunnel interface can be an internal
port.

Fixes: 27484f71 ("net/mlx5e: Offload tc rules that redirect to ovs internal port")
Signed-off-by: NAriel Levkovich <lariel@nvidia.com>
Reviewed-by: NMaor Dickman <maord@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-9-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

f382a241

net/mlx5: Fix possible use-after-free in async command interface · bacd22df

由 Tariq Toukan 提交于 10月 26, 2022

mlx5_cmd_cleanup_async_ctx should return only after all its callback
handlers were completed. Before this patch, the below race between
mlx5_cmd_cleanup_async_ctx and mlx5_cmd_exec_cb_handler was possible and
lead to a use-after-free:

1. mlx5_cmd_cleanup_async_ctx is called while num_inflight is 2 (i.e.
   elevated by 1, a single inflight callback).
2. mlx5_cmd_cleanup_async_ctx decreases num_inflight to 1.
3. mlx5_cmd_exec_cb_handler is called, decreases num_inflight to 0 and
   is about to call wake_up().
4. mlx5_cmd_cleanup_async_ctx calls wait_event, which returns
   immediately as the condition (num_inflight == 0) holds.
5. mlx5_cmd_cleanup_async_ctx returns.
6. The caller of mlx5_cmd_cleanup_async_ctx frees the mlx5_async_ctx
   object.
7. mlx5_cmd_exec_cb_handler goes on and calls wake_up() on the freed
   object.

Fix it by syncing using a completion object. Mark it completed when
num_inflight reaches 0.

Trace:

BUG: KASAN: use-after-free in do_raw_spin_lock+0x23d/0x270
Read of size 4 at addr ffff888139cd12f4 by task swapper/5/0

CPU: 5 PID: 0 Comm: swapper/5 Not tainted 6.0.0-rc3_for_upstream_debug_2022_08_30_13_10 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
 <IRQ>
 dump_stack_lvl+0x57/0x7d
 print_report.cold+0x2d5/0x684
 ? do_raw_spin_lock+0x23d/0x270
 kasan_report+0xb1/0x1a0
 ? do_raw_spin_lock+0x23d/0x270
 do_raw_spin_lock+0x23d/0x270
 ? rwlock_bug.part.0+0x90/0x90
 ? __delete_object+0xb8/0x100
 ? lock_downgrade+0x6e0/0x6e0
 _raw_spin_lock_irqsave+0x43/0x60
 ? __wake_up_common_lock+0xb9/0x140
 __wake_up_common_lock+0xb9/0x140
 ? __wake_up_common+0x650/0x650
 ? destroy_tis_callback+0x53/0x70 [mlx5_core]
 ? kasan_set_track+0x21/0x30
 ? destroy_tis_callback+0x53/0x70 [mlx5_core]
 ? kfree+0x1ba/0x520
 ? do_raw_spin_unlock+0x54/0x220
 mlx5_cmd_exec_cb_handler+0x136/0x1a0 [mlx5_core]
 ? mlx5_cmd_cleanup_async_ctx+0x220/0x220 [mlx5_core]
 ? mlx5_cmd_cleanup_async_ctx+0x220/0x220 [mlx5_core]
 mlx5_cmd_comp_handler+0x65a/0x12b0 [mlx5_core]
 ? dump_command+0xcc0/0xcc0 [mlx5_core]
 ? lockdep_hardirqs_on_prepare+0x400/0x400
 ? cmd_comp_notifier+0x7e/0xb0 [mlx5_core]
 cmd_comp_notifier+0x7e/0xb0 [mlx5_core]
 atomic_notifier_call_chain+0xd7/0x1d0
 mlx5_eq_async_int+0x3ce/0xa20 [mlx5_core]
 atomic_notifier_call_chain+0xd7/0x1d0
 ? irq_release+0x140/0x140 [mlx5_core]
 irq_int_handler+0x19/0x30 [mlx5_core]
 __handle_irq_event_percpu+0x1f2/0x620
 handle_irq_event+0xb2/0x1d0
 handle_edge_irq+0x21e/0xb00
 __common_interrupt+0x79/0x1a0
 common_interrupt+0x78/0xa0
 </IRQ>
 <TASK>
 asm_common_interrupt+0x22/0x40
RIP: 0010:default_idle+0x42/0x60
Code: c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 04 84 d2 75 14 8b 05 eb 47 22 02 85 c0 7e 07 0f 00 2d e0 9f 48 00 fb f4 <c3> 48 c7 c7 80 08 7f 85 e8 d1 d3 3e fe eb de 66 66 2e 0f 1f 84 00
RSP: 0018:ffff888100dbfdf0 EFLAGS: 00000242
RAX: 0000000000000001 RBX: ffffffff84ecbd48 RCX: 1ffffffff0afe110
RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffff835cc9bc
RBP: 0000000000000005 R08: 0000000000000001 R09: ffff88881dec4ac3
R10: ffffed1103bd8958 R11: 0000017d0ca571c9 R12: 0000000000000005
R13: ffffffff84f024e0 R14: 0000000000000000 R15: dffffc0000000000
 ? default_idle_call+0xcc/0x450
 default_idle_call+0xec/0x450
 do_idle+0x394/0x450
 ? arch_cpu_idle_exit+0x40/0x40
 ? do_idle+0x17/0x450
 cpu_startup_entry+0x19/0x20
 start_secondary+0x221/0x2b0
 ? set_cpu_sibling_map+0x2070/0x2070
 secondary_startup_64_no_verify+0xcd/0xdb
 </TASK>

Allocated by task 49502:
 kasan_save_stack+0x1e/0x40
 __kasan_kmalloc+0x81/0xa0
 kvmalloc_node+0x48/0xe0
 mlx5e_bulk_async_init+0x35/0x110 [mlx5_core]
 mlx5e_tls_priv_tx_list_cleanup+0x84/0x3e0 [mlx5_core]
 mlx5e_ktls_cleanup_tx+0x38f/0x760 [mlx5_core]
 mlx5e_cleanup_nic_tx+0xa7/0x100 [mlx5_core]
 mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core]
 mlx5e_suspend+0xdb/0x140 [mlx5_core]
 mlx5e_remove+0x89/0x190 [mlx5_core]
 auxiliary_bus_remove+0x52/0x70
 device_release_driver_internal+0x40f/0x650
 driver_detach+0xc1/0x180
 bus_remove_driver+0x125/0x2f0
 auxiliary_driver_unregister+0x16/0x50
 mlx5e_cleanup+0x26/0x30 [mlx5_core]
 cleanup+0xc/0x4e [mlx5_core]
 __x64_sys_delete_module+0x2b5/0x450
 do_syscall_64+0x3d/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

Freed by task 49502:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_set_free_info+0x20/0x30
 ____kasan_slab_free+0x11d/0x1b0
 kfree+0x1ba/0x520
 mlx5e_tls_priv_tx_list_cleanup+0x2e7/0x3e0 [mlx5_core]
 mlx5e_ktls_cleanup_tx+0x38f/0x760 [mlx5_core]
 mlx5e_cleanup_nic_tx+0xa7/0x100 [mlx5_core]
 mlx5e_detach_netdev+0x1ca/0x2b0 [mlx5_core]
 mlx5e_suspend+0xdb/0x140 [mlx5_core]
 mlx5e_remove+0x89/0x190 [mlx5_core]
 auxiliary_bus_remove+0x52/0x70
 device_release_driver_internal+0x40f/0x650
 driver_detach+0xc1/0x180
 bus_remove_driver+0x125/0x2f0
 auxiliary_driver_unregister+0x16/0x50
 mlx5e_cleanup+0x26/0x30 [mlx5_core]
 cleanup+0xc/0x4e [mlx5_core]
 __x64_sys_delete_module+0x2b5/0x450
 do_syscall_64+0x3d/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: e355477e ("net/mlx5: Make mlx5_cmd_exec_cb() a safe API")
Signed-off-by: NTariq Toukan <tariqt@nvidia.com>
Reviewed-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-8-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

bacd22df

net/mlx5: ASO, Create the ASO SQ with the correct timestamp format · 0f3caaa2

由 Saeed Mahameed 提交于 10月 26, 2022

mlx5 SQs must select the timestamp format explicitly according to the
active clock mode, select the current active timestamp mode so ASO SQ create
will succeed.

This fixes the following error prints when trying to create ipsec ASO SQ
while the timestamp format is real time mode.

mlx5_cmd_out_err:778:(pid 34874): CREATE_SQ(0x904) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0xd61c0b), err(-22)
mlx5_aso_create_sq:285:(pid 34874): Failed to open aso wq sq, err=-22
mlx5e_ipsec_init:436:(pid 34874): IPSec initialization failed, -22

Fixes: cdd04f4d ("net/mlx5: Add support to create SQ and CQ for ASO")
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Reported-by: NLeon Romanovsky <leonro@nvidia.com>
Reviewed-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-7-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

0f3caaa2

net/mlx5e: Update restore chain id for slow path packets · 8dc47c05

由 Paul Blakey 提交于 10月 26, 2022

Currently encap slow path rules just forward to software without
setting the chain id miss register, so driver doesn't restore
the chain, and packets hitting this rule will restart from tc chain
0 instead of continuing to the chain the encap rule was on.

Fix this by setting the chain id miss register to the chain id mapping.

Fixes: 8f1e0b97 ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping")
Signed-off-by: NPaul Blakey <paulb@nvidia.com>
Reviewed-by: NOz Shlomo <ozsh@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-6-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

8dc47c05

net/mlx5e: Extend SKB room check to include PTP-SQ · 19b43a43

由 Aya Levin 提交于 10月 26, 2022

When tx_port_ts is set, the driver diverts all UPD traffic over PTP port
to a dedicated PTP-SQ. The SKBs are cached until the wire-CQE arrives.
When the packet size is greater then MTU, the firmware might drop it and
the packet won't be transmitted to the wire, hence the wire-CQE won't
reach the driver. In this case the SKBs are accumulated in the SKB fifo.
Add room check to consider the PTP-SQ SKB fifo, when the SKB fifo is
full, driver stops the queue resulting in a TX timeout. Devlink
TX-reporter can recover from it.

Fixes: 1880bc4e ("net/mlx5e: Add TX port timestamp support")
Signed-off-by: NAya Levin <ayal@nvidia.com>
Reviewed-by: NTariq Toukan <tariqt@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-5-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

19b43a43

net/mlx5: DR, Fix matcher disconnect error flow · 4ea9891d

由 Rongwei Liu 提交于 10月 26, 2022

When 2nd flow rules arrives, it will merge together with the
1st one if matcher criteria is the same.

If merge fails, driver will rollback the merge contents, and
reject the 2nd rule. At rollback stage, matcher can't be
disconnected unconditionally, otherise the 1st rule can't be
hit anymore.

Add logic to check if the matcher should be disconnected or not.

Fixes: cc2295cd ("net/mlx5: DR, Improve steering for empty or RX/TX-only matchers")
Signed-off-by: NRongwei Liu <rongweil@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-4-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

4ea9891d

net/mlx5: Wait for firmware to enable CRS before pci_restore_state · 212b4d72

由 Moshe Shemesh 提交于 10月 26, 2022

After firmware reset driver should verify firmware already enabled CRS
and became responsive to pci config cycles before restoring pci state.
Fix that by waiting till device_id is readable through PCI again.

Fixes: eabe8e5e ("net/mlx5: Handle sync reset now event")
Signed-off-by: NMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-3-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

212b4d72

net/mlx5e: Do not increment ESN when updating IPsec ESN state · 888be6b2

由 Hyong Youb Kim 提交于 10月 26, 2022

An offloaded SA stops receiving after about 2^32 + replay_window
packets. For example, when SA reaches <seq-hi 0x1, seq 0x2c>, all
subsequent packets get dropped with SA-icv-failure (integrity_failed).

To reproduce the bug:
- ConnectX-6 Dx with crypto enabled (FW 22.30.1004)
- ipsec.conf:
  nic-offload = yes
  replay-window = 32
  esn = yes
  salifetime=24h
- Run netperf for a long time to send more than 2^32 packets
  netperf -H <device-under-test> -t TCP_STREAM -l 20000

When 2^32 + replay_window packets are received, the replay window
moves from the 2nd half of subspace (overlap=1) to the 1st half
(overlap=0). The driver then updates the 'esn' value in NIC
(i.e. seq_hi) as follows.

 seq_hi = xfrm_replay_seqhi(seq_bottom)
 new esn in NIC = seq_hi + 1

The +1 increment is wrong, as seq_hi already contains the correct
seq_hi. For example, when seq_hi=1, the driver actually tells NIC to
use seq_hi=2 (esn). This incorrect esn value causes all subsequent
packets to fail integrity checks (SA-icv-failure). So, do not
increment.

Fixes: cb010083 ("net/mlx5: IPSec, Add support for ESN")
Signed-off-by: NHyong Youb Kim <hyonkim@cisco.com>
Acked-by: NLeon Romanovsky <leonro@nvidia.com>
Signed-off-by: NSaeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20221026135153.154807-2-saeed@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

888be6b2

netdevsim: remove dir in nsim_dev_debugfs_init() when creating ports dir failed · a6aa8d0c

由 Zhengchao Shao 提交于 10月 26, 2022

Remove dir in nsim_dev_debugfs_init() when creating ports dir failed.
Otherwise, the netdevsim device will not be created next time. Kernel
reports an error: debugfs: Directory 'netdevsim1' with parent 'netdevsim'
already present!

Fixes: ab1d0cc0 ("netdevsim: change debugfs tree topology")
Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

a6aa8d0c

netdevsim: fix memory leak in nsim_drv_probe() when nsim_dev_resources_register() failed · 6b1da9f7

由 Zhengchao Shao 提交于 10月 26, 2022

If some items in nsim_dev_resources_register() fail, memory leak will
occur. The following is the memory leak information.

unreferenced object 0xffff888074c02600 (size 128):
  comm "echo", pid 8159, jiffies 4294945184 (age 493.530s)
  hex dump (first 32 bytes):
    40 47 ea 89 ff ff ff ff 01 00 00 00 00 00 00 00  @G..............
    ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
  backtrace:
    [<0000000011a31c98>] kmalloc_trace+0x22/0x60
    [<0000000027384c69>] devl_resource_register+0x144/0x4e0
    [<00000000a16db248>] nsim_drv_probe+0x37a/0x1260
    [<000000007d1f448c>] really_probe+0x20b/0xb10
    [<00000000c416848a>] __driver_probe_device+0x1b3/0x4a0
    [<00000000077e0351>] driver_probe_device+0x49/0x140
    [<0000000054f2465a>] __device_attach_driver+0x18c/0x2a0
    [<000000008538f359>] bus_for_each_drv+0x151/0x1d0
    [<0000000038e09747>] __device_attach+0x1c9/0x4e0
    [<00000000dd86e533>] bus_probe_device+0x1d5/0x280
    [<00000000839bea35>] device_add+0xae0/0x1cb0
    [<000000009c2abf46>] new_device_store+0x3b6/0x5f0
    [<00000000fb823d7f>] bus_attr_store+0x72/0xa0
    [<000000007acc4295>] sysfs_kf_write+0x106/0x160
    [<000000005f50cb4d>] kernfs_fop_write_iter+0x3a8/0x5a0
    [<0000000075eb41bf>] vfs_write+0x8f0/0xc80

Fixes: 37923ed6 ("netdevsim: Add simple FIB resource controller via devlink")
Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

6b1da9f7

netdevsim: fix memory leak in nsim_bus_dev_new() · cf2010aa

由 Zhengchao Shao 提交于 10月 26, 2022

If device_register() failed in nsim_bus_dev_new(), the value of reference
in nsim_bus_dev->dev is 1. obj->name in nsim_bus_dev->dev will not be
released.

unreferenced object 0xffff88810352c480 (size 16):
  comm "echo", pid 5691, jiffies 4294945921 (age 133.270s)
  hex dump (first 16 bytes):
    6e 65 74 64 65 76 73 69 6d 31 00 00 00 00 00 00  netdevsim1......
  backtrace:
    [<000000005e2e5e26>] __kmalloc_node_track_caller+0x3a/0xb0
    [<0000000094ca4fc8>] kvasprintf+0xc3/0x160
    [<00000000aad09bcc>] kvasprintf_const+0x55/0x180
    [<000000009bac868d>] kobject_set_name_vargs+0x56/0x150
    [<000000007c1a5d70>] dev_set_name+0xbb/0xf0
    [<00000000ad0d126b>] device_add+0x1f8/0x1cb0
    [<00000000c222ae24>] new_device_store+0x3b6/0x5e0
    [<0000000043593421>] bus_attr_store+0x72/0xa0
    [<00000000cbb1833a>] sysfs_kf_write+0x106/0x160
    [<00000000d0dedb8a>] kernfs_fop_write_iter+0x3a8/0x5a0
    [<00000000770b66e2>] vfs_write+0x8f0/0xc80
    [<0000000078bb39be>] ksys_write+0x106/0x210
    [<00000000005e55a4>] do_syscall_64+0x35/0x80
    [<00000000eaa40bbc>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: 40e4fe4c ("netdevsim: move device registration and related code to bus.c")
Signed-off-by: NZhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221026015405.128795-1-shaozhengchao@huawei.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

cf2010aa

net: broadcom: bcm4908_enet: update TX stats after actual transmission · ef3556ee

由 Rafał Miłecki 提交于 10月 27, 2022

Queueing packets doesn't guarantee their transmission. Update TX stats
after hardware confirms consuming submitted data.

This also fixes a possible race and NULL dereference.
bcm4908_enet_start_xmit() could try to access skb after freeing it in
the bcm4908_enet_poll_tx().
Reported-by: NFlorian Fainelli <f.fainelli@gmail.com>
Fixes: 4feffead ("net: broadcom: bcm4908enet: add BCM4908 controller driver")
Signed-off-by: NRafał Miłecki <rafal@milecki.pl>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221027112430.8696-1-zajec5@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

ef3556ee

27 10月, 2022 12 次提交

net: bcmsysport: Indicate MAC is in charge of PHY PM · 9f172134

由 Florian Fainelli 提交于 10月 25, 2022

Avoid the PHY library call unnecessarily into the suspend/resume
functions by setting phydev->mac_managed_pm to true. The SYSTEMPORT
driver essentially does exactly what mdio_bus_phy_resume() does by
calling phy_resume().

Fixes: fba863b8 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221025234201.2549360-1-f.fainelli@gmail.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>

9f172134

rbd: fix possible memory leak in rbd_sysfs_init() · 7f21735f

由 Yang Yingliang 提交于 10月 27, 2022

If device_register() returns error in rbd_sysfs_init(), name of kobject
which is allocated in dev_set_name() called in device_add() is leaked.

As comment of device_add() says, it should call put_device() to drop
the reference count that was set in device_initialize() when it fails,
so the name can be freed in kobject_cleanup().

Fault injection test can trigger this problem:

unreferenced object 0xffff88810173aa78 (size 8):
  comm "modprobe", pid 247, jiffies 4294714278 (age 31.789s)
  hex dump (first 8 bytes):
    72 62 64 00 81 88 ff ff                          rbd.....
  backtrace:
    [<00000000f58fae56>] __kmalloc_node_track_caller+0x44/0x1b0
    [<00000000bdd44fe7>] kstrdup+0x3a/0x70
    [<00000000f7844d0b>] kstrdup_const+0x63/0x80
    [<000000001b0a0eeb>] kvasprintf_const+0x10b/0x190
    [<00000000a47bd894>] kobject_set_name_vargs+0x56/0x150
    [<00000000d5edbf18>] dev_set_name+0xab/0xe0
    [<00000000f5153e80>] device_add+0x106/0x1f20

Fixes: dfc5606d ("rbd: replace the rbd sysfs interface")
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Reviewed-by: NAlex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20221027091918.2294132-1-yangyingliang@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

7f21735f

net: ehea: fix possible memory leak in ehea_register_port() · 0e7ce23a

由 Yang Yingliang 提交于 10月 25, 2022

If of_device_register() returns error, the of node and the
name allocated in dev_set_name() is leaked, call put_device()
to give up the reference that was set in device_initialize(),
so that of node is put in logical_port_release() and the name
is freed in kobject_cleanup().

Fixes: 1acf2318 ("ehea: dynamic add / remove port")
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221025130011.1071357-1-yangyingliang@huawei.comSigned-off-by: NPaolo Abeni <pabeni@redhat.com>

0e7ce23a

can: rcar_canfd: fix channel specific IRQ handling for RZ/G2L · d887087c

由 Biju Das 提交于 10月 25, 2022

RZ/G2L has separate channel specific IRQs for transmit and error
interrupts. But the IRQ handler processes both channels, even if there
no interrupt occurred on one of the channels.

This patch fixes the issue by passing a channel specific context
parameter instead of global one for the IRQ register and the IRQ
handler, it just handles the channel which is triggered the interrupt.

Fixes: 76e9353a ("can: rcar_canfd: Add support for RZ/G2L family")
Signed-off-by: NBiju Das <biju.das.jz@bp.renesas.com>
Link: https://lore.kernel.org/all/20221025155657.1426948-3-biju.das.jz@bp.renesas.com
Cc: stable@vger.kernel.org
[mkl: adjust commit message]
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

d887087c

can: rcar_canfd: rcar_canfd_handle_global_receive(): fix IRQ storm on global FIFO receive · 702de2c2

由 Biju Das 提交于 10月 25, 2022

We are seeing an IRQ storm on the global receive IRQ line under heavy
CAN bus load conditions with both CAN channels enabled.

Conditions:

The global receive IRQ line is shared between can0 and can1, either of
the channels can trigger interrupt while the other channel's IRQ line
is disabled (RFIE).

When global a receive IRQ interrupt occurs, we mask the interrupt in
the IRQ handler. Clearing and unmasking of the interrupt is happening
in rx_poll(). There is a race condition where rx_poll() unmasks the
interrupt, but the next IRQ handler does not mask the IRQ due to
NAPIF_STATE_MISSED flag (e.g.: can0 RX FIFO interrupt is disabled and
can1 is triggering RX interrupt, the delay in rx_poll() processing
results in setting NAPIF_STATE_MISSED flag) leading to an IRQ storm.

This patch fixes the issue by checking IRQ active and enabled before
handling the IRQ on a particular channel.

Fixes: dd3bd23e ("can: rcar_canfd: Add Renesas R-Car CAN FD driver")
Suggested-by: NMarc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: NBiju Das <biju.das.jz@bp.renesas.com>
Link: https://lore.kernel.org/all/20221025155657.1426948-2-biju.das.jz@bp.renesas.com
Cc: stable@vger.kernel.org
[mkl: adjust commit message]
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

702de2c2

fbdev/core: Avoid uninitialized read in aperture_remove_conflicting_pci_device() · e0ba1a39

由 Michał Mirosław 提交于 10月 27, 2022

Return on error directly from the BAR-iterating loop instead of
break+return.

This is actually a cosmetic fix, since it would be highly unusual to
have this called for a PCI device without any memory BARs.

Fixes: 9d69ef18 ("fbdev/core: Remove remove_conflicting_pci_framebuffers()")
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NThomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/e75323732bedc46d613d72ecb40f97e3bc75eea8.1666829073.git.mirq-linux@rere.qmqm.pl

e0ba1a39

can: kvaser_usb: Fix possible completions during init_completion · 2871edb3

由 Anssi Hannula 提交于 10月 10, 2022

kvaser_usb uses completions to signal when a response event is received
for outgoing commands.

However, it uses init_completion() to reinitialize the start_comp and
stop_comp completions before sending the start/stop commands.

In case the device sends the corresponding response just before the
actual command is sent, complete() may be called concurrently with
init_completion() which is not safe.

This might be triggerable even with a properly functioning device by
stopping the interface (CMD_STOP_CHIP) just after it goes bus-off (which
also causes the driver to send CMD_STOP_CHIP when restart-ms is off),
but that was not tested.

Fix the issue by using reinit_completion() instead.

Fixes: 080f40a6 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices")
Tested-by: NJimmy Assarsson <extja@kvaser.com>
Signed-off-by: NAnssi Hannula <anssi.hannula@bitwise.fi>
Signed-off-by: NJimmy Assarsson <extja@kvaser.com>
Link: https://lore.kernel.org/all/20221010185237.319219-2-extja@kvaser.com
Cc: stable@vger.kernel.org
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>

2871edb3

net: ethernet: ave: Fix MAC to be in charge of PHY PM · e2badb4b

由 Kunihiko Hayashi 提交于 10月 24, 2022

The phylib callback is called after MAC driver's own resume callback is
called. For AVE driver, after resuming immediately, PHY state machine is
in PHY_NOLINK because there is a time lag from link-down to link-up due to
autoneg. The result is WARN_ON() dump in mdio_bus_phy_resume().

Since ave_resume() itself calls phy_resume(), AVE driver should manage
PHY PM. To indicate that MAC driver manages PHY PM, set
phydev->mac_managed_pm to true to avoid the unnecessary phylib call and
add missing phy_init_hw() to ave_resume().
Suggested-by: NHeiner Kallweit <hkallweit1@gmail.com>
Fixes: fba863b8 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: NKunihiko Hayashi <hayashi.kunihiko@socionext.com>
Link: https://lore.kernel.org/r/20221024072227.24769-1-hayashi.kunihiko@socionext.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

e2badb4b

net: fec: limit register access on i.MX6UL · 0a8b43b1

由 Juergen Borleis 提交于 10月 24, 2022

Using 'ethtool -d […]' on an i.MX6UL leads to a kernel crash:

Unhandled fault: external abort on non-linefetch (0x1008) at […]

due to this SoC has less registers in its FEC implementation compared to other
i.MX6 variants. Thus, a run-time decision is required to avoid access to
non-existing registers.

Fixes: a51d3ab5 ("net: fec: use a more proper compatible string for i.MX6UL type device")
Signed-off-by: NJuergen Borleis <jbe@pengutronix.de>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20221024080552.21004-1-jbe@pengutronix.deSigned-off-by: NJakub Kicinski <kuba@kernel.org>

0a8b43b1

drm/amdgpu: disallow gfxoff until GC IP blocks complete s2idle resume · d61e1d1d

由 Prike Liang 提交于 10月 21, 2022

In the S2idle suspend/resume phase the gfxoff is keeping functional so
some IP blocks will be likely to reinitialize at gfxoff entry and that
will result in failing to program GC registers.Therefore, let disallow
gfxoff until AMDGPU IPs reinitialized completely.
Signed-off-by: NPrike Liang <Prike.Liang@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.15.x

d61e1d1d

usb: dwc3: gadget: Don't set IMI for no_interrupt · 308c316d

由 Thinh Nguyen 提交于 10月 25, 2022

The gadget driver may have a certain expectation of how the request
completion flow should be from to its configuration. Make sure the
controller driver respect that. That is, don't set IMI (Interrupt on
Missed Isoc) when usb_request->no_interrupt is set. Also, the driver
should only set IMI to the last TRB of a chain.

Fixes: 72246da4 ("usb: Introduce DesignWare USB3 DRD Driver")
Cc: stable@vger.kernel.org
Signed-off-by: NThinh Nguyen <Thinh.Nguyen@synopsys.com>
Reviewed-by: NJeff Vanhoof <jdv1029@gmail.com>
Tested-by: NJeff Vanhoof <jdv1029@gmail.com>
Link: https://lore.kernel.org/r/ced336c84434571340c07994e3667a0ee284fefe.1666735451.git.Thinh.Nguyen@synopsys.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

308c316d

usb: dwc3: gadget: Stop processing more requests on IMI · f78961f8

由 Thinh Nguyen 提交于 10月 25, 2022

When servicing a transfer completion event, the dwc3 driver will reclaim
TRBs of started requests up to the request associated with the interrupt
event. Currently we don't check for interrupt due to missed isoc, and
the driver may attempt to reclaim TRBs beyond the associated event. This
causes invalid memory access when the hardware still owns the TRB. If
there's a missed isoc TRB with IMI (interrupt on missed isoc), make sure
to stop servicing further.

Note that only the last TRB of chained TRBs has its status updated with
missed isoc.

Fixes: 72246da4 ("usb: Introduce DesignWare USB3 DRD Driver")
Cc: stable@vger.kernel.org
Reported-by: NJeff Vanhoof <jdv1029@gmail.com>
Reported-by: NDan Vacura <w36195@motorola.com>
Signed-off-by: NThinh Nguyen <Thinh.Nguyen@synopsys.com>
Reviewed-by: NJeff Vanhoof <jdv1029@gmail.com>
Tested-by: NJeff Vanhoof <jdv1029@gmail.com>
Link: https://lore.kernel.org/r/b29acbeab531b666095dfdafd8cb5c7654fbb3e1.1666735451.git.Thinh.Nguyen@synopsys.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

f78961f8

26 10月, 2022 2 次提交

s390/vfio-ap: Fix memory allocation for mdev_types array · e38de480

由 Jason J. Herne 提交于 10月 21, 2022

The vfio-ap crypto driver fails to allocate memory for an array of
pointers used to pass supported mdev types to mdev_register_parent().

Since we only support a single mdev type, the fix is to allocate a
single entry in the ap_matrix_dev->mdev_types array.

Link: https://lore.kernel.org/r/20221021145905.15100-1-jjherne@linux.ibm.com
Fixes: da44c340 ("vfio/mdev: simplify mdev_type handling")
Cc: stable@vger.kernel.org
Cc: Tony Krowiak <akrowiak@linux.ibm.com>
Reported-by: NChristian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: NMatthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

e38de480

s390/cio: fix out-of-bounds access on cio_ignore free · 1b607411

由 Peter Oberparleiter 提交于 10月 14, 2022

The channel-subsystem-driver scans for newly available devices whenever
device-IDs are removed from the cio_ignore list using a command such as:

  echo free >/proc/cio_ignore

Since an I/O device scan might interfer with running I/Os, commit
172da89e ("s390/cio: avoid excessive path-verification requests")
introduced an optimization to exclude online devices from the scan.

The newly added check for online devices incorrectly assumes that
an I/O-subchannel's drvdata points to a struct io_subchannel_private.
For devices that are bound to a non-default I/O subchannel driver, such
as the vfio_ccw driver, this results in an out-of-bounds read access
during each scan.

Fix this by changing the scan logic to rely on a driver-independent
online indication. For this we can use struct subchannel->config.ena,
which is the driver's requested subchannel-enabled state. Since I/Os
can only be started on enabled subchannels, this matches the intent
of the original optimization of not scanning devices where I/O might
be running.

Fixes: 172da89e ("s390/cio: avoid excessive path-verification requests")
Fixes: 0c3812c3 ("s390/cio: derive cdev information only for IO-subchannels")
Cc: <stable@vger.kernel.org> # v5.15
Reported-by: NAlexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: NVineeth Vijayan <vneethv@linux.ibm.com>
Signed-off-by: NPeter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

1b607411

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功