提交 · 6ce5e17c6e9c8724455f1a330d11bab996b366ff · openeuler / Kernel

26 4月, 2021 40 次提交

xfrm: interface: fix ipv4 pmtu check to honor ip header df · 6ce5e17c

由 Eyal Birger 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 3b74ce529ece253fb70efa69548ef8e160fb6b2d
bugzilla: 51791

--------------------------------

[ Upstream commit 8fc0e3b6 ]

Frag needed should only be sent if the header enables DF.

This fix allows packets larger than MTU to pass the xfrm interface
and be fragmented after encapsulation, aligning behavior with
non-interface xfrm.

Fixes: f203b76d ("xfrm: Add virtual xfrm interfaces")
Signed-off-by: NEyal Birger <eyal.birger@gmail.com>
Reviewed-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

6ce5e17c

ice: Recognize 860 as iSCSI port in CEE mode · 4d2d25a6

由 Chinh T Cao 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 2d62d6980c2bbb3346c5345d9a5a30953cc56700
bugzilla: 51791

--------------------------------

[ Upstream commit aeac8ce8 ]

iSCSI can use both TCP ports 860 and 3260. However, in our current
implementation, the ice_aqc_opc_get_cee_dcb_cfg (0x0A07) AQ command
doesn't provide a way to communicate the protocol port number to the
AQ's caller. Thus, we assume that 3260 is the iSCSI port number at the
AQ's caller layer.

Rely on the dcbx-willing mode, desired QoS and remote QoS configuration to
determine which port number that iSCSI will use.

Fixes: 0ebd3ff1 ("ice: Add code for DCB initialization part 2/4")
Signed-off-by: NChinh T Cao <chinh.t.cao@intel.com>
Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4d2d25a6

ice: Refactor DCB related variables out of the ice_port_info struct · c2f0a483

由 Chinh T Cao 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit fd92e7aacc16de471fde6afa8b49d6dfff0dc029
bugzilla: 51791

--------------------------------

[ Upstream commit fc2d1165 ]

Refactor the DCB related variables out of the ice_port_info_struct. The
goal is to make the ice_port_info struct cleaner.
Signed-off-by: NChinh T Cao <chinh.t.cao@intel.com>
Co-developed-by: NDave Ertman <david.m.ertman@intel.com>
Signed-off-by: NDave Ertman <david.m.ertman@intel.com>
Tested-by: NTony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c2f0a483

net: sched: fix err handler in tcf_action_init() · a037c314

由 Vlad Buslov 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 4a78ae12780394cae396ef4e8e94252271363363
bugzilla: 51791

--------------------------------

[ Upstream commit b3650bf7 ]

With recent changes that separated action module load from action
initialization tcf_action_init() function error handling code was modified
to manually release the loaded modules if loading/initialization of any
further action in same batch failed. For the case when all modules
successfully loaded and some of the actions were initialized before one of
them failed in init handler. In this case for all previous actions the
module will be released twice by the error handler: First time by the loop
that manually calls module_put() for all ops, and second time by the action
destroy code that puts the module after destroying the action.

Reproduction:

$ sudo tc actions add action simple sdata \"2\" index 2
$ sudo tc actions add action simple sdata \"1\" index 1 \
                      action simple sdata \"2\" index 2
RTNETLINK answers: File exists
We have an error talking to the kernel
$ sudo tc actions ls action simple
total acts 1

        action order 0: Simple <"2">
         index 2 ref 1 bind 0
$ sudo tc actions flush action simple
$ sudo tc actions ls action simple
$ sudo tc actions add action simple sdata \"2\" index 2
Error: Failed to load TC action module.
We have an error talking to the kernel
$ lsmod | grep simple
act_simple             20480  -1

Fix the issue by modifying module reference counting handling in action
initialization code:

- Get module reference in tcf_idr_create() and put it in tcf_idr_release()
instead of taking over the reference held by the caller.

- Modify users of tcf_action_init_1() to always release the module
reference which they obtain before calling init function instead of
assuming that created action takes over the reference.

- Finally, modify tcf_action_init_1() to not release the module reference
when overwriting existing action as this is no longer necessary since both
upper and lower layers obtain and manage their own module references
independently.

Fixes: d349f997 ("net_sched: fix RTNL deadlock again caused by request_module()")
Suggested-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a037c314

KVM: x86/mmu: preserve pending TLB flush across calls to kvm_tdp_mmu_zap_sp · 80d2ab86

由 Paolo Bonzini 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 3c7d3d188ca799805fe6894b0b525c23364ee21c
bugzilla: 51791

--------------------------------

[ Upstream commit 315f02c6 ]

Right now, if a call to kvm_tdp_mmu_zap_sp returns false, the caller
will skip the TLB flush, which is wrong.  There are two ways to fix
it:

- since kvm_tdp_mmu_zap_sp will not yield and therefore will not flush
  the TLB itself, we could change the call to kvm_tdp_mmu_zap_sp to
  use "flush |= ..."

- or we can chain the flush argument through kvm_tdp_mmu_zap_sp down
  to __kvm_tdp_mmu_zap_gfn_range.  Note that kvm_tdp_mmu_zap_sp will
  neither yield nor flush, so flush would never go from true to
  false.

This patch does the former to simplify application to stable kernels,
and to make it further clearer that kvm_tdp_mmu_zap_sp will not flush.

Cc: seanjc@google.com
Fixes: 048f4980 ("KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping")
Cc: <stable@vger.kernel.org> # 5.10.x: 048f4980: KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping
Cc: <stable@vger.kernel.org> # 5.10.x: 33a31641: KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages
Cc: <stable@vger.kernel.org>
Reviewed-by: NSean Christopherson <seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

80d2ab86

KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages · a2c9d658

由 Sean Christopherson 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 25fc773b21cef7b9c43ad9e58e374678222954f3
bugzilla: 51791

--------------------------------

[ Upstream commit 33a31641 ]

Prevent the TDP MMU from yielding when zapping a gfn range during NX
page recovery.  If a flush is pending from a previous invocation of the
zapping helper, either in the TDP MMU or the legacy MMU, but the TDP MMU
has not accumulated a flush for the current invocation, then yielding
will release mmu_lock with stale TLB entries.

That being said, this isn't technically a bug fix in the current code, as
the TDP MMU will never yield in this case.  tdp_mmu_iter_cond_resched()
will yield if and only if it has made forward progress, as defined by the
current gfn vs. the last yielded (or starting) gfn.  Because zapping a
single shadow page is guaranteed to (a) find that page and (b) step
sideways at the level of the shadow page, the TDP iter will break its loop
before getting a chance to yield.

But that is all very, very subtle, and will break at the slightest sneeze,
e.g. zapping while holding mmu_lock for read would break as the TDP MMU
wouldn't be guaranteed to see the present shadow page, and thus could step
sideways at a lower level.

Cc: Ben Gardon <bgardon@google.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210325200119.1359384-4-seanjc@google.com>
[Add lockdep assertion. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a2c9d658

KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping · b8a7aaf1

由 Sean Christopherson 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit be2c527b5d392d9395dea992b0db4087de3c993d
bugzilla: 51791

--------------------------------

[ Upstream commit 048f4980 ]

Honor the "flush needed" return from kvm_tdp_mmu_zap_gfn_range(), which
does the flush itself if and only if it yields (which it will never do in
this particular scenario), and otherwise expects the caller to do the
flush.  If pages are zapped from the TDP MMU but not the legacy MMU, then
no flush will occur.

Fixes: 29cf0f50 ("kvm: x86/mmu: NX largepage recovery for TDP MMU")
Cc: stable@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210325200119.1359384-3-seanjc@google.com>
Reviewed-by: NBen Gardon <bgardon@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b8a7aaf1

KVM: x86/mmu: Ensure TLBs are flushed when yielding during GFN range zap · 764a6648

由 Sean Christopherson 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 0aa4dd9e513209819579296f5d3d240f288cd423
bugzilla: 51791

--------------------------------

[ Upstream commit a835429c ]

When flushing a range of GFNs across multiple roots, ensure any pending
flush from a previous root is honored before yielding while walking the
tables of the current root.

Note, kvm_tdp_mmu_zap_gfn_range() now intentionally overwrites its local
"flush" with the result to avoid redundant flushes.  zap_gfn_range()
preserves and return the incoming "flush", unless of course the flush was
performed prior to yielding and no new flush was triggered.

Fixes: 1af4a960 ("KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed")
Cc: stable@vger.kernel.org
Reviewed-by: NBen Gardon <bgardon@google.com>
Signed-off-by: NSean Christopherson <seanjc@google.com>
Message-Id: <20210325200119.1359384-2-seanjc@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

764a6648

KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed · 85996ce8

由 Ben Gardon 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 3c7a18440638b1c5a4645e2de1670cee32df7307
bugzilla: 51791

--------------------------------

[ Upstream commit 1af4a960 ]

Given certain conditions, some TDP MMU functions may not yield
reliably / frequently enough. For example, if a paging structure was
very large but had few, if any writable entries, wrprot_gfn_range
could traverse many entries before finding a writable entry and yielding
because the check for yielding only happens after an SPTE is modified.

Fix this issue by moving the yield to the beginning of the loop.

Fixes: a6a0b05d ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
Reviewed-by: NPeter Feiner <pfeiner@google.com>
Signed-off-by: NBen Gardon <bgardon@google.com>

Message-Id: <20210202185734.1680553-15-bgardon@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

85996ce8

KVM: x86/mmu: Ensure forward progress when yielding in TDP MMU iter · 4396e74b

由 Ben Gardon 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 85f4ff2b06af7618329571ef68470ef26f527697
bugzilla: 51791

--------------------------------

[ Upstream commit ed5e484b ]

In some functions the TDP iter risks not making forward progress if two
threads livelock yielding to one another. This is possible if two threads
are trying to execute wrprot_gfn_range. Each could write protect an entry
and then yield. This would reset the tdp_iter's walk over the paging
structure and the loop would end up repeating the same entry over and
over, preventing either thread from making forward progress.

Fix this issue by only yielding if the loop has made forward progress
since the last yield.

Fixes: a6a0b05d ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
Reviewed-by: NPeter Feiner <pfeiner@google.com>
Signed-off-by: NBen Gardon <bgardon@google.com>

Message-Id: <20210202185734.1680553-14-bgardon@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

4396e74b

KVM: x86/mmu: Rename goal_gfn to next_last_level_gfn · aa25fd2b

由 Ben Gardon 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 1cd17c5c9b8a6c2d550c9dbc934dffa813fccfb6
bugzilla: 51791

--------------------------------

[ Upstream commit 74953d35 ]

The goal_gfn field in tdp_iter can be misleading as it implies that it
is the iterator's final goal. It is really a target for the lowest gfn
mapped by the leaf level SPTE the iterator will traverse towards. Change
the field's name to be more precise.
Signed-off-by: NBen Gardon <bgardon@google.com>
Message-Id: <20210202185734.1680553-13-bgardon@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

aa25fd2b

KVM: x86/mmu: Merge flush and non-flush tdp_mmu_iter_cond_resched · eab83813

由 Ben Gardon 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit b4a3a0d279249bb68fb4f4079ed4db8b0e633213
bugzilla: 51791

--------------------------------

[ Upstream commit e139a34e ]

The flushing and non-flushing variants of tdp_mmu_iter_cond_resched have
almost identical implementations. Merge the two functions and add a
flush parameter.
Signed-off-by: NBen Gardon <bgardon@google.com>
Message-Id: <20210202185734.1680553-12-bgardon@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

eab83813

KVM: x86/mmu: change TDP MMU yield function returns to match cond_resched · 26ead9e6

由 Ben Gardon 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 8f90432d7f5967567aff390891661cc81e7879a8
bugzilla: 51791

--------------------------------

[ Upstream commit e28a436c ]

Currently the TDP MMU yield / cond_resched functions either return
nothing or return true if the TLBs were not flushed. These are confusing
semantics, especially when making control flow decisions in calling
functions.

To clean things up, change both functions to have the same
return value semantics as cond_resched: true if the thread yielded,
false if it did not. If the function yielded in the _flush_ version,
then the TLBs will have been flushed.
Reviewed-by: NPeter Feiner <pfeiner@google.com>
Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NBen Gardon <bgardon@google.com>
Message-Id: <20210202185734.1680553-2-bgardon@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

26ead9e6

i2c: turn recovery error on init to debug · fd756fc0

由 Wolfram Sang 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 5ea9e6038d2926ef4ed456c346f8cd1d0d20749f
bugzilla: 51791

--------------------------------

commit e409a6a3 upstream.

In some configurations, recovery is optional. So, don't throw an error
when it is not used because e.g. pinctrl settings for recovery are not
provided. Reword the message and make it debug output.
Reported-by: NKlaus Kudielka <klaus.kudielka@gmail.com>
Tested-by: NKlaus Kudielka <klaus.kudielka@gmail.com>
Signed-off-by: NWolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: NWolfram Sang <wsa@kernel.org>
Cc: stable@kernel.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

fd756fc0

percpu: make pcpu_nr_empty_pop_pages per chunk type · 902c46e6

由 Roman Gushchin 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit efa869b68be99eff8af03ba2802ae746306036bc
bugzilla: 51791

--------------------------------

commit 0760fa3d upstream.

nr_empty_pop_pages is used to guarantee that there are some free
populated pages to satisfy atomic allocations. Accounted and
non-accounted allocations are using separate sets of chunks,
so both need to have a surplus of empty pages.

This commit makes pcpu_nr_empty_pop_pages and the corresponding logic
per chunk type.

[Dennis]
This issue came up as I was reviewing [1] and realized I missed this.
Simultaneously, it was reported btrfs was seeing failed atomic
allocations in fsstress tests [2] and [3].

[1] https://lore.kernel.org/linux-mm/20210324190626.564297-1-guro@fb.com/
[2] https://lore.kernel.org/linux-mm/20210401185158.3275.409509F4@e16-tech.com/
[3] https://lore.kernel.org/linux-mm/CAL3q7H5RNBjCi708GH7jnczAOe0BLnacT9C+OBgA-Dx9jhB6SQ@mail.gmail.com/

Fixes: 3c7be18a ("mm: memcg/percpu: account percpu memory to memory cgroups")
Cc: stable@vger.kernel.org # 5.9+
Signed-off-by: NRoman Gushchin <guro@fb.com>
Tested-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NDennis Zhou <dennis@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

902c46e6

scsi: target: iscsi: Fix zero tag inside a trace event · dc7d0b56

由 Roman Bolshakov 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit c441949184a9e316fb7682fb78d264fb05c6f260
bugzilla: 51791

--------------------------------

commit 0352c3d3 upstream.

target_sequencer_start event is triggered inside target_cmd_init_cdb().
se_cmd.tag is not initialized with ITT at the moment so the event always
prints zero tag.

Link: https://lore.kernel.org/r/20210403215415.95077-1-r.bolshakov@yadro.com
Cc: stable@vger.kernel.org # 5.10+
Reviewed-by: NMike Christie <michael.christie@oracle.com>
Signed-off-by: NRoman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

dc7d0b56

scsi: pm80xx: Fix chip initialization failure · 328b3bce

由 Viswas G 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit d8e7fa8509d7ff335acfe2b1764dae7dd272ec77
bugzilla: 51791

--------------------------------

commit 65df7d19 upstream.

Inbound and outbound queues were not properly configured and that lead to
MPI configuration failure.

Fixes: 05c6c029 ("scsi: pm80xx: Increase number of supported queues")
Cc: stable@vger.kernel.org # 5.10+
Link: https://lore.kernel.org/r/20210402054212.17834-1-Viswas.G@microchip.com.comReported-and-tested-by: NAsh Izat <ash@ai0.uk>
Signed-off-by: NViswas G <Viswas.G@microchip.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

328b3bce

driver core: Fix locking bug in deferred_probe_timeout_work_func() · 295574fc

由 Saravana Kannan 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 0c47d8a55f7f5522b04b75e511f74dcb71043663
bugzilla: 51791

--------------------------------

commit eed6e418 upstream.

list_for_each_entry_safe() is only useful if we are deleting nodes in a
linked list within the loop. It doesn't protect against other threads
adding/deleting nodes to the list in parallel. We need to grab
deferred_probe_mutex when traversing the deferred_probe_pending_list.

Cc: stable@vger.kernel.org
Fixes: 25b4e70d ("driver core: allow stopping deferred probe after init")
Signed-off-by: NSaravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210402040342.2944858-2-saravanak@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

295574fc

usbip: synchronize event handler with sysfs code paths · 5105c1c0

由 Shuah Khan 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit f06cb4641b15e55c2c7d47c44a57a8c9208b92ef
bugzilla: 51791

--------------------------------

commit 363eaa3a upstream.

Fuzzing uncovered race condition between sysfs code paths in usbip
drivers. Device connect/disconnect code paths initiated through
sysfs interface are prone to races if disconnect happens during
connect and vice versa.

Use sysfs_lock to synchronize event handler with sysfs paths
in usbip drivers.

Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com
Signed-off-by: NShuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/c5c8723d3f29dfe3d759cfaafa7dd16b0dfe2918.1616807117.git.skhan@linuxfoundation.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5105c1c0

usbip: vudc synchronize sysfs code paths · 02d99be8

由 Shuah Khan 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 28dc9237fe83602b00d43d40673bbfc1f00d8971
bugzilla: 51791

--------------------------------

commit bd8b8204 upstream.

Fuzzing uncovered race condition between sysfs code paths in usbip
drivers. Device connect/disconnect code paths initiated through
sysfs interface are prone to races if disconnect happens during
connect and vice versa.

Use sysfs_lock to protect sysfs paths in vudc.

Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com
Signed-off-by: NShuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/caabcf3fc87bdae970509b5ff32d05bb7ce2fb15.1616807117.git.skhan@linuxfoundation.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

02d99be8

usbip: stub-dev synchronize sysfs code paths · d3ee2a52

由 Shuah Khan 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 513765b186c99c09b245067b2ee28cec27a2982d
bugzilla: 51791

--------------------------------

commit 9dbf34a8 upstream.

Fuzzing uncovered race condition between sysfs code paths in usbip
drivers. Device connect/disconnect code paths initiated through
sysfs interface are prone to races if disconnect happens during
connect and vice versa.

Use sysfs_lock to protect sysfs paths in stub-dev.

Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com
Signed-off-by: NShuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/2b182f3561b4a065bf3bf6dce3b0e9944ba17b3f.1616807117.git.skhan@linuxfoundation.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d3ee2a52

usbip: add sysfs_lock to synchronize sysfs code paths · 540b3699

由 Shuah Khan 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 68be610c19a5467b0631eb291bf18379e7a14516
bugzilla: 51791

--------------------------------

commit 4e9c93af upstream.

Fuzzing uncovered race condition between sysfs code paths in usbip
drivers. Device connect/disconnect code paths initiated through
sysfs interface are prone to races if disconnect happens during
connect and vice versa.

This problem is common to all drivers while it can be reproduced easily
in vhci_hcd. Add a sysfs_lock to usbip_device struct to protect the paths.

Use this in vhci_hcd to protect sysfs paths. For a complete fix, usip_host
and usip-vudc drivers and the event handler will have to use this lock to
protect the paths. These changes will be done in subsequent patches.

Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com
Signed-off-by: NShuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/b6568f7beae702bbc236a545d3c020106ca75eac.1616807117.git.skhan@linuxfoundation.orgSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

540b3699

thunderbolt: Fix off by one in tb_port_find_retimer() · 5e98c242

由 Dan Carpenter 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 126ce97d39cf2839454994aa29478259e453ef5f
bugzilla: 51791

--------------------------------

commit 08fe7ae1 upstream.

This array uses 1-based indexing so it corrupts memory one element
beyond of the array.  Fix it by making the array one element larger.

Fixes: dacb1287 ("thunderbolt: Add support for on-board retimers")
Cc: stable@vger.kernel.org
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5e98c242

thunderbolt: Fix a leak in tb_retimer_add() · 29a5c2a7

由 Dan Carpenter 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 256ece954961411c9dc77a08566cda47fabc8d71
bugzilla: 51791

--------------------------------

commit bec4d7c9 upstream.

After the device_register() succeeds, then the correct way to clean up
is to call device_unregister().  The unregister calls both device_del()
and device_put().  Since this code was only device_del() it results in
a memory leak.

Fixes: dacb1287 ("thunderbolt: Add support for on-board retimers")
Cc: stable@vger.kernel.org
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

29a5c2a7

net: let skb_orphan_partial wake-up waiters. · 86f85282

由 Paolo Abeni 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit b830650c1a0c665a48a24b7cf4bf46f7e5b10119
bugzilla: 51791

--------------------------------

commit 9adc89af upstream.

Currently the mentioned helper can end-up freeing the socket wmem
without waking-up any processes waiting for more write memory.

If the partially orphaned skb is attached to an UDP (or raw) socket,
the lack of wake-up can hang the user-space.

Even for TCP sockets not calling the sk destructor could have bad
effects on TSQ.

Address the issue using skb_orphan to release the sk wmem before
setting the new sock_efree destructor. Additionally bundle the
whole ownership update in a new helper, so that later other
potential users could avoid duplicate code.

v1 -> v2:
 - use skb_orphan() instead of sort of open coding it (Eric)
 - provide an helper for the ownership change (Eric)

Fixes: f6ba8d33 ("netem: fix skb_orphan_partial()")
Suggested-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

86f85282

net-ipv6: bugfix - raw & sctp - switch to ipv6_can_nonlocal_bind() · 5842c4cf

由 Maciej Żenczykowski 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 5d9216b851009617e9841340d1702c55c430946b
bugzilla: 51791

--------------------------------

commit 630e4576 upstream.

Found by virtue of ipv6 raw sockets not honouring the per-socket
IP{,V6}_FREEBIND setting.

Based on hits found via:
  git grep '[.]ip_nonlocal_bind'
We fix both raw ipv6 sockets to honour IP{,V6}_FREEBIND and IP{,V6}_TRANSPARENT,
and we fix sctp sockets to honour IP{,V6}_TRANSPARENT (they already honoured
FREEBIND), and not just the ipv6 'ip_nonlocal_bind' sysctl.

The helper is defined as:
  static inline bool ipv6_can_nonlocal_bind(struct net *net, struct inet_sock *inet) {
    return net->ipv6.sysctl.ip_nonlocal_bind || inet->freebind || inet->transparent;
  }
so this change only widens the accepted opt-outs and is thus a clean bugfix.

I'm not entirely sure what 'fixes' tag to add, since this is AFAICT an ancient bug,
but IMHO this should be applied to stable kernels as far back as possible.
As such I'm adding a 'fixes' tag with the commit that originally added the helper,
which happened in 4.19.  Backporting to older LTS kernels (at least 4.9 and 4.14)
would presumably require open-coding it or backporting the helper as well.

Other possibly relevant commits:
  v4.18-rc6-1502-g83ba4645 net: add helpers checking if socket can be bound to nonlocal address
  v4.18-rc6-1431-gd0c1f011 net/ipv6: allow any source address for sendmsg pktinfo with ip_nonlocal_bind
  v4.14-rc5-271-gb71d21c2 sctp: full support for ipv6 ip_nonlocal_bind & IP_FREEBIND
  v4.7-rc7-1883-g9b974202 sctp: support ipv6 nonlocal bind
  v4.1-12247-g35a256fe ipv6: Nonlocal bind

Cc: Lorenzo Colitti <lorenzo@google.com>
Fixes: 83ba4645 ("net: add helpers checking if socket can be bound to nonlocal address")
Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Reviewed-By: NLorenzo Colitti <lorenzo@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5842c4cf

net: hsr: Reset MAC header for Tx path · 9bec2805

由 Kurt Kanzenbach 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit b82816d778750d13786d46a17183178e616dceb0
bugzilla: 51791

--------------------------------

commit 9d680392 upstream.

Reset MAC header in HSR Tx path. This is needed, because direct packet
transmission, e.g. by specifying PACKET_QDISC_BYPASS does not reset the MAC
header.

This has been observed using the following setup:

|$ ip link add name hsr0 type hsr slave1 lan0 slave2 lan1 supervision 45 version 1
|$ ifconfig hsr0 up
|$ ./test hsr0

The test binary is using mmap'ed sockets and is specifying the
PACKET_QDISC_BYPASS socket option.

This patch resolves the following warning on a non-patched kernel:

|[  112.725394] ------------[ cut here ]------------
|[  112.731418] WARNING: CPU: 1 PID: 257 at net/hsr/hsr_forward.c:560 hsr_forward_skb+0x484/0x568
|[  112.739962] net/hsr/hsr_forward.c:560: Malformed frame (port_src hsr0)

The warning can be safely removed, because the other call sites of
hsr_forward_skb() make sure that the skb is prepared correctly.

Fixes: d346a3fa ("packet: introduce PACKET_QDISC_BYPASS socket option")
Signed-off-by: NKurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9bec2805

mac80211: fix TXQ AC confusion · 1a8551ba

由 Johannes Berg 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 9b9c910ccc1995e0251ff8a6696f22682032998a
bugzilla: 51791

--------------------------------

commit 1153a747 upstream.

Normally, TXQs have

  txq->tid = tid;
  txq->ac = ieee80211_ac_from_tid(tid);

However, the special management TXQ actually has

  txq->tid = IEEE80211_NUM_TIDS; // 16
  txq->ac = IEEE80211_AC_VO;

This makes sense, but ieee80211_ac_from_tid(16) is the same
as ieee80211_ac_from_tid(0) which is just IEEE80211_AC_BE.

Now, normally this is fine. However, if the netdev queues
were stopped, then the code in ieee80211_tx_dequeue() will
propagate the stop from the interface (vif->txqs_stopped[])
if the AC 2 (ieee80211_ac_from_tid(txq->tid)) is marked as
stopped. On wake, however, __ieee80211_wake_txqs() will wake
the TXQ if AC 0 (txq->ac) is woken up.

If a driver stops all queues with ieee80211_stop_tx_queues()
and then wakes them again with ieee80211_wake_tx_queues(),
the ieee80211_wake_txqs() tasklet will run to resync queue
and TXQ state. If all queues were woken, then what'll happen
is that _ieee80211_wake_txqs() will run in order of HW queues
0-3, typically (and certainly for iwlwifi) corresponding to
ACs 0-3, so it'll call __ieee80211_wake_txqs() for each AC in
order 0-3.

When __ieee80211_wake_txqs() is called for AC 0 (VO) that'll
wake up the management TXQ (remember its tid is 16), and the
driver's wake_tx_queue() will be called. That tries to get a
frame, which will immediately *stop* the TXQ again, because
now we check against AC 2, and AC 2 hasn't yet been marked as
woken up again in sdata->vif.txqs_stopped[] since we're only
in the __ieee80211_wake_txqs() call for AC 0.

Thus, the management TXQ will never be started again.

Fix this by checking txq->ac directly instead of calculating
the AC as ieee80211_ac_from_tid(txq->tid).

Fixes: adf8ed01 ("mac80211: add an optional TXQ for other PS-buffered frames")
Acked-by: NToke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20210323210500.bf4d50afea4a.I136ffde910486301f8818f5442e3c9bf8670a9c4@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1a8551ba

mac80211: fix time-is-after bug in mlme · 860320b3

由 Ben Greear 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit cc357c29358df1c6a82ccc19c461dfc424e2e689
bugzilla: 51791

--------------------------------

commit 7d73cd94 upstream.

The incorrect timeout check caused probing to happen when it did
not need to happen.  This in turn caused tx performance drop
for around 5 seconds in ath10k-ct driver.  Possibly that tx drop
is due to a secondary issue, but fixing the probe to not happen
when traffic is running fixes the symptom.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Fixes: 9abf4e49 ("mac80211: optimize station connection monitor")
Acked-by: NFelix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210330230749.14097-1-greearb@candelatech.comSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

860320b3

cfg80211: check S1G beacon compat element length · 2f3c9ca2

由 Johannes Berg 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit cc1a702e6ec03b4f950fe6b62fd3dd2ef0fad749
bugzilla: 51791

--------------------------------

commit b5ac0146 upstream.

We need to check the length of this element so that we don't
access data beyond its end. Fix that.

Fixes: 9eaffe50 ("cfg80211: convert S1G beacon to scan results")
Link: https://lore.kernel.org/r/20210408142826.f6f4525012de.I9fdeff0afdc683a6024e5ea49d2daa3cd2459d11@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2f3c9ca2

nl80211: fix potential leak of ACL params · 93e1c21d

由 Johannes Berg 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit fea52345f422517baadc49425bf8c944b4c425c0
bugzilla: 51791

--------------------------------

commit abaf94ec upstream.

In case nl80211_parse_unsol_bcast_probe_resp() results in an
error, need to "goto out" instead of just returning to free
possibly allocated data.

Fixes: 7443dcd1 ("nl80211: Unsolicited broadcast probe response support")
Link: https://lore.kernel.org/r/20210408142833.d8bc2e2e454a.If290b1ba85789726a671ff0b237726d4851b5b0f@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

93e1c21d

nl80211: fix beacon head validation · 42371441

由 Johannes Berg 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 42e4450e37905c8765b770c490cd6c64713e5de9
bugzilla: 51791

--------------------------------

commit 9a6847ba upstream.

If the beacon head attribute (NL80211_ATTR_BEACON_HEAD)
is too short to even contain the frame control field,
we access uninitialized data beyond the buffer. Fix this
by checking the minimal required size first. We used to
do this until S1G support was added, where the fixed
data portion has a different size.

Reported-and-tested-by: syzbot+72b99dcf4607e8c770f3@syzkaller.appspotmail.com
Suggested-by: NEric Dumazet <eric.dumazet@gmail.com>
Fixes: 1d47f119 ("nl80211: correctly validate S1G beacon head")
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20210408154518.d9b06d39b4ee.Iff908997b2a4067e8d456b3cb96cab9771d252b8@changeidSigned-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

42371441

net: sched: fix action overwrite reference counting · 70238e2f

由 Vlad Buslov 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 81692c6add7e55309bdc275f59301efd4b49e958
bugzilla: 51791

--------------------------------

commit 87c750e8 upstream.

Action init code increments reference counter when it changes an action.
This is the desired behavior for cls API which needs to obtain action
reference for every classifier that points to action. However, act API just
needs to change the action and releases the reference before returning.
This sequence breaks when the requested action doesn't exist, which causes
act API init code to create new action with specified index, but action is
still released before returning and is deleted (unless it was referenced
concurrently by cls API).

Reproduction:

$ sudo tc actions ls action gact
$ sudo tc actions change action gact drop index 1
$ sudo tc actions ls action gact

Extend tcf_action_init() to accept 'init_res' array and initialize it with
action->ops->init() result. In tcf_action_add() remove pointers to created
actions from actions array before passing it to tcf_action_put_many().

Fixes: cae422f3 ("net: sched: use reference counting action init")
Reported-by: NKumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: NVlad Buslov <vladbu@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

70238e2f

net: sched: sch_teql: fix null-pointer dereference · 021278f2

由 Pavel Tikhomirov 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit cdcf3829f418d9d85c720bbd25ab9fb4b6c2e056
bugzilla: 51791

--------------------------------

commit 1ffbc7ea upstream.

Reproduce:

  modprobe sch_teql
  tc qdisc add dev teql0 root teql0

This leads to (for instance in Centos 7 VM) OOPS:

[  532.366633] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
[  532.366733] IP: [<ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql]
[  532.366825] PGD 80000001376d5067 PUD 137e37067 PMD 0
[  532.366906] Oops: 0000 [#1] SMP
[  532.366987] Modules linked in: sch_teql ...
[  532.367945] CPU: 1 PID: 3026 Comm: tc Kdump: loaded Tainted: G               ------------ T 3.10.0-1062.7.1.el7.x86_64 #1
[  532.368041] Hardware name: Virtuozzo KVM, BIOS 1.11.0-2.vz7.2 04/01/2014
[  532.368125] task: ffff8b7d37d31070 ti: ffff8b7c9fdbc000 task.ti: ffff8b7c9fdbc000
[  532.368224] RIP: 0010:[<ffffffffc06124a8>]  [<ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql]
[  532.368320] RSP: 0018:ffff8b7c9fdbf8e0  EFLAGS: 00010286
[  532.368394] RAX: ffffffffc0612490 RBX: ffff8b7cb1565e00 RCX: ffff8b7d35ba2000
[  532.368476] RDX: ffff8b7d35ba2000 RSI: 0000000000000000 RDI: ffff8b7cb1565e00
[  532.368557] RBP: ffff8b7c9fdbf8f8 R08: ffff8b7d3fd1f140 R09: ffff8b7d3b001600
[  532.368638] R10: ffff8b7d3b001600 R11: ffffffff84c7d65b R12: 00000000ffffffd8
[  532.368719] R13: 0000000000008000 R14: ffff8b7d35ba2000 R15: ffff8b7c9fdbf9a8
[  532.368800] FS:  00007f6a4e872740(0000) GS:ffff8b7d3fd00000(0000) knlGS:0000000000000000
[  532.368885] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  532.368961] CR2: 00000000000000a8 CR3: 00000001396ee000 CR4: 00000000000206e0
[  532.369046] Call Trace:
[  532.369159]  [<ffffffff84c8192e>] qdisc_create+0x36e/0x450
[  532.369268]  [<ffffffff846a9b49>] ? ns_capable+0x29/0x50
[  532.369366]  [<ffffffff849afde2>] ? nla_parse+0x32/0x120
[  532.369442]  [<ffffffff84c81b4c>] tc_modify_qdisc+0x13c/0x610
[  532.371508]  [<ffffffff84c693e7>] rtnetlink_rcv_msg+0xa7/0x260
[  532.372668]  [<ffffffff84907b65>] ? sock_has_perm+0x75/0x90
[  532.373790]  [<ffffffff84c69340>] ? rtnl_newlink+0x890/0x890
[  532.374914]  [<ffffffff84c8da7b>] netlink_rcv_skb+0xab/0xc0
[  532.376055]  [<ffffffff84c63708>] rtnetlink_rcv+0x28/0x30
[  532.377204]  [<ffffffff84c8d400>] netlink_unicast+0x170/0x210
[  532.378333]  [<ffffffff84c8d7a8>] netlink_sendmsg+0x308/0x420
[  532.379465]  [<ffffffff84c2f3a6>] sock_sendmsg+0xb6/0xf0
[  532.380710]  [<ffffffffc034a56e>] ? __xfs_filemap_fault+0x8e/0x1d0 [xfs]
[  532.381868]  [<ffffffffc034a75c>] ? xfs_filemap_fault+0x2c/0x30 [xfs]
[  532.383037]  [<ffffffff847ec23a>] ? __do_fault.isra.61+0x8a/0x100
[  532.384144]  [<ffffffff84c30269>] ___sys_sendmsg+0x3e9/0x400
[  532.385268]  [<ffffffff847f3fad>] ? handle_mm_fault+0x39d/0x9b0
[  532.386387]  [<ffffffff84d88678>] ? __do_page_fault+0x238/0x500
[  532.387472]  [<ffffffff84c31921>] __sys_sendmsg+0x51/0x90
[  532.388560]  [<ffffffff84c31972>] SyS_sendmsg+0x12/0x20
[  532.389636]  [<ffffffff84d8dede>] system_call_fastpath+0x25/0x2a
[  532.390704]  [<ffffffff84d8de21>] ? system_call_after_swapgs+0xae/0x146
[  532.391753] Code: 00 00 00 00 00 00 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b b7 48 01 00 00 48 89 fb <48> 8b 8e a8 00 00 00 48 85 c9 74 43 48 89 ca eb 0f 0f 1f 80 00
[  532.394036] RIP  [<ffffffffc06124a8>] teql_destroy+0x18/0x100 [sch_teql]
[  532.395127]  RSP <ffff8b7c9fdbf8e0>
[  532.396179] CR2: 00000000000000a8

Null pointer dereference happens on master->slaves dereference in
teql_destroy() as master is null-pointer.

When qdisc_create() calls teql_qdisc_init() it imediately fails after
check "if (m->dev == dev)" because both devices are teql0, and it does
not set qdisc_priv(sch)->m leaving it zero on error path, then
qdisc_create() imediately calls teql_destroy() which does not expect
zero master pointer and we get OOPS.

Fixes: 87b60cfa ("net_sched: fix error recovery at qdisc creation")
Signed-off-by: NPavel Tikhomirov <ptikhomirov@virtuozzo.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

021278f2

vdpa/mlx5: Fix suspend/resume index restoration · c30f327f

由 Eli Cohen 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 422eda6255161dbd615f7fc58e0d1abb506dc12f
bugzilla: 51791

--------------------------------

commit bc04d93e upstream.

When we suspend the VM, the VDPA interface will be reset. When the VM is
resumed again, clear_virtqueues() will clear the available and used
indices resulting in hardware virqtqueue objects becoming out of sync.
We can avoid this function alltogether since qemu will clear them if
required, e.g. when the VM went through a reboot.

Moreover, since the hw available and used indices should always be
identical on query and should be restored to the same value same value
for virtqueues that complete in order, we set the single value provided
by set_vq_state(). In get_vq_state() we return the value of hardware
used index.

Fixes: b35ccebe ("vdpa/mlx5: Restore the hardware used index after change map")
Fixes: 1a86b377 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: NEli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210408091047.4269-6-elic@nvidia.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c30f327f

i40e: Fix sparse errors in i40e_txrx.c · 1a3717e7

由 Arkadiusz Kubalewski 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 89e406e95278628ae0337ae3fa3c85659c88514f
bugzilla: 51791

--------------------------------

commit 12738ac4 upstream.

Remove error handling through pointers. Instead use plain int
to return value from i40e_run_xdp(...).

Previously:
- sparse errors were produced during compilation:
i40e_txrx.c:2338 i40e_run_xdp() error: (-2147483647) too low for ERR_PTR
i40e_txrx.c:2558 i40e_clean_rx_irq() error: 'skb' dereferencing possible ERR_PTR()

- sk_buff* was used to return value, but it has never had valid
pointer to sk_buff. Returned value was always int handled as
a pointer.

Fixes: 0c8493d9 ("i40e: add XDP support for pass and drop actions")
Fixes: 2e689312 ("i40e: split XDP_TX tail and XDP_REDIRECT map flushing")
Signed-off-by: NAleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: NArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: NDave Switzer <david.switzer@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1a3717e7

i40e: Fix sparse error: uninitialized symbol 'ring' · 502ddc34

由 Arkadiusz Kubalewski 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 12e1438a0946434906303b3682d363e0d0efc3a2
bugzilla: 51791

--------------------------------

commit d6d04ee6 upstream.

Init pointer with NULL in default switch case statement.

Previously the error was produced when compiling against sparse.
i40e_debugfs.c:582 i40e_dbg_dump_desc() error: uninitialized symbol 'ring'.

Fixes: 44ea803e ("i40e: introduce new dump desc XDP command")
Signed-off-by: NAleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: NArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: NDave Switzer <david.switzer@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

502ddc34

i40e: Fix sparse error: 'vsi->netdev' could be null · 5fd577b7

由 Arkadiusz Kubalewski 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 2472ba1c46b4c33b4010042a3e83a46ea59915b1
bugzilla: 51791

--------------------------------

commit 6b5674fe upstream.

Remove vsi->netdev->name from the trace.
This is redundant information. With the devinfo trace, the adapter
is already identifiable.

Previously following error was produced when compiling against sparse.
i40e_main.c:2571 i40e_sync_vsi_filters() error:
	we previously assumed 'vsi->netdev' could be null (see line 2323)

Fixes: b603f9dc ("i40e: Log info when PF is entering and leaving Allmulti mode.")
Signed-off-by: NAleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: NArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: NDave Switzer <david.switzer@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5fd577b7

i40e: Fix sparse warning: missing error code 'err' · f27e0e47

由 Arkadiusz Kubalewski 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit 792387118204fd67f1dbbf2c3da1a9c08c541593
bugzilla: 51791

--------------------------------

commit 8a1e918d upstream.

Set proper return values inside error checking if-statements.

Previously following warning was produced when compiling against sparse.
i40e_main.c:15162 i40e_init_recovery_mode() warn: missing error code 'err'

Fixes: 4ff0ee1a ("i40e: Introduce recovery mode support")
Signed-off-by: NAleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: NArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: NDave Switzer <david.switzer@intel.com>
Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f27e0e47

net: ensure mac header is set in virtio_net_hdr_to_skb() · 8adf4211

由 Eric Dumazet 提交于 4月 21, 2021

stable inclusion
from stable-5.10.30
commit f0b4c9acf5fe09d4deb3e217afdfff525ebc4b97
bugzilla: 51791

--------------------------------

commit 61431a59 upstream.

Commit 924a9bc3 ("net: check if protocol extracted by virtio_net_hdr_set_proto is correct")
added a call to dev_parse_header_protocol() but mac_header is not yet set.

This means that eth_hdr() reads complete garbage, and syzbot complained about it [1]

This patch resets mac_header earlier, to get more coverage about this change.

Audit of virtio_net_hdr_to_skb() callers shows that this change should be safe.

[1]

BUG: KASAN: use-after-free in eth_header_parse_protocol+0xdc/0xe0 net/ethernet/eth.c:282
Read of size 2 at addr ffff888017a6200b by task syz-executor313/8409

CPU: 1 PID: 8409 Comm: syz-executor313 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x141/0x1d7 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232
 __kasan_report mm/kasan/report.c:399 [inline]
 kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
 eth_header_parse_protocol+0xdc/0xe0 net/ethernet/eth.c:282
 dev_parse_header_protocol include/linux/netdevice.h:3177 [inline]
 virtio_net_hdr_to_skb.constprop.0+0x99d/0xcd0 include/linux/virtio_net.h:83
 packet_snd net/packet/af_packet.c:2994 [inline]
 packet_sendmsg+0x2325/0x52b0 net/packet/af_packet.c:3031
 sock_sendmsg_nosec net/socket.c:654 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:674
 sock_no_sendpage+0xf3/0x130 net/core/sock.c:2860
 kernel_sendpage.part.0+0x1ab/0x350 net/socket.c:3631
 kernel_sendpage net/socket.c:3628 [inline]
 sock_sendpage+0xe5/0x140 net/socket.c:947
 pipe_to_sendpage+0x2ad/0x380 fs/splice.c:364
 splice_from_pipe_feed fs/splice.c:418 [inline]
 __splice_from_pipe+0x43e/0x8a0 fs/splice.c:562
 splice_from_pipe fs/splice.c:597 [inline]
 generic_splice_sendpage+0xd4/0x140 fs/splice.c:746
 do_splice_from fs/splice.c:767 [inline]
 do_splice+0xb7e/0x1940 fs/splice.c:1079
 __do_splice+0x134/0x250 fs/splice.c:1144
 __do_sys_splice fs/splice.c:1350 [inline]
 __se_sys_splice fs/splice.c:1332 [inline]
 __x64_sys_splice+0x198/0x250 fs/splice.c:1332
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46

Fixes: 924a9bc3 ("net: check if protocol extracted by virtio_net_hdr_set_proto is correct")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Balazs Nemeth <bnemeth@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>
Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: N  Weilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8adf4211

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功