提交 · 2d47dc83dbfa0fa7cc1e5a0b44db62b6c4b8c4f2 · openeuler / Kernel

21 4月, 2022 20 次提交

mm/mempolicy: fix a race between offset_il_node and mpol_rebind_task · 2d47dc83

由 yanghui 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.15-rc1
commit 276aeee1
category: bugfix
bugzilla: 181417 https://gitee.com/openeuler/kernel/issues/I53CSV
backport: openEuler-22.03-LTS

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=276aeee1c5fc00df700f0782060beae126600472

-------------------------------

Servers happened below panic:

  Kernel version:5.4.56
  BUG: unable to handle page fault for address: 0000000000002c48
  RIP: 0010:__next_zones_zonelist+0x1d/0x40
  Call Trace:
    __alloc_pages_nodemask+0x277/0x310
    alloc_page_interleave+0x13/0x70
    handle_mm_fault+0xf99/0x1390
    __do_page_fault+0x288/0x500
    do_page_fault+0x30/0x110
    page_fault+0x3e/0x50

The reason for the panic is that MAX_NUMNODES is passed in the third
parameter in __alloc_pages_nodemask(preferred_nid).  So access to
zonelist->zoneref->zone_idx in __next_zones_zonelist will cause a panic.

In offset_il_node(), first_node() returns nid from pol->v.nodes, after
this other threads may chang pol->v.nodes before next_node().  This race
condition will let next_node return MAX_NUMNODES.  So put pol->nodes in
a local variable.

The race condition is between offset_il_node and cpuset_change_task_nodemask:

  CPU0:                                     CPU1:
  alloc_pages_vma()
    interleave_nid(pol,)
      offset_il_node(pol,)
        first_node(pol->v.nodes)            cpuset_change_task_nodemask
                        //nodes==0xc          mpol_rebind_task
                                                mpol_rebind_policy
                                                  mpol_rebind_nodemask(pol,nodes)
                        //nodes==0x3
        next_node(nid, pol->v.nodes)//return MAX_NUMNODES

Link: https://lkml.kernel.org/r/20210906034658.48721-1-yanghui.def@bytedance.comSigned-off-by: Nyanghui <yanghui.def@bytedance.com>
Reviewed-by: NMuchun Song <songmuchun@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 276aeee1)
conflicts:
	mm/mempolicy.c
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2d47dc83

PCI: fix kabi change in struct pci_dev · 62f62cfc

由 Jiefeng Ou 提交于 4月 21, 2022

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T

--------------------------------------------------------------------------

Fix kabi change in struct pci_dev since the following patches:
- commit 8eb7b6ca203f ("PCI/ERR: Cache RCEC EA Capability offset in
  pci_init_capabilities()")
- commit 1345ecf47242 ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs")
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

62f62cfc

PCI/RCEC: Fix RCiEP device to RCEC association · 84014230

由 Qiuxu Zhuo 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.13-rc1
commit d9b7eae8
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d9b7eae8e3424c3480fe9f40ebafbb0c96426e4c

--------------------------------------------------------------------------

rcec_assoc_rciep() used "rciep->devfn" (a single byte encoding both the
device and function number) as the device number to check whether the
corresponding bit was set in the RCEC's Association Bitmap for RCiEPs.

But per PCIe r5.0, sec 7.9.10.2, "Association Bitmap for RCiEPs", the
32-bit bitmap contains one bit per device.  That bit applies to all
functions of the device.

Fix rcec_assoc_rciep() to convert the value of "rciep->devfn" to a device
number to ensure that RCiEP devices are correctly associated with the RCEC.
Reported-and-tested-by: NWen Jin <wen.jin@intel.com>
Fixes: 507b460f ("PCI/ERR: Add pcie_link_rcec() to associate RCiEPs")
Link: https://lore.kernel.org/r/20210222011717.43266-1-qiuxu.zhuo@intel.comSigned-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NSean V Kelley <sean.v.kelley@intel.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

84014230

PCI/AER: Add RCEC AER error injection support · 08480b7a

由 Qiuxu Zhuo 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit d292dd0e
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d292dd0eb3ac6ce6ea66715bb9f6b8e2ae70747c

--------------------------------------------------------------------------

Root Complex Event Collectors (RCEC) appear as peers to Root Ports and may
also have the AER capability.

Add RCEC support to the AER error injection driver.
Co-developed-by: NSean V Kelley <sean.v.kelley@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-16-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

08480b7a

PCI/PME: Add pcie_walk_rcec() to RCEC PME handling · f5a09c3d

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 9a2f604f
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9a2f604f44979e0effa8cf067e5a8ecda729f23b

--------------------------------------------------------------------------

Root Complex Event Collectors (RCEC) appear as peers of Root Ports and also
have the PME capability. As with AER, there is a need to be able to walk
the RCiEPs associated with their RCEC for purposes of acting upon them with
callbacks.

Add RCEC support through the use of pcie_walk_rcec() to the current PME
service driver and attach the PME service driver to the RCEC device.
Co-developed-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-15-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f5a09c3d

PCI/AER: Add pcie_walk_rcec() to RCEC AER handling · 3a3ebae6

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit af113553
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=af113553d9610b2d811d05da96263b4f666f44f0

--------------------------------------------------------------------------

Root Complex Event Collectors (RCEC) appear as peers to Root Ports and also
have the AER capability. In addition, actions need to be taken for
associated RCiEPs. In such cases the RCECs will need to be walked in order
to find and act upon their respective RCiEPs.

Extend the existing ability to link the RCECs with a walking function
pcie_walk_rcec(). Add RCEC support to the current AER service driver and
attach the AER service driver to the RCEC device.
Co-developed-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-14-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

3a3ebae6

PCI/ERR: Recover from RCiEP AER errors · d2a69954

由 Qiuxu Zhuo 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 57908622
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5790862255028c831761e13014ee87a06df828f1

-----------------------------------------------------------------------------

Add support for handling AER errors detected by Root Complex Integrated
Endpoints (RCiEPs).  These errors are signaled to software natively via a
Root Complex Event Collector (RCEC) or non-natively via ACPI APEI if the
platform retains control of AER or uses a non-standard RCEC-like device.

When recovering from RCiEP errors, the Root Error Command and Status
registers are in the AER Capability of an associated RCEC (if any), not in
a Root Port.  In the non-native case, the platform is responsible for those
registers and we can't touch them.

[bhelgaas: commit log, etc]
Co-developed-by: NSean V Kelley <sean.v.kelley@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-13-sean.v.kelley@intel.comSigned-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d2a69954

PCI/ERR: Add pcie_link_rcec() to associate RCiEPs · 735504a0

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 507b460f
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=507b460f814458605c47b0ed03c11e49a712fc08

--------------------------------------------------------------------------

A Root Complex Event Collector terminates error and PME messages from
associated RCiEPs.

Use the RCEC Endpoint Association Extended Capability to identify
associated RCiEPs. Link the associated RCiEPs as the RCECs are enumerated.
Co-developed-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-12-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

735504a0

PCI/ERR: Recover from RCEC AER errors · a28efa17

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit a175102b
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a175102b0a82fc57853a9e611c42d1d6172e5180

----------------------------------------------------------------------------

A Root Complex Event Collector (RCEC) collects and signals AER errors that
were detected by Root Complex Integrated Endpoints (RCiEPs), but it may
also signal errors it detects itself.  This is analogous to errors detected
and signaled by a Root Port.

Update the AER service driver to claim RCECs in addition to Root Ports.
Add support for handling RCEC-detected AER errors.  This does not
include handling RCiEP-detected errors that are signaled by the RCEC.

Note that we expect these errors only from the native AER and APEI paths,
not from DPC or EDR.

[bhelgaas: split from combined RCEC/RCiEP patch, commit log]
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a28efa17

PCI/ERR: Clear AER status only when we control AER · ce37c219

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit aa344bc8
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=aa344bc8b727b47b4350b59d8166216a3f351e55

--------------------------------------------------------------------------

In some cases a bridge may not exist as the hardware controlling may be
handled only by firmware and so is not visible to the OS. This scenario is
also possible in future use cases involving non-native use of RCECs by
firmware. In this scenario, we expect the platform to retain control of the
bridge and to clear error status itself.

Clear error status only when the OS has native control of AER.
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ce37c219

PCI/ERR: Add pci_walk_bridge() to pcie_do_recovery() · 8dcc59b7

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 05e9ae19
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=05e9ae19ab83881a0f33025bd1288e41e552a34b

--------------------------------------------------------------------------

Consolidate subordinate bus checks with pci_walk_bus() into
pci_walk_bridge() for walking below potentially AER affected bridges.

Link: https://lore.kernel.org/r/20201121001036.8560-10-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8dcc59b7

PCI/ERR: Avoid negated conditional for clarity · 0ac22224

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 3d7d8fc7
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3d7d8fc78f4b504819882278fcfe10784eb985fa

--------------------------------------------------------------------------

Reverse the sense of the Root Port/Downstream Port conditional for clarity.
No functional change intended.

Link: https://lore.kernel.org/r/20201121001036.8560-9-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0ac22224

PCI/ERR: Use "bridge" for clarity in pcie_do_recovery() · 1c714692

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 0791721d
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0791721d800790e6e533bd8467df67f0dc4f2fec

--------------------------------------------------------------------------

pcie_do_recovery() may be called with "dev" being either a bridge (Root
Port or Switch Downstream Port) or an Endpoint.  The bulk of the function
deals with the bridge, so if we start with an Endpoint, we reset "dev" to
be the bridge leading to it.

For clarity, replace "dev" in the body of the function with "bridge".  No
functional change intended.

Link: https://lore.kernel.org/r/20201121001036.8560-8-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1c714692

PCI/ERR: Simplify by computing pci_pcie_type() once · 9a1f8e24

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 480ef7cb
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=480ef7cb9fcebda7b28cbed4f6cdcf0a02f4a6ca

--------------------------------------------------------------------------

Instead of calling pci_pcie_type(dev) twice, call it once and save the
result.  No functional change intended.

Link: https://lore.kernel.org/r/20201121001036.8560-7-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Acked-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9a1f8e24

PCI/ERR: Simplify by using pci_upstream_bridge() · 0b33f0c9

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 5d69dcc9
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5d69dcc9f839bd2d5cac7a098712f52149e1673f

--------------------------------------------------------------------------

Use pci_upstream_bridge() in place of dev->bus->self.  No functional change
intended.

Link: https://lore.kernel.org/r/20201121001036.8560-6-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0b33f0c9

PCI/ERR: Rename reset_link() to reset_subordinates() · 5dbfb814

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 8f1bbfbc
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8f1bbfbc3596d401b60d1562b27ec28c2724f60d

--------------------------------------------------------------------------

reset_link() appears to be misnamed.  The point is to reset any devices
below a given bridge, so rename it to reset_subordinates() to make it clear
that we are passing a bridge with the intent to reset the devices below it.

Link: https://lore.kernel.org/r/20201121001036.8560-5-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5dbfb814

PCI/ERR: Cache RCEC EA Capability offset in pci_init_capabilities() · 74f5f078

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 90655631
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=90655631988f8f501529e6de5f13614389717ead

--------------------------------------------------------------------------

Extend support for Root Complex Event Collectors by decoding and caching
the RCEC Endpoint Association Extended Capabilities when enumerating. Use
that cached information for later error source reporting. See PCIe r5.0,
sec 7.9.10.
Co-developed-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-4-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

74f5f078

PCI/ERR: Bind RCEC devices to the Root Port driver · 487e77c6

由 Qiuxu Zhuo 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit c9d659b6
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c9d659b60770db94b898f94947192a94bbf95c5c

--------------------------------------------------------------------------

If a Root Complex Integrated Endpoint (RCiEP) is implemented, it may signal
errors through a Root Complex Event Collector (RCEC).  Each RCiEP must be
associated with no more than one RCEC.

For an RCEC (which is technically not a Bridge), error messages "received"
from associated RCiEPs must be enabled for "transmission" in order to cause
a System Error via the Root Control register or (when the Advanced Error
Reporting Capability is present) reporting via the Root Error Command
register and logging in the Root Error Status register and Error Source
Identification register.

Given the commonality with Root Ports and the need to also support AER and
PME services for RCECs, extend the Root Port driver to support RCEC devices
by adding the RCEC Class ID to the driver structure.
Co-developed-by: NSean V Kelley <sean.v.kelley@intel.com>
Link: https://lore.kernel.org/r/20201121001036.8560-3-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

487e77c6

PCI/AER: Write AER Capability only when we control it · fe6e1c23

由 Sean V Kelley 提交于 4月 21, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 50cc18fc
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I51U4T
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=50cc18fcd3053fb46a09db5a39e6516e9560f765

------------------------------------------------------------------------

If an OS has not been granted AER control via _OSC, it should not make
changes to PCI_ERR_ROOT_COMMAND and PCI_ERR_ROOT_STATUS related registers.
Per section 4.5.1 of the System Firmware Intermediary (SFI) _OSC and DPC
Updates ECN [1], this bit also covers these aspects of the PCI Express
Advanced Error Reporting. Based on the above and earlier discussion [2],
make the following changes:

Add a check for the native case (i.e., AER control via _OSC)

Note that the previous "clear, reset, enable" order suggests that the reset
might cause errors that we should ignore. After this commit, those errors
(if any) will remain logged in the PCI_ERR_ROOT_STATUS register.

[1] System Firmware Intermediary (SFI) _OSC and DPC Updates ECN, Feb 24,
    2020, affecting PCI Firmware Specification, Rev. 3.2
    https://members.pcisig.com/wg/PCI-SIG/document/14076
[2] https://lore.kernel.org/linux-pci/20201020162820.GA370938@bjorn-Precision-5520/

Link: https://lore.kernel.org/r/20201121001036.8560-2-sean.v.kelley@intel.com
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # non-native/no RCEC
Signed-off-by: NSean V Kelley <sean.v.kelley@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJiefeng Ou <oujiefeng@h-partners.com>
Reviewed-by: NJay Fang <f.fangjian@huawei.com>
Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

fe6e1c23

af_key: add __GFP_ZERO flag for compose_sadb_supported in function pfkey_register · ea91adda

由 Haimin Zhang 提交于 4月 21, 2022

stable inclusion
from stable-v5.10.110
commit 8d3f4ad43054619379ccc697cfcbdb2c266800d8
category: bugfix
bugzilla: 186606, https://gitee.com/src-openeuler/kernel/issues/I53SSV
CVE: CVE-2022-1353

--------------------------------

[ Upstream commit 9a564bcc ]

Add __GFP_ZERO flag for compose_sadb_supported in function pfkey_register
to initialize the buffer of supp_skb to fix a kernel-info-leak issue.
1) Function pfkey_register calls compose_sadb_supported to request
a sk_buff. 2) compose_sadb_supported calls alloc_sbk to allocate
a sk_buff, but it doesn't zero it. 3) If auth_len is greater 0, then
compose_sadb_supported treats the memory as a struct sadb_supported and
begins to initialize. But it just initializes the field sadb_supported_len
and field sadb_supported_exttype without field sadb_supported_reserved.
Reported-by: NTCS Robot <tcs_robot@tencent.com>
Signed-off-by: NHaimin Zhang <tcs_kernel@tencent.com>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NXu Jia <xujia39@huawei.com>
Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ea91adda

19 4月, 2022 20 次提交

SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() · 89804594

由 Trond Myklebust 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.18-rc2
commit f0043206
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I52Y3C
CVE: CVE-2022-28893
backport: openEuler-22.03-LTS

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f00432063db1a0db484e85193eccc6845435b80e

--------------------------------

We must ensure that all sockets are closed before we call xprt_free()
and release the reference to the net namespace. The problem is that
calling fput() will defer closing the socket until delayed_fput() gets
called.
Let's fix the situation by allowing rpciod and the transport teardown
code (which runs on the system wq) to call __fput_sync(), and directly
close the socket.
Reported-by: NFelix Fu <foyjog@gmail.com>
Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
Fixes: a73881c9 ("SUNRPC: Fix an Oops in udp_poll()")
Cc: stable@vger.kernel.org # 5.1.x: 3be232f1: SUNRPC: Prevent immediate close+reconnect
Cc: stable@vger.kernel.org # 5.1.x: 89f42494: SUNRPC: Don't call connect() more than once on a TCP socket
Cc: stable@vger.kernel.org # 5.1.x
Signed-off-by: NTrond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: NBaisong Zhong <zhongbaisong@huawei.com>
Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

89804594

scsi: hisi_sas: Use autosuspend for the host controller · ff6bb789

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit b4cc0949
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=b4cc09492263e07bad4fc4bf34fed3246fa95057

--------------------------------

The controller may frequently enter and exit suspend for each I/O which we
need to deal with. This is inefficient and may cause too much suspend and
resume activity for the controller.  To avoid this, use a default 5s
autosuspend for the controller to stop frequently suspending and
resuming. This value may still be modified via sysfs interfaces.

Link: https://lore.kernel.org/r/1639999298-244569-16-git-send-email-chenxiang66@hisilicon.comAcked-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ff6bb789

scsi: libsas: Keep host active while processing events · 96433875

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 307d9f49
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=307d9f49cce966c2ba969f58bd6227bc0092afaa

--------------------------------

Processing events such as PORTE_BROADCAST_RCVD may cause dependency issues
for runtime power management support.  Such a problem would be that
handling a PORTE_BROADCAST_RCVD event requires that the host is resumed to
send SMP commands. However, in resuming the host, the phyup events
generated from re-enabling the phys are processed in the same workqueue as
the original PORTE_BROADCAST_RCVD event. As such, the host will never
finish resuming (as it waits for the phyup event processing), and then the
PORTE_BROADCAST_RCVD event can't be processed as the SMP commands are
blocked, and so we have a deadlock.  Solve this problem by ensuring that
libsas keeps the host active until completely finished phy or port events,
such as PORTE_BYTES_DMAED. As such, we don't have to worry about resuming
the host for processing individual SMP commands in this example.

Link: https://lore.kernel.org/r/1639999298-244569-15-git-send-email-chenxiang66@hisilicon.comReviewed-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

96433875

scsi: hisi_sas: Keep controller active between ISR of phyup and the event being processed · c4e7a0bd

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit ae9b69e8
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=ae9b69e85eb7ecb32ddce7c04a10a3c69ad60e52

--------------------------------

It is possible that controller may become suspended between processing a
phyup interrupt and the event being processed by libsas. As such, we can't
ensure the controller is active when processing the phyup event - this may
cause the phyup event to be lost or other issues.  To avoid any possible
issues, add pm_runtime_get_noresume() in phyup interrupt handler and
pm_runtime_put_sync() in the work handler exit to ensure that we stay
always active. Since we only want to call pm_runtime_get_noresume() for v3
hw, signal this will a new event, HISI_PHYE_PHY_UP_PM.

Link: https://lore.kernel.org/r/1639999298-244569-14-git-send-email-chenxiang66@hisilicon.comAcked-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c4e7a0bd

scsi: libsas: Defer works of new phys during suspend · 738915b1

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit bf19aea4
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=bf19aea4607cb5f4a652ab70d8d8035a72a6b8da

--------------------------------

During the processing of event PORT_BYTES_DMAED, the driver queues work
DISCE_DISCOVER_DOMAIN and then flushes workqueue ha->disco_q.  If a new
phyup event occurs during resuming the controller, the work
PORTE_BYTES_DMAED of new phy occurs before suspended phy's. The work
DISCE_DISCOVER_DOMAIN of new phy requires an active SAS controller (it
needs to resume SAS controller by function scsi_sysfs_add_sdev() and some
other functions such as function add_device_link()). However, the
activation of the SAS controller requires completion of work
PORTE_BYTES_DMAED of suspended phys while it is blocked by new phy's work
on ha->event_q. So there is a deadlock and it is released only after resume
timeout.

To solve the issue, defer works of new phys during suspend and queue those
defer works after SAS controller becomes active.

Link: https://lore.kernel.org/r/1639999298-244569-13-git-send-email-chenxiang66@hisilicon.comReviewed-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

738915b1

scsi: libsas: Refactor sas_queue_deferred_work() · fa9af4b7

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 1bc35475
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=1bc35475c6bf6d078b3800e516978f37c1ecda36

--------------------------------

In the second part of function __sas_drain_work(), deferred work is queued.
This functionality is required other places so factor it out into the
function sas_queue_deferred_work().

Link: https://lore.kernel.org/r/1639999298-244569-12-git-send-email-chenxiang66@hisilicon.comReviewed-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

fa9af4b7

scsi: libsas: Add flag SAS_HA_RESUMING · 7d74ad70

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 4ea775ab
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=4ea775abbb5c50c26edbf043d5a2ae7fde407f4a

--------------------------------

Add a flag SAS_HA_RESUMING and use it to indicate the state of resuming the
host controller.

Link: https://lore.kernel.org/r/1639999298-244569-11-git-send-email-chenxiang66@hisilicon.comReviewed-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

7d74ad70

scsi: libsas: Resume host while sending SMP I/Os · 386ce48f

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 0da7ca4c
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=0da7ca4c4fd95d70d473dc07488ad94ba3ee9b82

--------------------------------

When sending SMP I/Os to the host we need to ensure that the host is not
suspended and can process the commands. This is a better approach than
replying on the host to resume itself to handle such commands. Use
pm_runtime_get_sync() and pm_runtime_put_sync() calls for the host when
executing SMP I/Os.

Link: https://lore.kernel.org/r/1639999298-244569-10-git-send-email-chenxiang66@hisilicon.comReviewed-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

386ce48f

scsi: hisi_sas: Add more logs for runtime suspend/resume · 2d0cadc2

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 97f41009
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=97f4100939844a6381ba61b99d6d2b1f2fccb79f

--------------------------------

Add some logs at the beginning and end of suspend/resume.

Link: https://lore.kernel.org/r/1639999298-244569-9-git-send-email-chenxiang66@hisilicon.comAcked-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

2d0cadc2

scsi: libsas: Insert PORTE_BROADCAST_RCVD event for resuming host · 5eeadee8

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit e31e1812
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=e31e18128eb9dbcda8c169cb33421ae4813afa71

--------------------------------

If a new disk is inserted through an expander when the host was suspended,
it will not necessarily be detected as the topology is not re-scanned
during resume.  To detect possible changes in topology during suspension,
insert a PORTE_BROADCAST_RCVD event per port when resuming to trigger a
revalidation.

Link: https://lore.kernel.org/r/1639999298-244569-8-git-send-email-chenxiang66@hisilicon.comReviewed-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5eeadee8

scsi: mvsas: Add spin_lock/unlock() to protect asd_sas_port->phy_list · 03cd624f

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 133b688b
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=133b688b2d03f7ae2a6c9d344f92c1949ec05a51

--------------------------------

phy_list_lock is not held when using asd_sas_port->phy_list in the mvsas
driver. Add spin_lock/unlock in those places.

Link: https://lore.kernel.org/r/1639999298-244569-7-git-send-email-chenxiang66@hisilicon.comSigned-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

03cd624f

scsi: hisi_sas: Fix some issues related to asd_sas_port->phy_list · dc480061

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 29e2bac8
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=29e2bac87421c613782ccb510c76c5efbecac0cf

--------------------------------

Most places that use asd_sas_port->phy_list are protected by spinlock
asd_sas_port->phy_list_lock, however there are still some places which miss
grabbing the lock. Add it in function hisi_sas_refresh_port_id() when
accessing asd_sas_port->phy_list. This carries a risk that list mutates
while at the same time dropping the lock in function
hisi_sas_send_ata_reset_each_phy(). Read asd_sas_port->phy_mask instead of
accessing asd_sas_port->phy_list to avoid this risk.

Link: https://lore.kernel.org/r/1639999298-244569-6-git-send-email-chenxiang66@hisilicon.comAcked-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

dc480061

scsi: libsas: Add spin_lock/unlock() to protect asd_sas_port->phy_list · b9b53ed3

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 42159d3c
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=42159d3c8d879e8d5fc225733f0cedc8baf19002

--------------------------------

Most places that use asd_sas_port->phy_list in libsas are protected by
spinlock asd_sas_port->phy_list_lock. However, there are still a few places
which miss the lock. Add it in those places.

Link: https://lore.kernel.org/r/1639999298-244569-5-git-send-email-chenxiang66@hisilicon.comReviewed-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b9b53ed3

scsi: block: pm: Always set request queue runtime active in blk_post_runtime_resume() · ad42a605

由 Alan Stern 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 6e1fcab0
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=6e1fcab00a23f7fe9f4fe9704905a790efa1eeab

--------------------------------

John Garry reported a deadlock that occurs when trying to access a
runtime-suspended SATA device. For obscure reasons, the rescan procedure
causes the link to be hard-reset, which disconnects the device.

The rescan tries to carry out a runtime resume when accessing the device.
scsi_rescan_device() holds the SCSI device lock and won't release it until
it can put commands onto the device's block queue. This can't happen until
the queue is successfully runtime-resumed or the device is unregistered.
But the runtime resume fails because the device is disconnected, and
__scsi_remove_device() can't do the unregistration because it can't get the
device lock.

The best way to resolve this deadlock appears to be to allow the block
queue to start running again even after an unsuccessful runtime resume.
The idea is that the driver or the SCSI error handler will need to be able
to use the queue to resolve the runtime resume failure.

This patch removes the err argument to blk_post_runtime_resume() and makes
the routine act as though the resume was successful always. This fixes the
deadlock.

Link: https://lore.kernel.org/r/1639999298-244569-4-git-send-email-chenxiang66@hisilicon.com
Fixes: e27829dc ("scsi: serialize ->rescan against ->remove")
Reported-and-tested-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ad42a605

scsi: Revert "scsi: hisi_sas: Filter out new PHY up events during suspend" · e6ac2331

由 John Garry 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 6cc73908
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=6cc739087784160eff296c7fbd7a95b209f44ba5

--------------------------------

This reverts commit b14a37e0.

In that commit, we had to filter out phy-up events during suspend, as it
work cause a deadlock between processing the phyup event and the resume HA
function try to drain the HA event workqueue to complete the resume
process.

Now that we no longer try to drain the HA event queue during the HA resume
processor, the deadlock would not occur, so remove the special handling for
it.

Link: https://lore.kernel.org/r/1639999298-244569-3-git-send-email-chenxiang66@hisilicon.comSigned-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e6ac2331

scsi: libsas: Don't always drain event workqueue for HA resume · 9e6fbd2e

由 John Garry 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit fbefe228
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=fbefe22811c3140a686e407e114789ebf328a9a2

--------------------------------

For the hisi_sas driver, if a directly attached disk is removed during
suspend, a hang will occur in the resume process:

The background is that in commit 16fd4a7c ("scsi: hisi_sas: Add device
link between SCSI devices and hisi_hba"), it is ensured that the HBA device
cannot be runtime suspended when any SCSI device associated is active.

Other drivers which use libsas don't worry about this as none support
runtime suspend.

The mentioned hang occurs when an disk is removed during suspend. In the
removal process - from PHYE_RESUME_TIMEOUT event processing - we call into
scsi_remove_device(), which is being processed in the HA event workqueue.
Here we wait for all suppliers of the SCSI device to resume, which includes
the HBA device (from the above commit). However the HBA device cannot
resume, as it is waiting for the PHYE_RESUME_TIMEOUT to be processed (from
calling sas_resume_ha() -> sas_drain_work()). This is the deadlock.

There does not appear to be any need for the sas_drain_work() to be called
at all in sas_resume_ha() as it is not syncing against anything, so allow
LLDDs to avoid this by providing a variant of sas_resume_ha() which does
"sync", i.e. doesn't drain the event workqueue.

Link: https://lore.kernel.org/r/1639999298-244569-2-git-send-email-chenxiang66@hisilicon.comSigned-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9e6fbd2e

scsi: hisi_sas: Wait for phyup in hisi_sas_control_phy() · c782471c

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 046ab7d0
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=046ab7d0f5943dd74c351e1f3a771dea785fe25d

--------------------------------

When issuing a hardreset/linkreset/phy_set_linkrate from sysfs, the phy
will be disabled and re-enabled for the directly attached scenario.

It takes some time for the phy to come back up after re-enabling the phy.
If the controller becomes suspended while waiting for the phy to come back,
the phy up may be lost (along with the disk).

To solve this problem, wait for the phy up to occur with a timeout. Indeed
this is already done in hisi_sas_debug_I_T_nexus_reset() for local phys, so
just relocate the functionality to hisi_sas_control_phy().

Since the HA workqueue is drained when suspending the controller, and the
phy control function is called from the same workqueue, we can guarantee
that the controller will not be suspended during this period.

Link: https://lore.kernel.org/r/1634041588-74824-3-git-send-email-john.garry@huawei.comSigned-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c782471c

scsi: hisi_sas: Initialise devices in .slave_alloc callback · 088d588a

由 Xiang Chen 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.17-rc1
commit 36c6b761
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZHSV
CVE: NA

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/scsi/hisi_sas?id=36c6b7613ef1ffd88637315f11c71896f3ce4856

--------------------------------

Perform driver-specific SCSI device initialization in the designated SCSI
midlayer callback instead of relying on the libsas "device found" callback.

The SCSI midlayer .slave_alloc interface is called prior to sending any I/O
to the device.

Link: https://lore.kernel.org/r/1634041588-74824-2-git-send-email-john.garry@huawei.comSigned-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Fujai Ni<nifuijia1@hisilicon.com>
Reviewed-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

088d588a

can: ems_usb: ems_usb_start_xmit(): fix double dev_kfree_skb() in error path · 1cbe33bc

由 Hangyu Hua 提交于 4月 19, 2022

mainline inclusion
from mainline-v5.18-rc1
commit c7022275
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I51YBP
CVE: CVE-2022-28390
backport: openEuler-22.03-LTS

--------------------------------

There is no need to call dev_kfree_skb() when usb_submit_urb() fails
beacause can_put_echo_skb() deletes the original skb and
can_free_echo_skb() deletes the cloned skb.

Link: https://lore.kernel.org/all/20220228083639.38183-1-hbh25y@gmail.com
Fixes: 702171ad ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface")
Cc: stable@vger.kernel.org
Cc: Sebastian Haas <haas@ems-wuensche.com>
Signed-off-by: NHangyu Hua <hbh25y@gmail.com>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: NBaisong Zhong <zhongbaisong@huawei.com>
Reviewed-by: NYue Haibing <yuehaibing@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1cbe33bc

drivers: hamradio: 6pack: fix UAF bug caused by mod_timer() · 28c20e81

由 Duoming Zhou 提交于 4月 19, 2022

stable inclusion
from stable-v5.10.110
commit f67a1400788f550d201c71aeaf56706afe57f0da
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5224J
CVE: CVE-2022-1198

--------------------------------

commit efe4186e upstream.

When a 6pack device is detaching, the sixpack_close() will act to cleanup
necessary resources. Although del_timer_sync() in sixpack_close()
won't return if there is an active timer, one could use mod_timer() in
sp_xmit_on_air() to wake up timer again by calling userspace syscall such
as ax25_sendmsg(), ax25_connect() and ax25_ioctl().

This unexpected waked handler, sp_xmit_on_air(), realizes nothing about
the undergoing cleanup and may still call pty_write() to use driver layer
resources that have already been released.

One of the possible race conditions is shown below:

      (USE)                      |      (FREE)
ax25_sendmsg()                   |
 ax25_queue_xmit()               |
  ...                            |
  sp_xmit()                      |
   sp_encaps()                   | sixpack_close()
    sp_xmit_on_air()             |  del_timer_sync(&sp->tx_t)
     mod_timer(&sp->tx_t,...)    |  ...
                                 |  unregister_netdev()
                                 |  ...
     (wait a while)              | tty_release()
                                 |  tty_release_struct()
                                 |   release_tty()
    sp_xmit_on_air()             |    tty_kref_put(tty_struct) //FREE
     pty_write(tty_struct) //USE |    ...

The corresponding fail log is shown below:
Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>

===============================================================
BUG: KASAN: use-after-free in __run_timers.part.0+0x170/0x470
Write of size 8 at addr ffff88800a652ab8 by task swapper/2/0
...
Call Trace:
  ...
  queue_work_on+0x3f/0x50
  pty_write+0xcd/0xe0pty_write+0xcd/0xe0
  sp_xmit_on_air+0xb2/0x1f0
  call_timer_fn+0x28/0x150
  __run_timers.part.0+0x3c2/0x470
  run_timer_softirq+0x3b/0x80
  __do_softirq+0xf1/0x380
  ...

This patch reorders the del_timer_sync() after the unregister_netdev()
to avoid UAF bugs. Because the unregister_netdev() is well synchronized,
it flushs out any pending queues, waits the refcount of net_device
decreases to zero and removes net_device from kernel. There is not any
running routines after executing unregister_netdev(). Therefore, we could
not arouse timer from userspace again.
Signed-off-by: NDuoming Zhou <duoming@zju.edu.cn>
Reviewed-by: NLin Ma <linma@zju.edu.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NXu Jia <xujia39@huawei.com>
Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

28c20e81

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功