提交 · 947019a923b91df1ab3b73f333ebb77d9c58a0cd · openeuler / Kernel

16 7月, 2021 16 次提交

iommu/arm-smmu-v3: Realize switch_dirty_log iommu ops · 947019a9

由 Kunkun Jiang 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

This realizes switch_dirty_log. In order to get finer dirty
granule, it invokes arm_smmu_split_block when start dirty
log, and invokes arm_smmu_merge_page() to recover block
mapping when stop dirty log.
Co-developed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

947019a9

iommu/arm-smmu-v3: Add feature detection for BBML · 829e1611

由 Kunkun Jiang 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

This detects BBML feature and if SMMU supports it, transfer BBMLx
quirk to io-pgtable.
Co-developed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

829e1611

iommu/arm-smmu-v3: Enable HTTU for stage1 with io-pgtable mapping · c2f77dfa

由 Kunkun Jiang 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

As nested mode is not upstreamed now, we just aim to support dirty
log tracking for stage1 with io-pgtable mapping (means not support
SVA mapping). If HTTU is supported, we enable HA/HD bits in the SMMU
CD and transfer ARM_HD quirk to io-pgtable.
Co-developed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c2f77dfa

iommu/io-pgtable-arm: Add and realize clear_dirty_log ops · bdc8c00c

由 Kunkun Jiang 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

After dirty log is retrieved, user should clear dirty log to re-enable
dirty log tracking for these dirtied pages. This clears the dirty state
(As we just set DBM bit for stage1 mapping, so should set the AP[2] bit)
of these leaf TTDs that are specified by the user provided bitmap.
Co-developed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

bdc8c00c

iommu/io-pgtable-arm: Add and realize sync_dirty_log ops · f92820e8

由 Kunkun Jiang 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

During dirty log tracking, user will try to retrieve dirty log from
iommu if it supports hardware dirty log. Scan leaf TTD and treat it
is dirty if it's writable. As we just set DBM bit for stage1 mapping,
so check whether AP[2] is not set.
Co-developed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f92820e8

iommu/io-pgtable-arm: Add and realize merge_page ops · bca6b146

由 Kunkun Jiang 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

If block(largepage) mappings are split during start dirty log, then
when stop dirty log, we need to recover them for better DMA performance.

This recovers block mappings and unmap the span of page mappings. BBML1
or BBML2 feature is required.

Merging page is designed to be only used by dirty log tracking, which
does not concurrently work with other pgtable ops that access underlying
page table, so race condition does not exist.
Co-developed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

bca6b146

iommu/io-pgtable-arm: Add and realize split_block ops · 8b8bda8e

由 Kunkun Jiang 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

Block(largepage) mapping is not a proper granule for dirty log tracking.
Take an extreme example, if DMA writes one byte, under 1G mapping, the
dirty amount reported is 1G, but under 4K mapping, the dirty amount is
just 4K.

This splits block descriptor to an span of page descriptors. BBML1 or
BBML2 feature is required.

Spliting block is designed to be only used by dirty log tracking, which
does not concurrently work with other pgtable ops that access underlying
page table, so race condition does not exist.
Co-developed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8b8bda8e

iommu/io-pgtable-arm: Add quirk ARM_HD and ARM_BBMLx · 341497bb

由 Kunkun Jiang 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

These features are essential to support dirty log tracking for
SMMU with io-pgtable mapping.

The dirty state information is encoded using the access permission
bits AP[2] (stage 1) or S2AP[1] (stage 2) in conjunction with the
DBM (Dirty Bit Modifier) bit, where DBM means writable and AP[2]/
S2AP[1] means dirty.

When has ARM_HD, we set DBM bit for S1 mapping. As SMMU nested
mode is not upstreamed for now, we just aim to support dirty
log tracking for stage1 with io-pgtable mapping (means not support
SVA).
Co-developed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

341497bb

iommu: Introduce dirty log tracking framework · bbf3b39e

由 Keqian Zhu 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

Some types of IOMMU are capable of tracking DMA dirty log, such as
ARM SMMU with HTTU or Intel IOMMU with SLADE. This introduces the
dirty log tracking framework in the IOMMU base layer.

Four new essential interfaces are added, and we maintaince the status
of dirty log tracking in iommu_domain.
1. iommu_support_dirty_log: Check whether domain supports dirty log tracking
2. iommu_switch_dirty_log: Perform actions to start|stop dirty log tracking
3. iommu_sync_dirty_log: Sync dirty log from IOMMU into a dirty bitmap
4. iommu_clear_dirty_log: Clear dirty log of IOMMU by a mask bitmap

Note: Don't concurrently call these interfaces with other ops that
access underlying page table.
Signed-off-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

bbf3b39e

vfio/iommu_type1: Mantain a counter for non_pinned_groups · b5ea3305

由 Keqian Zhu 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

With this counter, we never need to traverse all groups to update
pinned_scope of vfio_iommu.
Suggested-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: Kunkun Jiang<jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b5ea3305

fs/filescontrol.c: fix warning:large integer implicitly truncated to unsigned type · a466f5dd

由 Lu Jialin 提交于 7月 16, 2021

hulk inclusion
category: bugfix
bugzilla: 50779
CVE: NA

--------

page_counter_set_max(struct page_counter *counter, unsigned long nr_pages)
the nr_pages is unsigned long, therefore change FILES_MAX to ULONG_MAX
Signed-off-by: NLu Jialin <lujialin4@huawei.com>
Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a466f5dd

irqchip/gic-v4.1: Reduce the delay when polling GICR_VPENDBASER.Dirty · 319e568b

由 Shenming Lu 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I40TDK
CVE: NA

---------------------------

The 10us delay of the poll on the GICR_VPENDBASER.Dirty bit is too
high, which might greatly affect the total scheduling latency of a
vCPU in our measurement. So we reduce it to 1 to lessen the impact.
Signed-off-by: NShenming Lu <lushenming@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201128141857.983-2-lushenming@huawei.comReviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

319e568b

KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty bit · 19339c51

由 Shenming Lu 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I40TDK
CVE: NA

---------------------------

In order to reduce the impact of the VPT parsing happening on the GIC,
we can split the vcpu reseidency in two phases:

- programming GICR_VPENDBASER: this still happens in vcpu_load()
- checking for the VPT parsing to be complete: this can happen
  on vcpu entry (in kvm_vgic_flush_hwstate())

This allows the GIC and the CPU to work in parallel, rewmoving some
of the entry overhead.
Suggested-by: NMarc Zyngier <maz@kernel.org>
Signed-off-by: NShenming Lu <lushenming@huawei.com>
Signed-off-by: NMarc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201128141857.983-3-lushenming@huawei.comReviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

19339c51

KVM: arm64: Make use of TWED feature · 9c8b91e8

由 Jingyi Wang 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I40FGG
CVE: NA

-----------------------------

For HCR_EL2, TWEDEn(bit[59]) decides whether TWED is enabled, and
when the configurable delay is enabled, TWEDEL (bits[63:60]) encodes
the minimum delay in taking a trap of WFE caused by the TWE bit in
this register as 2^(TWEDEL + 8) cycles.

We use two kernel parameters "twed_enable" and "twedel" to configure
the register.
Signed-off-by: NZengruan Ye <yezengruan@huawei.com>
Signed-off-by: NJingyi Wang <wangjingyi11@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9c8b91e8

arm64: cpufeature: TWED support detection · 1d939330

由 Zengruan Ye 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I40FGG
CVE: NA

-----------------------------

TWE Delay is an optional feature in ARMv8.6 Extensions. This patch
detect this feature.
Signed-off-by: NZengruan Ye <yezengruan@huawei.com>
Signed-off-by: NJingyi Wang <wangjingyi11@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1d939330

move ETMEM feature CONFIG to mm/Kconfig and add architecture dependency · d7b8dcbe

由 liubo 提交于 7月 16, 2021

euleros inclusion
category: feature
feature: etmem
bugzilla: 48246

-------------------------------------------------

The original etmem feature failed to compile a specific architecture,
for example, powerpc, because the architecture is not specified.

This patch is move ETMEM feature CONFIG to mm/Kconfig and add
architecture.
Signed-off-by: Nliubo <liubo254@huawei.com>
Reviewed-by: Njingxiangfeng 00447129 <jingxiangfeng@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d7b8dcbe

15 7月, 2021 13 次提交

x86/config: Set CONFIG_TXGBE=m by default · e4bfcab8

由 zhenpengzheng 提交于 7月 13, 2021

driver inclusion
category: feature
bugzilla: 50777
CVE: NA

-------------------------------------------------------------------------

Ensure the netswift 10G NIC driver ko can be distributed in ISO on X86.
Signed-off-by: Nzhenpengzheng <zhenpengzheng@net-swift.com>
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e4bfcab8

net: txgbe: Add support for Netswift 10G NIC · a493f74a

由 zhenpengzheng 提交于 7月 13, 2021

driver inclusion
category: feature
bugzilla: 50777
CVE: NA

-------------------------------------------------------------------------
This patch contains main code of Netswift 10G NIC Driver which supports
devices as follows:
1) Netswift SP1000A 8088:1001[VID:DID]
2) Netswift WX1820AL 8088:2001[VID:DID]
Signed-off-by: Nzhenpengzheng <zhenpengzheng@net-swift.com>
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a493f74a

net: hns3: fix spelling mistake "memroy" -> "memory" · abdcf183

由 Colin Ian King 提交于 7月 14, 2021

mainline inclusion
from mainline-v5.11-rc1
commit be419fca
category: bugfix
bugzilla: 173966
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=be419fcacf251423afc530b8964a355eb96e4040

----------------------------------------------------------------------

There are spelling mistakes in two dev_err messages. Fix them.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20201123103452.197708-1-colin.king@canonical.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: Nli yongxin <liyongxin1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

abdcf183

net: hns3: adds debugfs to dump more info of shaping parameters · 795694eb

由 Yonglong Liu 提交于 7月 14, 2021

mainline inclusion
from mainline-v5.11-rc1
commit c331ecf1
category: feature
bugzilla: 173966
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c331ecf1afc1211ce927cc4bd3a978b3655c0854

----------------------------------------------------------------------

Adds debugfs to dump new shaping parameters: rate and flag.
Signed-off-by: NYonglong Liu <liuyonglong@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: Nli yongxin <liyongxin1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

795694eb

net: hns3: add support to utilize the firmware calculated shaping parameters · 9c7964b3

由 Yonglong Liu 提交于 7月 14, 2021

mainline inclusion
from mainline-v5.11-rc1
commit e364ad30
category: feature
bugzilla: 173966
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e364ad303fe3e96ff30fb05c031774ecbbce4af1

----------------------------------------------------------------------

Since the calculation of the driver is fixed, if the number of
queue or clock changed, the calculated result may be inaccurate.

So for compatible and maintainable, add a new flag to tell the
firmware to calculate the shaping parameters with the specified
rate.
Signed-off-by: NYonglong Liu <liuyonglong@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: Nli yongxin <liyongxin1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9c7964b3

net: hns3: add support for pf querying new interrupt resources · 3161efe8

由 Yufeng Mo 提交于 7月 14, 2021

mainline inclusion
from mainline-v5.11-rc1
commit 3a6863e4
category: feature
bugzilla: 173966
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3a6863e4e8ee212c7f86594299d9ff0d6a15ecbc

----------------------------------------------------------------------

For HNAE3_DEVICE_VERSION_V3, a maximum of 1281 interrupt
resources are supported. To utilize these new resources,
extend the corresponding field or variable to 16bit type,
and remove the restriction of NIC client that only use a
maximum of 65 interrupt vectors. In addition, the I/O address
of the extended interrupt resources are different, so an extra
handler is needed.

Currently, the total number of interrupts is the sum of RoCE's
number and RoCE's offset (RoCE is in front of NIC), since
the number of both NIC and RoCE are same. For readability,
rewrite the corresponding field of the command, rename the
RoCE's offset field as the number of NIC interrupts, then
the total number of interrupts is sum of the number of RoCE
and NIC, and replace vport->back with hdev in
hclge_init_roce_base_info() for simplifying the code.
Signed-off-by: NYufeng Mo <moyufeng@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: Nli yongxin <liyongxin1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

3161efe8

net: hns3: add support for mapping device memory · cf182111

由 Huazhong Tan 提交于 7月 14, 2021

mainline inclusion
from mainline-v5.11-rc1
commit 30ae7f8a
category: feature
bugzilla: 173966
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=30ae7f8a6aa730e6dab8d86ccbbacdcbec1c389f

----------------------------------------------------------------------

For device who has device memory accessed through the PCI BAR4,
IO descriptor push of NIC and direct WQE(Work Queue Element) of
RoCE will use this device memory, so add support for mapping
this device memory, and add this info to the RoCE client whose
new feature needs.
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: Nli yongxin <liyongxin1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

cf182111

net: hns3: add support for 1280 queues · 31758273

由 Yonglong Liu 提交于 7月 14, 2021

mainline inclusion
from mainline-v5.11-rc1
commit 9a5ef4aa
category: feature
bugzilla: 173966
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9a5ef4aa5457ceab3ad9772fa7360b34192f9463

----------------------------------------------------------------------

For DEVICE_VERSION_V1/2, there are total 1024 queues and
queue sets. For DEVICE_VERSION_V3, it increases to 1280,
and can be assigned to one pf， so remove the limitation
of 1024.

To keep compatible with DEVICE_VERSION_V1/2 and old driver
version, the queue number is split into two part:
tqp_num(range 0~1023) and ext_tqp_num(range 1024~1279).
Signed-off-by: NYonglong Liu <liuyonglong@huawei.com>
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: Nli yongxin <liyongxin1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

31758273

net: hns3: rename gl_adapt_enable in struct hns3_enet_coalesce · 15bdd209

由 Huazhong Tan 提交于 7月 14, 2021

mainline inclusion
from mainline-v5.11-rc1
commit de25bcc4
category: feature
bugzilla: 173966
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=de25bcc47fba49a848764fdfab76741b7e17ca2f

----------------------------------------------------------------------

Besides GL(Gap Limiting), QL(Quantity Limiting) can be modified
dynamically when DIM is supported. So rename gl_adapt_enable as
adapt_enable in struct hns3_enet_coalesce.
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: Nli yongxin <liyongxin1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

15bdd209

net: hns3: add support for 1us unit GL configuration · 11e78c39

由 Huazhong Tan 提交于 7月 14, 2021

mainline inclusion
from mainline-v5.11-rc1
commit 5ac84b02
category: feature
bugzilla: 173966
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ac84b02d372ff45bce48c78beedbffe7c9158c0

----------------------------------------------------------------------

For device whose version is above V3(include V3), the GL
configuration can set as 1us unit, so adds support for
configuring this field.
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: Nli yongxin <liyongxin1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

11e78c39

net: hns3: add support for querying maximum value of GL · 64b9f616

由 Huazhong Tan 提交于 7月 14, 2021

mainline inclusion
from mainline-v5.11-rc1
commit ab16b49c
category: feature
bugzilla: 173966
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ab16b49cdf986172373afc16b4039f058aa3b22d

----------------------------------------------------------------------

For maintainability and compatibility, add support for querying
the maximum value of GL.
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: Nli yongxin <liyongxin1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

64b9f616

net: hns3: add support for configuring interrupt quantity limiting · 1b5ad5eb

由 Huazhong Tan 提交于 7月 14, 2021

mainline inclusion
from mainline-v5.11-rc1
commit 91bfae25
category: feature
bugzilla: 173966
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=91bfae25eedd981b384339c7b12bef9eeaba0f34

----------------------------------------------------------------------

QL(quantity limiting) means that hardware supports the interrupt
coalesce based on the frame quantity.  QL can be configured when
int_ql_max in device's specification is non-zero, so add support
to configure it. Also, rename two coalesce init function to fit
their purpose.
Signed-off-by: NHuazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: Nli yongxin <liyongxin1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1b5ad5eb

net: hns3: Remove duplicated include · fb701258

由 YueHaibing 提交于 7月 14, 2021

mainline inclusion
from mainline-v5.11-rc1
commit 36ed77cd
category: bugfix
bugzilla: 173966
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=36ed77cd0535728709d38e3f676db7958188765c

----------------------------------------------------------------------

Remove duplicated include.
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20201031024940.29716-1-yuehaibing@huawei.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
Reviewed-by: Nli yongxin <liyongxin1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

fb701258

14 7月, 2021 11 次提交

locking/qspinlock: Disable CNA by default · f2198ddb

由 Wei Li 提交于 7月 06, 2021

hulk inclusion
category: feature
bugzilla: 169576
CVE: NA

-------------------------------------------------

Disable CNA by default, this default behavior can be overridden with
the kernel boot command-line option "numa_spinlock=on/off/auto".
Signed-off-by: NWei Li <liwei391@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f2198ddb

locking/qspinlock: Add CNA support for ARM64 · 0532ec6d

由 Wei Li 提交于 7月 06, 2021

hulk inclusion
category: feature
bugzilla: 169576
CVE: NA

-------------------------------------------------

Enabling CNA is controlled via a new configuration option
(NUMA_AWARE_SPINLOCKS). Add it for arm64.
Signed-off-by: NWei Li <liwei391@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0532ec6d

KVM: arm64: Rename 'struct pv_sched_ops' · 5a868a49

由 Wei Li 提交于 7月 06, 2021

hulk inclusion
category: feature
bugzilla: 169576
CVE: NA

-------------------------------------------------

Refer to x86, rename 'struct pv_sched_ops sched' to
'struct pv_lock_ops lock' to prepare for supporting CNA on arm64.
Signed-off-by: NWei Li <liwei391@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5a868a49

locking/qspinlock: Introduce the shuffle reduction optimization into CNA · 72492082

由 Alex Kogan 提交于 7月 06, 2021

maillist inclusion
category: feature
bugzilla: 169576
CVE: NA

Reference: https://lore.kernel.org/patchwork/patch/1406529/

-------------------------------------------------

This performance optimization chooses probabilistically to avoid moving
threads from the main queue into the secondary one when the secondary queue
is empty.

It is helpful when the lock is only lightly contended. In particular, it
makes CNA less eager to create a secondary queue, but does not introduce
any extra delays for threads waiting in that queue once it is created.
Signed-off-by: NAlex Kogan <alex.kogan@oracle.com>
Reviewed-by: NSteve Sistare <steven.sistare@oracle.com>
Reviewed-by: NWaiman Long <longman@redhat.com>
Signed-off-by: NWei Li <liwei391@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

72492082

locking/qspinlock: Avoid moving certain threads between waiting queues in CNA · 96c56947

由 Alex Kogan 提交于 7月 06, 2021

maillist inclusion
category: feature
bugzilla: 169576
CVE: NA

Reference: https://lore.kernel.org/patchwork/patch/1406548/

-------------------------------------------------

Prohibit moving certain threads (e.g., in irq and nmi contexts)
to the secondary queue. Those prioritized threads will always stay
in the primary queue, and so will have a shorter wait time for the lock.
Signed-off-by: NAlex Kogan <alex.kogan@oracle.com>
Reviewed-by: NSteve Sistare <steven.sistare@oracle.com>
Reviewed-by: NWaiman Long <longman@redhat.com>
Signed-off-by: NWei Li <liwei391@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

96c56947

locking/qspinlock: Introduce starvation avoidance into CNA · 1a560b7d

由 Alex Kogan 提交于 7月 06, 2021

maillist inclusion
category: feature
bugzilla: 169576
CVE: NA

Reference: https://lore.kernel.org/patchwork/patch/1406285/

-------------------------------------------------

Keep track of the time the thread at the head of the secondary queue
has been waiting, and force inter-node handoff once this time passes
a preset threshold. The default value for the threshold (1ms) can be
overridden with the new kernel boot command-line option
"qspinlock.numa_spinlock_threshold_ns".
Signed-off-by: NAlex Kogan <alex.kogan@oracle.com>
Reviewed-by: NSteve Sistare <steven.sistare@oracle.com>
Reviewed-by: NWaiman Long <longman@redhat.com>
Signed-off-by: NWei Li <liwei391@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1a560b7d

locking/qspinlock: Introduce CNA into the slow path of qspinlock · 53a2c235

由 Alex Kogan 提交于 7月 06, 2021

maillist inclusion
category: feature
bugzilla: 169576
CVE: NA

Reference: https://lore.kernel.org/patchwork/patch/1406329/

-------------------------------------------------

In CNA, spinning threads are organized in two queues, a primary queue for
threads running on the same node as the current lock holder, and a
secondary queue for threads running on other nodes. After acquiring the
MCS lock and before acquiring the spinlock, the MCS lock
holder checks whether the next waiter in the primary queue (if exists) is
running on the same NUMA node. If it is not, that waiter is detached from
the main queue and moved into the tail of the secondary queue. This way,
we gradually filter the primary queue, leaving only waiters running on
the same preferred NUMA node. For more details, see
https://arxiv.org/abs/1810.05600.

Note that this variant of CNA may introduce starvation by continuously
passing the lock between waiters in the main queue. This issue will be
addressed later in the series.

Enabling CNA is controlled via a new configuration option
(NUMA_AWARE_SPINLOCKS). By default, the CNA variant is patched in at the
boot time only if we run on a multi-node machine in native environment and
the new config is enabled. (For the time being, the patching requires
CONFIG_PARAVIRT_SPINLOCKS to be enabled as well. However, this should be
resolved once static_call() is available.) This default behavior can be
overridden with the new kernel boot command-line option
"numa_spinlock=on/off" (default is "auto").
Signed-off-by: NAlex Kogan <alex.kogan@oracle.com>
Reviewed-by: NSteve Sistare <steven.sistare@oracle.com>
Reviewed-by: NWaiman Long <longman@redhat.com>
Signed-off-by: NWei Li <liwei391@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

53a2c235

locking/qspinlock: Refactor the qspinlock slow path · 0fc0c83f

由 Alex Kogan 提交于 7月 06, 2021

maillist inclusion
category: feature
bugzilla: 169576
CVE: NA

Reference: https://lore.kernel.org/patchwork/patch/1406535/

-------------------------------------------------

Move some of the code manipulating the spin lock into separate functions.
This would allow easier integration of alternative ways to manipulate
that lock.
Signed-off-by: NAlex Kogan <alex.kogan@oracle.com>
Reviewed-by: NSteve Sistare <steven.sistare@oracle.com>
Reviewed-by: NWaiman Long <longman@redhat.com>
Signed-off-by: NWei Li <liwei391@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0fc0c83f

locking/qspinlock: Rename mcs lock/unlock macros and make them more generic · 910e5a16

由 Alex Kogan 提交于 7月 06, 2021

maillist inclusion
category: feature
bugzilla: 169576
CVE: NA

Reference: https://lore.kernel.org/patchwork/patch/1406513/

-------------------------------------------------

The mcs unlock macro (arch_mcs_lock_handoff) should accept the value to be
stored into the lock argument as another argument. This allows using the
same macro in cases where the value to be stored when passing the lock is
different from 1.
Signed-off-by: NAlex Kogan <alex.kogan@oracle.com>
Reviewed-by: NSteve Sistare <steven.sistare@oracle.com>
Reviewed-by: NWaiman Long <longman@redhat.com>
Signed-off-by: NWei Li <liwei391@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

910e5a16

mm/page_alloc: do bulk array bounds check after checking populated elements · deecd417

由 Mel Gorman 提交于 7月 14, 2021

mainline inclusion
from mainline-5.13
commit b3b64ebd
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZVL2
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b3b64ebd38225d8032b5db42938d969b602040c2

-------------------------------------------------

Dan Carpenter reported the following

  The patch 0f87d9d3: "mm/page_alloc: add an array-based interface
  to the bulk page allocator" from Apr 29, 2021, leads to the following
  static checker warning:

        mm/page_alloc.c:5338 __alloc_pages_bulk()
        warn: potentially one past the end of array 'page_array[nr_populated]'

The problem can occur if an array is passed in that is fully populated.
That potentially ends up allocating a single page and storing it past
the end of the array.  This patch returns 0 if the array is fully
populated.

Link: https://lkml.kernel.org/r/20210618125102.GU30378@techsingularity.net
Fixes: 0f87d9d3 ("mm/page_alloc: add an array-based interface to the bulk page allocator")
Signed-off-by: NMel Gorman <mgorman@techsinguliarity.net>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit b3b64ebd)
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

deecd417

mm/page_alloc: __alloc_pages_bulk(): do bounds check before accessing array · 7491a8e9

由 Rasmus Villemoes 提交于 7月 14, 2021

mainline inclusion
from mainline-5.13
commit b08e50dd
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZVL2
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b08e50dd64489e3997029d204f761cb57a3762d2

-------------------------------------------------

In the event that somebody would call this with an already fully
populated page_array, the last loop iteration would do an access beyond
the end of page_array.

It's of course extremely unlikely that would ever be done, but this
triggers my internal static analyzer.  Also, if it really is not
supposed to be invoked this way (i.e., with no NULL entries in
page_array), the nr_populated<nr_pages check could simply be removed
instead.

Link: https://lkml.kernel.org/r/20210507064504.1712559-1-linux@rasmusvillemoes.dk
Fixes: 0f87d9d3 ("mm/page_alloc: add an array-based interface to the bulk page allocator")
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: NMel Gorman <mgorman@techsingularity.net>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit b08e50dd)
Signed-off-by: NYongqiang Liu <liuyongqiang13@huawei.com>
Reviewed-by: NTong Tiangen <tongtiangen@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

7491a8e9

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功