提交 · 73d3e679bef2ddbd8e31228fa59c3ec34d382f66 · openeuler / Kernel

07 1月, 2022 37 次提交

xfs: remove xfs_trans_unreserve_quota_nblks completely · 73d3e679

由 Darrick J. Wong 提交于 1月 07, 2022

mainline-inclusion
from mainline-v5.11-rc4
commit 35b11010
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=35b1101099e85af74a46b8e36f4d1fdac0367ffd

-------------------------------------------------

xfs_trans_cancel will release all the quota resources that were reserved
on behalf of the transaction, so get rid of the explicit unreserve step.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NLihong Kou <koulihong@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

73d3e679

xfs: create convenience wrappers for incore quota block reservations · 86d7a720

由 Darrick J. Wong 提交于 1月 07, 2022

mainline-inclusion
from mainline-v5.13-rc4
commit 85546500
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/xfs?h=v5.16-rc4&id=8554650003b8a66f3dd357692ab73101d088d938

----------------------------------------------------------------------

Create a couple of convenience wrappers for creating and deleting quota
block reservations against future changes.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NLihong Kou <koulihong@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

86d7a720

xfs: clean up quota reservation callsites · ba803695

由 Darrick J. Wong 提交于 1月 07, 2022

mainline-inclusion
from mainline-v5.13-rc4
commit 	4abe21ad
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/xfs?h=v5.16-rc4&id=4abe21ad67a7b9dc6844f55e91a6e3ef81879d42

----------------------------------------------------------------------

Convert a few xfs_trans_*reserve* callsites that are open-coding other
convenience functions.
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NLihong Kou <koulihong@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

ba803695

xfs: reduce quota reservation when doing a dax unwritten extent conversion · 8d6f9ca3

由 Darrick J. Wong 提交于 1月 07, 2022

mainline-inclusion
from mainline-v5.13-rc4
commit b8055ed6
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4KIAO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/xfs?h=v5.16-rc4&id=b8055ed6779d675e30f019ba3b7141848a4d6558

----------------------------------------------------------------------
In commit 3b0fe478, we reduced the free space requirement to
perform a pre-write unwritten extent conversion on an S_DAX file.  Since
we're not actually allocating any space, the logic goes, we only need
enough reservation to handle shape changes in the bmbt.

The same logic should have been applied to quota -- we're not allocating
any space, so we only need to reserve enough quota to handle the bmbt
shape changes.

Fixes: 3b0fe478 ("xfs: Don't use reserved blocks for data blocks with DAX")
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Signed-off-by: NLihong Kou <koulihong@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8d6f9ca3

scsi:spraid: use bsg module to replace with ioctrl · f45e5836

由 Yanling Song 提交于 1月 07, 2022

Ramaxel inclusion
category: features
bugzilla: https://gitee.com/openeuler/kernel/issues/I4JXCG
CVE: NA

Changes:
1. Use bsg module to replace with ioctrl
2. Split scmd_tmout_nonpt into two parameters:
   scmd_tmout_vd/scmd_tmout_rawdisk
3. Return -ETIME instead of -EINVAL when command is timeout.
4. Add one module parameters: max_io_force.
5. Remove some unnecessary module parameters.
6. Report disks by the order of channel/target id.
7. Add host_reset handler.
8. Use get_unaligned_be24.
Signed-off-by: NYanling Song <songyl@ramaxel.com>
Reviewed-by: Jiang Yu<yujiang@ramaxel.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

f45e5836

KVM: vmx/pmu: Fix dummy check if lbr_desc->event is created · 9d122be4

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit 67b45af9
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=67b45af946ec3148b64e6a3a1ee2ea8f79c5bc07

-------------------

If lbr_desc->event is successfully created, the intel_pmu_create_
guest_lbr_event() will return 0, otherwise it will return -ENOENT,
and then jump to LBR msrs dummy handling.

Fixes: 1b5ac322 ("KVM: vmx/pmu: Pass-through LBR msrs when the guest LBR event is ACTIVE")
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Message-Id: <20210223013958.1280444-1-like.xu@linux.intel.com>
[Add "< 0" and PTR_ERR to make the code clearer. - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

9d122be4

KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES · 1f3b1e63

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit be635e34
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=be635e34c284d08b1da7f93ddd6a2110617d15e7

-------------------

Userspace could enable guest LBR feature when the exactly supported
LBR format value is initialized to the MSR_IA32_PERF_CAPABILITIES
and the LBR is also compatible with vPMU version and host cpu model.

The LBR could be enabled on the guest if host perf supports LBR
(checked via x86_perf_get_lbr()) and the vcpu model is compatible
with the host one.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Message-Id: <20210201051039.255478-11-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1f3b1e63

KVM: vmx/pmu: Release guest LBR event via lazy release mechanism · 04876382

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit 9aa4f622
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9aa4f622460f9287e57804dbeb219bfef29f04a1

-------------------

The vPMU uses GUEST_LBR_IN_USE_IDX (bit 58) in 'pmu->pmc_in_use' to
indicate whether a guest LBR event is still needed by the vcpu. If the
vcpu no longer accesses LBR related registers within a scheduling time
slice, and the enable bit of LBR has been unset, vPMU will treat the
guest LBR event as a bland event of a vPMC counter and release it
as usual. Also, the pass-through state of LBR records msrs is cancelled.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Message-Id: <20210201051039.255478-10-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

04876382

KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI · 82d2ccbb

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit e6209a3b
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e6209a3bef793e8fe29c873a7612023916eaa611

-------------------

The current vPMU only supports Architecture Version 2. According to
Intel SDM "17.4.7 Freezing LBR and Performance Counters on PMI", if
IA32_DEBUGCTL.Freeze_LBR_On_PMI = 1, the LBR is frozen on the virtual
PMI and the KVM would emulate to clear the LBR bit (bit 0) in
IA32_DEBUGCTL. Also, guest needs to re-enable IA32_DEBUGCTL.LBR
to resume recording branches.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Message-Id: <20210201051039.255478-9-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

82d2ccbb

KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation · 58632199

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit 9254beaa
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9254beaafd12e27d48149fab3b16db372bc90ad7

-------------------

When the LBR records msrs has already been pass-through, there is no
need to call vmx_update_intercept_for_lbr_msrs() again and again, and
vice versa.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Message-Id: <20210201051039.255478-8-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

58632199

KVM: vmx/pmu: Pass-through LBR msrs when the guest LBR event is ACTIVE · e6aea025

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit 1b5ac322
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1b5ac3226a1aa071135fe0ee5d1055d9e88b717c

-------------------

In addition to DEBUGCTLMSR_LBR, any KVM trap caused by LBR msrs access
will result in a creation of guest LBR event per-vcpu.

If the guest LBR event is scheduled on with the corresponding vcpu context,
KVM will pass-through all LBR records msrs to the guest. The LBR callstack
mechanism implemented in the host could help save/restore the guest LBR
records during the event context switches, which reduces a lot of overhead
if we save/restore tens of LBR msrs (e.g. 32 LBR records entries) in the
much more frequent VMX transitions.

To avoid reclaiming LBR resources from any higher priority event on host,
KVM would always check the exist of guest LBR event and its state before
vm-entry as late as possible. A negative result would cancel the
pass-through state, and it also prevents real registers accesses and
potential data leakage. If host reclaims the LBR between two checks, the
interception state and LBR records can be safely preserved due to native
save/restore support from guest LBR event.

The KVM emits a pr_warn() when the LBR hardware is unavailable to the
guest LBR event. The administer is supposed to reminder users that the
guest result may be inaccurate if someone is using LBR to record
hypervisor on the host side.
Suggested-by: NAndi Kleen <ak@linux.intel.com>
Co-developed-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Message-Id: <20210201051039.255478-7-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e6aea025

KVM: vmx/pmu: Create a guest LBR event when vcpu sets DEBUGCTLMSR_LBR · 157add0e

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit 8e12911b
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8e12911b243e485f5e4c7c5fbc79cdf185728700

-------------------

When vcpu sets DEBUGCTLMSR_LBR in the MSR_IA32_DEBUGCTLMSR, the KVM handler
would create a guest LBR event which enables the callstack mode and none of
hardware counter is assigned. The host perf would schedule and enable this
event as usual but in an exclusive way.

The guest LBR event will be released when the vPMU is reset but soon,
the lazy release mechanism would be applied to this event like a vPMC.
Suggested-by: NAndi Kleen <ak@linux.intel.com>
Co-developed-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NWei Wang <wei.w.wang@intel.com>
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Message-Id: <20210201051039.255478-6-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

157add0e

KVM: vmx/pmu: Add PMU_CAP_LBR_FMT check when guest LBR is enabled · 1c78a8dc

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit c6462363
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c646236344e9054cc84cd5a9f763163b9654cf7e

-------------------

Usespace could set the bits [0, 5] of the IA32_PERF_CAPABILITIES
MSR which tells about the record format stored in the LBR records.

The LBR will be enabled on the guest if host perf supports LBR
(checked via x86_perf_get_lbr()) and the vcpu model is compatible
with the host one.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Message-Id: <20210201051039.255478-4-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1c78a8dc

KVM: vmx/pmu: Add PMU_CAP_LBR_FMT check when guest LBR is enabled · 81d99ef6

由 Paolo Bonzini 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit 9c9520ce
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9c9520ce883386dc3794c7d60204487ff1db09cb

-------------------

Usespace could set the bits [0, 5] of the IA32_PERF_CAPABILITIES
MSR which tells about the record format stored in the LBR records.

The LBR will be enabled on the guest if host perf supports LBR
(checked via x86_perf_get_lbr()) and the vcpu model is compatible
with the host one.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Message-Id: <20210201051039.255478-4-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

81d99ef6

KVM: x86/pmu: preserve IA32_PERF_CAPABILITIES across CPUID refresh · c22843e8

由 Paolo Bonzini 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit a7557539
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a755753903a40d982f6dd23d65eb96b248a2577a

-------------------

Once MSR_IA32_PERF_CAPABILITIES is changed via vmx_set_msr(), the
value should not be changed by cpuid(). To ensure that the new value
is kept, the default initialization path is moved to intel_pmu_init().
The effective value of the MSR will be 0 if PDCM is clear, however.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

c22843e8

KVM: x86/vmx: Make vmx_set_intercept_for_msr() non-static · a7beb17f

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit 252e365e
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=252e365eb28ddf49eb31cec1a5d99e708c73f57b

-------------------

To make code responsibilities clear, we may resue and invoke the
vmx_set_intercept_for_msr() in other vmx-specific files (e.g. pmu_intel.c),
so expose it to passthrough LBR msrs later.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Message-Id: <20210201051039.255478-2-like.xu@linux.intel.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a7beb17f

KVM: VMX: read/write MSR_IA32_DEBUGCTLMSR from GUEST_IA32_DEBUGCTL · 52622a7d

由 Like Xu 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit d855066f
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d855066f81726155caf766e47eea58ae10b1fd57

-------------------

SVM already has specific handlers of MSR_IA32_DEBUGCTLMSR in the
svm_get/set_msr, so the x86 common part can be safely moved to VMX.
This allows KVM to store the bits it supports in GUEST_IA32_DEBUGCTL.

Add vmx_supported_debugctl() to refactor the throwing logic of #GP.
Signed-off-by: NLike Xu <like.xu@linux.intel.com>
Reviewed-by: NAndi Kleen <ak@linux.intel.com>
Message-Id: <20210108013704.134985-2-like.xu@linux.intel.com>
[Merge parts of Chenyi Qiang's "KVM: X86: Expose bus lock debug exception
 to guest". - Paolo]
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NWangJian <wangjian161@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

52622a7d

openeuler_defconfig: Enable sharepool feature in defconfig · e24758a4

由 Wang Wensheng 提交于 1月 07, 2022

ascend inclusion
category: Feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW
CVE: NA

-------------------

Add this configs in openeuler_defconfig.
CONFIG_ASCEND_CHARGE_MIGRATE_HUGEPAGES=y
CONFIG_ASCEND_SHARE_POOL=y
Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com>
Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e24758a4

net/spnic:The reset command flags modification. · 09b1db85

由 Yanling Song 提交于 1月 07, 2022

Ramaxel inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OENF
CVE: NA

Rearragne the order and add some flag bits for future use.
Signed-off-by: NYanling Song <songyl@ramaxel.com>
Reviewed-by: NYang Gan <yanggan@ramaxel.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

09b1db85

net/spnic:Attribute negotiation and optimization. · 1ac36fde

由 Yanling Song 提交于 1月 07, 2022

Ramaxel inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OENF
CVE: NA

Driver and firmware properties negotiate optimization.
After obtaining the attributes from the firmware.
It intersects with the attributes supported by the driver.
Signed-off-by: NYanling Song <songyl@ramaxel.com>
Reviewed-by: NYang Gan <yanggan@ramaxel.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1ac36fde

net/spnic:RSS initialization process optimization · 1bd29bc5

由 Yanling Song 提交于 1月 07, 2022

Ramaxel inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OENF
CVE: NA

Firmware is responsible for the application and release of RSS module ID.
And the driver no longer perceives it .
Signed-off-by: NYanling Song <songyl@ramaxel.com>
Reviewed-by: NYang Gan <yanggan@ramaxel.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1bd29bc5

arm64: Fix conflict for capability when cpu hotplug · edaf6cd4

由 Weilong Chen 提交于 1月 07, 2022

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4LGV4
CVE: NA

---------------------------

Patch "cache: Workaround HiSilicon Taishan DC CVAU" breaks the verifiy
of cpu capability when hot plug cpus. It set the system scope on but
local cpu capability still off.
This path fix it by two step:
1. Unset CTR_IDC_SHIFT bit from strict_mask to skip check.
2. Special treatment in read_cpuid_effective_cachetype
Signed-off-by: NWeilong Chen <chenweilong@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

edaf6cd4

memcg: Add static key for memcg kswapd · 84355bcc

由 Lu Jialin 提交于 1月 07, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue
CVE: NA

--------

This patch adds a default-false static key to disable memcg kswapd
feature. User can enable by set memcg_kswapd in cmdline.
Signed-off-by: NLu Jialin <lujialin4@huawei.com>
Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

84355bcc

memcg: make memcg kswapd deal with dirty · 70d020ae

由 Lu Jialin 提交于 1月 07, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue
CVE: NA

--------

The memcg kswapd could set dirty state to memcg if current scan find all
pages are unqueued dirty in the memcg. Then kswapd would write out dirty pages.
Signed-off-by: NLu Jialin <lujialin4@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

70d020ae

memcg: support memcg sync reclaim work as kswapd · 1496d67c

由 Lu Jialin 提交于 1月 07, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue
CVE: NA

--------

Since memory.high reclaim is sync whether is in interrupt, it could
do more work than direct reclaim, i.e. write out dirty page, etc.

So, add PF_KSWAPD flag, so that current_is_kswapd() would return true
for memcg kswapd.

Memcg kswapd should stop when usage of memcg fit the memcg kswapd stop
flag. When the userland sets the memcg->memory.max, the stop_flag is
(memcg->memory.high - memcg->memory.max * 10 / 1000), which is similar
with global kswapd. Otherwise, the stop_flag is (memcg->memory.high -
memcg->memory.high / 6), which is similar with most difference between
watermark_low and watermark_high.

And, memcg kswapd should not break memory.low protection for now.
Signed-off-by: NLu Jialin <lujialin4@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

1496d67c

memcg: Export memcg.high from cgroupv2 to cgroupv1 · 6a7b3e98

由 Lu Jialin 提交于 1月 07, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue
CVE: NA

--------

Export memory.high from cgroupv2 to cgroupv1. Therefore, when the usage
of the memcg is larger than memory.high, some pages will be reclaimed
before return to userland, which will throttle the process.

Only export memory.high number in mem_cgroup_legacy_files and move
related functions in front of mem_cgroup_legacy_files. There is no need
to other changes.
Signed-off-by: NLu Jialin <lujialin4@huawei.com>
Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com>
Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

6a7b3e98

memcg: Export memcg.{min/low} from cgroupv2 to cgroupv1 · 27c047f4

由 Lu Jialin 提交于 1月 07, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue
CVE: NA

--------

Export memcg.min and memcg.low from cgroupv2 to cgroupv1, in order to reduce
the negtive impact between cgroups when the system memory is insufficient.

Only export memory.{min/low} numbers in mem_cgroup_legacy_files and move
related functions in front of mem_cgroup_legacy_files. There is no need
to other changes.
Signed-off-by: NLu Jialin <lujialin4@huawei.com>
Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

27c047f4

kabi: Add reserved page and gfp flags for future extension · afdf2a6c

由 Peng Liang 提交于 1月 07, 2022

euleros inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4OELI
CVE: NA

-------------------------------------------------

24 page flags are used in 32-bit architectures and 27 page flags are
used in 64-bit ones currently.  And 23 gfp flags are used currently.
Add 2 reserved page and gfp flags for internal extension.  For the
new flags which backported from kernel upstream, place them behind the
reserved flags.
Signed-off-by: NPeng Liang <liangpeng10@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

afdf2a6c

kabi: reserve space for cgroup_bpf_attach_type and bpf_cgroup_storage_type · 7623fa5d

由 Lu Jialin 提交于 1月 07, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4GII8?from=project-issue
CVE: NA

--------

We reserve some fields beforehand for cgroup_bpf_attach_type and bpf_cgroup_storage_type
prone to change, therefore, we can hot add/change features of bpf cgroup
with this enhancement.

After reserving, normally cache does not matter as the reserved fields
are not accessed at all.
Signed-off-by: NLu Jialin <lujialin4@huawei.com>
Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com>
Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Wei Yongjun<weiyongjun1@huawei.com>
Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

7623fa5d

bpf: Migrate cgroup_bpf to internal cgroup_bpf_attach_type enum · fac4c1ea

由 Dave Marchevsky 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.15-rc1
commit 6fc88c35
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4GII8?from=project-issue
CVE: NA

----------

Add an enum (cgroup_bpf_attach_type) containing only valid cgroup_bpf
attach types and a function to map bpf_attach_type values to the new
enum. Inspired by netns_bpf_attach_type.

Then, migrate cgroup_bpf to use cgroup_bpf_attach_type wherever
possible.  Functionality is unchanged as attach_type_to_prog_type
switches in bpf/syscall.c were preventing non-cgroup programs from
making use of the invalid cgroup_bpf array slots.

As a result struct cgroup_bpf uses 504 fewer bytes relative to when its
arrays were sized using MAX_BPF_ATTACH_TYPE.

bpf_cgroup_storage is notably not migrated as struct
bpf_cgroup_storage_key is part of uapi and contains a bpf_attach_type
member which is not meant to be opaque. Similarly, bpf_cgroup_link
continues to report its bpf_attach_type member to userspace via fdinfo
and bpf_link_info.

To ease disambiguation, bpf_attach_type variables are renamed from
'type' to 'atype' when changed to cgroup_bpf_attach_type.
Signed-off-by: NDave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210819092420.1984861-2-davemarchevsky@fb.com
Conflicts:
	include/linux/bpf-cgroup.h
	kernel/bpf/cgroup.c
	net/ipv4/af_inet.c
	net/ipv6/af_inet6.c
Signed-off-by: NLu Jialin <lujialin4@huawei.com>
Reviewed-by: Wei Yongjun<weiyongjun1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

fac4c1ea

bpf: Split cgroup_bpf_enabled per attach type · e960ab23

由 Stanislav Fomichev 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit a9ed15da
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4GII8?from=project-issue
CVE: NA

------------

When we attach any cgroup hook, the rest (even if unused/unattached) start
to contribute small overhead. In particular, the one we want to avoid is
__cgroup_bpf_run_filter_skb which does two redirections to get to
the cgroup and pushes/pulls skb.

Let's split cgroup_bpf_enabled to be per-attach to make sure
only used attach types trigger.

I've dropped some existing high-level cgroup_bpf_enabled in some
places because BPF_PROG_CGROUP_XXX_RUN macros usually have another
cgroup_bpf_enabled check.

I also had to copy-paste BPF_CGROUP_RUN_SA_PROG_LOCK for
GETPEERNAME/GETSOCKNAME because type for cgroup_bpf_enabled[type]
has to be constant and known at compile time.
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NSong Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210115163501.805133-4-sdf@google.com
Conflict:
	include/linux/bpf-cgroup.h
Signed-off-by: NLu Jialin <lujialin4@huawei.com>
Reviewed-by: Wei Yongjun<weiyongjun1@huawei.com>
Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e960ab23

bpf: Try to avoid kzalloc in cgroup/{s,g}etsockopt · 5bf7da9d

由 Stanislav Fomichev 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.12-rc1
commit 20f2505f
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4GII8?from=project-issue
CVE: NA

--------------

When we attach a bpf program to cgroup/getsockopt any other getsockopt()
syscall starts incurring kzalloc/kfree cost.

Let add a small buffer on the stack and use it for small (majority)
{s,g}etsockopt values. The buffer is small enough to fit into
the cache line and cover the majority of simple options (most
of them are 4 byte ints).

It seems natural to do the same for setsockopt, but it's a bit more
involved when the BPF program modifies the data (where we have to
kmalloc). The assumption is that for the majority of setsockopt
calls (which are doing pure BPF options or apply policy) this
will bring some benefit as well.

Without this patch (we remove about 1% __kmalloc):
     3.38%     0.07%  tcp_mmap  [kernel.kallsyms]  [k] __cgroup_bpf_run_filter_getsockopt
            |
             --3.30%--__cgroup_bpf_run_filter_getsockopt
                       |
                        --0.81%--__kmalloc
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210115163501.805133-3-sdf@google.comSigned-off-by: NLu Jialin <lujialin4@huawei.com>
Reviewed-by: Wei Yongjun<weiyongjun1@huawei.com>
Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

5bf7da9d

bpf: Allow bpf_{s,g}etsockopt from cgroup bind{4,6} hooks · b55b39ad

由 Stanislav Fomichev 提交于 1月 07, 2022

mainline inclusion
from mainline-v5.11-rc1
commit 427167c0
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4GII8?from=project-issue
CVE: NA

---------

I have to now lock/unlock socket for the bind hook execution.
That shouldn't cause any overhead because the socket is unbound
and shouldn't receive any traffic.
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NAndrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/20201202172516.3483656-3-sdf@google.comSigned-off-by: NLu Jialin <lujialin4@huawei.com>
Reviewed-by: Wei Yongjun<weiyongjun1@huawei.com>
Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

b55b39ad

KABI: Add KABI_AUX_PTR extenstions to some more base structures · a62969d3

由 Zheng Zengkai 提交于 1月 07, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4JBL0
CVE: NA

------------------------------

Add KABI_AUX_PTR extenstions to the following base structures
before KABI freeze:

struct device_driver
struct class
struct device
struct hrtimer
struct ipmi_smi_handlers
struct net_device
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

a62969d3

kabi: Generalize naming of kabi helper macros · e146a64e

由 Zheng Zengkai 提交于 1月 07, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4K3S5

--------------------------

Generalize naming of some kabi helper macros.
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

e146a64e

arm64: Request resources for reserved memory via memmap · 374db2be

由 Peng Liu 提交于 1月 07, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NYPZ
CVE: NA

-------------------------------------------------

A new flag MEMBLOCK_MEMMAP is added into memblock_flags, which is
used to identify reserved memory for memmap. This flag is limited
for arm64. When memmap memory is reserved by memblock_reserve, it
is subsequently marked with flag MEMBLOCK_MEMMAP. Therefore,
for_each_mem_region can find memmap memory and request resources
for it.
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

374db2be

arm64: Add support for memmap kernel parameters · d05cfbd9

由 Peng Liu 提交于 1月 07, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4NYPZ
CVE: NA

-------------------------------------------------

Add support for memmap kernel parameters for ARM64. The three below
modes are supported:

memmap=exactmap
Enable setting of an exact memory map, as specified by the user.

memmap=nn[KMG]@ss[KMG]
Force usage of a specific region of memory.

memmap=nn[KMG]$ss[KMG]
Region of memory to be reserved is from ss to ss+nn, the region must
be in the range of existed memory, otherwise will be ignored.

If users set memmap=exactmap before memmap=nn[KMG]@ss[KMG], they will
get the exact memory specified by memmap=nn[KMG]@ss[KMG]. For example,
on one machine with 4GB memory, "memmap=exactmap memmap=1G@1G" will
make kernel use the memory from 1GB to 2GB only.
Signed-off-by: NPeng Liu <liupeng256@huawei.com>
Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d05cfbd9

06 1月, 2022 3 次提交

openeuler_defconfig: Enable CONFIG_KABI_RESERVE for x86 and arm64 · 8c34e5f6

由 Zheng Zengkai 提交于 1月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4K3S5

---------------------------

Enable CONFIG_KABI_RESERVE for x86 and arm64 architectures in
openeuler_defconfig by default.
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8c34e5f6

KABI: Add CONFIG_KABI_RESERVE to control KABI padding reserve · d332bb38

由 Zheng Zengkai 提交于 1月 06, 2022

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4K3S5

-----------------------

Add CONFIG_KABI_RESERVE to control KABI padding reserve or not,
for some embedded system, KABI padding reserve may be not necessary.

By the way, adjust unsigned long to u64 to unify basic reserve
length for both 32bit and 64bit architectures.
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d332bb38

KABI: Fix allmodconfig build error · 65805061

由 Zheng Zengkai 提交于 1月 06, 2022

hulk inclusion
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MZU1
CVE: NA

---------------------------

For x86 platform, make allmodconfig & make -j64, following two
build errors are reported.

First error:
- build failed:
- In file included from <command-line>:32:0:
./usr/include/asm/bootparam.h:45:10: fatal error: linux/kabi.h: No such file or directory
 #include <linux/kabi.h>
          ^~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [usr/include/asm/bootparam.hdrtest] Error 1

Second error:
./arch/x86/include/asm/paravirt_types.h:198:2: error: expected specifier-qualifier-list before ‘KABI_RESERVE’
  KABI_RESERVE(1)
  ^~~~~~~~~~~~
./arch/x86/include/asm/paravirt_types.h:286:2: error: expected specifier-qualifier-list before ‘KABI_RESERVE’
  KABI_RESERVE(1)
  ^~~~~~~~~~~~
./arch/x86/include/asm/paravirt_types.h:309:2: error: expected specifier-qualifier-list before ‘KABI_RESERVE’
  KABI_RESERVE(1)
  ^~~~~~~~~~~~
make[1]: *** [scripts/Makefile.build:117: arch/x86/kernel/asm-offsets.s] Error 1

To fix first error, reverts commit 3ba63bac bootparam: Add kabi_reserve in bootparam.
To fix second error, add include file kabi.h to arch/x86/include/asm/paravirt_types.h.
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NXie Xiuqi <xiexiuqi@huawei.com>

65805061

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功