- 07 1月, 2022 30 次提交
-
-
由 Like Xu 提交于
mainline inclusion from mainline-v5.12-rc1 commit 9aa4f622 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9aa4f622460f9287e57804dbeb219bfef29f04a1 ------------------- The vPMU uses GUEST_LBR_IN_USE_IDX (bit 58) in 'pmu->pmc_in_use' to indicate whether a guest LBR event is still needed by the vcpu. If the vcpu no longer accesses LBR related registers within a scheduling time slice, and the enable bit of LBR has been unset, vPMU will treat the guest LBR event as a bland event of a vPMC counter and release it as usual. Also, the pass-through state of LBR records msrs is cancelled. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20210201051039.255478-10-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NWangJian <wangjian161@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Like Xu 提交于
mainline inclusion from mainline-v5.12-rc1 commit e6209a3b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e6209a3bef793e8fe29c873a7612023916eaa611 ------------------- The current vPMU only supports Architecture Version 2. According to Intel SDM "17.4.7 Freezing LBR and Performance Counters on PMI", if IA32_DEBUGCTL.Freeze_LBR_On_PMI = 1, the LBR is frozen on the virtual PMI and the KVM would emulate to clear the LBR bit (bit 0) in IA32_DEBUGCTL. Also, guest needs to re-enable IA32_DEBUGCTL.LBR to resume recording branches. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Message-Id: <20210201051039.255478-9-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NWangJian <wangjian161@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Like Xu 提交于
mainline inclusion from mainline-v5.12-rc1 commit 9254beaa category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9254beaafd12e27d48149fab3b16db372bc90ad7 ------------------- When the LBR records msrs has already been pass-through, there is no need to call vmx_update_intercept_for_lbr_msrs() again and again, and vice versa. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Message-Id: <20210201051039.255478-8-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NWangJian <wangjian161@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Like Xu 提交于
mainline inclusion from mainline-v5.12-rc1 commit 1b5ac322 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1b5ac3226a1aa071135fe0ee5d1055d9e88b717c ------------------- In addition to DEBUGCTLMSR_LBR, any KVM trap caused by LBR msrs access will result in a creation of guest LBR event per-vcpu. If the guest LBR event is scheduled on with the corresponding vcpu context, KVM will pass-through all LBR records msrs to the guest. The LBR callstack mechanism implemented in the host could help save/restore the guest LBR records during the event context switches, which reduces a lot of overhead if we save/restore tens of LBR msrs (e.g. 32 LBR records entries) in the much more frequent VMX transitions. To avoid reclaiming LBR resources from any higher priority event on host, KVM would always check the exist of guest LBR event and its state before vm-entry as late as possible. A negative result would cancel the pass-through state, and it also prevents real registers accesses and potential data leakage. If host reclaims the LBR between two checks, the interception state and LBR records can be safely preserved due to native save/restore support from guest LBR event. The KVM emits a pr_warn() when the LBR hardware is unavailable to the guest LBR event. The administer is supposed to reminder users that the guest result may be inaccurate if someone is using LBR to record hypervisor on the host side. Suggested-by: NAndi Kleen <ak@linux.intel.com> Co-developed-by: NWei Wang <wei.w.wang@intel.com> Signed-off-by: NWei Wang <wei.w.wang@intel.com> Signed-off-by: NLike Xu <like.xu@linux.intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Message-Id: <20210201051039.255478-7-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NWangJian <wangjian161@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Like Xu 提交于
mainline inclusion from mainline-v5.12-rc1 commit 8e12911b category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8e12911b243e485f5e4c7c5fbc79cdf185728700 ------------------- When vcpu sets DEBUGCTLMSR_LBR in the MSR_IA32_DEBUGCTLMSR, the KVM handler would create a guest LBR event which enables the callstack mode and none of hardware counter is assigned. The host perf would schedule and enable this event as usual but in an exclusive way. The guest LBR event will be released when the vPMU is reset but soon, the lazy release mechanism would be applied to this event like a vPMC. Suggested-by: NAndi Kleen <ak@linux.intel.com> Co-developed-by: NWei Wang <wei.w.wang@intel.com> Signed-off-by: NWei Wang <wei.w.wang@intel.com> Signed-off-by: NLike Xu <like.xu@linux.intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Message-Id: <20210201051039.255478-6-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NWangJian <wangjian161@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Like Xu 提交于
mainline inclusion from mainline-v5.12-rc1 commit c6462363 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c646236344e9054cc84cd5a9f763163b9654cf7e ------------------- Usespace could set the bits [0, 5] of the IA32_PERF_CAPABILITIES MSR which tells about the record format stored in the LBR records. The LBR will be enabled on the guest if host perf supports LBR (checked via x86_perf_get_lbr()) and the vcpu model is compatible with the host one. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20210201051039.255478-4-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NWangJian <wangjian161@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Paolo Bonzini 提交于
mainline inclusion from mainline-v5.12-rc1 commit 9c9520ce category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9c9520ce883386dc3794c7d60204487ff1db09cb ------------------- Usespace could set the bits [0, 5] of the IA32_PERF_CAPABILITIES MSR which tells about the record format stored in the LBR records. The LBR will be enabled on the guest if host perf supports LBR (checked via x86_perf_get_lbr()) and the vcpu model is compatible with the host one. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Message-Id: <20210201051039.255478-4-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NWangJian <wangjian161@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Paolo Bonzini 提交于
mainline inclusion from mainline-v5.12-rc1 commit a7557539 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a755753903a40d982f6dd23d65eb96b248a2577a ------------------- Once MSR_IA32_PERF_CAPABILITIES is changed via vmx_set_msr(), the value should not be changed by cpuid(). To ensure that the new value is kept, the default initialization path is moved to intel_pmu_init(). The effective value of the MSR will be 0 if PDCM is clear, however. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NWangJian <wangjian161@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Like Xu 提交于
mainline inclusion from mainline-v5.12-rc1 commit 252e365e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=252e365eb28ddf49eb31cec1a5d99e708c73f57b ------------------- To make code responsibilities clear, we may resue and invoke the vmx_set_intercept_for_msr() in other vmx-specific files (e.g. pmu_intel.c), so expose it to passthrough LBR msrs later. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Message-Id: <20210201051039.255478-2-like.xu@linux.intel.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NWangJian <wangjian161@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Like Xu 提交于
mainline inclusion from mainline-v5.12-rc1 commit d855066f category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d855066f81726155caf766e47eea58ae10b1fd57 ------------------- SVM already has specific handlers of MSR_IA32_DEBUGCTLMSR in the svm_get/set_msr, so the x86 common part can be safely moved to VMX. This allows KVM to store the bits it supports in GUEST_IA32_DEBUGCTL. Add vmx_supported_debugctl() to refactor the throwing logic of #GP. Signed-off-by: NLike Xu <like.xu@linux.intel.com> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Message-Id: <20210108013704.134985-2-like.xu@linux.intel.com> [Merge parts of Chenyi Qiang's "KVM: X86: Expose bus lock debug exception to guest". - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Reviewed-by: NWangJian <wangjian161@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang Wensheng 提交于
ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA ------------------- Add this configs in openeuler_defconfig. CONFIG_ASCEND_CHARGE_MIGRATE_HUGEPAGES=y CONFIG_ASCEND_SHARE_POOL=y Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yanling Song 提交于
Ramaxel inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4OENF CVE: NA Rearragne the order and add some flag bits for future use. Signed-off-by: NYanling Song <songyl@ramaxel.com> Reviewed-by: NYang Gan <yanggan@ramaxel.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yanling Song 提交于
Ramaxel inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4OENF CVE: NA Driver and firmware properties negotiate optimization. After obtaining the attributes from the firmware. It intersects with the attributes supported by the driver. Signed-off-by: NYanling Song <songyl@ramaxel.com> Reviewed-by: NYang Gan <yanggan@ramaxel.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Yanling Song 提交于
Ramaxel inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4OENF CVE: NA Firmware is responsible for the application and release of RSS module ID. And the driver no longer perceives it . Signed-off-by: NYanling Song <songyl@ramaxel.com> Reviewed-by: NYang Gan <yanggan@ramaxel.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Weilong Chen 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4LGV4 CVE: NA --------------------------- Patch "cache: Workaround HiSilicon Taishan DC CVAU" breaks the verifiy of cpu capability when hot plug cpus. It set the system scope on but local cpu capability still off. This path fix it by two step: 1. Unset CTR_IDC_SHIFT bit from strict_mask to skip check. 2. Special treatment in read_cpuid_effective_cachetype Signed-off-by: NWeilong Chen <chenweilong@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lu Jialin 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue CVE: NA -------- This patch adds a default-false static key to disable memcg kswapd feature. User can enable by set memcg_kswapd in cmdline. Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lu Jialin 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue CVE: NA -------- The memcg kswapd could set dirty state to memcg if current scan find all pages are unqueued dirty in the memcg. Then kswapd would write out dirty pages. Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lu Jialin 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue CVE: NA -------- Since memory.high reclaim is sync whether is in interrupt, it could do more work than direct reclaim, i.e. write out dirty page, etc. So, add PF_KSWAPD flag, so that current_is_kswapd() would return true for memcg kswapd. Memcg kswapd should stop when usage of memcg fit the memcg kswapd stop flag. When the userland sets the memcg->memory.max, the stop_flag is (memcg->memory.high - memcg->memory.max * 10 / 1000), which is similar with global kswapd. Otherwise, the stop_flag is (memcg->memory.high - memcg->memory.high / 6), which is similar with most difference between watermark_low and watermark_high. And, memcg kswapd should not break memory.low protection for now. Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lu Jialin 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue CVE: NA -------- Export memory.high from cgroupv2 to cgroupv1. Therefore, when the usage of the memcg is larger than memory.high, some pages will be reclaimed before return to userland, which will throttle the process. Only export memory.high number in mem_cgroup_legacy_files and move related functions in front of mem_cgroup_legacy_files. There is no need to other changes. Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lu Jialin 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue CVE: NA -------- Export memcg.min and memcg.low from cgroupv2 to cgroupv1, in order to reduce the negtive impact between cgroups when the system memory is insufficient. Only export memory.{min/low} numbers in mem_cgroup_legacy_files and move related functions in front of mem_cgroup_legacy_files. There is no need to other changes. Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peng Liang 提交于
euleros inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4OELI CVE: NA ------------------------------------------------- 24 page flags are used in 32-bit architectures and 27 page flags are used in 64-bit ones currently. And 23 gfp flags are used currently. Add 2 reserved page and gfp flags for internal extension. For the new flags which backported from kernel upstream, place them behind the reserved flags. Signed-off-by: NPeng Liang <liangpeng10@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lu Jialin 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4GII8?from=project-issue CVE: NA -------- We reserve some fields beforehand for cgroup_bpf_attach_type and bpf_cgroup_storage_type prone to change, therefore, we can hot add/change features of bpf cgroup with this enhancement. After reserving, normally cache does not matter as the reserved fields are not accessed at all. Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Wei Yongjun<weiyongjun1@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Dave Marchevsky 提交于
mainline inclusion from mainline-v5.15-rc1 commit 6fc88c35 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4GII8?from=project-issue CVE: NA ---------- Add an enum (cgroup_bpf_attach_type) containing only valid cgroup_bpf attach types and a function to map bpf_attach_type values to the new enum. Inspired by netns_bpf_attach_type. Then, migrate cgroup_bpf to use cgroup_bpf_attach_type wherever possible. Functionality is unchanged as attach_type_to_prog_type switches in bpf/syscall.c were preventing non-cgroup programs from making use of the invalid cgroup_bpf array slots. As a result struct cgroup_bpf uses 504 fewer bytes relative to when its arrays were sized using MAX_BPF_ATTACH_TYPE. bpf_cgroup_storage is notably not migrated as struct bpf_cgroup_storage_key is part of uapi and contains a bpf_attach_type member which is not meant to be opaque. Similarly, bpf_cgroup_link continues to report its bpf_attach_type member to userspace via fdinfo and bpf_link_info. To ease disambiguation, bpf_attach_type variables are renamed from 'type' to 'atype' when changed to cgroup_bpf_attach_type. Signed-off-by: NDave Marchevsky <davemarchevsky@fb.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210819092420.1984861-2-davemarchevsky@fb.com Conflicts: include/linux/bpf-cgroup.h kernel/bpf/cgroup.c net/ipv4/af_inet.c net/ipv6/af_inet6.c Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Wei Yongjun<weiyongjun1@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Stanislav Fomichev 提交于
mainline inclusion from mainline-v5.12-rc1 commit a9ed15da category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4GII8?from=project-issue CVE: NA ------------ When we attach any cgroup hook, the rest (even if unused/unattached) start to contribute small overhead. In particular, the one we want to avoid is __cgroup_bpf_run_filter_skb which does two redirections to get to the cgroup and pushes/pulls skb. Let's split cgroup_bpf_enabled to be per-attach to make sure only used attach types trigger. I've dropped some existing high-level cgroup_bpf_enabled in some places because BPF_PROG_CGROUP_XXX_RUN macros usually have another cgroup_bpf_enabled check. I also had to copy-paste BPF_CGROUP_RUN_SA_PROG_LOCK for GETPEERNAME/GETSOCKNAME because type for cgroup_bpf_enabled[type] has to be constant and known at compile time. Signed-off-by: NStanislav Fomichev <sdf@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NSong Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210115163501.805133-4-sdf@google.com Conflict: include/linux/bpf-cgroup.h Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Wei Yongjun<weiyongjun1@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Stanislav Fomichev 提交于
mainline inclusion from mainline-v5.12-rc1 commit 20f2505f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4GII8?from=project-issue CVE: NA -------------- When we attach a bpf program to cgroup/getsockopt any other getsockopt() syscall starts incurring kzalloc/kfree cost. Let add a small buffer on the stack and use it for small (majority) {s,g}etsockopt values. The buffer is small enough to fit into the cache line and cover the majority of simple options (most of them are 4 byte ints). It seems natural to do the same for setsockopt, but it's a bit more involved when the BPF program modifies the data (where we have to kmalloc). The assumption is that for the majority of setsockopt calls (which are doing pure BPF options or apply policy) this will bring some benefit as well. Without this patch (we remove about 1% __kmalloc): 3.38% 0.07% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt | --3.30%--__cgroup_bpf_run_filter_getsockopt | --0.81%--__kmalloc Signed-off-by: NStanislav Fomichev <sdf@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NMartin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210115163501.805133-3-sdf@google.comSigned-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Wei Yongjun<weiyongjun1@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Stanislav Fomichev 提交于
mainline inclusion from mainline-v5.11-rc1 commit 427167c0 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4GII8?from=project-issue CVE: NA --------- I have to now lock/unlock socket for the bind hook execution. That shouldn't cause any overhead because the socket is unbound and shouldn't receive any traffic. Signed-off-by: NStanislav Fomichev <sdf@google.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org> Acked-by: NAndrey Ignatov <rdna@fb.com> Link: https://lore.kernel.org/bpf/20201202172516.3483656-3-sdf@google.comSigned-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Wei Yongjun<weiyongjun1@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zheng Zengkai 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JBL0 CVE: NA ------------------------------ Add KABI_AUX_PTR extenstions to the following base structures before KABI freeze: struct device_driver struct class struct device struct hrtimer struct ipmi_smi_handlers struct net_device Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zheng Zengkai 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4K3S5 -------------------------- Generalize naming of some kabi helper macros. Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peng Liu 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NYPZ CVE: NA ------------------------------------------------- A new flag MEMBLOCK_MEMMAP is added into memblock_flags, which is used to identify reserved memory for memmap. This flag is limited for arm64. When memmap memory is reserved by memblock_reserve, it is subsequently marked with flag MEMBLOCK_MEMMAP. Therefore, for_each_mem_region can find memmap memory and request resources for it. Signed-off-by: NPeng Liu <liupeng256@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peng Liu 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NYPZ CVE: NA ------------------------------------------------- Add support for memmap kernel parameters for ARM64. The three below modes are supported: memmap=exactmap Enable setting of an exact memory map, as specified by the user. memmap=nn[KMG]@ss[KMG] Force usage of a specific region of memory. memmap=nn[KMG]$ss[KMG] Region of memory to be reserved is from ss to ss+nn, the region must be in the range of existed memory, otherwise will be ignored. If users set memmap=exactmap before memmap=nn[KMG]@ss[KMG], they will get the exact memory specified by memmap=nn[KMG]@ss[KMG]. For example, on one machine with 4GB memory, "memmap=exactmap memmap=1G@1G" will make kernel use the memory from 1GB to 2GB only. Signed-off-by: NPeng Liu <liupeng256@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 06 1月, 2022 3 次提交
-
-
由 Zheng Zengkai 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4K3S5 --------------------------- Enable CONFIG_KABI_RESERVE for x86 and arm64 architectures in openeuler_defconfig by default. Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zheng Zengkai 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4K3S5 ----------------------- Add CONFIG_KABI_RESERVE to control KABI padding reserve or not, for some embedded system, KABI padding reserve may be not necessary. By the way, adjust unsigned long to u64 to unify basic reserve length for both 32bit and 64bit architectures. Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zheng Zengkai 提交于
hulk inclusion bugzilla: https://gitee.com/openeuler/kernel/issues/I4MZU1 CVE: NA --------------------------- For x86 platform, make allmodconfig & make -j64, following two build errors are reported. First error: - build failed: - In file included from <command-line>:32:0: ./usr/include/asm/bootparam.h:45:10: fatal error: linux/kabi.h: No such file or directory #include <linux/kabi.h> ^~~~~~~~~~~~~~ compilation terminated. make[2]: *** [usr/include/asm/bootparam.hdrtest] Error 1 Second error: ./arch/x86/include/asm/paravirt_types.h:198:2: error: expected specifier-qualifier-list before ‘KABI_RESERVE’ KABI_RESERVE(1) ^~~~~~~~~~~~ ./arch/x86/include/asm/paravirt_types.h:286:2: error: expected specifier-qualifier-list before ‘KABI_RESERVE’ KABI_RESERVE(1) ^~~~~~~~~~~~ ./arch/x86/include/asm/paravirt_types.h:309:2: error: expected specifier-qualifier-list before ‘KABI_RESERVE’ KABI_RESERVE(1) ^~~~~~~~~~~~ make[1]: *** [scripts/Makefile.build:117: arch/x86/kernel/asm-offsets.s] Error 1 To fix first error, reverts commit 3ba63bac bootparam: Add kabi_reserve in bootparam. To fix second error, add include file kabi.h to arch/x86/include/asm/paravirt_types.h. Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NXie Xiuqi <xiexiuqi@huawei.com>
-
- 04 1月, 2022 2 次提交
-
-
由 Zheng Zengkai 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4KFY7?from=project-issue CVE: NA ------------------------------- Revert this patch to fix following compile error: In file included from <command-line>:32: ./usr/include/linux/ptp_clock.h:135:2: error: expected specifier-qualifier-list before ‘KABI_RESERVE’ KABI_RESERVE(1) ^~~~~~~~~~~~ make[2]: *** [usr/include/Makefile:108: usr/include/linux/ptp_clock.hdrtest] Error 1 This reverts commit af9f9bb9. Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com>
-
由 Jialin Zhang 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JBL0 CVE: NA ------------------------------- Reserve space for the structure cpu_hwcap_keys and cpu_hwcaps. Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> Reviewed-by: Nwangxiongfeng <wangxiongfeng2@huawei.com> Reviewed-by: NYang Yingliang <yangyingliang@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 31 12月, 2021 5 次提交
-
-
由 Gustavo A. R. Silva 提交于
mainline inclusion from mainline-v5.16-rc1 category: feature bugzilla: 185747 https://gitee.com/openeuler/kernel/issues/I4OUFN CVE: NA ------------------------------- There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Use an anonymous union with a couple of anonymous structs in order to keep userspace unchanged: $ pahole -C nfs_fhbase_new fs/nfsd/nfsfh.o struct nfs_fhbase_new { union { struct { __u8 fb_version_aux; /* 0 1 */ __u8 fb_auth_type_aux; /* 1 1 */ __u8 fb_fsid_type_aux; /* 2 1 */ __u8 fb_fileid_type_aux; /* 3 1 */ __u32 fb_auth[1]; /* 4 4 */ }; /* 0 8 */ struct { __u8 fb_version; /* 0 1 */ __u8 fb_auth_type; /* 1 1 */ __u8 fb_fsid_type; /* 2 1 */ __u8 fb_fileid_type; /* 3 1 */ __u32 fb_auth_flex[0]; /* 4 0 */ }; /* 0 4 */ }; /* 0 8 */ /* size: 8, cachelines: 1, members: 1 */ /* last cacheline: 8 bytes */ }; Also, this helps with the ongoing efforts to enable -Warray-bounds by fixing the following warnings: fs/nfsd/nfsfh.c: In function ‘nfsd_set_fh_dentry’: fs/nfsd/nfsfh.c:191:41: warning: array subscript 1 is above array bounds of ‘__u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds] 191 | ntohl((__force __be32)fh->fh_fsid[1]))); | ~~~~~~~~~~~^~~ ./include/linux/kdev_t.h:12:46: note: in definition of macro ‘MKDEV’ 12 | #define MKDEV(ma,mi) (((ma) << MINORBITS) | (mi)) | ^~ ./include/uapi/linux/byteorder/little_endian.h:40:26: note: in expansion of macro ‘__swab32’ 40 | #define __be32_to_cpu(x) __swab32((__force __u32)(__be32)(x)) | ^~~~~~~~ ./include/linux/byteorder/generic.h:136:21: note: in expansion of macro ‘__be32_to_cpu’ 136 | #define ___ntohl(x) __be32_to_cpu(x) | ^~~~~~~~~~~~~ ./include/linux/byteorder/generic.h:140:18: note: in expansion of macro ‘___ntohl’ 140 | #define ntohl(x) ___ntohl(x) | ^~~~~~~~ fs/nfsd/nfsfh.c:191:8: note: in expansion of macro ‘ntohl’ 191 | ntohl((__force __be32)fh->fh_fsid[1]))); | ^~~~~ fs/nfsd/nfsfh.c:192:32: warning: array subscript 2 is above array bounds of ‘__u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds] 192 | fh->fh_fsid[1] = fh->fh_fsid[2]; | ~~~~~~~~~~~^~~ fs/nfsd/nfsfh.c:192:15: warning: array subscript 1 is above array bounds of ‘__u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds] 192 | fh->fh_fsid[1] = fh->fh_fsid[2]; | ~~~~~~~~~~~^~~ [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/79 Link: https://github.com/KSPP/linux/issues/109Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: NChuck Lever <chuck.lever@oracle.com> Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhihao Cheng 提交于
hulk inclusion category: feature bugzilla: 185747 https://gitee.com/openeuler/kernel/issues/I4OUFN CVE: NA ------------------------------- Introduce kabi for storage module. Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NJason Yan <yanaijie@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Guan Jing 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4KAP1?from=project-issue CVE: NA -------- We reserve some fields beforehand for sched structures prone to change, therefore, we can hot add/change features of sched with this enhancement. After reserving, normally cache does not matter as the reserved fields are not accessed at all. Signed-off-by: NGuan Jing <guanjing6@huawei.com> Reviewed-by: NChen Hui <judy.chenhui@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Reviewed-by: NCheng Jian <cj.chengjian@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Guo Zihua 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4GK6B CVE: NA --------------------------- Reserving some fields for future IMA IPE development. Signed-off-by: NGuo Zihua <guozihua@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 GONG, Ruiqi 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4MYYH CVE: N/A ----------------------- Reserve space in struct cred and struct user_namespace in advance to prepare for merging a promising new feature [1] from kernel 5.14 and some others in the future. [1]: https://lkml.kernel.org/r/94d1dbecab060a6b116b0a2d1accd8ca1bbb4f5f.1619094428.git.legion@kernel.orgSigned-off-by: NGONG, Ruiqi <gongruiqi1@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-