- 20 5月, 2023 1 次提交
-
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @sun_nanyong patch 1~6: backport mainline patchset(mm: process/cgroup ksm support)and patches it depends on. patch 7:Add control file "memory.ksm" to enable ksm per cgroup. Link:https://gitee.com/openeuler/kernel/pulls/790 Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
- 19 5月, 2023 8 次提交
-
-
由 Nanyong Sun 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B CVE: NA ---------------------------------------------------------------------- Add control file "memory.ksm" to enable ksm per cgroup. Echo to 1 will set all tasks currently in the cgroup to ksm merge any mode, which means ksm gets enabled for all vma's of a process. Meanwhile echo to 0 will disable ksm for them and unmerge the merged pages. Cat the file will show the above state and ksm related profits of this cgroup. Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
-
由 David Hildenbrand 提交于
mainline inclusion from mainline-v6.4-rc1 commit 24139c07 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=24139c07f413ef4b555482c758343d71392a19bc ---------------------------------------------------------------------- Patch series "mm/ksm: improve PR_SET_MEMORY_MERGE=0 handling and cleanup disabling KSM", v2. (1) Make PR_SET_MEMORY_MERGE=0 unmerge pages like setting MADV_UNMERGEABLE does, (2) add a selftest for it and (3) factor out disabling of KSM from s390/gmap code. This patch (of 3): Let's unmerge any KSM pages when setting PR_SET_MEMORY_MERGE=0, and clear the VM_MERGEABLE flag from all VMAs -- just like KSM would. Of course, only do that if we previously set PR_SET_MEMORY_MERGE=1. Link: https://lkml.kernel.org/r/20230422205420.30372-1-david@redhat.com Link: https://lkml.kernel.org/r/20230422205420.30372-2-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NStefan Roesch <shr@devkernel.io> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Rik van Riel <riel@surriel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Conflicts: mm/ksm.c Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
-
由 Stefan Roesch 提交于
mainline inclusion from mainline-v6.4-rc1 commit d21077fb category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d21077fbc2fc987c2e593c34dc3b4d84e546dc9f ---------------------------------------------------------------------- This adds the general_profit KSM sysfs knob and the process profit metric knobs to ksm_stat. 1) expose general_profit metric The documentation mentions a general profit metric, however this metric is not calculated. In addition the formula depends on the size of internal structures, which makes it more difficult for an administrator to make the calculation. Adding the metric for a better user experience. 2) document general_profit sysfs knob 3) calculate ksm process profit metric The ksm documentation mentions the process profit metric and how to calculate it. This adds the calculation of the metric. 4) mm: expose ksm process profit metric in ksm_stat This exposes the ksm process profit metric in /proc/<pid>/ksm_stat. The documentation mentions the formula for the ksm process profit metric, however it does not calculate it. In addition the formula depends on the size of internal structures. So it makes sense to expose it. 5) document new procfs ksm knobs Link: https://lkml.kernel.org/r/20230418051342.1919757-3-shr@devkernel.ioSigned-off-by: NStefan Roesch <shr@devkernel.io> Reviewed-by: NBagas Sanjaya <bagasdotme@gmail.com> Acked-by: NDavid Hildenbrand <david@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
-
由 Stefan Roesch 提交于
mainline inclusion from mainline-v6.4-rc1 commit d7597f59 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d7597f59d1d33e9efbffa7060deb9ee5bd119e62 ---------------------------------------------------------------------- Patch series "mm: process/cgroup ksm support", v9. So far KSM can only be enabled by calling madvise for memory regions. To be able to use KSM for more workloads, KSM needs to have the ability to be enabled / disabled at the process / cgroup level. Use case 1: The madvise call is not available in the programming language. An example for this are programs with forked workloads using a garbage collected language without pointers. In such a language madvise cannot be made available. In addition the addresses of objects get moved around as they are garbage collected. KSM sharing needs to be enabled "from the outside" for these type of workloads. Use case 2: The same interpreter can also be used for workloads where KSM brings no benefit or even has overhead. We'd like to be able to enable KSM on a workload by workload basis. Use case 3: With the madvise call sharing opportunities are only enabled for the current process: it is a workload-local decision. A considerable number of sharing opportunities may exist across multiple workloads or jobs (if they are part of the same security domain). Only a higler level entity like a job scheduler or container can know for certain if its running one or more instances of a job. That job scheduler however doesn't have the necessary internal workload knowledge to make targeted madvise calls. Security concerns: In previous discussions security concerns have been brought up. The problem is that an individual workload does not have the knowledge about what else is running on a machine. Therefore it has to be very conservative in what memory areas can be shared or not. However, if the system is dedicated to running multiple jobs within the same security domain, its the job scheduler that has the knowledge that sharing can be safely enabled and is even desirable. Performance: Experiments with using UKSM have shown a capacity increase of around 20%. Here are the metrics from an instagram workload (taken from a machine with 64GB main memory): full_scans: 445 general_profit: 20158298048 max_page_sharing: 256 merge_across_nodes: 1 pages_shared: 129547 pages_sharing: 5119146 pages_to_scan: 4000 pages_unshared: 1760924 pages_volatile: 10761341 run: 1 sleep_millisecs: 20 stable_node_chains: 167 stable_node_chains_prune_millisecs: 2000 stable_node_dups: 2751 use_zero_pages: 0 zero_pages_sharing: 0 After the service is running for 30 minutes to an hour, 4 to 5 million shared pages are common for this workload when using KSM. Detailed changes: 1. New options for prctl system command This patch series adds two new options to the prctl system call. The first one allows to enable KSM at the process level and the second one to query the setting. The setting will be inherited by child processes. With the above setting, KSM can be enabled for the seed process of a cgroup and all processes in the cgroup will inherit the setting. 2. Changes to KSM processing When KSM is enabled at the process level, the KSM code will iterate over all the VMA's and enable KSM for the eligible VMA's. When forking a process that has KSM enabled, the setting will be inherited by the new child process. 3. Add general_profit metric The general_profit metric of KSM is specified in the documentation, but not calculated. This adds the general profit metric to /sys/kernel/debug/mm/ksm. 4. Add more metrics to ksm_stat This adds the process profit metric to /proc/<pid>/ksm_stat. 5. Add more tests to ksm_tests and ksm_functional_tests This adds an option to specify the merge type to the ksm_tests. This allows to test madvise and prctl KSM. It also adds a two new tests to ksm_functional_tests: one to test the new prctl options and the other one is a fork test to verify that the KSM process setting is inherited by client processes. This patch (of 3): So far KSM can only be enabled by calling madvise for memory regions. To be able to use KSM for more workloads, KSM needs to have the ability to be enabled / disabled at the process / cgroup level. 1. New options for prctl system command This patch series adds two new options to the prctl system call. The first one allows to enable KSM at the process level and the second one to query the setting. The setting will be inherited by child processes. With the above setting, KSM can be enabled for the seed process of a cgroup and all processes in the cgroup will inherit the setting. 2. Changes to KSM processing When KSM is enabled at the process level, the KSM code will iterate over all the VMA's and enable KSM for the eligible VMA's. When forking a process that has KSM enabled, the setting will be inherited by the new child process. 1) Introduce new MMF_VM_MERGE_ANY flag This introduces the new flag MMF_VM_MERGE_ANY flag. When this flag is set, kernel samepage merging (ksm) gets enabled for all vma's of a process. 2) Setting VM_MERGEABLE on VMA creation When a VMA is created, if the MMF_VM_MERGE_ANY flag is set, the VM_MERGEABLE flag will be set for this VMA. 3) support disabling of ksm for a process This adds the ability to disable ksm for a process if ksm has been enabled for the process with prctl. 4) add new prctl option to get and set ksm for a process This adds two new options to the prctl system call - enable ksm for all vmas of a process (if the vmas support it). - query if ksm has been enabled for a process. 3. Disabling MMF_VM_MERGE_ANY for storage keys in s390 In the s390 architecture when storage keys are used, the MMF_VM_MERGE_ANY will be disabled. Link: https://lkml.kernel.org/r/20230418051342.1919757-1-shr@devkernel.io Link: https://lkml.kernel.org/r/20230418051342.1919757-2-shr@devkernel.ioSigned-off-by: NStefan Roesch <shr@devkernel.io> Acked-by: NDavid Hildenbrand <david@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Rik van Riel <riel@surriel.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Conflicts: kernel/sys.c mm/ksm.c mm/mmap.c Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
-
由 xu xin 提交于
mainline inclusion from mainline-v6.1-rc1 commit 21b7bdb5 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=21b7bdb504ae6b0a795c8d63818611ce02b532c1 ---------------------------------------------------------------------- Add the description of KSM profit and how to determine it separately in system-wide range and inner a single process. Link: https://lkml.kernel.org/r/20220830144003.299870-1-xu.xin16@zte.com.cnSigned-off-by: Nxu xin <xu.xin16@zte.com.cn> Reviewed-by: NXiaokai Ran <ran.xiaokai@zte.com.cn> Reviewed-by: NYang Yang <yang.yang29@zte.com.cn> Reviewed-by: NBagas Sanjaya <bagasdotme@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Conflicts: Documentation/admin-guide/mm/ksm.rst Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
-
由 xu xin 提交于
mainline inclusion from mainline-v6.1-rc1 commit cb4df4ca category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cb4df4cae4f2bd8cf7a32eff81178fce31600f7c ---------------------------------------------------------------------- Patch series "ksm: count allocated rmap_items and update documentation", v5. KSM can save memory by merging identical pages, but also can consume additional memory, because it needs to generate rmap_items to save each scanned page's brief rmap information. To determine how beneficial the ksm-policy (like madvise), they are using brings, so we add a new interface /proc/<pid>/ksm_stat for each process The value "ksm_rmap_items" in it indicates the total allocated ksm rmap_items of this process. The detailed description can be seen in the following patches' commit message. This patch (of 2): KSM can save memory by merging identical pages, but also can consume additional memory, because it needs to generate rmap_items to save each scanned page's brief rmap information. Some of these pages may be merged, but some may not be abled to be merged after being checked several times, which are unprofitable memory consumed. The information about whether KSM save memory or consume memory in system-wide range can be determined by the comprehensive calculation of pages_sharing, pages_shared, pages_unshared and pages_volatile. A simple approximate calculation: profit =~ pages_sharing * sizeof(page) - (all_rmap_items) * sizeof(rmap_item); where all_rmap_items equals to the sum of pages_sharing, pages_shared, pages_unshared and pages_volatile. But we cannot calculate this kind of ksm profit inner single-process wide because the information of ksm rmap_item's number of a process is lacked. For user applications, if this kind of information could be obtained, it helps upper users know how beneficial the ksm-policy (like madvise) they are using brings, and then optimize their app code. For example, one application madvise 1000 pages as MERGEABLE, while only a few pages are really merged, then it's not cost-efficient. So we add a new interface /proc/<pid>/ksm_stat for each process in which the value of ksm_rmap_itmes is only shown now and so more values can be added in future. So similarly, we can calculate the ksm profit approximately for a single process by: profit =~ ksm_merging_pages * sizeof(page) - ksm_rmap_items * sizeof(rmap_item); where ksm_merging_pages is shown at /proc/<pid>/ksm_merging_pages, and ksm_rmap_items is shown in /proc/<pid>/ksm_stat. Link: https://lkml.kernel.org/r/20220830143731.299702-1-xu.xin16@zte.com.cn Link: https://lkml.kernel.org/r/20220830143838.299758-1-xu.xin16@zte.com.cnSigned-off-by: Nxu xin <xu.xin16@zte.com.cn> Reviewed-by: NXiaokai Ran <ran.xiaokai@zte.com.cn> Reviewed-by: NYang Yang <yang.yang29@zte.com.cn> Signed-off-by: NCGEL ZTE <cgel.zte@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Conflicts: include/linux/mm_types.h Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
-
由 xu xin 提交于
mainline inclusion from mainline-v5.19-rc1 commit 76093853 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I72R0B CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7609385337a4feb6236e42dcd0df2185683ce839 ---------------------------------------------------------------------- Some applications or containers want to use KSM by calling madvise() to advise areas of address space to be MERGEABLE. But they may not know which applications are more likely to cause real merges in the deployment. If this patch is applied, it helps them know their corresponding number of merged pages, and then optimize their app code. As current KSM only counts the number of KSM merging pages(e.g. ksm_pages_sharing and ksm_pages_shared) of the whole system, we cannot see the more fine-grained KSM merging, for the upper application optimization, the merging area cannot be set easily according to the KSM page merging probability of each process. Therefore, it is necessary to add extra statistical means so that the upper level users can know the detailed KSM merging information of each process. We add a new proc file named as ksm_merging_pages under /proc/<pid>/ to indicate the involved ksm merging pages of this process. [akpm@linux-foundation.org: fix comment typo, remove BUG_ON()s] Link: https://lkml.kernel.org/r/20220325082318.2352853-1-xu.xin16@zte.com.cnSigned-off-by: Nxu xin <xu.xin16@zte.com.cn> Reported-by: Nkernel test robot <lkp@intel.com> Reviewed-by: NYang Yang <yang.yang29@zte.com.cn> Reviewed-by: NRan Xiaokai <ran.xiaokai@zte.com.cn> Reported-by: NZeal Robot <zealci@zte.com.cn> Cc: Kees Cook <keescook@chromium.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Ohhoon Kwon <ohoono.kwon@samsung.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Feng Tang <feng.tang@intel.com> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn> Cc: Zeal Robot <zealci@zte.com.cn> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Conflicts: include/linux/mm_types.h Signed-off-by: NNanyong Sun <sunnanyong@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @openeuler-sync-bot Origin pull request: https://gitee.com/openeuler/kernel/pulls/774 Pull new CVEs: CVE-2023-32269 CVE-2023-2002 CVE-2023-26544 CVE-2023-0459 mm bugfixes from Yu Kuai fs bugfix from yangerkun fs perfs from Zhihao Cheng Link:https://gitee.com/openeuler/kernel/pulls/778 Reviewed-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
-
- 18 5月, 2023 2 次提交
-
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @Hongchen_Zhang ``` This series of patches adds the kexec/kdump feature of the LoongArch architecture. The production kernel and capture kernel can be the same kernel with the same binary implementation added. However, this implementation depends on the kernel relocation function, so the kernel relocation implementation and KASLR features are added to this series of patches. In order to be able to be used normally in machines compatible with the old interface specification, compatibility with the old interface specification has also been added. Manual command line test: kexec: $ sudo kexec -l vmlinuz --reuse-cmdline --initrd=initrd $ sudo kexec -e kdump: Add crashkernel=512M parameter in grub.cfg, $ sudo kexec -p vmlinuz --reuse-cmdline --initrd=initrd # echo c > /proc/sysrq-trigger kdump service mode test: kexec: $ sudo kexec -l vmlinuz --reuse-cmdline --initrd=initrd $ sudo kexec -e kdump: Add crashkernel=512M parameter in grub.cfg, $ sudo systemctl enable kdump $ sudo systemctl restart kdump # echo c > /proc/sysrq-trigger ``` Link:https://gitee.com/openeuler/kernel/pulls/766 Reviewed-by: Guo Dongtai <guodongtai@kylinos.cn> Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @svishen The patch sorted out the differences between the wol feature in the Linux mainline version and the openeuler and fix some bugfix issue: https://gitee.com/openeuler/kernel/issues/I72OPH Link:https://gitee.com/openeuler/kernel/pulls/758 Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
- 17 5月, 2023 22 次提交
-
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @nebula_matrix Add m1600-driver for nebula-matrix m1600 series smart NIC. M1600-NIC is a series of network interface card for the Data Center Area. The driver supports link-speed 10GbE. M1600 devices support SR-IOV.This driver is used for both of Physical Function(PF) and Virtual Function(VF). M1600 devices support MSI-X interrupt vector for each Tx/Rx queue and interrupt moderation. M1600 devices support also various offload features such as checksum offload, Receive-Side Scaling(RSS) Link:https://gitee.com/openeuler/kernel/pulls/570 Reviewed-by: Liu Chao <liuchao173@huawei.com> Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @did-you-collect-the-wool-today The ARMv8.7 WFxT feature is a new take on the good old WFI/WFE instructions as they behave the same way, only taking an extra timeout parameter. This small series aims at adding the minimal support for this feature, enabling it for both the kernel and KVM. Link:https://gitee.com/openeuler/kernel/pulls/629 Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 nebula_matrix 提交于
m1600-dirver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I6RNK1 CVE: NA -------------------------------- The m1600 smart NIC is a series of network interface card for the Data Center Area. The m1600 network card supports checksum, sriov and other hardware offloading. Signed-off-by: NYi Chen <open@nebula-matrix.com>
-
由 Jijie Shao 提交于
mainlist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I72OPH CVE: NA Reference: https://patchwork.kernel.org/project/netdevbpf/patch/20230512100014.2522-5-lanhao@huawei.com/ ---------------------------------------------------------------------- The timeout of the cmdq reset command has been increased to resolve the reset timeout issue in the full VF scenario. The timeout of other cmdq commands remains unchanged. Fixes: 8d307f8e ("net: hns3: create new set of unified hclge_comm_cmd_send APIs") Signed-off-by: NJijie Shao <shaojijie@huawei.com> Signed-off-by: NHao Lan <lanhao@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NJiantao Xiao <xiaojiantao1@h-partners.com>
-
由 Jie Wang 提交于
mainlist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I72OPH CVE: NA Reference: https://patchwork.kernel.org/project/netdevbpf/patch/20230512100014.2522-2-lanhao@huawei.com/ ---------------------------------------------------------------------- In function hns3_dump_tx_queue_info, The print buffer is not enough when the tx BD number is configured to 32760. As a result several BD information wouldn't be displayed. So fix it by increasing the tx queue print buffer length. Fixes: 630a6738 ("net: hns3: adjust string spaces of some parameters of tx bd info in debugfs") Signed-off-by: NJie Wang <wangjie125@huawei.com> Signed-off-by: NHao Lan <lanhao@huawei.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NJiantao Xiao <xiaojiantao1@h-partners.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @liaoyu15 Currently, only x86 architecture supports the CLOCKSOURCE_VALIDATE_LAST_CYCLE option. This option ensures that the timestamps returned by the clocksource are monotonically increasing, and helps avoid issues caused by hardware failures. This commit makes CLOCKSOURCE_VALIDATE_LAST_CYCLE configurable on the arm64 architecture, helps increase system stability and reliability. Link:https://gitee.com/openeuler/kernel/pulls/772 Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Reviewed-by: Wei Li <liwei391@huawei.com> Reviewed-by: Liu Chao <liuchao173@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 Yu Liao 提交于
openeuler inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7393E CVE: NA ------------------------------- The ARM ARM states the following (where a CPU is a device): The system counter must provide a uniform view of system time. More precisely, it must be impossible for the following sequence of events to show system time going backwards: 1) Device A reads the time from the system counter. 2) Device A communicates with another agent in the system, Device B. 3) After recognizing the communication from Device A, Device B reads the time from the system counter. So make CLOCKSOURCE_VALIDATE_LAST_CYCLE unset by default. Signed-off-by: NYu Liao <liaoyu15@huawei.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @sanglipeng Backport 5.10.151 LTS patches from upstream. Total patches: 5 Link:https://gitee.com/openeuler/kernel/pulls/768 Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 Matthew Wilcox (Oracle) 提交于
mainline inclusion from mainline-5.19-rc4 commit 5ccc944d category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6YDHU CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=5ccc944dce3df5fd2fd683a7df4fd49d1068eba2 ------------------------------------------------- We had an off-by-one error which meant that we never marked the first page in a read as accessed. This was visible as a slowdown when re-reading a file as pages were being evicted from cache too soon. In reviewing this code, we noticed a second bug where a multi-page folio would be marked as accessed multiple times when doing reads that were less than the size of the folio. Abstract the comparison of whether two file positions are in the same folio into a new function, fixing both of these bugs. Reported-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NKent Overstreet <kent.overstreet@gmail.com> Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Conflict: folios is not supported yet Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 16dd0b34)
-
由 Yu Kuai 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6YDHU CVE: NA -------------------------------- This reverts commit 939325bb. Because this commit make a mistake to judge if the page is the same. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 728c0094)
-
由 Hyunwoo Kim 提交于
stable inclusion from stable-v5.10.168 commit dd6991251a1382a9b4984962a0c7a467e9d71812 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I70OFF CVE: CVE-2023-32269 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=dd6991251a1382a9b4984962a0c7a467e9d71812 -------------------------------- [ Upstream commit 61179292 ] If you call listen() and accept() on an already connect()ed AF_NETROM socket, accept() can successfully connect. This is because when the peer socket sends data to sendmsg, the skb with its own sk stored in the connected socket's sk->sk_receive_queue is connected, and nr_accept() dequeues the skb waiting in the sk->sk_receive_queue. As a result, nr_accept() allocates and returns a sock with the sk of the parent AF_NETROM socket. And here use-after-free can happen through complex race conditions: ``` cpu0 cpu1 1. socket_2 = socket(AF_NETROM) . . listen(socket_2) accepted_socket = accept(socket_2) 2. socket_1 = socket(AF_NETROM) nr_create() // sk refcount : 1 connect(socket_1) 3. write(accepted_socket) nr_sendmsg() nr_output() nr_kick() nr_send_iframe() nr_transmit_buffer() nr_route_frame() nr_loopback_queue() nr_loopback_timer() nr_rx_frame() nr_process_rx_frame(sk, skb); // sk : socket_1's sk nr_state3_machine() nr_queue_rx_frame() sock_queue_rcv_skb() sock_queue_rcv_skb_reason() __sock_queue_rcv_skb() __skb_queue_tail(list, skb); // list : socket_1's sk->sk_receive_queue 4. listen(socket_1) nr_listen() uaf_socket = accept(socket_1) nr_accept() skb_dequeue(&sk->sk_receive_queue); 5. close(accepted_socket) nr_release() nr_write_internal(sk, NR_DISCREQ) nr_transmit_buffer() // NR_DISCREQ nr_route_frame() nr_loopback_queue() nr_loopback_timer() nr_rx_frame() // sk : socket_1's sk nr_process_rx_frame() // NR_STATE_3 nr_state3_machine() // NR_DISCREQ nr_disconnect() nr_sk(sk)->state = NR_STATE_0; 6. close(socket_1) // sk refcount : 3 nr_release() // NR_STATE_0 sock_put(sk); // sk refcount : 0 sk_free(sk); close(uaf_socket) nr_release() sock_hold(sk); // UAF ``` KASAN report by syzbot: ``` BUG: KASAN: use-after-free in nr_release+0x66/0x460 net/netrom/af_netrom.c:520 Write of size 4 at addr ffff8880235d8080 by task syz-executor564/5128 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:306 [inline] print_report+0x15e/0x461 mm/kasan/report.c:417 kasan_report+0xbf/0x1f0 mm/kasan/report.c:517 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0x141/0x190 mm/kasan/generic.c:189 instrument_atomic_read_write include/linux/instrumented.h:102 [inline] atomic_fetch_add_relaxed include/linux/atomic/atomic-instrumented.h:116 [inline] __refcount_add include/linux/refcount.h:193 [inline] __refcount_inc include/linux/refcount.h:250 [inline] refcount_inc include/linux/refcount.h:267 [inline] sock_hold include/net/sock.h:775 [inline] nr_release+0x66/0x460 net/netrom/af_netrom.c:520 __sock_release+0xcd/0x280 net/socket.c:650 sock_close+0x1c/0x20 net/socket.c:1365 __fput+0x27c/0xa90 fs/file_table.c:320 task_work_run+0x16f/0x270 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xaa8/0x2950 kernel/exit.c:867 do_group_exit+0xd4/0x2a0 kernel/exit.c:1012 get_signal+0x21c3/0x2450 kernel/signal.c:2859 arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306 exit_to_user_mode_loop kernel/entry/common.c:168 [inline] exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296 do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f6c19e3c9b9 Code: Unable to access opcode bytes at 0x7f6c19e3c98f. RSP: 002b:00007fffd4ba2ce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: 0000000000000116 RBX: 0000000000000003 RCX: 00007f6c19e3c9b9 RDX: 0000000000000318 RSI: 00000000200bd000 RDI: 0000000000000006 RBP: 0000000000000003 R08: 000000000000000d R09: 000000000000000d R10: 0000000000000000 R11: 0000000000000246 R12: 000055555566a2c0 R13: 0000000000000011 R14: 0000000000000000 R15: 0000000000000000 </TASK> Allocated by task 5128: kasan_save_stack+0x22/0x40 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 ____kasan_kmalloc mm/kasan/common.c:371 [inline] ____kasan_kmalloc mm/kasan/common.c:330 [inline] __kasan_kmalloc+0xa3/0xb0 mm/kasan/common.c:380 kasan_kmalloc include/linux/kasan.h:211 [inline] __do_kmalloc_node mm/slab_common.c:968 [inline] __kmalloc+0x5a/0xd0 mm/slab_common.c:981 kmalloc include/linux/slab.h:584 [inline] sk_prot_alloc+0x140/0x290 net/core/sock.c:2038 sk_alloc+0x3a/0x7a0 net/core/sock.c:2091 nr_create+0xb6/0x5f0 net/netrom/af_netrom.c:433 __sock_create+0x359/0x790 net/socket.c:1515 sock_create net/socket.c:1566 [inline] __sys_socket_create net/socket.c:1603 [inline] __sys_socket_create net/socket.c:1588 [inline] __sys_socket+0x133/0x250 net/socket.c:1636 __do_sys_socket net/socket.c:1649 [inline] __se_sys_socket net/socket.c:1647 [inline] __x64_sys_socket+0x73/0xb0 net/socket.c:1647 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Freed by task 5128: kasan_save_stack+0x22/0x40 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:518 ____kasan_slab_free mm/kasan/common.c:236 [inline] ____kasan_slab_free+0x13b/0x1a0 mm/kasan/common.c:200 kasan_slab_free include/linux/kasan.h:177 [inline] __cache_free mm/slab.c:3394 [inline] __do_kmem_cache_free mm/slab.c:3580 [inline] __kmem_cache_free+0xcd/0x3b0 mm/slab.c:3587 sk_prot_free net/core/sock.c:2074 [inline] __sk_destruct+0x5df/0x750 net/core/sock.c:2166 sk_destruct net/core/sock.c:2181 [inline] __sk_free+0x175/0x460 net/core/sock.c:2192 sk_free+0x7c/0xa0 net/core/sock.c:2203 sock_put include/net/sock.h:1991 [inline] nr_release+0x39e/0x460 net/netrom/af_netrom.c:554 __sock_release+0xcd/0x280 net/socket.c:650 sock_close+0x1c/0x20 net/socket.c:1365 __fput+0x27c/0xa90 fs/file_table.c:320 task_work_run+0x16f/0x270 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xaa8/0x2950 kernel/exit.c:867 do_group_exit+0xd4/0x2a0 kernel/exit.c:1012 get_signal+0x21c3/0x2450 kernel/signal.c:2859 arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306 exit_to_user_mode_loop kernel/entry/common.c:168 [inline] exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296 do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd ``` To fix this issue, nr_listen() returns -EINVAL for sockets that successfully nr_connect(). Reported-by: syzbot+caa188bdfc1eeafeb418@syzkaller.appspotmail.com Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by: NHyunwoo Kim <v4bel@theori.io> Reviewed-by: NKuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 40307a4a)
-
由 Ruihan Li 提交于
mainline inclusion from mainline-v6.4-rc1 commit 25c150ac category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I6WHKQ CVE: CVE-2023-2002 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=25c150ac103a4ebeed0319994c742a90634ddf18 ---------------------------------------- Previously, capability was checked using capable(), which verified that the caller of the ioctl system call had the required capability. In addition, the result of the check would be stored in the HCI_SOCK_TRUSTED flag, making it persistent for the socket. However, malicious programs can abuse this approach by deliberately sharing an HCI socket with a privileged task. The HCI socket will be marked as trusted when the privileged task occasionally makes an ioctl call. This problem can be solved by using sk_capable() to check capability, which ensures that not only the current task but also the socket opener has the specified capability, thus reducing the risk of privilege escalation through the previously identified vulnerability. Cc: stable@vger.kernel.org Fixes: f81f5b2d ("Bluetooth: Send control open and close messages for HCI raw sockets") Signed-off-by: NRuihan Li <lrh2000@pku.edu.cn> Signed-off-by: NLuiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: NZiyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: NLiu Jian <liujian56@huawei.com> Reviewed-by: NWang Weiyang <wangweiyang2@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit ec557f34)
-
由 Dan Carpenter 提交于
mainline inclusion from mainline-v6.2-rc1 commit 65801516 category: bugfix bugzilla: 188526, https://gitee.com/src-openeuler/kernel/issues/I71SYO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=658015167a8432b88f5d032e9d85d8fd50e5bf2c -------------------------------- There were two patches which addressed the same bug and added the same condition: commit 6db62086 ("fs/ntfs3: Validate data run offset") commit 887bfc54 ("fs/ntfs3: Fix slab-out-of-bounds read in run_unpack") Delete one condition. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NKonstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: NZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 1ca5b2ca)
-
由 Hawkins Jiawei 提交于
mainline inclusion from mainline-v6.2-rc1 commit 887bfc54 category: bugfix bugzilla: 188526, https://gitee.com/src-openeuler/kernel/issues/I71SYO CVE: CVE-2023-26544 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=887bfc546097fbe8071dac13b2fef73b77920899 -------------------------------- Syzkaller reports slab-out-of-bounds bug as follows: ================================================================== BUG: KASAN: slab-out-of-bounds in run_unpack+0x8b7/0x970 fs/ntfs3/run.c:944 Read of size 1 at addr ffff88801bbdff02 by task syz-executor131/3611 [...] Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:317 [inline] print_report.cold+0x2ba/0x719 mm/kasan/report.c:433 kasan_report+0xb1/0x1e0 mm/kasan/report.c:495 run_unpack+0x8b7/0x970 fs/ntfs3/run.c:944 run_unpack_ex+0xb0/0x7c0 fs/ntfs3/run.c:1057 ntfs_read_mft fs/ntfs3/inode.c:368 [inline] ntfs_iget5+0xc20/0x3280 fs/ntfs3/inode.c:501 ntfs_loadlog_and_replay+0x124/0x5d0 fs/ntfs3/fsntfs.c:272 ntfs_fill_super+0x1eff/0x37f0 fs/ntfs3/super.c:1018 get_tree_bdev+0x440/0x760 fs/super.c:1323 vfs_get_tree+0x89/0x2f0 fs/super.c:1530 do_new_mount fs/namespace.c:3040 [inline] path_mount+0x1326/0x1e20 fs/namespace.c:3370 do_mount fs/namespace.c:3383 [inline] __do_sys_mount fs/namespace.c:3591 [inline] __se_sys_mount fs/namespace.c:3568 [inline] __x64_sys_mount+0x27f/0x300 fs/namespace.c:3568 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd [...] </TASK> The buggy address belongs to the physical page: page:ffffea00006ef600 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1bbd8 head:ffffea00006ef600 order:3 compound_mapcount:0 compound_pincount:0 flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff) page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88801bbdfe00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88801bbdfe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88801bbdff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff88801bbdff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88801bbe0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Kernel will tries to read record and parse MFT from disk in ntfs_read_mft(). Yet the problem is that during enumerating attributes in record, kernel doesn't check whether run_off field loading from the disk is a valid value. To be more specific, if attr->nres.run_off is larger than attr->size, kernel will passes an invalid argument run_buf_size in run_unpack_ex(), which having an integer overflow. Then this invalid argument will triggers the slab-out-of-bounds Read bug as above. This patch solves it by adding the sanity check between the offset to packed runs and attribute size. link: https://lore.kernel.org/all/0000000000009145fc05e94bd5c3@google.com/#t Reported-and-tested-by: syzbot+8d6fbb27a6aded64b25b@syzkaller.appspotmail.com Signed-off-by: NHawkins Jiawei <yin31149@gmail.com> Signed-off-by: NKonstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: NZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit cf4fe9cf)
-
由 Edward Lo 提交于
mainline inclusion from mainline-v6.2-rc1 commit 6db62086 category: bugfix bugzilla: 188526, https://gitee.com/src-openeuler/kernel/issues/I71SYO CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6db620863f8528ed9a9aa5ad323b26554a17881d -------------------------------- This adds sanity checks for data run offset. We should make sure data run offset is legit before trying to unpack them, otherwise we may encounter use-after-free or some unexpected memory access behaviors. [ 82.940342] BUG: KASAN: use-after-free in run_unpack+0x2e3/0x570 [ 82.941180] Read of size 1 at addr ffff888008a8487f by task mount/240 [ 82.941670] [ 82.942069] CPU: 0 PID: 240 Comm: mount Not tainted 5.19.0+ #15 [ 82.942482] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 82.943720] Call Trace: [ 82.944204] <TASK> [ 82.944471] dump_stack_lvl+0x49/0x63 [ 82.944908] print_report.cold+0xf5/0x67b [ 82.945141] ? __wait_on_bit+0x106/0x120 [ 82.945750] ? run_unpack+0x2e3/0x570 [ 82.946626] kasan_report+0xa7/0x120 [ 82.947046] ? run_unpack+0x2e3/0x570 [ 82.947280] __asan_load1+0x51/0x60 [ 82.947483] run_unpack+0x2e3/0x570 [ 82.947709] ? memcpy+0x4e/0x70 [ 82.947927] ? run_pack+0x7a0/0x7a0 [ 82.948158] run_unpack_ex+0xad/0x3f0 [ 82.948399] ? mi_enum_attr+0x14a/0x200 [ 82.948717] ? run_unpack+0x570/0x570 [ 82.949072] ? ni_enum_attr_ex+0x1b2/0x1c0 [ 82.949332] ? ni_fname_type.part.0+0xd0/0xd0 [ 82.949611] ? mi_read+0x262/0x2c0 [ 82.949970] ? ntfs_cmp_names_cpu+0x125/0x180 [ 82.950249] ntfs_iget5+0x632/0x1870 [ 82.950621] ? ntfs_get_block_bmap+0x70/0x70 [ 82.951192] ? evict+0x223/0x280 [ 82.951525] ? iput.part.0+0x286/0x320 [ 82.951969] ntfs_fill_super+0x1321/0x1e20 [ 82.952436] ? put_ntfs+0x1d0/0x1d0 [ 82.952822] ? vsprintf+0x20/0x20 [ 82.953188] ? mutex_unlock+0x81/0xd0 [ 82.953379] ? set_blocksize+0x95/0x150 [ 82.954001] get_tree_bdev+0x232/0x370 [ 82.954438] ? put_ntfs+0x1d0/0x1d0 [ 82.954700] ntfs_fs_get_tree+0x15/0x20 [ 82.955049] vfs_get_tree+0x4c/0x130 [ 82.955292] path_mount+0x645/0xfd0 [ 82.955615] ? putname+0x80/0xa0 [ 82.955955] ? finish_automount+0x2e0/0x2e0 [ 82.956310] ? kmem_cache_free+0x110/0x390 [ 82.956723] ? putname+0x80/0xa0 [ 82.957023] do_mount+0xd6/0xf0 [ 82.957411] ? path_mount+0xfd0/0xfd0 [ 82.957638] ? __kasan_check_write+0x14/0x20 [ 82.957948] __x64_sys_mount+0xca/0x110 [ 82.958310] do_syscall_64+0x3b/0x90 [ 82.958719] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 82.959341] RIP: 0033:0x7fd0d1ce948a [ 82.960193] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008 [ 82.961532] RSP: 002b:00007ffe59ff69a8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 [ 82.962527] RAX: ffffffffffffffda RBX: 0000564dcc107060 RCX: 00007fd0d1ce948a [ 82.963266] RDX: 0000564dcc107260 RSI: 0000564dcc1072e0 RDI: 0000564dcc10fce0 [ 82.963686] RBP: 0000000000000000 R08: 0000564dcc107280 R09: 0000000000000020 [ 82.964272] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000564dcc10fce0 [ 82.964785] R13: 0000564dcc107260 R14: 0000000000000000 R15: 00000000ffffffff Signed-off-by: NEdward Lo <edward.lo@ambergroup.io> Signed-off-by: NKonstantin Komarov <almaz.alexandrovich@paragon-software.com> Signed-off-by: NZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 9dfa90f9)
-
由 Dave Chinner 提交于
maillist inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I6ZXOM CVE: NA Reference: https://patchwork.kernel.org/project/xfs/patch/20230420033550.339934-1-yangerkun@huaweicloud.com/ -------------------------------- When a buffer is unpinned by xfs_buf_item_unpin(), we need to access the buffer after we've dropped the buffer log item reference count. This opens a window where we can have two racing unpins for the buffer item (e.g. shutdown checkpoint context callback processing racing with journal IO iclog completion processing) and both attempt to access the buffer after dropping the BLI reference count. If we are unlucky, the "BLI freed" context wins the race and frees the buffer before the "BLI still active" case checks the buffer pin count. This results in a use after free that can only be triggered in active filesystem shutdown situations. To fix this, we need to ensure that buffer existence extends beyond the BLI reference count checks and until the unpin processing is complete. This implies that a buffer pin operation must also take a buffer reference to ensure that the buffer cannot be freed until the buffer unpin processing is complete. Reported-by: Nyangerkun <yangerkun@huawei.com> Signed-off-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: Nyangerkun <yangerkun@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit e05df008)
-
由 Zhihao Cheng 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I70MZX CVE: NA -------------------------------- Following process: P1 P2 path_openat link_path_walk may_lookup inode_permission(rcu) ovl_permission acl_permission_check check_acl get_cached_acl_rcu ovl_get_inode_acl realinode = ovl_inode_real(ovl_inode) drop_cache __dentry_kill(ovl_dentry) iput(ovl_inode) ovl_destroy_inode(ovl_inode) dput(oi->__upperdentry) dentry_kill(upperdentry) dentry_unlink_inode upperdentry->d_inode = NULL ovl_inode_upper upperdentry = ovl_i_dentry_upper(ovl_inode) d_inode(upperdentry) // returns NULL IS_POSIXACL(realinode) // NULL pointer dereference , will trigger an null pointer dereference at realinode: [ 205.472797] BUG: kernel NULL pointer dereference, address: 0000000000000028 [ 205.476701] CPU: 2 PID: 2713 Comm: ls Not tainted 6.3.0-12064-g2edfa098e750-dirty #1216 [ 205.478754] RIP: 0010:do_ovl_get_acl+0x5d/0x300 [ 205.489584] Call Trace: [ 205.489812] <TASK> [ 205.490014] ovl_get_inode_acl+0x26/0x30 [ 205.490466] get_cached_acl_rcu+0x61/0xa0 [ 205.490908] generic_permission+0x1bf/0x4e0 [ 205.491447] ovl_permission+0x79/0x1b0 [ 205.491917] inode_permission+0x15e/0x2c0 [ 205.492425] link_path_walk+0x115/0x550 [ 205.493311] path_lookupat.isra.0+0xb2/0x200 [ 205.493803] filename_lookup+0xda/0x240 [ 205.495747] vfs_fstatat+0x7b/0xb0 Fetch a reproducer in [Link]. Fix it by checking realinode whether to be NULL before accessing it. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217404 Fixes: 332f606b ("ovl: enable RCU'd ->get_acl()") Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 1686eed6)
-
由 Zhihao Cheng 提交于
hulk inclusion category: perf bugzilla: https://gitee.com/openeuler/kernel/issues/I6ZCW0 CVE: NA -------------------------------- Fix kabi broken for importing new inode operation get_inode_acl. Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit b0ae4965)
-
由 Miklos Szeredi 提交于
mainline inclusion from mainline-v5.15-rc1 commit 332f606b category: perf bugzilla: https://gitee.com/openeuler/kernel/issues/I6ZCW0 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=332f606b32b6291a944c8cf23b91f53a6e676525 -------------------------------- Overlayfs does not cache ACL's (to avoid double caching). Instead it just calls the underlying filesystem's i_op->get_acl(), which will return the cached value, if possible. In rcu path walk, however, get_cached_acl_rcu() is employed to get the value from the cache, which will fail on overlayfs resulting in dropping out of rcu walk mode. This can result in a big performance hit in certain situations. Fix by calling ->get_acl() with rcu=true in case of ACL_DONT_CACHE (which indicates pass-through) Reported-by: Ngaryhuang <zjh.20052005@163.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Conflicts: fs/posix_acl.c Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 488d4279)
-
由 Miklos Szeredi 提交于
mainline inclusion from mainline-v5.15-rc1 commit 0cad6246 category: perf bugzilla: https://gitee.com/openeuler/kernel/issues/I6ZCW0 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0cad6246621b5887d5b33fea84219d2a71f2f99a -------------------------------- Add a rcu argument to the ->get_acl() callback to allow get_cached_acl_rcu() to call the ->get_acl() method in the next patch. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> [chengzhihao: rename get_acl to get_acl2 to prevent KABI changes, and only backport(realize) overlayfs] Conflicts: fs/overlayfs/dir.c fs/overlayfs/inode.c fs/overlayfs/overlayfs.h fs/posix_acl.c include/linux/fs.h Signed-off-by: NZhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: NZhang Yi <yi.zhang@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 67cd4976)
-
由 Linus Torvalds 提交于
stable inclusion from stable-v5.10.170 commit 12e3119a87627741bd3871c895ce198f21529eb3 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I71N8L Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=12e3119a87627741bd3871c895ce198f21529eb3 -------------------------------- commit f3dd0c53 upstream. Commit 74e19ef0 ("uaccess: Add speculation barrier to copy_from_user()") built fine on x86-64 and arm64, and that's the extent of my local build testing. It turns out those got the <linux/nospec.h> include incidentally through other header files (<linux/kvm_host.h> in particular), but that was not true of other architectures, resulting in build errors kernel/bpf/core.c: In function ‘___bpf_prog_run’: kernel/bpf/core.c:1913:3: error: implicit declaration of function ‘barrier_nospec’ so just make sure to explicitly include the proper <linux/nospec.h> header file to make everybody see it. Fixes: 74e19ef0 ("uaccess: Add speculation barrier to copy_from_user()") Reported-by: Nkernel test robot <lkp@intel.com> Reported-by: NViresh Kumar <viresh.kumar@linaro.org> Reported-by: NHuacai Chen <chenhuacai@loongson.cn> Tested-by: NGeert Uytterhoeven <geert@linux-m68k.org> Tested-by: NDave Hansen <dave.hansen@linux.intel.com> Acked-by: NAlexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Conflicts: kernel/bpf/core.c Signed-off-by: NMa Wupeng <mawupeng1@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: NNanyong Sun <sunnanyong@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit b169caaa)
-
由 Dave Hansen 提交于
stable inclusion from stable-v5.10.170 commit 3b6ce54cfa2c04f0636fd0c985913af8703b408d category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I71N8L CVE: CVE-2023-0459 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3b6ce54cfa2c04f0636fd0c985913af8703b408d -------------------------------- commit 74e19ef0 upstream. The results of "access_ok()" can be mis-speculated. The result is that you can end speculatively: if (access_ok(from, size)) // Right here even for bad from/size combinations. On first glance, it would be ideal to just add a speculation barrier to "access_ok()" so that its results can never be mis-speculated. But there are lots of system calls just doing access_ok() via "copy_to_user()" and friends (example: fstat() and friends). Those are generally not problematic because they do not _consume_ data from userspace other than the pointer. They are also very quick and common system calls that should not be needlessly slowed down. "copy_from_user()" on the other hand uses a user-controller pointer and is frequently followed up with code that might affect caches. Take something like this: if (!copy_from_user(&kernelvar, uptr, size)) do_something_with(kernelvar); If userspace passes in an evil 'uptr' that *actually* points to a kernel addresses, and then do_something_with() has cache (or other) side-effects, it could allow userspace to infer kernel data values. Add a barrier to the common copy_from_user() code to prevent mis-speculated values which happen after the copy. Also add a stub for architectures that do not define barrier_nospec(). This makes the macro usable in generic code. Since the barrier is now usable in generic code, the x86 #ifdef in the BPF code can also go away. Reported-by: NJordy Zomer <jordyzomer@google.com> Suggested-by: NLinus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: Daniel Borkmann <daniel@iogearbox.net> # BPF bits Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Conflicts: lib/usercopy.c Signed-off-by: NMa Wupeng <mawupeng1@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: NNanyong Sun <sunnanyong@huawei.com> Reviewed-by: NXiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com> (cherry picked from commit 4e0f3c2f)
-
- 16 5月, 2023 7 次提交
-
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @Tom_zc Enabling bcache causes interface changes and needs to be rolled back。 Link:https://gitee.com/openeuler/kernel/pulls/771 Reviewed-by: Jialin Zhang <zhangjialin11@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 Yu Liao 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I7393E CVE: NA ------------------------------- Currently, only x86 architecture supports the CLOCKSOURCE_VALIDATE_LAST_CYCLE option. This option ensures that the timestamps returned by the clocksource are monotonically increasing, and helps avoid issues caused by hardware failures. This commit makes CLOCKSOURCE_VALIDATE_LAST_CYCLE configurable on the arm64 architecture, helps increase system stability and reliability. Signed-off-by: NYu Liao <liaoyu15@huawei.com>
-
由 Tom_zc 提交于
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I6N4EP CVE: NA -------------------------------- This reverts commit fc3bfef5. This commit causes the interface to change. Fixes: fc3bfef5 ("config: enable bcache for x86 by default") Signed-off-by: NChao Zhu <tom_toworld@163.com>
-
由 openeuler-ci-bot 提交于
Merge Pull Request from: @Tom_zc To fix issue about https://gitee.com/openeuler/kernel/issues/I6N4EP?from=project-issue and test report is in here: https://gitee.com/Tom_zc/my_doc/blob/master/kernel%E7%A4%BE%E5%8C%BA/%E9%AA%8C%E8%AF%81openEuler%E7%BC%96%E8%AF%91%E6%94%AF%E6%8C%81bcache.md Link:https://gitee.com/openeuler/kernel/pulls/757 Reviewed-by: Liu Chao <liuchao173@huawei.com> Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com>
-
由 Marc Zyngier 提交于
mainline inclusion from mainline-v5.19-rc1 commit 7d26b051 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I6YAMV CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7d26b0516a0df5888fd1486054bc5159f6c0b88f ------------------------------- Marginally optimise __delay() by using a WFIT/WFET sequence. It probably is a win if no interrupt fires during the delay. Signed-off-by: NMarc Zyngier <maz@kernel.org> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419182755.601427-11-maz@kernel.orgSigned-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
-
由 Marc Zyngier 提交于
mainline inclusion from mainline-v5.19-rc1 commit 9eae5885 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I6YAMV CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9eae588529751f95834bca775b30b66291def7f6 ------------------------------- Just like we have helpers for WFI and WFE, add the WFxT versions. Note that the encoding is that reported by objdump, as no currrent toolchain knows about these instructions yet. Signed-off-by: NMarc Zyngier <maz@kernel.org> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419182755.601427-10-maz@kernel.orgSigned-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
-
由 Marc Zyngier 提交于
mainline inclusion from mainline-v5.19-rc1 commit 69bb02eb category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I6YAMV CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=69bb02ebc38ace438c9cd7c5315cfe43862b51fe ------------------------------- In order to allow userspace to enjoy WFET, add a new HWCAP that advertises it when available. Signed-off-by: NMarc Zyngier <maz@kernel.org> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419182755.601427-9-maz@kernel.orgSigned-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
-