- 18 1月, 2023 1 次提交
-
-
由 Guo Mengqi 提交于
hulk inclusion category: other bugzilla: https://gitee.com/openeuler/kernel/issues/I69JDF CVE: NA ------------------------------- Delete svm driver, as it is specially designed for Hisilicon platform. Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NJialin Zhang <zhangjialin11@huawei.com>
-
- 13 12月, 2022 1 次提交
-
-
由 David Hildenbrand 提交于
stable inclusion from stable-v5.10.140 commit 62af37c5cd7f5fd071086cab645844bf5bcdc0ef category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I63FTT Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=62af37c5cd7f5fd071086cab645844bf5bcdc0ef -------------------------------- commit f96f7a40 upstream. Patch series "mm/hugetlb: fix write-fault handling for shared mappings", v2. I observed that hugetlb does not support/expect write-faults in shared mappings that would have to map the R/O-mapped page writable -- and I found two case where we could currently get such faults and would erroneously map an anon page into a shared mapping. Reproducers part of the patches. I propose to backport both fixes to stable trees. The first fix needs a small adjustment. This patch (of 2): Staring at hugetlb_wp(), one might wonder where all the logic for shared mappings is when stumbling over a write-protected page in a shared mapping. In fact, there is none, and so far we thought we could get away with that because e.g., mprotect() should always do the right thing and map all pages directly writable. Looks like we were wrong: -------------------------------------------------------------------------- #include <stdio.h> #include <stdlib.h> #include <string.h> #include <fcntl.h> #include <unistd.h> #include <errno.h> #include <sys/mman.h> #define HUGETLB_SIZE (2 * 1024 * 1024u) static void clear_softdirty(void) { int fd = open("/proc/self/clear_refs", O_WRONLY); const char *ctrl = "4"; int ret; if (fd < 0) { fprintf(stderr, "open(clear_refs) failed\n"); exit(1); } ret = write(fd, ctrl, strlen(ctrl)); if (ret != strlen(ctrl)) { fprintf(stderr, "write(clear_refs) failed\n"); exit(1); } close(fd); } int main(int argc, char **argv) { char *map; int fd; fd = open("/dev/hugepages/tmp", O_RDWR | O_CREAT); if (!fd) { fprintf(stderr, "open() failed\n"); return -errno; } if (ftruncate(fd, HUGETLB_SIZE)) { fprintf(stderr, "ftruncate() failed\n"); return -errno; } map = mmap(NULL, HUGETLB_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); if (map == MAP_FAILED) { fprintf(stderr, "mmap() failed\n"); return -errno; } *map = 0; if (mprotect(map, HUGETLB_SIZE, PROT_READ)) { fprintf(stderr, "mmprotect() failed\n"); return -errno; } clear_softdirty(); if (mprotect(map, HUGETLB_SIZE, PROT_READ|PROT_WRITE)) { fprintf(stderr, "mmprotect() failed\n"); return -errno; } *map = 0; return 0; } -------------------------------------------------------------------------- Above test fails with SIGBUS when there is only a single free hugetlb page. # echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # ./test Bus error (core dumped) And worse, with sufficient free hugetlb pages it will map an anonymous page into a shared mapping, for example, messing up accounting during unmap and breaking MAP_SHARED semantics: # echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # ./test # cat /proc/meminfo | grep HugePages_ HugePages_Total: 2 HugePages_Free: 1 HugePages_Rsvd: 18446744073709551615 HugePages_Surp: 0 Reason in this particular case is that vma_wants_writenotify() will return "true", removing VM_SHARED in vma_set_page_prot() to map pages write-protected. Let's teach vma_wants_writenotify() that hugetlb does not support softdirty tracking. Link: https://lkml.kernel.org/r/20220811103435.188481-1-david@redhat.com Link: https://lkml.kernel.org/r/20220811103435.188481-2-david@redhat.com Fixes: 64e45507 ("mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared") Signed-off-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Peter Feiner <pfeiner@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Jamie Liu <jamieliu@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Peter Xu <peterx@redhat.com> Cc: <stable@vger.kernel.org> [3.18+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
- 18 11月, 2022 1 次提交
-
-
由 Miaohe Lin 提交于
stable inclusion from stable-v5.10.137 commit 4ffa6cecb53d46af8f869cc7a5a376341ebef79f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I60PLB Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4ffa6cecb53d46af8f869cc7a5a376341ebef79f -------------------------------- [ Upstream commit 7f82f922 ] Since the beginning, charged is set to 0 to avoid calling vm_unacct_memory twice because vm_unacct_memory will be called by above unmap_region. But since commit 4f74d2c8 ("vm: remove 'nr_accounted' calculations from the unmap_vmas() interfaces"), unmap_region doesn't call vm_unacct_memory anymore. So charged shouldn't be set to 0 now otherwise the calling to paired vm_unacct_memory will be missed and leads to imbalanced account. Link: https://lkml.kernel.org/r/20220618082027.43391-1-linmiaohe@huawei.com Fixes: 4f74d2c8 ("vm: remove 'nr_accounted' calculations from the unmap_vmas() interfaces") Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com>
-
- 27 9月, 2022 1 次提交
-
-
由 Jann Horn 提交于
stable inclusion from stable-v5.10.142 commit 895428ee124ad70b9763259308354877b725c31d category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5PE9S CVE: CVE-2022-39188 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=895428ee124ad70b9763259308354877b725c31d -------------------------------- commit b67fbebd upstream. Some drivers rely on having all VMAs through which a PFN might be accessible listed in the rmap for correctness. However, on X86, it was possible for a VMA with stale TLB entries to not be listed in the rmap. This was fixed in mainline with commit b67fbebd ("mmu_gather: Force tlb-flush VM_PFNMAP vmas"), but that commit relies on preceding refactoring in commit 18ba064e ("mmu_gather: Let there be one tlb_{start,end}_vma() implementation") and commit 1e9fdf21 ("mmu_gather: Remove per arch tlb_{start,end}_vma()"). This patch provides equivalent protection without needing that refactoring, by forcing a TLB flush between removing PTEs in unmap_vmas() and the call to unlink_file_vma() in free_pgtables(). [This is a stable-specific rewrite of the upstream commit!] Signed-off-by: NJann Horn <jannh@google.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Nze zuo <zuoze1@huawei.com> Reviewed-by: NChen Wandun <chenwandun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 26 7月, 2022 1 次提交
-
-
由 Christophe Leroy 提交于
stable inclusion from stable-v5.10.113 commit 6b932920b96fc3002352fe8225ec63a1cd1717ec category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5ISAH Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6b932920b96fc3002352fe8225ec63a1cd1717ec -------------------------------- commit 5f24d5a5 upstream. This is a fix for commit f6795053 ("mm: mmap: Allow for "high" userspace addresses") for hugetlb. This patch adds support for "high" userspace addresses that are optionally supported on the system and have to be requested via a hint mechanism ("high" addr parameter to mmap). Architectures such as powerpc and x86 achieve this by making changes to their architectural versions of hugetlb_get_unmapped_area() function. However, arm64 uses the generic version of that function. So take into account arch_get_mmap_base() and arch_get_mmap_end() in hugetlb_get_unmapped_area(). To allow that, move those two macros out of mm/mmap.c into include/linux/sched/mm.h If these macros are not defined in architectural code then they default to (TASK_SIZE) and (base) so should not introduce any behavioural changes to architectures that do not define them. For the time being, only ARM64 is affected by this change. Catalin (ARM64) said "We should have fixed hugetlb_get_unmapped_area() as well when we added support for 52-bit VA. The reason for commit f6795053 was to prevent normal mmap() from returning addresses above 48-bit by default as some user-space had hard assumptions about this. It's a slight ABI change if you do this for hugetlb_get_unmapped_area() but I doubt anyone would notice. It's more likely that the current behaviour would cause issues, so I'd rather have them consistent. Basically when arm64 gained support for 52-bit addresses we did not want user-space calling mmap() to suddenly get such high addresses, otherwise we could have inadvertently broken some programs (similar behaviour to x86 here). Hence we added commit f6795053. But we missed hugetlbfs which could still get such high mmap() addresses. So in theory that's a potential regression that should have bee addressed at the same time as commit f6795053 (and before arm64 enabled 52-bit addresses)" Link: https://lkml.kernel.org/r/ab847b6edb197bffdfe189e70fb4ac76bfe79e0d.1650033747.git.christophe.leroy@csgroup.eu Fixes: f6795053 ("mm: mmap: Allow for "high" userspace addresses") Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: Steve Capper <steve.capper@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> [5.0.x] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com> Conflicts: fs/hugetlbfs/inode.c Acked-by: NXie XiuQi <xiexiuqi@huawei.com>
-
- 05 7月, 2022 1 次提交
-
-
由 Randy Dunlap 提交于
stable inclusion from stable-v5.10.110 commit d57feed3b11404ff7fca19b3576aff30459d68bf bugzilla: https://gitee.com/openeuler/kernel/issues/I574AL Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d57feed3b11404ff7fca19b3576aff30459d68bf -------------------------------- commit e6d09493 upstream. __setup() handlers should return 1 if the command line option is handled and 0 if not (or maybe never return 0; it just pollutes init's environment). This prevents: Unknown kernel command line parameters \ "BOOT_IMAGE=/boot/bzImage-517rc5 stack_guard_gap=100", will be \ passed to user space. Run /sbin/init as init process with arguments: /sbin/init with environment: HOME=/ TERM=linux BOOT_IMAGE=/boot/bzImage-517rc5 stack_guard_gap=100 Return 1 to indicate that the boot option has been handled. Note that there is no warning message if someone enters: stack_guard_gap=anything_invalid and 'val' and stack_guard_gap are both set to 0 due to the use of simple_strtoul(). This could be improved by using kstrtoxxx() and checking for an error. It appears that having stack_guard_gap == 0 is valid (if unexpected) since using "stack_guard_gap=0" on the kernel command line does that. Link: https://lkml.kernel.org/r/20220222005817.11087-1-rdunlap@infradead.org Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru Fixes: 1be7107f ("mm: larger stack guard gap, between vmas") Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Reported-by: NIgor Zhbanov <i.zhbanov@omprussia.ru> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYu Liao <liaoyu15@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 17 1月, 2022 1 次提交
-
-
由 Vladimir Murzin 提交于
mainline inclusion from mainline-v5.13-rc1 commit 18107f8a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4QUF2 CVE: NA ---------------------- Enhanced Privileged Access Never (EPAN) allows Privileged Access Never to be used with Execute-only mappings. Absence of such support was a reason for 24cecc37 ("arm64: Revert support for execute-only user mappings"). Thus now it can be revisited and re-enabled. Cc: Kees Cook <keescook@chromium.org> Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Acked-by: NWill Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210312173811.58284-2-vladimir.murzin@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com> Conflicts: arch/arm64/Kconfig arch/arm64/include/asm/cpucaps.h [wangxiongfeng: fix conflicts caused by context mismatch.] Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 30 12月, 2021 4 次提交
-
-
由 Wang Wensheng 提交于
ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA ------------------- The situation below is not allowed: int *result = mmap(ADDR, sizeof(int), PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0); As share pool uses an independent UVA allocation algorithm, it may produce an address that is conflicted with user-specified address. Signed-off-by: NTang Yizhou <tangyizhou@huawei.com> Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang Wensheng 提交于
ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA ------------------- The fork() will create the new mm for new process, the mm should not take any information from the parent process, so need to clean it. The exit() will mmput the mm and free the memory, if the mm is alrready be used for sp_group, need to clean the group first. Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Signed-off-by: NTang Yizhou <tangyizhou@huawei.com> Signed-off-by: NPeng Wu <wupeng58@huawei.com> Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang Wensheng 提交于
ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA ------------------- Change the mmap_base in mm_struct and check the limit in get_unmapped_area. Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Signed-off-by: NPeng Wu <wupeng58@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang Wensheng 提交于
ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA ------------------- The do_mmap/mmap_region/__mm_populate could only be used to handle the current process, now the share pool need to handle the other process and create memory mmaping, so need to export new function to distinguish different process and handle it, it would not break the current logic and only valid for share pool. Signed-off-by: NTang Yizhou <tangyizhou@huawei.com> Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com> Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 23 12月, 2021 1 次提交
-
-
由 Lijun Fang 提交于
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4M24Q CVE: NA ------------------------------------------------- The DvPP means Davinci Video Pre-Processor, add new config ASCEND_FEATURES and DVPP_MMAP to enable the DvPP features for Ascend platform. The DvPP could only use a limit range of virtual address, just like the Ascend310/910 could only use the 4 GB range of virtual address, so add a new mmap flag which is named MAP_DVPP to use the DvPP processor by mmap syscall, the new flag is only valid for Ascend platform. You should alloc the memory for dvpp like this: addr = mmap(NULL, length, PROT_READ, MAP_ANONYMOUS | MAP_DVPP, -1, 0); Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 10 12月, 2021 1 次提交
-
-
由 Fang Lijun 提交于
ascend inclusion category: Bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMLR CVE: NA -------------- System cann't use the cdm nodes memory, but it can mmap all nodes huge pages, so it will cause Bus error when mmap succeed but the huge pages were not enough. When set the cdmmask, users will transfer the numa id by mmap flag to map the specific numa node hugepages, if there was not enough hugepages on this node, return -ENOMEM. Dvpp use flags MAP_CHECKNODE to enable check node hugetlb. The global variable numanode will cause the mmap not be reenterable, so use the flags BITS[26:31] directly. v2: fix a compiling error on platforms such as mips Signed-off-by: NFang Lijun <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 03 12月, 2021 1 次提交
-
-
由 Lijun Fang 提交于
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA ------------------------------------------------- Add alloc and release memory functions in svm. And the physical address of the memory is within 4GB. For example: /* alloc */ fd = open("dev/svm0",); mmap(0, ALLOC_SIZE,, MAP_PA32BIT, fd, 0); /* free */ ioctl(fd, SVM_IOCTL_RELEASE_PHYS32,); close(fd); Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 30 10月, 2021 2 次提交
-
-
由 Xiongfeng Wang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4AHP2 CVE: NA ------------------------------------------------- Fix some format issues in mm/mmap.c. This patch also fix the wrong address range of mmu_notifier_range_init() in do_user_swap(). Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Xiongfeng Wang 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4AHP2 CVE: NA ------------------------------------------------- When userswap is enabled, the memory pointed by 'pages' is not freed in abnormal branch in do_mmap(). To fix the issue and keep do_mmap() mostly unchanged, we rename do_mmap() to __do_mmap() and extract the memory alloc and free code out of __do_mmap(). When __do_mmap() returns a error value, we goto the error label to free the memory. Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 02 9月, 2021 1 次提交
-
-
由 Xiongfeng Wang 提交于
hulk inclusion category: bugfix bugzilla: 175146 CVE: NA ------------------------------------ Disable userswap by default and add a kernel parameter to enable it. Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Conflicts: include/linux/userfaultfd_k.h Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 19 7月, 2021 1 次提交
-
-
由 Guo Fan 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I40AXF CVE: NA -------------------------------------- To make sure there are no other userspace threads access the memory region we are swapping out, we need unmmap the memory region, map it to a new address and use the new address to perform the swapout. We add a new flag 'MAP_REPLACE' for mmap() to unmap the pages of the input parameter 'VA' and remap them to a new tmpVA. Signed-off-by: NGuo Fan <guofan5@huawei.com> Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: Ntong tiangen <tongtiangen@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 07 12月, 2020 1 次提交
-
-
由 Liu Zixian 提交于
On success, mmap should return the begin address of newly mapped area, but patch "mm: mmap: merge vma after call_mmap() if possible" set vm_start of newly merged vma to return value addr. Users of mmap will get wrong address if vma is merged after call_mmap(). We fix this by moving the assignment to addr before merging vma. We have a driver which changes vm_flags, and this bug is found by our testcases. Fixes: d70cec89 ("mm: mmap: merge vma after call_mmap() if possible") Signed-off-by: NLiu Zixian <liuzixian4@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Hongxiang Lou <louhongxiang@huawei.com> Cc: Hu Shiyuan <hushiyuan@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Link: https://lkml.kernel.org/r/20201203085350.22624-1-liuzixian4@huawei.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 10月, 2020 2 次提交
-
-
由 Liam R. Howlett 提交于
There are two locations that have a block of code for munmapping a vma range. Change those two locations to use a function and add meaningful comments about what happens to the arguments, which was unclear in the previous code. Signed-off-by: NLiam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200818154707.2515169-2-Liam.Howlett@Oracle.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Liam R. Howlett 提交于
There are three places that the next vma is required which uses the same block of code. Replace the block with a function and add comments on what happens in the case where NULL is encountered. Signed-off-by: NLiam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200818154707.2515169-1-Liam.Howlett@Oracle.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 10月, 2020 2 次提交
-
-
由 Jann Horn 提交于
The preceding patches have ensured that core dumping properly takes the mmap_lock. Thanks to that, we can now remove mmget_still_valid() and all its users. Signed-off-by: NJann Horn <jannh@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-8-jannh@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miaohe Lin 提交于
In commit 1da177e4 ("Linux-2.6.12-rc2"), the helper put_write_access() came with the atomic_dec operation of the i_writecount field. But it forgot to use this helper in __vma_link_file() and dup_mmap(). Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200924115235.5111-1-linmiaohe@huawei.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 10月, 2020 8 次提交
-
-
由 Liao Pingfang 提交于
Replace do_brk with do_brk_flags in comment of insert_vm_struct(), since do_brk was removed in following commit. Fixes: bb177a73 ("mm: do not bug_on on incorrect length in __mm_populate()") Signed-off-by: NLiao Pingfang <liao.pingfang@zte.com.cn> Signed-off-by: NYi Wang <wang.yi59@zte.com.cn> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/1600650778-43230-1-git-send-email-wang.yi59@zte.com.cnSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miaohe Lin 提交于
In commit 1da177e4 ("Linux-2.6.12-rc2"), the helper allow_write_access came with the atomic_inc operation of the i_writecount field in the func __remove_shared_vm_struct(). But it forgot to use this helper function. Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200921115814.39680-1-linmiaohe@huawei.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miaohe Lin 提交于
Commit 4bb5f5d9 ("mm: allow drivers to prevent new writable mappings") changed i_mmap_writable from unsigned int to atomic_t and add the helper function mapping_allow_writable() to atomic_inc i_mmap_writable. But it forgot to use this helper function in dup_mmap() and __vma_link_file(). Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Christian Kellner <christian@kellner.me> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Adrian Reber <areber@redhat.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200917112736.7789-1-linmiaohe@huawei.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wei Yang 提交于
In __vma_adjust(), we do the check on *root* to decide whether to adjust the address_space. It seems to be more meaningful to do the check on *file* itself. This means we are adjusting some data because it is a file backed vma. Since we seem to assume the address_space is valid if it is a file backed vma, let's just replace *root* with *file* here. Signed-off-by: NWei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200913133631.37781-2-richard.weiyang@linux.alibaba.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wei Yang 提交于
*root* with type of struct rb_root_cached is an element of *mapping* with type of struct address_space. This implies when we have a valid *root* it must be a part of valid *mapping*. So we can merge these two checks together to make the code more easy to read and to save some cpu cycles. Signed-off-by: NWei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200913133631.37781-1-richard.weiyang@linux.alibaba.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wei Yang 提交于
Instead of converting adjust_next between bytes and pages number, let's just store the virtual address into adjust_next. Also, this patch fixes one typo in the comment of vma_adjust_trans_huge(). [vbabka@suse.cz: changelog tweak] Signed-off-by: NWei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Mike Kravetz <mike.kravetz@oracle.com> Link: http://lkml.kernel.org/r/20200828081031.11306-1-richard.weiyang@linux.alibaba.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wei Yang 提交于
These two functions share the same logic except ignore a different vma. Let's reuse the code. Signed-off-by: NWei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200809232057.23477-2-richard.weiyang@linux.alibaba.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Wei Yang 提交于
__vma_unlink_common() and __vma_unlink() are counterparts. Since there is no function named __vma_unlink(), let's rename __vma_unlink_common() to __vma_unlink() to make the code more self-explanatory and easy for audience to understand. Otherwise we may expect there are several variants of vma_unlink() and __vma_unlink_common() is used by them. Signed-off-by: NWei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200809232057.23477-1-richard.weiyang@linux.alibaba.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 10月, 2020 1 次提交
-
-
由 Miaohe Lin 提交于
The syzbot reported the below general protection fault: general protection fault, probably for non-canonical address 0xe00eeaee0000003b: 0000 [#1] PREEMPT SMP KASAN KASAN: maybe wild-memory-access in range [0x00777770000001d8-0x00777770000001df] CPU: 1 PID: 10488 Comm: syz-executor721 Not tainted 5.9.0-rc3-syzkaller #0 RIP: 0010:unlink_file_vma+0x57/0xb0 mm/mmap.c:164 Call Trace: free_pgtables+0x1b3/0x2f0 mm/memory.c:415 exit_mmap+0x2c0/0x530 mm/mmap.c:3184 __mmput+0x122/0x470 kernel/fork.c:1076 mmput+0x53/0x60 kernel/fork.c:1097 exit_mm kernel/exit.c:483 [inline] do_exit+0xa8b/0x29f0 kernel/exit.c:793 do_group_exit+0x125/0x310 kernel/exit.c:903 get_signal+0x428/0x1f00 kernel/signal.c:2757 arch_do_signal+0x82/0x2520 arch/x86/kernel/signal.c:811 exit_to_user_mode_loop kernel/entry/common.c:136 [inline] exit_to_user_mode_prepare+0x1ae/0x200 kernel/entry/common.c:167 syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:242 entry_SYSCALL_64_after_hwframe+0x44/0xa9 It's because the ->mmap() callback can change vma->vm_file and fput the original file. But the commit d70cec89 ("mm: mmap: merge vma after call_mmap() if possible") failed to catch this case and always fput() the original file, hence add an extra fput(). [ Thanks Hillf for pointing this extra fput() out. ] Fixes: d70cec89 ("mm: mmap: merge vma after call_mmap() if possible") Reported-by: syzbot+c5d5a51dcbb558ca0cb5@syzkaller.appspotmail.com Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Christian König <ckoenig.leichtzumerken@gmail.com> Cc: Hongxiang Lou <louhongxiang@huawei.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Airlie <airlied@redhat.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: John Hubbard <jhubbard@nvidia.com> Link: https://lkml.kernel.org/r/20200916090733.31427-1-linmiaohe@huawei.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 9月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
Replace the two negative flags that are always used together with a single positive flag that indicates the writeback capability instead of two related non-capabilities. Also remove the pointless wrappers to just check the flag. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 04 9月, 2020 1 次提交
-
-
由 Catalin Marinas 提交于
Similarly to arch_validate_prot() called from do_mprotect_pkey(), an architecture may need to sanity-check the new vm_flags. Define a dummy function always returning true. In addition to do_mprotect_pkey(), also invoke it from mmap_region() prior to updating vma->vm_page_prot to allow the architecture code to veto potentially inconsistent vm_flags. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 08 8月, 2020 3 次提交
-
-
由 Peter Collingbourne 提交于
The current split between do_mmap() and do_mmap_pgoff() was introduced in commit 1fcfd8db ("mm, mpx: add "vm_flags_t vm_flags" arg to do_mmap_pgoff()") to support MPX. The wrapper function do_mmap_pgoff() always passed 0 as the value of the vm_flags argument to do_mmap(). However, MPX support has subsequently been removed from the kernel and there were no more direct callers of do_mmap(); all calls were going via do_mmap_pgoff(). Simplify the code by removing do_mmap_pgoff() and changing all callers to directly call do_mmap(), which now no longer takes a vm_flags argument. Signed-off-by: NPeter Collingbourne <pcc@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Link: http://lkml.kernel.org/r/20200727194109.1371462-1-pcc@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miaohe Lin 提交于
The vm_flags may be changed after call_mmap() because drivers may set some flags for their own purpose. As a result, we failed to merge the adjacent vma due to the different vm_flags as userspace can't pass in the same one. Try to merge vma after call_mmap() to fix this issue. Signed-off-by: NHongxiang Lou <louhongxiang@huawei.com> Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1594954065-23733-1-git-send-email-linmiaohe@huawei.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zhen Lei 提交于
Look at the pseudo code below. It's very clear that, the judgement "!is_file_hugepages(file)" at 3) is duplicated to the one at 1), we can use "else if" to avoid it. And the assignment "retval = -EINVAL" at 2) is only needed by the branch 3), because "retval" will be overwritten at 4). No functional change, but it can reduce the code size. Maybe more clearer? Before: text data bss dec hex filename 28733 1590 1 30324 7674 mm/mmap.o After: text data bss dec hex filename 28701 1590 1 30292 7654 mm/mmap.o ====pseudo code====: if (!(flags & MAP_ANONYMOUS)) { ... 1) if (is_file_hugepages(file)) len = ALIGN(len, huge_page_size(hstate_file(file))); 2) retval = -EINVAL; 3) if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file))) goto out_fput; } else if (flags & MAP_HUGETLB) { ... } ... 4) retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff); out_fput: ... return retval; Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200705080112.1405-1-thunder.leizhen@huawei.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 7月, 2020 1 次提交
-
-
由 Kirill A. Shutemov 提交于
VMA with VM_GROWSDOWN or VM_GROWSUP flag set can change their size under mmap_read_lock(). It can lead to race with __do_munmap(): Thread A Thread B __do_munmap() detach_vmas_to_be_unmapped() mmap_write_downgrade() expand_downwards() vma->vm_start = address; // The VMA now overlaps with // VMAs detached by the Thread A // page fault populates expanded part // of the VMA unmap_region() // Zaps pagetables partly // populated by Thread B Similar race exists for expand_upwards(). The fix is to avoid downgrading mmap_lock in __do_munmap() if detached VMAs are next to VM_GROWSDOWN or VM_GROWSUP VMA. [akpm@linux-foundation.org: s/mmap_sem/mmap_lock/ in comment] Fixes: dd2283f2 ("mm: mmap: zap pages with read mmap_sem in munmap") Reported-by: NJann Horn <jannh@google.com> Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NYang Shi <yang.shi@linux.alibaba.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> [4.20+] Link: http://lkml.kernel.org/r/20200709105309.42495-1-kirill.shutemov@linux.intel.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 6月, 2020 1 次提交
-
-
由 Paul E. McKenney 提交于
A large process running on a heavily loaded system can encounter the following RCU CPU stall warning: rcu: INFO: rcu_sched self-detected stall on CPU rcu: 3-....: (20998 ticks this GP) idle=4ea/1/0x4000000000000002 softirq=556558/556558 fqs=5190 (t=21013 jiffies g=1005461 q=132576) NMI backtrace for cpu 3 CPU: 3 PID: 501900 Comm: aio-free-ring-w Kdump: loaded Not tainted 5.2.9-108_fbk12_rc3_3858_gb83b75af7909 #1 Hardware name: Wiwynn HoneyBadger/PantherPlus, BIOS HBM6.71 02/03/2016 Call Trace: <IRQ> dump_stack+0x46/0x60 nmi_cpu_backtrace.cold.3+0x13/0x50 ? lapic_can_unplug_cpu.cold.27+0x34/0x34 nmi_trigger_cpumask_backtrace+0xba/0xca rcu_dump_cpu_stacks+0x99/0xc7 rcu_sched_clock_irq.cold.87+0x1aa/0x397 ? tick_sched_do_timer+0x60/0x60 update_process_times+0x28/0x60 tick_sched_timer+0x37/0x70 __hrtimer_run_queues+0xfe/0x270 hrtimer_interrupt+0xf4/0x210 smp_apic_timer_interrupt+0x5e/0x120 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:kmem_cache_free+0x223/0x300 Code: 88 00 00 00 0f 85 ca 00 00 00 41 8b 55 18 31 f6 f7 da 41 f6 45 0a 02 40 0f 94 c6 83 c6 05 9c 41 5e fa e8 a0 a7 01 00 41 56 9d <49> 8b 47 08 a8 03 0f 85 87 00 00 00 65 48 ff 08 e9 3d fe ff ff 65 RSP: 0018:ffffc9000e8e3da8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13 RAX: 0000000000020000 RBX: ffff88861b9de960 RCX: 0000000000000030 RDX: fffffffffffe41e8 RSI: 000060777fe3a100 RDI: 000000000001be18 RBP: ffffea00186e7780 R08: ffffffffffffffff R09: ffffffffffffffff R10: ffff88861b9dea28 R11: ffff88887ffde000 R12: ffffffff81230a1f R13: ffff888854684dc0 R14: 0000000000000206 R15: ffff8888547dbc00 ? remove_vma+0x4f/0x60 remove_vma+0x4f/0x60 exit_mmap+0xd6/0x160 mmput+0x4a/0x110 do_exit+0x278/0xae0 ? syscall_trace_enter+0x1d3/0x2b0 ? handle_mm_fault+0xaa/0x1c0 do_group_exit+0x3a/0xa0 __x64_sys_exit_group+0x14/0x20 do_syscall_64+0x42/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 And on a PREEMPT=n kernel, the "while (vma)" loop in exit_mmap() can run for a very long time given a large process. This commit therefore adds a cond_resched() to this loop, providing RCU any needed quiescent states. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <linux-mm@kvack.org> Reviewed-by: NShakeel Butt <shakeelb@google.com> Reviewed-by: NJoel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
-
- 10 6月, 2020 1 次提交
-
-
由 Michel Lespinasse 提交于
Convert comments that reference mmap_sem to reference mmap_lock instead. [akpm@linux-foundation.org: fix up linux-next leftovers] [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil] [akpm@linux-foundation.org: more linux-next fixups, per Michel] Signed-off-by: NMichel Lespinasse <walken@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NVlastimil Babka <vbabka@suse.cz> Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-