- 19 1月, 2022 15 次提交
-
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- When THP is enabled, the allocation of a page(order=0) may be converted to an allocation of pages(order>0). In this case, the allocation will skip the dhugetlb_pool. When we want to use dynamic hugetlb feature, we have to disable THP for now. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- Add tracepoints for dynamic_hugetlb to track the process of page split, page merge, page migration, page allocation and page free. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- Add function to free huge page to dhugetlb_pool. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- Add function to alloc huge page from dhugetlb_pool. When process is bound to a mem_cgroup configured with dhugetlb_pool, only allowed to alloc huge page from dhugetlb_pool. If there is no huge pages in dhugetlb_pool, the mmap() will failed due to the reserve count introduced in previous patch. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- The dynamic hugetlb feature is based on hugetlb. There is a reserve count in hugetlb to determine if there were enough free huge pages to satisfy the requirement while mmap() to avoid SIGBUS at the next page fault time. Add similar count for dhugetlb_pool to avoid same problem. References: Documentation/vm/hugetlbfs_reserv.rst Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- Add new interface "dhugetlb.disable_normal_pages" to disable the allocation of normal pages from a hpool. This makes dynamic hugetlb more flexible. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- Add function to free page to dhugetlb_pool. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- Add function to alloc page from dhugetlb_pool. When process is bound to a mem_cgroup configured with dhugtlb_pool, alloc page from dhugetlb_pool firstly. If there is no page in dhugetlb_pool, fallback to alloc page from buddy system. As the process will alloc pages from dhugetlb_pool in the mem_cgroup, process is not allowed to migrate to other mem_cgroup. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- Sometimes, page merge may failed because some pages are still in use. Add migration function to enhance the merge function. This function relies on memory hotremove, so it only works when config MEMORY_HOTREMOVE is selected. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- When destroying hpool or alloc huge pages, the pages has been split may need to be merged to huge pages. Add functions to merge pages in dhugetlb_pool. The information about split huge pages has been recorded in hugepage_splitlists and can traverse it to merge huge pages. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- Currently, dynamic hugetlb support 1G/2M/4K pages. In the beginning, there were only 1G pages in the hpool. Add function to split pages in dhugetlb_pool. If 4K pages are insufficient, try to split 2M pages, and if 2M pages are insufficient, try to split 1G pages. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- Add two interfaces in mem_cgroup to configure the count of 1G/2M hugepages in dhugetlb_pool. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- Dynamic hugetlb is a self-developed feature based on the hugetlb and memcontrol. It supports to split huge page dynamically in a memory cgroup. There is a new structure dhugetlb_pool in every mem_cgroup to manage the pages configured to the mem_cgroup. For the mem_cgroup configured with dhugetlb_pool, processes in the mem_cgroup will preferentially use the pages in dhugetlb_pool. Dynamic hugetlb supports three types of pages, including 1G/2M huge pages and 4K pages. For the mem_cgroup configured with dhugetlb_pool, processes will be limited to alloc 1G/2M huge pages only from dhugetlb_pool. But there is no such constraint for 4K pages. If there are insufficient 4K pages in the dhugetlb_pool, pages can also be allocated from buddy system. So before using dynamic hugetlb, user must know how many huge pages they need. Usage: 1. Add 'dynamic_hugetlb=on' in cmdline to enable dynamic hugetlb feature. 2. Prealloc some 1G hugepages through hugetlb. 3. Create a mem_cgroup and configure dhugetlb_pool to mem_cgroup. 4. Configure the count of 1G/2M hugepages, and the remaining pages in dhugetlb_pool will be used as basic pages. 5. Bound a process to mem_cgroup. then the memory for it will be allocated from dhugetlb_pool. This patch add the corresponding structure dhugetlb_pool for dynamic hugetlb feature, the interface 'dhugetlb.nr_pages' in mem_cgroup to configure dhugetlb_pool and the cmdline 'dynamic_hugetlb=on' to enable dynamic hugetlb feature. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- In next patches, struct hugetlbfs_inode_info will be used to check whether a hugetlbfs file has memory in hpool, so add paramter hugetlbfs_inode_info to related functions, including hugetlb_acct_memory/hugepage_subpool_get_pages/ hugepage_subpool_put_pages. No functional changes. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
hulk inclusion category: feature bugzilla: 46904, https://gitee.com/openeuler/kernel/issues/I4QSHG CVE: NA -------------------------------- There are several functions that will be used in next patches for dynamic hugetlb feature. Declare them. No functional changes. Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 17 1月, 2022 3 次提交
-
-
由 Zhenguo Yao 提交于
mainline inclusion from mainline-v5.16-rc5 commit 4178158e category: bugfix bugzilla: 186043, https://gitee.com/openeuler/kernel/issues/I4QSF4 CVE: NA -------------------------------- Preallocation of gigantic pages can't work bacause of commit b5389086 ("hugetlbfs: extend the definition of hugepages parameter to support node allocation"). When nid is NUMA_NO_NODE(-1), alloc_bootmem_huge_page will always return without doing allocation. Fix this by adding more check. Link: https://lkml.kernel.org/r/20211129133803.15653-1-yaozhenguo1@gmail.com Fixes: b5389086 ("hugetlbfs: extend the definition of hugepages parameter to support node allocation") Signed-off-by: NZhenguo Yao <yaozhenguo1@gmail.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Tested-by: NMaxim Levitsky <mlevitsk@redhat.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Reviewed-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhenguo Yao 提交于
mainline inclusion from mainline-v5.16-rc1 commit b5389086 category: feature bugzilla: 186043, https://gitee.com/openeuler/kernel/issues/I4QSF4 CVE: NA -------------------------------- We can specify the number of hugepages to allocate at boot. But the hugepages is balanced in all nodes at present. In some scenarios, we only need hugepages in one node. For example: DPDK needs hugepages which are in the same node as NIC. If DPDK needs four hugepages of 1G size in node1 and system has 16 numa nodes we must reserve 64 hugepages on the kernel cmdline. But only four hugepages are used. The others should be free after boot. If the system memory is low(for example: 64G), it will be an impossible task. So extend the hugepages parameter to support specifying hugepages on a specific node. For example add following parameter: hugepagesz=1G hugepages=0:1,1:3 It will allocate 1 hugepage in node0 and 3 hugepages in node1. Link: https://lkml.kernel.org/r/20211005054729.86457-1-yaozhenguo1@gmail.comSigned-off-by: NZhenguo Yao <yaozhenguo1@gmail.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Zhenguo Yao <yaozhenguo1@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mike Rapoport <rppt@kernel.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Conflicts: mm/hugetlb.c Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Vladimir Murzin 提交于
mainline inclusion from mainline-v5.13-rc1 commit 18107f8a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4QUF2 CVE: NA ---------------------- Enhanced Privileged Access Never (EPAN) allows Privileged Access Never to be used with Execute-only mappings. Absence of such support was a reason for 24cecc37 ("arm64: Revert support for execute-only user mappings"). Thus now it can be revisited and re-enabled. Cc: Kees Cook <keescook@chromium.org> Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Acked-by: NWill Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210312173811.58284-2-vladimir.murzin@arm.comSigned-off-by: NCatalin Marinas <catalin.marinas@arm.com> Conflicts: arch/arm64/Kconfig arch/arm64/include/asm/cpucaps.h [wangxiongfeng: fix conflicts caused by context mismatch.] Signed-off-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 15 1月, 2022 1 次提交
-
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: https://gitee.com/openeuler/kernel/issues/I4OODH?from=project-issue CVE: NA ------------------------------------------------- Add proc/sys/vm/hugepage_nocache_copy switch. Set 1 to copy hugepage with movnt SSE instructoin if cpu support it. Set 0 to copy hugepage as usual. Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: Nlouhongxiang <louhongxiang@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 14 1月, 2022 7 次提交
-
-
由 Manjong Lee 提交于
stable inclusion from stable-v5.10.85 commit c581090228e3aeabb5081c1db8b2024ae8478f5b bugzilla: 186032 https://gitee.com/openeuler/kernel/issues/I4QVI4 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c581090228e3aeabb5081c1db8b2024ae8478f5b -------------------------------- commit 3c376dfa upstream. Initialize min_ratio if it is set during bdi unregistration. This can prevent problems that may occur a when bdi is removed without resetting min_ratio. For example. 1) insert external sdcard 2) set external sdcard's min_ratio 70 3) remove external sdcard without setting min_ratio 0 4) insert external sdcard 5) set external sdcard's min_ratio 70 << error occur(can't set) Because when an sdcard is removed, the present bdi_min_ratio value will remain. Currently, the only way to reset bdi_min_ratio is to reboot. [akpm@linux-foundation.org: tweak comment and coding style] Link: https://lkml.kernel.org/r/20211021161942.5983-1-mj0123.lee@samsung.comSigned-off-by: NManjong Lee <mj0123.lee@samsung.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Changheun Lee <nanich.lee@samsung.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: <seunghwan.hyun@samsung.com> Cc: <sookwan7.kim@samsung.com> Cc: <yt0928.kim@samsung.com> Cc: <junho89.kim@samsung.com> Cc: <jisoo2146.oh@samsung.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kefeng Wang 提交于
hulk inclusion category: bugfix bugzilla: 186017 https://gitee.com/openeuler/kernel/issues/I4DDEL -------------------------------- virt_addr_valid() could be insufficient to validate the virt addr on some architecture, which could lead to potential BUG which has been found on arm64/powerpc64. Let's add WARN_ON to check if the virt addr is passed virt_addr_valid() but is a vmalloc/module address. Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NYuanzheng Song <songyuanzheng@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Liu Shixin 提交于
mainline inclusion from mainline-v5.16-rc7 commit 2a57d83c category: bugfix bugzilla: 185855 https://gitee.com/openeuler/kernel/issues/I4DDEL Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2a57d83c78f889bf3f54eede908d0643c40d5418 -------------------------------- Hulk Robot reported a panic in put_page_testzero() when testing madvise() with MADV_SOFT_OFFLINE. The BUG() is triggered when retrying get_any_page(). This is because we keep MF_COUNT_INCREASED flag in second try but the refcnt is not increased. page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) ------------[ cut here ]------------ kernel BUG at include/linux/mm.h:737! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 5 PID: 2135 Comm: sshd Tainted: G B 5.16.0-rc6-dirty #373 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: release_pages+0x53f/0x840 Call Trace: free_pages_and_swap_cache+0x64/0x80 tlb_flush_mmu+0x6f/0x220 unmap_page_range+0xe6c/0x12c0 unmap_single_vma+0x90/0x170 unmap_vmas+0xc4/0x180 exit_mmap+0xde/0x3a0 mmput+0xa3/0x250 do_exit+0x564/0x1470 do_group_exit+0x3b/0x100 __do_sys_exit_group+0x13/0x20 __x64_sys_exit_group+0x16/0x20 do_syscall_64+0x34/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Modules linked in: ---[ end trace e99579b570fe0649 ]--- RIP: 0010:release_pages+0x53f/0x840 Link: https://lkml.kernel.org/r/20211221074908.3910286-1-liushixin2@huawei.com Fixes: b94e0282 ("mm,hwpoison: try to narrow window race for free pages") Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Reported-by: NHulk Robot <hulkci@huawei.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Acked-by: NNaoya Horiguchi <naoya.horiguchi@nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Conflicts: mm/memory-failure.c Signed-off-by: NLiu Shixin <liushixin2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Mike Kravetz 提交于
mainline inclusion from mainline-v5.14-rc1 commit 7118fc29 category: bugfix bugzilla:171843 ----------------------------------------------- In [1], Jann Horn points out a possible race between prep_compound_gigantic_page and __page_cache_add_speculative. The root cause of the possible race is prep_compound_gigantic_page uncondittionally setting the ref count of pages to zero. It does this because prep_compound_gigantic_page is handed a 'group' of pages from an allocator and needs to convert that group of pages to a compound page. The ref count of each page in this 'group' is one as set by the allocator. However, the ref count of compound page tail pages must be zero. The potential race comes about when ref counted pages are returned from the allocator. When this happens, other mm code could also take a reference on the page. __page_cache_add_speculative is one such example. Therefore, prep_compound_gigantic_page can not just set the ref count of pages to zero as it does today. Doing so would lose the reference taken by any other code. This would lead to BUGs in code checking ref counts and could possibly even lead to memory corruption. There are two possible ways to address this issue. 1) Make all allocators of gigantic groups of pages be able to return a properly constructed compound page. 2) Make prep_compound_gigantic_page be more careful when constructing a compound page. This patch takes approach 2. In prep_compound_gigantic_page, use cmpxchg to only set ref count to zero if it is one. If the cmpxchg fails, call synchronize_rcu() in the hope that the extra ref count will be driopped during a rcu grace period. This is not a performance critical code path and the wait should be accceptable. If the ref count is still inflated after the grace period, then undo any modifications made and return an error. Currently prep_compound_gigantic_page is type void and does not return errors. Modify the two callers to check for and handle error returns. On error, the caller must free the 'group' of pages as they can not be used to form a gigantic page. After freeing pages, the runtime caller (alloc_fresh_huge_page) will retry the allocation once. Boot time allocations can not be retried. The routine prep_compound_page also unconditionally sets the ref count of compound page tail pages to zero. However, in this case the buddy allocator is constructing a compound page from freshly allocated pages. The ref count on those freshly allocated pages is already zero, so the set_page_count(p, 0) is unnecessary and could lead to confusion. Just remove it. [1] https://lore.kernel.org/linux-mm/CAG48ez23q0Jy9cuVnwAe7t_fdhMk2S7N5Hdi-GLcCeq5bsfLxw@mail.gmail.com/ Link: https://lkml.kernel.org/r/20210622021423.154662-3-mike.kravetz@oracle.com Fixes: 58a84aa9 ("thp: set compound tail page _count to zero") Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com> Reported-by: NJann Horn <jannh@google.com> Cc: Youquan Song <youquan.song@intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NChen Huang <chenhuang5@huawei.com> Reviewed-by: NChen Wandun <chenwandun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Mike Rapoport 提交于
mainline inclusion from mainline-v5.14-rc1 commit 023accf5 category: bugfix bugzilla: 172285 https://gitee.com/openeuler/kernel/issues/I4DDEL ----------------------------------------------- There maybe an overflow in memblock_overlaps_region() if it is called with base and size such that base + size > PHYS_ADDR_MAX Make sure that memblock_overlaps_region() caps the size to prevent such overflow and remove now duplicated call to memblock_cap_size() from memblock_is_region_reserved(). Signed-off-by: NMike Rapoport <rppt@linux.ibm.com> Tested-by: NTony Lindgren <tony@atomide.com> Signed-off-by: NChen Huang <chenhuang5@huawei.com> Reviewed-by: NChen Wandun <chenwandun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Rustam Kovhaev 提交于
stable inclusion form stable-v5.10.82 commit b2e2fb64071a00df54d904858b591590de369108 bugzilla: 185877 https://gitee.com/openeuler/kernel/issues/I4QU6V Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b2e2fb64071a00df54d904858b591590de369108 -------------------------------- commit 34dbc3aa upstream. When kmemleak is enabled for SLOB, system does not boot and does not print anything to the console. At the very early stage in the boot process we hit infinite recursion from kmemleak_init() and eventually kernel crashes. kmemleak_init() specifies SLAB_NOLEAKTRACE for KMEM_CACHE(), but kmem_cache_create_usercopy() removes it because CACHE_CREATE_MASK is not valid for SLOB. Let's fix CACHE_CREATE_MASK and make kmemleak work with SLOB Link: https://lkml.kernel.org/r/20211115020850.3154366-1-rkovhaev@gmail.com Fixes: d8843922 ("slab: Ignore internal flags in cache creation") Signed-off-by: NRustam Kovhaev <rkovhaev@gmail.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Glauber Costa <glommer@parallels.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Baokun Li 提交于
hulk inclusion category: bugfix bugzilla: 185858 https://gitee.com/openeuler/kernel/issues/I4DDEL ------------------------------------------------- Hulk robot reported a kmemleak problem: ----------------------------------------------------------------------- unreferenced object 0xffff93d1d8cc02e8 (size 248): comm "cat", pid 23327, jiffies 4624670141 (age 495992.217s) hex dump (first 32 bytes): 00 40 85 19 d4 93 ff ff 00 10 00 00 00 00 00 00 .@.............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000db5610b3>] seq_open+0x2a/0x80 [<00000000d66ac99d>] full_proxy_open+0x167/0x1e0 [<00000000d58ef917>] do_dentry_open+0x1e1/0x3a0 [<0000000016c91867>] path_openat+0x961/0xa20 [<00000000909c9564>] do_filp_open+0xae/0x120 [<0000000059c761e6>] do_sys_openat2+0x216/0x2f0 [<00000000b7a7b239>] do_sys_open+0x57/0x80 [<00000000e559d671>] do_syscall_64+0x33/0x40 [<000000000ea1fbfd>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 unreferenced object 0xffff93d419854000 (size 4096): comm "cat", pid 23327, jiffies 4624670141 (age 495992.217s) hex dump (first 32 bytes): 6b 66 65 6e 63 65 2d 23 32 35 30 3a 20 30 78 30 kfence-#250: 0x0 30 30 30 30 30 30 30 37 35 34 62 64 61 31 32 2d 0000000754bda12- backtrace: [<000000008162c6f2>] seq_read_iter+0x313/0x440 [<0000000020b1b3e3>] seq_read+0x14b/0x1a0 [<00000000af248fbc>] full_proxy_read+0x56/0x80 [<00000000f97679d1>] vfs_read+0xa5/0x1b0 [<000000000ed8a36f>] ksys_read+0xa0/0xf0 [<00000000e559d671>] do_syscall_64+0x33/0x40 [<000000000ea1fbfd>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 ----------------------------------------------------------------------- I find that we can easily reproduce this problem with the following commands: `cat /sys/kernel/debug/kfence/objects` `echo scan > /sys/kernel/debug/kmemleak` `cat /sys/kernel/debug/kmemleak` The leaked memory is allocated in the stack below: ---------------------------------- do_syscall_64 do_sys_open do_dentry_open full_proxy_open seq_open ---> alloc seq_file vfs_read full_proxy_read seq_read seq_read_iter traverse ---> alloc seq_buf ---------------------------------- And it should have been released in the following process: ---------------------------------- do_syscall_64 syscall_exit_to_user_mode exit_to_user_mode_prepare task_work_run ____fput __fput full_proxy_release ---> free here ---------------------------------- However, the release function corresponding to file_operations is not implemented in kfence. As a result, a memory leak occurs. Therefore, the solution to this problem is to implement the corresponding release function. Link: https://lkml.kernel.org/r/20211206133628.2822545-1-libaokun1@huawei.com Fixes: 0ce20dd8 ("mm: add Kernel Electric-Fence infrastructure") Signed-off-by: NBaokun Li <libaokun1@huawei.com> Reported-by: NHulk Robot <hulkci@huawei.com> Acked-by: NMarco Elver <elver@google.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Yu Kuai <yukuai3@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NPeng Liu <liupeng256@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NChen Jun <chenjun102@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 08 1月, 2022 2 次提交
-
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: https://gitee.com/openeuler/kernel/issues/I4OODH?from=project-issue CVE: NA ------------------------------------------------- Add /proc/sys/kernel/hugepage_pmem_allocall switch. Set 1 to allowed all memory in pmem could alloc for hugepage. Set 0(default) hugepage alloc is limited by zone watermark as usual. Add /proc/sys/kernel/hugepage_mig_noalloc switch. Set 1 to forbid new hugepage alloc in hugepage migration when hugepage in dest node runs out. Set 0(default) to allow hugepage alloc in hugepage migration as usual. Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: Nlouhongxiang <louhongxiang@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kemeng Shi 提交于
euleros inclusion category: feature feature: etmem bugzilla: https://gitee.com/openeuler/kernel/issues/I4OODH?from=project-issue CVE: NA ------------------------------------------------- Driver dax_kmem will export pmem as a NUMA node. This patch will record node consists of persistent memory for futher use. Signed-off-by: NKemeng Shi <shikemeng@huawei.com> Reviewed-by: Nlouhongxiang <louhongxiang@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 07 1月, 2022 6 次提交
-
-
由 Lu Jialin 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue CVE: NA -------- This patch adds a default-false static key to disable memcg kswapd feature. User can enable by set memcg_kswapd in cmdline. Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lu Jialin 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue CVE: NA -------- The memcg kswapd could set dirty state to memcg if current scan find all pages are unqueued dirty in the memcg. Then kswapd would write out dirty pages. Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lu Jialin 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue CVE: NA -------- Since memory.high reclaim is sync whether is in interrupt, it could do more work than direct reclaim, i.e. write out dirty page, etc. So, add PF_KSWAPD flag, so that current_is_kswapd() would return true for memcg kswapd. Memcg kswapd should stop when usage of memcg fit the memcg kswapd stop flag. When the userland sets the memcg->memory.max, the stop_flag is (memcg->memory.high - memcg->memory.max * 10 / 1000), which is similar with global kswapd. Otherwise, the stop_flag is (memcg->memory.high - memcg->memory.high / 6), which is similar with most difference between watermark_low and watermark_high. And, memcg kswapd should not break memory.low protection for now. Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lu Jialin 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue CVE: NA -------- Export memory.high from cgroupv2 to cgroupv1. Therefore, when the usage of the memcg is larger than memory.high, some pages will be reclaimed before return to userland, which will throttle the process. Only export memory.high number in mem_cgroup_legacy_files and move related functions in front of mem_cgroup_legacy_files. There is no need to other changes. Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lu Jialin 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IMAK?from=project-issue CVE: NA -------- Export memcg.min and memcg.low from cgroupv2 to cgroupv1, in order to reduce the negtive impact between cgroups when the system memory is insufficient. Only export memory.{min/low} numbers in mem_cgroup_legacy_files and move related functions in front of mem_cgroup_legacy_files. There is no need to other changes. Signed-off-by: NLu Jialin <lujialin4@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peng Liu 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NYPZ CVE: NA ------------------------------------------------- A new flag MEMBLOCK_MEMMAP is added into memblock_flags, which is used to identify reserved memory for memmap. This flag is limited for arm64. When memmap memory is reserved by memblock_reserve, it is subsequently marked with flag MEMBLOCK_MEMMAP. Therefore, for_each_mem_region can find memmap memory and request resources for it. Signed-off-by: NPeng Liu <liupeng256@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 30 12月, 2021 6 次提交
-
-
由 Wang Wensheng 提交于
ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA ------------------- Sharepool applies for a dedicated interface for large pages, which optimizes the efficiency of memory application Signed-off-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang Wensheng 提交于
ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA ------------------- In the share pool scenario, when the shared memory is applied for, the do_mm_populate function is performed at the same time, that is, the corresponding pages are allocated. In the current share pool implementation, the memory is charged to the memcg of the first task added to this share pool group. This is unreasonable and may cause memcg of first task oom. So, we should charge the pages to the memcg of current task. Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com> Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang Wensheng 提交于
ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA ------------------- We store the preferred node_id in sp_area in sp_alloc() and use it for memory alloc in shmem_fault. Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Signed-off-by: NPeng Wu <wupeng58@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang Wensheng 提交于
ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA ------------------- The situation below is not allowed: int *result = mmap(ADDR, sizeof(int), PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0); As share pool uses an independent UVA allocation algorithm, it may produce an address that is conflicted with user-specified address. Signed-off-by: NTang Yizhou <tangyizhou@huawei.com> Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang Wensheng 提交于
ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA ------------------- The fork() will create the new mm for new process, the mm should not take any information from the parent process, so need to clean it. The exit() will mmput the mm and free the memory, if the mm is alrready be used for sp_group, need to clean the group first. Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Signed-off-by: NTang Yizhou <tangyizhou@huawei.com> Signed-off-by: NPeng Wu <wupeng58@huawei.com> Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang Wensheng 提交于
ascend inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA ------------------- This interface is added to support the function of exiting a process from a sharing group. Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Signed-off-by: NGuo Mengqi <guomengqi3@huawei.com> Signed-off-by: NZhou Guanghui <zhouguanghui1@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-