- 28 5月, 2022 17 次提交
-
-
由 Miaohe Lin 提交于
Patch series "A few fixup patches for mm", v4. This series contains a few patches to avoid mapping random data if swap read fails and fix lost swap bits in unuse_pte. Also we free hwpoison and swapin error entry in madvise_free_pte_range and so on. More details can be found in the respective changelogs. This patch (of 5): There is a bug in unuse_pte(): when swap page happens to be unreadable, page filled with random data is mapped into user address space. In case of error, a special swap entry indicating swap read fails is set to the page table. So the swapcache page can be freed and the user won't end up with a permanently mounted swap because a sector is bad. And if the page is accessed later, the user process will be killed so that corrupted data is never consumed. On the other hand, if the page is never accessed, the user won't even notice it. Link: https://lkml.kernel.org/r/20220519125030.21486-1-linmiaohe@huawei.com Link: https://lkml.kernel.org/r/20220519125030.21486-2-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Acked-by: NDavid Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Howells <dhowells@redhat.com> Cc: NeilBrown <neilb@suse.de> Cc: Alistair Popple <apopple@nvidia.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Michal Koutný 提交于
The memory protection test setup and runtime is almost equal for memory.low and memory.min cases. It makes modification of the common parts prone to mistakes, since the protections are similar not only in setup but also in principle, factor the common part out. Past exceptions between the tests: - missing memory.min is fine (kept), - test_memcg_low protected orphaned pagecache (adapted like test_memcg_min and we keep the processes of protected memory running). The evaluation in two tests is different (OOM of allocator vs low events of protégés), this is kept different. Link: https://lkml.kernel.org/r/20220518161859.21565-6-mkoutny@suse.comSigned-off-by: NMichal Koutný <mkoutny@suse.com> Acked-by: NRoman Gushchin <roman.gushchin@linux.dev> CC: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Richard Palethorpe <rpalethorpe@suse.de> Cc: David Vernet <void@manifault.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Michal Koutný 提交于
The reclaim is triggered by memory limit in a subtree, therefore the testcase does not need configured protection against external reclaim. Also, correct respective comments. Link: https://lkml.kernel.org/r/20220518161859.21565-5-mkoutny@suse.comSigned-off-by: NMichal Koutný <mkoutny@suse.com> Acked-by: NRoman Gushchin <roman.gushchin@linux.dev> Cc: David Vernet <void@manifault.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Richard Palethorpe <rpalethorpe@suse.de> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Michal Koutný 提交于
The numbers are not easy to derive in a closed form (certainly mere protections ratios do not apply), therefore use a simulation to obtain expected numbers. Link: https://lkml.kernel.org/r/20220518161859.21565-4-mkoutny@suse.comSigned-off-by: NMichal Koutný <mkoutny@suse.com> Acked-by: NRoman Gushchin <roman.gushchin@linux.dev> Cc: David Vernet <void@manifault.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Richard Palethorpe <rpalethorpe@suse.de> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Michal Koutný 提交于
This is effectively a revert of commit cdc69458 ("cgroup: account for memory_recursiveprot in test_memcg_low()"). The case test_memcg_low will fail with memory_recursiveprot until resolved in reclaim code. However, this patch preserves the existing helpers and variables for later uses. Link: https://lkml.kernel.org/r/20220518161859.21565-3-mkoutny@suse.comSigned-off-by: NMichal Koutný <mkoutny@suse.com> Reviewed-by: NDavid Vernet <void@manifault.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Richard Palethorpe <rpalethorpe@suse.de> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Michal Koutný 提交于
Patch series "memcontrol selftests fixups", v2. Flushing the patches to make memcontrol selftests check the events behavior we had consensus about (test_memcg_low fails). (test_memcg_reclaim, test_memcg_swap_max fail for me now but it's present even before the refactoring.) The two bigger changes are: - adjustment of the protected values to make tests succeed with the given tolerance, - both test_memcg_low and test_memcg_min check protection of memory in populated cgroups (actually as per Documentation/admin-guide/cgroup-v2.rst memory.min should not apply to empty cgroups, which is not the case currently. Therefore I unified tests with the populated case in order to to bring more broken tests). This patch (of 5): This fixes mis-applied changes from commit 72b1e03a ("cgroup: account for memory_localevents in test_memcg_oom_group_leaf_events()"). Link: https://lkml.kernel.org/r/20220518161859.21565-1-mkoutny@suse.com Link: https://lkml.kernel.org/r/20220518161859.21565-2-mkoutny@suse.comSigned-off-by: NMichal Koutný <mkoutny@suse.com> Reviewed-by: NDavid Vernet <void@manifault.com> Acked-by: NRoman Gushchin <roman.gushchin@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Richard Palethorpe <rpalethorpe@suse.de> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Think about the below scenario: CPU1 CPU2 z3fold_page_migrate z3fold_map z3fold_page_trylock ... z3fold_page_unlock /* slots still points to old zhdr*/ get_z3fold_header get slots from handle get old zhdr from slots z3fold_page_trylock return *old* zhdr encode_handle(new_zhdr, FIRST|LAST|MIDDLE) put_page(page) /* zhdr is freed! */ but zhdr is still used by caller! z3fold_map can map freed z3fold page and lead to use-after-free bug. To fix it, we add PAGE_MIGRATED to indicate z3fold page is migrated and soon to be released. So get_z3fold_header won't return such page. Link: https://lkml.kernel.org/r/20220429064051.61552-10-linmiaohe@huawei.com Fixes: 1f862989 ("mm/z3fold.c: support page migration") Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Think about the below scenario: CPU1 CPU2 z3fold_reclaim_page z3fold_free spin_lock(&pool->lock) get_z3fold_header -- hold page_lock kref_get_unless_zero kref_put--zhdr->refcount can be 1 now !z3fold_page_trylock kref_put -- zhdr->refcount is 0 now release_z3fold_page WARN_ON(!list_empty(&zhdr->buddy)); -- we're on buddy now! spin_lock(&pool->lock); -- deadlock here! z3fold_reclaim_page might race with z3fold_free and will lead to pool lock deadlock and zhdr buddy non-empty warning. To fix this, defer getting the refcount until page_lock is held just like what __z3fold_alloc does. Note this has the side effect that we won't break the reclaim if we meet a soon to be released z3fold page now. Link: https://lkml.kernel.org/r/20220429064051.61552-9-linmiaohe@huawei.com Fixes: dcf5aedb ("z3fold: stricter locking and more careful reclaim") Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Think about the below race window: CPU1 CPU2 z3fold_reclaim_page z3fold_free test_and_set_bit PAGE_CLAIMED failed to reclaim page z3fold_page_lock(zhdr); add back to the lru list; z3fold_page_unlock(zhdr); get_z3fold_header page_claimed=test_and_set_bit PAGE_CLAIMED clear_bit(PAGE_CLAIMED, &page->private); if (!page_claimed) /* it's false true */ free_handle is not called free_handle won't be called in this case. So z3fold_buddy_slots will leak. Fix it by always clear PAGE_CLAIMED under z3fold page lock. Link: https://lkml.kernel.org/r/20220429064051.61552-8-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
When doing z3fold page reclaim or migration, the page is removed from unbuddied list. If reclaim or migration succeeds, it's fine as page is released. But in case it fails, the page is not put back into unbuddied list now. The page will be leaked until next compaction work, reclaim or migration is done. Link: https://lkml.kernel.org/r/20220429064051.61552-7-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Revert commit f1549cb5 ("mm/z3fold.c: allow __GFP_HIGHMEM in z3fold_alloc"). z3fold can't support GFP_HIGHMEM page now. page_address is used directly at all places. Moreover, z3fold_header is on per cpu unbuddied list which could be accessed anytime. So we should remove the support of GFP_HIGHMEM allocation for z3fold. Link: https://lkml.kernel.org/r/20220429064051.61552-6-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
If trylock_page fails, the page won't be non-lru movable page. When this page is freed via free_z3fold_page, it will trigger bug on PageMovable check in __ClearPageMovable. Throw warning on failure of trylock_page to guard against such rare case just as what zsmalloc does. Link: https://lkml.kernel.org/r/20220429064051.61552-5-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Currently if z3fold couldn't find an unbuddied page it would first try to pull a page off the stale list. But this approach is problematic. If init z3fold page fails later, the page should be freed via free_z3fold_page to clean up the relevant resource instead of using __free_page directly. And if page is successfully reused, it will BUG_ON later in __SetPageMovable because it's already non-lru movable page, i.e. PAGE_MAPPING_MOVABLE is already set in page->mapping. In order to fix all of these issues, we can simply remove the buggy use of stale list for allocation because can_sleep should always be false and we never really hit the reusing code path now. Link: https://lkml.kernel.org/r/20220429064051.61552-4-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
alloc_slots could fail to allocate memory under heavy memory pressure. So we should check zhdr->slots against NULL to avoid future null pointer dereferencing. Link: https://lkml.kernel.org/r/20220429064051.61552-3-linmiaohe@huawei.com Fixes: fc548865 ("z3fold: simplify freeing slots") Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Patch series "A few fixup patches for z3fold". This series contains a few fixup patches to fix sheduling while atomic, fix possible null pointer dereferencing, fix various race conditions and so on. More details can be found in the respective changelogs. This patch (of 9): z3fold's page_lock is always held when calling alloc_slots. So gfp should be GFP_ATOMIC to avoid "scheduling while atomic" bug. Link: https://lkml.kernel.org/r/20220429064051.61552-1-linmiaohe@huawei.com Link: https://lkml.kernel.org/r/20220429064051.61552-2-linmiaohe@huawei.com Fixes: fc548865 ("z3fold: simplify freeing slots") Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Zi Yan 提交于
In isolate_single_pageblock(), free pages are checked without holding zone lock, but they can go away in split_free_page() when zone lock is held. Check the free page and its order again in split_free_page() when zone lock is held. Recheck the page if the free page is gone under zone lock. In addition, in split_free_page(), the free page was deleted from the page list without changing free page accounting. Add the missing free page accounting code. Fix the type of order parameter in split_free_page(). Link: https://lore.kernel.org/lkml/20220525103621.987185e2ca0079f7b97b856d@linux-foundation.org/ Link: https://lkml.kernel.org/r/20220526231531.2404977-2-zi.yan@sent.com Fixes: b2c9e2fb ("mm: make alloc_contig_range work at pageblock granularity") Signed-off-by: NZi Yan <ziy@nvidia.com> Reported-by: NDoug Berger <opendmb@gmail.com> Link: https://lore.kernel.org/linux-mm/c3932a6f-77fe-29f7-0c29-fe6b1c67ab7b@gmail.com/ Cc: David Hildenbrand <david@redhat.com> Cc: Qian Cai <quic_qiancai@quicinc.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Eric Ren <renzhengeek@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Michael Walle <michael@walle.cc> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Zi Yan 提交于
start_isolate_page_range() first isolates the first and the last pageblocks in the range and ensure pages across range boundaries are split during isolation. But it missed the case when the range is <= a pageblock and the first and the last pageblocks are the same one, so the second isolate_single_pageblock() will always fail. To fix it, skip the pageblock isolation in second isolate_single_pageblock(). Link: https://lkml.kernel.org/r/20220526231531.2404977-1-zi.yan@sent.com Fixes: 88ee1343 ("mm: fix a potential infinite loop in start_isolate_page_range()") Signed-off-by: NZi Yan <ziy@nvidia.com> Reported-by: NMarek Szyprowski <m.szyprowski@samsung.com> Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/linux-mm/ac65adc0-a7e4-cdfe-a0d8-757195b86293@samsung.com/Reported-by: NMichael Walle <michael@walle.cc> Tested-by: NMichael Walle <michael@walle.cc> Link: https://lore.kernel.org/linux-mm/8ca048ca8b547e0dd1c95387ee05c23d@walle.cc/ Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand <david@redhat.com> Cc: Doug Berger <opendmb@gmail.com> Cc: Eric Ren <renzhengeek@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Rapoport <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Qian Cai <quic_qiancai@quicinc.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 26 5月, 2022 17 次提交
-
-
由 Kefeng Wang 提交于
Use PAGE_ALIGNED macro instead of IS_ALIGNED and passing PAGE_SIZE. Link: https://lkml.kernel.org/r/20220520021833.121405-1-wangkefeng.wang@huawei.comSigned-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Acked-by: NMuchun Song <songmuchun@bytedance.com> Cc: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Patrick Wang 提交于
The default "timeout" for one kselftest is 45 seconds, while some cases in run_vmtests.sh require more time. This will cause testing timeout like: not ok 4 selftests: vm: run_vmtests.sh # TIMEOUT 45 seconds Therefore, add the "settings" file with timeout variable so users can set the "timeout" value. Link: https://lkml.kernel.org/r/20220521083825.319654-4-patrick.wang.shcn@gmail.comSigned-off-by: NPatrick Wang <patrick.wang.shcn@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Patrick Wang 提交于
The "test_hmm.sh" file used by run_vmtests.sh dose not be installed into INSTALL_PATH. Thus run_vmtests.sh can not call it in INSTALL_PATH: --------------------------- running ./test_hmm.sh smoke --------------------------- ./run_vmtests.sh: line 74: ./test_hmm.sh: No such file or directory [FAIL] ----------------------- Add "test_hmm.sh" to TEST_FILES so that it will be installed. Link: https://lkml.kernel.org/r/20220521083825.319654-3-patrick.wang.shcn@gmail.comSigned-off-by: NPatrick Wang <patrick.wang.shcn@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Patrick Wang 提交于
Patch series "selftests: vm: a few fixup patches". This series contains three fixup patches for vm selftests. They are independent. Please see the patches. This patch (of 3): Currently, ksm_tests operates "merge_across_nodes" with NUMA either enabled or disabled. In a system with NUMA disabled, these operations will fail and output a misleading report given "merge_across_nodes" does not exist in sysfs: ---------------------------- running ./ksm_tests -M -p 10 ---------------------------- f /sys/kernel/mm/ksm/merge_across_nodes fopen: No such file or directory Cannot save default tunables [FAIL] ---------------------- So check numa_available() before those operations to skip them if NUMA is disabled. Link: https://lkml.kernel.org/r/20220521083825.319654-1-patrick.wang.shcn@gmail.com Link: https://lkml.kernel.org/r/20220521083825.319654-2-patrick.wang.shcn@gmail.comSigned-off-by: NPatrick Wang <patrick.wang.shcn@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Muhammad Usama Anjum 提交于
Add newly added migration test object to .gitignore file. Link: https://lkml.kernel.org/r/20220521094313.166505-1-usama.anjum@collabora.com Fixes: 0c2d0872 ("mm: add selftests for migration entries") Signed-off-by: NMuhammad Usama Anjum <usama.anjum@collabora.com> Reviewed-by: NAlistair Popple <apopple@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Julia Lawall 提交于
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Link: https://lkml.kernel.org/r/20220521111145.81697-80-Julia.Lawall@inria.frSigned-off-by: NJulia Lawall <Julia.Lawall@inria.fr> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Julia Lawall 提交于
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Link: https://lkml.kernel.org/r/20220521111145.81697-94-Julia.Lawall@inria.frSigned-off-by: NJulia Lawall <Julia.Lawall@inria.fr> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Suren Baghdasaryan 提交于
Introduce process_mrelease syscall sanity tests which include tests which expect to fail: - process_mrelease with invalid pidfd and flags inputs - process_mrelease on a live process with no pending signals and valid process_mrelease usage which is expected to succeed. Because process_mrelease has to be used against a process with a pending SIGKILL, it's possible that the process exits before process_mrelease gets called. In such cases we retry the test with a victim that allocates twice more memory up to 1GB. This would require the victim process to spend more time during exit and process_mrelease has a better chance of catching the process before it exits and succeeding. On success the test reports the amount of memory the child had to allocate for reaping to succeed. Sample output: $ mrelease_test Success reaping a child with 1MB of memory allocations On failure the test reports the failure. Sample outputs: $ mrelease_test All process_mrelease attempts failed! $ mrelease_test process_mrelease: Invalid argument Link: https://lkml.kernel.org/r/20220518204316.13131-1-surenb@google.comSigned-off-by: NSuren Baghdasaryan <surenb@google.com> Reviewed-by: NShuah Khan <skhan@linuxfoundation.org> Acked-by: NChristian Brauner (Microsoft) <brauner@kernel.org> Reviewed-by: NMuhammad Usama Anjum <usama.anjum@collabora.com> Cc: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <guro@fb.com> Cc: Minchan Kim <minchan@kernel.org> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Johannes Weiner 提交于
This reverts commit 3a235693. Its premise was that cgroup reclaim cares about freeing memory inside the cgroup, and demotion just moves them around within the cgroup limit. Hence, pages from toptier nodes should be reclaimed directly. However, with NUMA balancing now doing tier promotions, demotion is part of the page aging process. Global reclaim demotes the coldest toptier pages to secondary memory, where their life continues and from which they have a chance to get promoted back. Essentially, tiered memory systems have an LRU order that spans multiple nodes. When cgroup reclaims pages coming off the toptier directly, there can be colder pages on lower tier nodes that were demoted by global reclaim. This is an aging inversion, not unlike if cgroups were to reclaim directly from the active lists while there are inactive pages. Proactive reclaim is another factor. The goal of that it is to offload colder pages from expensive RAM to cheaper storage. When lower tier memory is available as an intermediate layer, we want offloading to take advantage of it instead of bypassing to storage. Revert the patch so that cgroups respect the LRU order spanning the memory hierarchy. Of note is a specific undercommit scenario, where all cgroup limits in the system add up to <= available toptier memory. In that case, shuffling pages out to lower tiers first to reclaim them from there is inefficient. This is something could be optimized/short-circuited later on (although care must be taken not to accidentally recreate the aging inversion). Let's ensure correctness first. Link: https://lkml.kernel.org/r/20220518190911.82400-1-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NDave Hansen <dave.hansen@linux.intel.com> Reviewed-by: NYang Shi <shy828301@gmail.com> Acked-by: NRoman Gushchin <roman.gushchin@linux.dev> Reviewed-by: N"Huang, Ying" <ying.huang@intel.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NShakeel Butt <shakeelb@google.com> Acked-by: NTim Chen <tim.c.chen@linux.intel.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Jackie Liu 提交于
By printing information, we can friendly prompt the status change information of kfence by dmesg and record by syslog. Also, set kfence_enabled to false only when needed. Link: https://lkml.kernel.org/r/20220518073105.3160335-1-liu.yun@linux.devSigned-off-by: NJackie Liu <liuyun01@kylinos.cn> Co-developed-by: NMarco Elver <elver@google.com> Signed-off-by: NMarco Elver <elver@google.com> Reviewed-by: NMarco Elver <elver@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Vasily Averin 提交于
Fix sparse warning about incorrect gfp_t cast. Link: https://lkml.kernel.org/r/001979f3-e978-0998-cbed-61a4a2ac87b8@openvz.org Fixes: f67bed13 ("percpu: improve percpu_alloc_percpu event trace") Signed-off-by: NVasily Averin <vvs@openvz.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Vasily Averin 提交于
Redefines __def_gfpflag_names array according to akpm@, willy@ and Joe Perches recommendations. Link: https://lkml.kernel.org/r/6f811e19-41c6-f3e8-fca6-23a19a62e313@openvz.org Fixes: fe573327 ("tracing: incorrect gfp_t conversion") Signed-off-by: NVasily Averin <vvs@openvz.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Joe Perches <joe@perches.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Zi Yan 提交于
In isolate_single_pageblock() called by start_isolate_page_range(), there are some pageblock isolation issues causing a potential infinite loop when isolating a page range. This is reported by Qian Cai. 1. the pageblock was isolated by just changing pageblock migratetype without checking unmovable pages. Calling set_migratetype_isolate() to isolate pageblock properly. 2. an off-by-one error caused migrating pages unnecessarily, since the page is not crossing pageblock boundary. 3. migrating a compound page across pageblock boundary then splitting the free page later has a small race window that the free page might be allocated again, so that the code will try again, causing an potential infinite loop. Temporarily set the to-be-migrated page's pageblock to MIGRATE_ISOLATE to prevent that and bail out early if no free page is found after page migration. An additional fix to split_free_page() aims to avoid crashing in __free_one_page(). When the free page is split at the specified split_pfn_offset, free_page_order should check both the first bit of free_page_pfn and the last bit of split_pfn_offset and use the smaller one. For example, if free_page_pfn=0x10000, split_pfn_offset=0xc000, free_page_order should first be 0x8000 then 0x4000, instead of 0x4000 then 0x8000, which the original algorithm did. [akpm@linux-foundation.org: suppress min() warning] Link: https://lkml.kernel.org/r/20220524194756.1698351-1-zi.yan@sent.com Fixes: b2c9e2fb ("mm: make alloc_contig_range work at pageblock granularity") Signed-off-by: NZi Yan <ziy@nvidia.com> Reported-by: NQian Cai <quic_qiancai@quicinc.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand <david@redhat.com> Cc: Eric Ren <renzhengeek@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Muchun Song 提交于
I have been focusing on mm for the past two years. e.g. developing, fixing bugs, reviewing related to HugeTLB system. I would like to help Mike and other people working on HugeTLB by reviewing their work. When I first introduced the vmemmmap reduction, I forgot to update MAINTAINERS file. Let's update it as well. And rename "HUGETLB FILESYSTEM" to "HUGETLB SUBSYSTEM" since some files are not only related to filesystem but also memory management (the name of FILESYSTEM cannot cover this area). Link: https://lkml.kernel.org/r/20220521074103.79468-1-songmuchun@bytedance.comSigned-off-by: NMuchun Song <songmuchun@bytedance.com> Acked-by: NMike Kravetz <mike.kravetz@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Randy Dunlap 提交于
ZSMALLOC depends on MMU so ZRAM should also depend on MMU since 'select' does not follow any dependency chains. Fixes this Kconfig warning: WARNING: unmet direct dependencies detected for ZSMALLOC Depends on [n]: MMU [=n] Selected by [y]: - ZRAM [=y] && BLK_DEV [=y] && BLOCK [=y] && SYSFS [=y] && (CRYPTO_LZO [=y] || CRYPTO_ZSTD [=m] || CRYPTO_LZ4 [=m] || CRYPTO_LZ4HC [=n] || CRYPTO_842 [=n]) Link: https://lkml.kernel.org/r/20220522204027.22964-1-rdunlap@infradead.org Fixes: b3fbd58f ("mm: Kconfig: simplify zswap configuration") Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Hugh Dickins 提交于
Shmem swapoff makes no progress: the index to indices is not incremented. But "ret" is no longer a return value, so use folio_batch_count() instead. Link: https://lkml.kernel.org/r/c32bee8a-f0aa-245-f94e-24dd271924fa@google.com Fixes: da08e9b7 ("mm/shmem: convert shmem_swapin_page() to shmem_swapin_folio()") Signed-off-by: NHugh Dickins <hughd@google.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Tested-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Christophe JAILLET 提交于
If the first goto is taken, 'fd' is not opened yet (and is un-initialized). So a direct return is safer. Link: https://lkml.kernel.org/r/628312312eb40e0e39463a2c06415fde5295c716.1653229120.git.christophe.jaillet@wanadoo.fr Fixes: c1a31a2f ("cgroup: fix racy check in alloc_pagecache_max_30M() helper function") Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeelb@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Tejun Heo <tj@kernel.org> Cc: Zefan Li <lizefan.x@bytedance.com> Cc: Shuah Khan <shuah@kernel.org> Cc: David Vernet <void@manifault.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 20 5月, 2022 6 次提交
-
-
由 Kefeng Wang 提交于
Use HPAGE_PMD_SIZE instead of open coding. Link: https://lkml.kernel.org/r/20220517145120.118523-1-wangkefeng.wang@huawei.comSigned-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: NSeongJae Park <sj@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Vasily Averin 提交于
Fixes following sparse warnings: CHECK mm/vmscan.c mm/vmscan.c: note: in included file (through include/trace/trace_events.h, include/trace/define_trace.h, include/trace/events/vmscan.h): ./include/trace/events/vmscan.h:281:1: sparse: warning: cast to restricted isolate_mode_t ./include/trace/events/vmscan.h:281:1: sparse: warning: restricted isolate_mode_t degrades to integer Link: https://lkml.kernel.org/r/e85d7ff2-fd10-53f8-c24e-ba0458439c1b@openvz.orgSigned-off-by: NVasily Averin <vvs@openvz.org> Acked-by: NSteven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Christophe de Dinechin 提交于
With gcc version 12.0.1 20220401 (Red Hat 12.0.1-0), building with defconfig results in the following compilation error: | CC mm/swapfile.o | mm/swapfile.c: In function `setup_swap_info': | mm/swapfile.c:2291:47: error: array subscript -1 is below array bounds | of `struct plist_node[]' [-Werror=array-bounds] | 2291 | p->avail_lists[i].prio = 1; | | ~~~~~~~~~~~~~~^~~ | In file included from mm/swapfile.c:16: | ./include/linux/swap.h:292:27: note: while referencing `avail_lists' | 292 | struct plist_node avail_lists[]; /* | | ^~~~~~~~~~~ This is due to the compiler detecting that the mask in node_states[__state] could theoretically be zero, which would lead to first_node() returning -1 through find_first_bit. I believe that the warning/error is legitimate. I first tried adding a test to check that the node mask is not emtpy, since a similar test exists in the case where MAX_NUMNODES == 1. However, adding the if statement causes other warnings to appear in for_each_cpu_node_but, because it introduces a dangling else ambiguity. And unfortunately, GCC is not smart enough to detect that the added test makes the case where (node) == -1 impossible, so it still complains with the same message. This is why I settled on replacing that with a harmless, but relatively useless (node) >= 0 test. Based on the warning for the dangling else, I also decided to fix the case where MAX_NUMNODES == 1 by moving the condition inside the for loop. It will still only be tested once. This ensures that the meaning of an else following for_each_node_mask or derivatives would not silently have a different meaning depending on the configuration. Link: https://lkml.kernel.org/r/20220414150855.2407137-3-dinechin@redhat.comSigned-off-by: NChristophe de Dinechin <christophe@dinechin.org> Signed-off-by: NChristophe de Dinechin <dinechin@redhat.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Ben Segall <bsegall@google.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Qi Zheng 提交于
We expect no warnings to be issued when we specify __GFP_NOWARN, but currently in paths like alloc_pages() and kmalloc(), there are still some warnings printed, fix it. But for some warnings that report usage problems, we don't deal with them. If such warnings are printed, then we should fix the usage problems. Such as the following case: WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); [zhengqi.arch@bytedance.com: v2] Link: https://lkml.kernel.org/r/20220511061951.1114-1-zhengqi.arch@bytedance.com Link: https://lkml.kernel.org/r/20220510113809.80626-1-zhengqi.arch@bytedance.comSigned-off-by: NQi Zheng <zhengqi.arch@bytedance.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Wonhyuk Yang 提交于
Currently, trace point mm_page_alloc_zone_locked() doesn't show correct information. First, when alloc_flag has ALLOC_HARDER/ALLOC_CMA, page can be allocated from MIGRATE_HIGHATOMIC/MIGRATE_CMA. Nevertheless, tracepoint use requested migration type not MIGRATE_HIGHATOMIC and MIGRATE_CMA. Second, after commit 44042b44 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists") percpu-list can store high order pages. But trace point determine whether it is a refiil of percpu-list by comparing requested order and 0. To handle these problems, make mm_page_alloc_zone_locked() only be called by __rmqueue_smallest with correct migration type. With a new argument called percpu_refill, it can show roughly whether it is a refill of percpu-list. Link: https://lkml.kernel.org/r/20220512025307.57924-1-vvghjk1234@gmail.comSigned-off-by: NWonhyuk Yang <vvghjk1234@gmail.com> Acked-by: NMel Gorman <mgorman@suse.de> Cc: Baik Song An <bsahn@etri.re.kr> Cc: Hong Yeon Kim <kimhy@etri.re.kr> Cc: Taeung Song <taeung@reallinux.co.kr> Cc: <linuxgeek@linuxgeek.io> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Fanjun Kong 提交于
This patch fixes two issues: 1. Add __initdata attribute according to include/linux/init.h: For initialized data: You should insert __initdata between the variable name and equal sign followed by value 2. Fix below error reported by checkpatch.pl: ERROR: do not initialise statics to false Special thanks to Muchun Song :) Link: https://lkml.kernel.org/r/20220516030039.1487005-1-bh1scw@gmail.comSigned-off-by: NFanjun Kong <bh1scw@gmail.com> Suggested-by: NMuchun Song <songmuchun@bytedance.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-