- 29 10月, 2022 1 次提交
-
-
由 Maria Yu 提交于
When !CONFIG_VM_BUG_ON, there is warning of clang-analyzer-deadcode.DeadStores: Value stored to 'mt' during its initialization is never read. Link: https://lkml.kernel.org/r/20221021101555.7992-2-quic_aiquny@quicinc.comSigned-off-by: NMaria Yu <quic_aiquny@quicinc.com> Cc: David Hildenbrand <david@redhat.com> Cc: Doug Berger <opendmb@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 04 10月, 2022 3 次提交
-
-
由 Kefeng Wang 提交于
Add pageblock_aligned() and use it to simplify code. Link: https://lkml.kernel.org/r/20220907060844.126891-3-wangkefeng.wang@huawei.comSigned-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Acked-by: NMike Rapoport <rppt@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Kefeng Wang 提交于
Add pageblock_align() macro and use it to simplify code. Link: https://lkml.kernel.org/r/20220907060844.126891-2-wangkefeng.wang@huawei.comSigned-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Acked-by: NMike Rapoport <rppt@linux.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Kefeng Wang 提交于
Move pageblock_start_pfn/pageblock_end_pfn() into pageblock-flags.h, then they could be used somewhere else, not only in compaction, also use ALIGN_DOWN() instead of round_down() to be pair with ALIGN(), which should be same for pageblock usage. Link: https://lkml.kernel.org/r/20220907060844.126891-1-wangkefeng.wang@huawei.comSigned-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Acked-by: NMike Rapoport <rppt@linux.ibm.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 27 9月, 2022 1 次提交
-
-
由 Zi Yan 提交于
set_migratetype_isolate() does not allow isolating MIGRATE_CMA pageblocks unless it is used for CMA allocation. isolate_single_pageblock() did not have the same behavior when it is used together with set_migratetype_isolate() in start_isolate_page_range(). This allows alloc_contig_range() with migratetype other than MIGRATE_CMA, like MIGRATE_MOVABLE (used by alloc_contig_pages()), to isolate first and last pageblock but fail the rest. The failure leads to changing migratetype of the first and last pageblock to MIGRATE_MOVABLE from MIGRATE_CMA, corrupting the CMA region. This can happen during gigantic page allocations. Like Doug said here: https://lore.kernel.org/linux-mm/a3363a52-883b-dcd1-b77f-f2bb378d6f2d@gmail.com/T/#u, for gigantic page allocations, the user would notice no difference, since the allocation on CMA region will fail as well as it did before. But it might hurt the performance of device drivers that use CMA, since CMA region size decreases. Fix it by passing migratetype into isolate_single_pageblock(), so that set_migratetype_isolate() used by isolate_single_pageblock() will prevent the isolation happening. Link: https://lkml.kernel.org/r/20220914023913.1855924-1-zi.yan@sent.com Fixes: b2c9e2fb ("mm: make alloc_contig_range work at pageblock granularity") Signed-off-by: NZi Yan <ziy@nvidia.com> Reported-by: NDoug Berger <opendmb@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Doug Berger <opendmb@gmail.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 17 6月, 2022 1 次提交
-
-
由 Yang Li 提交于
Remove one warning found by running scripts/kernel-doc, which is caused by using 'make W=1': mm/page_isolation.c:304: warning: Function parameter or member 'skip_isolation' not described in 'isolate_single_pageblock' Link: https://lkml.kernel.org/r/20220602062116.61199-1-yang.lee@linux.alibaba.comSigned-off-by: NYang Li <yang.lee@linux.alibaba.com> Reported-by: NAbaci Robot <abaci@linux.alibaba.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 02 6月, 2022 1 次提交
-
-
由 Zi Yan 提交于
When compound_nr(page) was used, page was not guaranteed to be the head of the compound page and it could cause an infinite loop. Fix it by calling it on the head page. Link: https://lkml.kernel.org/r/20220531024450.2498431-1-zi.yan@sent.com Fixes: b2c9e2fb ("mm: make alloc_contig_range work at pageblock granularity") Signed-off-by: NZi Yan <ziy@nvidia.com> Reported-by: NAnshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/linux-mm/20220530115027.123341-1-anshuman.khandual@arm.com/Reviewed-by: NAnshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Acked-by: NMuchun Song <songmuchun@bytedance.com> Cc: David Hildenbrand <david@redhat.com> Cc: Qian Cai <quic_qiancai@quicinc.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Eric Ren <renzhengeek@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 28 5月, 2022 2 次提交
-
-
由 Zi Yan 提交于
In isolate_single_pageblock(), free pages are checked without holding zone lock, but they can go away in split_free_page() when zone lock is held. Check the free page and its order again in split_free_page() when zone lock is held. Recheck the page if the free page is gone under zone lock. In addition, in split_free_page(), the free page was deleted from the page list without changing free page accounting. Add the missing free page accounting code. Fix the type of order parameter in split_free_page(). Link: https://lore.kernel.org/lkml/20220525103621.987185e2ca0079f7b97b856d@linux-foundation.org/ Link: https://lkml.kernel.org/r/20220526231531.2404977-2-zi.yan@sent.com Fixes: b2c9e2fb ("mm: make alloc_contig_range work at pageblock granularity") Signed-off-by: NZi Yan <ziy@nvidia.com> Reported-by: NDoug Berger <opendmb@gmail.com> Link: https://lore.kernel.org/linux-mm/c3932a6f-77fe-29f7-0c29-fe6b1c67ab7b@gmail.com/ Cc: David Hildenbrand <david@redhat.com> Cc: Qian Cai <quic_qiancai@quicinc.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Eric Ren <renzhengeek@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Michael Walle <michael@walle.cc> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Zi Yan 提交于
start_isolate_page_range() first isolates the first and the last pageblocks in the range and ensure pages across range boundaries are split during isolation. But it missed the case when the range is <= a pageblock and the first and the last pageblocks are the same one, so the second isolate_single_pageblock() will always fail. To fix it, skip the pageblock isolation in second isolate_single_pageblock(). Link: https://lkml.kernel.org/r/20220526231531.2404977-1-zi.yan@sent.com Fixes: 88ee1343 ("mm: fix a potential infinite loop in start_isolate_page_range()") Signed-off-by: NZi Yan <ziy@nvidia.com> Reported-by: NMarek Szyprowski <m.szyprowski@samsung.com> Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/linux-mm/ac65adc0-a7e4-cdfe-a0d8-757195b86293@samsung.com/Reported-by: NMichael Walle <michael@walle.cc> Tested-by: NMichael Walle <michael@walle.cc> Link: https://lore.kernel.org/linux-mm/8ca048ca8b547e0dd1c95387ee05c23d@walle.cc/ Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand <david@redhat.com> Cc: Doug Berger <opendmb@gmail.com> Cc: Eric Ren <renzhengeek@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Rapoport <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Qian Cai <quic_qiancai@quicinc.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 26 5月, 2022 1 次提交
-
-
由 Zi Yan 提交于
In isolate_single_pageblock() called by start_isolate_page_range(), there are some pageblock isolation issues causing a potential infinite loop when isolating a page range. This is reported by Qian Cai. 1. the pageblock was isolated by just changing pageblock migratetype without checking unmovable pages. Calling set_migratetype_isolate() to isolate pageblock properly. 2. an off-by-one error caused migrating pages unnecessarily, since the page is not crossing pageblock boundary. 3. migrating a compound page across pageblock boundary then splitting the free page later has a small race window that the free page might be allocated again, so that the code will try again, causing an potential infinite loop. Temporarily set the to-be-migrated page's pageblock to MIGRATE_ISOLATE to prevent that and bail out early if no free page is found after page migration. An additional fix to split_free_page() aims to avoid crashing in __free_one_page(). When the free page is split at the specified split_pfn_offset, free_page_order should check both the first bit of free_page_pfn and the last bit of split_pfn_offset and use the smaller one. For example, if free_page_pfn=0x10000, split_pfn_offset=0xc000, free_page_order should first be 0x8000 then 0x4000, instead of 0x4000 then 0x8000, which the original algorithm did. [akpm@linux-foundation.org: suppress min() warning] Link: https://lkml.kernel.org/r/20220524194756.1698351-1-zi.yan@sent.com Fixes: b2c9e2fb ("mm: make alloc_contig_range work at pageblock granularity") Signed-off-by: NZi Yan <ziy@nvidia.com> Reported-by: NQian Cai <quic_qiancai@quicinc.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand <david@redhat.com> Cc: Eric Ren <renzhengeek@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 13 5月, 2022 4 次提交
-
-
由 Zi Yan 提交于
Now start_isolate_page_range() is ready to handle arbitrary range isolation, so move the alignment check/adjustment into the function body. Do the same for its counterpart undo_isolate_page_range(). alloc_contig_range(), its caller, can pass an arbitrary range instead of a MAX_ORDER_NR_PAGES aligned one. Link: https://lkml.kernel.org/r/20220425143118.2850746-5-zi.yan@sent.comSigned-off-by: NZi Yan <ziy@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand <david@redhat.com> Cc: Eric Ren <renzhengeek@gmail.com> Cc: kernel test robot <lkp@intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Zi Yan 提交于
alloc_contig_range() worked at MAX_ORDER_NR_PAGES granularity to avoid merging pageblocks with different migratetypes. It might unnecessarily convert extra pageblocks at the beginning and at the end of the range. Change alloc_contig_range() to work at pageblock granularity. Special handling is needed for free pages and in-use pages across the boundaries of the range specified by alloc_contig_range(). Because these= Partially isolated pages causes free page accounting issues. The free pages will be split and freed into separate migratetype lists; the in-use= Pages will be migrated then the freed pages will be handled in the aforementioned way. [ziy@nvidia.com: fix deadlock/crash] Link: https://lkml.kernel.org/r/23A7297E-6C84-4138-A9FE-3598234004E6@nvidia.com Link: https://lkml.kernel.org/r/20220425143118.2850746-4-zi.yan@sent.comSigned-off-by: NZi Yan <ziy@nvidia.com> Reported-by: Nkernel test robot <lkp@intel.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand <david@redhat.com> Cc: Eric Ren <renzhengeek@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Zi Yan 提交于
Enable set_migratetype_isolate() to check specified range for unmovable pages during isolation to prepare arbitrary range page isolation. The functionality will take effect in upcoming commits by adjusting the callers of start_isolate_page_range(), which uses set_migratetype_isolate(). For example, alloc_contig_range(), which calls start_isolate_page_range(), accepts unaligned ranges, but because page isolation is currently done at MAX_ORDER_NR_PAEGS granularity, pages that are out of the specified range but withint MAX_ORDER_NR_PAEGS alignment might be attempted for isolation and the failure of isolating these unrelated pages fails the whole operation undesirably. Link: https://lkml.kernel.org/r/20220425143118.2850746-3-zi.yan@sent.comSigned-off-by: NZi Yan <ziy@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand <david@redhat.com> Cc: Eric Ren <renzhengeek@gmail.com> Cc: kernel test robot <lkp@intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Zi Yan 提交于
Patch series "Use pageblock_order for cma and alloc_contig_range alignment", v11. This patchset tries to remove the MAX_ORDER-1 alignment requirement for CMA and alloc_contig_range(). It prepares for my upcoming changes to make MAX_ORDER adjustable at boot time[1]. The MAX_ORDER - 1 alignment requirement comes from that alloc_contig_range() isolates pageblocks to remove free memory from buddy allocator but isolating only a subset of pageblocks within a page spanning across multiple pageblocks causes free page accounting issues. Isolated page might not be put into the right free list, since the code assumes the migratetype of the first pageblock as the whole free page migratetype. This is based on the discussion at [2]. To remove the requirement, this patchset: 1. isolates pages at pageblock granularity instead of max(MAX_ORDER_NR_PAEGS, pageblock_nr_pages); 2. splits free pages across the specified range or migrates in-use pages across the specified range then splits the freed page to avoid free page accounting issues (it happens when multiple pageblocks within a single page have different migratetypes); 3. only checks unmovable pages within the range instead of MAX_ORDER - 1 aligned range during isolation to avoid alloc_contig_range() failure when pageblocks within a MAX_ORDER - 1 aligned range are allocated separately. 4. returns pages not in the range as it did before. One optimization might come later: 1. make MIGRATE_ISOLATE a separate bit to be able to restore the original migratetypes when isolation fails in the middle of the range. [1] https://lore.kernel.org/linux-mm/20210805190253.2795604-1-zi.yan@sent.com/ [2] https://lore.kernel.org/linux-mm/d19fb078-cb9b-f60f-e310-fdeea1b947d2@redhat.com/ This patch (of 6): has_unmovable_pages() is only used in mm/page_isolation.c. Move it from mm/page_alloc.c and make it static. Link: https://lkml.kernel.org/r/20220425143118.2850746-2-zi.yan@sent.comSigned-off-by: NZi Yan <ziy@nvidia.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Reviewed-by: NMike Rapoport <rppt@linux.ibm.com> Acked-by: NDavid Hildenbrand <david@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Eric Ren <renzhengeek@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Minchan Kim <minchan@kernel.org> Cc: kernel test robot <lkp@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 29 4月, 2022 1 次提交
-
-
由 Zi Yan 提交于
Whenever the buddy of a page is found from __find_buddy_pfn(), page_is_buddy() should be used to check its validity. Add a helper function find_buddy_page_pfn() to find the buddy page and do the check together. [ziy@nvidia.com: updates per David] Link: https://lkml.kernel.org/r/20220401230804.1658207-2-zi.yan@sent.com Link: https://lore.kernel.org/linux-mm/CAHk-=wji_AmYygZMTsPMdJ7XksMt7kOur8oDfDdniBRMjm4VkQ@mail.gmail.com/ Link: https://lkml.kernel.org/r/7236E7CA-B5F1-4C04-AB85-E86FA3E9A54B@nvidia.comSuggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NZi Yan <ziy@nvidia.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NDavid Hildenbrand <david@redhat.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Rapoport <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 05 2月, 2022 1 次提交
-
-
由 Chen Wandun 提交于
This reverts commit 721fb891. Commit 721fb891 ("mm/page_isolation: unset migratetype directly for non Buddy page") will result memory that should in buddy disappear by mistake. move_freepages_block moves all pages in pageblock instead of pages indicated by input parameter, so if input pages is not in buddy but other pages in pageblock is in buddy, it will result in page out of control. Link: https://lkml.kernel.org/r/20220126024436.13921-1-chenwandun@huawei.com Fixes: 721fb891 ("mm/page_isolation: unset migratetype directly for non Buddy page") Signed-off-by: NChen Wandun <chenwandun@huawei.com> Reported-by: N"kernelci.org bot" <bot@kernelci.org> Acked-by: NDavid Hildenbrand <david@redhat.com> Tested-by: NDong Aisheng <aisheng.dong@nxp.com> Tested-by: NFrancesco Dolcini <francesco.dolcini@toradex.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Tested-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 1月, 2022 1 次提交
-
-
由 Chen Wandun 提交于
In unset_migratetype_isolate(), we can bypass the call to move_freepages_block() for non-buddy pages. It will save a few cpu cycles for some situations such as cma and hugetlb when allocating continue pages, in these situation function alloc_contig_pages will be called. alloc_contig_pages __alloc_contig_migrate_range isolate_freepages_range ==> pages has been remove from buddy undo_isolate_page_range unset_migratetype_isolate ==> can directly set migratetype [osalvador@suse.de: changelog tweak] Link: https://lkml.kernel.org/r/20211229033649.2760586-1-chenwandun@huawei.com Fixes: 3c605096 ("mm/page_alloc: restrict max order of merging on isolated pageblock") Signed-off-by: NChen Wandun <chenwandun@huawei.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Wang Kefeng <wangkefeng.wang@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 11月, 2021 2 次提交
-
-
由 Miaohe Lin 提交于
Isolating a free page in an isolated pageblock is expected to always work as watermarks don't apply here. But if __isolate_free_page() failed, due to condition changes, the page will be left on the free list. And the page will be put back to free list again via __putback_isolated_page(). This may trigger VM_BUG_ON_PAGE() on page->flags checking in __free_one_page() if PageReported is set. Or we will corrupt the free list because list_add() will be called for pages already on another list. Add a VM_WARN_ON() to complain about this change. Link: https://lkml.kernel.org/r/20210914114508.23725-1-linmiaohe@huawei.com Fixes: 3c605096 ("mm/page_alloc: restrict max order of merging on isolated pageblock") Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miaohe Lin 提交于
In start_isolate_page_range() undo path, pfn_to_online_page() just checks the first pfn in a pageblock while __first_valid_page() will traverse the pageblock until the first online pfn is found. So we may miss the call to unset_migratetype_isolate() in undo path and pages will remain isolated unexpectedly. Fix this by calling undo_isolate_page_range() and this will also help to simplify the code further. Note we shouldn't ever trigger it because MAX_ORDER-1 aligned pfn ranges shouldn't contain memory holes now. Link: https://lkml.kernel.org/r/20210914114348.15569-1-linmiaohe@huawei.com Fixes: 2ce13640 ("mm: __first_valid_page skip over offline pages") Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 9月, 2021 1 次提交
-
-
由 Mike Rapoport 提交于
Patch series "mm: remove pfn_valid_within() and CONFIG_HOLES_IN_ZONE". After recent updates to freeing unused parts of the memory map, no architecture can have holes in the memory map within a pageblock. This makes pfn_valid_within() check and CONFIG_HOLES_IN_ZONE configuration option redundant. The first patch removes them both in a mechanical way and the second patch simplifies memory_hotplug::test_pages_in_a_zone() that had pfn_valid_within() surrounded by more logic than simple if. This patch (of 2): After recent changes in freeing of the unused parts of the memory map and rework of pfn_valid() in arm and arm64 there are no architectures that can have holes in the memory map within a pageblock and so nothing can enable CONFIG_HOLES_IN_ZONE which guards non trivial implementation of pfn_valid_within(). With that, pfn_valid_within() is always hardwired to 1 and can be completely removed. Remove calls to pfn_valid_within() and CONFIG_HOLES_IN_ZONE. Link: https://lkml.kernel.org/r/20210713080035.7464-1-rppt@kernel.org Link: https://lkml.kernel.org/r/20210713080035.7464-2-rppt@kernel.orgSigned-off-by: NMike Rapoport <rppt@linux.ibm.com> Acked-by: NDavid Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 9月, 2021 1 次提交
-
-
由 George G. Davis 提交于
Some test_pages_isolated failure conditions don't include trace points. For debugging issues caused by "pinned" pages, make sure to trace all calls whether they succeed or fail. In this case, a failure case did not result in a trace point. So add the missing failure case in test_pages_isolated traces. Link: https://lkml.kernel.org/r/20210823202823.13765-1-george_davis@mentor.comSigned-off-by: NGeorge G. Davis <davis.george@siemens.com> Cc: Eugeniu Rosca <erosca@de.adit-jv.com> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 12月, 2020 3 次提交
-
-
由 Muchun Song 提交于
A max order page has no buddy page and never merges to another order. So isolating and then freeing it is pointless. Link: https://lkml.kernel.org/r/20201202122114.75316-1-songmuchun@bytedance.com Fixes: 3c605096 ("mm/page_alloc: restrict max order of merging on isolated pageblock") Signed-off-by: NMuchun Song <songmuchun@bytedance.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NVlastimil Babka <vbabka@suse.cz> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vlastimil Babka 提交于
Memory offlining relies on page isolation to guarantee a forward progress because pages cannot be reused while they are isolated. But the page isolation itself doesn't prevent from races while freed pages are stored on pcp lists and thus can be reused. This can be worked around by repeated draining of pcplists, as done by commit 96831826 ("mm/memory_hotplug: drain per-cpu pages again during memory offline"). David and Michal would prefer that this race was closed in a way that callers of page isolation who need stronger guarantees don't need to repeatedly drain. David suggested disabling pcplists usage completely during page isolation, instead of repeatedly draining them. To achieve this without adding special cases in alloc/free fastpath, we can use the same approach as boot pagesets - when pcp->high is 0, any pcplist addition will be immediately flushed. The race can thus be closed by setting pcp->high to 0 and draining pcplists once, before calling start_isolate_page_range(). The draining will serialize after processes that already disabled interrupts and read the old value of pcp->high in free_unref_page_commit(), and processes that have not yet disabled interrupts, will observe pcp->high == 0 when they are rescheduled, and skip pcplists. This guarantees no stray pages on pcplists in zones where isolation happens. This patch thus adds zone_pcp_disable() and zone_pcp_enable() functions that page isolation users can call before start_isolate_page_range() and after unisolating (or offlining) the isolated pages. Also, drain_all_pages() is optimized to only execute on cpus where pcplists are not empty. The check can however race with a free to pcplist that has not yet increased the pcp->count from 0 to 1. Thus make the drain optionally skip the racy check and drain on all cpus, and use this option in zone_pcp_disable(). As we have to avoid external updates to high and batch while pcplists are disabled, we take pcp_batch_high_lock in zone_pcp_disable() and release it in zone_pcp_enable(). This also synchronizes multiple users of zone_pcp_disable()/enable(). Currently the only user of this functionality is offline_pages(). [vbabka@suse.cz: add comment, per David] Link: https://lkml.kernel.org/r/527480ef-ed72-e1c1-52a0-1c5b0113df45@suse.cz Link: https://lkml.kernel.org/r/20201111092812.11329-8-vbabka@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz> Suggested-by: NDavid Hildenbrand <david@redhat.com> Suggested-by: NMichal Hocko <mhocko@suse.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vlastimil Babka 提交于
Currently, pcplists are drained during set_migratetype_isolate() which means once per pageblock processed start_isolate_page_range(). This is somewhat wasteful. Moreover, the callers might need different guarantees, and the draining is currently prone to races and does not guarantee that no page from isolated pageblock will end up on the pcplist after the drain. Better guarantees are added by later patches and require explicit actions by page isolation users that need them. Thus it makes sense to move the current imperfect draining to the callers also as a preparation step. Link: https://lkml.kernel.org/r/20201111092812.11329-7-vbabka@suse.czSuggested-by: NDavid Hildenbrand <david@redhat.com> Suggested-by: NPavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: NVlastimil Babka <vbabka@suse.cz> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 10月, 2020 3 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
The current page_order() can only be called on pages in the buddy allocator. For compound pages, you have to use compound_order(). This is confusing and led to a bug, so rename page_order() to buddy_order(). Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20201001152259.14932-2-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Hildenbrand 提交于
Whenever we move pages between freelists via move_to_free_list()/ move_freepages_block(), we don't actually touch the pages: 1. Page isolation doesn't actually touch the pages, it simply isolates pageblocks and moves all free pages to the MIGRATE_ISOLATE freelist. When undoing isolation, we move the pages back to the target list. 2. Page stealing (steal_suitable_fallback()) moves free pages directly between lists without touching them. 3. reserve_highatomic_pageblock()/unreserve_highatomic_pageblock() moves free pages directly between freelists without touching them. We already place pages to the tail of the freelists when undoing isolation via __putback_isolated_page(), let's do it in any case (e.g., if order <= pageblock_order) and document the behavior. To simplify, let's move the pages to the tail for all move_to_free_list()/move_freepages_block() users. In 2., the target list is empty, so there should be no change. In 3., we might observe a change, however, highatomic is more concerned about allocations succeeding than cache hotness - if we ever realize this change degrades a workload, we can special-case this instance and add a proper comment. This change results in all pages getting onlined via online_pages() to be placed to the tail of the freelist. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NOscar Salvador <osalvador@suse.de> Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Acked-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Link: https://lkml.kernel.org/r/20201005121534.15649-4-david@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Hildenbrand 提交于
Callers no longer need the number of isolated pageblocks. Let's simplify. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NOscar Salvador <osalvador@suse.de> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Charan Teja Reddy <charante@codeaurora.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michel Lespinasse <walken@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200819175957.28465-7-david@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 10月, 2020 3 次提交
-
-
由 David Hildenbrand 提交于
Let's clean it up a bit, simplifying the exit paths. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NBaoquan He <bhe@redhat.com> Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Qian Cai <cai@lca.pw> Link: http://lkml.kernel.org/r/20200816125333.7434-5-david@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Hildenbrand 提交于
Inside has_unmovable_pages(), we have a comment describing how unmovable data could end up in ZONE_MOVABLE - via "movablecore". Also, besides checking if the first page in the pageblock is reserved, we don't perform any further checks in case of ZONE_MOVABLE. In case of memory offlining, we set REPORT_FAILURE, properly dump_page() the page and handle the error gracefully. alloc_contig_pages() users currently never allocate from ZONE_MOVABLE. E.g., hugetlb uses alloc_contig_pages() for the allocation of gigantic pages only, which will never end up on the MOVABLE zone (see htlb_alloc_mask()). Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NBaoquan He <bhe@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Qian Cai <cai@lca.pw> Link: http://lkml.kernel.org/r/20200816125333.7434-4-david@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Hildenbrand 提交于
Right now, if we have two isolations racing on a pageblock that's in the MOVABLE zone, we would trigger the WARN_ON_ONCE(). Let's just return directly, simplifying error handling. The change was introduced in commit 3d680bdf ("mm/page_isolation: fix potential warning from user"). As far as I can see, we currently don't have alloc_contig_range() users that use the ZONE_MOVABLE (anymore), so it's currently more a cleanup and a preparation for the future than a fix. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NBaoquan He <bhe@redhat.com> Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Qian Cai <cai@lca.pw> Cc: Jason Wang <jasowang@redhat.com> Cc: Mike Rapoport <rppt@kernel.org> Link: http://lkml.kernel.org/r/20200816125333.7434-3-david@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 9月, 2020 1 次提交
-
-
由 Pavel Tatashin 提交于
There is a race during page offline that can lead to infinite loop: a page never ends up on a buddy list and __offline_pages() keeps retrying infinitely or until a termination signal is received. Thread#1 - a new process: load_elf_binary begin_new_exec exec_mmap mmput exit_mmap tlb_finish_mmu tlb_flush_mmu release_pages free_unref_page_list free_unref_page_prepare set_pcppage_migratetype(page, migratetype); // Set page->index migration type below MIGRATE_PCPTYPES Thread#2 - hot-removes memory __offline_pages start_isolate_page_range set_migratetype_isolate set_pageblock_migratetype(page, MIGRATE_ISOLATE); Set migration type to MIGRATE_ISOLATE-> set drain_all_pages(zone); // drain per-cpu page lists to buddy allocator. Thread#1 - continue free_unref_page_commit migratetype = get_pcppage_migratetype(page); // get old migration type list_add(&page->lru, &pcp->lists[migratetype]); // add new page to already drained pcp list Thread#2 Never drains pcp again, and therefore gets stuck in the loop. The fix is to try to drain per-cpu lists again after check_pages_isolated_cb() fails. Fixes: c52e7593 ("mm: remove extra drain pages on pcp list") Signed-off-by: NPavel Tatashin <pasha.tatashin@soleen.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NDavid Rientjes <rientjes@google.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NDavid Hildenbrand <david@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200903140032.380431-1-pasha.tatashin@soleen.com Link: https://lkml.kernel.org/r/20200904151448.100489-2-pasha.tatashin@soleen.com Link: http://lkml.kernel.org/r/20200904070235.GA15277@dhcp22.suse.czSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 8月, 2020 3 次提交
-
-
由 Joonsoo Kim 提交于
There is a well-defined standard migration target callback. Use it directly. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Roman Gushchin <guro@fb.com> Link: http://lkml.kernel.org/r/1594622517-20681-8-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joonsoo Kim 提交于
There are some similar functions for migration target allocation. Since there is no fundamental difference, it's better to keep just one rather than keeping all variants. This patch implements base migration target allocation function. In the following patches, variants will be converted to use this function. Changes should be mechanical, but, unfortunately, there are some differences. First, some callers' nodemask is assgined to NULL since NULL nodemask will be considered as all available nodes, that is, &node_states[N_MEMORY]. Second, for hugetlb page allocation, gfp_mask is redefined as regular hugetlb allocation gfp_mask plus __GFP_THISNODE if user provided gfp_mask has it. This is because future caller of this function requires to set this node constaint. Lastly, if provided nodeid is NUMA_NO_NODE, nodeid is set up to the node where migration source lives. It helps to remove simple wrappers for setting up the nodeid. Note that PageHighmem() call in previous function is changed to open-code "is_highmem_idx()" since it provides more readability. [akpm@linux-foundation.org: tweak patch title, per Vlastimil] [akpm@linux-foundation.org: fix typo in comment] Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Roman Gushchin <guro@fb.com> Link: http://lkml.kernel.org/r/1594622517-20681-6-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joonsoo Kim 提交于
Patch series "clean-up the migration target allocation functions", v5. This patch (of 9): For locality, it's better to migrate the page to the same node rather than the node of the current caller's cpu. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NRoman Gushchin <guro@fb.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Link: http://lkml.kernel.org/r/1594622517-20681-1-git-send-email-iamjoonsoo.kim@lge.com Link: http://lkml.kernel.org/r/1594622517-20681-2-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 6月, 2020 1 次提交
-
-
由 David Hildenbrand 提交于
virtio-mem wants to allow to offline memory blocks of which some parts were unplugged (allocated via alloc_contig_range()), especially, to later offline and remove completely unplugged memory blocks. The important part is that PageOffline() has to remain set until the section is offline, so these pages will never get accessed (e.g., when dumping). The pages should not be handed back to the buddy (which would require clearing PageOffline() and result in issues if offlining fails and the pages are suddenly in the buddy). Let's allow to do that by allowing to isolate any PageOffline() page when offlining. This way, we can reach the memory hotplug notifier MEM_GOING_OFFLINE, where the driver can signal that he is fine with offlining this page by dropping its reference count. PageOffline() pages with a reference count of 0 can then be skipped when offlining the pages (like if they were free, however they are not in the buddy). Anybody who uses PageOffline() pages and does not agree to offline them (e.g., Hyper-V balloon, XEN balloon, VMWare balloon for 2MB pages) will not decrement the reference count and make offlining fail when trying to migrate such an unmovable page. So there should be no observable change. Same applies to balloon compaction users (movable PageOffline() pages), the pages will simply be migrated. Note 1: If offlining fails, a driver has to increment the reference count again in MEM_CANCEL_OFFLINE. Note 2: A driver that makes use of this has to be aware that re-onlining the memory block has to be handled by hooking into onlining code (online_page_callback_t), resetting the page PageOffline() and not giving them to the buddy. Reviewed-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Tested-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Anthony Yznaga <anthony.yznaga@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Qian Cai <cai@lca.pw> Cc: Pingfan Liu <kernelfans@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-7-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 08 4月, 2020 1 次提交
-
-
由 Alexander Duyck 提交于
There are cases where we would benefit from avoiding having to go through the allocation and free cycle to return an isolated page. Examples for this might include page poisoning in which we isolate a page and then put it back in the free list without ever having actually allocated it. This will enable us to also avoid notifiers for the future free page reporting which will need to avoid retriggering page reporting when returning pages that have been reported on. Signed-off-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NMel Gorman <mgorman@techsingularity.net> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Nitesh Narayan Lal <nitesh@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pankaj Gupta <pagupta@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Wang <wei.w.wang@intel.com> Cc: Yang Zhang <yang.zhang.wz@gmail.com> Cc: wei qi <weiqi4@huawei.com> Link: http://lkml.kernel.org/r/20200211224624.29318.89287.stgit@localhost.localdomainSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 2月, 2020 4 次提交
-
-
由 Qian Cai 提交于
It makes sense to call the WARN_ON_ONCE(zone_idx(zone) == ZONE_MOVABLE) from start_isolate_page_range(), but should avoid triggering it from userspace, i.e, from is_mem_section_removable() because it could crash the system by a non-root user if warn_on_panic is set. While at it, simplify the code a bit by removing an unnecessary jump label. Link: http://lkml.kernel.org/r/20200120163915.1469-1-cai@lca.pwSigned-off-by: NQian Cai <cai@lca.pw> Suggested-by: NMichal Hocko <mhocko@kernel.org> Acked-by: NMichal Hocko <mhocko@suse.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Qian Cai 提交于
It is not that hard to trigger lockdep splats by calling printk from under zone->lock. Most of them are false positives caused by lock chains introduced early in the boot process and they do not cause any real problems (although most of the early boot lock dependencies could happen after boot as well). There are some console drivers which do allocate from the printk context as well and those should be fixed. In any case, false positives are not that trivial to workaround and it is far from optimal to lose lockdep functionality for something that is a non-issue. So change has_unmovable_pages() so that it no longer calls dump_page() itself - instead it returns a "struct page *" of the unmovable page back to the caller so that in the case of a has_unmovable_pages() failure, the caller can call dump_page() after releasing zone->lock. Also, make dump_page() is able to report a CMA page as well, so the reason string from has_unmovable_pages() can be removed. Even though has_unmovable_pages doesn't hold any reference to the returned page this should be reasonably safe for the purpose of reporting the page (dump_page) because it cannot be hotremoved in the context of memory unplug. The state of the page might change but that is the case even with the existing code as zone->lock only plays role for free pages. While at it, remove a similar but unnecessary debug-only printk() as well. A sample of one of those lockdep splats is, WARNING: possible circular locking dependency detected ------------------------------------------------------ test.sh/8653 is trying to acquire lock: ffffffff865a4460 (console_owner){-.-.}, at: console_unlock+0x207/0x750 but task is already holding lock: ffff88883fff3c58 (&(&zone->lock)->rlock){-.-.}, at: __offline_isolated_pages+0x179/0x3e0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&(&zone->lock)->rlock){-.-.}: __lock_acquire+0x5b3/0xb40 lock_acquire+0x126/0x280 _raw_spin_lock+0x2f/0x40 rmqueue_bulk.constprop.21+0xb6/0x1160 get_page_from_freelist+0x898/0x22c0 __alloc_pages_nodemask+0x2f3/0x1cd0 alloc_pages_current+0x9c/0x110 allocate_slab+0x4c6/0x19c0 new_slab+0x46/0x70 ___slab_alloc+0x58b/0x960 __slab_alloc+0x43/0x70 __kmalloc+0x3ad/0x4b0 __tty_buffer_request_room+0x100/0x250 tty_insert_flip_string_fixed_flag+0x67/0x110 pty_write+0xa2/0xf0 n_tty_write+0x36b/0x7b0 tty_write+0x284/0x4c0 __vfs_write+0x50/0xa0 vfs_write+0x105/0x290 redirected_tty_write+0x6a/0xc0 do_iter_write+0x248/0x2a0 vfs_writev+0x106/0x1e0 do_writev+0xd4/0x180 __x64_sys_writev+0x45/0x50 do_syscall_64+0xcc/0x76c entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #2 (&(&port->lock)->rlock){-.-.}: __lock_acquire+0x5b3/0xb40 lock_acquire+0x126/0x280 _raw_spin_lock_irqsave+0x3a/0x50 tty_port_tty_get+0x20/0x60 tty_port_default_wakeup+0xf/0x30 tty_port_tty_wakeup+0x39/0x40 uart_write_wakeup+0x2a/0x40 serial8250_tx_chars+0x22e/0x440 serial8250_handle_irq.part.8+0x14a/0x170 serial8250_default_handle_irq+0x5c/0x90 serial8250_interrupt+0xa6/0x130 __handle_irq_event_percpu+0x78/0x4f0 handle_irq_event_percpu+0x70/0x100 handle_irq_event+0x5a/0x8b handle_edge_irq+0x117/0x370 do_IRQ+0x9e/0x1e0 ret_from_intr+0x0/0x2a cpuidle_enter_state+0x156/0x8e0 cpuidle_enter+0x41/0x70 call_cpuidle+0x5e/0x90 do_idle+0x333/0x370 cpu_startup_entry+0x1d/0x1f start_secondary+0x290/0x330 secondary_startup_64+0xb6/0xc0 -> #1 (&port_lock_key){-.-.}: __lock_acquire+0x5b3/0xb40 lock_acquire+0x126/0x280 _raw_spin_lock_irqsave+0x3a/0x50 serial8250_console_write+0x3e4/0x450 univ8250_console_write+0x4b/0x60 console_unlock+0x501/0x750 vprintk_emit+0x10d/0x340 vprintk_default+0x1f/0x30 vprintk_func+0x44/0xd4 printk+0x9f/0xc5 -> #0 (console_owner){-.-.}: check_prev_add+0x107/0xea0 validate_chain+0x8fc/0x1200 __lock_acquire+0x5b3/0xb40 lock_acquire+0x126/0x280 console_unlock+0x269/0x750 vprintk_emit+0x10d/0x340 vprintk_default+0x1f/0x30 vprintk_func+0x44/0xd4 printk+0x9f/0xc5 __offline_isolated_pages.cold.52+0x2f/0x30a offline_isolated_pages_cb+0x17/0x30 walk_system_ram_range+0xda/0x160 __offline_pages+0x79c/0xa10 offline_pages+0x11/0x20 memory_subsys_offline+0x7e/0xc0 device_offline+0xd5/0x110 state_store+0xc6/0xe0 dev_attr_store+0x3f/0x60 sysfs_kf_write+0x89/0xb0 kernfs_fop_write+0x188/0x240 __vfs_write+0x50/0xa0 vfs_write+0x105/0x290 ksys_write+0xc6/0x160 __x64_sys_write+0x43/0x50 do_syscall_64+0xcc/0x76c entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: console_owner --> &(&port->lock)->rlock --> &(&zone->lock)->rlock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&zone->lock)->rlock); lock(&(&port->lock)->rlock); lock(&(&zone->lock)->rlock); lock(console_owner); *** DEADLOCK *** 9 locks held by test.sh/8653: #0: ffff88839ba7d408 (sb_writers#4){.+.+}, at: vfs_write+0x25f/0x290 #1: ffff888277618880 (&of->mutex){+.+.}, at: kernfs_fop_write+0x128/0x240 #2: ffff8898131fc218 (kn->count#115){.+.+}, at: kernfs_fop_write+0x138/0x240 #3: ffffffff86962a80 (device_hotplug_lock){+.+.}, at: lock_device_hotplug_sysfs+0x16/0x50 #4: ffff8884374f4990 (&dev->mutex){....}, at: device_offline+0x70/0x110 #5: ffffffff86515250 (cpu_hotplug_lock.rw_sem){++++}, at: __offline_pages+0xbf/0xa10 #6: ffffffff867405f0 (mem_hotplug_lock.rw_sem){++++}, at: percpu_down_write+0x87/0x2f0 #7: ffff88883fff3c58 (&(&zone->lock)->rlock){-.-.}, at: __offline_isolated_pages+0x179/0x3e0 #8: ffffffff865a4920 (console_lock){+.+.}, at: vprintk_emit+0x100/0x340 stack backtrace: Hardware name: HPE ProLiant DL560 Gen10/ProLiant DL560 Gen10, BIOS U34 05/21/2019 Call Trace: dump_stack+0x86/0xca print_circular_bug.cold.31+0x243/0x26e check_noncircular+0x29e/0x2e0 check_prev_add+0x107/0xea0 validate_chain+0x8fc/0x1200 __lock_acquire+0x5b3/0xb40 lock_acquire+0x126/0x280 console_unlock+0x269/0x750 vprintk_emit+0x10d/0x340 vprintk_default+0x1f/0x30 vprintk_func+0x44/0xd4 printk+0x9f/0xc5 __offline_isolated_pages.cold.52+0x2f/0x30a offline_isolated_pages_cb+0x17/0x30 walk_system_ram_range+0xda/0x160 __offline_pages+0x79c/0xa10 offline_pages+0x11/0x20 memory_subsys_offline+0x7e/0xc0 device_offline+0xd5/0x110 state_store+0xc6/0xe0 dev_attr_store+0x3f/0x60 sysfs_kf_write+0x89/0xb0 kernfs_fop_write+0x188/0x240 __vfs_write+0x50/0xa0 vfs_write+0x105/0x290 ksys_write+0xc6/0x160 __x64_sys_write+0x43/0x50 do_syscall_64+0xcc/0x76c entry_SYSCALL_64_after_hwframe+0x49/0xbe Link: http://lkml.kernel.org/r/20200117181200.20299-1-cai@lca.pwSigned-off-by: NQian Cai <cai@lca.pw> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Hildenbrand 提交于
Now that the memory isolate notifier is gone, the parameter is always 0. Drop it and cleanup has_unmovable_pages(). Link: http://lkml.kernel.org/r/20191114131911.11783-3-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Qian Cai <cai@lca.pw> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Arun KS <arunks@codeaurora.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Hildenbrand 提交于
Luckily, we have no users left, so we can get rid of it. Cleanup set_migratetype_isolate() a little bit. Link: http://lkml.kernel.org/r/20191114131911.11783-2-david@redhat.comSigned-off-by: NDavid Hildenbrand <david@redhat.com> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Qian Cai <cai@lca.pw> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-