- 25 1月, 2021 2 次提交
-
-
由 Shakeel Butt 提交于
Currently the kernel is not correctly updating the numa stats for NR_FILE_PAGES and NR_SHMEM on THP migration. Fix that. For NR_FILE_DIRTY and NR_ZONE_WRITE_PENDING, although at the moment there is no need to handle THP migration as kernel still does not have write support for file THP but to be more future proof, this patch adds the THP support for those stats as well. Link: https://lkml.kernel.org/r/20210108155813.2914586-2-shakeelb@google.com Fixes: e71769ae ("mm: enable thp migration for shmem thp") Signed-off-by: NShakeel Butt <shakeelb@google.com> Acked-by: NYang Shi <shy828301@gmail.com> Reviewed-by: NRoman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shakeel Butt 提交于
The kernel updates the per-node NR_FILE_DIRTY stats on page migration but not the memcg numa stats. That was not an issue until recently the commit 5f9a4f4a ("mm: memcontrol: add the missing numa_stat interface for cgroup v2") exposed numa stats for the memcg. So fix the file_dirty per-memcg numa stat. Link: https://lkml.kernel.org/r/20210108155813.2914586-1-shakeelb@google.com Fixes: 5f9a4f4a ("mm: memcontrol: add the missing numa_stat interface for cgroup v2") Signed-off-by: NShakeel Butt <shakeelb@google.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Acked-by: NYang Shi <shy828301@gmail.com> Reviewed-by: NRoman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 12月, 2020 10 次提交
-
-
由 Haitao Shi 提交于
Fix some spelling mistakes in comments: udpate ==> update succesful ==> successful exmaple ==> example unneccessary ==> unnecessary stoping ==> stopping uknown ==> unknown Link: https://lkml.kernel.org/r/20201127011747.86005-1-shihaitao1@huawei.comSigned-off-by: NHaitao Shi <shihaitao1@huawei.com> Reviewed-by: NMike Rapoport <rppt@linux.ibm.com> Reviewed-by: NSouptick Joarder <jrdr.linux@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Stephen Zhang 提交于
"dst" parameter to migrate_vma_insert_page() is not used anymore. Link: https://lkml.kernel.org/r/CANubcdUwCAMuUyamG2dkWP=cqSR9MAS=tHLDc95kQkqU-rEnAg@mail.gmail.comSigned-off-by: NStephen Zhang <starzhangzsd@gmail.com> Reviewed-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yang Shi 提交于
In the current implementation unmap_and_move() would return -ENOMEM if THP migration is unsupported, then the THP will be split. If split is failed just exit without trying to migrate other pages. It doesn't make too much sense since there may be enough free memory to migrate other pages and there may be a lot base pages on the list. Return -ENOSYS to make consistent with hugetlb. And if THP split is failed just skip and try other pages on the list. Just skip the whole list and exit when free memory is really low. Link: https://lkml.kernel.org/r/20201113205359.556831-6-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Song Liu <songliubraving@fb.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yang Shi 提交于
The migrate_prep{_local} never fails, so it is pointless to have return value and check the return value. Link: https://lkml.kernel.org/r/20201113205359.556831-5-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Song Liu <songliubraving@fb.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yang Shi 提交于
The NUMA balancing skip shared exec base page. Since CONFIG_READ_ONLY_THP_FOR_FS was introduced, there are probably shared exec THP, so skip such THPs for NUMA balancing as well. And Willy's regular filesystem THP support patches could create shared exec THP wven without that config. In addition, the page_is_file_lru() is used to tell if the page is file cache or not, but it filters out shmem page. It sounds like a typical usecase by putting executables in shmem to achieve performance gain via using shmem-THP, so it sounds worth skipping migration for such case too. Link: https://lkml.kernel.org/r/20201113205359.556831-4-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Jan Kara <jack@suse.cz> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Song Liu <songliubraving@fb.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yang Shi 提交于
When unmap_and_move{_huge_page}() returns !-EAGAIN and !MIGRATEPAGE_SUCCESS, the page would be put back to LRU or proper list if it is non-LRU movable page. But, the callers always call putback_movable_pages() to put the failed pages back later on, so it seems not very efficient to put every single page back immediately, and the code looks convoluted. Put the failed page on a separate list, then splice the list to migrate list when all pages are tried. It is the caller's responsibility to call putback_movable_pages() to handle failures. This also makes the code simpler and more readable. After the change the rules are: * Success: non hugetlb page will be freed, hugetlb page will be put back * -EAGAIN: stay on the from list * -ENOMEM: stay on the from list * Other errno: put on ret_pages list then splice to from list The from list would be empty iff all pages are migrated successfully, it was not so before. This has no impact to current existing callsites. Link: https://lkml.kernel.org/r/20201113205359.556831-3-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Song Liu <songliubraving@fb.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yang Shi 提交于
Patch series "mm: misc migrate cleanup and improvement", v3. This patch (of 5): The commit 9f4e41f4 ("mm: refactor truncate_complete_page()") refactored truncate_complete_page(), and it is not existed anymore, correct the comment in vmscan and migrate to avoid confusion. Link: https://lkml.kernel.org/r/20201113205359.556831-1-shy828301@gmail.com Link: https://lkml.kernel.org/r/20201113205359.556831-2-shy828301@gmail.comSigned-off-by: NYang Shi <shy828301@gmail.com> Reviewed-by: NJan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Song Liu <songliubraving@fb.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ralph Campbell 提交于
When migrating a zero page or pte_none() anonymous page to device private memory, migrate_vma_setup() will initialize the src[] array with a NULL PFN. This lets the device driver allocate device private memory and clear it instead of DMAing a page of zeros over the device bus. Since the source page didn't exist at the time, no struct page was locked nor a migration PTE inserted into the CPU page tables. The actual PTE insertion happens in migrate_vma_pages() when it tries to insert the device private struct page PTE into the CPU page tables. migrate_vma_pages() has to call the mmu notifiers again since another device could fault on the same page before the page table locks are acquired. Allow device drivers to optimize the invalidation similar to migrate_vma_setup() by calling mmu_notifier_range_init() which sets struct mmu_notifier_range event type to MMU_NOTIFY_MIGRATE and the migrate_pgmap_owner field. Link: https://lkml.kernel.org/r/20201021191335.10916-1-rcampbell@nvidia.comSigned-off-by: NRalph Campbell <rcampbell@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Long Li 提交于
The word in the comment is misspelled, it should be "include". Link: https://lkml.kernel.org/r/20201024114144.GA20552@lilongSigned-off-by: NLong Li <lonuxli.64@gmail.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Shakeel Butt 提交于
Since commit 369ea824 ("mm/rmap: update to new mmu_notifier semantic v2"), the code to check the secondary MMU's page table access bit is broken for !(TTU_IGNORE_ACCESS) because the page is unmapped from the secondary MMU's page table before the check. More specifically for those secondary MMUs which unmap the memory in mmu_notifier_invalidate_range_start() like kvm. However memory reclaim is the only user of !(TTU_IGNORE_ACCESS) or the absence of TTU_IGNORE_ACCESS and it explicitly performs the page table access check before trying to unmap the page. So, at worst the reclaim will miss accesses in a very short window if we remove page table access check in unmapping code. There is an unintented consequence of !(TTU_IGNORE_ACCESS) for the memcg reclaim. From memcg reclaim the page_referenced() only account the accesses from the processes which are in the same memcg of the target page but the unmapping code is considering accesses from all the processes, so, decreasing the effectiveness of memcg reclaim. The simplest solution is to always assume TTU_IGNORE_ACCESS in unmapping code. Link: https://lkml.kernel.org/r/20201104231928.1494083-1-shakeelb@google.com Fixes: 369ea824 ("mm/rmap: update to new mmu_notifier semantic v2") Signed-off-by: NShakeel Butt <shakeelb@google.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 11月, 2020 1 次提交
-
-
由 Mike Kravetz 提交于
Qian Cai reported the following BUG in [1] LTP: starting move_pages12 BUG: unable to handle page fault for address: ffffffffffffffe0 ... RIP: 0010:anon_vma_interval_tree_iter_first+0xa2/0x170 avc_start_pgoff at mm/interval_tree.c:63 Call Trace: rmap_walk_anon+0x141/0xa30 rmap_walk_anon at mm/rmap.c:1864 try_to_unmap+0x209/0x2d0 try_to_unmap at mm/rmap.c:1763 migrate_pages+0x1005/0x1fb0 move_pages_and_store_status.isra.47+0xd7/0x1a0 __x64_sys_move_pages+0xa5c/0x1100 do_syscall_64+0x5f/0x310 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Hugh Dickins diagnosed this as a migration bug caused by code introduced to use i_mmap_rwsem for pmd sharing synchronization. Specifically, the routine unmap_and_move_huge_page() is always passing the TTU_RMAP_LOCKED flag to try_to_unmap() while holding i_mmap_rwsem. This is wrong for anon pages as the anon_vma_lock should be held in this case. Further analysis suggested that i_mmap_rwsem was not required to he held at all when calling try_to_unmap for anon pages as an anon page could never be part of a shared pmd mapping. Discussion also revealed that the hack in hugetlb_page_mapping_lock_write to drop page lock and acquire i_mmap_rwsem is wrong. There is no way to keep mapping valid while dropping page lock. This patch does the following: - Do not take i_mmap_rwsem and set TTU_RMAP_LOCKED for anon pages when calling try_to_unmap. - Remove the hacky code in hugetlb_page_mapping_lock_write. The routine will now simply do a 'trylock' while still holding the page lock. If the trylock fails, it will return NULL. This could impact the callers: - migration calling code will receive -EAGAIN and retry up to the hard coded limit (10). - memory error code will treat the page as BUSY. This will force killing (SIGKILL) instead of SIGBUS any mapping tasks. Do note that this change in behavior only happens when there is a race. None of the standard kernel testing suites actually hit this race, but it is possible. [1] https://lore.kernel.org/lkml/20200708012044.GC992@lca.pw/ [2] https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.anvils/ Fixes: c0d0381a ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") Reported-by: NQian Cai <cai@lca.pw> Suggested-by: NHugh Dickins <hughd@google.com> Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NNaoya Horiguchi <naoya.horiguchi@nec.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201105195058.78401-1-mike.kravetz@oracle.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 10月, 2020 1 次提交
-
-
由 Miaohe Lin 提交于
There is no need to check if this process has the right to modify the specified process when they are same. And we could also skip the security hook call if a process is modifying its own pages. Add helper function to handle these. Suggested-by: NMatthew Wilcox <willy@infradead.org> Signed-off-by: NHongxiang Lou <louhongxiang@huawei.com> Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Christopher Lameter <cl@linux.com> Link: https://lkml.kernel.org/r/20200819083331.19012-1-linmiaohe@huawei.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 10月, 2020 1 次提交
-
-
由 Oscar Salvador 提交于
This patch changes the way we set and handle in-use poisoned pages. Until now, poisoned pages were released to the buddy allocator, trusting that the checks that take place at allocation time would act as a safe net and would skip that page. This has proved to be wrong, as we got some pfn walkers out there, like compaction, that all they care is the page to be in a buddy freelist. Although this might not be the only user, having poisoned pages in the buddy allocator seems a bad idea as we should only have free pages that are ready and meant to be used as such. Before explaining the taken approach, let us break down the kind of pages we can soft offline. - Anonymous THP (after the split, they end up being 4K pages) - Hugetlb - Order-0 pages (that can be either migrated or invalited) * Normal pages (order-0 and anon-THP) - If they are clean and unmapped page cache pages, we invalidate then by means of invalidate_inode_page(). - If they are mapped/dirty, we do the isolate-and-migrate dance. Either way, do not call put_page directly from those paths. Instead, we keep the page and send it to page_handle_poison to perform the right handling. page_handle_poison sets the HWPoison flag and does the last put_page. Down the chain, we placed a check for HWPoison page in free_pages_prepare, that just skips any poisoned page, so those pages do not end up in any pcplist/freelist. After that, we set the refcount on the page to 1 and we increment the poisoned pages counter. If we see that the check in free_pages_prepare creates trouble, we can always do what we do for free pages: - wait until the page hits buddy's freelists - take it off, and flag it The downside of the above approach is that we could race with an allocation, so by the time we want to take the page off the buddy, the page has been already allocated so we cannot soft offline it. But the user could always retry it. * Hugetlb pages - We isolate-and-migrate them After the migration has been successful, we call dissolve_free_huge_page, and we set HWPoison on the page if we succeed. Hugetlb has a slightly different handling though. While for non-hugetlb pages we cared about closing the race with an allocation, doing so for hugetlb pages requires quite some additional and intrusive code (we would need to hook in free_huge_page and some other places). So I decided to not make the code overly complicated and just fail normally if the page we allocated in the meantime. We can always build on top of this. As a bonus, because of the way we handle now in-use pages, we no longer need the put-as-isolation-migratetype dance, that was guarding for poisoned pages to end up in pcplists. Signed-off-by: NOscar Salvador <osalvador@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NNaoya Horiguchi <naoya.horiguchi@nec.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Aristeu Rozanski <aris@ruivo.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Yakunin <zeil@yandex-team.ru> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.com> Cc: Qian Cai <cai@lca.pw> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200922135650.1634-10-osalvador@suse.deSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 10月, 2020 2 次提交
-
-
由 Ralph Campbell 提交于
Device public memory never had an in tree consumer and was removed in commit 25b2995a ("mm: remove MEMORY_DEVICE_PUBLIC support"). Delete the obsolete comment. Signed-off-by: NRalph Campbell <rcampbell@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20200827190735.12752-2-rcampbell@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ralph Campbell 提交于
The variable struct migrate_vma->cpages is only used in migrate_vma_setup(). There is no need to decrement it in migrate_vma_finalize() since it is never checked. Signed-off-by: NRalph Campbell <rcampbell@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20200827190735.12752-1-rcampbell@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 9月, 2020 1 次提交
-
-
由 Zi Yan 提交于
PageTransHuge returns true for both thp and hugetlb, so thp stats was counting both thp and hugetlb migrations. Exclude hugetlb migration by setting is_thp variable right. Clean up thp handling code too when we are there. Fixes: 1a5bae25 ("mm/vmstat: add events for THP migration without split") Signed-off-by: NZi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lkml.kernel.org/r/20200917210413.1462975-1-zi.yan@sent.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 9月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
Replace the two negative flags that are always used together with a single positive flag that indicates the writeback capability instead of two related non-capabilities. Also remove the pointless wrappers to just check the flag. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 20 9月, 2020 1 次提交
-
-
由 Hugh Dickins 提交于
hugetlbfs pages do not participate in memcg: so although they do find most of migrate_page_states() useful, it would be better if they did not call into mem_cgroup_migrate() - where Qian Cai reported that LTP's move_pages12 triggers the warning in Alex Shi's prospective commit "mm/memcg: warning on !memcg after readahead page charged". Signed-off-by: NHugh Dickins <hughd@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NShakeel Butt <shakeelb@google.com> Acked-by: NJohannes Weiner <hannes@cmpxch.org> Cc: Alex Shi <alex.shi@linux.alibaba.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Qian Cai <cai@lca.pw> Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008301359460.5954@eggly.anvilsSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 9月, 2020 4 次提交
-
-
由 Ralph Campbell 提交于
The code to remove a migration PTE and replace it with a device private PTE was not copying the soft dirty bit from the migration entry. This could lead to page contents not being marked dirty when faulting the page back from device private memory. Signed-off-by: NRalph Campbell <rcampbell@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Bharata B Rao <bharata@linux.ibm.com> Link: https://lkml.kernel.org/r/20200831212222.22409-3-rcampbell@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ralph Campbell 提交于
Patch series "mm/migrate: preserve soft dirty in remove_migration_pte()". I happened to notice this from code inspection after seeing Alistair Popple's patch ("mm/rmap: Fixup copying of soft dirty and uffd ptes"). This patch (of 2): The check for is_zone_device_page() and is_device_private_page() is unnecessary since the latter is sufficient to determine if the page is a device private page. Simplify the code for easier reading. Signed-off-by: NRalph Campbell <rcampbell@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Bharata B Rao <bharata@linux.ibm.com> Link: https://lkml.kernel.org/r/20200831212222.22409-1-rcampbell@nvidia.com Link: https://lkml.kernel.org/r/20200831212222.22409-2-rcampbell@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alistair Popple 提交于
During memory migration a pte is temporarily replaced with a migration swap pte. Some pte bits from the existing mapping such as the soft-dirty and uffd write-protect bits are preserved by copying these to the temporary migration swap pte. However these bits are not stored at the same location for swap and non-swap ptes. Therefore testing these bits requires using the appropriate helper function for the given pte type. Unfortunately several code locations were found where the wrong helper function is being used to test soft_dirty and uffd_wp bits which leads to them getting incorrectly set or cleared during page-migration. Fix these by using the correct tests based on pte type. Fixes: a5430dda ("mm/migrate: support un-addressable ZONE_DEVICE page in migration") Fixes: 8c3328f1 ("mm/migrate: migrate_vma() unmap page from vma while collecting pages") Fixes: f45ec5ff ("userfaultfd: wp: support swap and page migration") Signed-off-by: NAlistair Popple <alistair@popple.id.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NPeter Xu <peterx@redhat.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Alistair Popple <alistair@popple.id.au> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200825064232.10023-2-alistair@popple.id.auSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alistair Popple 提交于
Commit f45ec5ff ("userfaultfd: wp: support swap and page migration") introduced support for tracking the uffd wp bit during page migration. However the non-swap PTE variant was used to set the flag for zone device private pages which are a type of swap page. This leads to corruption of the swap offset if the original PTE has the uffd_wp flag set. Fixes: f45ec5ff ("userfaultfd: wp: support swap and page migration") Signed-off-by: NAlistair Popple <alistair@popple.id.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NPeter Xu <peterx@redhat.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Link: https://lkml.kernel.org/r/20200825064232.10023-1-alistair@popple.id.auSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 8月, 2020 1 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
The thp prefix is more frequently used than hpage and we should be consistent between the various functions. [akpm@linux-foundation.org: fix mm/migrate.c] Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20200629151959.15779-6-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 8月, 2020 9 次提交
-
-
由 Joonsoo Kim 提交于
There is a well-defined migration target allocation callback. Use it. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Roman Gushchin <guro@fb.com> Link: http://lkml.kernel.org/r/1594622517-20681-7-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joonsoo Kim 提交于
There are some similar functions for migration target allocation. Since there is no fundamental difference, it's better to keep just one rather than keeping all variants. This patch implements base migration target allocation function. In the following patches, variants will be converted to use this function. Changes should be mechanical, but, unfortunately, there are some differences. First, some callers' nodemask is assgined to NULL since NULL nodemask will be considered as all available nodes, that is, &node_states[N_MEMORY]. Second, for hugetlb page allocation, gfp_mask is redefined as regular hugetlb allocation gfp_mask plus __GFP_THISNODE if user provided gfp_mask has it. This is because future caller of this function requires to set this node constaint. Lastly, if provided nodeid is NUMA_NO_NODE, nodeid is set up to the node where migration source lives. It helps to remove simple wrappers for setting up the nodeid. Note that PageHighmem() call in previous function is changed to open-code "is_highmem_idx()" since it provides more readability. [akpm@linux-foundation.org: tweak patch title, per Vlastimil] [akpm@linux-foundation.org: fix typo in comment] Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Roman Gushchin <guro@fb.com> Link: http://lkml.kernel.org/r/1594622517-20681-6-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joonsoo Kim 提交于
mm/migrate: clear __GFP_RECLAIM to make the migration callback consistent with regular THP allocations new_page_nodemask is a migration callback and it tries to use a common gfp flags for the target page allocation whether it is a base page or a THP. The later only adds GFP_TRANSHUGE to the given mask. This results in the allocation being slightly more aggressive than necessary because the resulting gfp mask will contain also __GFP_RECLAIM_KSWAPD. THP allocations usually exclude this flag to reduce over eager background reclaim during a high THP allocation load which has been seen during large mmaps initialization. There is no indication that this is a problem for migration as well but theoretically the same might happen when migrating large mappings to a different node. Make the migration callback consistent with regular THP allocations. [akpm@linux-foundation.org: fix comment typo, per Vlastimil] Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Roman Gushchin <guro@fb.com> Link: http://lkml.kernel.org/r/1594622517-20681-5-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joonsoo Kim 提交于
There is no difference between two migration callback functions, alloc_huge_page_node() and alloc_huge_page_nodemask(), except __GFP_THISNODE handling. It's redundant to have two almost similar functions in order to handle this flag. So, this patch tries to remove one by introducing a new argument, gfp_mask, to alloc_huge_page_nodemask(). After introducing gfp_mask argument, it's caller's job to provide correct gfp_mask. So, every callsites for alloc_huge_page_nodemask() are changed to provide gfp_mask. Note that it's safe to remove a node id check in alloc_huge_page_node() since there is no caller passing NUMA_NO_NODE as a node id. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Roman Gushchin <guro@fb.com> Link: http://lkml.kernel.org/r/1594622517-20681-4-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joonsoo Kim 提交于
It's not performance sensitive function. Move it to .c. This is a preparation step for future change. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMike Kravetz <mike.kravetz@oracle.com> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Roman Gushchin <guro@fb.com> Link: http://lkml.kernel.org/r/1594622517-20681-3-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Randy Dunlap 提交于
Drop the repeated word "and". Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NZi Yan <ziy@nvidia.com> Link: http://lkml.kernel.org/r/20200801173822.14973-8-rdunlap@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Anshuman Khandual 提交于
Add following new vmstat events which will help in validating THP migration without split. Statistics reported through these new VM events will help in performance debugging. 1. THP_MIGRATION_SUCCESS 2. THP_MIGRATION_FAILURE 3. THP_MIGRATION_SPLIT In addition, these new events also update normal page migration statistics appropriately via PGMIGRATE_SUCCESS and PGMIGRATE_FAILURE. While here, this updates current trace event 'mm_migrate_pages' to accommodate now available THP statistics. [akpm@linux-foundation.org: s/hpage_nr_pages/thp_nr_pages/] [ziy@nvidia.com: v2] Link: http://lkml.kernel.org/r/C5E3C65C-8253-4638-9D3C-71A61858BB8B@nvidia.com [anshuman.khandual@arm.com: s/thp_nr_pages/hpage_nr_pages/] Link: http://lkml.kernel.org/r/1594287583-16568-1-git-send-email-anshuman.khandual@arm.comSigned-off-by: NAnshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: NZi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Zi Yan <ziy@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Link: http://lkml.kernel.org/r/1594080415-27924-1-git-send-email-anshuman.khandual@arm.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ralph Campbell 提交于
Patch series "mm/migrate: optimize migrate_vma_setup() for holes". A simple optimization for migrate_vma_*() when the source vma is not an anonymous vma and a new test case to exercise it. This patch (of 2): When migrating system memory to device private memory, if the source address range is a valid VMA range and there is no memory or a zero page, the source PFN array is marked as valid but with no PFN. This lets the device driver allocate private memory and clear it, then insert the new device private struct page into the CPU's page tables when migrate_vma_pages() is called. migrate_vma_pages() only inserts the new page if the VMA is an anonymous range. There is no point in telling the device driver to allocate device private memory and then not migrate the page. Instead, mark the source PFN array entries as not migrating to avoid this overhead. [rcampbell@nvidia.com: v2] Link: http://lkml.kernel.org/r/20200710194840.7602-2-rcampbell@nvidia.comSigned-off-by: NRalph Campbell <rcampbell@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: "Bharata B Rao" <bharata@linux.ibm.com> Cc: Shuah Khan <shuah@kernel.org> Link: http://lkml.kernel.org/r/20200710194840.7602-1-rcampbell@nvidia.com Link: http://lkml.kernel.org/r/20200709165711.26584-1-rcampbell@nvidia.com Link: http://lkml.kernel.org/r/20200709165711.26584-2-rcampbell@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joonsoo Kim 提交于
In current implementation, newly created or swap-in anonymous page is started on active list. Growing active list results in rebalancing active/inactive list so old pages on active list are demoted to inactive list. Hence, the page on active list isn't protected at all. Following is an example of this situation. Assume that 50 hot pages on active list. Numbers denote the number of pages on active/inactive list (active | inactive). 1. 50 hot pages on active list 50(h) | 0 2. workload: 50 newly created (used-once) pages 50(uo) | 50(h) 3. workload: another 50 newly created (used-once) pages 50(uo) | 50(uo), swap-out 50(h) This patch tries to fix this issue. Like as file LRU, newly created or swap-in anonymous pages will be inserted to the inactive list. They are promoted to active list if enough reference happens. This simple modification changes the above example as following. 1. 50 hot pages on active list 50(h) | 0 2. workload: 50 newly created (used-once) pages 50(h) | 50(uo) 3. workload: another 50 newly created (used-once) pages 50(h) | 50(uo), swap-out 50(uo) As you can see, hot pages on active list would be protected. Note that, this implementation has a drawback that the page cannot be promoted and will be swapped-out if re-access interval is greater than the size of inactive list but less than the size of total(active+inactive). To solve this potential issue, following patch will apply workingset detection similar to the one that's already applied to file LRU. Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Link: http://lkml.kernel.org/r/1595490560-15117-3-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 8月, 2020 1 次提交
-
-
由 Ralph Campbell 提交于
On x86_64, when CONFIG_MMU_NOTIFIER is not set/enabled, there is a compiler error: mm/migrate.c: In function 'migrate_vma_collect': mm/migrate.c:2481:7: error: 'struct mmu_notifier_range' has no member named 'migrate_pgmap_owner' range.migrate_pgmap_owner = migrate->pgmap_owner; ^ Fixes: 998427b3 ("mm/notifier: add migration invalidation type") Reported-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NRalph Campbell <rcampbell@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Tested-by: NRandy Dunlap <rdunlap@infradead.org> Acked-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "Jason Gunthorpe" <jgg@mellanox.com> Link: http://lkml.kernel.org/r/20200806193353.7124-1-rcampbell@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 29 7月, 2020 2 次提交
-
-
由 Ralph Campbell 提交于
Currently migrate_vma_setup() calls mmu_notifier_invalidate_range_start() which flushes all device private page mappings whether or not a page is being migrated to/from device private memory. In order to not disrupt device mappings that are not being migrated, shift the responsibility for clearing device private mappings to the device driver and leave CPU page table unmapping handled by migrate_vma_setup(). To support this, the caller of migrate_vma_setup() should always set struct migrate_vma::pgmap_owner to a non NULL value that matches the device private page->pgmap->owner. This value is then passed to the struct mmu_notifier_range with a new event type which the driver's invalidation function can use to avoid device MMU invalidations. Link: https://lore.kernel.org/r/20200723223004.9586-4-rcampbell@nvidia.comSigned-off-by: NRalph Campbell <rcampbell@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
由 Ralph Campbell 提交于
The src_owner field in struct migrate_vma is being used for two purposes, it acts as a selection filter for which types of pages are to be migrated and it identifies device private pages owned by the caller. Split this into separate parameters so the src_owner field can be used just to identify device private pages owned by the caller of migrate_vma_setup(). Rename the src_owner field to pgmap_owner to reflect it is now used only to identify which device private pages to migrate. Link: https://lore.kernel.org/r/20200723223004.9586-3-rcampbell@nvidia.comSigned-off-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NBharata B Rao <bharata@linux.ibm.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
-
- 09 7月, 2020 1 次提交
-
-
由 Linus Torvalds 提交于
I realize that we fairly recently raised it to 4.8, but the fact is, 4.9 is a much better minimum version to target. We have a number of workarounds for actual bugs in pre-4.9 gcc versions (including things like internal compiler errors on ARM), but we also have some syntactic workarounds for lacking features. In particular, raising the minimum to 4.9 means that we can now just assume _Generic() exists, which is likely the much better replacement for a lot of very convoluted built-time magic with conditionals on sizeof and/or __builtin_choose_expr() with same_type() etc. Using _Generic also means that you will need to have a very recent version of 'sparse', but thats easy to build yourself, and much less of a hassle than some old gcc version can be. The latest (in a long string) of reasons for minimum compiler version upgrades was commit 5435f73d ("efi/x86: Fix build with gcc 4"). Ard points out that RHEL 7 uses gcc-4.8, but the people who stay back on old RHEL versions persumably also don't build their own kernels anyway. And maybe they should cross-built or just have a little side affair with a newer compiler? Acked-by: NArd Biesheuvel <ardb@kernel.org> Acked-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 6月, 2020 2 次提交
-
-
由 Michel Lespinasse 提交于
Convert comments that reference mmap_sem to reference mmap_lock instead. [akpm@linux-foundation.org: fix up linux-next leftovers] [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil] [akpm@linux-foundation.org: more linux-next fixups, per Michel] Signed-off-by: NMichel Lespinasse <walken@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NVlastimil Babka <vbabka@suse.cz> Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michel Lespinasse 提交于
Convert comments that reference old mmap_sem APIs to reference corresponding new mmap locking APIs instead. Signed-off-by: NMichel Lespinasse <walken@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NVlastimil Babka <vbabka@suse.cz> Reviewed-by: NDavidlohr Bueso <dbueso@suse.de> Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-12-walken@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-