- 29 4月, 2022 40 次提交
-
-
由 Anshuman Khandual 提交于
This defines and exports a platform specific custom vm_get_page_prot() via subscribing ARCH_HAS_VM_GET_PAGE_PROT. It localizes arch_vm_get_page_prot() and moves it near vm_get_page_prot(). Link: https://lkml.kernel.org/r/20220414062125.609297-4-anshuman.khandual@arm.comSigned-off-by: NAnshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Khalid Aziz <khalid.aziz@oracle.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Anshuman Khandual 提交于
This defines and exports a platform specific custom vm_get_page_prot() via subscribing ARCH_HAS_VM_GET_PAGE_PROT. While here, this also localizes arch_vm_get_page_prot() as __vm_get_page_prot() and moves it near vm_get_page_prot(). Link: https://lkml.kernel.org/r/20220414062125.609297-3-anshuman.khandual@arm.comSigned-off-by: NAnshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: NChristophe Leroy <christophe.leroy@csgroup.eu> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Khalid Aziz <khalid.aziz@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Anshuman Khandual 提交于
Patch series "mm/mmap: Drop arch_vm_get_page_prot() and arch_filter_pgprot()", v7. protection_map[] is an array based construct that translates given vm_flags combination. This array contains page protection map, which is populated by the platform via [__S000 .. __S111] and [__P000 .. __P111] exported macros. Primary usage for protection_map[] is for vm_get_page_prot(), which is used to determine page protection value for a given vm_flags. vm_get_page_prot() implementation, could again call platform overrides arch_vm_get_page_prot() and arch_filter_pgprot(). Some platforms override protection_map[] that was originally built with __SXXX/__PXXX with different runtime values. Currently there are multiple layers of abstraction i.e __SXXX/__PXXX macros , protection_map[], arch_vm_get_page_prot() and arch_filter_pgprot() built between the platform and generic MM, finally defining vm_get_page_prot(). Hence this series proposes to drop later two abstraction levels and instead just move the responsibility of defining vm_get_page_prot() to the platform (still utilizing generic protection_map[] array) itself making it clean and simple. This first introduces ARCH_HAS_VM_GET_PAGE_PROT which enables the platforms to define custom vm_get_page_prot(). This starts converting platforms that define the overrides arch_filter_pgprot() or arch_vm_get_page_prot() which enables for those constructs to be dropped off completely. The series has been inspired from an earlier discuss with Christoph Hellwig https://lore.kernel.org/all/1632712920-8171-1-git-send-email-anshuman.khandual@arm.com/ This patch (of 7): Add a new config ARCH_HAS_VM_GET_PAGE_PROT, which when subscribed enables a given platform to define its own vm_get_page_prot() but still utilizing the generic protection_map[] array. Link: https://lkml.kernel.org/r/20220414062125.609297-1-anshuman.khandual@arm.com Link: https://lkml.kernel.org/r/20220414062125.609297-2-anshuman.khandual@arm.comSigned-off-by: NAnshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: NChristophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Suggested-by: NChristoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Khalid Aziz <khalid.aziz@oracle.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Use helper mlock_future_check() to check whether it's safe to enlarge the locked_vm to simplify the code. Minor readability improvement. Link: https://lkml.kernel.org/r/20220402032231.64974-1-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Anshuman Khandual 提交于
protection_map[] maps vm_flags access combinations into page protection value as defined by the platform via __PXXX and __SXXX macros. The array indices in protection_map[], represents vm_flags access combinations but it's not very intuitive to derive. This makes it clear and explicit. Link: https://lkml.kernel.org/r/20220404031840.588321-3-anshuman.khandual@arm.comSigned-off-by: NAnshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Anshuman Khandual 提交于
Patch series "mm: protection_map[] cleanups". This patch (of 2): Although protection_map[] contains the platform defined page protection map for a given vm_flags combination, vm_get_page_prot() is the right interface to use. This will also reduce dependency on protection_map[] which is going to be dropped off completely later on. Link: https://lkml.kernel.org/r/20220404031840.588321-1-anshuman.khandual@arm.com Link: https://lkml.kernel.org/r/20220404031840.588321-2-anshuman.khandual@arm.comSigned-off-by: NAnshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Jianxing Wang 提交于
free a large list of pages maybe cause rcu_sched starved on non-preemptible kernels. howerver free_unref_page_list maybe can't cond_resched as it maybe called in interrupt or atomic context, especially can't detect atomic context in CONFIG_PREEMPTION=n. The issue is detected in guest with kvm cpu 200% overcommit, however I didn't see the warning in the host with the same application. I'm sure that the patch is needed for guest kernel, but no sure for host. To reproduce, set up two virtual machines in one host machine, per vm has the same number cpu and half memory of host. the run ltpstress.sh in per vm, then will see rcu stall warning.kernel is preempt disabled, append kernel command 'preempt=none' if enable dynamic preempt . It could detected in loongson machine(32 core, 128G mem) and ProLiant DL380 Gen9(x86 E5-2680, 28 core, 64G mem) tlb flush batch count depends on PAGE_SIZE, it's too large if PAGE_SIZE > 4K, here limit free batch count with 512. And add schedule point in tlb_batch_pages_flush. rcu: rcu_sched kthread starved for 5359 jiffies! g454793 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=19 [...] Call Trace: free_unref_page_list+0x19c/0x270 release_pages+0x3cc/0x498 tlb_flush_mmu_free+0x44/0x70 zap_pte_range+0x450/0x738 unmap_page_range+0x108/0x240 unmap_vmas+0x74/0xf0 unmap_region+0xb0/0x120 do_munmap+0x264/0x438 vm_munmap+0x58/0xa0 sys_munmap+0x10/0x20 syscall_common+0x24/0x38 Link: https://lkml.kernel.org/r/20220317072857.2635262-1-wangjianxing@loongson.cnSigned-off-by: NJianxing Wang <wangjianxing@loongson.cn> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Rolf Eike Beer 提交于
In case the lock is actually not held at this point. Link: https://lkml.kernel.org/r/5827758.TJ1SttVevJ@mobilepool36.emlix.comSigned-off-by: NRolf Eike Beer <eb@emlix.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Axel Rasmussen 提交于
These might not be issues yet, but they make the script more fragile. Also by fixing them we give a better example to future readers, who might copy/paste or otherwise re-use snippets from our script. - Use "read -r", since we don't ever want read to be interpreting '\' characters as escape sequences... - Quote variables, to deal with spaces properly. - Use $() instead of the older and harder-to-nest ``. - Get rid of superfluous "$" prefixes inside arithmetic $(()). Link: https://lkml.kernel.org/r/20220421224928.1848230-2-axelrasmussen@google.comSigned-off-by: NAxel Rasmussen <axelrasmussen@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Axel Rasmussen 提交于
Previously, each test printed out its own header, dealt with its own return code, etc. By just putting this standard stuff in a function, we can delete > 300 lines from the script. This also makes adding future tests easier. And, it gets rid of various inconsistencies that already exist: - Some tests correctly deal with ksft_skip, but others don't. - Some tests just print the executable name, others print arguments, and yet others print some comment in the header. - Most tests print out a header with two separator lines, but not the HMM smoke test or the memfd_secret test, which only print one. - We had a redundant "exit" at the end, with all the boilerplate it's an easy oversight. Link: https://lkml.kernel.org/r/20220421224928.1848230-1-axelrasmussen@google.comSigned-off-by: NAxel Rasmussen <axelrasmussen@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Gabriel Krisman Bertazi 提交于
This introduces three tests: 1) Sanity check soft dirty basic semantics: allocate area, clean, dirty, check if the SD bit is flipped. 2) Check VMA reuse: validate the VM_SOFTDIRTY usage 3) Check soft-dirty on huge pages This was motivated by Will Deacon's fix commit 912efa17 ("mm: proc: Invalidate TLB after clearing soft-dirty page state"). I was tracking the same issue that he fixed, and this test would have caught it. Link: https://lkml.kernel.org/r/20220420084036.4101604-2-usama.anjum@collabora.comSigned-off-by: NGabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: NMuhammad Usama Anjum <usama.anjum@collabora.com> Co-developed-by: NMuhammad Usama Anjum <usama.anjum@collabora.com> Cc: Will Deacon <will@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Muhammad Usama Anjum 提交于
Bring common functions to a new file while keeping code as much same as possible. These functions can be used in the new tests. This helps in avoiding code duplication. Link: https://lkml.kernel.org/r/20220420084036.4101604-1-usama.anjum@collabora.comSigned-off-by: NMuhammad Usama Anjum <usama.anjum@collabora.com> Acked-by: NDavid Hildenbrand <david@redhat.com> Cc: Gabriel Krisman Bertazi <krisman@collabora.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Sidhartha Kumar 提交于
Print three possible reasons /sys/kernel/debug/gup_test cannot be opened to help users of this test diagnose failures. Link: https://lkml.kernel.org/r/20220405214809.3351223-1-sidhartha.kumar@oracle.comSigned-off-by: NSidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: NShuah Khan <skhan@linuxfoundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Muchun Song 提交于
The only user (DAX) of range and pmdpp parameters of follow_invalidate_pte() is gone, it is safe to remove them and make it static to simlify the code. This is revertant of the following commits: 09796395 ("mm: add follow_pte_pmd()") a4d1a885 ("dax: update to new mmu_notifier semantic") There is only one caller of the follow_invalidate_pte(). So just fold it into follow_pte() and remove it. Link: https://lkml.kernel.org/r/20220403053957.10770-7-songmuchun@bytedance.comSigned-off-by: NMuchun Song <songmuchun@bytedance.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Cc: Alistair Popple <apopple@nvidia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Ross Zwisler <zwisler@kernel.org> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Muchun Song 提交于
Currently dax_mapping_entry_mkclean() fails to clean and write protect the pte entry within a DAX PMD entry during an *sync operation. This can result in data loss in the following sequence: 1) process A mmap write to DAX PMD, dirtying PMD radix tree entry and making the pmd entry dirty and writeable. 2) process B mmap with the @offset (e.g. 4K) and @length (e.g. 4K) write to the same file, dirtying PMD radix tree entry (already done in 1)) and making the pte entry dirty and writeable. 3) fsync, flushing out PMD data and cleaning the radix tree entry. We currently fail to mark the pte entry as clean and write protected since the vma of process B is not covered in dax_entry_mkclean(). 4) process B writes to the pte. These don't cause any page faults since the pte entry is dirty and writeable. The radix tree entry remains clean. 5) fsync, which fails to flush the dirty PMD data because the radix tree entry was clean. 6) crash - dirty data that should have been fsync'd as part of 5) could still have been in the processor cache, and is lost. Just to use pfn_mkclean_range() to clean the pfns to fix this issue. Link: https://lkml.kernel.org/r/20220403053957.10770-6-songmuchun@bytedance.com Fixes: 4b4bb46d ("dax: clear dirty entry tags on cache flush") Signed-off-by: NMuchun Song <songmuchun@bytedance.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Cc: Alistair Popple <apopple@nvidia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Ross Zwisler <zwisler@kernel.org> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Muchun Song 提交于
The devmap pages can not use page_vma_mapped_walk() to check if a huge devmap page is mapped into a vma. Add support for walking huge devmap pages so that DAX can use it in the next patch. Link: https://lkml.kernel.org/r/20220403053957.10770-5-songmuchun@bytedance.comSigned-off-by: NMuchun Song <songmuchun@bytedance.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Ross Zwisler <zwisler@kernel.org> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Muchun Song 提交于
The page_mkclean_one() is supposed to be used with the pfn that has a associated struct page, but not all the pfns (e.g. DAX) have a struct page. Introduce a new function pfn_mkclean_range() to cleans the PTEs (including PMDs) mapped with range of pfns which has no struct page associated with them. This helper will be used by DAX device in the next patch to make pfns clean. Link: https://lkml.kernel.org/r/20220403053957.10770-4-songmuchun@bytedance.comSigned-off-by: NMuchun Song <songmuchun@bytedance.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Ross Zwisler <zwisler@kernel.org> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Muchun Song 提交于
The flush_cache_page() only remove a PAGE_SIZE sized range from the cache. However, it does not cover the full pages in a THP except a head page. Replace it with flush_cache_range() to fix this issue. This is just a documentation issue with the respect to properly documenting the expected usage of cache flushing before modifying the pmd. However, in practice this is not a problem due to the fact that DAX is not available on architectures with virtually indexed caches per: commit d92576f1 ("dax: does not work correctly with virtual aliasing caches") Link: https://lkml.kernel.org/r/20220403053957.10770-3-songmuchun@bytedance.com Fixes: f729c8c9 ("dax: wrprotect pmd_t in dax_mapping_entry_mkclean") Signed-off-by: NMuchun Song <songmuchun@bytedance.com> Reviewed-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Cc: Alistair Popple <apopple@nvidia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Ross Zwisler <zwisler@kernel.org> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Muchun Song 提交于
Patch series "Fix some bugs related to ramp and dax", v7. Patch 1-2 fix a cache flush bug, because subsequent patches depend on those on those changes, there are placed in this series. Patch 3-4 are preparation for fixing a dax bug in patch 5. Patch 6 is code cleanup since the previous patch removes the usage of follow_invalidate_pte(). This patch (of 6): The flush_cache_page() only remove a PAGE_SIZE sized range from the cache. However, it does not cover the full pages in a THP except a head page. Replace it with flush_cache_range() to fix this issue. At least, no problems were found due to this. Maybe because the architectures that have virtual indexed caches is less. Link: https://lkml.kernel.org/r/20220403053957.10770-1-songmuchun@bytedance.com Link: https://lkml.kernel.org/r/20220403053957.10770-2-songmuchun@bytedance.com Fixes: f27176cf ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()") Signed-off-by: NMuchun Song <songmuchun@bytedance.com> Reviewed-by: NYang Shi <shy828301@gmail.com> Reviewed-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alistair Popple <apopple@nvidia.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Hugh Dickins <hughd@google.com> Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Ross Zwisler <zwisler@kernel.org> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
We can't assume pte_offset_map_lock will return same orig_pte value. So it's necessary to reacquire the orig_pte or pte_unmap_unlock will unmap the stale pte. Link: https://lkml.kernel.org/r/20220416081416.23304-1-linmiaohe@huawei.com Fixes: 9c276cc6 ("mm: introduce MADV_COLD") Fixes: 854e9ed0 ("mm: support madvise(MADV_FREE)") Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Oscar Salvador 提交于
At the time demote-on-reclaim was introduced, it was tied to CONFIG_HOTPLUG_CPU + CONFIG_MIGRATE, but that is not really accurate. The only two things we need to depend on are CONFIG_NUMA + CONFIG_MIGRATE, so clean this up. Furthermore, we only register the hotplug memory notifier when the system has CONFIG_MEMORY_HOTPLUG. Link: https://lkml.kernel.org/r/20220322224016.4574-1-osalvador@suse.deSigned-off-by: NOscar Salvador <osalvador@suse.de> Suggested-by: N"Huang, Ying" <ying.huang@intel.com> Reviewed-by: N"Huang, Ying" <ying.huang@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Abhishek Goel <huntbag@linux.vnet.ibm.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Baolin Wang 提交于
There is no need to validate the hugetlb page's refcount before trying to freeze the hugetlb page's expected refcount, instead we can just rely on the page_ref_freeze() to simplify the validation. Moreover we are always under the page lock when migrating the hugetlb page mapping, which means nowhere else can remove it from the page cache, so we can remove the xas_load() validation under the i_pages lock. Link: https://lkml.kernel.org/r/eb2fbbeaef2b1714097b9dec457426d682ee0635.1649676424.git.baolin.wang@linux.alibaba.comSigned-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Acked-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
When follow_page peeks a page, the page could be migrated and then be offlined while it's still being used by the do_pages_stat_array(). Use FOLL_GET to hold the page refcnt to fix this potential race. Link: https://lkml.kernel.org/r/20220318111709.60311-12-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Acked-by: N"Huang, Ying" <ying.huang@intel.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
If we failed to setup hotplug state callbacks for mm/demotion:online in some corner cases, node_demotion will be left uninitialized. Invalid node might be returned from the next_demotion_node() when doing reclaim-based migration. Use kcalloc to allocate node_demotion to fix the issue. Link: https://lkml.kernel.org/r/20220318111709.60311-11-linmiaohe@huawei.com Fixes: ac16ec83 ("mm: migrate: support multiple target nodes demotion") Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: N"Huang, Ying" <ying.huang@intel.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
In -ENOMEM case, there might be some subpages of fail-to-migrate THPs left in thp_split_pages list. We should move them back to migration list so that they could be put back to the right list by the caller otherwise the page refcnt will be leaked here. Also adjust nr_failed and nr_thp_failed accordingly to make vm events account more accurate. Link: https://lkml.kernel.org/r/20220318111709.60311-10-linmiaohe@huawei.com Fixes: b5bade97 ("mm: migrate: fix the return value of migrate_pages()") Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Reviewed-by: N"Huang, Ying" <ying.huang@intel.com> Reviewed-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Cc: Alistair Popple <apopple@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Remove the duplicated codes in migrate_pages to simplify the code. Minor readability improvement. No functional change intended. Link: https://lkml.kernel.org/r/20220318111709.60311-9-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NZi Yan <ziy@nvidia.com> Reviewed-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Avoid unneeded next_pass and this_pass initialization as they're always set before using to save possible cpu cycles when there are plenty of nodes in the system. Link: https://lkml.kernel.org/r/20220318111709.60311-8-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Reviewed-by: NBaolin Wang <baolin.wang@linux.alibaba.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
We could use helper macro min to help set the chunk_nr to simplify the code. Link: https://lkml.kernel.org/r/20220318111709.60311-7-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
We could use helper function vma_lookup() to lookup the needed vma to simplify the code. Link: https://lkml.kernel.org/r/20220318111709.60311-6-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
We can use page_is_file_lru() directly to help account the isolated pages to simplify the code a bit. Link: https://lkml.kernel.org/r/20220318111709.60311-4-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Patch series "A few cleanup and fixup patches for migration", v2. This series contains a few patches to remove unneeded variables, jump label and use helper to simplify the code. Also we fix some bugs such as page refcounts leak , invalid node access and so on. More details can be found in the respective changelogs. This patch (of 11): When mapping_locked is true, TTU_RMAP_LOCKED is always set to ttu. We can check ttu instead so mapping_locked can be removed. And ttu is either 0 or TTU_RMAP_LOCKED now. Change '|=' to '=' to reflect this. Link: https://lkml.kernel.org/r/20220318111709.60311-1-linmiaohe@huawei.com Link: https://lkml.kernel.org/r/20220318111709.60311-2-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Alistair Popple <apopple@nvidia.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Alistair Popple 提交于
Add some basic migration tests and in particular tests that will stress both the pte and pmd migration entry wait paths. Link: https://lkml.kernel.org/r/20220324014349.229253-1-apopple@nvidia.comSigned-off-by: NAlistair Popple <apopple@nvidia.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Since commit e5947d23 ("mm: mempolicy: don't have to split pmd for huge zero page"), THP is never splited in queue_pages_pmd. Thus 2 is never returned now. We can remove such unnecessary ret != 2 check and clean up the relevant comment. Minor improvements in readability. Link: https://lkml.kernel.org/r/20220419122234.45083-1-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NYang Shi <shy828301@gmail.com> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Compaction sysfs file is created via compaction_register_node in register_node. But we forgot to remove it in unregister_node. Thus compaction sysfs file is leaked. Using compaction_unregister_node to fix this issue. Link: https://lkml.kernel.org/r/20220401070905.43679-1-linmiaohe@huawei.com Fixes: ed4a6d7f ("mm: compaction: add /sys trigger for per-node memory compaction") Signed-off-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Use helper isolation_suitable() to check whether page is suitable to isolate to simplify the code. Minor readability improvement. Link: https://lkml.kernel.org/r/20220322110750.60311-1-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NWei Yang <richard.weiyang@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
The only caller z3fold_free() never calls free_handle() in PAGE_HEADLESS case. Remove this unneeded check. Link: https://lkml.kernel.org/r/20220308134311.59086-9-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
do_compact_page() will do list_del_init(&zhdr->buddy) for us. Remove this extra one to save some possible cpu cycles. Link: https://lkml.kernel.org/r/20220308134311.59086-8-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
The z3fold will always do atomic64_dec(&pool->pages_nr) when the __release_z3fold_page() is called. Thus we can move decrement of pool->pages_nr into __release_z3fold_page() to simplify the code. Also we can reduce the size of z3fold.o ~1k. Without this patch: text data bss dec hex filename 15444 1376 8 16828 41bc mm/z3fold.o With this patch: text data bss dec hex filename 15044 1248 8 16300 3fac mm/z3fold.o Link: https://lkml.kernel.org/r/20220308134311.59086-7-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
The local variable l holds the address of unbuddied[i] which won't change after we take the pool lock. Remove it to avoid confusion. Link: https://lkml.kernel.org/r/20220308134311.59086-6-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Page->page_type and PagePrivate are not used in z3fold. We should remove these confusing unneeded operations. The z3fold do these here is due to referring to zsmalloc's migration code which does need these operations. Link: https://lkml.kernel.org/r/20220308134311.59086-5-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NVitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-