- 01 12月, 2022 1 次提交
-
-
由 Li zeming 提交于
The ei pointer does not need to cast the type. Link: https://lkml.kernel.org/r/20221107015659.3221-1-zeming@nfschina.comSigned-off-by: NLi zeming <zeming@nfschina.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 09 11月, 2022 4 次提交
-
-
由 Hawkins Jiawei 提交于
Syzkaller reports a null-ptr-deref bug as follows: ====================================================== KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] RIP: 0010:hugetlbfs_parse_param+0x1dd/0x8e0 fs/hugetlbfs/inode.c:1380 [...] Call Trace: <TASK> vfs_parse_fs_param fs/fs_context.c:148 [inline] vfs_parse_fs_param+0x1f9/0x3c0 fs/fs_context.c:129 vfs_parse_fs_string+0xdb/0x170 fs/fs_context.c:191 generic_parse_monolithic+0x16f/0x1f0 fs/fs_context.c:231 do_new_mount fs/namespace.c:3036 [inline] path_mount+0x12de/0x1e20 fs/namespace.c:3370 do_mount fs/namespace.c:3383 [inline] __do_sys_mount fs/namespace.c:3591 [inline] __se_sys_mount fs/namespace.c:3568 [inline] __x64_sys_mount+0x27f/0x300 fs/namespace.c:3568 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd [...] </TASK> ====================================================== According to commit "vfs: parse: deal with zero length string value", kernel will set the param->string to null pointer in vfs_parse_fs_string() if fs string has zero length. Yet the problem is that, hugetlbfs_parse_param() will dereference the param->string, without checking whether it is a null pointer. To be more specific, if hugetlbfs_parse_param() parses an illegal mount parameter, such as "size=,", kernel will constructs struct fs_parameter with null pointer in vfs_parse_fs_string(), then passes this struct fs_parameter to hugetlbfs_parse_param(), which triggers the above null-ptr-deref bug. This patch solves it by adding sanity check on param->string in hugetlbfs_parse_param(). Link: https://lkml.kernel.org/r/20221020231609.4810-1-yin31149@gmail.com Reported-by: syzbot+a3e6acd85ded5c16a709@syzkaller.appspotmail.com Tested-by: syzbot+a3e6acd85ded5c16a709@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/0000000000005ad00405eb7148c6@google.com/Signed-off-by: NHawkins Jiawei <yin31149@gmail.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Hawkins Jiawei <yin31149@gmail.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Ian Kent <raven@themaw.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Sidhartha Kumar 提交于
Remove the last caller of delete_from_page_cache() by converting the code to its folio equivalent. Link: https://lkml.kernel.org/r/20220922154207.1575343-5-sidhartha.kumar@oracle.comSigned-off-by: NSidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Colin Cross <ccross@google.com> Cc: David Howells <dhowells@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Peter Xu <peterx@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Sidhartha Kumar 提交于
Allow hugetlbfs_migrate_folio to check and read subpool information by passing in a folio. Link: https://lkml.kernel.org/r/20220922154207.1575343-4-sidhartha.kumar@oracle.comSigned-off-by: NSidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Colin Cross <ccross@google.com> Cc: David Howells <dhowells@redhat.com> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Hugh Dickins <hughd@google.com> Cc: kernel test robot <lkp@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Peter Xu <peterx@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 James Houghton 提交于
This change is very similar to the change that was made for shmem [1], and it solves the same problem but for HugeTLBFS instead. Currently, when poison is found in a HugeTLB page, the page is removed from the page cache. That means that attempting to map or read that hugepage in the future will result in a new hugepage being allocated instead of notifying the user that the page was poisoned. As [1] states, this is effectively memory corruption. The fix is to leave the page in the page cache. If the user attempts to use a poisoned HugeTLB page with a syscall, the syscall will fail with EIO, the same error code that shmem uses. For attempts to map the page, the thread will get a BUS_MCEERR_AR SIGBUS. [1]: commit a7605426 ("mm: shmem: don't truncate page if memory failure happens") Link: https://lkml.kernel.org/r/20221018200125.848471-1-jthoughton@google.comSigned-off-by: NJames Houghton <jthoughton@google.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NNaoya Horiguchi <naoya.horiguchi@nec.com> Tested-by: NNaoya Horiguchi <naoya.horiguchi@nec.com> Reviewed-by: NYang Shi <shy828301@gmail.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: James Houghton <jthoughton@google.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 04 10月, 2022 7 次提交
-
-
由 Mike Kravetz 提交于
With the new hugetlb vma lock in place, it can also be used to handle page fault races with file truncation. The lock is taken at the beginning of the code fault path in read mode. During truncation, it is taken in write mode for each vma which has the file mapped. The file's size (i_size) is modified before taking the vma lock to unmap. How are races handled? The page fault code checks i_size early in processing after taking the vma lock. If the fault is beyond i_size, the fault is aborted. If the fault is not beyond i_size the fault will continue and a new page will be added to the file. It could be that truncation code modifies i_size after the check in fault code. That is OK, as truncation code will soon remove the page. The truncation code will wait until the fault is finished, as it must obtain the vma lock in write mode. This patch cleans up/removes late checks in the fault paths that try to back out pages racing with truncation. As noted above, we just let the truncation code remove the pages. [mike.kravetz@oracle.com: fix reserve_alloc set but not used compiler warning] Link: https://lkml.kernel.org/r/Yyj7HsJWfHDoU24U@monkey Link: https://lkml.kernel.org/r/20220914221810.95771-10-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Prakash Sangappa <prakash.sangappa@oracle.com> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Mike Kravetz 提交于
The new hugetlb vma lock is used to address this race: Faulting thread Unsharing thread ... ... ptep = huge_pte_offset() or ptep = huge_pte_alloc() ... i_mmap_lock_write lock page table ptep invalid <------------------------ huge_pmd_unshare() Could be in a previously unlock_page_table sharing process or worse i_mmap_unlock_write ... The vma_lock is used as follows: - During fault processing. The lock is acquired in read mode before doing a page table lock and allocation (huge_pte_alloc). The lock is held until code is finished with the page table entry (ptep). - The lock must be held in write mode whenever huge_pmd_unshare is called. Lock ordering issues come into play when unmapping a page from all vmas mapping the page. The i_mmap_rwsem must be held to search for the vmas, and the vma lock must be held before calling unmap which will call huge_pmd_unshare. This is done today in: - try_to_migrate_one and try_to_unmap_ for page migration and memory error handling. In these routines we 'try' to obtain the vma lock and fail to unmap if unsuccessful. Calling routines already deal with the failure of unmapping. - hugetlb_vmdelete_list for truncation and hole punch. This routine also tries to acquire the vma lock. If it fails, it skips the unmapping. However, we can not have file truncation or hole punch fail because of contention. After hugetlb_vmdelete_list, truncation and hole punch call remove_inode_hugepages. remove_inode_hugepages checks for mapped pages and call hugetlb_unmap_file_page to unmap them. hugetlb_unmap_file_page is designed to drop locks and reacquire in the correct order to guarantee unmap success. Link: https://lkml.kernel.org/r/20220914221810.95771-9-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Prakash Sangappa <prakash.sangappa@oracle.com> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> -
由 Mike Kravetz 提交于
Create the new routine hugetlb_unmap_file_folio that will unmap a single file folio. This is refactored code from hugetlb_vmdelete_list. It is modified to do locking within the routine itself and check whether the page is mapped within a specific vma before unmapping. This refactoring will be put to use and expanded upon in a subsequent patch adding vma specific locking. Link: https://lkml.kernel.org/r/20220914221810.95771-8-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Prakash Sangappa <prakash.sangappa@oracle.com> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Mike Kravetz 提交于
Create the new routine remove_inode_single_folio that will remove a single folio from a file. This is refactored code from remove_inode_hugepages. It checks for the uncommon case in which the folio is still mapped and unmaps. No functional change. This refactoring will be put to use and expanded upon in a subsequent patches. Link: https://lkml.kernel.org/r/20220914221810.95771-5-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Prakash Sangappa <prakash.sangappa@oracle.com> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Mike Kravetz 提交于
remove_huge_page removes a hugetlb page from the page cache. Change to hugetlb_delete_from_page_cache as it is a more descriptive name. huge_add_to_page_cache is global in scope, but only deals with hugetlb pages. For consistency and clarity, rename to hugetlb_add_to_page_cache. Link: https://lkml.kernel.org/r/20220914221810.95771-4-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Prakash Sangappa <prakash.sangappa@oracle.com> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Mike Kravetz 提交于
Commit c0d0381a ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") added code to take i_mmap_rwsem in read mode for the duration of fault processing. However, this has been shown to cause performance/scaling issues. Revert the code and go back to only taking the semaphore in huge_pmd_share during the fault path. Keep the code that takes i_mmap_rwsem in write mode before calling try_to_unmap as this is required if huge_pmd_unshare is called. NOTE: Reverting this code does expose the following race condition. Faulting thread Unsharing thread ... ... ptep = huge_pte_offset() or ptep = huge_pte_alloc() ... i_mmap_lock_write lock page table ptep invalid <------------------------ huge_pmd_unshare() Could be in a previously unlock_page_table sharing process or worse i_mmap_unlock_write ... ptl = huge_pte_lock(ptep) get/update pte set_pte_at(pte, ptep) It is unknown if the above race was ever experienced by a user. It was discovered via code inspection when initially addressed. In subsequent patches, a new synchronization mechanism will be added to coordinate pmd sharing and eliminate this race. Link: https://lkml.kernel.org/r/20220914221810.95771-3-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Prakash Sangappa <prakash.sangappa@oracle.com> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Mike Kravetz 提交于
Patch series "hugetlb: Use new vma lock for huge pmd sharing synchronization", v2. hugetlb fault scalability regressions have recently been reported [1]. This is not the first such report, as regressions were also noted when commit c0d0381a ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") was added [2] in v5.7. At that time, a proposal to address the regression was suggested [3] but went nowhere. The regression and benefit of this patch series is not evident when using the vm_scalability benchmark reported in [2] on a recent kernel. Results from running, "./usemem -n 48 --prealloc --prefault -O -U 3448054972" 48 sample Avg next-20220913 next-20220913 next-20220913 unmodified revert i_mmap_sema locking vma sema locking, this series ----------------------------------------------------------------------------- 498150 KB/s 501934 KB/s 504793 KB/s The recent regression report [1] notes page fault and fork latency of shared hugetlb mappings. To measure this, I created two simple programs: 1) map a shared hugetlb area, write fault all pages, unmap area Do this in a continuous loop to measure faults per second 2) map a shared hugetlb area, write fault a few pages, fork and exit Do this in a continuous loop to measure forks per second These programs were run on a 48 CPU VM with 320GB memory. The shared mapping size was 250GB. For comparison, a single instance of the program was run. Then, multiple instances were run in parallel to introduce lock contention. Changing the locking scheme results in a significant performance benefit. test instances unmodified revert vma -------------------------------------------------------------------------- faults per sec 1 393043 395680 389932 faults per sec 24 71405 81191 79048 forks per sec 1 2802 2747 2725 forks per sec 24 439 536 500 Combined faults 24 1621 68070 53662 Combined forks 24 358 67 142 Combined test is when running both faulting program and forking program simultaneously. Patches 1 and 2 of this series revert c0d0381a and 87bf91d3 which depends on c0d0381a. Acquisition of i_mmap_rwsem is still required in the fault path to establish pmd sharing, so this is moved back to huge_pmd_share. With c0d0381a reverted, this race is exposed: Faulting thread Unsharing thread ... ... ptep = huge_pte_offset() or ptep = huge_pte_alloc() ... i_mmap_lock_write lock page table ptep invalid <------------------------ huge_pmd_unshare() Could be in a previously unlock_page_table sharing process or worse i_mmap_unlock_write ... ptl = huge_pte_lock(ptep) get/update pte set_pte_at(pte, ptep) Reverting 87bf91d3 exposes races in page fault/file truncation. When the new vma lock is put to use in patch 8, this will handle the fault/file truncation races. This is explained in patch 9 where code associated with these races is cleaned up. Patches 3 - 5 restructure existing code in preparation for using the new vma lock (rw semaphore) for pmd sharing synchronization. The idea is that this semaphore will be held in read mode for the duration of fault processing, and held in write mode for unmap operations which may call huge_pmd_unshare. Acquiring i_mmap_rwsem is also still required to synchronize huge pmd sharing. However it is only required in the fault path when setting up sharing, and will be acquired in huge_pmd_share(). Patch 6 adds the new vma lock and all supporting routines, but does not actually change code to use the new lock. Patch 7 refactors code in preparation for using the new lock. And, patch 8 finally adds code to make use of this new vma lock. Unfortunately, the fault code and truncate/hole punch code would naturally take locks in the opposite order which could lead to deadlock. Since the performance of page faults is more important, the truncation/hole punch code is modified to back out and take locks in the correct order if necessary. [1] https://lore.kernel.org/linux-mm/43faf292-245b-5db5-cce9-369d8fb6bd21@infradead.org/ [2] https://lore.kernel.org/lkml/20200622005551.GK5535@shao2-debian/ [3] https://lore.kernel.org/linux-mm/20200706202615.32111-1-mike.kravetz@oracle.com/ This patch (of 9): Commit c0d0381a ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") added code to take i_mmap_rwsem in read mode for the duration of fault processing. The use of i_mmap_rwsem to prevent fault/truncate races depends on this. However, this has been shown to cause performance/scaling issues. As a result, that code will be reverted. Since the use i_mmap_rwsem to address page fault/truncate races depends on this, it must also be reverted. In a subsequent patch, code will be added to detect the fault/truncate race and back out operations as required. Link: https://lkml.kernel.org/r/20220914221810.95771-1-mike.kravetz@oracle.com Link: https://lkml.kernel.org/r/20220914221810.95771-2-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Prakash Sangappa <prakash.sangappa@oracle.com> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 24 9月, 2022 2 次提交
-
-
由 Miklos Szeredi 提交于
This is in preparation for adding tmpfile support to fuse, which requires that the tmpfile creation and opening are done as a single operation. Replace the 'struct dentry *' argument of i_op->tmpfile with 'struct file *'. Call finish_open_simple() as the last thing in ->tmpfile() instances (may be omitted in the error case). Change d_tmpfile() argument to 'struct file *' as well to make callers more readable. Reviewed-by: NChristian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Al Viro 提交于
Duplicate the few lines that are shared between hugetlbfs_mknod() and hugetlbfs_tmpfile(). This is a prerequisite for sanely changing the signature of ->tmpfile(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Reviewed-by: NChristian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 09 8月, 2022 1 次提交
-
-
由 Al Viro 提交于
... since April 2021 Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 03 8月, 2022 1 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
This involves converting migrate_huge_page_move_mapping(). We also need a folio variant of hugetlb_set_page_subpool(), but that's for a later patch. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Acked-by: NMuchun Song <songmuchun@bytedance.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
-
- 30 7月, 2022 5 次提交
-
-
由 Miaohe Lin 提交于
In some cases, e.g. when size option is not specified, f_blocks, f_bavail and f_bfree will be set to -1 instead of 0. Likewise, when nr_inodes isn't specified, f_files and f_ffree will be set to -1 too. Update the comment to make this clear. Link: https://lkml.kernel.org/r/20220726142918.51693-6-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
The function generic_file_buffered_read has been renamed to filemap_read since commit 87fa0f3e ("mm/filemap: rename generic_file_buffered_read to filemap_read"). Update the corresponding comment. And duplicated taken in hugetlbfs_fill_super is removed. Link: https://lkml.kernel.org/r/20220726142918.51693-5-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
The header file signal.h is unneeded now. Remove it. Link: https://lkml.kernel.org/r/20220726142918.51693-4-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
The forward declaration for hugetlbfs_ops is unnecessary. Remove it. Link: https://lkml.kernel.org/r/20220726142918.51693-3-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Miaohe Lin 提交于
Patch series "A few cleanup and fixup patches for hugetlbfs", v2. This series contains a few cleaup patches to remove unneeded forward declaration, use helper macro and so on. More details can be found in the respective changelogs. This patch (of 5): Use helper macro SZ_1K and SZ_1M to do the size conversion. Minor readability improvement. Link: https://lkml.kernel.org/r/20220726142918.51693-1-linmiaohe@huawei.com Link: https://lkml.kernel.org/r/20220726142918.51693-2-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 29 6月, 2022 2 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
Use folios throughout this function. That removes the last caller of huge_pagevec_release(), so delete that too. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Acked-by: NChristian Brauner (Microsoft) <brauner@kernel.org>
-
由 Matthew Wilcox (Oracle) 提交于
Remove the last caller of add_to_page_cache() Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com>
-
- 17 6月, 2022 1 次提交
-
-
由 Mike Kravetz 提交于
hugetlbfs fallocate support was originally added with commit 70c3547e ("hugetlbfs: add hugetlbfs_fallocate()"). Initial support only operated on whole hugetlb pages. This makes sense for populating files as other interfaces such as mmap and truncate require hugetlb page size alignment. Only operating on whole hugetlb pages for the hole punch case was a simplification and there was no compelling use case to zero partial pages. In a recent discussion[1] it was assumed that hugetlbfs hole punch would zero partial hugetlb pages as that is in line with the man page description saying 'partial filesystem blocks are zeroed'. However, the hugetlbfs hole punch code actually does this: hole_start = round_up(offset, hpage_size); hole_end = round_down(offset + len, hpage_size); Modify code to zero partial hugetlb pages in hole punch range. It is possible that application code could note a change in behavior. However, that would imply the code is passing in an unaligned range and expecting only whole pages be removed. This is unlikely as the fallocate documentation states the opposite. The current hugetlbfs fallocate hole punch behavior is tested with the libhugetlbfs test fallocate_align[2]. This test will be updated to validate partial page zeroing. [1] https://lore.kernel.org/linux-mm/20571829-9d3d-0b48-817c-b6b15565f651@redhat.com/ [2] https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/fallocate_align.c Link: https://lkml.kernel.org/r/YqeiMlZDKI1Kabfe@monkeySigned-off-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMuchun Song <songmuchun@bytedance.com> Cc: David Hildenbrand <david@redhat.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 13 5月, 2022 2 次提交
-
-
由 Peter Xu 提交于
As with shmem uffd-wp special ptes, only drop the uffd-wp special swap pte if unmapping an entire vma or synchronized such that faults can not race with the unmap operation. This requires passing zap_flags all the way to the lowest level hugetlb unmap routine: __unmap_hugepage_range. In general, unmap calls originated in hugetlbfs code will pass the ZAP_FLAG_DROP_MARKER flag as synchronization is in place to prevent faults. The exception is hole punch which will first unmap without any synchronization. Later when hole punch actually removes the page from the file, it will check to see if there was a subsequent fault and if so take the hugetlb fault mutex while unmapping again. This second unmap will pass in ZAP_FLAG_DROP_MARKER. The justification of "whether to apply ZAP_FLAG_DROP_MARKER flag when unmap a hugetlb range" is (IMHO): we should never reach a state when a page fault could errornously fault in a page-cache page that was wr-protected to be writable, even in an extremely short period. That could happen if e.g. we pass ZAP_FLAG_DROP_MARKER when hugetlbfs_punch_hole() calls hugetlb_vmdelete_list(), because if a page faults after that call and before remove_inode_hugepages() is executed, the page cache can be mapped writable again in the small racy window, that can cause unexpected data overwritten. [peterx@redhat.com: fix sparse warning] Link: https://lkml.kernel.org/r/Ylcdw8I1L5iAoWhb@xz-m1.local [akpm@linux-foundation.org: move zap_flags_t from mm.h to mm_types.h to fix build issues] Link: https://lkml.kernel.org/r/20220405014915.14873-1-peterx@redhat.comSigned-off-by: NPeter Xu <peterx@redhat.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Mina Almasry 提交于
After commit db71ef79 ("hugetlb: make free_huge_page irq safe"), the subpool lock should be locked with spin_lock_irq() and all call sites was modified as such, except for the ones in hugetlbfs_statfs(). Link: https://lkml.kernel.org/r/20220429202207.3045-1-almasrymina@google.com Fixes: db71ef79 ("hugetlb: make free_huge_page irq safe") Signed-off-by: NMina Almasry <almasrymina@google.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
- 09 5月, 2022 1 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
There are no more aop flags left, so remove the parameter. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 05 5月, 2022 2 次提交
-
-
由 Christophe Leroy 提交于
Powerpc needs flags and len to make decision on arch_get_mmap_end(). So add them as parameters to arch_get_mmap_end(). Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b556daabe7d2bdb2361c4d6130280da7c1ba2c14.1649523076.git.christophe.leroy@csgroup.eu
-
由 Christophe Leroy 提交于
Unlike most architectures, powerpc can only define at runtime if it is going to use the generic arch_get_unmapped_area() or not. Today, powerpc has a copy of the generic arch_get_unmapped_area() because when selection HAVE_ARCH_UNMAPPED_AREA the generic arch_get_unmapped_area() is not available. Rename it generic_get_unmapped_area() and make it independent of HAVE_ARCH_UNMAPPED_AREA. Do the same for arch_get_unmapped_area_topdown() versus HAVE_ARCH_UNMAPPED_AREA_TOPDOWN. Do the same for hugetlb_get_unmapped_area() versus HAVE_ARCH_HUGETLB_UNMAPPED_AREA. Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: NNicholas Piggin <npiggin@gmail.com> Acked-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/77f9d3e592f1c8511df9381aa1c4e754651da4d1.1649523076.git.christophe.leroy@csgroup.eu
-
- 22 4月, 2022 1 次提交
-
-
由 Christophe Leroy 提交于
This is a fix for commit f6795053 ("mm: mmap: Allow for "high" userspace addresses") for hugetlb. This patch adds support for "high" userspace addresses that are optionally supported on the system and have to be requested via a hint mechanism ("high" addr parameter to mmap). Architectures such as powerpc and x86 achieve this by making changes to their architectural versions of hugetlb_get_unmapped_area() function. However, arm64 uses the generic version of that function. So take into account arch_get_mmap_base() and arch_get_mmap_end() in hugetlb_get_unmapped_area(). To allow that, move those two macros out of mm/mmap.c into include/linux/sched/mm.h If these macros are not defined in architectural code then they default to (TASK_SIZE) and (base) so should not introduce any behavioural changes to architectures that do not define them. For the time being, only ARM64 is affected by this change. Catalin (ARM64) said "We should have fixed hugetlb_get_unmapped_area() as well when we added support for 52-bit VA. The reason for commit f6795053 was to prevent normal mmap() from returning addresses above 48-bit by default as some user-space had hard assumptions about this. It's a slight ABI change if you do this for hugetlb_get_unmapped_area() but I doubt anyone would notice. It's more likely that the current behaviour would cause issues, so I'd rather have them consistent. Basically when arm64 gained support for 52-bit addresses we did not want user-space calling mmap() to suddenly get such high addresses, otherwise we could have inadvertently broken some programs (similar behaviour to x86 here). Hence we added commit f6795053. But we missed hugetlbfs which could still get such high mmap() addresses. So in theory that's a potential regression that should have bee addressed at the same time as commit f6795053 (and before arm64 enabled 52-bit addresses)" Link: https://lkml.kernel.org/r/ab847b6edb197bffdfe189e70fb4ac76bfe79e0d.1650033747.git.christophe.leroy@csgroup.eu Fixes: f6795053 ("mm: mmap: Allow for "high" userspace addresses") Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: Steve Capper <steve.capper@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> [5.0.x] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 3月, 2022 1 次提交
-
-
由 Muchun Song 提交于
The inode allocation is supposed to use alloc_inode_sb(), so convert kmem_cache_alloc() of all filesystems to alloc_inode_sb(). Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.comSigned-off-by: NMuchun Song <songmuchun@bytedance.com> Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4] Acked-by: NRoman Gushchin <roman.gushchin@linux.dev> Cc: Alex Shi <alexs@kernel.org> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kari Argillander <kari.argillander@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 3月, 2022 1 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
This is a mechanical change. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Tested-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs Tested-by: David Howells <dhowells@redhat.com> # afs
-
- 15 1月, 2022 1 次提交
-
-
由 Sean Christopherson 提交于
Pass "end - 1" instead of "end" when walking the interval tree in hugetlb_vmdelete_list() to fix an inclusive vs. exclusive bug. The two callers that pass a non-zero "end" treat it as exclusive, whereas the interval tree iterator expects an inclusive "last". E.g. punching a hole in a file that precisely matches the size of a single hugepage, with a vma starting right on the boundary, will result in unmap_hugepage_range() being called twice, with the second call having start==end. The off-by-one error doesn't cause functional problems as __unmap_hugepage_range() turns into a massive nop due to short-circuiting its for-loop on "address < end". But, the mmu_notifier invocations to invalid_range_{start,end}() are passed a bogus zero-sized range, which may be unexpected behavior for secondary MMUs. The bug was exposed by commit ed922739 ("KVM: Use interval tree to do fast hva lookup in memslots"), currently queued in the KVM tree for 5.17, which added a WARN to detect ranges with start==end. Link: https://lkml.kernel.org/r/20211228234257.1926057-1-seanjc@google.com Fixes: 1bfad99a ("hugetlbfs: hugetlb_vmtruncate_list() needs to take a range to delete") Signed-off-by: NSean Christopherson <seanjc@google.com> Reported-by: syzbot+4e697fe80a31aa7efe21@syzkaller.appspotmail.com Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 11月, 2021 1 次提交
-
-
由 zhangyiru 提交于
Commit 21a3c273 ("mm, hugetlb: add thread name and pid to SHM_HUGETLB mlock rlimit warning") marked this as deprecated in 2012, but it is not deleted yet. Mike says he still sees that message in log files on occasion, so maybe we should preserve this warning. Also remove hugetlbfs related user_shm_unlock in ipc/shm.c and remove the user_shm_unlock after out. Link: https://lkml.kernel.org/r/20211103105857.25041-1-zhangyiru3@huawei.comSigned-off-by: Nzhangyiru <zhangyiru3@huawei.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Liu Zixian <liuzixian4@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: wuxu.wu <wuxu.wu@huawei.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 7月, 2021 1 次提交
-
-
由 Mike Kravetz 提交于
In commit 32021982 ("hugetlbfs: Convert to fs_context") processing of the mount mode string was changed from match_octal() to fsparam_u32. This changed existing behavior as match_octal does not require octal values to have a '0' prefix, but fsparam_u32 does. Use fsparam_u32oct which provides the same behavior as match_octal. Link: https://lkml.kernel.org/r/20210721183326.102716-1-mike.kravetz@oracle.com Fixes: 32021982 ("hugetlbfs: Convert to fs_context") Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com> Reported-by: NDennis Camera <bugs+kernel.org@dtnr.ch> Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 6月, 2021 1 次提交
-
-
由 Mike Kravetz 提交于
The routine restore_reserve_on_error is called to restore reservation information when an error occurs after page allocation. The routine alloc_huge_page modifies the mapping reserve map and potentially the reserve count during allocation. If code calling alloc_huge_page encounters an error after allocation and needs to free the page, the reservation information needs to be adjusted. Currently, restore_reserve_on_error only takes action on pages for which the reserve count was adjusted(HPageRestoreReserve flag). There is nothing wrong with these adjustments. However, alloc_huge_page ALWAYS modifies the reserve map during allocation even if the reserve count is not adjusted. This can cause issues as observed during development of this patch [1]. One specific series of operations causing an issue is: - Create a shared hugetlb mapping Reservations for all pages created by default - Fault in a page in the mapping Reservation exists so reservation count is decremented - Punch a hole in the file/mapping at index previously faulted Reservation and any associated pages will be removed - Allocate a page to fill the hole No reservation entry, so reserve count unmodified Reservation entry added to map by alloc_huge_page - Error after allocation and before instantiating the page Reservation entry remains in map - Allocate a page to fill the hole Reservation entry exists, so decrement reservation count This will cause a reservation count underflow as the reservation count was decremented twice for the same index. A user would observe a very large number for HugePages_Rsvd in /proc/meminfo. This would also likely cause subsequent allocations of hugetlb pages to fail as it would 'appear' that all pages are reserved. This sequence of operations is unlikely to happen, however they were easily reproduced and observed using hacked up code as described in [1]. Address the issue by having the routine restore_reserve_on_error take action on pages where HPageRestoreReserve is not set. In this case, we need to remove any reserve map entry created by alloc_huge_page. A new helper routine vma_del_reservation assists with this operation. There are three callers of alloc_huge_page which do not currently call restore_reserve_on error before freeing a page on error paths. Add those missing calls. [1] https://lore.kernel.org/linux-mm/20210528005029.88088-1-almasrymina@google.com/ Link: https://lkml.kernel.org/r/20210607204510.22617-1-mike.kravetz@oracle.com Fixes: 96b96a96 ("mm/hugetlb: fix huge page reservation leak in private mapping error paths" Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMina Almasry <almasrymina@google.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 5月, 2021 1 次提交
-
-
由 Mike Kravetz 提交于
In commit d6995da3 ("hugetlb: use page.private for hugetlb specific page flags") the use of PagePrivate to indicate a reservation count should be restored at free time was changed to the hugetlb specific flag HPageRestoreReserve. Changes to a userfaultfd error path as well as a VM_BUG_ON() in remove_inode_hugepages() were overlooked. Users could see incorrect hugetlb reserve counts if they experience an error with a UFFDIO_COPY operation. Specifically, this would be the result of an unlikely copy_huge_page_from_user error. There is not an increased chance of hitting the VM_BUG_ON. Link: https://lkml.kernel.org/r/20210521233952.236434-1-mike.kravetz@oracle.com Fixes: d6995da3 ("hugetlb: use page.private for hugetlb specific page flags") Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com> Reviewed-by: NMina Almasry <almasry.mina@google.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Mina Almasry <almasrymina@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 5月, 2021 1 次提交
-
-
由 Peter Xu 提交于
Patch series "mm/hugetlb: Fix issues on file sealing and fork", v2. Hugh reported issue with F_SEAL_FUTURE_WRITE not applied correctly to hugetlbfs, which I can easily verify using the memfd_test program, which seems that the program is hardly run with hugetlbfs pages (as by default shmem). Meanwhile I found another probably even more severe issue on that hugetlb fork won't wr-protect child cow pages, so child can potentially write to parent private pages. Patch 2 addresses that. After this series applied, "memfd_test hugetlbfs" should start to pass. This patch (of 2): F_SEAL_FUTURE_WRITE is missing for hugetlb starting from the first day. There is a test program for that and it fails constantly. $ ./memfd_test hugetlbfs memfd-hugetlb: CREATE memfd-hugetlb: BASIC memfd-hugetlb: SEAL-WRITE memfd-hugetlb: SEAL-FUTURE-WRITE mmap() didn't fail as expected Aborted (core dumped) I think it's probably because no one is really running the hugetlbfs test. Fix it by checking FUTURE_WRITE also in hugetlbfs_file_mmap() as what we do in shmem_mmap(). Generalize a helper for that. Link: https://lkml.kernel.org/r/20210503234356.9097-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20210503234356.9097-2-peterx@redhat.com Fixes: ab3948f5 ("mm/memfd: add an F_SEAL_FUTURE_WRITE seal to memfd") Signed-off-by: NPeter Xu <peterx@redhat.com> Reported-by: NHugh Dickins <hughd@google.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 5月, 2021 2 次提交
-
-
由 Miaohe Lin 提交于
The local variable pseudo_vma is not used anymore. Link: https://lkml.kernel.org/r/20210410072348.20437-6-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Feilong Lin <linfeilong@huawei.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miaohe Lin 提交于
The fault_mutex hashing overhead can be avoided in truncate_op case because page faults can not race with truncation in this routine. So calculate hash for fault_mutex only in !truncate_op case to save some cpu cycles. Link: https://lkml.kernel.org/r/20210308112809.26107-6-linmiaohe@huawei.comSigned-off-by: NMiaohe Lin <linmiaohe@huawei.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-