1. 21 6月, 2022 1 次提交
  2. 24 3月, 2022 1 次提交
  3. 08 3月, 2022 1 次提交
  4. 17 1月, 2022 2 次提交
  5. 16 12月, 2021 1 次提交
    • N
      hugetlbfs: flush TLBs correctly after huge_pmd_unshare · 09e1590f
      Nadav Amit 提交于
      stable inclusion
      from linux-v4.19.219
      commit b0313bc7f5fbb6beee327af39d818ffdc921821a
      category: bugfix
      bugzilla: 185854
      CVE: CVE-2021-4002
      
      -----------------------------------------------
      
      commit a4a118f2 upstream.
      
      When __unmap_hugepage_range() calls to huge_pmd_unshare() succeed, a TLB
      flush is missing.  This TLB flush must be performed before releasing the
      i_mmap_rwsem, in order to prevent an unshared PMDs page from being
      released and reused before the TLB flush took place.
      
      Arguably, a comprehensive solution would use mmu_gather interface to
      batch the TLB flushes and the PMDs page release, however it is not an
      easy solution: (1) try_to_unmap_one() and try_to_migrate_one() also call
      huge_pmd_unshare() and they cannot use the mmu_gather interface; and (2)
      deferring the release of the page reference for the PMDs page until
      after i_mmap_rwsem is dropeed can confuse huge_pmd_unshare() into
      thinking PMDs are shared when they are not.
      
      Fix __unmap_hugepage_range() by adding the missing TLB flush, and
      forcing a flush when unshare is successful.
      
      Fixes: 24669e58 ("hugetlb: use mmu_gather instead of a temporary linked list for accumulating pages)" # 3.6
      Signed-off-by: NNadav Amit <namit@vmware.com>
      Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Conflicts:
      	include/asm-generic/tlb.h
      	mm/mmu_gather.c
      Signed-off-by: NLiu Shixin <liushixin2@huawei.com>
      Reviewed-by: Ntong tiangen <tongtiangen@huawei.com>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      09e1590f
  6. 02 12月, 2021 1 次提交
    • M
      hugetlb: before freeing hugetlb page set dtor to appropriate value · c65d13f3
      Mike Kravetz 提交于
      mainline inclusion
      from mainline-5.15-rc1
      commit e32d20c0
      category: bugfix
      bugzilla: 180680
      CVE: NA
      
      ---------------------------
      
      When removing a hugetlb page from the pool the ref count is set to one (as
      the free page has no ref count) and compound page destructor is set to
      NULL_COMPOUND_DTOR.  Since a subsequent call to free the hugetlb page will
      call __free_pages for non-gigantic pages and free_gigantic_page for
      gigantic pages the destructor is not used.
      
      However, consider the following race with code taking a speculative
      reference on the page:
      
      Thread 0				Thread 1
      --------				--------
      remove_hugetlb_page
        set_page_refcounted(page);
        set_compound_page_dtor(page,
                 NULL_COMPOUND_DTOR);
      					get_page_unless_zero(page)
      __update_and_free_page
        __free_pages(page,
                 huge_page_order(h));
      
      		/* Note that __free_pages() will simply drop
      		   the reference to the page. */
      
      					put_page(page)
      					  __put_compound_page()
      					    destroy_compound_page
      					      NULL_COMPOUND_DTOR
      						BUG: kernel NULL pointer
      						dereference, address:
      						0000000000000000
      
      To address this race, set the dtor to the normal compound page dtor for
      non-gigantic pages.  The dtor for gigantic pages does not matter as
      gigantic pages are changed from a compound page to 'just a group of pages'
      before freeing.  Hence, the destructor is not used.
      
      Link: https://lkml.kernel.org/r/20210809184832.18342-4-mike.kravetz@oracle.comSigned-off-by: NMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: NMuchun Song <songmuchun@bytedance.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
      Cc: Mina Almasry <almasrymina@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Conflicts:
      	mm/hugetlb.c
      Signed-off-by: NChen Wandun <chenwandun@huawei.com>
      Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      c65d13f3
  7. 30 10月, 2021 4 次提交
  8. 22 10月, 2021 3 次提交
  9. 02 8月, 2021 1 次提交
  10. 29 7月, 2021 1 次提交
  11. 19 7月, 2021 1 次提交
    • M
      mm, hugetlb: fix simple resv_huge_pages underflow on UFFDIO_COPY · bf555e76
      Mina Almasry 提交于
      stable inclusion
      from linux-4.19.194
      commit 7de60c2d5a2a66ef2c5d76952a5a3a9a4ea4d436
      
      --------------------------------
      
      [ Upstream commit d84cf06e ]
      
      The userfaultfd hugetlb tests cause a resv_huge_pages underflow.  This
      happens when hugetlb_mcopy_atomic_pte() is called with !is_continue on
      an index for which we already have a page in the cache.  When this
      happens, we allocate a second page, double consuming the reservation,
      and then fail to insert the page into the cache and return -EEXIST.
      
      To fix this, we first check if there is a page in the cache which
      already consumed the reservation, and return -EEXIST immediately if so.
      
      There is still a rare condition where we fail to copy the page contents
      AND race with a call for hugetlb_no_page() for this index and again we
      will underflow resv_huge_pages.  That is fixed in a more complicated
      patch not targeted for -stable.
      
      Test:
      
        Hacked the code locally such that resv_huge_pages underflows produce a
        warning, then:
      
        ./tools/testing/selftests/vm/userfaultfd hugetlb_shared 10
      	2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success
        ./tools/testing/selftests/vm/userfaultfd hugetlb 10
      	2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success
      
      Both tests succeed and produce no warnings.  After the test runs number
      of free/resv hugepages is correct.
      
      [mike.kravetz@oracle.com: changelog fixes]
      
      Link: https://lkml.kernel.org/r/20210528004649.85298-1-almasrymina@google.com
      Fixes: 8fb5debc ("userfaultfd: hugetlbfs: add hugetlb_mcopy_atomic_pte for userfaultfd support")
      Signed-off-by: NMina Almasry <almasrymina@google.com>
      Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
      Cc: Axel Rasmussen <axelrasmussen@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      bf555e76
  12. 30 6月, 2021 2 次提交
  13. 14 4月, 2021 4 次提交
  14. 11 3月, 2021 4 次提交
  15. 22 2月, 2021 3 次提交
  16. 15 10月, 2020 1 次提交
  17. 22 9月, 2020 4 次提交
    • M
      mm/hugetlb: fix a race between hugetlb sysctl handlers · 403e3ba3
      Muchun Song 提交于
      mainline inclusion
      from mainline-v5.9-rc4
      commit 17743798
      category: bugfix
      bugzilla: NA
      CVE: CVE-2020-25285
      
      --------------------------------
      
      There is a race between the assignment of `table->data` and write value
      to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on
      the other thread.
      
        CPU0:                                 CPU1:
                                              proc_sys_write
        hugetlb_sysctl_handler                  proc_sys_call_handler
        hugetlb_sysctl_handler_common             hugetlb_sysctl_handler
          table->data = &tmp;                       hugetlb_sysctl_handler_common
                                                      table->data = &tmp;
            proc_doulongvec_minmax
              do_proc_doulongvec_minmax           sysctl_head_finish
                __do_proc_doulongvec_minmax         unuse_table
                  i = table->data;
                  *i = val;  // corrupt CPU1's stack
      
      Fix this by duplicating the `table`, and only update the duplicate of
      it.  And introduce a helper of proc_hugetlb_doulongvec_minmax() to
      simplify the code.
      
      The following oops was seen:
      
          BUG: kernel NULL pointer dereference, address: 0000000000000000
          #PF: supervisor instruction fetch in kernel mode
          #PF: error_code(0x0010) - not-present page
          Code: Bad RIP value.
          ...
          Call Trace:
           ? set_max_huge_pages+0x3da/0x4f0
           ? alloc_pool_huge_page+0x150/0x150
           ? proc_doulongvec_minmax+0x46/0x60
           ? hugetlb_sysctl_handler_common+0x1c7/0x200
           ? nr_hugepages_store+0x20/0x20
           ? copy_fd_bitmaps+0x170/0x170
           ? hugetlb_sysctl_handler+0x1e/0x20
           ? proc_sys_call_handler+0x2f1/0x300
           ? unregister_sysctl_table+0xb0/0xb0
           ? __fd_install+0x78/0x100
           ? proc_sys_write+0x14/0x20
           ? __vfs_write+0x4d/0x90
           ? vfs_write+0xef/0x240
           ? ksys_write+0xc0/0x160
           ? __ia32_sys_read+0x50/0x50
           ? __close_fd+0x129/0x150
           ? __x64_sys_write+0x43/0x50
           ? do_syscall_64+0x6c/0x200
           ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: e5ff2159 ("hugetlb: multiple hstates for multiple page sizes")
      Signed-off-by: NMuchun Song <songmuchun@bytedance.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Link: http://lkml.kernel.org/r/20200828031146.43035-1-songmuchun@bytedance.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      Reviewed-by: NJason Yan <yanaijie@huawei.com>
      Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      403e3ba3
    • W
      arm64/ascend: Add hugepage flags change interface · db1d159b
      Weilong Chen 提交于
      ascend inclusion
      category: feature
      bugzilla: NA
      CVE: NA
      
      -------------------------------------------------
      
      Add a variable to change alloc hugepage gfp flags, and export it out.
      Hugepage need to be accounted by cgroup. We use this to set ACCOUNT
      flag to memory subsystem.
      Signed-off-by: NWeilong Chen <chenweilong@huawei.com>
      Reviewed-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      db1d159b
    • W
      arm64/ascend: Add set hugepage number helper function · b6bcd500
      Weilong Chen 提交于
      ascend inclusion
      category: feature
      bugzilla: NA
      CVE: NA
      
      -------------------------------------------------
      
      Add helper function for change system hugepage nr, and export it out.
      Signed-off-by: NWeilong Chen <chenweilong@huawei.com>
      Reviewed-by: NDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      b6bcd500
    • L
      mm/hugetlb: fix a addressing exception caused by huge_pte_offset · f053d5db
      Longpeng 提交于
      stable inclusion
      from linux-4.19.119
      commit dcca7d2f751014d8caa3711e93b2119305e837df
      
      --------------------------------
      
      commit 3c1d7e6c upstream.
      
      Our machine encountered a panic(addressing exception) after run for a
      long time and the calltrace is:
      
          RIP: hugetlb_fault+0x307/0xbe0
          RSP: 0018:ffff9567fc27f808  EFLAGS: 00010286
          RAX: e800c03ff1258d48 RBX: ffffd3bb003b69c0 RCX: e800c03ff1258d48
          RDX: 17ff3fc00eda72b7 RSI: 00003ffffffff000 RDI: e800c03ff1258d48
          RBP: ffff9567fc27f8c8 R08: e800c03ff1258d48 R09: 0000000000000080
          R10: ffffaba0704c22a8 R11: 0000000000000001 R12: ffff95c87b4b60d8
          R13: 00005fff00000000 R14: 0000000000000000 R15: ffff9567face8074
          FS:  00007fe2d9ffb700(0000) GS:ffff956900e40000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: ffffd3bb003b69c0 CR3: 000000be67374000 CR4: 00000000003627e0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
            follow_hugetlb_page+0x175/0x540
            __get_user_pages+0x2a0/0x7e0
            __get_user_pages_unlocked+0x15d/0x210
            __gfn_to_pfn_memslot+0x3c5/0x460 [kvm]
            try_async_pf+0x6e/0x2a0 [kvm]
            tdp_page_fault+0x151/0x2d0 [kvm]
           ...
            kvm_arch_vcpu_ioctl_run+0x330/0x490 [kvm]
            kvm_vcpu_ioctl+0x309/0x6d0 [kvm]
            do_vfs_ioctl+0x3f0/0x540
            SyS_ioctl+0xa1/0xc0
            system_call_fastpath+0x22/0x27
      
      For 1G hugepages, huge_pte_offset() wants to return NULL or pudp, but it
      may return a wrong 'pmdp' if there is a race.  Please look at the
      following code snippet:
      
          ...
          pud = pud_offset(p4d, addr);
          if (sz != PUD_SIZE && pud_none(*pud))
              return NULL;
          /* hugepage or swap? */
          if (pud_huge(*pud) || !pud_present(*pud))
              return (pte_t *)pud;
      
          pmd = pmd_offset(pud, addr);
          if (sz != PMD_SIZE && pmd_none(*pmd))
              return NULL;
          /* hugepage or swap? */
          if (pmd_huge(*pmd) || !pmd_present(*pmd))
              return (pte_t *)pmd;
          ...
      
      The following sequence would trigger this bug:
      
       - CPU0: sz = PUD_SIZE and *pud = 0 , continue
       - CPU0: "pud_huge(*pud)" is false
       - CPU1: calling hugetlb_no_page and set *pud to xxxx8e7(PRESENT)
       - CPU0: "!pud_present(*pud)" is false, continue
       - CPU0: pmd = pmd_offset(pud, addr) and maybe return a wrong pmdp
      
      However, we want CPU0 to return NULL or pudp in this case.
      
      We must make sure there is exactly one dereference of pud and pmd.
      Signed-off-by: NLongpeng <longpeng2@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20200413010342.771-1-longpeng2@huawei.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      f053d5db
  18. 31 8月, 2020 3 次提交
  19. 27 12月, 2019 2 次提交