hugetlbfs: fix potential over/underflow setting node specific nr_hugepages

mainline inclusion from mainline-5.1 commit 3ca7a26777dc05b0eb72e233a26400c227ac5d58 category: bugfix bugzilla: NA CVE: NA ------------------------------ The number of node specific huge pages can be set via a file such as: /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages When a node specific value is specified, the global number of huge pages must also be adjusted. This adjustment is calculated as the specified node specific value + (global value - current node value). If the node specific value provided by the user is large enough, this calculation could overflow an unsigned long leading to a smaller than expected number of huge pages. To fix, check the calculation for overflow. If overflow is detected, use ULONG_MAX as the requested value. This is inline with the user request to allocate as many huge pages as possible. It was also noticed that the above calculation was done outside the hugetlb_lock. Therefore, the values could be inconsistent and result in underflow. To fix, the calculation is moved within the routine set_max_huge_pages() where the lock is held. In addition, the code in __nr_hugepages_store_common() which tries to handle the case of not being able to allocate a node mask would likely result in incorrect behavior. Luckily, it is very unlikely we will ever take this path. If we do, simply return ENOMEM. Link: http://lkml.kernel.org/r/8f3aede3-c07e-ac15-1577-7667e5b70d2f@oracle.comSigned-off-by: N Mike Kravetz <mike.kravetz@oracle.com> Reported-by: N Jing Xiangfeng <jingxiangfeng@huawei.com> Reviewed-by: N Oscar Salvador <osalvador@suse.de> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: David Rientjes <rientjes@google.com> Cc: Jing Xiangfeng <jingxiangfeng@huawei.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: N Andrew Morton <akpm@linux-foundation.org> Signed-off-by: N Stephen Rothwell <sfr@canb.auug.org.au> Conflicts: mm/hugetlb.c Signed-off-by: N Jing Xiangfeng <jingxiangfeng@huawei.com> Reviewed-by: N zhong jiang <zhongjiang@huawei.com> Signed-off-by: N Yang Yingliang <yangyingliang@huawei.com>

hugetlbfs: fix potential over/underflow setting node specific nr_hugepages
mainline inclusion from mainline-5.1 commit 3ca7a26777dc05b0eb72e233a26400c227ac5d58 category: bugfix bugzilla: NA CVE: NA ------------------------------ The number of node specific huge pages can be set via a file such as: /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages When a node specific value is specified, the global number of huge pages must also be adjusted. This adjustment is calculated as the specified node specific value + (global value - current node value). If the node specific value provided by the user is large enough, this calculation could overflow an unsigned long leading to a smaller than expected number of huge pages. To fix, check the calculation for overflow. If overflow is detected, use ULONG_MAX as the requested value. This is inline with the user request to allocate as many huge pages as possible. It was also noticed that the above calculation was done outside the hugetlb_lock. Therefore, the values could be inconsistent and result in underflow. To fix, the calculation is moved within the routine set_max_huge_pages() where the lock is held. In addition, the code in __nr_hugepages_store_common() which tries to handle the case of not being able to allocate a node mask would likely result in incorrect behavior. Luckily, it is very unlikely we will ever take this path. If we do, simply return ENOMEM. Link: http://lkml.kernel.org/r/8f3aede3-c07e-ac15-1577-7667e5b70d2f@oracle.comSigned-off-by: N Mike Kravetz <mike.kravetz@oracle.com> Reported-by: N Jing Xiangfeng <jingxiangfeng@huawei.com> Reviewed-by: N Oscar Salvador <osalvador@suse.de> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: David Rientjes <rientjes@google.com> Cc: Jing Xiangfeng <jingxiangfeng@huawei.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: N Andrew Morton <akpm@linux-foundation.org> Signed-off-by: N Stephen Rothwell <sfr@canb.auug.org.au> Conflicts: mm/hugetlb.c Signed-off-by: N Jing Xiangfeng <jingxiangfeng@huawei.com> Reviewed-by: N zhong jiang <zhongjiang@huawei.com> Signed-off-by: N Yang Yingliang <yangyingliang@huawei.com>
6168fe8b · Mike Kravetz · Xie XiuQi · 349a9f0c · 6168fe8b
隐藏空白更改
内联并排

Showing with 34 addition and 9 deletion

mm/hugetlb.c mm/hugetlb.c +34 -9

未找到文件。
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2276,13 +2276,35 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,

 #define persistent_huge_pages(h) (h->nr_huge_pages - h->surplus_huge_pages)
 static unsigned long set_max_huge_pages(struct hstate *h, unsigned long count,
-						nodemask_t *nodes_allowed)
+					int nid, nodemask_t *nodes_allowed)
 {
 	unsigned long min_count, ret;

 	if (hstate_is_gigantic(h) && !gigantic_page_supported())
 		return h->max_huge_pages;

+	spin_lock(&hugetlb_lock);
+
+	/*
+	 * Check for a node specific request.
+	 * Changing node specific huge page count may require a corresponding
+	 * change to the global count.  In any case, the passed node mask
+	 * (nodes_allowed) will restrict alloc/free to the specified node.
+	 */
+	if (nid != NUMA_NO_NODE) {
+		unsigned long old_count = count;
+
+		count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
+		/*
+		 * User may have specified a large count value which caused the
+		 * above calculation to overflow.  In this case, they wanted
+		 * to allocate as many huge pages as possible.  Set count to
+		 * largest possible value to align with their intention.
+		 */
+		if (count < old_count)
+			count = ULONG_MAX;
+	}
+
 	/*
 	 * Increase the pool size
 	 * First take pages out of surplus state.  Then make up the
@@ -2294,7 +2316,6 @@ static unsigned long set_max_huge_pages(struct hstate *h, unsigned long count,
 	 * pool might be one hugepage larger than it needs to be, but
 	 * within all the constraints specified by the sysctls.
 	 */
-	spin_lock(&hugetlb_lock);
 	while (h->surplus_huge_pages && count > persistent_huge_pages(h)) {
 		if (!adjust_pool_surplus(h, nodes_allowed, -1))
 			break;
@@ -2418,16 +2439,20 @@ static ssize_t __nr_hugepages_store_common(bool obey_mempolicy,
 			nodes_allowed = &node_states[N_MEMORY];
 		}
 	} else if (nodes_allowed) {
+		/* Node specific request */
+		init_nodemask_of_node(nodes_allowed, nid);
+	} else {
 		/*
-		 * per node hstate attribute: adjust count to global,
-		 * but restrict alloc/free to the specified node.
+		 * Node specific request, but we could not allocate the few
+		 * words required for a node mask.  We are unlikely to hit
+		 * this condition.  Since we can not pass down the appropriate
+		 * node mask, just return ENOMEM.
 		 */
-		count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
-		init_nodemask_of_node(nodes_allowed, nid);
-	} else
-		nodes_allowed = &node_states[N_MEMORY];
+		err = -ENOMEM;
+		goto out;
+	}

-	h->max_huge_pages = set_max_huge_pages(h, count, nodes_allowed);
+	h->max_huge_pages = set_max_huge_pages(h, count, nid, nodes_allowed);

 	if (nodes_allowed != &node_states[N_MEMORY])
 		NODEMASK_FREE(nodes_allowed);