提交 792568ec 编写于 作者: R Rik van Riel 提交者: Ingo Molnar

sched/numa: Count pages on active node as local

The NUMA code is smart enough to distribute the memory of workloads
that span multiple NUMA nodes across those NUMA nodes.

However, it still has a pretty high scan rate for such workloads,
because any memory that is left on a node other than the node of
the CPU that faulted on the memory is counted as non-local, which
causes the scan rate to go up.

Counting the memory on any node where the task's numa group is
actively running as local, allows the scan rate to slow down
once the application is settled in.

This should reduce the overhead of the automatic NUMA placement
code, when a workload spans multiple NUMA nodes.
Signed-off-by: NRik van Riel <riel@redhat.com>
Tested-by: NVinod Chegu <chegu_vinod@hp.com>
Acked-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1397235629-16328-2-git-send-email-riel@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
上级 2fe5de9c
...@@ -1738,6 +1738,7 @@ void task_numa_fault(int last_cpupid, int mem_node, int pages, int flags) ...@@ -1738,6 +1738,7 @@ void task_numa_fault(int last_cpupid, int mem_node, int pages, int flags)
struct task_struct *p = current; struct task_struct *p = current;
bool migrated = flags & TNF_MIGRATED; bool migrated = flags & TNF_MIGRATED;
int cpu_node = task_node(current); int cpu_node = task_node(current);
int local = !!(flags & TNF_FAULT_LOCAL);
int priv; int priv;
if (!numabalancing_enabled) if (!numabalancing_enabled)
...@@ -1786,6 +1787,17 @@ void task_numa_fault(int last_cpupid, int mem_node, int pages, int flags) ...@@ -1786,6 +1787,17 @@ void task_numa_fault(int last_cpupid, int mem_node, int pages, int flags)
task_numa_group(p, last_cpupid, flags, &priv); task_numa_group(p, last_cpupid, flags, &priv);
} }
/*
* If a workload spans multiple NUMA nodes, a shared fault that
* occurs wholly within the set of nodes that the workload is
* actively using should be counted as local. This allows the
* scan rate to slow down when a workload has settled down.
*/
if (!priv && !local && p->numa_group &&
node_isset(cpu_node, p->numa_group->active_nodes) &&
node_isset(mem_node, p->numa_group->active_nodes))
local = 1;
task_numa_placement(p); task_numa_placement(p);
/* /*
...@@ -1800,7 +1812,7 @@ void task_numa_fault(int last_cpupid, int mem_node, int pages, int flags) ...@@ -1800,7 +1812,7 @@ void task_numa_fault(int last_cpupid, int mem_node, int pages, int flags)
p->numa_faults_buffer_memory[task_faults_idx(mem_node, priv)] += pages; p->numa_faults_buffer_memory[task_faults_idx(mem_node, priv)] += pages;
p->numa_faults_buffer_cpu[task_faults_idx(cpu_node, priv)] += pages; p->numa_faults_buffer_cpu[task_faults_idx(cpu_node, priv)] += pages;
p->numa_faults_locality[!!(flags & TNF_FAULT_LOCAL)] += pages; p->numa_faults_locality[local] += pages;
} }
static void reset_ptenuma_scan(struct task_struct *p) static void reset_ptenuma_scan(struct task_struct *p)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册