You need to sign in or sign up before continuing.
提交 4cda25d3 编写于 作者: A Anshuman Khandual 提交者: Zheng Zengkai

mm: Define coherent device memory (CDM) node

ascend inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMLR
CVE: NA
-------------------

There are certain devices like specialized accelerator, GPU cards, network
cards, FPGA cards etc which might contain onboard memory which is coherent
along with the existing system RAM while being accessed either from the CPU
or from the device. They share some similar properties with that of normal
system RAM but at the same time can also be different with respect to
system RAM.

User applications might be interested in using this kind of coherent device
memory explicitly or implicitly along side the system RAM utilizing all
possible core memory functions like anon mapping (LRU), file mapping (LRU),
page cache (LRU), driver managed (non LRU), HW poisoning, NUMA migrations
etc. To achieve this kind of tight integration with core memory subsystem,
the device onboard coherent memory must be represented as a memory only
NUMA node. At the same time arch must export some kind of a function to
identify of this node as a coherent device memory not any other regular
cpu less memory only NUMA node.

After achieving the integration with core memory subsystem coherent device
memory might still need some special consideration inside the kernel. There
can be a variety of coherent memory nodes with different expectations from
the core kernel memory. But right now only one kind of special treatment is
considered which requires certain isolation.

Now consider the case of a coherent device memory node type which requires
isolation. This kind of coherent memory is onboard an external device
attached to the system through a link where there is always a chance of a
link failure taking down the entire memory node with it. More over the
memory might also have higher chance of ECC failure as compared to the
system RAM. Hence allocation into this kind of coherent memory node should
be regulated. Kernel allocations must not come here. Normal user space
allocations too should not come here implicitly (without user application
knowing about it). This summarizes isolation requirement of certain kind of
coherent device memory node as an example. There can be different kinds of
isolation requirement also.

Some coherent memory devices might not require isolation altogether after
all. Then there might be other coherent memory devices which might require
some other special treatment after being part of core memory representation
. For now, will look into isolation seeking coherent device memory node not
the other ones.

To implement the integration as well as isolation, the coherent memory node
must be present in N_MEMORY and a new N_COHERENT_DEVICE node mask inside
the node_states[] array. During memory hotplug operations, the new nodemask
N_COHERENT_DEVICE is updated along with N_MEMORY for these coherent device
memory nodes. This also creates the following new sysfs based interface to
list down all the coherent memory nodes of the system.

	/sys/devices/system/node/is_cdm_node

Architectures must export function arch_check_node_cdm() which identifies
any coherent device memory node in case they enable CONFIG_COHERENT_DEVICE.
Signed-off-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: NLijun Fang <fanglijun3@huawei.com>
Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
上级 c44e1f78
...@@ -25,6 +25,13 @@ static int numa_distance_cnt; ...@@ -25,6 +25,13 @@ static int numa_distance_cnt;
static u8 *numa_distance; static u8 *numa_distance;
bool numa_off; bool numa_off;
#ifdef CONFIG_COHERENT_DEVICE
inline int arch_check_node_cdm(int nid)
{
return 0;
}
#endif
static __init int numa_parse_early_param(char *opt) static __init int numa_parse_early_param(char *opt)
{ {
if (!opt) if (!opt)
......
...@@ -1016,6 +1016,9 @@ static struct node_attr node_state_attr[] = { ...@@ -1016,6 +1016,9 @@ static struct node_attr node_state_attr[] = {
[N_CPU] = _NODE_ATTR(has_cpu, N_CPU), [N_CPU] = _NODE_ATTR(has_cpu, N_CPU),
[N_GENERIC_INITIATOR] = _NODE_ATTR(has_generic_initiator, [N_GENERIC_INITIATOR] = _NODE_ATTR(has_generic_initiator,
N_GENERIC_INITIATOR), N_GENERIC_INITIATOR),
#ifdef CONFIG_COHERENT_DEVICE
[N_COHERENT_DEVICE] = _NODE_ATTR(is_cdm_node, N_COHERENT_DEVICE),
#endif
}; };
static struct attribute *node_state_attrs[] = { static struct attribute *node_state_attrs[] = {
...@@ -1028,6 +1031,9 @@ static struct attribute *node_state_attrs[] = { ...@@ -1028,6 +1031,9 @@ static struct attribute *node_state_attrs[] = {
&node_state_attr[N_MEMORY].attr.attr, &node_state_attr[N_MEMORY].attr.attr,
&node_state_attr[N_CPU].attr.attr, &node_state_attr[N_CPU].attr.attr,
&node_state_attr[N_GENERIC_INITIATOR].attr.attr, &node_state_attr[N_GENERIC_INITIATOR].attr.attr,
#ifdef CONFIG_COHERENT_DEVICE
&node_state_attr[N_COHERENT_DEVICE].attr.attr,
#endif
NULL NULL
}; };
......
...@@ -397,9 +397,12 @@ enum node_states { ...@@ -397,9 +397,12 @@ enum node_states {
#else #else
N_HIGH_MEMORY = N_NORMAL_MEMORY, N_HIGH_MEMORY = N_NORMAL_MEMORY,
#endif #endif
N_MEMORY, /* The node has memory(regular, high, movable) */ N_MEMORY, /* The node has memory(regular, high, movable, cdm) */
N_CPU, /* The node has one or more cpus */ N_CPU, /* The node has one or more cpus */
N_GENERIC_INITIATOR, /* The node has one or more Generic Initiators */ N_GENERIC_INITIATOR, /* The node has one or more Generic Initiators */
#ifdef CONFIG_COHERENT_DEVICE
N_COHERENT_DEVICE, /* The node has CDM memory */
#endif
NR_NODE_STATES NR_NODE_STATES
}; };
...@@ -503,6 +506,77 @@ static inline int node_random(const nodemask_t *mask) ...@@ -503,6 +506,77 @@ static inline int node_random(const nodemask_t *mask)
} }
#endif #endif
#ifdef CONFIG_COHERENT_DEVICE
extern int arch_check_node_cdm(int nid);
static inline nodemask_t system_mem_nodemask(void)
{
nodemask_t system_mem;
nodes_clear(system_mem);
nodes_andnot(system_mem, node_states[N_MEMORY],
node_states[N_COHERENT_DEVICE]);
return system_mem;
}
static inline bool is_cdm_node(int node)
{
return node_isset(node, node_states[N_COHERENT_DEVICE]);
}
static inline bool nodemask_has_cdm(nodemask_t mask)
{
int node, i;
node = first_node(mask);
for (i = 0; i < nodes_weight(mask); i++) {
if (is_cdm_node(node))
return true;
node = next_node(node, mask);
}
return false;
}
static inline void node_set_state_cdm(int node)
{
if (arch_check_node_cdm(node))
node_set_state(node, N_COHERENT_DEVICE);
}
static inline void node_clear_state_cdm(int node)
{
if (arch_check_node_cdm(node))
node_clear_state(node, N_COHERENT_DEVICE);
}
#else
static inline int arch_check_node_cdm(int nid) { return 0; }
static inline nodemask_t system_mem_nodemask(void)
{
return node_states[N_MEMORY];
}
static inline bool is_cdm_node(int node)
{
return false;
}
static inline bool nodemask_has_cdm(nodemask_t mask)
{
return false;
}
static inline void node_set_state_cdm(int node)
{
}
static inline void node_clear_state_cdm(int node)
{
}
#endif /* CONFIG_COHERENT_DEVICE */
#define node_online_map node_states[N_ONLINE] #define node_online_map node_states[N_ONLINE]
#define node_possible_map node_states[N_POSSIBLE] #define node_possible_map node_states[N_POSSIBLE]
......
...@@ -145,6 +145,13 @@ config NUMA_KEEP_MEMINFO ...@@ -145,6 +145,13 @@ config NUMA_KEEP_MEMINFO
config MEMORY_ISOLATION config MEMORY_ISOLATION
bool bool
config COHERENT_DEVICE
bool "coherent device memory"
def_bool n
depends on CPUSETS && ARM64 && NUMA
help
Enable coherent device memory (CDM) support.
# #
# Only be set on architectures that have completely implemented memory hotplug # Only be set on architectures that have completely implemented memory hotplug
# feature. If you are not sure, don't touch it. # feature. If you are not sure, don't touch it.
......
...@@ -7355,8 +7355,10 @@ static unsigned long __init early_calculate_totalpages(void) ...@@ -7355,8 +7355,10 @@ static unsigned long __init early_calculate_totalpages(void)
unsigned long pages = end_pfn - start_pfn; unsigned long pages = end_pfn - start_pfn;
totalpages += pages; totalpages += pages;
if (pages) if (pages) {
node_set_state_cdm(nid);
node_set_state(nid, N_MEMORY); node_set_state(nid, N_MEMORY);
}
} }
return totalpages; return totalpages;
} }
...@@ -7694,8 +7696,10 @@ void __init free_area_init(unsigned long *max_zone_pfn) ...@@ -7694,8 +7696,10 @@ void __init free_area_init(unsigned long *max_zone_pfn)
free_area_init_node(nid); free_area_init_node(nid);
/* Any memory on that node */ /* Any memory on that node */
if (pgdat->node_present_pages) if (pgdat->node_present_pages) {
node_set_state_cdm(nid);
node_set_state(nid, N_MEMORY); node_set_state(nid, N_MEMORY);
}
check_for_memory(pgdat, nid); check_for_memory(pgdat, nid);
} }
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册