提交 98376843 编写于 作者: H Haozhong Zhang 提交者: Eduardo Habkost

hostmem-file: add "align" option

When mmap(2) the backend files, QEMU uses the host page size
(getpagesize(2)) by default as the alignment of mapping address.
However, some backends may require alignments different than the page
size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux
kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned,
fails with a kernel message like

[617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff)

Because there is no common approach to get such alignment requirement,
we add the 'align' option to 'memory-backend-file', so that users or
management utils, which have enough knowledge about the backend, can
specify a proper alignment via this option.
Signed-off-by: NHaozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com>
Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
[ehabkost: fixed typo, fixed error_setg() format string]
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>
上级 1e2bdd2e
...@@ -34,6 +34,7 @@ struct HostMemoryBackendFile { ...@@ -34,6 +34,7 @@ struct HostMemoryBackendFile {
bool share; bool share;
bool discard_data; bool discard_data;
char *mem_path; char *mem_path;
uint64_t align;
}; };
static void static void
...@@ -58,7 +59,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp) ...@@ -58,7 +59,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
path = object_get_canonical_path(OBJECT(backend)); path = object_get_canonical_path(OBJECT(backend));
memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
path, path,
backend->size, fb->share, backend->size, fb->align, fb->share,
fb->mem_path, errp); fb->mem_path, errp);
g_free(path); g_free(path);
} }
...@@ -115,6 +116,40 @@ static void file_memory_backend_set_discard_data(Object *o, bool value, ...@@ -115,6 +116,40 @@ static void file_memory_backend_set_discard_data(Object *o, bool value,
MEMORY_BACKEND_FILE(o)->discard_data = value; MEMORY_BACKEND_FILE(o)->discard_data = value;
} }
static void file_memory_backend_get_align(Object *o, Visitor *v,
const char *name, void *opaque,
Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
uint64_t val = fb->align;
visit_type_size(v, name, &val, errp);
}
static void file_memory_backend_set_align(Object *o, Visitor *v,
const char *name, void *opaque,
Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
Error *local_err = NULL;
uint64_t val;
if (host_memory_backend_mr_inited(backend)) {
error_setg(&local_err, "cannot change property value");
goto out;
}
visit_type_size(v, name, &val, &local_err);
if (local_err) {
goto out;
}
fb->align = val;
out:
error_propagate(errp, local_err);
}
static void file_backend_unparent(Object *obj) static void file_backend_unparent(Object *obj)
{ {
HostMemoryBackend *backend = MEMORY_BACKEND(obj); HostMemoryBackend *backend = MEMORY_BACKEND(obj);
...@@ -145,6 +180,10 @@ file_backend_class_init(ObjectClass *oc, void *data) ...@@ -145,6 +180,10 @@ file_backend_class_init(ObjectClass *oc, void *data)
object_class_property_add_str(oc, "mem-path", object_class_property_add_str(oc, "mem-path",
get_mem_path, set_mem_path, get_mem_path, set_mem_path,
&error_abort); &error_abort);
object_class_property_add(oc, "align", "int",
file_memory_backend_get_align,
file_memory_backend_set_align,
NULL, NULL, &error_abort);
} }
static void file_backend_instance_finalize(Object *o) static void file_backend_instance_finalize(Object *o)
......
...@@ -122,3 +122,19 @@ Note: ...@@ -122,3 +122,19 @@ Note:
M >= size of RAM devices + M >= size of RAM devices +
size of statically plugged vNVDIMM devices + size of statically plugged vNVDIMM devices +
size of hotplugged vNVDIMM devices size of hotplugged vNVDIMM devices
Alignment
---------
QEMU uses mmap(2) to maps vNVDIMM backends and aligns the mapping
address to the page size (getpagesize(2)) by default. However, some
types of backends may require an alignment different than the page
size. In that case, QEMU v2.12.0 and later provide 'align' option to
memory-backend-file to allow users to specify the proper alignment.
For example, device dax require the 2 MB alignment, so we can use
following QEMU command line options to use it (/dev/dax0.0) as the
backend of vNVDIMM:
-object memory-backend-file,id=mem1,share=on,mem-path=/dev/dax0.0,size=4G,align=2M
-device nvdimm,id=nvdimm1,memdev=mem1
...@@ -1612,7 +1612,13 @@ static void *file_ram_alloc(RAMBlock *block, ...@@ -1612,7 +1612,13 @@ static void *file_ram_alloc(RAMBlock *block,
void *area; void *area;
block->page_size = qemu_fd_getpagesize(fd); block->page_size = qemu_fd_getpagesize(fd);
block->mr->align = block->page_size; if (block->mr->align % block->page_size) {
error_setg(errp, "alignment 0x%" PRIx64
" must be multiples of page size 0x%zx",
block->mr->align, block->page_size);
return NULL;
}
block->mr->align = MAX(block->page_size, block->mr->align);
#if defined(__s390x__) #if defined(__s390x__)
if (kvm_enabled()) { if (kvm_enabled()) {
block->mr->align = MAX(block->mr->align, QEMU_VMALLOC_ALIGN); block->mr->align = MAX(block->mr->align, QEMU_VMALLOC_ALIGN);
......
...@@ -465,6 +465,8 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr, ...@@ -465,6 +465,8 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr,
* @name: Region name, becomes part of RAMBlock name used in migration stream * @name: Region name, becomes part of RAMBlock name used in migration stream
* must be unique within any device * must be unique within any device
* @size: size of the region. * @size: size of the region.
* @align: alignment of the region base address; if 0, the default alignment
* (getpagesize()) will be used.
* @share: %true if memory must be mmaped with the MAP_SHARED flag * @share: %true if memory must be mmaped with the MAP_SHARED flag
* @path: the path in which to allocate the RAM. * @path: the path in which to allocate the RAM.
* @errp: pointer to Error*, to store an error if it happens. * @errp: pointer to Error*, to store an error if it happens.
...@@ -476,6 +478,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr, ...@@ -476,6 +478,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
struct Object *owner, struct Object *owner,
const char *name, const char *name,
uint64_t size, uint64_t size,
uint64_t align,
bool share, bool share,
const char *path, const char *path,
Error **errp); Error **errp);
......
...@@ -1570,6 +1570,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr, ...@@ -1570,6 +1570,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
struct Object *owner, struct Object *owner,
const char *name, const char *name,
uint64_t size, uint64_t size,
uint64_t align,
bool share, bool share,
const char *path, const char *path,
Error **errp) Error **errp)
...@@ -1578,6 +1579,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr, ...@@ -1578,6 +1579,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
mr->ram = true; mr->ram = true;
mr->terminates = true; mr->terminates = true;
mr->destructor = memory_region_destructor_ram; mr->destructor = memory_region_destructor_ram;
mr->align = align;
mr->ram_block = qemu_ram_alloc_from_file(size, mr, share, path, errp); mr->ram_block = qemu_ram_alloc_from_file(size, mr, share, path, errp);
mr->dirty_log_mask = tcg_enabled() ? (1 << DIRTY_MEMORY_CODE) : 0; mr->dirty_log_mask = tcg_enabled() ? (1 << DIRTY_MEMORY_CODE) : 0;
} }
......
...@@ -456,7 +456,7 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner, ...@@ -456,7 +456,7 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
if (mem_path) { if (mem_path) {
#ifdef __linux__ #ifdef __linux__
Error *err = NULL; Error *err = NULL;
memory_region_init_ram_from_file(mr, owner, name, ram_size, false, memory_region_init_ram_from_file(mr, owner, name, ram_size, 0, false,
mem_path, &err); mem_path, &err);
if (err) { if (err) {
error_report_err(err); error_report_err(err);
......
...@@ -3974,7 +3974,7 @@ property must be set. These objects are placed in the ...@@ -3974,7 +3974,7 @@ property must be set. These objects are placed in the
@table @option @table @option
@item -object memory-backend-file,id=@var{id},size=@var{size},mem-path=@var{dir},share=@var{on|off},discard-data=@var{on|off},merge=@var{on|off},dump=@var{on|off},prealloc=@var{on|off},host-nodes=@var{host-nodes},policy=@var{default|preferred|bind|interleave} @item -object memory-backend-file,id=@var{id},size=@var{size},mem-path=@var{dir},share=@var{on|off},discard-data=@var{on|off},merge=@var{on|off},dump=@var{on|off},prealloc=@var{on|off},host-nodes=@var{host-nodes},policy=@var{default|preferred|bind|interleave},align=@var{align}
Creates a memory file backend object, which can be used to back Creates a memory file backend object, which can be used to back
the guest RAM with huge pages. the guest RAM with huge pages.
...@@ -4027,6 +4027,13 @@ restrict memory allocation to the given host node list ...@@ -4027,6 +4027,13 @@ restrict memory allocation to the given host node list
interleave memory allocations across the given host node list interleave memory allocations across the given host node list
@end table @end table
The @option{align} option specifies the base address alignment when
QEMU mmap(2) @option{mem-path}, and accepts common suffixes, eg
@option{2M}. Some backend store specified by @option{mem-path}
requires an alignment different than the default one used by QEMU, eg
the device DAX /dev/dax0.0 requires 2M alignment rather than 4K. In
such cases, users can specify the required alignment via this option.
@item -object memory-backend-ram,id=@var{id},merge=@var{on|off},dump=@var{on|off},prealloc=@var{on|off},size=@var{size},host-nodes=@var{host-nodes},policy=@var{default|preferred|bind|interleave} @item -object memory-backend-ram,id=@var{id},merge=@var{on|off},dump=@var{on|off},prealloc=@var{on|off},size=@var{size},host-nodes=@var{host-nodes},policy=@var{default|preferred|bind|interleave}
Creates a memory backend object, which can be used to back the guest RAM. Creates a memory backend object, which can be used to back the guest RAM.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册