- 19 12月, 2020 22 次提交
-
-
由 David Hildenbrand 提交于
Let's rename them accordingly. virtio_mem_plug_request() and virtio_mem_unplug_request() will be handled separately. Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-23-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's move first_mb_id/next_mb_id/last_usable_mb_id accordingly. Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-22-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's rename to "sbs_per_mb" and "sb_size" and move accordingly. Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-21-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's rename and move accordingly. While at it, rename sb_bitmap to "sb_states". Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-20-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
let's use a new "sbm" sub-struct to hold SBM-specific state and rename + move applicable definitions, functions, and variables (related to memory block states). While at it: - Drop the "_STATE" part from memory block states - Rename "nb_mb_state" to "mb_count" - "set_mb_state" / "get_mb_state" vs. "mb_set_state" / "mb_get_state" - Don't use lengthy "enum virtio_mem_smb_mb_state", simply use "uint8_t" Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-19-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's add some documentation for the current mode - Sub Block Mode (SBM) - to prepare for a new mode - Big Block Mode (BBM). Follow-up patches will properly factor out the existing Sub Block Mode (SBM) and implement Big Block Mode (BBM). Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-18-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
We don't want to add too much memory when it's not getting onlined immediately, to avoid running OOM. Generalize the handling, to avoid making use of memory block states. Use a threshold of 1 GiB for now. Properly adjust the offline size when adding/removing memory. As we are not always protected by a lock when touching the offline size, use an atomic64_t. We don't care about races (e.g., someone offlining memory while we are adding more), only about consistent values. (1 GiB needs a memmap of ~16MiB - which sounds reasonable even for setups with little boot memory and (possibly) one virtio-mem device per node) We don't want to retrigger when onlining is caused immediately by our action (e.g., adding memory which immediately gets onlined), so use a flag to indicate if the workqueue is active and use that as an indicator whether to trigger a retry. This will also be especially relevant for Big Block Mode (BBM), whereby we might re-online memory in case offlining of another memory block failed. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-17-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's trigger from offlining code only when we're not allowed to unplug online memory. Handle the other case (memmap possibly freeing up another memory block) when actually removing memory. We now also properly handle the case when removing already offline memory blocks via virtio_mem_mb_remove(). When removing via virtio_mem_remove(), when unloading the driver, virtio_mem_retry() is a NOP and safe to use. While at it, move retry handling when offlining out of virtio_mem_notify_offline(), to share it with Big Block Mode (BBM) soon. This is a preparation for Big Block Mode (BBM), whereby we can see some temporary offlining of memory blocks without actually making progress. Imagine you have a Big Block that spans to Linux memory blocks. Assume the first Linux memory blocks has no unmovable data on it. When we would call offline_and_remove_memory() on the big block, we would 1. Try to offline the first block. Works, notifiers triggered. virtio_mem_retry() called. 2. Try to offline the second block. Does not work. 3. Re-online first block. 4. Exit to main loop, exit workqueue. 5. Retry immediately (due to virtio_mem_retry()), go to 1. The result are endless retries. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-16-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
No longer used, let's drop it. Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-15-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Avoid using memory block ids. While at it, use uint64_t for address/size. This is a preparation for Big Block Mode (BBM). Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-14-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Avoid using memory block ids. Rename it to virtio_mem_contains_range(). This is a preparation for Big Block Mode (BBM). Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-13-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's check by traversing busy system RAM resources instead, to avoid relying on memory block states. Don't use walk_system_ram_range(), as that works on pages and we want to use the bare addresses we have easily at hand. This is a preparation for Big Block Mode (BBM), which won't have memory block states. Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-12-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
ZONE_MOVABLE is supposed to give some guarantees, yet, alloc_contig_range() isn't prepared to properly deal with some racy cases properly (e.g., temporary page pinning when exiting processed, PCP). Retry 5 times for now. There is certainly room for improvement in the future. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-11-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's factor out the core pieces and place the implementation next to virtio_mem_fake_offline(). We'll reuse this functionality soon. Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-10-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
... which now matches virtio_mem_fake_online(). We'll reuse this functionality soon. Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-9-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's move the existing dev_dbg() into the functions, print if something went wrong, and also print for virtio_mem_send_unplug_all_request(). Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-8-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
The calculation is already complicated enough, let's limit it to one location. Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-7-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
No harm done, but let's be consistent. Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-6-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
We can drop rc2, we don't actually need the value. Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-5-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's use pageblock_nr_pages and MAX_ORDER_NR_PAGES instead where possible to simplify. Add a comment why we have that restriction for now. Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-4-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
We actually need one byte less (next_mb_id is exclusive, first_mb_id is inclusive). While at it, compact the code. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-3-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's determine the target nid only once in case we have none specified - usually, we'll end up with node 0 either way. Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20201112133815.13332-2-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 21 10月, 2020 1 次提交
-
-
由 Rikard Falkeborn 提交于
mem_id_table is not modified, so make it const to allow the compiler to put it in read-only memory. Signed-off-by: NRikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lore.kernel.org/r/20200911203509.26505-4-rikard.falkeborn@gmail.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NDavid Hildenbrand <david@redhat.com>
-
- 17 10月, 2020 2 次提交
-
-
由 David Hildenbrand 提交于
virtio-mem adds memory in memory block granularity, to be able to remove it in the same granularity again later, and to grow slowly on demand. This, however, results in quite a lot of resources when adding a lot of memory. Resources are effectively stored in a list-based tree. Having a lot of resources not only wastes memory, it also makes traversing that tree more expensive, and makes /proc/iomem explode in size (e.g., requiring kexec-tools to manually merge resources later when e.g., trying to create a kdump header). Before this patch, we get (/proc/iomem) when hotplugging 2G via virtio-mem on x86-64: [...] 100000000-13fffffff : System RAM 140000000-33fffffff : virtio0 140000000-147ffffff : System RAM (virtio_mem) 148000000-14fffffff : System RAM (virtio_mem) 150000000-157ffffff : System RAM (virtio_mem) 158000000-15fffffff : System RAM (virtio_mem) 160000000-167ffffff : System RAM (virtio_mem) 168000000-16fffffff : System RAM (virtio_mem) 170000000-177ffffff : System RAM (virtio_mem) 178000000-17fffffff : System RAM (virtio_mem) 180000000-187ffffff : System RAM (virtio_mem) 188000000-18fffffff : System RAM (virtio_mem) 190000000-197ffffff : System RAM (virtio_mem) 198000000-19fffffff : System RAM (virtio_mem) 1a0000000-1a7ffffff : System RAM (virtio_mem) 1a8000000-1afffffff : System RAM (virtio_mem) 1b0000000-1b7ffffff : System RAM (virtio_mem) 1b8000000-1bfffffff : System RAM (virtio_mem) 3280000000-32ffffffff : PCI Bus 0000:00 With this patch, we get (/proc/iomem): [...] fffc0000-ffffffff : Reserved 100000000-13fffffff : System RAM 140000000-33fffffff : virtio0 140000000-1bfffffff : System RAM (virtio_mem) 3280000000-32ffffffff : PCI Bus 0000:00 Of course, with more hotplugged memory, it gets worse. When unplugging memory blocks again, try_remove_memory() (via offline_and_remove_memory()) will properly split the resource up again. Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Juergen Gross <jgross@suse.com> Cc: Julien Grall <julien@xen.org> Cc: Kees Cook <keescook@chromium.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Wei Liu <wei.liu@kernel.org> Link: https://lkml.kernel.org/r/20200911103459.10306-7-david@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Hildenbrand 提交于
We soon want to pass flags, e.g., to mark added System RAM resources. mergeable. Prepare for that. This patch is based on a similar patch by Oscar Salvador: https://lkml.kernel.org/r/20190625075227.15193-3-osalvador@suse.deSigned-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Reviewed-by: Juergen Gross <jgross@suse.com> # Xen related part Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: NWei Liu <wei.liu@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Julien Grall <julien@xen.org> Cc: Kees Cook <keescook@chromium.org> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richardw.yang@linux.intel.com> Link: https://lkml.kernel.org/r/20200911103459.10306-5-david@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 10月, 2020 1 次提交
-
-
由 David Hildenbrand 提交于
When introducing virtio-mem, the semantics of ZONE_MOVABLE were rather unclear, which is why we special-cased ZONE_MOVABLE such that partially plugged blocks would never end up in ZONE_MOVABLE. Now that the semantics are much clearer (and will be documented in a follow-up patch including the new virtio-mem behavior), let's allow to online partially plugged memory blocks to ZONE_MOVABLE and also consider memory blocks that were onlined to ZONE_MOVABLE when unplugging memory. While unplugged memory pages are, in general, unmovable, they can be skipped when offlining memory. virtio-mem only unplugs fairly big chunks (in the megabyte range) and rather tries to shrink the memory region than randomly choosing memory. In theory, if all other pages in the movable zone would be movable, virtio-mem would only shrink that zone and not create any kind of fragmentation. In the future, we might want to remember the zone again and use the information when (un)plugging memory. For now, let's keep it simple. Note: Support for defragmentation is planned, to deal with fragmentation after unplug due to memory chunks within memory blocks that could not get unplugged before (e.g., somebody pinning pages within ZONE_MOVABLE for a longer time). Signed-off-by: NDavid Hildenbrand <david@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Qian Cai <cai@lca.pw> Link: http://lkml.kernel.org/r/20200816125333.7434-6-david@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 05 8月, 2020 1 次提交
-
-
由 Michael S. Tsirkin 提交于
Virtio mem is modern-only. Use LE accessors for config space. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 23 6月, 2020 2 次提交
-
-
由 David Hildenbrand 提交于
Virtio-mem managed memory is always detected and added by the virtio-mem driver, never using something like the firmware-provided memory map. This is the case after an ordinary system reboot, and has to be guaranteed after kexec. Especially, virtio-mem added memory resources can contain inaccessible parts ("unblocked memory blocks"), blindly forwarding them to a kexec kernel is dangerous, as unplugged memory will get accessed (esp. written). Let's use the new way of adding special driver-managed memory introduced in commit 7b7b2721 ("mm/memory_hotplug: introduce add_memory_driver_managed()"). This will result in no entries in /sys/firmware/memmap ("raw firmware- provided memory map"), the memory resource will be flagged IORESOURCE_MEM_DRIVER_MANAGED (esp., kexec_file_load() will not place kexec images on this memory), and it is exposed as "System RAM (virtio_mem)" in /proc/iomem, so esp. kexec-tools can properly handle it. Example /proc/iomem before this change: [...] 140000000-333ffffff : virtio0 140000000-147ffffff : System RAM 334000000-533ffffff : virtio1 338000000-33fffffff : System RAM 340000000-347ffffff : System RAM 348000000-34fffffff : System RAM [...] Example /proc/iomem after this change: [...] 140000000-333ffffff : virtio0 140000000-147ffffff : System RAM (virtio_mem) 334000000-533ffffff : virtio1 338000000-33fffffff : System RAM (virtio_mem) 340000000-347ffffff : System RAM (virtio_mem) 348000000-34fffffff : System RAM (virtio_mem) [...] Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: teawater <teawaterz@linux.alibaba.com> Fixes: 5f1f79bb ("virtio-mem: Paravirtualized memory hotplug") Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200611093518.5737-1-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com>
-
由 Dan Carpenter 提交于
Smatch complains that "rc" can be uninitialized if we hit the "break;" statement on the first iteration through the loop. I suspect that this can't happen in real life, but returning a zero literal is cleaner and silence the static checker warning. Fixes: 5f1f79bb ("virtio-mem: Paravirtualized memory hotplug") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20200610085911.GC5439@mwandaSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 09 6月, 2020 1 次提交
-
-
由 Michael S. Tsirkin 提交于
If subblock size is large (e.g. 1G) 32 bit math involving it can overflow. Rather than try to catch all instances of that, let's tweak block size to 64 bit. It ripples through UAPI which is an ABI change, but it's not too late to make it, and it will allow supporting >4Gbyte blocks while might become necessary down the road. Fixes: 5f1f79bb ("virtio-mem: Paravirtualized memory hotplug") Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NDavid Hildenbrand <david@redhat.com>
-
- 08 6月, 2020 1 次提交
-
-
由 Michael S. Tsirkin 提交于
rc is initialized to -ENIVAL but that's never used. Drop it. Fixes: 5f1f79bb ("virtio-mem: Paravirtualized memory hotplug") Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NDavid Hildenbrand <david@redhat.com>
-
- 05 6月, 2020 9 次提交
-
-
由 David Hildenbrand 提交于
The compiler will add padding after the last member, make that explicit. The size of a request is always 24 bytes. The size of a response always 10 bytes. Add compile-time checks. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: teawater <teawaterz@linux.alibaba.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200515101402.16597-1-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Right now, we always try to unplug single subblocks when processing an online memory block. Let's try to unplug the complete online memory block first, in case it is fully plugged and the unplug request is large enough. Fallback to single subblocks in case the memory block cannot get unplugged as a whole. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-16-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's be able to distinguish if the device or if memory is busy. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-15-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
We unplug blocks right-to-left, let's also unplug subblocks within a block right-to-left. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-14-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Registering our parent resource will fail if any memory is still present (e.g., because somebody unloaded the driver and tries to reload it). No need for the manual check. Move our "unplug all" handling to after registering the resource. Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-13-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's add a parent resource, named after the virtio device (inspired by drivers/dax/kmem.c). This allows user space to identify which memory belongs to which virtio-mem device. With this change and two virtio-mem devices: :/# cat /proc/iomem 00000000-00000fff : Reserved 00001000-0009fbff : System RAM [...] 140000000-333ffffff : virtio0 140000000-147ffffff : System RAM 148000000-14fffffff : System RAM 150000000-157ffffff : System RAM [...] 334000000-3033ffffff : virtio1 338000000-33fffffff : System RAM 340000000-347ffffff : System RAM 348000000-34fffffff : System RAM [...] Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-12-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com>
-
由 David Hildenbrand 提交于
Let's start with a retry interval of 5 seconds and double the time until we reach 5 minutes, in case we keep getting errors. Reset the retry interval in case we succeeded. The two main reasons for having to retry are - The hypervisor is busy and cannot process our request - We cannot reach the desired requested_size (esp., not enough memory can get unplugged because we can't allocate any subblocks). Tested-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-11-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Let's offline+remove memory blocks once all subblocks are unplugged. We can use the new Linux MM interface for that. As no memory is in use anymore, this shouldn't take a long time and shouldn't fail. There might be corner cases where the offlining could still fail (especially, if another notifier NACKs the offlining request). Acked-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Tested-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-10-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 David Hildenbrand 提交于
Dropping the reference count of PageOffline() pages during MEM_GOING_ONLINE allows offlining code to skip them. However, we also have to clear PG_reserved, because PG_reserved pages get detected as unmovable right away. Take care of restoring the reference count when offlining is canceled. Clarify why we don't have to perform any action when unloading the driver. Also, let's add a warning if anybody is still holding a reference to unplugged pages when offlining. Tested-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: NDavid Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200507140139.17083-8-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
-