提交 · 83ef73b27eb2363f44faf9c3ee28a3fe752cfd15 · openeuler / Kernel

19 12月, 2020 40 次提交

vdpa/mlx5: Use write memory barrier after updating CQ index · 83ef73b2

由 Eli Cohen 提交于 12月 09, 2020

Make sure to put dma write memory barrier after updating CQ consumer
index so the hardware knows that there are available CQE slots in the
queue.

Failure to do this can cause the update of the RX doorbell record to get
updated before the CQ consumer index resulting in CQ overrun.

Fixes: 1a86b377 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: NEli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20201209140004.15892-1-elic@nvidia.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

83ef73b2

vdpa: split vdpasim to core and net modules · db1e8bb6

由 Max Gurtovoy 提交于 12月 15, 2020

Introduce new vdpa_sim_net and vdpa_sim (core) drivers. This is a
preparation for adding a vdpa simulator module for block devices.
Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
[sgarzare: various cleanups/fixes]
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-19-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

db1e8bb6

vdpa_sim: split vdpasim_virtqueue's iov field in out_iov and in_iov · 275900df

由 Stefano Garzarella 提交于 12月 15, 2020

vringh_getdesc_iotlb() manages 2 iovs for writable and readable
descriptors. This is very useful for the block device, where for
each request we have both types of descriptor.

Let's split the vdpasim_virtqueue's iov field in out_iov and
in_iov to use them with vringh_getdesc_iotlb().

We are using VIRTIO terminology for "out" (readable by the device)
and "in" (writable by the device) descriptors.
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-18-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

275900df

vdpa_sim: make vdpasim->buffer size configurable · da7af696

由 Stefano Garzarella 提交于 12月 15, 2020

Allow each device to specify the size of the buffer allocated
in vdpa_sim.
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-17-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

da7af696

vdpa_sim: use kvmalloc to allocate vdpasim->buffer · 165be1f8

由 Stefano Garzarella 提交于 12月 15, 2020

The next patch will make the buffer size configurable from each
device.
Since the buffer could be larger than a page, we use kvmalloc()
instead of kmalloc().
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-16-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

165be1f8

vdpa_sim: set vringh notify callback · b240491b

由 Stefano Garzarella 提交于 12月 15, 2020

Instead of calling the vq callback directly, we can leverage the
vringh_notify() function, adding vdpasim_vq_notify() and setting it
in the vringh notify callback.
Suggested-by: NJason Wang <jasowang@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-15-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

b240491b

vdpa_sim: add set_config callback in vdpasim_dev_attr · c124a95e

由 Stefano Garzarella 提交于 12月 15, 2020

The set_config callback can be used by the device to parse the
config structure modified by the driver.

The callback will be invoked, if set, in vdpasim_set_config() after
copying bytes from caller buffer into vdpasim->config buffer.
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-14-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

c124a95e

vdpa_sim: add get_config callback in vdpasim_dev_attr · 65b70958

由 Stefano Garzarella 提交于 12月 15, 2020

The get_config callback can be used by the device to fill the
config structure.
The callback will be invoked in vdpasim_get_config() before copying
bytes into caller buffer.

Move vDPA-net config updates from vdpasim_set_features() in the
new vdpasim_net_get_config() callback.
This is safe since in vdpa_get_config() we already check that
.set_features() callback is called before .get_config().
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-13-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

65b70958

vdpa_sim: make 'config' generic and usable for any device type · f37cbbc6

由 Stefano Garzarella 提交于 12月 15, 2020

Add new 'config_size' attribute in 'vdpasim_dev_attr' and allocates
'config' dynamically to support any device types.
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-12-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

f37cbbc6

vdpa_sim: store parsed MAC address in a buffer · cf1a3b35

由 Stefano Garzarella 提交于 12月 15, 2020

As preparation for the next patches, we store the MAC address,
parsed during the vdpasim_create(), in a buffer that will be used
to fill 'config' together with other configurations.
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-11-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

cf1a3b35

vdpa_sim: add work_fn in vdpasim_dev_attr · a13b5918

由 Stefano Garzarella 提交于 12月 15, 2020

Rename vdpasim_work() in vdpasim_net_work() and add it to
the vdpasim_dev_attr structure.
Co-developed-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-10-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

a13b5918

vdpa_sim: add supported_features field in vdpasim_dev_attr · 011c35ba

由 Stefano Garzarella 提交于 12月 15, 2020

Introduce a new VDPASIM_FEATURES macro with the generic features
supported by the vDPA simulator, and VDPASIM_NET_FEATURES macro with
vDPA-net features.

Add 'supported_features' field in vdpasim_dev_attr, to allow devices
to specify their features.
Co-developed-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-9-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

011c35ba

vdpa_sim: add device id field in vdpasim_dev_attr · 2f8f4618

由 Stefano Garzarella 提交于 12月 15, 2020

Remove VDPASIM_DEVICE_ID macro and add 'id' field in vdpasim_dev_attr,
that will be returned by vdpasim_get_device_id().

Use VIRTIO_ID_NET for vDPA-net simulator device id.
Co-developed-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-8-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

2f8f4618

vdpa_sim: add struct vdpasim_dev_attr for device attributes · 6c6e28fe

由 Stefano Garzarella 提交于 12月 15, 2020

vdpasim_dev_attr will contain device specific attributes. We starting
moving the number of virtqueues (i.e. nvqs) to vdpasim_dev_attr.

vdpasim_create() creates a new vDPA simulator following the device
attributes defined in the vdpasim_dev_attr parameter.
Co-developed-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-7-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

6c6e28fe

vdpa_sim: rename vdpasim_config_ops variables · 36a9c306

由 Stefano Garzarella 提交于 12月 15, 2020

These variables store generic callbacks used by the vDPA simulator
core, so we can remove the 'net' word in their names.
Co-developed-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-6-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

36a9c306

vdpa_sim: make IOTLB entries limit configurable · 2fc0ebfa

由 Stefano Garzarella 提交于 12月 15, 2020

Some devices may require a higher limit for the number of IOTLB
entries, so let's make it configurable through a module parameter.

By default, it's initialized with the current limit (2048).
Suggested-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-5-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

2fc0ebfa

vdpa_sim: remove hard-coded virtq count · 423248d6

由 Max Gurtovoy 提交于 12月 15, 2020

Add a new attribute that will define the number of virt queues to be
created for the vdpasim device.
Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com>
[sgarzare: replace kmalloc_array() with kcalloc()]
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-4-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

423248d6

vdpa_sim: remove unnecessary headers inclusion · cc3d4238

由 Stefano Garzarella 提交于 12月 15, 2020

Some headers are not necessary, so let's remove them to do
some cleaning.
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-3-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

cc3d4238

vdpa: remove unnecessary 'default n' in Kconfig entries · 29b90f92

由 Stefano Garzarella 提交于 12月 15, 2020

'default n' is not necessary since it is already the default when
nothing is specified.
Suggested-by: NJason Wang <jasowang@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20201215144256.155342-2-sgarzare@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

29b90f92

vdpa: ifcvf: Use dma_set_mask_and_coherent to simplify code · 4d10367f

由 Christophe JAILLET 提交于 11月 29, 2020

'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

While at it, fix a typo (s/confiugration/configuration)
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20201129125434.1462638-1-christophe.jaillet@wanadoo.frSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>

4d10367f

vhost_vdpa: switch to vmemdup_user() · 0ab4b890

由 Tian Tao 提交于 11月 11, 2020

Replace opencoded alloc and copy with vmemdup_user()
Signed-off-by: NTian Tao <tiantao6@hisilicon.com>
Link: https://lore.kernel.org/r/1605057288-60400-1-git-send-email-tiantao6@hisilicon.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>

0ab4b890

virtio-mem: Big Block Mode (BBM) - safe memory hotunplug · 3711387a

由 David Hildenbrand 提交于 11月 12, 2020

Let's add a safe mechanism to unplug memory, avoiding long/endless loops
when trying to offline memory - similar to in SBM.

Fake-offline all memory (via alloc_contig_range()) before trying to
offline+remove it. Use this mode as default, but allow to enable the other
mode explicitly (which could give better memory hotunplug guarantees in
some environments).

The "unsafe" mode can be enabled e.g., via virtio_mem.bbm_safe_unplug=0
on the cmdline.
Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-30-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

3711387a

virtio-mem: Big Block Mode (BBM) - basic memory hotunplug · 269ac938

由 David Hildenbrand 提交于 11月 12, 2020

Let's try to unplug completely offline big blocks first. Then, (if
enabled via unplug_offline) try to offline and remove whole big blocks.

No locking necessary - we can deal with concurrent onlining/offlining
just fine.

Note1: This is sub-optimal and might be dangerous in some environments: we
could end up in an infinite loop when offlining (e.g., long-term pinnings),
similar as with DIMMs. We'll introduce safe memory hotunplug via
fake-offlining next, and use this basic mode only when explicitly enabled.

Note2: Without ZONE_MOVABLE, memory unplug will be extremely unreliable
with bigger block sizes.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-29-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

269ac938

mm/memory_hotplug: extend offline_and_remove_memory() to handle more than one memory block · 8dc4bb58

由 David Hildenbrand 提交于 11月 12, 2020

virtio-mem soon wants to use offline_and_remove_memory() memory that
exceeds a single Linux memory block (memory_block_size_bytes()). Let's
remove that restriction.

Let's remember the old state and try to restore that if anything goes
wrong. While re-onlining can, in general, fail, it's highly unlikely to
happen (usually only when a notifier fails to allocate memory, and these
are rather rare).

This will be used by virtio-mem to offline+remove memory ranges that are
bigger than a single memory block - for example, with a device block
size of 1 GiB (e.g., gigantic pages in the hypervisor) and a Linux memory
block size of 128MB.

While we could compress the state into 2 bit, using 8 bit is much
easier.

This handling is similar, but different to acpi_scan_try_to_offline():

a) We don't try to offline twice. I am not sure if this CONFIG_MEMCG
optimization is still relevant - it should only apply to ZONE_NORMAL
(where we have no guarantees). If relevant, we can always add it.

b) acpi_scan_try_to_offline() simply onlines all memory in case
something goes wrong. It doesn't restore previous online type. Let's do
that, so we won't overwrite what e.g., user space configured.
Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-28-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NAndrew Morton <akpm@linux-foundation.org>

8dc4bb58

virtio-mem: allow to force Big Block Mode (BBM) and set the big block size · faa45ff4

由 David Hildenbrand 提交于 11月 12, 2020

Let's allow to force BBM, even if subblocks would be possible. Take care
of properly calculating the first big block id, because the start
address might no longer be aligned to the big block size.

Also, allow to manually configure the size of Big Blocks.
Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-27-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

faa45ff4

virtio-mem: Big Block Mode (BBM) memory hotplug · 4ba50cd3

由 David Hildenbrand 提交于 11月 12, 2020

Currently, we do not support device block sizes that exceed the Linux
memory block size. For example, having a device block size of 1 GiB (e.g.,
gigantic pages in the hypervisor) won't work with 128 MiB Linux memory
blocks.

Let's implement Big Block Mode (BBM), whereby we add/remove at least
one Linux memory block at a time. With a 1 GiB device block size, a Big
Block (BB) will cover 8 Linux memory blocks.

We'll keep registering the online_page_callback machinery, it will be used
for safe memory hotunplug in BBM next.

Note: BBM is properly prepared for variable-sized Linux memory
blocks that we might see in the future. So we won't care how many Linux
memory blocks a big block actually spans, and how the memory notifier is
called.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-26-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

4ba50cd3

virtio-mem: factor out adding/removing memory from Linux · 01afdee2

由 David Hildenbrand 提交于 11月 12, 2020

Let's use wrappers for the low-level functions that dev_dbg/dev_warn
and work on addr + size, such that we can reuse them for adding/removing
in other granularity.

We only warn when adding memory failed, because that's something to pay
attention to. We won't warn when removing failed, we'll reuse that in
racy context soon (and we do have proper BUG_ON() statements in the
current cases where it must never happen).
Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-25-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

01afdee2

virtio-mem: memory notifier callbacks are specific to Sub Block Mode (SBM) · d46dfb62

由 David Hildenbrand 提交于 11月 12, 2020

Let's rename accordingly.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-24-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

d46dfb62

virito-mem: existing (un)plug functions are specific to Sub Block Mode (SBM) · 602ef894

由 David Hildenbrand 提交于 11月 12, 2020

Let's rename them accordingly. virtio_mem_plug_request() and
virtio_mem_unplug_request() will be handled separately.
Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-23-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

602ef894

virtio-mem: memory block ids are specific to Sub Block Mode (SBM) · 8a6f082b

由 David Hildenbrand 提交于 11月 12, 2020

Let's move first_mb_id/next_mb_id/last_usable_mb_id accordingly.
Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-22-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

8a6f082b

virtio-mem: nb_sb_per_mb and subblock_size are specific to Sub Block Mode (SBM) · 905c4c51

由 David Hildenbrand 提交于 11月 12, 2020

Let's rename to "sbs_per_mb" and "sb_size" and move accordingly.
Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-21-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

905c4c51

virito-mem: subblock states are specific to Sub Block Mode (SBM) · 54c6a6ba

由 David Hildenbrand 提交于 11月 12, 2020

Let's rename and move accordingly. While at it, rename sb_bitmap to
"sb_states".
Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com>
Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-20-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

54c6a6ba

virtio-mem: memory block states are specific to Sub Block Mode (SBM) · 99f0b55e

由 David Hildenbrand 提交于 11月 12, 2020

let's use a new "sbm" sub-struct to hold SBM-specific state and rename +
move applicable definitions, functions, and variables (related to
memory block states).

While at it:
- Drop the "_STATE" part from memory block states
- Rename "nb_mb_state" to "mb_count"
- "set_mb_state" / "get_mb_state" vs. "mb_set_state" / "mb_get_state"
- Don't use lengthy "enum virtio_mem_smb_mb_state", simply use "uint8_t"
Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com>
Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-19-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

99f0b55e

virito-mem: document Sub Block Mode (SBM) · d5614944

由 David Hildenbrand 提交于 11月 12, 2020

Let's add some documentation for the current mode - Sub Block Mode (SBM) -
to prepare for a new mode - Big Block Mode (BBM).

Follow-up patches will properly factor out the existing Sub Block Mode
(SBM) and implement Big Block Mode (BBM).
Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com>
Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-18-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

d5614944

virtio-mem: generalize handling when memory is getting onlined deferred · 98ff9f94

由 David Hildenbrand 提交于 11月 12, 2020

We don't want to add too much memory when it's not getting onlined
immediately, to avoid running OOM. Generalize the handling, to avoid
making use of memory block states. Use a threshold of 1 GiB for now.

Properly adjust the offline size when adding/removing memory. As we are
not always protected by a lock when touching the offline size, use an
atomic64_t. We don't care about races (e.g., someone offlining memory
while we are adding more), only about consistent values.

(1 GiB needs a memmap of ~16MiB - which sounds reasonable even for
setups with little boot memory and (possibly) one virtio-mem device per
node)

We don't want to retrigger when onlining is caused immediately by our
action (e.g., adding memory which immediately gets onlined), so use a
flag to indicate if the workqueue is active and use that as an
indicator whether to trigger a retry. This will also be especially relevant
for Big Block Mode (BBM), whereby we might re-online memory in case
offlining of another memory block failed.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-17-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

98ff9f94

virtio-mem: don't always trigger the workqueue when offlining memory · 1d33c2ca

由 David Hildenbrand 提交于 11月 12, 2020

Let's trigger from offlining code only when we're not allowed to unplug
online memory. Handle the other case (memmap possibly freeing up another
memory block) when actually removing memory. We now also properly handle
the case when removing already offline memory blocks via
virtio_mem_mb_remove(). When removing via virtio_mem_remove(), when
unloading the driver, virtio_mem_retry() is a NOP and safe to use.

While at it, move retry handling when offlining out of
virtio_mem_notify_offline(), to share it with Big Block Mode (BBM)
soon.

This is a preparation for Big Block Mode (BBM), whereby we can see some
temporary offlining of memory blocks without actually making progress.
Imagine you have a Big Block that spans to Linux memory blocks. Assume
the first Linux memory blocks has no unmovable data on it. When we would
call offline_and_remove_memory() on the big block, we would
	1. Try to offline the first block. Works, notifiers triggered.
	   virtio_mem_retry() called.
	2. Try to offline the second block. Does not work.
	3. Re-online first block.
	4. Exit to main loop, exit workqueue.
	5. Retry immediately (due to virtio_mem_retry()), go to 1.
The result are endless retries.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-16-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

1d33c2ca

virtio-mem: drop last_mb_id · 42006682

由 David Hildenbrand 提交于 11月 12, 2020

No longer used, let's drop it.
Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com>
Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-15-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

42006682

virtio-mem: generalize virtio_mem_overlaps_range() · 835491c5

由 David Hildenbrand 提交于 11月 12, 2020

Avoid using memory block ids. While at it, use uint64_t for
address/size.

This is a preparation for Big Block Mode (BBM).
Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-14-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

835491c5

virtio-mem: generalize virtio_mem_owned_mb() · 8464e3bd

由 David Hildenbrand 提交于 11月 12, 2020

Avoid using memory block ids. Rename it to virtio_mem_contains_range().

This is a preparation for Big Block Mode (BBM).
Reviewed-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-13-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

8464e3bd

virtio-mem: generalize check for added memory · 989ff825

由 David Hildenbrand 提交于 11月 12, 2020

Let's check by traversing busy system RAM resources instead, to avoid
relying on memory block states.

Don't use walk_system_ram_range(), as that works on pages and we want to
use the bare addresses we have easily at hand.

This is a preparation for Big Block Mode (BBM), which won't have memory
block states.
Reviewed-by: NWei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: NDavid Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-12-david@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

989ff825

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功