提交 · 09102704c67457c6cdea6c0394c34843484a852c · openeuler / Kernel

10 6月, 2020 1 次提交

mmap locking API: use coccinelle to convert mmap_sem rwsem call sites · d8ed45c5

由 Michel Lespinasse 提交于 6月 08, 2020

This change converts the existing mmap_sem rwsem calls to use the new mmap
locking API instead.

The change is generated using coccinelle with the following rule:

// spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

@@
expression mm;
@@
(
-init_rwsem
+mmap_init_lock
|
-down_write
+mmap_write_lock
|
-down_write_killable
+mmap_write_lock_killable
|
-down_write_trylock
+mmap_write_trylock
|
-up_write
+mmap_write_unlock
|
-downgrade_write
+mmap_write_downgrade
|
-down_read
+mmap_read_lock
|
-down_read_killable
+mmap_read_lock_killable
|
-down_read_trylock
+mmap_read_trylock
|
-up_read
+mmap_read_unlock
)
-(&mm->mmap_sem)
+(mm)
Signed-off-by: NMichel Lespinasse <walken@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NDaniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: NLaurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d8ed45c5

09 6月, 2020 2 次提交

vhost/test: fix up after API change · 044e4b09

由 Michael S. Tsirkin 提交于 6月 08, 2020

Pass a flag to request kernel thread use.

Fixes: 01fcb1cb ("vhost: allow device that does not depend on vhost worker")
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

044e4b09

vhost: convert get_user_pages() --> pin_user_pages() · 690623e1

由 John Hubbard 提交于 6月 07, 2020

This code was using get_user_pages*(), in approximately a "Case 5"
scenario (accessing the data within a page), using the categorization
from [1].  That means that it's time to convert the get_user_pages*() +
put_page() calls to pin_user_pages*() + unpin_user_pages() calls.

There is some helpful background in [2]: basically, this is a small part
of fixing a long-standing disconnect between pinning pages, and file
systems' use of those pages.

[1] Documentation/core-api/pin_user_pages.rst

[2] "Explicit pinning of user-space pages":
    https://lwn.net/Articles/807108/Signed-off-by: NJohn Hubbard <jhubbard@nvidia.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NPankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: http://lkml.kernel.org/r/20200529234309.484480-3-jhubbard@nvidia.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

690623e1

07 6月, 2020 2 次提交

vhost: replace -1 with VHOST_FILE_UNBIND in ioctls · e0136c16

由 Zhu Lingshan 提交于 6月 05, 2020

This commit replaces -1 with VHOST_FILE_UNBIND in ioctls since
we have added such a macro in the uapi header for vdpa_host.
Signed-off-by: NZhu Lingshan <lingshan.zhu@intel.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/1591352835-22441-5-git-send-email-lingshan.zhu@intel.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

e0136c16

vhost_vdpa: Support config interrupt in vdpa · 776f3950

由 Zhu Lingshan 提交于 6月 05, 2020

This commit implements config interrupt support in
vhost_vdpa layer.
Signed-off-by: NZhu Lingshan <lingshan.zhu@intel.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/1591352835-22441-4-git-send-email-lingshan.zhu@intel.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

776f3950

05 6月, 2020 5 次提交

vhost: (cosmetic) remove a superfluous variable initialisation · 002ef18e

由 Guennadi Liakhovetski 提交于 5月 27, 2020

Even the compiler is able to figure out that in this case the
initialisation is superfluous.
Signed-off-by: NGuennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Link: https://lore.kernel.org/r/20200527180541.5570-3-guennadi.liakhovetski@linux.intel.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

002ef18e

vhost_vdpa: disable doorbell mapping for !MMU · 4b4e4867

由 Michael S. Tsirkin 提交于 6月 04, 2020

There could be ways to support doorbell mapping with !MMU, but things
like pgprot_noncached are not universally supported.
Fixable, but just disable this for now.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

4b4e4867

vhost_vdpa: support doorbell mapping via mmap · ddd89d0a

由 Jason Wang 提交于 5月 29, 2020

Currently the doorbell is relayed via eventfd which may have
significant overhead because of the cost of vmexits or syscall. This
patch introduces mmap() based doorbell mapping which can eliminate the
overhead caused by vmexit or syscall.

To ease the userspace modeling of the doorbell layout (usually
virtio-pci), this patch starts from a doorbell per page
model. Vhost-vdpa only support the hardware doorbell that sit at the
boundary of a page and does not share the page with other registers.

Doorbell of each virtqueue must be mapped separately, pgoff is the
index of the virtqueue. This allows userspace to map a subset of the
doorbell which may be useful for the implementation of software
assisted virtqueue (control vq) in the future.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200529080303.15449-5-jasowang@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

ddd89d0a

vhost: use mmgrab() instead of mmget() for non worker device · 5ce995f3

由 Jason Wang 提交于 5月 29, 2020

For the device that doesn't use vhost worker and use_mm(), mmget() is
too heavy weight and it may brings troubles for implementing mmap()
support for vDPA device.

This is because, an reference to the address space was held via
mm_get() in vhost_dev_set_owner() and an reference to the file was
held in mmap(). This means when process exits, the mm can not be
released thus we can not release the file.

This patch tries to use mmgrab() instead of mmget(), which allows the
address space to be destroy in process exit without releasing the mm
structure itself. This is sufficient for vDPA device which pin user
pages and does not depend on the address space to work.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200529080303.15449-3-jasowang@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

5ce995f3

vhost: allow device that does not depend on vhost worker · 01fcb1cb

由 Jason Wang 提交于 5月 29, 2020

vDPA device currently relays the eventfd via vhost worker. This is
inefficient due the latency of wakeup and scheduling, so this patch
tries to introduce a use_worker attribute for the vhost device. When
use_worker is not set with vhost_dev_init(), vhost won't try to
allocate a worker thread and the vhost_poll will be processed directly
in the wakeup function.

This help for vDPA since it reduces the latency caused by vhost worker.

In my testing, it saves 0.2 ms in pings between VMs on a mutual host.
Signed-off-by: NZhu Lingshan <lingshan.zhu@intel.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200529080303.15449-2-jasowang@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

01fcb1cb

02 6月, 2020 2 次提交

vhost: revert "vhost: disable for OABI" · 213e7721

由 Michael S. Tsirkin 提交于 4月 23, 2020

This reverts commit d085eb8c ("vhost: disable for OABI")
With commit "virtio: force spec specified alignment on types"
in place, we force proper alignment for all structures,
so there's no longer a reason to blacklist OABI.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

213e7721

virtio: force spec specified alignment on types · a865e420

由 Michael S. Tsirkin 提交于 4月 06, 2020

The ring element addresses are passed between components with different
alignments assumptions. Thus, if guest/userspace selects a pointer and
host then gets and dereferences it, we might need to decrease the
compiler-selected alignment to prevent compiler on the host from
assuming pointer is aligned.

This actually triggers on ARM with -mabi=apcs-gnu - which is a
deprecated configuration, but it seems safer to handle this
generally.

Note that userspace that allocates the memory is actually OK and does
not need to be fixed, but userspace that gets it from guest or another
process does need to be fixed. The later doesn't generally talk to the
kernel so while it might be buggy it's not talking to the kernel in the
buggy way - it's just using the header in the buggy way - so fixing
header and asking userspace to recompile is the best we can do.

I verified that the produced kernel binary on x86 is exactly identical
before and after the change.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>

a865e420

27 5月, 2020 1 次提交

scsi: vhost: Notify TCM about the maximum sg entries supported per command · 5ae6a6a9

由 Sudhakar Panneerselvam 提交于 5月 22, 2020

vhost-scsi pre-allocates the maximum sg entries per command and if a
command requires more than VHOST_SCSI_PREALLOC_SGLS entries, then that
command is failed by it. This patch lets vhost communicate the max sg limit
when it registers vhost_scsi_ops with TCM. With this change, TCM would
report the max sg entries through "Block Limits" VPD page which will be
typically queried by the SCSI initiator during device discovery. By knowing
this limit, the initiator could ensure the maximum transfer length is less
than or equal to what is reported by vhost-scsi.

Link: https://lore.kernel.org/r/1590166317-953-1-git-send-email-sudhakar.panneerselvam@oracle.com
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NMike Christie <mchristi@redhat.com>
Signed-off-by: NSudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

5ae6a6a9

15 5月, 2020 2 次提交

vhost: missing __user tags · 1b0be99f

由 Michael S. Tsirkin 提交于 5月 15, 2020

sparse warns about converting void * to void __user *. This is not new
but only got noticed now that vhost is built on more systems.
This is just a question of __user tags missing in a couple of places,
so fix it up.

Fixes: f8894913 ("vhost: introduce O(1) vq metadata cache")
Reported-by: Nkbuild test robot <lkp@intel.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

1b0be99f

vhost_net: Also populate XDP frame size · 05afee29

由 Jesper Dangaard Brouer 提交于 5月 14, 2020

In vhost_net_build_xdp() the 'buf' that gets queued via an xdp_buff
have embedded a struct tun_xdp_hdr (located at xdp->data_hard_start)
which contains the buffer length 'buflen' (with tailroom for
skb_shared_info). Also storing this buflen in xdp->frame_sz, does not
obsolete struct tun_xdp_hdr, as it also contains a struct
virtio_net_hdr with other information.
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/bpf/158945343928.97035.4620233649151726289.stgit@firesoul

05afee29

02 5月, 2020 1 次提交

vhost: vsock: kick send_pkt worker once device is started · 0b841030

由 Jia He 提交于 5月 01, 2020

Ning Bo reported an abnormal 2-second gap when booting Kata container [1].
The unconditional timeout was caused by VSOCK_DEFAULT_CONNECT_TIMEOUT of
connecting from the client side. The vhost vsock client tries to connect
an initializing virtio vsock server.

The abnormal flow looks like:
host-userspace           vhost vsock                       guest vsock
==============           ===========                       ============
connect()     -------->  vhost_transport_send_pkt_work()   initializing
   |                     vq->private_data==NULL
   |                     will not be queued
   V
schedule_timeout(2s)
                         vhost_vsock_start()  <---------   device ready
                         set vq->private_data

wait for 2s and failed
connect() again          vq->private_data!=NULL         recv connecting pkt

Details:
1. Host userspace sends a connect pkt, at that time, guest vsock is under
   initializing, hence the vhost_vsock_start has not been called. So
   vq->private_data==NULL, and the pkt is not been queued to send to guest
2. Then it sleeps for 2s
3. After guest vsock finishes initializing, vq->private_data is set
4. When host userspace wakes up after 2s, send connecting pkt again,
   everything is fine.

As suggested by Stefano Garzarella, this fixes it by additional kicking the
send_pkt worker in vhost_vsock_start once the virtio device is started. This
makes the pending pkt sent again.

After this patch, kata-runtime (with vsock enabled) boot time is reduced
from 3s to 1s on a ThunderX2 arm64 server.

[1] https://github.com/kata-containers/runtime/issues/1917Reported-by: NNing Bo <n.b@live.com>
Suggested-by: NStefano Garzarella <sgarzare@redhat.com>
Signed-off-by: NJia He <justin.he@arm.com>
Link: https://lore.kernel.org/r/20200501043840.186557-1-justin.he@arm.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
Reviewed-by: NStefano Garzarella <sgarzare@redhat.com>

0b841030

28 4月, 2020 2 次提交

vsock/virtio: fix multiple packet delivery to monitoring devices · a78d1639

由 Stefano Garzarella 提交于 4月 24, 2020

In virtio_transport.c, if the virtqueue is full, the transmitting
packet is queued up and it will be sent in the next iteration.
This causes the same packet to be delivered multiple times to
monitoring devices.

We want to continue to deliver packets to monitoring devices before
it is put in the virtqueue, to avoid that replies can appear in the
packet capture before the transmitted packet.

This patch fixes the issue, adding a new flag (tap_delivered) in
struct virtio_vsock_pkt, to check if the packet is already delivered
to monitoring devices.

In vhost/vsock.c, we are splitting packets, so we must set
'tap_delivered' to false when we queue up the same virtio_vsock_pkt
to handle the remaining bytes.
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a78d1639

vhost/vsock: fix packet delivery order to monitoring devices · 107bc076

由 Stefano Garzarella 提交于 4月 24, 2020

We want to deliver packets to monitoring devices before it is
put in the virtqueue, to avoid that replies can appear in the
packet capture before the transmitted packet.
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

107bc076

20 4月, 2020 1 次提交

vhost: disable for OABI · d085eb8c

由 Michael S. Tsirkin 提交于 4月 06, 2020

vhost is currently broken on the some ARM configs.

The reason is that the ring element addresses are passed between
components with different alignments assumptions. Thus, if
guest selects a pointer and host then gets and dereferences
it, then alignment assumed by the host's compiler might be
greater than the actual alignment of the pointer.
compiler on the host from assuming pointer is aligned.

This actually triggers on ARM with -mabi=apcs-gnu - which is a
deprecated configuration. With this OABI, compiler assumes that
all structures are 4 byte aligned - which is stronger than
virtio guarantees for available and used rings, which are
merely 2 bytes. Thus a guest without -mabi=apcs-gnu running
on top of host with -mabi=apcs-gnu will be broken.

The correct fix is to force alignment of structures - however
that is an intrusive fix that's best deferred until the next release.

We didn't previously support such ancient systems at all - this surfaced
after vdpa support prompted removing dependency of vhost on
VIRTULIZATION. So for now, let's just add something along the lines of

	depends on !ARM || AEABI

to the virtio Kconfig declaration, and add a comment that it has to do
with struct member alignment.

Note: we can't make VHOST and VHOST_RING themselves have
a dependency since these are selected. Add a new symbol for that.

We should be able to drop this dependency down the road.

Fixes: 20c384f1 ("vhost: refine vhost and vringh kconfig")
Suggested-by: NArd Biesheuvel <ardb@kernel.org>
Suggested-by: NRichard Earnshaw <Richard.Earnshaw@arm.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

d085eb8c

17 4月, 2020 5 次提交

vdpa: make vhost, virtio depend on menu · 58ad1372

由 Michael S. Tsirkin 提交于 4月 12, 2020

If user did not configure any vdpa drivers, neither vhost
nor virtio vdpa are going to be useful. So there's no point
in prompting for these and selecting vdpa core automatically.
Simplify configuration by making virtio and vhost vdpa
drivers depend on vdpa menu entry. Once done, we no longer
need a separate menu entry, so also get rid of this.
While at it, fix up the IFC entry: VDPA->vDPA for consistency
with other places.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>

58ad1372

virtio/test: fix up after IOTLB changes · 3302363a

由 Michael S. Tsirkin 提交于 4月 01, 2020

Allow building vringh without IOTLB (that's the case for userspace
builds, will be useful for CAIF/VOD down the road too).
Update for API tweaks.
Don't include vringh with userspace builds.

Cc: Jason Wang <jasowang@redhat.com>
Cc: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>

3302363a

vhost: Create accessors for virtqueues private_data · 247643f8

由 Eugenio Pérez 提交于 3月 31, 2020

Signed-off-by: NEugenio Pérez <eperezma@redhat.com>
Link: https://lore.kernel.org/r/20200331192804.6019-2-eperezma@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

247643f8

vhost: remove set but not used variable 'status' · e373f3d7

由 Jason Yan 提交于 4月 02, 2020

Fix the following gcc warning:
drivers/vhost/vdpa.c:299:5: warning: variable 'status' set but not used [-Wunused-but-set-variable]
  u8 status;
     ^~~~~~
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NJason Yan <yanaijie@huawei.com>
Link: https://lore.kernel.org/r/20200402065106.20108-1-yanaijie@huawei.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

e373f3d7

vhost: vdpa: remove unnecessary null check · aa21c2e7

由 Gustavo A. R. Silva 提交于 3月 30, 2020

container_of is never null, so this null check is
unnecessary.

Addresses-Coverity-ID: 1492006 ("Logically dead code")
Fixes: 20453a45fb06 ("vhost: introduce vDPA-based backend")
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Link: https://lore.kernel.org/r/20200330235040.GA9997@embeddedorSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NJason Wang <jasowang@redhat.com>

aa21c2e7

02 4月, 2020 5 次提交

vhost: introduce vDPA-based backend · 4c8cf318

由 Tiwei Bie 提交于 3月 26, 2020

This patch introduces a vDPA-based vhost backend. This backend is
built on top of the same interface defined in virtio-vDPA and provides
a generic vhost interface for userspace to accelerate the virtio
devices in guest.

This backend is implemented as a vDPA device driver on top of the same
ops used in virtio-vDPA. It will create char device entry named
vhost-vdpa-$index for userspace to use. Userspace can use vhost ioctls
on top of this char device to setup the backend.

Vhost ioctls are extended to make it type agnostic and behave like a
virtio device, this help to eliminate type specific API like what
vhost_net/scsi/vsock did:

- VHOST_VDPA_GET_DEVICE_ID: get the virtio device ID which is defined
by virtio specification to differ from different type of devices
- VHOST_VDPA_GET_VRING_NUM: get the maximum size of virtqueue
supported by the vDPA device
- VHSOT_VDPA_SET/GET_STATUS: set and get virtio status of vDPA device
- VHOST_VDPA_SET/GET_CONFIG: access virtio config space
- VHOST_VDPA_SET_VRING_ENABLE: enable a specific virtqueue

For memory mapping, IOTLB API is mandated for vhost-vDPA which means
userspace drivers are required to use
VHOST_IOTLB_UPDATE/VHOST_IOTLB_INVALIDATE to add or remove mapping for
a specific userspace memory region.

The vhost-vDPA API is designed to be type agnostic, but it allows net
device only in current stage. Due to the lacking of control virtqueue
support, some features were filter out by vhost-vdpa.

We will enable more features and devices in the near future.
Signed-off-by: NTiwei Bie <tiwei.bie@intel.com>
Signed-off-by: NEugenio Pérez <eperezma@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200326140125.19794-8-jasowang@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

4c8cf318

vringh: IOTLB support · 9ad9c49c

由 Jason Wang 提交于 3月 26, 2020

This patch implements the third memory accessor for vringh besides
current kernel and userspace accessors. This idea is to allow vringh
to do the address translation through an IOTLB which is implemented
via vhost_map interval tree. Users should setup and IOVA to PA mapping
in this IOTLB.

This allows us to:

- Use vringh to access virtqueues with vIOMMU
- Use vringh to implement software virtqueues for vDPA devices
Signed-off-by: NJason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200326140125.19794-5-jasowang@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

9ad9c49c

vhost: factor out IOTLB · 0bbe3066

由 Jason Wang 提交于 3月 26, 2020

This patch factors out IOTLB into a dedicated module in order to be
reused by other modules like vringh. User may choose to enable the
automatic retiring by specifying VHOST_IOTLB_FLAG_RETIRE flag to fit
for the case of vhost device IOTLB implementation.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200326140125.19794-4-jasowang@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

0bbe3066

vhost: allow per device message handler · 792a4f2e

由 Jason Wang 提交于 3月 26, 2020

This patch allow device to register its own message handler during
vhost_dev_init(). vDPA device will use it to implement its own DMA
mapping logic.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200326140125.19794-3-jasowang@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

792a4f2e

vhost: refine vhost and vringh kconfig · 20c384f1

由 Jason Wang 提交于 3月 26, 2020

Currently, CONFIG_VHOST depends on CONFIG_VIRTUALIZATION. But vhost is
not necessarily for VM since it's a generic userspace and kernel
communication protocol. Such dependency may prevent archs without
virtualization support from using vhost.

To solve this, a dedicated vhost menu is created under drivers so
CONIFG_VHOST can be decoupled out of CONFIG_VIRTUALIZATION.

While at it, also squash Kconfig.vringh into vhost Kconfig file. This
avoids the trick of conditional inclusion from VOP or CAIF. Then it
will be easier to introduce new vringh users and common dependency for
both vringh and vhost.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20200326140125.19794-2-jasowang@redhat.comSigned-off-by: NMichael S. Tsirkin <mst@redhat.com>

20c384f1

23 2月, 2020 1 次提交

vhost: Check docket sk_family instead of call getname · 42d84c84

由 Eugenio Pérez 提交于 2月 21, 2020

Doing so, we save one call to get data we already have in the struct.

Also, since there is no guarantee that getname use sockaddr_ll
parameter beyond its size, we add a little bit of security here.
It should do not do beyond MAX_ADDR_LEN, but syzbot found that
ax25_getname writes more (72 bytes, the size of full_sockaddr_ax25,
versus 20 + 32 bytes of sockaddr_ll + MAX_ADDR_LEN in syzbot repro).

Fixes: 3a4d5c94 ("vhost_net: a kernel-level virtio server")
Reported-by: syzbot+f2a62d07a5198c819c7b@syzkaller.appspotmail.com
Signed-off-by: NEugenio Pérez <eperezma@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

42d84c84

08 12月, 2019 1 次提交

vhost/vsock: accept only packets with the right dst_cid · 8a3cc29c

由 Stefano Garzarella 提交于 12月 06, 2019

When we receive a new packet from the guest, we check if the
src_cid is correct, but we forgot to check the dst_cid.

The host should accept only packets where dst_cid is
equal to the host CID.
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a3cc29c

05 12月, 2019 1 次提交

vhost, kcov: collect coverage from vhost_worker · 8f6a7f96

由 Andrey Konovalov 提交于 12月 04, 2019

Add kcov_remote_start()/kcov_remote_stop() annotations to the
vhost_worker() function, which is responsible for processing vhost
works.

Since vhost_worker() threads are spawned per vhost device instance the
common kcov handle is used for kcov_remote_start()/stop() annotations
(see Documentation/dev-tools/kcov.rst for details).  As the result kcov
can now be used to collect coverage from vhost worker threads.

Link: http://lkml.kernel.org/r/e49d5d154e5da6c9ada521d2b7ce10a49ce9f98b.1572366574.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Alexander Potapenko <glider@google.com>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8f6a7f96

15 11月, 2019 5 次提交

vhost/vsock: refuse CID assigned to the guest->host transport · ed8640a9

由 Stefano Garzarella 提交于 11月 14, 2019

In a nested VM environment, we have to refuse to assign to a nested
guest the same CID assigned to our guest->host transport.
In this way, the user can use the local CID for loopback.
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed8640a9

vsock: prevent transport modules unloading · 6a2c0962

由 Stefano Garzarella 提交于 11月 14, 2019

This patch adds 'module' member in the 'struct vsock_transport'
in order to get/put the transport module. This prevents the
module unloading while sockets are assigned to it.

We increase the module refcnt when a socket is assigned to a
transport, and we decrease the module refcnt when the socket
is destructed.
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NJorgen Hansen <jhansen@vmware.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a2c0962

vsock: add multi-transports support · c0cfa2d8

由 Stefano Garzarella 提交于 11月 14, 2019

This patch adds the support of multiple transports in the
VSOCK core.

With the multi-transports support, we can use vsock with nested VMs
(using also different hypervisors) loading both guest->host and
host->guest transports at the same time.

Major changes:
- vsock core module can be loaded regardless of the transports
- vsock_core_init() and vsock_core_exit() are renamed to
  vsock_core_register() and vsock_core_unregister()
- vsock_core_register() has a feature parameter (H2G, G2H, DGRAM)
  to identify which directions the transport can handle and if it's
  support DGRAM (only vmci)
- each stream socket is assigned to a transport when the remote CID
  is set (during the connect() or when we receive a connection request
  on a listener socket).
  The remote CID is used to decide which transport to use:
  - remote CID <= VMADDR_CID_HOST will use guest->host transport;
  - remote CID == local_cid (guest->host transport) will use guest->host
    transport for loopback (host->guest transports don't support loopback);
  - remote CID > VMADDR_CID_HOST will use host->guest transport;
- listener sockets are not bound to any transports since no transport
  operations are done on it. In this way we can create a listener
  socket, also if the transports are not loaded or with VMADDR_CID_ANY
  to listen on all transports.
- DGRAM sockets are handled as before, since only the vmci_transport
  provides this feature.
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c0cfa2d8

vsock: handle buffer_size sockopts in the core · b9f2b0ff

由 Stefano Garzarella 提交于 11月 14, 2019

virtio_transport and vmci_transport handle the buffer_size
sockopts in a very similar way.

In order to support multiple transports, this patch moves this
handling in the core to allow the user to change the options
also if the socket is not yet assigned to any transport.

This patch also adds the '.notify_buffer_size' callback in the
'struct virtio_transport' in order to inform the transport,
when the buffer_size is changed by the user. It is also useful
to limit the 'buffer_size' requested (e.g. virtio transports).
Acked-by: NDexuan Cui <decui@microsoft.com>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NJorgen Hansen <jhansen@vmware.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b9f2b0ff

vsock/virtio: add transport parameter to the virtio_transport_reset_no_sock() · 4c7246dc

由 Stefano Garzarella 提交于 11月 14, 2019

We are going to add 'struct vsock_sock *' parameter to
virtio_transport_get_ops().

In some cases, like in the virtio_transport_reset_no_sock(),
we don't have any socket assigned to the packet received,
so we can't use the virtio_transport_get_ops().

In order to allow virtio_transport_reset_no_sock() to use the
'.send_pkt' callback from the 'vhost_transport' or 'virtio_transport',
we add the 'struct virtio_transport *' to it and to its caller:
virtio_transport_recv_pkt().

We moved the 'vhost_transport' and 'virtio_transport' definition,
to pass their address to the virtio_transport_recv_pkt().
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NStefano Garzarella <sgarzare@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c7246dc

28 10月, 2019 1 次提交

vringh: fix copy direction of vringh_iov_push_kern() · b3683dee

由 Jason Wang 提交于 10月 24, 2019

We want to copy from iov to buf, so the direction was wrong.

Note: no real user for the helper, but it will be used by future
features.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

b3683dee

23 10月, 2019 1 次提交

compat_ioctl: move drivers to compat_ptr_ioctl · 407e9ef7

由 Arnd Bergmann 提交于 9月 11, 2018

Each of these drivers has a copy of the same trivial helper function to
convert the pointer argument and then call the native ioctl handler.

We now have a generic implementation of that, so use it.
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: NJason Gunthorpe <jgg@mellanox.com>
Reviewed-by: NJiri Kosina <jkosina@suse.cz>
Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>

407e9ef7

13 10月, 2019 1 次提交

vhost/test: stop device before reset · 245cdd9f

由 Michael S. Tsirkin 提交于 10月 07, 2019

When device stop was moved out of reset, test device wasn't updated to
stop before reset, this resulted in a use after free.  Fix by invoking
stop appropriately.

Fixes: b211616d ("vhost: move -net specific code out")
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

245cdd9f

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功