- 09 5月, 2017 1 次提交
-
-
由 Michal Hocko 提交于
vhost code uses __GFP_REPEAT when allocating vhost_virtqueue resp. vhost_vsock because it would really like to prefer kmalloc to the vmalloc fallback - see 23cc5a99 ("vhost-net: extend device allocation to vmalloc") for more context. Michael Tsirkin has also noted: "__GFP_REPEAT overhead is during allocation time. Using vmalloc means all accesses are slowed down. Allocation is not on data path, accesses are." The similar applies to other vhost_kvzalloc users. Let's teach kvmalloc_node to handle __GFP_REPEAT properly. There are two things to be careful about. First we should prevent from the OOM killer and so have to involve __GFP_NORETRY by default and secondly override __GFP_REPEAT for !costly order requests as the __GFP_REPEAT is ignored for !costly orders. Supporting __GFP_REPEAT like semantic for !costly request is possible it would require changes in the page allocator. This is out of scope of this patch. This patch shouldn't introduce any functional change. Link: http://lkml.kernel.org/r/20170306103032.2540-3-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 4月, 2017 1 次提交
-
-
由 Gerard Garcia 提交于
The virtio drivers deal with struct virtio_vsock_pkt. Add virtio_transport_deliver_tap_pkt(pkt) for handing packets to the vsockmon device. We call virtio_transport_deliver_tap_pkt(pkt) from net/vmw_vsock/virtio_transport.c and drivers/vhost/vsock.c instead of common code. This is because the drivers may drop packets before handing them to common code - we still want to capture them. Signed-off-by: NGerard Garcia <ggarcia@deic.uab.cat> Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Reviewed-by: NJorgen Hansen <jhansen@vmware.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 3月, 2017 1 次提交
-
-
由 Peng Tao 提交于
To allow canceling all packets of a connection. Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Reviewed-by: NJorgen Hansen <jhansen@vmware.com> Signed-off-by: NPeng Tao <bergwolf@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 3月, 2017 4 次提交
-
-
由 Ingo Molnar 提交于
sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ingo Molnar 提交于
We are going to split <linux/sched/mm.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/mm.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. The APIs that are going to be moved first are: mm_alloc() __mmdrop() mmdrop() mmdrop_async_fn() mmdrop_async() mmget_not_zero() mmput() mmput_async() get_task_mm() mm_access() mm_release() Include the new header in the files that are going to need it. Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ingo Molnar 提交于
We are going to split <linux/sched/clock.h> out of <linux/sched.h>, which will have to be picked up from other headers and .c files. Create a trivial placeholder <linux/sched/clock.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jason Wang 提交于
When device IOTLB is enabled, all address translations were stored in interval tree. O(lgN) searching time could be slow for virtqueue metadata (avail, used and descriptors) since they were accessed much often than other addresses. So this patch introduces an O(1) array which points to the interval tree nodes that store the translations of vq metadata. Those array were update during vq IOTLB prefetching and were reset during each invalidation and tlb update. Each time we want to access vq metadata, this small array were queried before interval tree. This would be sufficient for static mappings but not dynamic mappings, we could do optimizations on top. Test were done with l2fwd in guest (2M hugepage): noiommu | before | after tx 1.32Mpps | 1.06Mpps(82%) | 1.30Mpps(98%) rx 2.33Mpps | 1.46Mpps(63%) | 2.29Mpps(98%) We can almost reach the same performance as noiommu mode. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 28 2月, 2017 1 次提交
-
-
由 Jason Wang 提交于
If last avail idx is not equal to cached avail idx, we're sure there's still available buffers in the virtqueue so there's no need to re-read avail idx. So let's skip this to avoid unnecessary userspace memory access and memory barrier. Pktgen test show about 3% improvement on rx pps. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 12 2月, 2017 2 次提交
-
-
由 Sainath Grandhi 提交于
This patch makes tap a separate module for other types of virtual interfaces, for example, ipvlan to use. Signed-off-by: NSainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sainath Grandhi 提交于
Renaming tap related APIs, data structures and macros in tap.c from macvtap_.* to tap_.* Signed-off-by: NSainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 2月, 2017 1 次提交
-
-
由 Halil Pasic 提交于
Currently, under certain circumstances vhost_init_is_le does just a part of the initialization job, and depends on vhost_reset_is_le being called too. For this reason vhost_vq_init_access used to call vhost_reset_is_le when vq->private_data is NULL. This is not only counter intuitive, but also real a problem because it breaks vhost_net. The bug was introduced to vhost_net with commit 2751c988 ("vhost: cross-endian support for legacy devices"). The symptom is corruption of the vq's used.idx field (virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost shutdown on a vq with pending descriptors. Let us make sure the outcome of vhost_init_is_le never depend on the state it is actually supposed to initialize, and fix virtio_net by removing the reset from vhost_vq_init_access. With the above, there is no reason for vhost_reset_is_le to do just half of the job. Let us make vhost_reset_is_le reinitialize is_le. Signed-off-by: NHalil Pasic <pasic@linux.vnet.ibm.com> Reported-by: NMichael A. Tebolt <miket@us.ibm.com> Reported-by: NDr. David Alan Gilbert <dgilbert@redhat.com> Fixes: commit 2751c988 ("vhost: cross-endian support for legacy devices") Cc: <stable@vger.kernel.org> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NGreg Kurz <groug@kaod.org> Tested-by: NMichael A. Tebolt <miket@us.ibm.com>
-
- 25 1月, 2017 1 次提交
-
-
由 Stefan Hajnoczi 提交于
Propagate the error when vhost_vq_init_access() fails and set vq->private_data to NULL. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 20 1月, 2017 2 次提交
-
-
由 Dan Carpenter 提交于
This is to silence an uninitialized variable warning in debug output. The problem is this line: pr_debug("vhost_get_vq_desc: head: %d, out: %u in: %u\n", head, out, in); If "head == vq->num" is true on the first iteration then "out" and "in" aren't initialized. We handle that a few lines after the printk. I was tempted to just delete the pr_debug() but I decided to just initialize them to zero instead. Also checkpatch.pl complains if variables are declared as just "unsigned" without the "int". Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Bhumika Goyal 提交于
Declare target_core_fabric_ops strucrues as const as they are only passed as an argument to the functions target_register_template and target_unregister_template. The arguments are of type const struct target_core_fabric_ops *, so target_core_fabric_ops structures having this property can be declared const. Done using Coccinelle: @r disable optional_qualifier@ identifier i; position p; @@ static struct target_core_fabric_ops i@p={...}; @ok@ position p; identifier r.i; @@ ( target_register_template(&i@p) | target_unregister_template(&i@p) ) @bad@ position p!={r.p,ok.p}; identifier r.i; @@ i@p @depends on !bad disable optional_qualifier@ identifier r.i; @@ +const struct target_core_fabric_ops i; File size before: drivers/vhost/scsi.o text data bss dec hex filename 18063 2985 40 21088 5260 drivers/vhost/scsi.o File size after: drivers/vhost/scsi.o text data bss dec hex filename 18479 2601 40 21120 5280 drivers/vhost/scsi.o Signed-off-by: NBhumika Goyal <bhumirks@gmail.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com>
-
- 19 1月, 2017 2 次提交
-
-
由 Jason Wang 提交于
This patch tries to utilize tuntap rx batching by peeking the tx virtqueue during transmission, if there's more available buffers in the virtqueue, set MSG_MORE flag for a hint for backend (e.g tuntap) to batch the packets. Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
This patch tries to do several tweaks on vhost_vq_avail_empty() for a better performance: - check cached avail index first which could avoid userspace memory access. - using unlikely() for the failure of userspace access - check vq->last_avail_idx instead of cached avail index as the last step. This patch is need for batching supports which needs to peek whether or not there's still available buffers in the ring. Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 12月, 2016 4 次提交
-
-
由 Tobias Klauser 提交于
Remove the unused but set variable se_tpg in vhost_scsi_nexus_cb() to fix the following GCC warning when building with 'W=1': drivers/vhost/scsi.c:1752:26: warning: variable ‘se_tpg’ set but not used Signed-off-by: NTobias Klauser <tklauser@distanz.ch> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Mark Rutland 提交于
Despite living under drivers/ vringh.c is also used as part of the userspace virtio tools. Before we can kill off the ACCESS_ONCE()definition in the tools, we must convert vringh.c to use {READ,WRITE}_ONCE(). This patch does so, along with the required include of <linux/compiler.h> for the relevant definitions. The userspace tools provide their own definitions in their own <linux/compiler.h>. Signed-off-by: NMark Rutland <mark.rutland@arm.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: NCornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: NJason Wang <jasowang@redhat.com>
-
由 Jason Wang 提交于
When event index was enabled, we need to fetch used event from userspace memory each time. This userspace fetch (with memory barrier) could be saved sometime when 1) caching used event and 2) if used event is ahead of new and old to new updating does not cross it, we're sure there's no need to notify guest. This will be useful for heavy tx load e.g guest pktgen test with Linux driver shows ~3.5% improvement. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Gao feng 提交于
Multi vsocks may setup the same cid at the same time. Signed-off-by: NGao feng <omarapazanadi@gmail.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
-
- 15 12月, 2016 2 次提交
-
-
由 Michael S. Tsirkin 提交于
Several vhost functions were missing __user annotations on pointers, causing sparse warnings. Fix this up. sparse also warns about vhost_process_iotlb_msg which is local and should be static. Fix that up as well. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Michael S. Tsirkin 提交于
vhost_umem_interval_tree is only used locally within vhost.c, mark it static. As some functions generated go unused, this triggers warnings unless we also mark it inline. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 09 12月, 2016 3 次提交
-
-
由 Peng Tao 提交于
local_addr.svm_cid is host cid. We should check guest cid instead, which is remote_addr.svm_cid. Otherwise we end up resetting all connections to all guests. Cc: stable@vger.kernel.org [4.8+] Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NPeng Tao <bergwolf@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Peng Tao 提交于
test_and_set_bit() already implies a memory barrier. Signed-off-by: NPeng Tao <bergwolf@gmail.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Peng Tao 提交于
Signed-off-by: NPeng Tao <bergwolf@gmail.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 12月, 2016 1 次提交
-
-
由 Al Viro 提交于
copy_from_iter_full(), copy_from_iter_full_nocache() and csum_and_copy_from_iter_full() - counterparts of copy_from_iter() et.al., advancing iterator only in case of successful full copy and returning whether it had been successful or not. Convert some obvious users. *NOTE* - do not blindly assume that something is a good candidate for those unless you are sure that not advancing iov_iter in failure case is the right thing in this case. Anything that does short read/short write kind of stuff (or is in a loop, etc.) is unlikely to be a good one. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 16 11月, 2016 1 次提交
-
-
由 Christian Borntraeger 提交于
With the s390 special case of a yielding cpu_relax() implementation gone, we can now remove all users of cpu_relax_lowlatency() and replace them with cpu_relax(). Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1477386195-32736-5-git-send-email-borntraeger@de.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 31 8月, 2016 1 次提交
-
-
Many modules call misc_register and misc_deregister in its module init and exit methods without any additional code. This ends up being boilerplate. This patch adds helper macro module_misc_device(), that replaces module_init()/ module_exit() with template functions. This patch also converts drivers to use new macro. Change since v1: Add device.h include in miscdevice.h as module_driver macro was not available from other include files in some architectures. Signed-off-by: NPrasannaKumar Muralidharan <prasannatsmkumar@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 23 8月, 2016 1 次提交
-
-
由 Benjamin Coddington 提交于
The address of the iovec &vq->iov[out] is not guaranteed to contain the scsi command's response iovec throughout the lifetime of the command. Rather, it is more likely to contain an iovec from an immediately following command after looping back around to vhost_get_vq_desc(). Pass along the iovec entirely instead. Fixes: 79c14141 ("vhost/scsi: Convert completion path to use copy_to_iter") Cc: stable@vger.kernel.org Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 15 8月, 2016 1 次提交
-
-
由 Michael S. Tsirkin 提交于
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 09 8月, 2016 1 次提交
-
-
由 Stefan Hajnoczi 提交于
Stash the packet length in a local variable before handing over ownership of the packet to virtio_transport_recv_pkt() or virtio_transport_free_pkt(). This patch solves the use-after-free since pkt is no longer guaranteed to be alive. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 02 8月, 2016 9 次提交
-
-
由 Wei Yongjun 提交于
Use kvfree() instead of open-coding it. Signed-off-by: NWei Yongjun <weiyj.lk@gmail.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Michael S. Tsirkin 提交于
vringh is pulled in by caif and mic, but the other vhost config does not need to be there. In particular, it makes no sense to have vhost net/scsi/sock under caif/mic. Create a separate Kconfig file and put vringh bits there. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Michael S. Tsirkin 提交于
Detect and fail early if long wrap around is triggered. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Jason Wang 提交于
This patch tries to implement an device IOTLB for vhost. This could be used with userspace(qemu) implementation of DMA remapping to emulate an IOMMU for the guest. The idea is simple, cache the translation in a software device IOTLB (which is implemented as an interval tree) in vhost and use vhost_net file descriptor for reporting IOTLB miss and IOTLB update/invalidation. When vhost meets an IOTLB miss, the fault address, size and access can be read from the file. After userspace finishes the translation, it writes the translated address to the vhost_net file to update the device IOTLB. When device IOTLB is enabled by setting VIRTIO_F_IOMMU_PLATFORM all vq addresses set by ioctl are treated as iova instead of virtual address and the accessing can only be done through IOTLB instead of direct userspace memory access. Before each round or vq processing, all vq metadata is prefetched in device IOTLB to make sure no translation fault happens during vq processing. In most cases, virtqueues are contiguous even in virtual address space. The IOTLB translation for virtqueue itself may make it a little slower. We might add fast path cache on top of this patch. Signed-off-by: NJason Wang <jasowang@redhat.com> [mst: use virtio feature bit: VHOST_F_DEVICE_IOTLB -> VIRTIO_F_IOMMU_PLATFORM ] [mst: fix build warnings ] Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> [ weiyj.lk: missing unlock on error ] Signed-off-by: NWei Yongjun <weiyj.lk@gmail.com>
-
由 Jason Wang 提交于
Current pre-sorted memory region array has some limitations for future device IOTLB conversion: 1) need extra work for adding and removing a single region, and it's expected to be slow because of sorting or memory re-allocation. 2) need extra work of removing a large range which may intersect several regions with different size. 3) need trick for a replacement policy like LRU To overcome the above shortcomings, this patch convert it to interval tree which can easily address the above issue with almost no extra work. The patch could be used for: - Extend the current API and only let the userspace to send diffs of memory table. - Simplify Device IOTLB implementation. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Jason Wang 提交于
This patch introduces vhost memory accessors which were just wrappers for userspace address access helpers. This is a requirement for vhost device iotlb implementation which will add iotlb translations in those accessors. Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Asias He 提交于
Enable virtio-vsock and vhost-vsock. Signed-off-by: NAsias He <asias@redhat.com> Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Asias He 提交于
VM sockets vhost transport implementation. This driver runs on the host. Signed-off-by: NAsias He <asias@redhat.com> Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Michael S. Tsirkin 提交于
vringh isn't used by vhost net or scsi - it's used by CAIF only at the moment. Drop the dependency. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-