- 18 6月, 2019 1 次提交
-
-
由 Jason Wang 提交于
Vhost_net was known to suffer from HOL[1] issues which is not easy to fix. Several downstream disable the feature by default. What's more, the datapath was split and datacopy path got the support of batching and XDP support recently which makes it faster than zerocopy part for small packets transmission. It looks to me that disable zerocopy by default is more appropriate. It cold be enabled by default again in the future if we fix the above issues. [1] https://patchwork.kernel.org/patch/3787671/Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 5月, 2019 4 次提交
-
-
由 Jason Wang 提交于
This patch will check the weight and exit the loop if we exceeds the weight. This is useful for preventing scsi kthread from hogging cpu which is guest triggerable. This addresses CVE-2019-3900. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Fixes: 057cbf49 ("tcm_vhost: Initial merge for vhost level target fabric driver") Signed-off-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com>
-
由 Jason Wang 提交于
This patch will check the weight and exit the loop if we exceeds the weight. This is useful for preventing vsock kthread from hogging cpu which is guest triggerable. The weight can help to avoid starving the request from on direction while another direction is being processed. The value of weight is picked from vhost-net. This addresses CVE-2019-3900. Cc: Stefan Hajnoczi <stefanha@redhat.com> Fixes: 433fc58e ("VSOCK: Introduce vhost_vsock.ko") Signed-off-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Jason Wang 提交于
When the rx buffer is too small for a packet, we will discard the vq descriptor and retry it for the next packet: while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk, &busyloop_intr))) { ... /* On overrun, truncate and discard */ if (unlikely(headcount > UIO_MAXIOV)) { iov_iter_init(&msg.msg_iter, READ, vq->iov, 1, 1); err = sock->ops->recvmsg(sock, &msg, 1, MSG_DONTWAIT | MSG_TRUNC); pr_debug("Discarded rx packet: len %zd\n", sock_len); continue; } ... } This makes it possible to trigger a infinite while..continue loop through the co-opreation of two VMs like: 1) Malicious VM1 allocate 1 byte rx buffer and try to slow down the vhost process as much as possible e.g using indirect descriptors or other. 2) Malicious VM2 generate packets to VM1 as fast as possible Fixing this by checking against weight at the end of RX and TX loop. This also eliminate other similar cases when: - userspace is consuming the packets in the meanwhile - theoretical TOCTOU attack if guest moving avail index back and forth to hit the continue after vhost find guest just add new buffers This addresses CVE-2019-3900. Fixes: d8316f39 ("vhost: fix total length when packets are too short") Fixes: 3a4d5c94 ("vhost_net: a kernel-level virtio server") Signed-off-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Jason Wang 提交于
We used to have vhost_exceeds_weight() for vhost-net to: - prevent vhost kthread from hogging the cpu - balance the time spent between TX and RX This function could be useful for vsock and scsi as well. So move it to vhost.c. Device must specify a weight which counts the number of requests, or it can also specific a byte_weight which counts the number of bytes that has been processed. Signed-off-by: NJason Wang <jasowang@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 21 5月, 2019 2 次提交
-
-
由 Thomas Gleixner 提交于
Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Thomas Gleixner 提交于
Add SPDX license identifiers to all files which: - Have no license information of any form - Have MODULE_LICENCE("GPL*") inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 15 5月, 2019 1 次提交
-
-
由 Ira Weiny 提交于
To facilitate additional options to get_user_pages_fast() change the singular write parameter to be gup_flags. This patch does not change any functionality. New functionality will follow in subsequent patches. Some of the get_user_pages_fast() call sites were unchanged because they already passed FOLL_WRITE or 0 for the write parameter. NOTE: It was suggested to change the ordering of the get_user_pages_fast() arguments to ensure that callers were converted. This breaks the current GUP call site convention of having the returned pages be the final parameter. So the suggestion was rejected. Link: http://lkml.kernel.org/r/20190328084422.29911-4-ira.weiny@intel.com Link: http://lkml.kernel.org/r/20190317183438.2057-4-ira.weiny@intel.comSigned-off-by: NIra Weiny <ira.weiny@intel.com> Reviewed-by: NMike Marshall <hubcap@omnibond.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Hogan <jhogan@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Rich Felker <dalias@libc.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 5月, 2019 1 次提交
-
-
由 Paolo Bonzini 提交于
At this point, vs_tpg is not public at all; tv_tpg_vhost_count is accessed under tpg->tv_tpg_mutex; tpg->vhost_scsi is accessed under vhost_scsi_mutex. Therefor there are no atomic operations involved at all here, just remove the barrier. Reported-by: NAndrea Parri <andrea.parri@amarulasolutions.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com>
-
- 11 4月, 2019 1 次提交
-
-
由 Jason Wang 提交于
We used to accept zero size iova range which will lead a infinite loop in translate_desc(). Fixing this by failing the request in this case. Reported-by: syzbot+d21e6e297322a900c128@syzkaller.appspotmail.com Fixes: 6b1e6cc7 ("vhost: new device IOTLB API") Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 3月, 2019 1 次提交
-
-
由 Arnd Bergmann 提交于
On some architectures, the MMU can be disabled, leading to access_ok() becoming an empty macro that does not evaluate its size argument, which in turn produces an unused-variable warning: drivers/vhost/vhost.c:1191:9: error: unused variable 's' [-Werror,-Wunused-variable] size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0; Mark the variable as __maybe_unused to shut up that warning. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 3月, 2019 1 次提交
-
-
由 Arnd Bergmann 提交于
On some architectures, the MMU can be disabled, leading to access_ok() becoming an empty macro that does not evaluate its size argument, which in turn produces an unused-variable warning: drivers/vhost/vhost.c:1191:9: error: unused variable 's' [-Werror,-Wunused-variable] size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0; Mark the variable as __maybe_unused to shut up that warning. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 20 2月, 2019 1 次提交
-
-
由 Jason Wang 提交于
When fail, translate_desc() returns negative value, otherwise the number of iovs. So we should fail when the return value is negative instead of a blindly check against zero. Detected by CoverityScan, CID# 1442593: Control flow issues (DEADCODE) Fixes: cc5e7107 ("vhost: log dirty page correctly") Acked-by: NMichael S. Tsirkin <mst@redhat.com> Reported-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 2月, 2019 1 次提交
-
-
由 Bart Van Assche 提交于
Due to the patch that makes TMF handling synchronous the write_pending_status() callback function is no longer called. Hence remove it. Acked-by: NFelipe Balbi <balbi@ti.com> Reviewed-by: NSagi Grimberg <sagig@mellanox.com> Reviewed-by: NAndy Grover <agrover@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NBryant G. Ly <bryantly@linux.vnet.ibm.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Saurav Kashyap <saurav.kashyap@qlogic.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Juergen Gross <jgross@suse.com> Signed-off-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 29 1月, 2019 1 次提交
-
-
由 Jason Wang 提交于
After batched used ring updating was introduced in commit e2b3b35e ("vhost_net: batch used ring update in rx"). We tend to batch heads in vq->heads for more than one packet. But the quota passed to get_rx_bufs() was not correctly limited, which can result a OOB write in vq->heads. headcount = get_rx_bufs(vq, vq->heads + nvq->done_idx, vhost_len, &in, vq_log, &log, likely(mergeable) ? UIO_MAXIOV : 1); UIO_MAXIOV was still used which is wrong since we could have batched used in vq->heads, this will cause OOB if the next buffer needs more than 960 (1024 (UIO_MAXIOV) - 64 (VHOST_NET_BATCH)) heads after we've batched 64 (VHOST_NET_BATCH) heads: Acked-by: NStefan Hajnoczi <stefanha@redhat.com> ============================================================================= BUG kmalloc-8k (Tainted: G B ): Redzone overwritten ----------------------------------------------------------------------------- INFO: 0x00000000fd93b7a2-0x00000000f0713384. First byte 0xa9 instead of 0xcc INFO: Allocated in alloc_pd+0x22/0x60 age=3933677 cpu=2 pid=2674 kmem_cache_alloc_trace+0xbb/0x140 alloc_pd+0x22/0x60 gen8_ppgtt_create+0x11d/0x5f0 i915_ppgtt_create+0x16/0x80 i915_gem_create_context+0x248/0x390 i915_gem_context_create_ioctl+0x4b/0xe0 drm_ioctl_kernel+0xa5/0xf0 drm_ioctl+0x2ed/0x3a0 do_vfs_ioctl+0x9f/0x620 ksys_ioctl+0x6b/0x80 __x64_sys_ioctl+0x11/0x20 do_syscall_64+0x43/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 INFO: Slab 0x00000000d13e87af objects=3 used=3 fp=0x (null) flags=0x200000000010201 INFO: Object 0x0000000003278802 @offset=17064 fp=0x00000000e2e6652b Fixing this by allocating UIO_MAXIOV + VHOST_NET_BATCH iovs for vhost-net. This is done through set the limitation through vhost_dev_init(), then set_owner can allocate the number of iov in a per device manner. This fixes CVE-2018-16880. Fixes: e2b3b35e ("vhost_net: batch used ring update in rx") Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 1月, 2019 1 次提交
-
-
由 Jason Wang 提交于
Vhost dirty page logging API is designed to sync through GPA. But we try to log GIOVA when device IOTLB is enabled. This is wrong and may lead to missing data after migration. To solve this issue, when logging with device IOTLB enabled, we will: 1) reuse the device IOTLB translation result of GIOVA->HVA mapping to get HVA, for writable descriptor, get HVA through iovec. For used ring update, translate its GIOVA to HVA 2) traverse the GPA->HVA mapping to get the possible GPA and log through GPA. Pay attention this reverse mapping is not guaranteed to be unique, so we should log each possible GPA in this case. This fix the failure of scp to guest during migration. In -next, we will probably support passing GIOVA->GPA instead of GIOVA->HVA. Fixes: 6b1e6cc7 ("vhost: new device IOTLB API") Reported-by: NJintack Lim <jintack@cs.columbia.edu> Cc: Jintack Lim <jintack@cs.columbia.edu> Signed-off-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 1月, 2019 2 次提交
-
-
由 Bijan Mottahedeh 提交于
Uses copy_to_iter() instead of __copy_to_user() in order to ensure we support arbitrary layouts and an input buffer split across iov entries. Fixes: 0d02dbd6 ("vhost/scsi: Respond to control queue operations") Signed-off-by: NBijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Pavel Tikhomirov 提交于
We've failed to copy and process vhost_iotlb_msg so let userspace at least know about it. For instance before these patch the code below runs without any error: int main() { struct vhost_msg msg; struct iovec iov; int fd; fd = open("/dev/vhost-net", O_RDWR); if (fd == -1) { perror("open"); return 1; } iov.iov_base = &msg; iov.iov_len = sizeof(msg)-4; if (writev(fd, &iov,1) == -1) { perror("writev"); return 1; } return 0; } Signed-off-by: NPavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 12 1月, 2019 1 次提交
-
-
由 Zha Bin 提交于
The vsock core only supports 32bit CID, but the Virtio-vsock spec define CID (dst_cid and src_cid) as u64 and the upper 32bits is reserved as zero. This inconsistency causes one bug in vhost vsock driver. The scenarios is: 0. A hash table (vhost_vsock_hash) is used to map an CID to a vsock object. And hash_min() is used to compute the hash key. hash_min() is defined as: (sizeof(val) <= 4 ? hash_32(val, bits) : hash_long(val, bits)). That means the hash algorithm has dependency on the size of macro argument 'val'. 0. In function vhost_vsock_set_cid(), a 64bit CID is passed to hash_min() to compute the hash key when inserting a vsock object into the hash table. 0. In function vhost_vsock_get(), a 32bit CID is passed to hash_min() to compute the hash key when looking up a vsock for an CID. Because the different size of the CID, hash_min() returns different hash key, thus fails to look up the vsock object for an CID. To fix this bug, we keep CID as u64 in the IOCTLs and virtio message headers, but explicitly convert u64 to u32 when deal with the hash table and vsock core. Fixes: 834e772c ("vhost/vsock: fix use-after-free in network stack callers") Link: https://github.com/stefanha/virtio/blob/vsock/trunk/content.texSigned-off-by: NZha Bin <zhabin@linux.alibaba.com> Reviewed-by: NLiu Jiang <gerry@linux.alibaba.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 1月, 2019 1 次提交
-
-
由 Linus Torvalds 提交于
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 12月, 2018 2 次提交
-
-
由 wangyan 提交于
Fixes: 'commit d588cf8f ("target: Fix se_tpg_tfo->tf_subsys regression + remove tf_subsystem")' 'commit cbbd26b8 ("[iov_iter] new primitives - copy_from_iter_full() and friends")' Signed-off-by: NYan Wang <wangyan122@huawei.com> Reviewed-by: NJun Piao <piaojun@huawei.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Stefan Hajnoczi 提交于
Now that there are no more data path users of vhost_vsock_lock, it can be turned into a mutex. It's only used by .release() and in the .ioctl() path. Depends-on: <20181105103547.22018-1-stefanha@redhat.com> Suggested-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com>
-
- 13 12月, 2018 3 次提交
-
-
由 Jason Wang 提交于
This reverts commit 78139c94. We don't protect device IOTLB with vq mutex, which will lead e.g use after free for device IOTLB entries. And since we've switched to use mutex_trylock() in previous patch, it's safe to revert it without having deadlock. Fixes: commit 78139c94 ("net: vhost: lock the vqs one by one") Cc: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
We used to hold the mutex of paired virtqueue in vhost_net_busy_poll(). But this will results an inconsistent lock order which may cause deadlock if we try to bring back the protection of device IOTLB with vq mutex that requires to hold mutex of all virtqueues at the same time. Fix this simply by switching to use mutex_trylock(), when fail just skip the busy polling. This can happen when device IOTLB is under updating which should be rare. Fixes: commit 78139c94 ("net: vhost: lock the vqs one by one") Cc: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason Wang 提交于
We miss a write barrier that guarantees used idx is updated and seen before log. This will let userspace sync and copy used ring before used idx is update. Fix this by adding a barrier before log_write(). Fixes: 8dd014ad ("vhost-net: mergeable buffers support") Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 12月, 2018 2 次提交
-
-
由 Stefan Hajnoczi 提交于
If the network stack calls .send_pkt()/.cancel_pkt() during .release(), a struct vhost_vsock use-after-free is possible. This occurs because .release() does not wait for other CPUs to stop using struct vhost_vsock. Switch to an RCU-enabled hashtable (indexed by guest CID) so that .release() can wait for other CPUs by calling synchronize_rcu(). This also eliminates vhost_vsock_lock acquisition in the data path so it could have a positive effect on performance. This is CVE-2018-14625 "kernel: use-after-free Read in vhost_transport_send_pkt". Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+bd391451452fb0b93039@syzkaller.appspotmail.com Reported-by: syzbot+e3e074963495f92a89ed@syzkaller.appspotmail.com Reported-by: syzbot+d5a0a170c5069658b141@syzkaller.appspotmail.com Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NJason Wang <jasowang@redhat.com>
-
由 Stefan Hajnoczi 提交于
If a local process has closed a connected socket and hasn't received a RST packet yet, then the socket remains in the table until a timeout expires. When a vhost_vsock instance is released with the timeout still pending, the socket is never freed because vhost_vsock has already set the SOCK_DONE flag. Check if the close timer is pending and let it close the socket. This prevents the race which can leak sockets. Reported-by: NMaximilian Riemensberger <riemensberger@cadami.net> Cc: Graham Whaley <graham.whaley@gmail.com> Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 04 12月, 2018 1 次提交
-
-
由 Jean-Philippe Brucker 提交于
Commit 78139c94 ("net: vhost: lock the vqs one by one") moved the vq lock to improve scalability, but introduced a possible deadlock in vhost-iotlb. vhost_iotlb_notify_vq() now takes vq->mutex while holding the device's IOTLB spinlock. And on the vhost_iotlb_miss() path, the spinlock is taken while holding vq->mutex. Since calling vhost_poll_queue() doesn't require any lock, avoid the deadlock by not taking vq->mutex. Fixes: 78139c94 ("net: vhost: lock the vqs one by one") Acked-by: NJason Wang <jasowang@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 11月, 2018 2 次提交
-
-
由 David Disseldorp 提交于
iscsi_target_mod is the only LIO fabric where fabric_ops.name differs from the fabric_ops.fabric_name string. fabric_ops.name is used when matching target/$fabric ConfigFS create paths, so rename it .fabric_alias and fallback to target/$fabric vs .fabric_name comparison if .fabric_alias isn't initialised. iscsi_target_mod is the only fabric module to set .fabric_alias . All other fabric modules rely on .fabric_name matching and can drop the duplicate string. Signed-off-by: NDavid Disseldorp <ddiss@suse.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 David Disseldorp 提交于
All fabrics return a const string. In all cases *except* iSCSI the get_fabric_name() string matches fabric_ops.name. Both fabric_ops.get_fabric_name() and fabric_ops.name are user-facing, with the former being used for PR/ALUA state and the latter for ConfigFS (config/target/$name), so we unfortunately need to keep both strings around for now. Replace the useless .get_fabric_name() accessor function with a const string fabric_name member variable. Signed-off-by: NDavid Disseldorp <ddiss@suse.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 28 11月, 2018 1 次提交
-
-
由 Paul E. McKenney 提交于
Now that synchronize_rcu() waits for bh-disable regions of code as well as RCU read-side critical sections, synchronize_rcu_bh() can be replaced by synchronize_rcu(). This commit therefore makes this change. Signed-off-by: NPaul E. McKenney <paulmck@linux.ibm.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: <kvm@vger.kernel.org> Cc: <virtualization@lists.linux-foundation.org> Cc: <netdev@vger.kernel.org>
-
- 18 11月, 2018 1 次提交
-
-
由 Jason Wang 提交于
We do a get_page() which involves a atomic operation. This patch tries to mitigate a per packet atomic operation by maintaining a reference bias which is initially USHRT_MAX. Each time a page is got, instead of calling get_page() we decrease the bias and when we find it's time to use a new page we will decrease the bias at one time through __page_cache_drain_cache(). Testpmd(virtio_user + vhost_net) + XDP_DROP on TAP shows about 1.6% improvement. Before: 4.63Mpps After: 4.71Mpps Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 11月, 2018 1 次提交
-
-
由 Jason Wang 提交于
The idx in vhost_vring_ioctl() was controlled by userspace, hence a potential exploitation of the Spectre variant 1 vulnerability. Fixing this by sanitizing idx before using it to index d->vqs. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: NJason Wang <jasowang@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 10月, 2018 4 次提交
-
-
由 Bijan Mottahedeh 提交于
Change the request queue handler to use common handling routines same as the control queue handler. Signed-off-by: NBijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Bijan Mottahedeh 提交于
Prepare to change the request queue handler to use common handling routines. Signed-off-by: NBijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Bijan Mottahedeh 提交于
The vhost-scsi driver currently does not handle any control queue operations. In particular, vhost_scsi_ctl_handle_kick, merely prints out a debug message but does nothing else. This can cause guest VMs to hang. As part of SCSI recovery from an error, e.g., an I/O timeout, the SCSI midlayer attempts to abort the failed operation. The SCSI virtio driver translates the abort to a SCSI TMF request that gets put on the control queue (virtscsi_abort -> virtscsi_tmf). The SCSI virtio driver then waits indefinitely for this request to be completed, but it never will because vhost-scsi never responds to that request. To avoid a hang, always respond to control queue operations; explicitly reject TMF requests, and return a no-op response to event requests. Signed-off-by: NBijan Mottahedeh <bijan.mottahedeh@oracle.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
由 Greg Edwards 提交于
Commands with protection information included were not truncating the protection iov_iter to the number of protection bytes in the command. This resulted in vhost_scsi mis-calculating the size of the protection SGL in vhost_scsi_calc_sgls(), and including both the protection and data SG entries in the protection SGL. Fixes: 09b13fa8 ("vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq") Signed-off-by: NGreg Edwards <gedwards@ddn.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Fixes: 09b13fa8 Cc: stable@vger.kernel.org Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 08 10月, 2018 1 次提交
-
-
由 Tonghao Zhang 提交于
Signed-off-by: NTonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 9月, 2018 2 次提交
-
-
由 Tonghao Zhang 提交于
This patch improves the guest receive performance. On the handle_tx side, we poll the sock receive queue at the same time. handle_rx do that in the same way. We set the poll-us=100us and use the netperf to test throughput and mean latency. When running the tests, the vhost-net kthread of that VM, is alway 100% CPU. The commands are shown as below. Rx performance is greatly improved by this patch. There is not notable performance change on tx with this series though. This patch is useful for bi-directional traffic. netperf -H IP -t TCP_STREAM -l 20 -- -O "THROUGHPUT, THROUGHPUT_UNITS, MEAN_LATENCY" Topology: [Host] ->linux bridge -> tap vhost-net ->[Guest] TCP_STREAM: * Without the patch: 19842.95 Mbps, 6.50 us mean latency * With the patch: 37598.20 Mbps, 3.43 us mean latency Signed-off-by: NTonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tonghao Zhang 提交于
Factor out generic busy polling logic and will be used for in tx path in the next patch. And with the patch, qemu can set differently the busyloop_timeout for rx queue. To avoid duplicate codes, introduce the helper functions: * sock_has_rx_data(changed from sk_has_rx_data) * vhost_net_busy_poll_try_queue Signed-off-by: NTonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-