- 23 10月, 2019 2 次提交
-
-
由 Miklos Szeredi 提交于
If writeback cache is enabled, then writes might get reordered with chmod/chown/utimes. The problem with this is that performing the write in the fuse daemon might itself change some of these attributes. In such case the following sequence of operations will result in file ending up with the wrong mode, for example: int fd = open ("suid", O_WRONLY|O_CREAT|O_EXCL); write (fd, "1", 1); fchown (fd, 0, 0); fchmod (fd, 04755); close (fd); This patch fixes this by flushing pending writes before performing chown/chmod/utimes. Reported-by: NGiuseppe Scrivano <gscrivan@redhat.com> Tested-by: NGiuseppe Scrivano <gscrivan@redhat.com> Fixes: 4d99ff8f ("fuse: Turn writeback cache on") Cc: <stable@vger.kernel.org> # v3.15+ Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 zhengbin 提交于
Fixes gcc '-Wunused-but-set-variable' warning: fs/fuse/virtio_fs.c: In function virtio_fs_wake_pending_and_unlock: fs/fuse/virtio_fs.c:983:20: warning: variable fc set but not used [-Wunused-but-set-variable] It is not used since commit 7ee1e2e6 ("virtiofs: No need to check fpq->connected state") Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: Nzhengbin <zhengbin13@huawei.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 21 10月, 2019 7 次提交
-
-
由 Vivek Goyal 提交于
If regular request queue gets full, currently we sleep for a bit and retrying submission in submitter's context. This assumes submitter is not holding any spin lock. But this assumption is not true for background requests. For background requests, we are called with fc->bg_lock held. This can lead to deadlock where one thread is trying submission with fc->bg_lock held while request completion thread has called fuse_request_end() which tries to acquire fc->bg_lock and gets blocked. As request completion thread gets blocked, it does not make further progress and that means queue does not get empty and submitter can't submit more requests. To solve this issue, retry submission with the help of a worker, instead of retrying in submitter's context. We already do this for hiprio/forget requests. Reported-by: NChirantan Ekbote <chirantan@chromium.org> Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Vivek Goyal 提交于
If virtqueue is full, we put forget requests on a list and these forgets are dispatched later using a worker. As of now we don't count these forgets in fsvq->in_flight variable. This means when queue is being drained, we have to have special logic to first drain these pending requests and then wait for fsvq->in_flight to go to zero. By counting pending forgets in fsvq->in_flight, we can get rid of special logic and just wait for in_flight to go to zero. Worker thread will kick and drain all the forgets anyway, leading in_flight to zero. I also need similar logic for normal request queue in next patch where I am about to defer request submission in the worker context if queue is full. This simplifies the code a bit. Also add two helper functions to inc/dec in_flight. Decrement in_flight helper will later used to call completion when in_flight reaches zero. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Vivek Goyal 提交于
FR_SENT flag should be set when request has been sent successfully sent over virtqueue. This is used by interrupt logic to figure out if interrupt request should be sent or not. Also add it to fqp->processing list after sending it successfully. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Vivek Goyal 提交于
In virtiofs we keep per queue connected state in virtio_fs_vq->connected and use that to end request if queue is not connected. And virtiofs does not even touch fpq->connected state. We probably need to merge these two at some point of time. For now, simplify the code a bit and do not worry about checking state of fpq->connected. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Vivek Goyal 提交于
Submission context can hold some locks which end request code tries to hold again and deadlock can occur. For example, fc->bg_lock. If a background request is being submitted, it might hold fc->bg_lock and if we could not submit request (because device went away) and tried to end request, then deadlock happens. During testing, I also got a warning from deadlock detection code. So put requests on a list and end requests from a worker thread. I got following warning from deadlock detector. [ 603.137138] WARNING: possible recursive locking detected [ 603.137142] -------------------------------------------- [ 603.137144] blogbench/2036 is trying to acquire lock: [ 603.137149] 00000000f0f51107 (&(&fc->bg_lock)->rlock){+.+.}, at: fuse_request_end+0xdf/0x1c0 [fuse] [ 603.140701] [ 603.140701] but task is already holding lock: [ 603.140703] 00000000f0f51107 (&(&fc->bg_lock)->rlock){+.+.}, at: fuse_simple_background+0x92/0x1d0 [fuse] [ 603.140713] [ 603.140713] other info that might help us debug this: [ 603.140714] Possible unsafe locking scenario: [ 603.140714] [ 603.140715] CPU0 [ 603.140716] ---- [ 603.140716] lock(&(&fc->bg_lock)->rlock); [ 603.140718] lock(&(&fc->bg_lock)->rlock); [ 603.140719] [ 603.140719] *** DEADLOCK *** Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
If the FUSE_READDIRPLUS_AUTO feature is enabled, then lookups on a directory before/during readdir are used as an indication that READDIRPLUS should be used instead of READDIR. However if the lookup turns out to be negative, then selecting READDIRPLUS makes no sense. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Move the check for async request after check for the request being already finished and done with. Reported-by: syzbot+ae0bb7aae3de6b4594e2@syzkaller.appspotmail.com Fixes: d4993774 ("fuse: stop copying args to fuse_req") Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 15 10月, 2019 1 次提交
-
-
由 Miklos Szeredi 提交于
Virtio-fs does not accept any mount options, so it's confusing and wrong to show any in /proc/mounts. Reported-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 14 10月, 2019 1 次提交
-
-
由 Vivek Goyal 提交于
We have been calling it virtio_fs and even file name is virtio_fs.c. Module name is virtio_fs.ko but when registering file system user is supposed to specify filesystem type as "virtiofs". Masayoshi Mizuma reported that he specified filesytem type as "virtio_fs" and got this warning on console. ------------[ cut here ]------------ request_module fs-virtio_fs succeeded, but still no fs? WARNING: CPU: 1 PID: 1234 at fs/filesystems.c:274 get_fs_type+0x12c/0x138 Modules linked in: ... virtio_fs fuse virtio_net net_failover ... CPU: 1 PID: 1234 Comm: mount Not tainted 5.4.0-rc1 #1 So looks like kernel could find the module virtio_fs.ko but could not find filesystem type after that. It probably is better to rename module name to virtiofs.ko so that above warning goes away in case user ends up specifying wrong fs name. Reported-by: NMasayoshi Mizuma <msys.mizuma@gmail.com> Suggested-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Tested-by: NMasayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 24 9月, 2019 7 次提交
-
-
由 YueHaibing 提交于
Fix sparse warning: fs/fuse/dev.c:468:6: warning: symbol 'fuse_args_to_req' was not declared. Should it be static? Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Fixes: 68583165 ("fuse: add pages to fuse_args") Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 zhengbin 提交于
If cuse_send_init fails, need to fuse_conn_put cc->fc. cuse_channel_open->fuse_conn_init->refcount_set(&fc->count, 1) ->fuse_dev_alloc->fuse_conn_get ->fuse_dev_free->fuse_conn_put Fixes: cc080e9e ("fuse: introduce per-instance fuse_dev structure") Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: Nzhengbin <zhengbin13@huawei.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Tejun Heo 提交于
With DEBUG_PAGEALLOC on, the following triggers. BUG: unable to handle page fault for address: ffff88859367c000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 3001067 P4D 3001067 PUD 406d3a8067 PMD 406d30c067 PTE 800ffffa6c983060 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC CPU: 38 PID: 3110657 Comm: python2.7 RIP: 0010:fuse_readdir+0x88f/0xe7a [fuse] Code: 49 8b 4d 08 49 39 4e 60 0f 84 44 04 00 00 48 8b 43 08 43 8d 1c 3c 4d 01 7e 68 49 89 dc 48 03 5c 24 38 49 89 46 60 8b 44 24 30 <8b> 4b 10 44 29 e0 48 89 ca 48 83 c1 1f 48 83 e1 f8 83 f8 17 49 89 RSP: 0018:ffffc90035edbde0 EFLAGS: 00010286 RAX: 0000000000001000 RBX: ffff88859367bff0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff88859367bfed RDI: 0000000000920907 RBP: ffffc90035edbe90 R08: 000000000000014b R09: 0000000000000004 R10: ffff88859367b000 R11: 0000000000000000 R12: 0000000000000ff0 R13: ffffc90035edbee0 R14: ffff889fb8546180 R15: 0000000000000020 FS: 00007f80b5f4a740(0000) GS:ffff889fffa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff88859367c000 CR3: 0000001c170c2001 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: iterate_dir+0x122/0x180 __x64_sys_getdents+0xa6/0x140 do_syscall_64+0x42/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 It's in fuse_parse_cache(). %rbx (ffff88859367bff0) is fuse_dirent pointer - addr + offset. FUSE_DIRENT_SIZE() is trying to dereference namelen off of it but that derefs into the next page which is disabled by pagealloc debug causing a PF. This is caused by dirent->namelen being accessed before ensuring that there's enough bytes in the page for the dirent. Fix it by pushing down reclen calculation. Signed-off-by: NTejun Heo <tj@kernel.org> Fixes: 5d7bc7e8 ("fuse: allow using readdir cache") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Arnd Bergmann 提交于
This function has been made static, which now causes a compile-time warning: WARNING: "fuse_put_request" [vmlinux] is a static EXPORT_SYMBOL_GPL Remove the unneeded export. Fixes: 66abc359 ("fuse: unexport request ops") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Khazhismel Kumykov 提交于
account per-file, dentry, and inode data blockdev/superblock and temporary per-request data was left alone, as this usually isn't accounted Reviewed-by: NShakeel Butt <shakeelb@google.com> Signed-off-by: NKhazhismel Kumykov <khazhy@google.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Khazhismel Kumykov 提交于
Implements the optimization noted in commit f75fdf22 ("fuse: don't use ->d_time"), as the additional memory can be significant. (In particular, on SLAB configurations this 8-byte alloc becomes 32 bytes). Per-dentry, this can consume significant memory. Reviewed-by: NShakeel Butt <shakeelb@google.com> Signed-off-by: NKhazhismel Kumykov <khazhy@google.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Vasily Averin 提交于
unlock_page() was missing in case of an already in-flight write against the same page. Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Fixes: ff17be08 ("fuse: writepage: skip already in flight") Cc: <stable@vger.kernel.org> # v3.13 Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 19 9月, 2019 1 次提交
-
-
由 Stefan Hajnoczi 提交于
Add a basic file system module for virtio-fs. This does not yet contain shared data support between host and guest or metadata coherency speedups. However it is already significantly faster than virtio-9p. Design Overview =============== With the goal of designing something with better performance and local file system semantics, a bunch of ideas were proposed. - Use fuse protocol (instead of 9p) for communication between guest and host. Guest kernel will be fuse client and a fuse server will run on host to serve the requests. - For data access inside guest, mmap portion of file in QEMU address space and guest accesses this memory using dax. That way guest page cache is bypassed and there is only one copy of data (on host). This will also enable mmap(MAP_SHARED) between guests. - For metadata coherency, there is a shared memory region which contains version number associated with metadata and any guest changing metadata updates version number and other guests refresh metadata on next access. This is yet to be implemented. How virtio-fs differs from existing approaches ============================================== The unique idea behind virtio-fs is to take advantage of the co-location of the virtual machine and hypervisor to avoid communication (vmexits). DAX allows file contents to be accessed without communication with the hypervisor. The shared memory region for metadata avoids communication in the common case where metadata is unchanged. By replacing expensive communication with cheaper shared memory accesses, we expect to achieve better performance than approaches based on network file system protocols. In addition, this also makes it easier to achieve local file system semantics (coherency). These techniques are not applicable to network file system protocols since the communications channel is bypassed by taking advantage of shared memory on a local machine. This is why we decided to build virtio-fs rather than focus on 9P or NFS. Caching Modes ============= Like virtio-9p, different caching modes are supported which determine the coherency level as well. The “cache=FOO” and “writeback” options control the level of coherence between the guest and host filesystems. - cache=none metadata, data and pathname lookup are not cached in guest. They are always fetched from host and any changes are immediately pushed to host. - cache=always metadata, data and pathname lookup are cached in guest and never expire. - cache=auto metadata and pathname lookup cache expires after a configured amount of time (default is 1 second). Data is cached while the file is open (close to open consistency). - writeback/no_writeback These options control the writeback strategy. If writeback is disabled, then normal writes will immediately be synchronized with the host fs. If writeback is enabled, then writes may be cached in the guest until the file is closed or an fsync(2) performed. This option has no effect on mmap-ed writes or writes going through the DAX mechanism. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 12 9月, 2019 12 次提交
-
-
由 Vivek Goyal 提交于
virtio-fs does not support aborting requests which are being processed. That is requests which have been sent to fuse daemon on host. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Allow virtio-fs to also send DESTROY request. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Don't hold onto dentry in lru list if need to re-lookup it anyway at next access. Only do this if explicitly enabled, otherwise it could result in performance regression. More advanced version of this patch would periodically flush out dentries from the lru which have gone stale. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Vivek Goyal 提交于
As of now fuse_dev_alloc() both allocates a fuse device and installs it in fuse_conn list. fuse_dev_alloc() can fail if fuse_device allocation fails. virtio-fs needs to initialize multiple fuse devices (one per virtio queue). It initializes one fuse device as part of call to fuse_fill_super_common() and rest of the devices are allocated and installed after that. But, we can't afford to fail after calling fuse_fill_super_common() as we don't have a way to undo all the actions done by fuse_fill_super_common(). So to avoid failures after the call to fuse_fill_super_common(), pre-allocate all fuse devices early and install them into fuse connection later. This patch provides two separate helpers for fuse device allocation and fuse device installation in fuse_conn. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Stefan Hajnoczi 提交于
The /dev/fuse device uses fiq->waitq and fasync to signal that requests are available. These mechanisms do not apply to virtio-fs. This patch introduces callbacks so alternative behavior can be used. Note that queue_interrupt() changes along these lines: spin_lock(&fiq->waitq.lock); wake_up_locked(&fiq->waitq); + kill_fasync(&fiq->fasync, SIGIO, POLL_IN); spin_unlock(&fiq->waitq.lock); - kill_fasync(&fiq->fasync, SIGIO, POLL_IN); Since queue_request() and queue_forget() also call kill_fasync() inside the spinlock this should be safe. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Stefan Hajnoczi 提交于
fuse_fill_super() includes code to process the fd= option and link the struct fuse_dev to the fd's struct file. In virtio-fs there is no file descriptor because /dev/fuse is not used. This patch extracts fuse_fill_super_common() so that both classic fuse and virtio-fs can share the code to initialize a mount. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Vivek Goyal 提交于
File systems like virtio-fs need to do not have to play directly with forget list data structures. There is a helper function use that instead. Rename dequeue_forget() to fuse_dequeue_forget() and export it so that stacked filesystems can use it. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Stefan Hajnoczi 提交于
virtio-fs will need unique IDs for FORGET requests from outside fs/fuse/dev.c. Make the symbol visible. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Vivek Goyal 提交于
This will be used by virtio-fs to send init request to fuse server after initialization of virt queues. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Stefan Hajnoczi 提交于
virtio-fs will need to query the length of fuse_arg lists. Make the symbol visible. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Stefan Hajnoczi 提交于
virtio-fs will need to complete requests from outside fs/fuse/dev.c. Make the symbol visible. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
The size of struct fuse_req was reduced from 392B to 144B on a non-debug config, thus the sanitize_global_limit() helper was setting a larger default limit. This doesn't really reflect reduction in the memory used by requests, since the fields removed from fuse_req were added to fuse_args derived structs; e.g. sizeof(struct fuse_writepages_args) is 248B, thus resulting in slightly more memory being used for writepage requests overalll (due to using 256B slabs). Make the calculatation ignore the size of fuse_req and use the old 392B value. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 10 9月, 2019 9 次提交
-
-
由 Miklos Szeredi 提交于
The page array pointers are also duplicated across fuse_args_pages and fuse_req. Get rid of the fuse_req ones. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
No need to duplicate the argument arrays in fuse_req, so just dereference req->args instead of copying to the fuse_req internal ones. This allows further cleanup of the fuse_req structure. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Get rid of request specific fields in fuse_req that are not used anymore. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Page arrays are not allocated together with the request anymore. Get rid of the dead code Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
All requests are now sent with one of the fuse_simple_... helpers. Get rid of the old api from the fuse internal header. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Rename fuse_request_send_notify_reply() to fuse_simple_notify_reply() and convert to passing fuse_args instead of fuse_req. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Since we cannot reserve the request structure up-front, make sure that the request allocation doesn't fail using __GFP_NOFAIL. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
This is a straightforward conversion. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Bypass the fc->initialized check by setting the force flag. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-