- 02 9月, 2020 24 次提交
-
-
由 Liu Bo 提交于
task #28910367 Rather than explicitly specifying "-o default_permissions,allow_other", virtiofs can set some default values for them. With this, we can simply do "mount -t virtio_fs atest /mnt/test/ -otag=myfs-1,dax". Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Liu Bo 提交于
task #28910367 virtio-fs will need to use it from outside fs/fuse/dev.c. Make the symbol visible. Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit 519525fa47b5a8155f0b203e49a3a6a2319f75ae upstream Allow fuse to pass RENAME_WHITEOUT to fuse server. Overlayfs on top of virtiofs uses RENAME_WHITEOUT. Without this patch renaming a directory in overlayfs (dir is on lower) fails with -EINVAL. With this patch it works. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> (cherry picked from commit 519525fa47b5a8155f0b203e49a3a6a2319f75ae) Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit 724c15a43e2c7ac26e2d07abef99191162498fa9 upstream While we wait for queue to finish draining, use completions instead of usleep_range(). This is better way of waiting for event. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> (cherry picked from commit 724c15a43e2c7ac26e2d07abef99191162498fa9) Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit 1efcf39eb627573f8d543ea396cf36b0651b1e56 upstream We are sending whole of virtio_fs_forget struct to the other end over virtqueue. Other end does not need to see elements like "struct list". That's internal detail of guest kernel. Fix it. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> (cherry picked from commit 1efcf39eb627573f8d543ea396cf36b0651b1e56) Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit 58ada94f95f71d4f73197ab0e9603dbba6e47fe3 upstream Currently we are duplicating logic to send forgets at two places. Consolidate the code by calling one helper function. This also uses virtqueue_add_outbuf() instead of virtqueue_add_sgs(). Former is simpler to call. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> (cherry picked from commit 58ada94f95f71d4f73197ab0e9603dbba6e47fe3) Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 YueHaibing 提交于
task #28910367 commit 00929447f5758c4f64c74d0a4b40a6eb3d9df0e3 upstream There expect the 'static' keyword to come first in a declaration, and we get warnings like this with "make W=1": fs/fuse/virtio_fs.c:687:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] fs/fuse/virtio_fs.c:692:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] fs/fuse/virtio_fs.c:1029:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> (cherry picked from commit 00929447f5758c4f64c74d0a4b40a6eb3d9df0e3) Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 zhengbin 提交于
task #28910367 commit 80da5a809d193c60d090cbdf4fe316781bc07965 upstream Fixes gcc '-Wunused-but-set-variable' warning: fs/fuse/virtio_fs.c: In function virtio_fs_wake_pending_and_unlock: fs/fuse/virtio_fs.c:983:20: warning: variable fc set but not used [-Wunused-but-set-variable] It is not used since commit 7ee1e2e631db ("virtiofs: No need to check fpq->connected state") Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: Nzhengbin <zhengbin13@huawei.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit a9bfd9dd3417561d06c81de04f6d6c1e0c9b3d44 upstream If regular request queue gets full, currently we sleep for a bit and retrying submission in submitter's context. This assumes submitter is not holding any spin lock. But this assumption is not true for background requests. For background requests, we are called with fc->bg_lock held. This can lead to deadlock where one thread is trying submission with fc->bg_lock held while request completion thread has called fuse_request_end() which tries to acquire fc->bg_lock and gets blocked. As request completion thread gets blocked, it does not make further progress and that means queue does not get empty and submitter can't submit more requests. To solve this issue, retry submission with the help of a worker, instead of retrying in submitter's context. We already do this for hiprio/forget requests. Reported-by: NChirantan Ekbote <chirantan@chromium.org> Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> (cherry picked from commit a9bfd9dd3417561d06c81de04f6d6c1e0c9b3d44) Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit c17ea009610366146ec409fd6dc277e0f2510b10 upstream If virtqueue is full, we put forget requests on a list and these forgets are dispatched later using a worker. As of now we don't count these forgets in fsvq->in_flight variable. This means when queue is being drained, we have to have special logic to first drain these pending requests and then wait for fsvq->in_flight to go to zero. By counting pending forgets in fsvq->in_flight, we can get rid of special logic and just wait for in_flight to go to zero. Worker thread will kick and drain all the forgets anyway, leading in_flight to zero. I also need similar logic for normal request queue in next patch where I am about to defer request submission in the worker context if queue is full. This simplifies the code a bit. Also add two helper functions to inc/dec in_flight. Decrement in_flight helper will later used to call completion when in_flight reaches zero. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> (cherry picked from commit c17ea009610366146ec409fd6dc277e0f2510b10) Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit 5dbe190f341206a7896f7e40c1e3a36933d812f3 upstream FR_SENT flag should be set when request has been sent successfully sent over virtqueue. This is used by interrupt logic to figure out if interrupt request should be sent or not. Also add it to fqp->processing list after sending it successfully. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit 7ee1e2e631dbf0ff0df2a67a1e01ba3c1dce7a46 upstream In virtiofs we keep per queue connected state in virtio_fs_vq->connected and use that to end request if queue is not connected. And virtiofs does not even touch fpq->connected state. We probably need to merge these two at some point of time. For now, simplify the code a bit and do not worry about checking state of fpq->connected. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit 51fecdd2555b3e0e05a78d30093c638d164a32f9 upstream Submission context can hold some locks which end request code tries to hold again and deadlock can occur. For example, fc->bg_lock. If a background request is being submitted, it might hold fc->bg_lock and if we could not submit request (because device went away) and tried to end request, then deadlock happens. During testing, I also got a warning from deadlock detection code. So put requests on a list and end requests from a worker thread. I got following warning from deadlock detector. [ 603.137138] WARNING: possible recursive locking detected [ 603.137142] -------------------------------------------- [ 603.137144] blogbench/2036 is trying to acquire lock: [ 603.137149] 00000000f0f51107 (&(&fc->bg_lock)->rlock){+.+.}, at: fuse_request_end+0xdf/0x1c0 [fuse] [ 603.140701] [ 603.140701] but task is already holding lock: [ 603.140703] 00000000f0f51107 (&(&fc->bg_lock)->rlock){+.+.}, at: fuse_simple_background+0x92/0x1d0 [fuse] [ 603.140713] [ 603.140713] other info that might help us debug this: [ 603.140714] Possible unsafe locking scenario: [ 603.140714] [ 603.140715] CPU0 [ 603.140716] ---- [ 603.140716] lock(&(&fc->bg_lock)->rlock); [ 603.140718] lock(&(&fc->bg_lock)->rlock); [ 603.140719] [ 603.140719] *** DEADLOCK *** Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit 112e72373d1f60f1e4558d0a7f0de5da39a1224d upstream We have been calling it virtio_fs and even file name is virtio_fs.c. Module name is virtio_fs.ko but when registering file system user is supposed to specify filesystem type as "virtiofs". Masayoshi Mizuma reported that he specified filesytem type as "virtio_fs" and got this warning on console. ------------[ cut here ]------------ request_module fs-virtio_fs succeeded, but still no fs? WARNING: CPU: 1 PID: 1234 at fs/filesystems.c:274 get_fs_type+0x12c/0x138 Modules linked in: ... virtio_fs fuse virtio_net net_failover ... CPU: 1 PID: 1234 Comm: mount Not tainted 5.4.0-rc1 #1 So looks like kernel could find the module virtio_fs.ko but could not find filesystem type after that. It probably is better to rename module name to virtiofs.ko so that above warning goes away in case user ends up specifying wrong fs name. Reported-by: NMasayoshi Mizuma <msys.mizuma@gmail.com> Suggested-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Tested-by: NMasayoshi Mizuma <m.mizuma@jp.fujitsu.com> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> (cherry picked from commit 112e72373d1f60f1e4558d0a7f0de5da39a1224d) Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Stefan Hajnoczi 提交于
task #28910367 commit a62a8ef9d97da23762a588592c8b8eb50a8deb6a upstream Add a basic file system module for virtio-fs. This does not yet contain shared data support between host and guest or metadata coherency speedups. However it is already significantly faster than virtio-9p. Design Overview =============== With the goal of designing something with better performance and local file system semantics, a bunch of ideas were proposed. - Use fuse protocol (instead of 9p) for communication between guest and host. Guest kernel will be fuse client and a fuse server will run on host to serve the requests. - For data access inside guest, mmap portion of file in QEMU address space and guest accesses this memory using dax. That way guest page cache is bypassed and there is only one copy of data (on host). This will also enable mmap(MAP_SHARED) between guests. - For metadata coherency, there is a shared memory region which contains version number associated with metadata and any guest changing metadata updates version number and other guests refresh metadata on next access. This is yet to be implemented. How virtio-fs differs from existing approaches ============================================== The unique idea behind virtio-fs is to take advantage of the co-location of the virtual machine and hypervisor to avoid communication (vmexits). DAX allows file contents to be accessed without communication with the hypervisor. The shared memory region for metadata avoids communication in the common case where metadata is unchanged. By replacing expensive communication with cheaper shared memory accesses, we expect to achieve better performance than approaches based on network file system protocols. In addition, this also makes it easier to achieve local file system semantics (coherency). These techniques are not applicable to network file system protocols since the communications channel is bypassed by taking advantage of shared memory on a local machine. This is why we decided to build virtio-fs rather than focus on 9P or NFS. Caching Modes ============= Like virtio-9p, different caching modes are supported which determine the coherency level as well. The “cache=FOO” and “writeback” options control the level of coherence between the guest and host filesystems. - cache=none metadata, data and pathname lookup are not cached in guest. They are always fetched from host and any changes are immediately pushed to host. - cache=always metadata, data and pathname lookup are cached in guest and never expire. - cache=auto metadata and pathname lookup cache expires after a configured amount of time (default is 1 second). Data is cached while the file is open (close to open consistency). - writeback/no_writeback These options control the writeback strategy. If writeback is disabled, then normal writes will immediately be synchronized with the host fs. If writeback is enabled, then writes may be cached in the guest until the file is closed or an fsync(2) performed. This option has no effect on mmap-ed writes or writes going through the DAX mechanism. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> (cherry picked from commit a62a8ef9d97da23762a588592c8b8eb50a8deb6a) [Liubo: given that 4.19 lacks the support of fs_context to parse mount option, here I just change it back to the 4.19 way, so we still use -o tag=myfs-1 to get virtiofs mount.] Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Miklos Szeredi 提交于
task #28910367 commit 8fab010644363f8f80194322aa7a81e38c867af3 upstream Don't hold onto dentry in lru list if need to re-lookup it anyway at next access. Only do this if explicitly enabled, otherwise it could result in performance regression. More advanced version of this patch would periodically flush out dentries from the lru which have gone stale. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit 4388c5aac4bae5c83a2c66882043942002ba09a2 upstream stacked file systems like virtio-fs do not have to play directly with forget list data structures. There is a helper function use that instead. Rename dequeue_forget() to fuse_dequeue_forget() and export it so that stacked filesystems can use it. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Stefan Hajnoczi 提交于
task #28910367 commit 79d96efffda7597b41968d5d8813b39fc2965f1b upstream virtio-fs will need unique IDs for FORGET requests from outside fs/fuse/dev.c. Make the symbol visible. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit 0cd1eb9a4160a96e0ec9b93b2e7b489f449bf22d upstream As of now fuse_dev_alloc() both allocates a fuse device and installs it in fuse_conn list. fuse_dev_alloc() can fail if fuse_device allocation fails. virtio-fs needs to initialize multiple fuse devices (one per virtio queue). It initializes one fuse device as part of call to fuse_fill_super_common() and rest of the devices are allocated and installed after that. But, we can't affort to fail after calling fuse_fill_super_common() as we don't have a way to undo all the actions done by fuse_fill_super_common(). So to avoid failures after the call to fuse_fill_super_common(), pre-allocate all fuse devices early and install them into fuse connection later. This patch provides two separate helpers for fuse device allocation and fuse device installation in fuse_conn. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Stefan Hajnoczi 提交于
task #28910367 commit ae3aad77f46fbba56eff7141b2fc49870b60827e upstream The /dev/fuse device uses fiq->waitq and fasync to signal that requests are available. These mechanisms do not apply to virtio-fs. This patch introduces callbacks so alternative behavior can be used. Note that queue_interrupt() changes along these lines: spin_lock(&fiq->waitq.lock); wake_up_locked(&fiq->waitq); + kill_fasync(&fiq->fasync, SIGIO, POLL_IN); spin_unlock(&fiq->waitq.lock); - kill_fasync(&fiq->fasync, SIGIO, POLL_IN); Since queue_request() and queue_forget() also call kill_fasync() inside the spinlock this should be safe. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Vivek Goyal 提交于
task #28910367 commit 95a84cdb11c26315a6d34664846f82c438c961a1 upstream This will be used by virtio-fs to send init request to fuse server after initialization of virt queues. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Stefan Hajnoczi 提交于
task #28910367 commit 14d46d7abc3973a47e8eb0eb5eb87ee8d910a505 upstream virtio-fs will need to query the length of fuse_arg lists. Make the symbol visible. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Stefan Hajnoczi 提交于
task #28910367 commit 04ec5af0776e9baefed59891f12adbcb5fa71a23 upstream virtio-fs will need to complete requests from outside fs/fuse/dev.c. Make the symbol visible. Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
由 Stefan Hajnoczi 提交于
task #28910367 commit 0cc2656cdb0b1f234e6d29378cb061e29d7522bc upstream fuse_fill_super() includes code to process the fd= option and link the struct fuse_dev to the fd's struct file. In virtio-fs there is no file descriptor because /dev/fuse is not used. This patch extracts fuse_fill_super_common() so that both classic fuse and virtio-fs can share the code to initialize a mount. parse_fuse_opt() is also extracted so that the fuse_fill_super_common() caller has access to the mount options. This allows classic fuse to handle the fd= option outside fuse_fill_super_common(). Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com> Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com>
-
- 17 1月, 2020 1 次提交
-
-
由 David Howells 提交于
commit 00e23707442a75b404392cef1405ab4fd498de6b upstream. Use accessor functions to access an iterator's type and direction. This allows for the possibility of using some other method of determining the type of iterator than if-chains with bitwise-AND conditions. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Reviewed-by: NJeffle Xu <jefflexu@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
- 15 1月, 2020 1 次提交
-
-
由 Ma Jie Yue 提交于
The failover of fuse userspace daemon will reuse the existing fuse conn, without unmounting it, during daemon crashing and recovery procedure. But some requests might be in process in the daemon before sending out reply, when the crash happens. This will stuck the application since it will never get the reply after the failover. We add the sysfs api to flush these requests, after the daemon crash, before recovery. It is easy to reproduce the issue in the fuse userspace daemon, just exit after receiving the request and before sending the reply back. The application will hang up in some read/write operation, before echo 1 > /sys/fs/fuse/connection/xxx/flush. The flush operation will make the io fail and return the error to the application. Signed-off-by: NMa Jie Yue <majieyue@linux.alibaba.com> Reviewed-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: NXiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Acked-by: NCaspar Zhang <caspar@linux.alibaba.com>
-
- 13 12月, 2019 2 次提交
-
-
由 Miklos Szeredi 提交于
commit eb59bd17d2fa6e5e84fba61a5ebdea984222e6d5 upstream. If a filesystem returns negative inode sizes, future reads on the file were causing the cpu to spin on truncate_pagecache. Create a helper to validate the attributes. This now does two things: - check the file mode - check if the file size fits in i_size without overflowing Reported-by: NArijit Banerjee <arijit@rubrik.com> Fixes: d8a5ba45 ("[PATCH] FUSE - core") Cc: <stable@vger.kernel.org> # v2.6.14 Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Miklos Szeredi 提交于
commit c634da718db9b2fac201df2ae1b1b095344ce5eb upstream. When adding a new hard link, make sure that i_nlink doesn't overflow. Fixes: ac45d613 ("fuse: fix nlink after unlink") Cc: <stable@vger.kernel.org> # v3.4 Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 21 11月, 2019 1 次提交
-
-
由 Kirill Tkhai 提交于
[ Upstream commit 2a23f2b8adbe4bd584f936f7ac17a99750eed9d7 ] Since they are of unsigned int type, it's allowed to read them unlocked during reporting to userspace. Let's underline this fact with READ_ONCE() macroses. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 06 11月, 2019 2 次提交
-
-
由 Miklos Szeredi 提交于
commit e4648309b85a78f8c787457832269a8712a8673e upstream. Make sure cached writes are not reordered around open(..., O_TRUNC), with the obvious wrong results. Fixes: 4d99ff8f ("fuse: Turn writeback cache on") Cc: <stable@vger.kernel.org> # v3.15+ Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Miklos Szeredi 提交于
commit b24e7598db62386a95a3c8b9c75630c5d56fe077 upstream. If writeback cache is enabled, then writes might get reordered with chmod/chown/utimes. The problem with this is that performing the write in the fuse daemon might itself change some of these attributes. In such case the following sequence of operations will result in file ending up with the wrong mode, for example: int fd = open ("suid", O_WRONLY|O_CREAT|O_EXCL); write (fd, "1", 1); fchown (fd, 0, 0); fchmod (fd, 04755); close (fd); This patch fixes this by flushing pending writes before performing chown/chmod/utimes. Reported-by: NGiuseppe Scrivano <gscrivan@redhat.com> Tested-by: NGiuseppe Scrivano <gscrivan@redhat.com> Fixes: 4d99ff8f ("fuse: Turn writeback cache on") Cc: <stable@vger.kernel.org> # v3.15+ Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 12 10月, 2019 1 次提交
-
-
由 zhengbin 提交于
[ Upstream commit 9ad09b1976c562061636ff1e01bfc3a57aebe56b ] If cuse_send_init fails, need to fuse_conn_put cc->fc. cuse_channel_open->fuse_conn_init->refcount_set(&fc->count, 1) ->fuse_dev_alloc->fuse_conn_get ->fuse_dev_free->fuse_conn_put Fixes: cc080e9e ("fuse: introduce per-instance fuse_dev structure") Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: Nzhengbin <zhengbin13@huawei.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 05 10月, 2019 2 次提交
-
-
由 Eric Biggers 提交于
[ Upstream commit 76e43c8ccaa35c30d5df853013561145a0f750a5 ] When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock. This may have to wait for fuse_iqueue::waitq.lock to be released by one of many places that take it with IRQs enabled. Since the IRQ handler may take kioctx::ctx_lock, lockdep reports that a deadlock is possible. Fix it by protecting the state of struct fuse_iqueue with a separate spinlock, and only accessing fuse_iqueue::waitq using the versions of the waitqueue functions which do IRQ-safe locking internally. Reproducer: #include <fcntl.h> #include <stdio.h> #include <sys/mount.h> #include <sys/stat.h> #include <sys/syscall.h> #include <unistd.h> #include <linux/aio_abi.h> int main() { char opts[128]; int fd = open("/dev/fuse", O_RDWR); aio_context_t ctx = 0; struct iocb cb = { .aio_lio_opcode = IOCB_CMD_POLL, .aio_fildes = fd }; struct iocb *cbp = &cb; sprintf(opts, "fd=%d,rootmode=040000,user_id=0,group_id=0", fd); mkdir("mnt", 0700); mount("foo", "mnt", "fuse", 0, opts); syscall(__NR_io_setup, 1, &ctx); syscall(__NR_io_submit, ctx, 1, &cbp); } Beginning of lockdep output: ===================================================== WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected 5.3.0-rc5 #9 Not tainted ----------------------------------------------------- syz_fuse/135 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: 000000003590ceda (&fiq->waitq){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline] 000000003590ceda (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1751 [inline] 000000003590ceda (&fiq->waitq){+.+.}, at: __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825 and this task is already holding: 0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline] 0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1749 [inline] 0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one.constprop.0+0x1f4/0x5b0 fs/aio.c:1825 which would create a new lock dependency: (&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.} but this new dependency connects a SOFTIRQ-irq-safe lock: (&(&ctx->ctx_lock)->rlock){..-.} [...] Reported-by: syzbot+af05535bb79520f95431@syzkaller.appspotmail.com Reported-by: syzbot+d86c4426a01f60feddc7@syzkaller.appspotmail.com Fixes: bfe4037e ("aio: implement IOCB_CMD_POLL") Cc: <stable@vger.kernel.org> # v4.19+ Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Vasily Averin 提交于
commit d5880c7a8620290a6c90ced7a0e8bd0ad9419601 upstream. unlock_page() was missing in case of an already in-flight write against the same page. Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Fixes: ff17be08 ("fuse: writepage: skip already in flight") Cc: <stable@vger.kernel.org> # v3.13 Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 15 6月, 2019 1 次提交
-
-
由 Kirill Smelkov 提交于
[ Upstream commit 7640682e67b33cab8628729afec8ca92b851394f ] FUSE filesystem server and kernel client negotiate during initialization phase, what should be the maximum write size the client will ever issue. Correspondingly the filesystem server then queues sys_read calls to read requests with buffer capacity large enough to carry request header + that max_write bytes. A filesystem server is free to set its max_write in anywhere in the range between [1*page, fc->max_pages*page]. In particular go-fuse[2] sets max_write by default as 64K, wheres default fc->max_pages corresponds to 128K. Libfuse also allows users to configure max_write, but by default presets it to possible maximum. If max_write is < fc->max_pages*page, and in NOTIFY_RETRIEVE handler we allow to retrieve more than max_write bytes, corresponding prepared NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the filesystem server, in full correspondence with server/client contract, will be only queuing sys_read with ~max_write buffer capacity, and fuse_dev_do_read throws away requests that cannot fit into server request buffer. In turn the filesystem server could get stuck waiting indefinitely for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK which is understood by clients as that NOTIFY_REPLY was queued and will be sent back. Cap requested size to negotiate max_write to avoid the problem. This aligns with the way NOTIFY_RETRIEVE handler works, which already unconditionally caps requested retrieve size to fuse_conn->max_pages. This way it should not hurt NOTIFY_RETRIEVE semantic if we return less data than was originally requested. Please see [1] for context where the problem of stuck filesystem was hit for real, how the situation was traced and for more involving patch that did not make it into the tree. [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2 [2] https://github.com/hanwen/go-fuseSigned-off-by: NKirill Smelkov <kirr@nexedi.com> Cc: Han-Wen Nienhuys <hanwen@google.com> Cc: Jakob Unterwurzacher <jakobunt@gmail.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
- 11 6月, 2019 1 次提交
-
-
由 Miklos Szeredi 提交于
commit 35d6fcbb7c3e296a52136347346a698a35af3fda upstream. Do the proper cleanup in case the size check fails. Tested with xfstests:generic/228 Reported-by: Nkbuild test robot <lkp@intel.com> Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Fixes: 0cbade024ba5 ("fuse: honor RLIMIT_FSIZE in fuse_file_fallocate") Cc: Liu Bo <bo.liu@linux.alibaba.com> Cc: <stable@vger.kernel.org> # v3.5 Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 26 5月, 2019 3 次提交
-
-
由 Kirill Smelkov 提交于
commit bbd84f33652f852ce5992d65db4d020aba21f882 upstream. Starting from commit 9c225f26 ("vfs: atomic f_pos accesses as per POSIX") files opened even via nonseekable_open gate read and write via lock and do not allow them to be run simultaneously. This can create read vs write deadlock if a filesystem is trying to implement a socket-like file which is intended to be simultaneously used for both read and write from filesystem client. See commit 10dce8af3422 ("fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock") for details and e.g. commit 581d21a2 ("xenbus: fix deadlock on writes to /proc/xen/xenbus") for a similar deadlock example on /proc/xen/xenbus. To avoid such deadlock it was tempting to adjust fuse_finish_open to use stream_open instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE, and in particular GVFS which actually uses offset in its read and write handlers https://codesearch.debian.net/search?q=-%3Enonseekable+%3D https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080 https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346 https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481 so if we would do such a change it will break a real user. Add another flag (FOPEN_STREAM) for filesystem servers to indicate that the opened handler is having stream-like semantics; does not use file position and thus the kernel is free to issue simultaneous read and write request on opened file handle. This patch together with stream_open() should be added to stable kernels starting from v3.14+. This will allow to patch OSSPD and other FUSE filesystems that provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE in open handler and this way avoid the deadlock on all kernel versions. This should work because fuse_finish_open ignores unknown open flags returned from a filesystem and so passing FOPEN_STREAM to a kernel that is not aware of this flag cannot hurt. In turn the kernel that is not aware of FOPEN_STREAM will be < v3.14 where just FOPEN_NONSEEKABLE is sufficient to implement streams without read vs write deadlock. Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: NKirill Smelkov <kirr@nexedi.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Liu Bo 提交于
commit 0cbade024ba501313da3b7e5dd2a188a6bc491b5 upstream. fstests generic/228 reported this failure that fuse fallocate does not honor what 'ulimit -f' has set. This adds the necessary inode_newsize_ok() check. Signed-off-by: NLiu Bo <bo.liu@linux.alibaba.com> Fixes: 05ba1f08 ("fuse: add FALLOCATE operation") Cc: <stable@vger.kernel.org> # v3.5 Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Miklos Szeredi 提交于
commit 9de5be06d0a89ca97b5ab902694d42dfd2bb77d2 upstream. Writepage requests were cropped to i_size & 0xffffffff, which meant that mmaped writes to any file larger than 4G might be silently discarded. Fix by storing the file size in a properly sized variable (loff_t instead of size_t). Reported-by: NAntonio SJ Musumeci <trapexit@spawn.link> Fixes: 6eaf4782 ("fuse: writepages: crop secondary requests") Cc: <stable@vger.kernel.org> # v3.13 Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 04 5月, 2019 1 次提交
-
-
由 Matthew Wilcox 提交于
commit 15fab63e1e57be9fdb5eec1bbc5916e9825e9acb upstream. Change pipe_buf_get() to return a bool indicating whether it succeeded in raising the refcount of the page (if the thing in the pipe is a page). This removes another mechanism for overflowing the page refcount. All callers converted to handle a failure. Reported-by: NJann Horn <jannh@google.com> Signed-off-by: NMatthew Wilcox <willy@infradead.org> Cc: stable@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-