- 13 2月, 2019 1 次提交
-
-
由 Miklos Szeredi 提交于
commit a2ebba824106dabe79937a9f29a875f837e1b6d4 upstream. NR_WRITEBACK_TEMP is accounted on the temporary page in the request, not the page cache page. Fixes: 8b284dc4 ("fuse: writepages: handle same page rewrites") Cc: <stable@vger.kernel.org> # v3.13 Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 20 12月, 2018 1 次提交
-
-
由 Chad Austin 提交于
commit 2e64ff154ce6ce9a8dc0f9556463916efa6ff460 upstream. When FUSE_OPEN returns ENOSYS, the no_open bit is set on the connection. Because the FUSE_RELEASE and FUSE_RELEASEDIR paths share code, this incorrectly caused the FUSE_RELEASEDIR request to be dropped and never sent to userspace. Pass an isdir bool to distinguish between FUSE_RELEASE and FUSE_RELEASEDIR inside of fuse_file_put. Fixes: 7678ac50 ("fuse: support clients that don't implement 'open'") Cc: <stable@vger.kernel.org> # v3.14 Signed-off-by: NChad Austin <chadaustin@fb.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 21 11月, 2018 1 次提交
-
-
由 Lukas Czerner 提交于
commit ebacb81273599555a7a19f7754a1451206a5fc4f upstream. In async IO blocking case the additional reference to the io is taken for it to survive fuse_aio_complete(). In non blocking case this additional reference is not needed, however we still reference io to figure out whether to wait for completion or not. This is wrong and will lead to use-after-free. Fix it by storing blocking information in separate variable. This was spotted by KASAN when running generic/208 fstest. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Reported-by: NZorro Lang <zlang@redhat.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Fixes: 744742d6 ("fuse: Add reference counting for fuse_io_priv") Cc: <stable@vger.kernel.org> # v4.6 Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 26 7月, 2018 2 次提交
-
-
由 Souptick Joarder 提交于
Use new return type vm_fault_t for fault handler in struct vm_operations_struct. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. commit 1c8f4220 ("mm: change return type to vm_fault_t") Signed-off-by: NSouptick Joarder <jrdr.linux@gmail.com> Reviewed-by: NMatthew Wilcox <mawilcox@microsoft.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Tkhai 提交于
The above error path returns with page unlocked, so this place seems also to behave the same. Fixes: f8dbdf81 ("fuse: rework fuse_readpages()") Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 21 7月, 2018 1 次提交
-
-
由 Eric W. Biederman 提交于
The cost is the the same and this removes the need to worry about complications that come from de_thread and group_leader changing. __task_pid_nr_ns has been updated to take advantage of this change. Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 12 2月, 2018 1 次提交
-
-
由 Linus Torvalds 提交于
This is the mindless scripted replacement of kernel use of POLL* variables as described by Al, done by this script: for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'` for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done done with de-mangling cleanups yet to come. NOTE! On almost all architectures, the EPOLL* constants have the same values as the POLL* constants do. But they keyword here is "almost". For various bad reasons they aren't the same, and epoll() doesn't actually work quite correctly in some cases due to this on Sparc et al. The next patch from Al will sort out the final differences, and we should be all done. Scripted-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 11月, 2017 1 次提交
-
-
由 Al Viro 提交于
mangle/demangle on the way to/from userland Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 28 11月, 2017 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 12 9月, 2017 3 次提交
-
-
由 Miklos Szeredi 提交于
The refreshed argument isn't used by any caller, get rid of it. Use a helper for just updating the inode (no need to fill in a kstat). Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
If the IOCB_DSYNC flag is set a sync is not being performed by fuse_file_write_iter. Honor IOCB_DSYNC/IOCB_SYNC by setting O_DYSNC/O_SYNC respectively in the flags filed of the write request. We don't need to sync data or metadata, since fuse_perform_write() does write-through and the filesystem is responsible for updating file times. Original patch by Vitaly Zolotusky. Reported-by: NNate Clark <nate@neworld.us> Cc: Vitaly Zolotusky <vitaly@unitc.com>. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Commit 0b6e9ea0 ("fuse: Add support for pid namespaces") broke Sandstorm.io development tools, which have been sending FUSE file descriptors across PID namespace boundaries since early 2014. The above patch added a check that prevented I/O on the fuse device file descriptor if the pid namespace of the reader/writer was different from the pid namespace of the mounter. With this change passing the device file descriptor to a different pid namespace simply doesn't work. The check was added because pids are transferred to/from the fuse userspace server in the namespace registered at mount time. To fix this regression, remove the checks and do the following: 1) the pid in the request header (the pid of the task that initiated the filesystem operation) is translated to the reader's pid namespace. If a mapping doesn't exist for this pid, then a zero pid is used. Note: even if a mapping would exist between the initiator task's pid namespace and the reader's pid namespace the pid will be zero if either mapping from initator's to mounter's namespace or mapping from mounter's to reader's namespace doesn't exist. 2) The lk.pid value in setlk/setlkw requests and getlk reply is left alone. Userspace should not interpret this value anyway. Also allow the setlk/setlkw operations if the pid of the task cannot be represented in the mounter's namespace (pid being zero in that case). Reported-by: NKenton Varda <kenton@sandstorm.io> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Fixes: 0b6e9ea0 ("fuse: Add support for pid namespaces") Cc: <stable@vger.kernel.org> # v4.12+ Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Seth Forshee <seth.forshee@canonical.com>
-
- 11 8月, 2017 1 次提交
-
-
由 Jeff Layton 提交于
This ensures that we see errors on fsync when writeback fails. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 03 8月, 2017 1 次提交
-
-
由 Ashish Samant 提交于
Commit 8fba54ae ("fuse: direct-io: don't dirty ITER_BVEC pages") fixes the ITER_BVEC page deadlock for direct io in fuse by checking in fuse_direct_io(), whether the page is a bvec page or not, before locking it. However, this check is missed when the "async_dio" mount option is enabled. In this case, set_page_dirty_lock() is called from the req->end callback in request_end(), when the fuse thread is returning from userspace to respond to the read request. This will cause the same deadlock because the bvec condition is not checked in this path. Here is the stack of the deadlocked thread, while returning from userspace: [13706.656686] INFO: task glusterfs:3006 blocked for more than 120 seconds. [13706.657808] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [13706.658788] glusterfs D ffffffff816c80f0 0 3006 1 0x00000080 [13706.658797] ffff8800d6713a58 0000000000000086 ffff8800d9ad7000 ffff8800d9ad5400 [13706.658799] ffff88011ffd5cc0 ffff8800d6710008 ffff88011fd176c0 7fffffffffffffff [13706.658801] 0000000000000002 ffffffff816c80f0 ffff8800d6713a78 ffffffff816c790e [13706.658803] Call Trace: [13706.658809] [<ffffffff816c80f0>] ? bit_wait_io_timeout+0x80/0x80 [13706.658811] [<ffffffff816c790e>] schedule+0x3e/0x90 [13706.658813] [<ffffffff816ca7e5>] schedule_timeout+0x1b5/0x210 [13706.658816] [<ffffffff81073ffb>] ? gup_pud_range+0x1db/0x1f0 [13706.658817] [<ffffffff810668fe>] ? kvm_clock_read+0x1e/0x20 [13706.658819] [<ffffffff81066909>] ? kvm_clock_get_cycles+0x9/0x10 [13706.658822] [<ffffffff810f5792>] ? ktime_get+0x52/0xc0 [13706.658824] [<ffffffff816c6f04>] io_schedule_timeout+0xa4/0x110 [13706.658826] [<ffffffff816c8126>] bit_wait_io+0x36/0x50 [13706.658828] [<ffffffff816c7d06>] __wait_on_bit_lock+0x76/0xb0 [13706.658831] [<ffffffffa0545636>] ? lock_request+0x46/0x70 [fuse] [13706.658834] [<ffffffff8118800a>] __lock_page+0xaa/0xb0 [13706.658836] [<ffffffff810c8500>] ? wake_atomic_t_function+0x40/0x40 [13706.658838] [<ffffffff81194d08>] set_page_dirty_lock+0x58/0x60 [13706.658841] [<ffffffffa054d968>] fuse_release_user_pages+0x58/0x70 [fuse] [13706.658844] [<ffffffffa0551430>] ? fuse_aio_complete+0x190/0x190 [fuse] [13706.658847] [<ffffffffa0551459>] fuse_aio_complete_req+0x29/0x90 [fuse] [13706.658849] [<ffffffffa05471e9>] request_end+0xd9/0x190 [fuse] [13706.658852] [<ffffffffa0549126>] fuse_dev_do_write+0x336/0x490 [fuse] [13706.658854] [<ffffffffa054963e>] fuse_dev_write+0x6e/0xa0 [fuse] [13706.658857] [<ffffffff812a9ef3>] ? security_file_permission+0x23/0x90 [13706.658859] [<ffffffff81205300>] do_iter_readv_writev+0x60/0x90 [13706.658862] [<ffffffffa05495d0>] ? fuse_dev_splice_write+0x350/0x350 [fuse] [13706.658863] [<ffffffff812062a1>] do_readv_writev+0x171/0x1f0 [13706.658866] [<ffffffff810b3d00>] ? try_to_wake_up+0x210/0x210 [13706.658868] [<ffffffff81206361>] vfs_writev+0x41/0x50 [13706.658870] [<ffffffff81206496>] SyS_writev+0x56/0xf0 [13706.658872] [<ffffffff810257a1>] ? syscall_trace_leave+0xf1/0x160 [13706.658874] [<ffffffff816cbb2e>] system_call_fastpath+0x12/0x71 Fix this by making should_dirty a fuse_io_priv parameter that can be checked in fuse_aio_complete_req(). Reported-by: NTiger Yang <tiger.yang@oracle.com> Signed-off-by: NAshish Samant <ashish.samant@oracle.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 01 8月, 2017 1 次提交
-
-
由 Jeff Layton 提交于
Change to file_write_and_wait_range and file_check_and_advance_wb_err Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
- 16 7月, 2017 1 次提交
-
-
由 Benjamin Coddington 提交于
Since commit c69899a1 "NFSv4: Update of VFS byte range lock must be atomic with the stateid update", NFSv4 has been inserting locks in rpciod worker context. The result is that the file_lock's fl_nspid is the kworker's pid instead of the original userspace pid. The fl_nspid is only used to represent the namespaced virtual pid number when displaying locks or returning from F_GETLK. There's no reason to set it for every inserted lock, since we can usually just look it up from fl_pid. So, instead of looking up and holding struct pid for every lock, let's just look up the virtual pid number from fl_pid when it is needed. That means we can remove fl_nspid entirely. The translaton and presentation of fl_pid should handle the following four cases: 1 - F_GETLK on a remote file with a remote lock: In this case, the filesystem should determine the l_pid to return here. Filesystems should indicate that the fl_pid represents a non-local pid value that should not be translated by returning an fl_pid <= 0. 2 - F_GETLK on a local file with a remote lock: This should be the l_pid of the lock manager process, and translated. 3 - F_GETLK on a remote file with a local lock, and 4 - F_GETLK on a local file with a local lock: These should be the translated l_pid of the local locking process. Fuse was already doing the correct thing by translating the pid into the caller's namespace. With this change we must update fuse to translate to init's pid namespace, so that the locks API can then translate from init's pid namespace into the pid namespace of the caller. With this change, the locks API will expect that if a filesystem returns a remote pid as opposed to a local pid for F_GETLK, that remote pid will be <= 0. This signifies that the pid is remote, and the locks API will forego translating that pid into the pid namespace of the local calling process. Finally, we convert remote filesystems to present remote pids using negative numbers. Have lustre, 9p, ceph, cifs, and dlm negate the remote pid returned for F_GETLK lock requests. Since local pids will never be larger than PID_MAX_LIMIT (which is currently defined as <= 4 million), but pid_t is an unsigned int, we should have plenty of room to represent remote pids with negative numbers if we assume that remote pid numbers are similarly limited. If this is not the case, then we run the risk of having a remote pid returned for which there is also a corresponding local pid. This is a problem we have now, but this patch should reduce the chances of that occurring, while also returning those remote pid numbers, for whatever that may be worth. Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Signed-off-by: NJeff Layton <jlayton@redhat.com>
-
- 09 6月, 2017 1 次提交
-
-
由 Mateusz Jurczyk 提交于
Before the patch, the flock flag could remain uninitialized for the lifespan of the fuse_file allocation. Unless set to true in fuse_file_flock(), it would remain in an indeterminate state until read in an if statement in fuse_release_common(). This could consequently lead to taking an unexpected branch in the code. The bug was discovered by a runtime instrumentation designed to detect use of uninitialized memory in the kernel. Signed-off-by: NMateusz Jurczyk <mjurczyk@google.com> Fixes: 37fb3a30 ("fuse: fix flock") Cc: <stable@vger.kernel.org> # v3.1+ Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 21 4月, 2017 1 次提交
-
-
由 Benjamin Coddington 提交于
Set FL_CLOSE in fl_flags as in locks_remove_posix() when clearing locks. NFS will check for this flag to ensure an unlock is sent in a following patch. Fuse handles flock and posix locks differently for FL_CLOSE, and so requires a fixup to retain the existing behavior for flock. Signed-off-by: NBenjamin Coddington <bcodding@redhat.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Acked-by: NMiklos Szeredi <miklos@szeredi.hu> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 18 4月, 2017 2 次提交
-
-
由 Seth Forshee 提交于
When the userspace process servicing fuse requests is running in a pid namespace then pids passed via the fuse fd are not being translated into that process' namespace. Translation is necessary for the pid to be useful to that process. Since no use case currently exists for changing namespaces all translations can be done relative to the pid namespace in use when fuse_conn_init() is called. For fuse this translates to mount time, and for cuse this is when /dev/cuse is opened. IO for this connection from another namespace will return errors. Requests from processes whose pid cannot be translated into the target namespace will have a value of 0 for in.h.pid. File locking changes based on previous work done by Eric Biederman. Signed-off-by: NSeth Forshee <seth.forshee@canonical.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Elena Reshetova 提交于
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com> Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NDavid Windsor <dwindsor@gmail.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 25 2月, 2017 1 次提交
-
-
由 Dave Jiang 提交于
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [arnd@arndb.de: fix ARM build] Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NDave Jiang <dave.jiang@intel.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 2月, 2017 3 次提交
-
-
由 Miklos Szeredi 提交于
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
struct fuse_file is stored in file->private_data. Make this always be a counting reference for consistency. This also allows fuse_sync_release() to call fuse_file_put() instead of partially duplicating its functionality. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
fuse_file_put() was missing the "force" flag for the RELEASE request when sending synchronously (fuseblk). If this flag is not set, then a sync request may be interrupted before it is dequeued by the userspace filesystem. In this case the OPEN won't be balanced with a RELEASE. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Fixes: 5a18ec17 ("fuse: fix hang of single threaded fuseblk filesystem") Cc: <stable@vger.kernel.org> # v2.6.38+
-
- 15 11月, 2016 1 次提交
-
-
由 Miklos Szeredi 提交于
If pos is at the beginning of a page and copied is zero then page is not zeroed but is marked uptodate. Fix by skipping everything except unlock/put of page if zero bytes were copied. Reported-by: NAl Viro <viro@zeniv.linux.org.uk> Fixes: 6b12c1b3 ("fuse: Implement write_begin/write_end callbacks") Cc: <stable@vger.kernel.org> # v3.15+ Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 01 10月, 2016 2 次提交
-
-
由 Miklos Szeredi 提交于
The two invocations share little code. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 22 9月, 2016 1 次提交
-
-
由 Jan Kara 提交于
To avoid clearing of capabilities or security related extended attributes too early, inode_change_ok() will need to take dentry instead of inode. Propagate it down to fuse_do_setattr(). Acked-by: NMiklos Szeredi <mszeredi@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 25 8月, 2016 1 次提交
-
-
由 Miklos Szeredi 提交于
When reading from a loop device backed by a fuse file it deadlocks on lock_page(). This is because the page is already locked by the read() operation done on the loop device. In this case we don't want to either lock the page or dirty it. So do what fs/direct-io.c does: only dirty the page for ITER_IOVEC vectors. Reported-by: NSheng Yang <sheng@yasker.org> Fixes: aa4d8616 ("block: loop: switch to VFS ITER_BVEC") Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> # v4.1+ Reviewed-by: NSheng Yang <sheng@yasker.org> Reviewed-by: NAshish Samant <ashish.samant@oracle.com> Tested-by: NSheng Yang <sheng@yasker.org> Tested-by: NAshish Samant <ashish.samant@oracle.com>
-
- 29 7月, 2016 4 次提交
-
-
由 Miklos Szeredi 提交于
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Maxim Patlasov 提交于
fuse_flush() calls write_inode_now() that triggers writeback, but actual writeback will happen later, on fuse_sync_writes(). If an error happens, fuse_writepage_end() will set error bit in mapping->flags. So, we have to check mapping->flags after fuse_sync_writes(). Signed-off-by: NMaxim Patlasov <mpatlasov@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Fixes: 4d99ff8f ("fuse: Turn writeback cache on") Cc: <stable@vger.kernel.org> # v3.15+
-
由 Alexey Kuznetsov 提交于
Due to implementation of fuse writeback filemap_write_and_wait_range() does not catch errors. We have to do this directly after fuse_sync_writes() Signed-off-by: NAlexey Kuznetsov <kuznet@virtuozzo.com> Signed-off-by: NMaxim Patlasov <mpatlasov@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Fixes: 4d99ff8f ("fuse: Turn writeback cache on") Cc: <stable@vger.kernel.org> # v3.15+
-
由 Mel Gorman 提交于
There are now a number of accounting oddities such as mapped file pages being accounted for on the node while the total number of file pages are accounted on the zone. This can be coped with to some extent but it's confusing so this patch moves the relevant file-based accounted. Due to throttling logic in the page allocator for reliable OOM detection, it is still necessary to track dirty and writeback pages on a per-zone basis. [mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting] Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 6月, 2016 1 次提交
-
-
由 Ashish Sangwan 提交于
While sending the blocking directIO in fuse, the write request is broken into sub-requests, each of default size 128k and all the requests are sent in non-blocking background mode if async_dio mode is supported by libfuse. The process which issue the write wait for the completion of all the sub-requests. Sending multiple requests parallely gives a chance to perform parallel writes in the user space fuse implementation if it is multi-threaded and hence improves the performance. When there is a size extending aio dio write, we switch to blocking mode so that we can properly update the size of the file after completion of the writes. However, in this situation all the sub-requests are sent in serialized manner where the next request is sent only after receiving the reply of the current request. Hence the multi-threaded user space implementation is not utilized properly. This patch changes the size extending aio dio behavior to exactly follow blocking dio. For multi threaded fuse implementation having 10 threads and using buffer size of 64MB to perform async directIO, we are getting double the speed. Signed-off-by: NAshish Sangwan <ashishsangwan2@gmail.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 02 5月, 2016 2 次提交
-
-
由 Christoph Hellwig 提交于
Including blkdev_direct_IO and dax_do_io. It has to be ki_pos to actually work, so eliminate the superflous argument. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 25 4月, 2016 1 次提交
-
-
由 Ashish Samant 提交于
fuse_get_user_pages() should return error or 0. Otherwise fuse_direct_io read will not return 0 to indicate that read has completed. Fixes: 742f9927 ("fuse: return patrial success from fuse_direct_io()") Signed-off-by: NAshish Samant <ashish.samant@oracle.com> Signed-off-by: NSeth Forshee <seth.forshee@canonical.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 05 4月, 2016 1 次提交
-
-
由 Kirill A. Shutemov 提交于
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 3月, 2016 1 次提交
-
-
由 Ashish Samant 提交于
If a user calls writev/readv in direct io mode with partially valid data in the iovec array such that any vector other than the first one in the array contains invalid data, we currently return the error for the invalid iovec. Instead, we should return the number of bytes already written/read and not the error as we do in the non direct_io case. Reported-by: NAlexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: NAshish Samant <ashish.samant@oracle.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 14 3月, 2016 1 次提交
-
-
由 Seth Forshee 提交于
The 'reqs' member of fuse_io_priv serves two purposes. First is to track the number of oustanding async requests to the server and to signal that the io request is completed. The second is to be a reference count on the structure to know when it can be freed. For sync io requests these purposes can be at odds. fuse_direct_IO() wants to block until the request is done, and since the signal is sent when 'reqs' reaches 0 it cannot keep a reference to the object. Yet it needs to use the object after the userspace server has completed processing requests. This leads to some handshaking and special casing that it needlessly complicated and responsible for at least one race condition. It's much cleaner and safer to maintain a separate reference count for the object lifecycle and to let 'reqs' just be a count of outstanding requests to the userspace server. Then we can know for sure when it is safe to free the object without any handshaking or special cases. The catch here is that most of the time these objects are stack allocated and should not be freed. Initializing these objects with a single reference that is never released prevents accidental attempts to free the objects. Fixes: 9d5722b7 ("fuse: handle synchronous iocbs internally") Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: NSeth Forshee <seth.forshee@canonical.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-