- 25 5月, 2010 3 次提交
-
-
由 Miklos Szeredi 提交于
When splicing buffers to the fuse device with SPLICE_F_MOVE, try to move pages from the pipe buffer into the page cache. This allows populating the fuse filesystem's cache without ever touching the page contents, i.e. zero copy read capability. The following steps are performed when trying to move a page into the page cache: - buf->ops->confirm() to make sure the new page is uptodate - buf->ops->steal() to try to remove the new page from it's previous place - remove_from_page_cache() on the old page - add_to_page_cache_locked() on the new page If any of the above steps fail (non fatally) then the code falls back to copying the page. In particular ->steal() will fail if there are external references (other than the page cache and the pipe buffer) to the page. Also since the remove_from_page_cache() + add_to_page_cache_locked() are non-atomic it is possible that the page cache is repopulated in between the two and add_to_page_cache_locked() will fail. This could be fixed by creating a new atomic replace_page_cache_page() function. fuse_readpages_end() needed to be reworked so it works even if page->mapping is NULL for some or all pages which can happen if the add_to_page_cache_locked() failed. A number of sanity checks were added to make sure the stolen pages don't have weird flags set, etc... These could be moved into generic splice/steal code. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 Miklos Szeredi 提交于
Allow userspace filesystem implementation to use splice() to write to the fuse device. The semantics of using splice() are: 1) buffer the message header and data in a temporary pipe 2) with a *single* splice() call move the message from the temporary pipe to the fuse device The READ reply message has the most interesting use for this, since now the data from an arbitrary file descriptor (which could be a regular file, a block device or a socket) can be tranferred into the fuse device without having to go through a userspace buffer. It will also allow zero copy moving of pages. One caveat is that the protocol on the fuse device requires the length of the whole message to be written into the header. But the length of the data transferred into the temporary pipe may not be known in advance. The current library implementation works around this by using vmplice to write the header and modifying the header after splicing the data into the pipe (error handling omitted): struct fuse_out_header out; iov.iov_base = &out; iov.iov_len = sizeof(struct fuse_out_header); vmsplice(pip[1], &iov, 1, 0); len = splice(input_fd, input_offset, pip[1], NULL, len, 0); /* retrospectively modify the header: */ out.len = len + sizeof(struct fuse_out_header); splice(pip[0], NULL, fuse_chan_fd(req->ch), NULL, out.len, flags); This works since vmsplice only saves a pointer to the data, it does not copy the data itself. Since pipes are currently limited to 16 pages and messages need to be spliced atomically, the length of the data is limited to 15 pages (or 60kB for 4k pages). Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 Miklos Szeredi 提交于
Replace uses of get_user_pages() with get_user_pages_fast(). It looks nicer and should be faster in most cases. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
- 05 2月, 2010 2 次提交
-
-
由 Fang Wenqi 提交于
gcc 4.4 warns about: fs/fuse/dev.c: In function ‘fuse_notify_inval_entry’: fs/fuse/dev.c:925: warning: the frame size of 1060 bytes is larger than 1024 bytes The problem is we declare two structures and a large array on the stack, I move the array alway from the stack and allocate memory for it dynamically. Signed-off-by: NFang Wenqi <antonf@turbolinux.com.cn> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 Miklos Szeredi 提交于
Small cleanup in fuse_notify_inval_inode() and fuse_notify_inval_entry(). Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
- 12 7月, 2009 1 次提交
-
-
由 Linus Torvalds 提交于
This reverts commit 097041e5. Trond had a better fix, which is the parent of this one ("Fix compile error due to congestion_wait() changes") Requested-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NLarry Finger <Larry.Finger@lwfinger.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 7月, 2009 2 次提交
-
-
由 Larry Finger 提交于
When building v2.6.31-rc2-344-g69ca06c9, the following build errors are found due to missing includes: CC [M] fs/fuse/dev.o fs/fuse/dev.c: In function ‘request_end’: fs/fuse/dev.c:289: error: ‘BLK_RW_SYNC’ undeclared (first use in this function) ... fs/nfs/write.c: In function ‘nfs_set_page_writeback’: fs/nfs/write.c:207: error: ‘BLK_RW_ASYNC’ undeclared (first use in this function) Signed-off-by: Larry Finger@lwfinger.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jens Axboe 提交于
Commit 1faa16d2 accidentally broke the bdi congestion wait queue logic, causing us to wait on congestion for WRITE (== 1) when we really wanted BLK_RW_ASYNC (== 0) instead. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 07 7月, 2009 1 次提交
-
-
由 Csaba Henk 提交于
The practical values for these limits depend on the design of the filesystem server so let userspace set them at initialization time. Signed-off-by: NCsaba Henk <csaba@gluster.com> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
- 01 7月, 2009 2 次提交
-
-
由 John Muir 提交于
Add notification messages that allow the filesystem to invalidate VFS caches. Two notifications are added: 1) inode invalidation - invalidate cached attributes - invalidate a range of pages in the page cache (this is optional) 2) dentry invalidation - try to invalidate a subtree in the dentry cache Care must be taken while accessing the 'struct super_block' for the mount, as it can go away while an invalidation is in progress. To prevent this, introduce a rw-semaphore, that is taken for read during the invalidation and taken for write in the ->kill_sb callback. Cc: Csaba Henk <csaba@gluster.com> Cc: Anand Avati <avati@zresearch.com> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 Csaba Henk 提交于
On 64 bit systems -- where sizeof(ssize_t) > sizeof(int) -- the following test exposes a bug due to a non-careful return of an int or unsigned value: implement a FUSE filesystem which sends an unsolicited notification to the kernel with invalid opcode. The respective write to /dev/fuse will return (1 << 32) - EINVAL with errno == 0 instead of -1 with errno == EINVAL. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org
-
- 28 4月, 2009 2 次提交
-
-
由 Tejun Heo 提交于
Export the following symbols for CUSE. fuse_conn_put() fuse_conn_get() fuse_conn_kill() fuse_send_init() fuse_do_open() fuse_sync_release() fuse_direct_io() fuse_do_ioctl() fuse_file_poll() fuse_request_alloc() fuse_get_req() fuse_put_request() fuse_request_send() fuse_abort_conn() fuse_dev_release() fuse_dev_operations Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 Tejun Heo 提交于
Update fuse_conn_init() such that it doesn't take @sb and move bdi registration into a separate function. Also separate out fuse_conn_kill() from fuse_put_super(). These will be used to implement cuse. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
- 26 1月, 2009 2 次提交
-
-
由 Miklos Szeredi 提交于
Move fuse_copy_finish() to before calling fuse_notify_poll_wakeup(). This is not a big issue because fuse_notify_poll_wakeup() should be atomic, but it's cleaner this way, and later uses of notification will need to be able to finish the copying before performing some actions. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 Miklos Szeredi 提交于
If a fuse filesystem is unmounted but the device file descriptor remains open and a new mount reuses the old device number, then the mount fails with EEXIST and the following warning is printed in the kernel log: WARNING: at fs/sysfs/dir.c:462 sysfs_add_one+0x35/0x3d() sysfs: duplicate filename '0:15' can not be created The cause is that the bdi belonging to the fuse filesystem was destoryed only after the device file was released. Fix this by calling bdi_destroy() from fuse_put_super() instead. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> CC: stable@kernel.org
-
- 02 12月, 2008 1 次提交
-
-
由 Harvey Harrison 提交于
Makes the existing annotations match the more common one per line style and adds a few missing annotations. Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
- 26 11月, 2008 5 次提交
-
-
由 Tejun Heo 提交于
Add fuse_ prefix to request_send*() and get_root_inode() as some of those functions will be exported for CUSE. With or without CUSE export, having the function names scoped is a good idea for debuggability. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 Tejun Heo 提交于
Implement poll support. Polled files are indexed using kh in a RB tree rooted at fuse_conn->polled_files. Client should send FUSE_NOTIFY_POLL notification once after processing FUSE_POLL which has FUSE_POLL_SCHEDULE_NOTIFY set. Sending notification unconditionally after the latest poll or everytime file content might have changed is inefficient but won't cause malfunction. fuse_file_poll() can sleep and requires patches from the following thread which allows f_op->poll() to sleep. http://thread.gmane.org/gmane.linux.kernel/726176Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 Tejun Heo 提交于
Clients always used to write only in response to read requests. To implement poll efficiently, clients should be able to issue unsolicited notifications. This patch implements basic notification support. Zero fuse_out_header.unique is now accepted and considered unsolicited notification and the error field contains notification code. This patch doesn't implement any actual notification. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 Tejun Heo 提交于
fuse_req->end() was supposed to be put the base reference but there's no reason why it should. It only makes things more complex. Move it out of ->end() and make it the responsibility of request_end(). Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
由 Miklos Szeredi 提交于
Fix coding style errors reported by checkpatch and others. Uptdate copyright date to 2008. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
-
- 14 11月, 2008 1 次提交
-
-
由 David Howells 提交于
Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: NDavid Howells <dhowells@redhat.com> Reviewed-by: NJames Morris <jmorris@namei.org> Acked-by: NSerge Hallyn <serue@us.ibm.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: fuse-devel@lists.sourceforge.net Signed-off-by: NJames Morris <jmorris@namei.org>
-
- 02 11月, 2008 1 次提交
-
-
由 Al Viro 提交于
As it is, all instances of ->release() for files that have ->fasync() need to remember to evict file from fasync lists; forgetting that creates a hole and we actually have a bunch that *does* forget. So let's keep our lives simple - let __fput() check FASYNC in file->f_flags and call ->fasync() there if it's been set. And lose that crap in ->release() instances - leaving it there is still valid, but we don't have to bother anymore. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 4月, 2008 2 次提交
-
-
由 Miklos Szeredi 提交于
fs/fuse/dev.c:306:2: warning: context imbalance in 'wait_answer_interruptible' - unexpected unlock fs/fuse/dev.c:361:2: warning: context imbalance in 'request_wait_answer' - unexpected unlock fs/fuse/dev.c:1002:4: warning: context imbalance in 'end_io_requests' - unexpected unlock Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miklos Szeredi 提交于
Quoting Linus (3 years ago, FUSE inclusion discussions): "User-space filesystems are hard to get right. I'd claim that they are almost impossible, unless you limit them somehow (shared writable mappings are the nastiest part - if you don't have those, you can reasonably limit your problems by limiting the number of dirty pages you accept through normal "write()" calls)." Instead of attempting the impossible, I've just waited for the dirty page accounting infrastructure to materialize (thanks to Peter Zijlstra and others). This nicely solved the biggest problem: limiting the number of pages used for write caching. Some small details remained, however, which this largish patch attempts to address. It provides a page writeback implementation for fuse, which is completely safe against VM related deadlocks. Performance may not be very good for certain usage patterns, but generally it should be acceptable. It has been tested extensively with fsx-linux and bash-shared-mapping. Fuse page writeback design -------------------------- fuse_writepage() allocates a new temporary page with GFP_NOFS|__GFP_HIGHMEM. It copies the contents of the original page, and queues a WRITE request to the userspace filesystem using this temp page. The writeback is finished instantly from the MM's point of view: the page is removed from the radix trees, and the PageDirty and PageWriteback flags are cleared. For the duration of the actual write, the NR_WRITEBACK_TEMP counter is incremented. The per-bdi writeback count is not decremented until the actual write completes. On dirtying the page, fuse waits for a previous write to finish before proceeding. This makes sure, there can only be one temporary page used at a time for one cached page. This approach is wasteful in both memory and CPU bandwidth, so why is this complication needed? The basic problem is that there can be no guarantee about the time in which the userspace filesystem will complete a write. It may be buggy or even malicious, and fail to complete WRITE requests. We don't want unrelated parts of the system to grind to a halt in such cases. Also a filesystem may need additional resources (particularly memory) to complete a WRITE request. There's a great danger of a deadlock if that allocation may wait for the writepage to finish. Currently there are several cases where the kernel can block on page writeback: - allocation order is larger than PAGE_ALLOC_COSTLY_ORDER - page migration - throttle_vm_writeout (through NR_WRITEBACK) - sync(2) Of course in some cases (fsync, msync) we explicitly want to allow blocking. So for these cases new code has to be added to fuse, since the VM is not tracking writeback pages for us any more. As an extra safetly measure, the maximum dirty ratio allocated to a single fuse filesystem is set to 1% by default. This way one (or several) buggy or malicious fuse filesystems cannot slow down the rest of the system by hogging dirty memory. With appropriate privileges, this limit can be raised through '/sys/class/bdi/<bdi>/max_ratio'. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 2月, 2008 1 次提交
-
-
由 Miklos Szeredi 提交于
Libfuse basically creates a new thread for each new request. This is fine for synchronous requests, which are naturally limited. However background requests (especially writepage) can cause a thread creation storm. To avoid this, limit the number of background requests available to userspace. This is done by introducing another queue for background requests, and a counter for the number of "active" requests, which are currently available for userspace. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 10月, 2007 6 次提交
-
-
由 Miklos Szeredi 提交于
Don't return -ENOENT for a read() on the fuse device when the request was aborted. Instead return -ENODEV, meaning the filesystem has been force-umounted or aborted. Previously ENOENT meant that the request was interrupted, but now the 'aborted' flag is not set in case of interrupts. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miklos Szeredi 提交于
Don't set 'aborted' flag on a request if it's interrupted. We have to wait for the answer anyway, and this would only a very little time while copying the reply. This means, that write() on the fuse device will not return -ENOENT during normal operation, only if the filesystem is aborted by a forced umount or through the fusectl interface. This could simplify userspace code somewhat when backward compatibility with earlier kernel versions is not required. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miklos Szeredi 提交于
Move dput/mntput pair from request_end() to fuse_release_end(), because there's no other place they are used. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miklos Szeredi 提交于
Make lifetime of 'struct fuse_file' independent from 'struct file' by adding a reference counter and destructor. This will enable asynchronous page writeback, where it cannot be guaranteed, that the file is not released while a request with this file handle is being served. The actual RELEASE request is only sent when there are no more references to the fuse_file. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miklos Szeredi 提交于
Use wake_up_all instead of wake_up in put_reserved_req(), otherwise it is possible that the right task is not woken up. Also create a separate reserved_req_waitq in addition to the blocked_waitq, since they fulfill totally separate functions. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miklos Szeredi 提交于
Set the read and write congestion state if the request queue is close to blocking, and clear it when it's not. This prevents unnecessary blocking in readahead and (when writable mmaps are allowed) writeback. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 7月, 2007 1 次提交
-
-
由 Paul Mundt 提交于
Slab destructors were no longer supported after Christoph's c59def9f change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
-
- 08 12月, 2006 2 次提交
-
-
由 Christoph Lameter 提交于
Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
SLAB_KERNEL is an alias of GFP_KERNEL. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 01 10月, 2006 1 次提交
-
-
由 Badari Pulavarty 提交于
This patch removes readv() and writev() methods and replaces them with aio_read()/aio_write() methods. Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 30 9月, 2006 1 次提交
-
-
由 Josh Triplett 提交于
request_end and fuse_read_interrupt release fc->lock. Add lock annotations to these two functions so that sparse can check callers for lock pairing, and so that sparse will not complain about these functions since they intentionally use locks in this manner. Signed-off-by: NJosh Triplett <josh@freedesktop.org> Acked-by: NMiklos Szeredi <miklos@szeredi.hu> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 26 6月, 2006 3 次提交
-
-
由 Miklos Szeredi 提交于
Add synchronous request interruption. This is needed for file locking operations which have to be interruptible. However filesystem may implement interruptibility of other operations (e.g. like NFS 'intr' mount option). Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Miklos Szeredi 提交于
Rename the 'interrupted' flag to 'aborted', since it indicates exactly that, and next patch will introduce an 'interrupted' flag for a Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Miklos Szeredi 提交于
All POSIX locks owned by the current task are removed on close(). If the FLUSH request resulting initiated by close() fails to reach userspace, there might be locks remaining, which cannot be removed. The only reason it could fail, is if allocating the request fails. In this case use the request reserved for RELEASE, or if that is currently used by another FLUSH, wait for it to become available. Signed-off-by: NMiklos Szeredi <miklos@szeredi.hu> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-