- 10 9月, 2019 14 次提交
-
-
由 Miklos Szeredi 提交于
All requests are now sent with one of the fuse_simple_... helpers. Get rid of the old api from the fuse internal header. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Rename fuse_request_send_notify_reply() to fuse_simple_notify_reply() and convert to passing fuse_args instead of fuse_req. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Derive fuse_writepage_args from fuse_io_args. Sending the request is tricky since it was done with fi->lock held, hence we must either use atomic allocation or release the lock. Both are possible so try atomic first and if it fails, release the lock and do the regular allocation with GFP_NOFS and __GFP_NOFAIL. Both flags are necessary for correct operation. Move the page realloc function from dev.c to file.c and convert to using fuse_writepage_args. The last caller of fuse_write_fill() is gone, so get rid of it. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Create a helper named fuse_simple_background() that is similar to fuse_simple_request(). Unlike the latter, it returns immediately and calls the supplied 'end' callback when the reply is received. The supplied 'args' pointer is stored in 'fuse_req' which allows the callback to interpret the output arguments decoded from the reply. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
fuse_simple_request() is converted to return length of last (instead of single) out arg, since FUSE_IOCTL_OUT has two out args, the second of which is variable length. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
fuse_req_pages_alloc() is moved to file.c, since its internal use by the device code will eventually be removed. Rename to fuse_pages_alloc() to signify that it's not only usable for fuse_req page array. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Derive fuse_args_pages from fuse_args. This is used to handle requests which use pages for input or output. The related flags are added to fuse_args. New FR_ALLOC_PAGES flags is added to indicate whether the page arrays in fuse_req need to be freed by fuse_put_request() or not. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
In some cases it makes no sense to set pid/uid/gid fields in the request header. Allow fuse_simple_background() to omit these. This is only required in the "force" case, so for now just WARN if set otherwise. Fold fuse_get_req_nofail_nopages() into its only caller. Comment is obsolete anyway. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Move this function to the readdir.c where its only caller resides. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
This will be used by fuse_force_forget(). We can expand fuse_request_send() into fuse_simple_request(). The FR_WAITING bit has already been set, no need to check. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Add 'force' to fuse_args and use fuse_get_req_nofail_nopages() to allocate the request in that case. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Instead of complex games with a reserved request, just use __GFP_NOFAIL. Both calers (flush, readdir) guarantee that connection was already initialized, so no need to wait for fc->initialized. Also remove unneeded clearing of FR_BACKGROUND flag. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
...to make future expansion simpler. The hiearachical structure is a historical thing that does not serve any practical purpose. The generated code is excatly the same before and after the patch. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Eric Biggers 提交于
When IOCB_CMD_POLL is used on the FUSE device, aio_poll() disables IRQs and takes kioctx::ctx_lock, then fuse_iqueue::waitq.lock. This may have to wait for fuse_iqueue::waitq.lock to be released by one of many places that take it with IRQs enabled. Since the IRQ handler may take kioctx::ctx_lock, lockdep reports that a deadlock is possible. Fix it by protecting the state of struct fuse_iqueue with a separate spinlock, and only accessing fuse_iqueue::waitq using the versions of the waitqueue functions which do IRQ-safe locking internally. Reproducer: #include <fcntl.h> #include <stdio.h> #include <sys/mount.h> #include <sys/stat.h> #include <sys/syscall.h> #include <unistd.h> #include <linux/aio_abi.h> int main() { char opts[128]; int fd = open("/dev/fuse", O_RDWR); aio_context_t ctx = 0; struct iocb cb = { .aio_lio_opcode = IOCB_CMD_POLL, .aio_fildes = fd }; struct iocb *cbp = &cb; sprintf(opts, "fd=%d,rootmode=040000,user_id=0,group_id=0", fd); mkdir("mnt", 0700); mount("foo", "mnt", "fuse", 0, opts); syscall(__NR_io_setup, 1, &ctx); syscall(__NR_io_submit, ctx, 1, &cbp); } Beginning of lockdep output: ===================================================== WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected 5.3.0-rc5 #9 Not tainted ----------------------------------------------------- syz_fuse/135 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: 000000003590ceda (&fiq->waitq){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline] 000000003590ceda (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1751 [inline] 000000003590ceda (&fiq->waitq){+.+.}, at: __io_submit_one.constprop.0+0x203/0x5b0 fs/aio.c:1825 and this task is already holding: 0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:363 [inline] 0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1749 [inline] 0000000075037284 (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one.constprop.0+0x1f4/0x5b0 fs/aio.c:1825 which would create a new lock dependency: (&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.} but this new dependency connects a SOFTIRQ-irq-safe lock: (&(&ctx->ctx_lock)->rlock){..-.} [...] Reported-by: syzbot+af05535bb79520f95431@syzkaller.appspotmail.com Reported-by: syzbot+d86c4426a01f60feddc7@syzkaller.appspotmail.com Fixes: bfe4037e ("aio: implement IOCB_CMD_POLL") Cc: <stable@vger.kernel.org> # v4.19+ Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NEric Biggers <ebiggers@google.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 02 9月, 2019 1 次提交
-
-
由 Kirill Smelkov 提交于
[ This retries commit d4b13963 ("fuse: require /dev/fuse reads to have enough buffer capacity"), which was reverted. In this version we require only `sizeof(fuse_in_header) + sizeof(fuse_write_in)` instead of 4K for FUSE request header room, because, contrary to libfuse and kernel client behaviour, GlusterFS actually provides only so much room for request header. ] A FUSE filesystem server queues /dev/fuse sys_read calls to get filesystem requests to handle. It does not know in advance what would be that request as it can be anything that client issues - LOOKUP, READ, WRITE, ... Many requests are short and retrieve data from the filesystem. However WRITE and NOTIFY_REPLY write data into filesystem. Before getting into operation phase, FUSE filesystem server and kernel client negotiate what should be the maximum write size the client will ever issue. After negotiation the contract in between server/client is that the filesystem server then should queue /dev/fuse sys_read calls with enough buffer capacity to receive any client request - WRITE in particular, while FUSE client should not, in particular, send WRITE requests with > negotiated max_write payload. FUSE client in kernel and libfuse historically reserve 4K for request header. However an existing filesystem server - GlusterFS - was found which reserves only 80 bytes for header room (= `sizeof(fuse_in_header) + sizeof(fuse_write_in)`). Since `sizeof(fuse_in_header) + sizeof(fuse_write_in)` == `sizeof(fuse_in_header) + sizeof(fuse_read_in)` == `sizeof(fuse_in_header) + sizeof(fuse_notify_retrieve_in)` is the absolute minimum any sane filesystem should be using for header room, the contract is that filesystem server should queue sys_reads with `sizeof(fuse_in_header) + sizeof(fuse_write_in)` + max_write buffer. If the filesystem server does not follow this contract, what can happen is that fuse_dev_do_read will see that request size is > buffer size, and then it will return EIO to client who issued the request but won't indicate in any way that there is a problem to filesystem server. This can be hard to diagnose because for some requests, e.g. for NOTIFY_REPLY which mimics WRITE, there is no client thread that is waiting for request completion and that EIO goes nowhere, while on filesystem server side things look like the kernel is not replying back after successful NOTIFY_RETRIEVE request made by the server. We can make the problem easy to diagnose if we indicate via error return to filesystem server when it is violating the contract. This should not practically cause problems because if a filesystem server is using shorter buffer, writes to it were already very likely to cause EIO, and if the filesystem is read-only it should be too following FUSE_MIN_READ_BUFFER minimum buffer size. Please see [1] for context where the problem of stuck filesystem was hit for real (because kernel client was incorrectly sending more than max_write data with NOTIFY_REPLY; see also previous patch), how the situation was traced and for more involving patch that did not make it into the tree. [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2Signed-off-by: NKirill Smelkov <kirr@nexedi.com> Tested-by: NSander Eikelenboom <linux@eikelenboom.it> Cc: Han-Wen Nienhuys <hanwen@google.com> Cc: Jakob Unterwurzacher <jakobunt@gmail.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 11 6月, 2019 1 次提交
-
-
由 Miklos Szeredi 提交于
This reverts commit d4b13963. The commit introduced a regression in glusterfs-fuse. Reported-by: NSander Eikelenboom <linux@eikelenboom.it> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 24 4月, 2019 3 次提交
-
-
由 Kirill Smelkov 提交于
A FUSE filesystem server queues /dev/fuse sys_read calls to get filesystem requests to handle. It does not know in advance what would be that request as it can be anything that client issues - LOOKUP, READ, WRITE, ... Many requests are short and retrieve data from the filesystem. However WRITE and NOTIFY_REPLY write data into filesystem. Before getting into operation phase, FUSE filesystem server and kernel client negotiate what should be the maximum write size the client will ever issue. After negotiation the contract in between server/client is that the filesystem server then should queue /dev/fuse sys_read calls with enough buffer capacity to receive any client request - WRITE in particular, while FUSE client should not, in particular, send WRITE requests with > negotiated max_write payload. FUSE client in kernel and libfuse historically reserve 4K for request header. This way the contract is that filesystem server should queue sys_reads with 4K+max_write buffer. If the filesystem server does not follow this contract, what can happen is that fuse_dev_do_read will see that request size is > buffer size, and then it will return EIO to client who issued the request but won't indicate in any way that there is a problem to filesystem server. This can be hard to diagnose because for some requests, e.g. for NOTIFY_REPLY which mimics WRITE, there is no client thread that is waiting for request completion and that EIO goes nowhere, while on filesystem server side things look like the kernel is not replying back after successful NOTIFY_RETRIEVE request made by the server. We can make the problem easy to diagnose if we indicate via error return to filesystem server when it is violating the contract. This should not practically cause problems because if a filesystem server is using shorter buffer, writes to it were already very likely to cause EIO, and if the filesystem is read-only it should be too following FUSE_MIN_READ_BUFFER minimum buffer size. Please see [1] for context where the problem of stuck filesystem was hit for real (because kernel client was incorrectly sending more than max_write data with NOTIFY_REPLY; see also previous patch), how the situation was traced and for more involving patch that did not make it into the tree. [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2Signed-off-by: NKirill Smelkov <kirr@nexedi.com> Cc: Han-Wen Nienhuys <hanwen@google.com> Cc: Jakob Unterwurzacher <jakobunt@gmail.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Smelkov 提交于
FUSE filesystem server and kernel client negotiate during initialization phase, what should be the maximum write size the client will ever issue. Correspondingly the filesystem server then queues sys_read calls to read requests with buffer capacity large enough to carry request header + that max_write bytes. A filesystem server is free to set its max_write in anywhere in the range between [1*page, fc->max_pages*page]. In particular go-fuse[2] sets max_write by default as 64K, wheres default fc->max_pages corresponds to 128K. Libfuse also allows users to configure max_write, but by default presets it to possible maximum. If max_write is < fc->max_pages*page, and in NOTIFY_RETRIEVE handler we allow to retrieve more than max_write bytes, corresponding prepared NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the filesystem server, in full correspondence with server/client contract, will be only queuing sys_read with ~max_write buffer capacity, and fuse_dev_do_read throws away requests that cannot fit into server request buffer. In turn the filesystem server could get stuck waiting indefinitely for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK which is understood by clients as that NOTIFY_REPLY was queued and will be sent back. Cap requested size to negotiate max_write to avoid the problem. This aligns with the way NOTIFY_RETRIEVE handler works, which already unconditionally caps requested retrieve size to fuse_conn->max_pages. This way it should not hurt NOTIFY_RETRIEVE semantic if we return less data than was originally requested. Please see [1] for context where the problem of stuck filesystem was hit for real, how the situation was traced and for more involving patch that did not make it into the tree. [1] https://marc.info/?l=linux-fsdevel&m=155057023600853&w=2 [2] https://github.com/hanwen/go-fuseSigned-off-by: NKirill Smelkov <kirr@nexedi.com> Cc: Han-Wen Nienhuys <hanwen@google.com> Cc: Jakob Unterwurzacher <jakobunt@gmail.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Smelkov 提交于
Functions, like pr_err, are a more modern variant of printing compared to printk. They could be used to denoise sources by using needed level in the print function name, and by automatically inserting per-driver / function / ... print prefix as defined by pr_fmt macro. pr_* are also said to be used in Documentation/process/coding-style.rst and more recent code - for example overlayfs - uses them instead of printk. Convert CUSE and FUSE to use the new pr_* functions. CUSE output stays completely unchanged, while FUSE output is amended a bit for "trying to steal weird page" warning - the second line now comes also with "fuse:" prefix. I hope it is ok. Suggested-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NKirill Smelkov <kirr@nexedi.com> Reviewed-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 15 4月, 2019 1 次提交
-
-
由 Matthew Wilcox 提交于
Change pipe_buf_get() to return a bool indicating whether it succeeded in raising the refcount of the page (if the thing in the pipe is a page). This removes another mechanism for overflowing the page refcount. All callers converted to handle a failure. Reported-by: NJann Horn <jannh@google.com> Signed-off-by: NMatthew Wilcox <willy@infradead.org> Cc: stable@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 2月, 2019 8 次提交
-
-
由 Miklos Szeredi 提交于
The only caller that needs fc->aborted set is fuse_conn_abort_write(). Setting fc->aborted is now racy (fuse_abort_conn() may already be in progress or finished) but there's no reason to care. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Tkhai 提交于
This is rather natural action after previous patches, and it just decreases load of fc->lock. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Tkhai 提交于
When queue_interrupt() is called from fuse_dev_do_write(), it came from userspace directly. Userspace may pass any request id, even the request's we have not interrupted (or even background's request). This patch adds sanity check to make kernel safe against that. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Tkhai 提交于
This is needed for next patch. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Tkhai 提交于
Currently, we wait on req->waitq in request_wait_answer() function only, and it's never used for background requests. Since wake_up() is not a light-weight macros, instead of this, it unfolds in really called function, which makes locking operations taking some cpu cycles, let's avoid its call for the case we definitely know it's completely useless. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Tkhai 提交于
We take global fiq->waitq.lock every time, when we are in this function, but interrupted requests are just small subset of all requests. This patch optimizes request_end() and makes it to take the lock when it's really needed. queue_interrupt() needs small change for that. After req is linked to interrupt list, we do smp_mb() and check for FR_FINISHED again. In case of FR_FINISHED bit has appeared, we remove req and leave the function: Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Tkhai 提交于
We should sent signal only in case of interrupt is really queued. Not a real problem, but this makes the code clearer and intuitive. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Tkhai 提交于
Function end_requests() does not take fc->lock. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 16 1月, 2019 2 次提交
-
-
由 Jann Horn 提交于
Some of the pipe_buf_release() handlers seem to assume that the pipe is locked - in particular, anon_pipe_buf_release() accesses pipe->tmp_page without taking any extra locks. From a glance through the callers of pipe_buf_release(), it looks like FUSE is the only one that calls pipe_buf_release() without having the pipe locked. This bug should only lead to a memory leak, nothing terrible. Fixes: dd3bb14f ("fuse: support splice() writing to fuse device") Cc: stable@vger.kernel.org Signed-off-by: NJann Horn <jannh@google.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
Dereferencing req->page_descs[0] will Oops if req->max_pages is zero. Reported-by: syzbot+c1e36d30ee3416289cc0@syzkaller.appspotmail.com Tested-by: syzbot+c1e36d30ee3416289cc0@syzkaller.appspotmail.com Fixes: b2430d75 ("fuse: add per-page descriptor <offset, length> to fuse_req") Cc: <stable@vger.kernel.org> # v3.9 Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 09 11月, 2018 2 次提交
-
-
由 Miklos Szeredi 提交于
In current fuse_drop_waiting() implementation it's possible that fuse_wait_aborted() will not be woken up in the unlikely case that fuse_abort_conn() + fuse_wait_aborted() runs in between checking fc->connected and calling atomic_dec(&fc->num_waiting). Do the atomic_dec_and_test() unconditionally, which also provides the necessary barrier against reordering with the fc->connected check. The explicit smp_mb() in fuse_wait_aborted() is not actually needed, since the spin_unlock() in fuse_abort_conn() provides the necessary RELEASE barrier after resetting fc->connected. However, this is not a performance sensitive path, and adding the explicit barrier makes it easier to document. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Fixes: b8f95e5d ("fuse: umount should wait for all requests") Cc: <stable@vger.kernel.org> #v4.19
-
由 Miklos Szeredi 提交于
fuse_request_send_notify_reply() may fail if the connection was reset for some reason (e.g. fs was unmounted). Don't leak request reference in this case. Besides leaking memory, this resulted in fc->num_waiting not being decremented and hence fuse_wait_aborted() left in a hanging and unkillable state. Fixes: 2d45ba38 ("fuse: add retrieve request") Fixes: b8f95e5d ("fuse: umount should wait for all requests") Reported-and-tested-by: syzbot+6339eda9cb4ebbc4c37b@syzkaller.appspotmail.com Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> #v2.6.36
-
- 01 10月, 2018 3 次提交
-
-
由 Miklos Szeredi 提交于
Writeback caching currently allocates requests with the maximum number of possible pages, while the actual number of pages per request depends on a couple of factors that cannot be determined when the request is allocated (whether page is already under writeback, whether page is contiguous with previous pages already added to a request). This patch allows such requests to start with no page allocation (all pages inline) and grow the page array on demand. If the max_pages tunable remains the default value, then this will mean just one allocation that is the same size as before. If the tunable is larger, then this adds at most 3 additional memory allocations (which is generously compensated by the improved performance from the larger request). Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Constantine Shulyupin 提交于
Replace FUSE_MAX_PAGES_PER_REQ with the configurable parameter max_pages to improve performance. Old RFC with detailed description of the problem and many fixes by Mitsuo Hayasaka (mitsuo.hayasaka.hu@hitachi.com): - https://lkml.org/lkml/2012/7/5/136 We've encountered performance degradation and fixed it on a big and complex virtual environment. Environment to reproduce degradation and improvement: 1. Add lag to user mode FUSE Add nanosleep(&(struct timespec){ 0, 1000 }, NULL); to xmp_write_buf in passthrough_fh.c 2. patch UM fuse with configurable max_pages parameter. The patch will be provided latter. 3. run test script and perform test on tmpfs fuse_test() { cd /tmp mkdir -p fusemnt passthrough_fh -o max_pages=$1 /tmp/fusemnt grep fuse /proc/self/mounts dd conv=fdatasync oflag=dsync if=/dev/zero of=fusemnt/tmp/tmp \ count=1K bs=1M 2>&1 | grep -v records rm fusemnt/tmp/tmp killall passthrough_fh } Test results: passthrough_fh /tmp/fusemnt fuse.passthrough_fh \ rw,nosuid,nodev,relatime,user_id=0,group_id=0 0 0 1073741824 bytes (1.1 GB) copied, 1.73867 s, 618 MB/s passthrough_fh /tmp/fusemnt fuse.passthrough_fh \ rw,nosuid,nodev,relatime,user_id=0,group_id=0,max_pages=256 0 0 1073741824 bytes (1.1 GB) copied, 1.15643 s, 928 MB/s Obviously with bigger lag the difference between 'before' and 'after' will be more significant. Mitsuo Hayasaka, in 2012 (https://lkml.org/lkml/2012/7/5/136), observed improvement from 400-550 to 520-740. Signed-off-by: NConstantine Shulyupin <const@MakeLinux.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
When allocating page array for a request the array for the page pointers and the array for page descriptors are allocated by two separate kmalloc() calls. Merge these into one allocation. Also instead of initializing the request and the page arrays with memset(), use the zeroing allocation variants. Reserved requests never carry pages (page array size is zero). Make that explicit by initializing the page array pointers to NULL and make sure the assumption remains true by adding a WARN_ON(). Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
- 28 9月, 2018 5 次提交
-
-
由 Kirill Tkhai 提交于
We noticed the performance bottleneck in FUSE running our Virtuozzo storage over rdma. On some types of workload we observe 20% of times spent in request_find() in profiler. This function is iterating over long requests list, and it scales bad. The patch introduces hash table to reduce the number of iterations, we do in this function. Hash generating algorithm is taken from hash_add() function, while 256 lines table is used to store pending requests. This fixes problem and improves the performance. Reported-by: NAlexey Kuznetsov <kuznet@virtuozzo.com> Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Tkhai 提交于
This field is not needed after the previous patch, since we can easily convert request ID to interrupt request ID and vice versa. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Tkhai 提交于
Using of two unconnected IDs req->in.h.unique and req->intr_unique does not allow to link requests to a hash table. We need can't use none of them as a key to calculate hash. This patch changes the algorithm of allocation of IDs for a request. Plain requests obtain even ID, while interrupt requests are encoded in the low bit. So, in next patches we will be able to use the rest of ID bits to calculate hash, and the hash will be the same for plain and interrupt requests. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Tkhai 提交于
Currently, we take fc->lock there only to check for fc->connected. But this flag is changed only on connection abort, which is very rare operation. So allow checking fc->connected under just fc->bg_lock and use this lock (as well as fc->lock) when resetting fc->connected. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Kirill Tkhai 提交于
To reduce contention of fc->lock, this patch introduces bg_lock for protection of fields related to background queue. These are: max_background, congestion_threshold, num_background, active_background, bg_queue and blocked. This allows next patch to make async reads not requiring fc->lock, so async reads and writes will have better performance executed in parallel. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-