- 06 2月, 2008 4 次提交
-
-
由 Davide Libenzi 提交于
This is the new timerfd API as it is implemented by the following patch: int timerfd_create(int clockid, int flags); int timerfd_settime(int ufd, int flags, const struct itimerspec *utmr, struct itimerspec *otmr); int timerfd_gettime(int ufd, struct itimerspec *otmr); The timerfd_create() API creates an un-programmed timerfd fd. The "clockid" parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME. The timerfd_settime() API give new settings by the timerfd fd, by optionally retrieving the previous expiration time (in case the "otmr" parameter is not NULL). The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit is set in the "flags" parameter. Otherwise it's a relative time. The timerfd_gettime() API returns the next expiration time of the timer, or {0, 0} if the timerfd has not been set yet. Like the previous timerfd API implementation, read(2) and poll(2) are supported (with the same interface). Here's a simple test program I used to exercise the new timerfd APIs: http://www.xmailserver.org/timerfd-test2.c [akpm@linux-foundation.org: coding-style cleanups] [akpm@linux-foundation.org: fix ia64 build] [akpm@linux-foundation.org: fix m68k build] [akpm@linux-foundation.org: fix mips build] [akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds] [heiko.carstens@de.ibm.com: fix s390] [akpm@linux-foundation.org: fix powerpc build] [akpm@linux-foundation.org: fix sparc64 more] Signed-off-by: NDavide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Oleg Nesterov 提交于
As Roland pointed out, we have the very old problem with exec. de_thread() sets SIGNAL_GROUP_EXIT, kills other threads, changes ->group_leader and then clears signal->flags. All signals (even fatal ones) sent in this window (which is not too small) will be lost. With this patch exec doesn't abuse SIGNAL_GROUP_EXIT. signal_group_exit(), the new helper, should be used to detect exit_group() or exec() in progress. It can have more users, but this patch does only strictly necessary changes. Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Robin Holt <holt@sgi.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Morton 提交于
It was dumb to make get_task_comm() return void. Change it to return a pointer to the resulting output for caller convenience. Cc: Ulrich Drepper <drepper@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Peter Zijlstra 提交于
On Sat, 2008-01-05 at 13:35 -0800, Davide Libenzi wrote: > I remember I talked with Arjan about this time ago. Basically, since 1) > you can drop an epoll fd inside another epoll fd 2) callback-based wakeups > are used, you can see a wake_up() from inside another wake_up(), but they > will never refer to the same lock instance. > Think about: > > dfd = socket(...); > efd1 = epoll_create(); > efd2 = epoll_create(); > epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...); > epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); > > When a packet arrives to the device underneath "dfd", the net code will > issue a wake_up() on its poll wake list. Epoll (efd1) has installed a > callback wakeup entry on that queue, and the wake_up() performed by the > "dfd" net code will end up in ep_poll_callback(). At this point epoll > (efd1) notices that it may have some event ready, so it needs to wake up > the waiters on its poll wait list (efd2). So it calls ep_poll_safewake() > that ends up in another wake_up(), after having checked about the > recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to > avoid stack blasting. Never hit the same queue, to avoid loops like: > > epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); > epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...); > epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...); > epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...); > > The code "if (tncur->wq == wq || ..." prevents re-entering the same > queue/lock. Since the epoll code is very careful to not nest same instance locks allow the recursion. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: NStefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: NDavide Libenzi <davidel@xmailserver.org> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 2月, 2008 5 次提交
-
-
由 Nick Piggin 提交于
Drivers that register a ->fault handler, but do not range-check the offset argument, must set VM_DONTEXPAND in the vm_flags in order to prevent an expanding mremap from overflowing the resource. I've audited the tree and attempted to fix these problems (usually by adding VM_DONTEXPAND where it is not obvious). Signed-off-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Vitaliy Gusev 提交于
fcntl(F_GETLK,..) can return pid of process for not current pid namespace (if process is belonged to the several namespaces). It is true also for pids in /proc/locks. So correct behavior is saving pointer to the struct pid of the process lock owner. Signed-off-by: NVitaliy Gusev <vgusev@openvz.org> Acked-by: NSerge Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 Matthew Wilcox 提交于
interruptible_sleep_on_locked() is just an open-coded wait_event_interruptible_timeout(), with the one difference that interruptible_sleep_on_locked() doesn't bother to check the condition on which it is waiting, depending instead on the BKL to avoid the case where it blocks after the wakeup has already been called. locks_block_on_timeout() is only used in one place, so it's actually simpler to inline it into its caller. Signed-off-by: NMatthew Wilcox <willy@linux.intel.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
For such a short function (with such a long comment), posix_locks_deadlock() seems to cause a lot of confusion. Attempt to make it a bit clearer: - Remove the initial posix_same_owner() check, which can never pass (since this is only called in the case that block_fl and caller_fl conflict) - Use an explicit loop (and a helper function) instead of a goto. - Rewrite the comment, attempting a clearer explanation, and removing some uninteresting historical detail. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 Ohad Ben-Cohen 提交于
s/litle/little Signed-off-by: NOhad Ben-Cohen <ohad@bencohen.org> Signed-off-by: NAdrian Bunk <bunk@kernel.org>
-
- 03 2月, 2008 4 次提交
-
-
由 Joe Perches 提交于
Signed-off-by: NJoe Perches <joe@perches.com> Signed-off-by: NAdrian Bunk <bunk@kernel.org>
-
由 Paulius Zaleckas 提交于
Signed-off-by: NPaulius Zaleckas <pauliusz@yahoo.com> Signed-off-by: NAdrian Bunk <bunk@kernel.org>
-
由 Robert P. J. Day 提交于
Signed-off-by: NRobert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: NAdrian Bunk <bunk@kernel.org>
-
由 Robert P. J. Day 提交于
Signed-off-by: NRobert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: NAdrian Bunk <bunk@kernel.org>
-
- 02 2月, 2008 27 次提交
-
-
由 Jeff Layton 提交于
It's possible for a RPC to outlive the lockd daemon that created it, so we need to make sure that all RPC's are killed when lockd is coming down. When nlm_shutdown_hosts is called, kill off all RPC tasks associated with the host. Since we need to wait until they have all gone away, we might as well just shut down the RPC client altogether. Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
Neil Brown points out that we're checking buf[size-1] in a couple places without first checking whether size is zero. Actually, given the implementation of simple_transaction_get(), buf[-1] is zero, so in both of these cases the subsequent check of the value of buf[size-1] will catch this case. But it seems fragile to depend on that, so add explicit checks for this case. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Acked-by: NNeilBrown <neilb@suse.de>
-
由 J. Bruce Fields 提交于
Wendy Cheng noticed that function name doesn't agree here. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Cc: Wendy Cheng <wcheng@redhat.com>
-
由 J. Bruce Fields 提交于
Neither EPERM and ENOENT map to valid errors for PUTROOTFH according to rfc 3530, and, if anything, ENOENT is likely to be slightly more informative; so don't bother mapping ENOENT to EPERM. (Probably this was originally done because one likely cause was that there is an fsid=0 export but that it isn't permitted to this particular client. Now that we allow WRONGSEC returns, this is somewhat less likely.) In the long term we should work to make this situation less likely, perhaps by turning off nfsv4 service entirely in the absence of the pseudofs root, or constructing a pseudofilesystem root ourselves in the kernel as necessary. Thanks to Benny Halevy <bhalevy@panasas.com> for pointing out this problem. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Cc: Benny Halevy <bhalevy@panasas.com>
-
由 Tom Tucker 提交于
Create a transport independent version of the svc_sock_names function. The toclose capability of the svc_sock_names service can be implemented using the svc_xprt_find and svc_xprt_close services. Signed-off-by: NTom Tucker <tom@opengridcomputing.com> Acked-by: NNeil Brown <neilb@suse.de> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NGreg Banks <gnb@sgi.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 Tom Tucker 提交于
Update the write handler for the portlist file to allow creating new listening endpoints on a transport. The general form of the string is: <transport_name><space><port number> For example: echo "tcp 2049" > /proc/fs/nfsd/portlist This is intended to support the creation of a listening endpoint for RDMA transports without adding #ifdef code to the nfssvc.c file. Transports can also be removed as follows: '-'<transport_name><space><port number> For example: echo "-tcp 2049" > /proc/fs/nfsd/portlist Attempting to add a listener with an invalid transport string results in EPROTONOSUPPORT and a perror string of "Protocol not supported". Attempting to remove an non-existent listener (.e.g. bad proto or port) results in ENOTCONN and a perror string of "Transport endpoint is not connected" Signed-off-by: NTom Tucker <tom@opengridcomputing.com> Acked-by: NNeil Brown <neilb@suse.de> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NGreg Banks <gnb@sgi.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 Tom Tucker 提交于
Add a new svc function that allows a service to query whether a transport instance has already been created. This is used in lockd to determine whether or not a transport needs to be created when a lockd instance is brought up. Specifying 0 for the address family or port is effectively a wild-card, and will result in matching the first transport in the service's list that has a matching class name. Signed-off-by: NTom Tucker <tom@opengridcomputing.com> Acked-by: NNeil Brown <neilb@suse.de> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NGreg Banks <gnb@sgi.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 Tom Tucker 提交于
Move sk_list and sk_ready to svc_xprt. This involves close because these lists are walked by svcs when closing all their transports. So I combined the moving of these lists to svc_xprt with making close transport independent. The svc_force_sock_close has been changed to svc_close_all and takes a list as an argument. This removes some svc internals knowledge from the svcs. This code races with module removal and transport addition. Thanks to Simon Holm Thøgersen for a compile fix. Signed-off-by: NTom Tucker <tom@opengridcomputing.com> Acked-by: NNeil Brown <neilb@suse.de> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NGreg Banks <gnb@sgi.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Cc: Simon Holm Thøgersen <odie@cs.aau.dk>
-
由 Tom Tucker 提交于
Modify the various kernel RPC svcs to use the svc_create_xprt service. Signed-off-by: NTom Tucker <tom@opengridcomputing.com> Acked-by: NNeil Brown <neilb@suse.de> Reviewed-by: NChuck Lever <chuck.lever@oracle.com> Reviewed-by: NGreg Banks <gnb@sgi.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 Oleg Drokin 提交于
Fix nlm_block leak for the case of supplied blocking lock info. Signed-off-by: NOleg Drokin <green@linuxhacker.ru> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
Document these checks a little better and inline, as suggested by Neil Brown (note both functions have two callers). Remove an obviously bogus check while we're there (checking whether unsigned value is negative). Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Cc: Neil Brown <neilb@suse.de>
-
由 J. Bruce Fields 提交于
The server silently ignores attempts to set the uid and gid on create. Based on the comment, this appears to have been done to prevent some overly-clever IRIX client from causing itself problems. Perhaps we should remove that hack completely. For now, at least, it makes sense to allow root (when no_root_squash is set) to set uid and gid. While we're there, since nfsd_create and nfsd_create_v3 share the same logic, pull that out into a separate function. And spell out the individual modifications of ia_valid instead of doing them both at once inside a conditional. Thanks to Roger Willcocks <roger@filmlight.ltd.uk> for the bug report and original patch on which this is based. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 Oleg Drokin 提交于
Without the patch, there is a leakage of nlmblock structure refcount that holds a reference nlmfile structure, that holds a reference to struct file, when async GETFL is used (-EINPROGRESS return from file_ops->lock()), and also in some error cases. Fix up a style nit while we're here. Signed-off-by: NOleg Drokin <green@linuxhacker.ru> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 Frank Filz 提交于
This patch addresses a compatibility issue with a Linux NFS server and AIX NFS client. I have exported /export as fsid=0 with sec=krb5:krb5i I have mount --bind /home onto /export/home I have exported /export/home with sec=krb5i The AIX client mounts / -o sec=krb5:krb5i onto /mnt If I do an ls /mnt, the AIX client gets a permission error. Looking at the network traceIwe see a READDIR looking for attributes FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID. The response gives a NFS4ERR_WRONGSEC which the AIX client is not expecting. Since the AIX client is only asking for an attribute that is an attribute of the parent file system (pseudo root in my example), it seems reasonable that there should not be an error. In discussing this issue with Bruce Fields, I initially proposed ignoring the error in nfsd4_encode_dirent_fattr() if all that was being asked for was FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID, however, Bruce suggested that we avoid calling cross_mnt() if only these attributes are requested. The following patch implements bypassing cross_mnt() if only FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID are called. Since there is some complexity in the code in nfsd4_encode_fattr(), I didn't want to duplicate code (and introduce a maintenance nightmare), so I added a parameter to nfsd4_encode_fattr() that indicates whether it should ignore cross mounts and simply fill in the attribute using the passed in dentry as opposed to it's parent. Signed-off-by: NFrank Filz <ffilzlnx@us.ibm.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
The failure to return a stateowner from nfs4_preprocess_seqid_op() means in the case where a lock request is of a type incompatible with an open (due to, e.g., an application attempting a write lock on a file open for read), means that fs/nfsd/nfs4xdr.c:ENCODE_SEQID_OP_TAIL() never bumps the seqid as it should. The client, attempting to close the file afterwards, then gets an (incorrect) bad sequence id error. Worse, this prevents the open file from ever being closed, so we leak state. Thanks to Benny Halevy and Trond Myklebust for analysis, and to Steven Wilton for the report and extensive data-gathering. Cc: Benny Halevy <bhalevy@panasas.com> Cc: Steven Wilton <steven.wilton@team.eftel.com.au> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 Oleg Drokin 提交于
In a number of places where we wish only to translate nlm_drop_reply to rpc_drop_reply errors we instead return early with rpc_drop_reply, skipping some important end-of-function cleanup. This results in reference count leaks when lockd is doing posix locking on GFS2. Signed-off-by: NOleg Drokin <green@linuxhacker.ru> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
When the callback channel fails, we inform the client of that by returning a cb_path_down error the next time it tries to renew its lease. If we wait most of a lease period before deciding that a callback has failed and that the callback channel is down, then we decrease the chances that the client will find out in time to do anything about it. So, mark the channel down as soon as we recognize that an rpc has failed. However, continue trying to recall delegations anyway, in hopes it will come back up. This will prevent more delegations from being given out, and ensure cb_path_down is returned to renew calls earlier, while still making the best effort to deliver recalls of existing delegations. Also fix a couple comments and remove a dprink that doesn't seem likely to be useful. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
Fix various minor style violations. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
Declare this variable in the one function where it's used, and clean up some minor style problems. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
Fix bizarre indentation. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
We generate a unique cl_confirm for every new client; so if we've already checked that this cl_confirm agrees with the cl_confirm of unconf, then we already know that it does not agree with the cl_confirm of conf. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
Again, the only way conf and unconf can have the same clientid is if they were created in the "probable callback update" case of setclientid, in which case we already know that the cl_verifier fields must agree. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
If conf and unconf are both found in the lookup by cl_clientid, then they share the same cl_clientid. We always create a unique new cl_clientid field when creating a new client--the only exception is the "probable callback update" case in setclientid, where we copy the old cl_clientid from another clientid with the same name. Therefore two clients with the same cl_client field also always share the same cl_name field, and a couple of the checks here are redundant. Thanks to Simon Holm Thøgersen for a compile fix. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu> Cc: Simon Holm Thøgersen <odie@cs.aau.dk>
-
由 J. Bruce Fields 提交于
Using a counter instead of the nanoseconds value seems more likely to produce a unique cl_confirm. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
We're supposed to generate a different cl_confirm verifier for each new client, so these to cl_confirm values should never be the same. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
Most of these comments just summarize the code. The matching of code to the cases described in the RFC may still be useful, though; add specific section references to make that easier to follow. Also update references to the outdated RFC 3010. Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 J. Bruce Fields 提交于
While we're here, let's remove the redundant (and now wrong) pathname in the comment, and the #ifdef __KERNEL__'s. Acked-by: NNeilBrown <neilb@suse.de> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-