- 11 5月, 2011 1 次提交
-
-
由 Eric W. Biederman 提交于
Create files under /proc/<pid>/ns/ to allow controlling the namespaces of a process. This addresses three specific problems that can make namespaces hard to work with. - Namespaces require a dedicated process to pin them in memory. - It is not possible to use a namespace unless you are the child of the original creator. - Namespaces don't have names that userspace can use to talk about them. The namespace files under /proc/<pid>/ns/ can be opened and the file descriptor can be used to talk about a specific namespace, and to keep the specified namespace alive. A namespace can be kept alive by either holding the file descriptor open or bind mounting the file someplace else. aka: mount --bind /proc/self/ns/net /some/filesystem/path mount --bind /proc/self/fd/<N> /some/filesystem/path This allows namespaces to be named with userspace policy. It requires additional support to make use of these filedescriptors and that will be comming in the following patches. Acked-by: NDaniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
-
- 04 5月, 2011 1 次提交
-
-
由 Linus Torvalds 提交于
In particular, s_freeing_list needs to be initialized early, since it is used on some of the error paths when mounts fail. The mapping inode, for example, would be initialized and then free'd on an error path before s_freeing_list was initialized, but the inode drop operation needs the s_freeing_list to be set up. Normally you'd never see this, because not only is logfs fairly rare, but a successful mount will never have any issues. Reported-by: Nwerner <w.landgraf@ru.ru> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 5月, 2011 2 次提交
-
-
由 Artem Bityutskiy 提交于
This is the second fix of the following symptom: UBIFS error (pid 34456): could not find an empty LEB which sometimes happens after power cuts when we mount the file-system - UBIFS refuses it with the above error message which comes from the 'ubifs_rcvry_gc_commit()' function. I can reproduce this using the integck test with the UBIFS power cut emulation enabled. Analysis of the problem. Currently UBIFS replay seeks the journal heads to the last _replayed_ bud. But the buds are replayed out-of-order, so the replay basically seeks journal heads to the "random" bud belonging to this head, and not to the _last_ one. The result of this is that the GC head may be seeked to a full LEB with no free space, or very little free space. And 'ubifs_rcvry_gc_commit()' tries to find a fully or mostly dirty LEB to match the current GC head (because we need to garbage-collect that dirty LEB at one go, because we do not have @c->gc_lnum). So 'ubifs_find_dirty_leb()' fails and we fall back to finding an empty LEB and also fail. As a result - recovery fails and mounting fails. This patch teaches the replay to initialize the GC heads exactly to the latest buds, i.e. the buds which have the largest sequence number in corresponding log reference nodes. Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com> Cc: stable@kernel.org
-
由 Artem Bityutskiy 提交于
Currently UBIFS has a small optimization - it frees write-buffers when it is re-mounted from R/W mode to R/O mode. Of course, when it is mounted R/O, it does not allocate write-buffers as well. This optimization is nice but it leads to subtle problems and complications in recovery, which I can reproduce using the integck test. The symptoms are that after a power cut the file-system cannot be mounted if we first mount it R/O, and then re-mount R/W - 'ubifs_rcvry_gc_commit()' prints: UBIFS error (pid 34456): could not find an empty LEB Analysis of the problem. When mounting R/W, the reply process sets journal heads to buds [1], but when mounting R/O - it does not do this, because the write-buffers are not allocated. So 'ubifs_rcvry_gc_commit()' works completely differently for the same file-system but for the following 2 cases: 1. mounting R/W after a power cut and recover 2. mounting R/O after a power cut, re-mounting R/W and run deferred recovery In the former case, we have journal heads seeked to the a bud, in the latter case, they are non-seeked (wbuf->lnum == -1). So in the latter case we do not try to recover the GC LEB by garbage-collecting to the GC head, but we just try to find an empty LEB, and there may be no empty LEBs, so we just fail. On the other hand, in the former case (mount R/W), we are able to make a GC LEB (@c->gc_lnum) by garbage-collecting. Thus, let's remove this small nice optimization and always allocate write-buffers. This should not make too big difference - we have only 3 of them, each of max. write unit size, which is usually 2KiB. So this is about 6KiB of RAM for the typical case, and only when mounted R/O. [1]: Note, currently the replay process is setting (seeking) the journal heads to _some_ buds, not necessarily to the buds which had been the journal heads before the power cut happened. This will be fixed separately. Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com> Cc: stable@kernel.org
-
- 29 4月, 2011 1 次提交
-
-
由 Andrew Morton 提交于
Azurit reports large increases in system time after 2.6.36 when running Apache. It was bisected down to a892e2d7 ("vfs: use kmalloc() to allocate fdmem if possible"). That patch caused the vfs to use kmalloc() for very large allocations and this is causing excessive work (and presumably excessive reclaim) within the page allocator. Fix it by falling back to vmalloc() earlier - when the allocation attempt would have been considered "costly" by reclaim. Reported-by: NazurIt <azurit@pobox.sk> Tested-by: NazurIt <azurit@pobox.sk> Acked-by: NChangli Gao <xiaosuo@gmail.com> Cc: Americo Wang <xiyou.wangcong@gmail.com> Cc: Jiri Slaby <jslaby@suse.cz> Acked-by: NEric Dumazet <eric.dumazet@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 4月, 2011 3 次提交
-
-
由 Jeff Layton 提交于
On a remount, the VFS layer will clear the MS_SYNCHRONOUS bit on the assumption that the flags on the mount syscall will have it set if the remounted fs is supposed to keep it. In the case of "noac" though, MS_SYNCHRONOUS is implied. A remount of such a mount will lose the MS_SYNCHRONOUS flag since "sync" isn't part of the mount options. Reported-by: NMax Matveev <makc@redhat.com> Signed-off-by: NJeff Layton <jlayton@redhat.com> Cc: stable@kernel.org Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Bryan Schumaker 提交于
When compiling, I was getting this warning: fs/nfs/nfs4xdr.c: In function ‘decode_secinfo’: fs/nfs/nfs4xdr.c:4839:6: warning: variable ‘status’ set but not used [-Wunused-but-set-variable] We were unconditionally returning 0 as long as there wasn't an error coming out of xdr_inline_decode(). We probably want to check the error status coming out of decode_op_hdr() and decode_secinfo_gss(), rather than assuming that everything is OK all the time. Signed-off-by: NBryan Schumaker <bjschuma@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
When readdir() returns a directory entry for the root of a mounted filesystem, Linux follows the old convention of returning the inode number of the covered directory (despite newer versions of POSIX declaring that this is a bug). To ensure this continues to work, the NFSv4 readdir implementation requests the 'mounted-on-fileid' from the server. However, readdirplus also needs to instantiate an inode for this entry, and for that, we also need to request the real fileid as per this patch. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 27 4月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
These changes were incorrectly fixed by codespell. They were now manually corrected. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>
-
- 26 4月, 2011 14 次提交
-
-
由 Christoph Hellwig 提交于
Now that the whole dcache_hash_bucket crap is gone, go all the way and also remove the weird locking layering violations for locking the hash buckets. Add hlist_bl_lock/unlock helpers to move the locking into the list abstraction instead of requiring each caller to open code it. After all allowing for the bit locks is the whole point of these helpers over the plain hlist variant. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Tyler Hicks 提交于
After 57db4e8d changed eCryptfs to write-back caching, eCryptfs page writeback updates the lower inode times due to the use of vfs_write() on the lower file. To preserve inode metadata changes, such as 'cp -p' does with utimensat(), we need to flush all dirty pages early in ecryptfs_setattr() so that the user-updated lower inode metadata isn't clobbered later in writeback. https://bugzilla.kernel.org/show_bug.cgi?id=33372Reported-by: NRocko <rockorequin@hotmail.com> Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
-
由 Tyler Hicks 提交于
When failing to read the lower file's crypto metadata during a lookup, eCryptfs must continue on without throwing an error. For example, there may be a plaintext file in the lower mount point that the user wants to delete through the eCryptfs mount. If an error is encountered while reading the metadata in lookup(), the eCryptfs inode's size could be incorrect. We must be sure to reread the plaintext inode size from the metadata when performing an open() or setattr(). The metadata is already being read in those paths, so this adds minimal performance overhead. This patch introduces a flag which will track whether or not the plaintext inode size has been read so that an incorrect i_size can be fixed in the open() or setattr() paths. https://bugs.launchpad.net/bugs/509180 Cc: <stable@kernel.org> Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
-
由 Tsutomu Itoh 提交于
The error processing of several places is changed like setting the error number only at the error. Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Josef Bacik 提交于
In btrfs_submit_direct_hook if the first btrfs_map_block fails we need to put the orig_bio, not bio. Signed-off-by: NJosef Bacik <josef@redhat.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Josef Bacik 提交于
If our space cache is wrong, we do the right thing and free up everything that we loaded, however we don't reset the total_bitmaps counter or the thresholds or anything. So in btrfs_remove_free_space_cache make sure to call free_bitmap() if it's a bitmap, this will keep us from panicing when we check to make sure we don't have too many bitmaps. Thanks, Signed-off-by: NJosef Bacik <josef@redhat.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Li Zefan 提交于
Since commit dc89e982, we've changed to use a specific slab for alocation of free_space items. Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 David Sterba 提交于
Signed-off-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Tsutomu Itoh 提交于
The check on the return value of kmalloc() is added to some places. Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Itaru Kitayama 提交于
the space cache use extent_readpages() to read free space information, so we can not use GFP_KERNEL flag to allocate memory, or it may lead to deadlock. Signed-off-by: NItaru Kitayama <kitayama@cl.bb4u.ne.jp> Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Tsutomu Itoh 提交于
It is necessary to unlock mutex_lock before it return an error when btrfs_alloc_path() fails. Signed-off-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Tyler Hicks 提交于
For any given lower inode, eCryptfs keeps only one lower file open and multiplexes all eCryptfs file operations through that lower file. The lower file was considered "persistent" and stayed open from the first lookup through the lifetime of the inode. This patch keeps the notion of a single, per-inode lower file, but adds reference counting around the lower file so that it is closed when not currently in use. If the reference count is at 0 when an operation (such as open, create, etc.) needs to use the lower file, a new lower file is opened. Since the file is no longer persistent, all references to the term persistent file are changed to lower file. Locking is added around the sections of code that opens the lower file and assign the pointer in the inode info, as well as the code the fputs the lower file when all eCryptfs users are done with it. This patch is needed to fix issues, when mounted on top of the NFSv3 client, where the lower file is left silly renamed until the eCryptfs inode is destroyed. Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
-
由 Tyler Hicks 提交于
Call dput on the dentries previously returned by dget_parent() in ecryptfs_rename(). This is needed for supported eCryptfs mounts on top of the NFSv3 client. Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
-
由 Tyler Hicks 提交于
vfs_rmdir() already calls d_delete() on the lower dentry. That was being duplicated in ecryptfs_rmdir() and caused a NULL pointer dereference when NFSv3 was the lower filesystem. Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>
-
- 25 4月, 2011 2 次提交
-
-
由 Trond Myklebust 提交于
The following patch ensures that we do not get permanently trapped in the RPC layer when trying to establish a new client id or session. This again ensures that the state manager can finish in a timely fashion when the last filesystem to reference the nfs_client exits. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Trond Myklebust 提交于
If a server for some reason keeps sending NFS4ERR_DELAY errors, we can end up looping forever inside nfs4_proc_create_session, and so the usual mechanisms for detecting if the nfs_client is dead don't work. Fix this by ensuring that we loop inside the nfs4_state_manager thread instead. Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 24 4月, 2011 2 次提交
-
-
由 Linus Torvalds 提交于
The dentry hashing rules have been really quite complicated for a long while, in odd ways. That made functions like __d_drop() very fragile and non-obvious. In particular, whether a dentry was hashed or not was indicated with an explicit DCACHE_UNHASHED bit. That's despite the fact that the hash abstraction that the dentries use actually have a 'is this entry hashed or not' model (which is a simple test of the 'pprev' pointer). The reason that was done is because we used the normal 'is this entry unhashed' model to mark whether the dentry had _ever_ been hashed in the dentry hash tables, and that logic goes back many years (commit b3423415: "dcache: avoid RCU for never-hashed dentries"). That, in turn, meant that __d_drop had totally different unhashing logic for the dentry hash table case and for the anonymous dcache case, because in order to use the "is this dentry hashed" logic as a flag for whether it had ever been on the RCU hash table, we had to unhash such a dentry differently so that we'd never think that it wasn't 'unhashed' and wouldn't be free'd correctly. That's just insane. It made the logic really hard to follow, when there were two different kinds of "unhashed" states, and one of them (the one that used "list_bl_unhashed()") really had nothing at all to do with being unhashed per se, but with a very subtle lifetime rule instead. So turn all of it around, and make it logical. Instead of having a DENTRY_UNHASHED bit in d_flags to indicate whether the dentry is on the hash chains or not, use the hash chain unhashed logic for that. Suddenly "d_unhashed()" just uses "list_bl_unhashed()", and everything makes sense. And for the lifetime rule, just use an explicit DENTRY_RCUACCEES bit. If we ever insert the dentry into the dentry hash table so that it is visible to RCU lookup, we mark it DENTRY_RCUACCESS to show that it now needs the RCU lifetime rules. Now suddently that test at dentry free time makes sense too. And because unhashing now is sane and doesn't depend on where the dentry got unhashed from (because the dentry hash chain details doesn't have some subtle side effects), we can re-unify the __d_drop() logic and use common code for the unhashing. Also fix one more open-coded hash chain bit_spin_lock() that I missed in the previous chain locking cleanup commit. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
It's a useless abstraction for 'hlist_bl_head', and it doesn't actually help anything - quite the reverse. All the users end up having to know about the hlist_bl_head details anyway, using 'struct hlist_bl_node *' etc. So it just makes the code look confusing. And the cost of it is extra '&b->head' syntactic noise, but more importantly it spuriously makes the hash table dentry list look different from the per-superblock DCACHE_DISCONNECTED dentry list. As a result, the code ended up using ad-hoc locking for one case and special helper functions for what is really another totally identical case in the very same function. Make it all look and work the same. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 4月, 2011 1 次提交
-
-
由 Pavel Shilovsky 提交于
While password processing we can get out of options array bound if the next character after array is delimiter. The patch adds a check if we reach the end. Signed-off-by: NPavel Shilovsky <piastry@etersoft.ru> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-
- 21 4月, 2011 4 次提交
-
-
由 Jan Kara 提交于
For some reason generic_setxattr() did not pass flags (XATTR_CREATE, XATTR_REPLACE) to the filesystem specific helper. This caused that setxattr(2) syscall just ignored these flags. Fix the bug by passing flags correctly. Signed-off-by: NJan Kara <jack@suse.cz> Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Artem Bityutskiy 提交于
This patch fixes the following symptoms: 1. Unmount UBIFS cleanly. 2. Start mounting UBIFS R/W and have a power cut immediately 3. Start mounting UBIFS R/O, this succeeds 4. Try to re-mount UBIFS R/W - this fails immediately or later on, because UBIFS will write the master node to the flash area which has been written before. The analysis of the problem: 1. UBIFS is unmounted cleanly, both copies of the master node are clean. 2. UBIFS is being mounter R/W, starts changing master node copy 1, and a power cut happens. The copy N1 becomes corrupted. 3. UBIFS is being mounted R/O. It notices the copy N1 is corrupted and reads copy N2. Copy N2 is clean. 4. Because of R/O mode, UBIFS cannot recover copy 1. 5. The mount code (ubifs_mount()) sees that the master node is clean, so it decides that no recovery is needed. 6. We are re-mounting R/W. UBIFS believes no recovery is needed and starts updating the master node, but copy N1 is still corrupted and was not recovered! Fix this problem by marking the master node as dirty every time we recover it and we are in R/O mode. This forces further recovery and the UBIFS cleans-up the corruptions and recovers the copy N1 when re-mounting R/W later. Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com> Cc: stable@kernel.org
-
由 Artem Bityutskiy 提交于
When UBIFS switches to R/O mode because it detects I/O failures, then when we unmount, we still may have allocated budget, and the assertions which verify that we have not budget will fire. But it is expected to have the budget in case of I/O failures, so the assertion warnings will be false. Suppress them for the I/O failure case. Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
-
由 Dave Chinner 提交于
Commit 957935dc ("xfs: fix xfs_debug warnings" broke the logic in __xfs_printk(). Instead of only printing one of two possible output strings based on whether the fs has a name or not, it outputs both. Fix it to only output one message again. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NAlex Elder <aelder@sgi.com>
-
- 20 4月, 2011 4 次提交
-
-
由 Artem Bityutskiy 提交于
This patch fixes UBIFS mount failure when the debugging support is enabled, we are recovering from a power cut, we were first mounter R/O and we are re-mounting R/W. In this case we should not assume that the amount of free space before we have re-mounted R/W and after are equivalent, because when we have mounted R/O the file-system is in a non-committed state so the amount of free space is slightly smaller, due to the fact that we cannot predict the amount of free space precisely before we commit. This patch fixes the issue by skipping the debugging check in case of recovery. This issue was reported by Caizhiyong <caizhiyong@huawei.com> here: http://thread.gmane.org/gmane.linux.drivers.mtd/34350/focus=34387Signed-off-by: NArtem Bityutskiy <Artem.Bityutskiy@nokia.com> Reported-by: NCaizhiyong <caizhiyong@huawei.com> Cc: stable@kernel.org [2.6.30+]
-
由 Sachin Prabhu 提交于
An open on a NFS4 share using the O_CREAT flag on an existing file for which we have permissions to open but contained in a directory with no write permissions will fail with EACCES. A tcpdump shows that the client had set the open mode to UNCHECKED which indicates that the file should be created if it doesn't exist and encountering an existing flag is not an error. Since in this case the file exists and can be opened by the user, the NFS server is wrong in attempting to check create permissions on the parent directory. The patch adds a conditional statement to check for create permissions only if the file doesn't exist. Signed-off-by: NSachin S. Prabhu <sprabhu@redhat.com> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
由 Chris Mason 提交于
The Btrfs submit bio threads have a small number of threads responsible for pushing down bios we've collected for a large number of devices. Since we do all the bios for a single device at once, we want to make sure we unplug and send down the bios for each device as we're done processing them. The new plugging API removed the btrfs code to unplug while processing bios, this adds it back with the new API. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 OGAWA Hirofumi 提交于
23fcf2ec (nfsd4: fix oops on lock failure) The above patch breaks free path for stp->st_file. If stp was inserted into sop->so_stateids, we have to free stp->st_file refcount. Because stp->st_file refcount itself is taken whether or not any refcounts are taken on the stp->st_file->fi_fds[]. Signed-off-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: stable@kernel.org Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-
- 19 4月, 2011 4 次提交
-
-
由 Bryan Schumaker 提交于
I only want to try other secflavors during an initial mount if NFS4ERR_WRONGSEC is returned. nfs4_handle_exception() could potentially map other errors to EPERM, so we should handle this error specially for correctness. Signed-off-by: NBryan Schumaker <bjschuma@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Bryan Schumaker 提交于
If we fail to contact the gss upcall program, then no message will be sent to the server. The client still updated the sequence number, however, and this lead to NFS4ERR_SEQ_MISMATCH for the next several RPC calls. Signed-off-by: NBryan Schumaker <bjschuma@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Linus Torvalds 提交于
Rather than pass in some random truncated offset to the pid-related functions, check that the offset is in range up-front. This is just cleanup, the previous commit fixed the real problem. Cc: stable@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 J. Bruce Fields 提交于
Introduced by acfdf5c3. Cc: stable@kernel.org Reported-by: NGerhard Heift <ml-nfs-linux-20110412-ef47@gheift.de> Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
-