- 04 5月, 2010 12 次提交
-
-
由 Joel Becker 提交于
gcc warns that a variable is uninitialized. It's actually handled, but an early return fools gcc. Let's just initialize the variable to a garbage value that will crash if the usage is ever broken. Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Sage Weil 提交于
It's useless, since our allocations are already a power of 2. And it was allocated per-instance (not globally), which caused a name collision when we tried to mount a second file system with auth_x enabled. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
The __ variant requires caller to hold i_lock. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
If a rename operation is resent to the MDS following an MDS restart, the client does not get a full reply (containing the resulting metadata) back. In that case, a ceph_rename() needs to compensate by doing anything useful that fill_inode() would have, like d_move(). It also needs to invalidate the dentry (to workaround the vfs_rename_dir() bug) and clear the dir complete flag, just like fill_trace(). Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
truncate_inode_pages_range wants the end offset to align with the last byte in a page. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
We can get old message seq #'s after a tcp reconnect for stateful sessions (i.e., the MDS). If we get a higher seq #, that is an error, and we shouldn't see any bad seq #'s for stateless (mon, osd) connections. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Increment in_seq even when the message is skipped for some reason. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
Decouple the client version from the server side. Print relevant protocol and map version info instead. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
The snap realm split was checking i_snap_realm, not the list_head, to determine if an inode belonged in the new realm. The check always failed, which meant we always moved the inode, corrupting the old realm's list and causing various crashes. Also wait to release old realm reference to avoid possibility of use after free. Signed-off-by: NSage Weil <sage@newdream.net>
-
由 Sage Weil 提交于
d_move() reorders the d_subdirs list, breaking the readdir result caching. Unless/until d_move preserves that ordering, clear CEPH_I_COMPLETE on rename. Signed-off-by: NSage Weil <sage@newdream.net>
-
- 03 5月, 2010 1 次提交
-
-
由 Ryusuke Konishi 提交于
As of 32a88aa1, __sync_filesystem() will return 0 if s_bdi is not set. And nilfs does not set s_bdi anywhere. I noticed this problem by the warning introduced by the recent commit 5129a469 ("Catch filesystem lacking s_bdi"). WARNING: at fs/super.c:959 vfs_kern_mount+0xc5/0x14e() Hardware name: PowerEdge 2850 Modules linked in: nilfs2 loop tpm_tis tpm tpm_bios video shpchp pci_hotplug output dcdbas Pid: 3773, comm: mount.nilfs2 Not tainted 2.6.34-rc6-debug #38 Call Trace: [<c1028422>] warn_slowpath_common+0x60/0x90 [<c102845f>] warn_slowpath_null+0xd/0x10 [<c1095936>] vfs_kern_mount+0xc5/0x14e [<c1095a03>] do_kern_mount+0x32/0xbd [<c10a811e>] do_mount+0x671/0x6d0 [<c1073794>] ? __get_free_pages+0x1f/0x21 [<c10a684f>] ? copy_mount_options+0x2b/0xe2 [<c107b634>] ? strndup_user+0x48/0x67 [<c10a81de>] sys_mount+0x61/0x8f [<c100280c>] sysenter_do_call+0x12/0x32 This ensures to set s_bdi for nilfs and fixes the sync silent failure. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: NJens Axboe <jens.axboe@oracle.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 01 5月, 2010 2 次提交
-
-
由 Li Dongyang 提交于
when we fall back to buffered write from direct write, we call __generic_file_aio_write() but that will end up doing direct write even we are only prepared to do buffered write because the file has the O_DIRECT flag set. This is a fix for https://bugzilla.novell.com/show_bug.cgi?id=591039 revised with Joel's comments. Signed-off-by: NLi Dongyang <lidongyang@novell.com> Acked-by: NMark Fasheh <mfasheh@suse.com> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Ralf Baechle 提交于
CONFIG_INOTIFY_USER defined but CONFIG_ANON_INODES undefined will result in the following build failure: LD vmlinux fs/built-in.o: In function 'sys_inotify_init1': (.text.sys_inotify_init1+0x22c): undefined reference to 'anon_inode_getfd' fs/built-in.o: In function `sys_inotify_init1': (.text.sys_inotify_init1+0x22c): relocation truncated to fit: R_MIPS_26 against 'anon_inode_getfd' make[2]: *** [vmlinux] Error 1 make[1]: *** [sub-make] Error 2 make: *** [all] Error 2 Signed-off-by: NRalf Baechle <ralf@linux-mips.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 4月, 2010 3 次提交
-
-
由 Dave Chinner 提交于
On low memory boxes or those with highmem, kernel can OOM before the background reclaims inodes via xfssyncd. Add a shrinker to run inode reclaim so that it inode reclaim is expedited when memory is low. This is more complex than it needs to be because the VM folk don't want a context added to the shrinker infrastructure. Hence we need to add a global list of XFS mount structures so the shrinker can traverse them. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
由 Boaz Harrosh 提交于
The patch: add bdi backing to mount session (b3d0ab7e) Has a bug in the placement of the bdi member at struct exofs_sb_info. The layout member must be kept last. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Jens Axboe 提交于
When CONFIG_BLOCK is set, it ends up getting backing-dev.h included. But for !CONFIG_BLOCK, it isn't so lucky. The proper thing to do is include <linux/backing-dev.h> directly from the file it's used from, so do that. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 29 4月, 2010 5 次提交
-
-
由 Arnd Bergmann 提交于
The pktcdvd driver uses proper locking and does not need the BKL in the ioctl and llseek functions of the character device, so kill both. Moving the compat_ioctl handling from common code into the driver itself fixes build problems when CONFIG_BLOCK is disabled. Acked-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Boaz Harrosh 提交于
Commit b3d0ab7e ("exofs: add bdi backing to mount session") has a bug in the placement of the bdi member at struct exofs_sb_info. The layout member must be kept last. Signed-off-by: NBoaz Harrosh <bharrosh@panasas.com> Acked-by: NJens Axboe <jens.axboe@oracle.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Al Viro 提交于
If dentry found stale happens to be a root of disconnected tree, we can't d_drop() it; its d_hash is actually part of s_anon and d_drop() would simply hide it from shrink_dcache_for_umount(), leading to all sorts of fun, including busy inodes on umount and oopsen after that. Bug had been there since at least 2006 (commit c636eb already has it), so it's definitely -stable fodder. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Cc: stable@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Xiaotian Feng 提交于
With CONFIG_NFS_V4 and data version 4, nfs_get_sb will allocate memory for export_path in nfs4_validate_text_mount_data, so we need to free it then. This is addressed in following kmemleak report: unreferenced object 0xffff88016bf48a50 (size 16): comm "mount.nfs", pid 22567, jiffies 4651574704 (age 175471.200s) hex dump (first 16 bytes): 2f 6f 70 74 2f 77 6f 72 6b 00 6b 6b 6b 6b 6b a5 /opt/work.kkkkk. backtrace: [<ffffffff814b34f9>] kmemleak_alloc+0x60/0xa7 [<ffffffff81102c76>] kmemleak_alloc_recursive.clone.5+0x1b/0x1d [<ffffffff811046b3>] __kmalloc_track_caller+0x18f/0x1b7 [<ffffffff810e1b08>] kstrndup+0x37/0x54 [<ffffffffa0336971>] nfs_parse_devname+0x152/0x204 [nfs] [<ffffffffa0336af3>] nfs4_validate_text_mount_data+0xd0/0xdc [nfs] [<ffffffffa0338deb>] nfs_get_sb+0x325/0x736 [nfs] [<ffffffff81113671>] vfs_kern_mount+0xbd/0x17c [<ffffffff81113798>] do_kern_mount+0x4d/0xed [<ffffffff81129a87>] do_mount+0x787/0x7fe [<ffffffff81129b86>] sys_mount+0x88/0xc2 [<ffffffff81009b42>] system_call_fastpath+0x16/0x1b Signed-off-by: NXiaotian Feng <dfeng@redhat.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Benny Halevy <bhalevy@panasas.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Adamson <andros@netapp.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Dan Carpenter 提交于
The original code passed an ERR_PTR() to rpc_put_task() and instead of returning zero on success it returned -ENOMEM. Signed-off-by: NDan Carpenter <error27@gmail.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 28 4月, 2010 3 次提交
-
-
由 Jerome Marchand 提交于
Correct the file_operations struct in fdinfo entry of tid_base_stuff[]. Presently /proc/*/task/*/fdinfo contains symlinks to opened files like /proc/*/fd/. Signed-off-by: NJerome Marchand <jmarchan@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Miklos Szeredi <mszeredi@suse.cz> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Trond Myklebust 提交于
Neil Brown reports that he is seeing the BUG_ON(ret == 0) trigger in nfs_page_async_flush. According to the trace in https://bugzilla.novell.com/show_bug.cgi?id=599628 the problem appears to be due to nfs_wb_page() not waiting for the PG_writeback flag to clear. There is a ditto problem in nfs_wb_page_cancel() Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Christoph Egger 提交于
The checks for CONFIG_MMU at this location are duplicated as all the code is located inside a #ifndef CONFIG_MMU block. So the first conditional block will always be included while the second never will. Signed-off-by: NChristoph Egger <siccegge@stud.informatik.uni-erlangen.de> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 4月, 2010 2 次提交
-
-
由 Neil Brown 提交于
When read_buf is called to move over to the next page in the pagelist of an NFSv4 request, it sets argp->end to essentially a random number, certainly not an address within the page which argp->p now points to. So subsequent calls to READ_BUF will think there is much more than a page of spare space (the cast to u32 ensures an unsigned comparison) so we can expect to fall off the end of the second page. We never encountered thsi in testing because typically the only operations which use more than two pages are write-like operations, which have their own decoding logic. Something like a getattr after a write may cross a page boundary, but it would be very unusual for it to cross another boundary after that. Cc: stable@kernel.org Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 Dave Chinner 提交于
A new xfsqa test (226) with a prototype xfs_fsr change to try to handle dynamic fork offsets better triggers an assertion failure where the inode data fork is in btree format, yet there is room in the inode for it to be in extent format. The two inodes look like: before: ino 0x101 (target), num_extents 11, Max in-fork extents 6, broot size 40, fork offset 96 before: ino 0x115 (temp), num_extents 5, Max in-fork extents 3, broot size 40, fork offset 56 after: ino 0x101 (target), num_extents 5, Max in-fork extents 6, broot size 40, fork offset 96 after: ino 0x115 (temp), num_extents 11, Max in-fork extents 3, broot size 40, fork offset 56 Basically the target inode ends up with 5 extents in btree format, but it had space for 6 extents in extent format, so ends up incorrect. Notably here the broot size is the same, and that is where the kernel code is going wrong - the btree root will fit, so it lets the swap go ahead. The check should not allow the swap to take place if the number of extents while in btree format is less than the number of extents that can fit in the inode in extent format. Adding that check will prevent this swap and corruption from occurring. Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de>
-
- 26 4月, 2010 1 次提交
-
-
由 Jens Axboe 提交于
It's now a provided helper, so get rid of the internal setup and btrfs atomic_t bdi enumerator. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 25 4月, 2010 7 次提交
-
-
由 Jörn Engel 提交于
noop_backing_dev_info is used only as a flag to mark filesystems that don't have any backing store, like tmpfs, procfs, spufs, etc. Signed-off-by: NJoern Engel <joern@logfs.org> Changed the BUG_ON() to a WARN_ON(). Note that adding dirty inodes to the noop_backing_dev_info is not legal and will not result in them being flushed, but we already catch this condition in __mark_inode_dirty() when checking for a registered bdi. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
由 Phillip Lougher 提交于
Sizing the buffer based on block size is incorrect, leading to a potential buffer over-run on 4K block size file systems (because the metadata block size is always 8K). This bug doesn't seem have triggered because 4K block size file systems are not default, and also because metadata blocks after compression tend to be less than 4K. Signed-off-by: NPhillip Lougher <phillip@lougher.demon.co.uk>
-
由 Phillip Lougher 提交于
Signed-off-by: NPhillip Lougher <phillip@lougher.demon.co.uk>
-
由 Phillip Lougher 提交于
Fix warn_on triggered by mounting a fsfuzzer corrupted file system, where the root inode has been corrupted. Signed-off-by: NPhillip Lougher <phillip@lougher.demon.co.uk> Reported-by: NSteve Grubb <sgrubb@redhat.com>
-
由 Anton Blanchard 提交于
We are seeing a large regression in database performance on recent kernels. The database opens a block device with O_DIRECT|O_SYNC and a number of threads write to different regions of the file at the same time. A simple test case is below. I haven't defined DEVICE since getting it wrong will destroy your data :) On an 3 disk LVM with a 64k chunk size we see about 17MB/sec and only a few threads in IO wait: procs -----io---- -system-- -----cpu------ r b bi bo in cs us sy id wa st 0 3 0 16170 656 2259 0 0 86 14 0 0 2 0 16704 695 2408 0 0 92 8 0 0 2 0 17308 744 2653 0 0 86 14 0 0 2 0 17933 759 2777 0 0 89 10 0 Most threads are blocking in vfs_fsync_range, which has: mutex_lock(&mapping->host->i_mutex); err = fop->fsync(file, dentry, datasync); if (!ret) ret = err; mutex_unlock(&mapping->host->i_mutex); commit 148f948b (vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode) offers some explanation of what is going on: Use these new helpers for syncing from generic VFS functions. This makes O_SYNC writes to block devices acquire i_mutex for syncing. If we really care about this, we can make block_fsync() drop the i_mutex and reacquire it before it returns. Thanks Jan for such a good commit message! As well as dropping i_mutex, Christoph suggests we should remove the call to sync_blockdev(): > sync_blockdev is an overcomplicated alias for filemap_write_and_wait on > the block device inode, which is exactly what we did just before calling > into ->fsync The patch below incorporates both suggestions. With it the testcase improves from 17MB/s to 68M/sec: procs -----io---- -system-- -----cpu------ r b bi bo in cs us sy id wa st 0 7 0 65536 1000 3878 0 0 70 30 0 0 34 0 69632 1016 3921 0 1 46 53 0 0 57 0 69632 1000 3921 0 0 55 45 0 0 53 0 69640 754 4111 0 0 81 19 0 Testcase: #define _GNU_SOURCE #include <stdio.h> #include <pthread.h> #include <unistd.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #define NR_THREADS 64 #define BUFSIZE (64 * 1024) #define DEVICE "/dev/mapper/XXXXXX" #define ALIGN(VAL, SIZE) (((VAL)+(SIZE)-1) & ~((SIZE)-1)) static int fd; static void *doit(void *arg) { unsigned long offset = (long)arg; char *b, *buf; b = malloc(BUFSIZE + 1024); buf = (char *)ALIGN((unsigned long)b, 1024); memset(buf, 0, BUFSIZE); while (1) pwrite(fd, buf, BUFSIZE, offset); } int main(int argc, char *argv[]) { int flags = O_RDWR|O_DIRECT; int i; unsigned long offset = 0; if (argc > 1 && !strcmp(argv[1], "O_SYNC")) flags |= O_SYNC; fd = open(DEVICE, flags); if (fd == -1) { perror("open"); exit(1); } for (i = 0; i < NR_THREADS-1; i++) { pthread_t tid; pthread_create(&tid, NULL, doit, (void *)offset); offset += BUFSIZE; } doit((void *)offset); return 0; } Signed-off-by: NAnton Blanchard <anton@samba.org> Acked-by: NJan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jeff Mahoney 提交于
Commit 48b32a35 ("reiserfs: use generic xattr handlers") introduced a problem that causes corruption when extended attributes are replaced with a smaller value. The issue is that the reiserfs_setattr to shrink the xattr file was moved from before the write to after the write. The root issue has always been in the reiserfs xattr code, but was papered over by the fact that in the shrink case, the file would just be expanded again while the xattr was written. The end result is that the last 8 bytes of xattr data are lost. This patch fixes it to use new_size. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=14826Signed-off-by: NJeff Mahoney <jeffm@suse.com> Reported-by: NChristian Kujau <lists@nerdbynature.de> Tested-by: NChristian Kujau <lists@nerdbynature.de> Cc: Edward Shishkin <edward.shishkin@gmail.com> Cc: Jethro Beekman <kernel@jbeekman.nl> Cc: Greg Surbey <gregsurbey@hotmail.com> Cc: Marco Gatti <marco.gatti@gmail.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jeff Mahoney 提交于
Commit 677c9b2e ("reiserfs: remove privroot hiding in lookup") removed the magic from the lookup code to hide the .reiserfs_priv directory since it was getting loaded at mount-time instead. The intent was that the entry would be hidden from the user via a poisoned d_compare, but this was faulty. This introduced a security issue where unprivileged users could access and modify extended attributes or ACLs belonging to other users, including root. This patch resolves the issue by properly hiding .reiserfs_priv. This was the intent of the xattr poisoning code, but it appears to have never worked as expected. This is fixed by using d_revalidate instead of d_compare. This patch makes -oexpose_privroot a no-op. I'm fine leaving it this way. The effort involved in working out the corner cases wrt permissions and caching outweigh the benefit of the feature. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Acked-by: NEdward Shishkin <edward.shishkin@gmail.com> Reported-by: NMatt McCutchen <matt@mattmccutchen.net> Tested-by: NMatt McCutchen <matt@mattmccutchen.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 4月, 2010 4 次提交
-
-
由 Joel Becker 提交于
When asked for a partial read of the LVB in a dlmfs file, we can accidentally calculate a negative count. Reported-by: NDan Carpenter <error27@gmail.com> Cc: <stable@kernel.org> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Tao Ma 提交于
In reflink we update the id info on the disk but forgot to update the corresponding information in the VFS inode. Update them accordingly when we want to preserve the attributes. Reported-by: NJeff Liu <jeff.liu@oracle.com> Signed-off-by: NTao Ma <tao.ma@oracle.com> Cc: <stable@kernel.org> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Dan Carpenter 提交于
If "handle" is non null at the end of the function then we assume it's a valid pointer and pass it to ocfs2_commit_trans(); Signed-off-by: NDan Carpenter <error27@gmail.com> Cc: <stable@kernel.org> Signed-off-by: NJoel Becker <joel.becker@oracle.com>
-
由 Mark Fasheh 提交于
If we get a failure during creation of an inode we'll allow the orphan code to remove the inode, which is correct. However, we need to ensure that we don't get any errors after the call to ocfs2_add_entry(), otherwise we could leave a dangling directory reference. The solution is simple - in both cases, all I had to do was move ocfs2_dentry_attach_lock() above the ocfs2_add_entry() call. Signed-off-by: NMark Fasheh <mfasheh@suse.com>
-