- 01 5月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
Ext4's on-line resizing adds a new block group and then, only at the last step adjusts s_groups_count. However, it's possible on SMP systems that another CPU could see the updated the s_group_count and not see the newly initialized data structures for the just-added block group. For this reason, it's important to insert a SMP read barrier after reading s_groups_count and before reading any (for example) the new block group descriptors allowed by the increased value of s_groups_count. Unfortunately, we rather blatently violate this locking protocol documented in fs/ext4/resize.c. Fortunately, (1) on-line resizes happen relatively rarely, and (2) it seems rare that the filesystem code will immediately try to use just-added block group before any memory ordering issues resolve themselves. So apparently problems here are relatively hard to hit, since ext3 has been vulnerable to the same issue for years with no one apparently complaining. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 02 5月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
By using a separate super_operations structure for filesystems that have and don't have journals, we can simply ext4_write_super() --- which is only needed when no journal is present --- and ext4_freeze(), ext4_unfreeze(), and ext4_sync_fs(), which are only needed when the journal is present. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 01 5月, 2009 2 次提交
-
-
由 Theodore Ts'o 提交于
The s_dirt flag wasn't completely handled correctly, but it didn't really matter when journalling was enabled. It turns out that when ext4 runs without a journal, we don't clear s_dirt in places where we should have, with the result that the high-level write_super() function was writing the superblock when it wasn't necessary. So we fix this by making ext4_commit_super() clear the s_dirt flag, and removing many of the other places where s_dirt is manipulated. When journalling is enabled, the s_dirt flag might be left set more often, but s_dirt really doesn't matter when journalling is enabled. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
The ext4_commit_super() function took both a struct super_block * and a struct ext4_super_block *, but the struct ext4_super_block can be derived from the struct super_block. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 25 4月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
Signed-off-by: NRobert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 28 4月, 2009 1 次提交
-
-
由 Theodore Ts'o 提交于
For very large filesystems, the s_flex_groups array can get quite big. For example, a filesystem that can be resized up to 16TB will have 8192 flex groups (assuming the default flex_bg size of 16), so the array is 96k, which is *very* marginal for kmalloc(). On the other hand, a 160GB filesystem without the resize_inode feature will only require 960 bytes. So we try to allocate the array first using kmalloc(), and if that fails, we'll try to use vmalloc() instead. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 13 5月, 2009 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
Setting BH_Unwritten buffer_heads as BH_Mapped avoids multiple (unnecessary) calls to get_block() during the call to the write(2) system call. Setting BH_Unwritten buffer heads as BH_Mapped requires that the writepages() functions can handle BH_Unwritten buffer_heads. After this commit, things work as follows: ext4_ext_get_block() returns unmapped, unwritten, buffer head when called with create = 0 for prealloc space. This makes sure we handle the read path and non-delayed allocation case correctly. Even though the buffer head is marked unmapped we have valid b_blocknr and b_bdev values in the buffer_head. ext4_da_get_block_prep() called for block resrevation will now return mapped, unwritten, new buffer_head for prealloc space. This avoids multiple calls to get_block() for write to same offset. By making such buffers as BH_New, we also assure that sub-block zeroing of buffered writes happens correctly. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Aneesh Kumar K.V 提交于
The BH_Delay and BH_Unwritten flags should never leak out to submit_bh(). So add some BUG_ON() checks to submit_bh so we can get a stack trace and determine how and why this might have happened. (Note that only XFS and ext4 use these buffer head flags, and XFS does not use submit_bh(). So this patch should only modify behavior for ext4.) Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Cc: linux-fsdevel@vger.kernel.org
-
- 14 5月, 2009 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
These struct buffer_heads are allocated on the stack (and hence are initialized with stack garbage). They are only used to call a get_blocks() function, so that's mostly OK, but b_state must be initialized to be 0 so we don't have any unexpected BH_* flags set by accident, such as BH_Unwritten or BH_Delay. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
- 02 6月, 2009 3 次提交
-
-
由 Felix Blyakher 提交于
It's possible to recurse into filesystem from the memory allocation, which deadlocks in xfs_qm_shake(). Add check for __GFP_FS, and bail out if it is not set. Signed-off-by: NFelix Blyakher <felixb@sgi.com> Signed-off-by: NHedi Berriche <hedi@sgi.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NFelix Blyakher <felixb@sgi.com>
-
由 Eric Sandeen 提交于
In the case where growing a filesystem would leave the last AG too small, the fixup code has an overflow in the calculation of the new size with one fewer ag, because "nagcount" is a 32 bit number. If the new filesystem has > 2^32 blocks in it this causes a problem resulting in an EINVAL return from growfs: # xfs_io -f -c "truncate 19998630180864" fsfile # mkfs.xfs -f -bsize=4096 -dagsize=76288719b,size=3905982455b fsfile # mount -o loop fsfile /mnt # xfs_growfs /mnt meta-data=/dev/loop0 isize=256 agcount=52, agsize=76288719 blks = sectsz=512 attr=2 data = bsize=4096 blocks=3905982455, imaxpct=5 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Invalid argument Reported-by: richard.ems@cape-horn-eng.com Signed-off-by: NEric Sandeen <sandeen@sandeen.net> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NFelix Blyakher <felixb@sgi.com> Signed-off-by: NFelix Blyakher <felixb@sgi.com>
-
由 Felix Blyakher 提交于
Regreesion from commit ef8f7fc5, which rearranged the code in xfs_swap_extents() leading to double unlock of xfs inode ilock. That resulted in xfs_fsr deadlocking itself on platforms, which don't handle double unlock of rw_semaphore nicely. It caused the count go negative, which represents the write holder, without really having one. ia64 is one of the platforms where deadlock was easily reproduced and the fix was tested. Signed-off-by: NEric Sandeen <sandeen@sandeen.net> Reviewed-by: NEric Sandeen <sandeen@sandeen.net> Signed-off-by: NFelix Blyakher <felixb@sgi.com>
-
- 30 5月, 2009 1 次提交
-
-
由 Ryusuke Konishi 提交于
The nilfs_cpfile_delete_checkpoints() wrongly skips brelse() for the header block of checkpoint file in case of errors. This fixes the leak bug. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
- 29 5月, 2009 4 次提交
-
-
由 Oskar Schirmer 提交于
The flat loader uses an architecture's flat_stack_align() to align the stack but assumes word-alignment is enough for the data sections. However, on the Xtensa S6000 we have registers up to 128bit width which can be used from userspace and therefor need userspace stack and data-section alignment of at least this size. This patch drops flat_stack_align() and uses the same alignment that is required for slab caches, ARCH_SLAB_MINALIGN, or wordsize if it's not defined by the architecture. It also fixes m32r which was obviously kaput, aligning an uninitialized stack entry instead of the stack pointer. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NOskar Schirmer <os@emlix.com> Cc: David Howells <dhowells@redhat.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Bryan Wu <cooloney@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: NPaul Mundt <lethal@linux-sh.org> Cc: Greg Ungerer <gerg@uclinux.org> Signed-off-by: NJohannes Weiner <jw@emlix.com> Acked-by: NMike Frysinger <vapier.adi@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
proc_pident_instantiate() has following call flow. proc_pident_lookup() proc_pident_instantiate() proc_pid_make_inode() And, proc_pident_lookup() has following error handling. const struct pid_entry *p, *last; error = ERR_PTR(-ENOENT); if (!task) goto out_no_task; Then, proc_pident_instantiate should return ENOENT too when racing against exit(2) occur. EINAL has two bad reason. - it implies caller is wrong. bad the race isn't caller's mistake. - man 2 open don't explain EINVAL. user often don't handle it. Note: Other proc_pid_make_inode() caller already use ENOENT properly. Acked-by: NEric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joakim Tjernlund 提交于
Erase errors such as: "Newly-erased block contained word 0xa4ef223e at offset 0x0296a014" and failure to write the clean marker, moves the offending erase block to erasing list before calling jffs2_erase_failed(). This is bad as jffs2_erase_failed() will also move the block to the bad_list, but is now moving the wrong block, causing FS corruption. Signed-off-by: NJoakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
由 Andrew Morton 提交于
We don't need a kernel thread per CPU for this application. Acked-by: NAlex Chiang <achiang@hp.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 28 5月, 2009 3 次提交
-
-
由 Wei Yongjun 提交于
Commit 'Short write in nfsd becomes a full write to the client' (31dec253) broken the sync write. With the following commands to reproduce: $ mount -t nfs -o sync 192.168.0.21:/nfsroot /mnt $ cd /mnt $ echo aaaa > temp.txt Then nfs client is hung up. In SYNC mode the server alaways return the write count 0 to the client. This is because the value of host_err in nfsd_vfs_write() will be overwrite in SYNC mode by 'host_err=nfsd_sync(file);', and then we return host_err(which is now 0) as write count. This patch fixed the problem. Signed-off-by: NWei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
-
由 David Howells 提交于
Fix up renamed filenames in comments in fs/cachefiles/internal.h. Originally, the files were all called cf-xxx.c, but they got renamed to just xxx.c. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Howells 提交于
Fix up renamed filenames in comments in fs/fscache/internal.h. Originally, the files were all called fsc-xxx.c, but they got renamed to just xxx.c. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 27 5月, 2009 2 次提交
-
-
由 Trond Myklebust 提交于
If the asynchronous lease renewal fails (usually due to a soft timeout), then we _must_ schedule state recovery in order to ensure that we don't lose the lease unnecessarily or, if the lease is already lost, that we recover the locking state promptly... Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
由 Sam Ravnborg 提交于
fix build error with latest kbuild adjustments to initconst. The commit a447c093 ("vfs: Use const for kernel parser table") changed: static match_table_t __initdata tokens = { to static match_table_t __initconst tokens = { But the missing const causes popwerpc to fail with latest updates to __initconst like this: fs/nfs/nfsroot.c:400: error: __setup_str_nfs_root_setup causes a section type conflict fs/nfs/nfsroot.c:400: error: __setup_str_nfs_root_setup causes a section type conflict The bug is only present with kbuild-next. Following patch has been build tested. Signed-off-by: NSam Ravnborg <sam@ravnborg.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: NJan Beulich <jbeulich@novell.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
-
- 24 5月, 2009 1 次提交
-
-
由 Steve French 提交于
Small change (mostly formatting) to limit lookup based open calls to file create only. After discussion yesteday on samba-technical about the posix lookup regression, and looking at a problem with cifs posix open to one particular Samba version, Jeff and JRA realized that Samba server's behavior changed in this area (posix open behavior on files vs. directories). To make this behavior consistent, JRA just made a fix to Samba server to alter how it handles open of directories (now returning the equivalent of EISDIR instead of success). Since we don't know at lookup time whether the inode is a directory or file (and thus whether posix open will succeed with most current Samba server), this change avoids the posix open code on lookup open (just issues posix open on creates). This gets the semantic benefits we want (atomicity, posix byte range locks, improved write semantics on newly created files) and file create still is fast, and we avoid the problem that Jeff noticed yesterday with "openat" (and some open directory calls) of non-cached directories to one version of Samba server, and will work with future Samba versions (which include the fix jra just pushed into Samba server). I confirmed this approach with jra yesterday and with Shirish today. Posix open is only called (at lookup time) for file create now. For opens (rather than creates), because we do not know if it is a file or directory yet, and current Samba no longer allows us to do posix open on dirs, we could end up wasting an open call on what turns out to be a dir. For file opens, we wait to call posix open till cifs_open. It could be added here (lookup) in the future but the performance tradeoff of the extra network request when EISDIR or EACCES is returned would have to be weighed against the 50% reduction in network traffic in the other paths. Reviewed-by: NShirish Pargaonkar <shirishp@us.ibm.com> Tested-by: NJeff Layton <jlayton@redhat.com> CC: Jeremy Allison <jra@samba.org> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-
- 22 5月, 2009 2 次提交
-
-
由 Ryusuke Konishi 提交于
This fixes a new memory leak problem in garbage collection. The problem was brought by the bugfix patch ("nilfs2: fix lock order reversal in nilfs_clean_segments ioctl"). Thanks to Kentaro Suzuki for finding this problem. Reported-by: NKentaro Suzuki <k_suzuki@ms.sylc.co.jp> Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
-
由 Steve French 提交于
Posix open code was not properly adding the file to the list of open files. Fix allocating cifsFileInfo more than once, and adding twice to flist and tlist. Also fix mode setting to be done in one place in these paths. Signed-off-by: NSteve French <sfrench@us.ibm.com> Reviewed-by: NShirish Pargaonkar <shirishp@us.ibm.com> Tested-by: NJeff Layton <jlayton@redhat.com> Tested-by: NLuca Tettamanti <kronos.it@gmail.com>
-
- 19 5月, 2009 2 次提交
-
-
由 Jeff Layton 提交于
This is the third respin of the patch posted yesterday to fix the error handling in cifs_follow_symlink. It also includes a fix for a bogus NULL pointer check in CIFSSMBQueryUnixSymLink that Jeff Moyer spotted. It's possible for CIFSSMBQueryUnixSymLink to return without setting target_path to a valid pointer. If that happens then the current value to which we're initializing this pointer could cause an oops when it's kfree'd. This patch is a little more comprehensive than the last patches. It reorganizes cifs_follow_link a bit for (hopefully) better readability. It should also eliminate the uneeded allocation of full_path on servers without unix extensions (assuming they can get to this point anyway, of which I'm not convinced). On a side note, I'm not sure I agree with the logic of enabling this query even when unix extensions are disabled on the client. It seems like that should disable this as well. But, changing that is outside the scope of this fix, so I've left it alone for now. Reported-by: NJeff Moyer <jmoyer@redhat.com> Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NJeff Moyer <jmoyer@redhat.com> Reviewed-by: NChristoph Hellwig <hch@inraded.com> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-
由 Frank Filz 提交于
The problem is that permission checking is skipped if atomic open is possible, but when exec opens a file, it just opens it O_READONLY which means EXEC permission will not be checked at that time. This problem is observed by the following sequence (executed as root): mount -t nfs4 server:/ /mnt4 echo "ls" >/mnt4/foo chmod 744 /mnt4/foo su guest -c "mnt4/foo" Signed-off-by: NFrank Filz <ffilzlnx@us.ibm.com> Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org Tested-by: NEugene Teo <eugeneteo@kernel.sg> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 5月, 2009 3 次提交
-
-
由 Jeff Mahoney 提交于
This adds CONFIG_REISERFS_FS_XATTR protection from reiserfs_permission. This is needed to avoid warnings during file deletions and chowns with xattrs disabled. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jeff Mahoney 提交于
This avoids an Oops in open_xa_root that can occur when deleting a file with xattrs disabled. It assumes that the xattr root will be there, and that is not guaranteed. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jeff Mahoney 提交于
With xattr cleanup even with xattrs disabled, much of the initial setup is still performed. Some #ifdefs are just not needed since the options they protect wouldn't be available anyway. This cleans those up. Signed-off-by: NJeff Mahoney <jeffm@suse.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 15 5月, 2009 9 次提交
-
-
由 Sukadev Bhattiprolu 提交于
devpts_get_sb() calls memset(0) to clear mount options and calls parse_mount_options() if user specified any mount options. The memset(0) is bogus since the 'mode' and 'ptmxmode' options are non-zero by default. parse_mount_options() restores options to default anyway and can properly deal with NULL mount options. So in devpts_get_sb() remove memset(0) and call parse_mount_options() even for NULL mount options. Bug reported by Eric Paris: http://lkml.org/lkml/2009/5/7/448. Signed-off-by: NSukadev Bhattiprolu <sukadev@us.ibm.com> Tested-by: NMarc Dionne <marc.c.dionne@gmail.com> Reported-by: NEric Paris <eparis@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Acked-by: NSerge Hallyn <serue@us.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Reviewed-by: N"H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Theodore Ts'o 提交于
If two CPU's simultaneously call ext4_ext_get_blocks() at the same time, there is nothing protecting the i_cached_extent structure from being used and updated at the same time. This could potentially cause the wrong location on disk to be read or written to, including potentially causing the corruption of the block group descriptors and/or inode table. This bug has been in the ext4 code since almost the very beginning of ext4's development. Fortunately once the data is stored in the page cache cache, ext4_get_blocks() doesn't need to be called, so trying to replicate this problem to the point where we could identify its root cause was *extremely* difficult. Many thanks to Kevin Shanahan for working over several months to be able to reproduce this easily so we could finally nail down the cause of the corruption. Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu> Reviewed-by: N"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
-
由 Aneesh Kumar K.V 提交于
The BH_Unwritten flag indicates that the buffer is allocated on disk but has not been written; that is, the disk was part of a persistent preallocation area. That flag should only be set when a get_blocks() function is looking up a inode's logical to physical block mapping. When ext4_get_blocks_wrap() is called with create=1, the uninitialized extent is converted into an initialized one, so the BH_Unwritten flag is no longer appropriate. Hence, we need to make sure the BH_Unwritten is not left set, since the combination of BH_Mapped and BH_Unwritten is not allowed; among other things, it will result ext4's get_block() to be called over and over again during the write_begin phase of write(2). Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
-
由 Sankar P 提交于
Signed-off-by: NSankar P <sankar.curiosity@gmail.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Sage Weil 提交于
The notreelog and flushoncommit mount options were being printed slightly differently. Signed-off-by: NSage Weil <sage@newdream.net> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Li Hong 提交于
In Li Zefan's commit dae7b665, a combination call of kmalloc() and copy_from_user() is replaced by memdup_user(). So btrfs_ioctl_resize() doesn't use GFP_NOFS any more. Signed-off-by: NLi Hong <lihong.hi@gmail.com> Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
These debugging WARN_ONs make too much console noise during regular IO failures. An IO failure will still generate a number of messages as we verify checksums etc, but these two are not needed. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
When a btrfs metadata read fails, the first thing we try to do is find a good copy on another mirror of the block. If this fails, read_tree_block() ends up returning a buffer that isn't up to date. The btrfs btree reading code was reworked to drop locks and repeat the search when IO was done, but the changes didn't add a check for failed reads. The end result was looping forever on buffers that were never going to become up to date. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
由 Chris Mason 提交于
This flag is used to decide when we need to send a given file through the ordered code to make sure it is fully written before a transaction commits. It was not being properly set to zero when the inode was being setup. Signed-off-by: NChris Mason <chris.mason@oracle.com>
-
- 14 5月, 2009 1 次提交
-
-
由 Jeff Layton 提交于
cifs_strndup_from_ucs returns NULL on error, not an ERR_PTR Signed-off-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NSteve French <sfrench@us.ibm.com>
-