- 12 4月, 2015 3 次提交
-
-
由 Theodore Ts'o 提交于
Enforce the following inheritance policy: 1) An unencrypted directory may contain encrypted or unencrypted files or directories. 2) All files or directories in a directory must be protected using the same key as their containing directory. As a result, assuming the following setup: mke2fs -t ext4 -Fq -O encrypt /dev/vdc mount -t ext4 /dev/vdc /vdc mkdir /vdc/a /vdc/b /vdc/c echo foo | e4crypt add_key /vdc/a echo bar | e4crypt add_key /vdc/b for i in a b c ; do cp /etc/motd /vdc/$i/motd-$i ; done Then we will see the following results: cd /vdc mv a b # will fail; /vdc/a and /vdc/b have different keys mv b/motd-b a # will fail, see above ln a/motd-a b # will fail, see above mv c a # will fail; all inodes in an encrypted directory # must be encrypted ln c/motd-c b # will fail, see above mv a/motd-a c # will succeed mv c/motd-a a # will succeed Signed-off-by: NMichael Halcrow <mhalcrow@google.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Michael Halcrow 提交于
Signed-off-by: NMichael Halcrow <mhalcrow@google.com> Signed-off-by: NIldar Muslukhov <muslukhovi@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Michael Halcrow 提交于
On encrypt, we will re-assign the buffer_heads to point to a bounce page rather than the control_page (which is the original page to write that contains the plaintext). The block I/O occurs against the bounce page. On write completion, we re-assign the buffer_heads to the original plaintext page. On decrypt, we will attach a read completion callback to the bio struct. This read completion will decrypt the read contents in-place prior to setting the page up-to-date. The current encryption mode, AES-256-XTS, lacks cryptographic integrity. AES-256-GCM is in-plan, but we will need to devise a mechanism for handling the integrity data. Signed-off-by: NMichael Halcrow <mhalcrow@google.com> Signed-off-by: NIldar Muslukhov <ildarm@google.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 11 4月, 2015 5 次提交
-
-
由 Michael Halcrow 提交于
Signed-off-by: NMichael Halcrow <mhalcrow@google.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NIldar Muslukhov <muslukhovi@gmail.com>
-
由 Michael Halcrow 提交于
Signed-off-by: NMichael Halcrow <mhalcrow@google.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Michael Halcrow 提交于
Required for future encryption xattr changes. Signed-off-by: NMichael Halcrow <mhalcrow@google.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: NMichael Halcrow <mhalcrow@google.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 08 4月, 2015 1 次提交
-
-
由 Theodore Ts'o 提交于
This takes code from fs/mpage.c and optimizes it for ext4. Its primary reason is to allow us to more easily add encryption to ext4's read path in an efficient manner. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 03 4月, 2015 11 次提交
-
-
由 Lukas Czerner 提交于
Previously commit 14ece102 added a support for for syncing parent directory of newly created inodes to make sure that the inode is not lost after a power failure in no-journal mode. However this does not work in majority of cases, namely: - if the directory has inline data - if the directory is already indexed - if the directory already has at least one block and: - the new entry fits into it - or we've successfully converted it to indexed So in those cases we might lose the inode entirely even after fsync in the no-journal mode. This also includes ext2 default mode obviously. I've noticed this while running xfstest generic/321 and even though the test should fail (we need to run fsck after a crash in no-journal mode) I could not find a newly created entries even when if it was fsynced before. Fix this by adjusting the ext4_add_entry() successful exit paths to set the inode EXT4_STATE_NEWENTRY so that fsync has the chance to fsync the parent directory as well. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz> Cc: Frank Mayhar <fmayhar@google.com> Cc: stable@vger.kernel.org
-
由 Eric Whitney 提交于
When xfstests' auto group is run on a bigalloc filesystem with a 4.0-rc3 kernel, e2fsck failures and kernel warnings occur for some tests. e2fsck reports incorrect iblocks values, and the warnings indicate that the space reserved for delayed allocation is being overdrawn at allocation time. Some of these errors occur because the reserved space is incorrectly decreased by one cluster when ext4_ext_map_blocks satisfies an allocation request by mapping an unused portion of a previously allocated cluster. Because a cluster's worth of reserved space was already released when it was first allocated, it should not be released again. This patch appears to correct the e2fsck failure reported for generic/232 and the kernel warnings produced by ext4/001, generic/009, and generic/033. Failures and warnings for some other tests remain to be addressed. Signed-off-by: NEric Whitney <enwlinux@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Eric Whitney 提交于
In ext4_zero_range(), removing a file's entire block range from the extent status tree removes all records of that file's delalloc extents. The delalloc accounting code uses this information, and its loss can then lead to accounting errors and kernel warnings at writeback time and subsequent file system damage. This is most noticeable on bigalloc file systems where code in ext4_ext_map_blocks() handles cases where delalloc extents share clusters with a newly allocated extent. Because we're not deleting a block range and are correctly updating the status of its associated extent, there is no need to remove anything from the extent status tree. When this patch is combined with an unrelated bug fix for ext4_zero_range(), kernel warnings and e2fsck errors reported during xfstests runs on bigalloc filesystems are greatly reduced without introducing regressions on other xfstests-bld test scenarios. Signed-off-by: NEric Whitney <enwlinux@gmail.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Lukas Czerner 提交于
Currently there is a bug in zero range code which causes zero range calls to only allocate block aligned portion of the range, while ignoring the rest in some cases. In some cases, namely if the end of the range is past i_size, we do attempt to preallocate the last nonaligned block. However this might cause kernel to BUG() in some carefully designed zero range requests on setups where page size > block size. Fix this problem by first preallocating the entire range, including the nonaligned edges and converting the written extents to unwritten in the next step. This approach will also give us the advantage of having the range to be as linearly contiguous as possible. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Maurizio Lombardi 提交于
This is a leftover of commit 71d4f7d0Signed-off-by: NMaurizio Lombardi <mlombard@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NLukas Czerner <lczerner@redhat.com>
-
由 Christoph Hellwig 提交于
bdi->dev now never goes away, so this function became useless. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Wei Yuan 提交于
In this if statement, the previous condition is useless, the later one has covered it. Signed-off-by: NWeiyuan <weiyuan.wei@huawei.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NLukas Czerner <lczerner@redhat.com>
-
由 Sheng Yong 提交于
Remove unused header files and header files which are included in ext4.h. Signed-off-by: NSheng Yong <shengyong1@huawei.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Xiaoguang Wang 提交于
Since commit a9b82415, we are allowed to merge unwritten extents, so here these comments are wrong, remove it. Signed-off-by: NXiaoguang Wang <wangxg.fnst@cn.fujitsu.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Rasmus Villemoes 提交于
According to C99, %*.s means the same as %*.0s, in other words, print as many spaces as the field width argument says and effectively ignore the string argument. That is certainly not what was meant here. The kernel's printf implementation, however, treats it as if the . was not there, i.e. as %*s. I don't know if de->name is nul-terminated or not, but in any case I'm guessing the intention was to use de->name_len as precision instead of field width. [ Note: this is debugging code which is commented out, so this is not security issue; a developer would have to explicitly enable INLINE_DIR_DEBUG before this would be an issue. ] Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Konstantin Khlebnikov 提交于
Release references to buffer-heads if ext4_journal_start() fails. Fixes: 5b61de75 ("ext4: start handle at least possible moment when renaming files") Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-
- 18 3月, 2015 2 次提交
-
-
由 Theodore Ts'o 提交于
Add a tuning knob so we can adjust the dirtytime expiration timeout, which is very useful for testing lazytime. Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Theodore Ts'o 提交于
Jan Kara pointed out that if there is an inode which is constantly getting dirtied with I_DIRTY_PAGES, an inode with an updated timestamp will never be written since inode->dirtied_when is constantly getting updated. We fix this by adding an extra field to the inode, dirtied_time_when, so inodes with a stale dirtytime can get detected and handled. In addition, if we have a dirtytime inode caused by an atime update, and there is no write activity on the file system, we need to have a secondary system to make sure these inodes get written out. We do this by setting up a second delayed work structure which wakes up the CPU much more rarely compared to writeback_expire_centisecs. Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NJan Kara <jack@suse.cz>
-
- 01 3月, 2015 1 次提交
-
-
由 Ryusuke Konishi 提交于
Each inode of nilfs2 stores a root node of a b-tree, and it turned out to have a memory overrun issue: Each b-tree node of nilfs2 stores a set of key-value pairs and the number of them (in "bn_nchildren" member of nilfs_btree_node struct), as well as a few other "bn_*" members. Since the value of "bn_nchildren" is used for operations on the key-values within the b-tree node, it can cause memory access overrun if a large number is incorrectly set to "bn_nchildren". For instance, nilfs_btree_node_lookup() function determines the range of binary search with it, and too large "bn_nchildren" leads nilfs_btree_node_get_key() in that function to overrun. As for intermediate b-tree nodes, this is prevented by a sanity check performed when each node is read from a drive, however, no sanity check has been done for root nodes stored in inodes. This patch fixes the issue by adding missing sanity check against b-tree root nodes so that it's called when on-memory inodes are read from ifile, inode metadata file. Signed-off-by: NRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 2月, 2015 2 次提交
-
-
由 Eric Sandeen 提交于
If xfs_trans_reserve fails we don't cancel the transaction, and we'll leak the allocated transaction pointer. Spotted by Coverity. Signed-off-by: NEric Sandeen <ssandeen@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Eric Sandeen 提交于
We shouldn't get here with RENAME_EXCHANGE set and no target_ip, but let's be defensive, because xfs_cross_rename() will dereference it. Spotted by Coverity. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 23 2月, 2015 11 次提交
-
-
由 Dave Chinner 提交于
A new fsync vs power fail test in xfstests indicated that XFS can have unreliable data consistency when doing extending truncates that require block zeroing. The blocks beyond EOF get zeroed in memory, but we never force those changes to disk before we run the transaction that extends the file size and exposes those blocks to userspace. This can result in the blocks not being correctly zeroed after a crash. Because in-memory behaviour is correct, tools like fsx don't pick up any coherency problems - it's not until the filesystem is shutdown or the system crashes after writing the truncate transaction to the journal but before the zeroed data in the page cache is flushed that the issue is exposed. Fix this by also flushing the dirty data in memory region between the old size and new size when we've found blocks that need zeroing in the truncate process. Reported-by: NLiu Bo <bo.li.liu@oracle.com> cc: <stable@vger.kernel.org> Signed-off-by: NDave Chinner <dchinner@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Jan Kara 提交于
For filesystems without separate project quota inode field in the superblock we just reuse project quota file for group quotas (and vice versa) if project quota file is allocated and we need group quota file. When we reuse the file, quota structures on disk suddenly have wrong type stored in d_flags though. Nobody really cares about this (although structure type reported to userspace was wrong as well) except that after commit 14bf61ff (quota: Switch ->get_dqblk() and ->set_dqblk() to use bytes as space units) assertion in xfs_qm_scall_getquota() started to trigger on xfs/106 test (apparently I was testing without XFS_DEBUG so I didn't notice when submitting the above commit). Fix the problem by properly resetting ddq->d_flags when running quotacheck for a quota file. CC: stable@vger.kernel.org Reported-by: NAl Viro <viro@ZenIV.linux.org.uk> Signed-off-by: NJan Kara <jack@suse.cz> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Al Viro 提交于
X-Coverup: just ask spender Cc: stable@vger.kernel.org Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
use_pde()/unuse_pde() in ->follow_link()/->put_link() resp. Cc: stable@vger.kernel.org Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
As it is, we have debugfs_remove() racing with symlink traversals. Supply ->evict_inode() and do freeing there - inode will remain pinned until we are done with the symlink body. And rip the idiocy with checking if dentry is positive right after we'd verified debugfs_positive(), which is a stronger check... Cc: stable@vger.kernel.org Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Konstantin Khlebnikov 提交于
I've noticed significant locking contention in memory reclaimer around sb_lock inside grab_super_passive(). Grab_super_passive() is called from two places: in icache/dcache shrinkers (function super_cache_scan) and from writeback (function __writeback_inodes_wb). Both are required for progress in memory allocator. Grab_super_passive() acquires sb_lock to increment sb->s_count and check sb->s_instances. It seems sb->s_umount locked for read is enough here: super-block deactivation always runs under sb->s_umount locked for write. Protecting super-block itself isn't a problem: in super_cache_scan() sb is protected by shrinker_rwsem: it cannot be freed if its slab shrinkers are still active. Inside writeback super-block comes from inode from bdi writeback list under wb->list_lock. This patch removes locking sb_lock and checks s_instances under s_umount: generic_shutdown_super() unlinks it under sb->s_umount locked for write. New variant is called trylock_super() and since it only locks semaphore, callers must call up_read(&sb->s_umount) instead of drop_super(sb) when they're done. Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 David Howells 提交于
Fanotify probably doesn't want to watch autodirs so make it use d_can_lookup() rather than d_is_dir() when checking a dir watch and give an error on fake directories. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 David Howells 提交于
Fix up the following scripted S_ISDIR/S_ISREG/S_ISLNK conversions (or lack thereof) in cachefiles: (1) Cachefiles mostly wants to use d_can_lookup() rather than d_is_dir() as it doesn't want to deal with automounts in its cache. (2) Coccinelle didn't find S_IS* expressions in ASSERT() statements in cachefiles. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 David Howells 提交于
Convert the following where appropriate: (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry). (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry). (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more complicated than it appears as some calls should be converted to d_can_lookup() instead. The difference is whether the directory in question is a real dir with a ->lookup op or whether it's a fake dir with a ->d_automount op. In some circumstances, we can subsume checks for dentry->d_inode not being NULL into this, provided we the code isn't in a filesystem that expects d_inode to be NULL if the dirent really *is* negative (ie. if we're going to use d_inode() rather than d_backing_inode() to get the inode pointer). Note that the dentry type field may be set to something other than DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS manages the fall-through from a negative dentry to a lower layer. In such a case, the dentry type of the negative union dentry is set to the same as the type of the lower dentry. However, if you know d_inode is not NULL at the call site, then you can use the d_is_xxx() functions even in a filesystem. There is one further complication: a 0,0 chardev dentry may be labelled DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was intended for special directory entry types that don't have attached inodes. The following perl+coccinelle script was used: use strict; my @callers; open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') || die "Can't grep for S_ISDIR and co. callers"; @callers = <$fd>; close($fd); unless (@callers) { print "No matches\n"; exit(0); } my @cocci = ( '@@', 'expression E;', '@@', '', '- S_ISLNK(E->d_inode->i_mode)', '+ d_is_symlink(E)', '', '@@', 'expression E;', '@@', '', '- S_ISDIR(E->d_inode->i_mode)', '+ d_is_dir(E)', '', '@@', 'expression E;', '@@', '', '- S_ISREG(E->d_inode->i_mode)', '+ d_is_reg(E)' ); my $coccifile = "tmp.sp.cocci"; open($fd, ">$coccifile") || die $coccifile; print($fd "$_\n") || die $coccifile foreach (@cocci); close($fd); foreach my $file (@callers) { chomp $file; print "Processing ", $file, "\n"; system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 || die "spatch failed"; } [AV: overlayfs parts skipped] Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 David Howells 提交于
Split DCACHE_FILE_TYPE into DCACHE_REGULAR_TYPE (dentries representing regular files) and DCACHE_SPECIAL_TYPE (representing blockdev, chardev, FIFO and socket files). d_is_reg() and d_is_special() are added to detect these subtypes and d_is_file() is left as the union of the two. This allows a number of places that use S_ISREG(dentry->d_inode->i_mode) to use d_is_reg(dentry) instead. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 David Howells 提交于
Add a DCACHE_FALLTHRU flag to indicate that, in a layered filesystem, this is a virtual dentry that covers another one in a lower layer that should be used instead. This may be recorded on medium if directory integration is stored there. The flag can be set with d_set_fallthru() and tested with d_is_fallthru(). Original-author: Valerie Aurora <vaurora@redhat.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 20 2月, 2015 4 次提交
-
-
由 Chris Mason 提交于
Since commit 8e5cfb55 (Btrfs: Make raid_map array be inlined in btrfs_bio structure), the raid map array is allocated along with the btrfs bio in alloc_btrfs_bio. The calculation used to decide how much we need to allocate was using the wrong parameter passed into the allocation function. The passed in real_stripes will be zero if a target replace operation is not currently running. We want to use total_stripes instead. Signed-off-by: NChris Mason <clm@fb.com> Reported-by: NDavid Sterba <dsterba@suse.cz> Tested-by: NDavid Sterba <dsterba@suse.cz>
-
由 Omar Sandoval 提交于
get_acl gets a reference which we must release in the error cases. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NOmar Sandoval <osandov@osandov.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Rasmus Villemoes 提交于
%pD for struct file*, %pd for struct dentry*. Fixes: a455589f ("assorted conversions to %p[dD]") Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Bastien Nocera 提交于
Signed-off-by: NBastien Nocera <hadess@hadess.net> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-