- 09 3月, 2016 1 次提交
-
-
由 Jan Kara 提交于
Currently we've used hashed aio_mutex to serialize unaligned AIO DIO. However the code cleanups that happened after 2011 when the lock was introduced made aio_mutex acquired at almost the same places where we already have exclusion using i_mutex. So just use i_mutex for the exclusion of unaligned AIO DIO. The change moves waiting for pending unwritten extent conversion under i_mutex. That makes special handling of O_APPEND writes unnecessary and also avoids possible livelocking of unaligned AIO DIO with aligned one (nothing was preventing contiguous stream of aligned AIO DIOs to let unaligned AIO DIO wait forever). Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 23 2月, 2016 2 次提交
-
-
由 Andreas Gruenbacher 提交于
This variable, introduced in commit 9c191f70, is unnecessary: it is set once the module has been initialized correctly, and ext4_fill_super cannot run unless the module has been initialized correctly. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Jan Kara 提交于
The conversion is generally straightforward. The only tricky part is that xattr block corresponding to found mbcache entry can get freed before we get buffer lock for that block. So we have to check whether the entry is still valid after getting buffer lock. Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 23 1月, 2016 1 次提交
-
-
由 Al Viro 提交于
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 15 1月, 2016 1 次提交
-
-
由 Vladimir Davydov 提交于
Mark those kmem allocations that are known to be easily triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to memcg. For the list, see below: - threadinfo - task_struct - task_delay_info - pid - cred - mm_struct - vm_area_struct and vm_region (nommu) - anon_vma and anon_vma_chain - signal_struct - sighand_struct - fs_struct - files_struct - fdtable and fdtable->full_fds_bits - dentry and external_name - inode for all filesystems. This is the most tedious part, because most filesystems overwrite the alloc_inode method. The list is far from complete, so feel free to add more objects. Nevertheless, it should be close to "account everything" approach and keep most workloads within bounds. Malevolent users will be able to breach the limit, but this was possible even with the former "account everything" approach (simply because it did not account everything in fact). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 1月, 2016 2 次提交
-
-
由 Li Xi 提交于
This patch adds mount options for enabling/disabling project quota accounting and enforcement. A new specific inode is also used for project quota accounting. [ Includes fix from Dan Carpenter to crrect error checking from dqget(). ] Signed-off-by: NLi Xi <lixi@ddn.com> Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NAndreas Dilger <adilger@dilger.ca> Reviewed-by: NJan Kara <jack@suse.cz>
-
由 Li Xi 提交于
Signed-off-by: NLi Xi <lixi@ddn.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NAndreas Dilger <adilger@dilger.ca> Reviewed-by: NJan Kara <jack@suse.cz>
-
- 08 12月, 2015 2 次提交
-
-
由 Jan Kara 提交于
We have enough locks that it's probably worth documenting the lock ordering rules we have in ext4. Signed-off-by: NJan Kara <jack@suse.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Jan Kara 提交于
Currently, page faults and hole punching are completely unsynchronized. This can result in page fault faulting in a page into a range that we are punching after truncate_pagecache_range() has been called and thus we can end up with a page mapped to disk blocks that will be shortly freed. Filesystem corruption will shortly follow. Note that the same race is avoided for truncate by checking page fault offset against i_size but there isn't similar mechanism available for punching holes. Fix the problem by creating new rw semaphore i_mmap_sem in inode and grab it for writing over truncate, hole punching, and other functions removing blocks from extent tree and for read over page faults. We cannot easily use i_data_sem for this since that ranks below transaction start and we need something ranking above it so that it can be held over the whole truncate / hole punching operation. Also remove various workarounds we had in the code to reduce race window when page fault could have created pages with stale mapping information. Signed-off-by: NJan Kara <jack@suse.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 17 11月, 2015 1 次提交
-
-
由 Dan Williams 提交于
Similar to XFS warn when mounting DAX while it is still considered under development. Also, aspects of the DAX implementation, for example synchronization against multiple faults and faults causing block allocation, depend on the correct implementation in the filesystem. The maturity of a given DAX implementation is filesystem specific. Cc: <stable@vger.kernel.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: linux-ext4@vger.kernel.org Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: NDave Chinner <david@fromorbit.com> Acked-by: NJan Kara <jack@suse.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 07 11月, 2015 1 次提交
-
-
由 Mel Gorman 提交于
mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 10月, 2015 3 次提交
-
-
由 Dmitry Monakhov 提交于
It is appeared that we can pass journal related mount options and such options be shown in /proc/mounts Example: #mkfs.ext4 -F /dev/vdb #tune2fs -O ^has_journal /dev/vdb #mount /dev/vdb /mnt/ -ocommit=20,journal_async_commit #cat /proc/mounts | grep /mnt /dev/vdb /mnt ext4 rw,relatime,journal_checksum,journal_async_commit,commit=20,data=ordered 0 0 But options:"journal_checksum,journal_async_commit,commit=20,data=ordered" has nothing with reality because there is no journal at all. This patch disallow following options for journalless configurations: - journal_checksum - journal_async_commit - commit=%ld - data={writeback,ordered,journal} Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
-
由 Dmitry Monakhov 提交于
Currently MOPT_EXPLICIT treated as EXPLICIT_DELALLOC which may be changed in future. Let's fix it now. Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Daeho Jeong 提交于
If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the journaling will be aborted first and the error number will be recorded into JBD2 superblock and, finally, the system will enter into the panic state in "errors=panic" option. But, in the rare case, this sequence is little twisted like the below figure and it will happen that the system enters into panic state, which means the system reset in mobile environment, before completion of recording an error in the journal superblock. In this case, e2fsck cannot recognize that the filesystem failure occurred in the previous run and the corruption wouldn't be fixed. Task A Task B ext4_handle_error() -> jbd2_journal_abort() -> __journal_abort_soft() -> __jbd2_journal_abort_hard() | -> journal->j_flags |= JBD2_ABORT; | | __ext4_abort() | -> jbd2_journal_abort() | | -> __journal_abort_soft() | | -> if (journal->j_flags & JBD2_ABORT) | | return; | -> panic() | -> jbd2_journal_update_sb_errno() Tested-by: NHobin Woo <hobin.woo@samsung.com> Signed-off-by: NDaeho Jeong <daeho.jeong@samsung.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 18 10月, 2015 3 次提交
-
-
由 Darrick J. Wong 提交于
Create separate predicate functions to test/set/clear feature flags, thereby replacing the wordy old macros. Furthermore, clean out the places where we open-coded feature tests. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
-
由 Darrick J. Wong 提交于
Instead of overloading EIO for CRC errors and corrupt structures, return the same error codes that XFS returns for the same issues. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Darrick J. Wong 提交于
Allow the filesystem to store the metadata checksum seed in the superblock and add an incompat feature to say that we're using it. This enables tune2fs to change the UUID on a mounted metadata_csum FS without having to (racy!) rewrite all disk metadata. Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 24 9月, 2015 2 次提交
-
-
由 Theodore Ts'o 提交于
This allows us to refactor the procfs code, which saves a bit of compiled space. More importantly it isolates most of the procfs support code into a single file, so it's easier to #ifdef it out if the proc file system has been disabled. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Also statically allocate the ext4_kset and ext4_feat objects, since we only need exactly one of each, and it's simpler and less code if we drop the dynamic allocation and deallocation when it's not needed. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 05 9月, 2015 1 次提交
-
-
由 Kees Cook 提交于
Many file systems that implement the show_options hook fail to correctly escape their output which could lead to unescaped characters (e.g. new lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This could lead to confusion, spoofed entries (resulting in things like systemd issuing false d-bus "mount" notifications), and who knows what else. This looks like it would only be the root user stepping on themselves, but it's possible weird things could happen in containers or in other situations with delegated mount privileges. Here's an example using overlay with setuid fusermount trusting the contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use of "sudo" is something more sneaky: $ BASE="ovl" $ MNT="$BASE/mnt" $ LOW="$BASE/lower" $ UP="$BASE/upper" $ WORK="$BASE/work/ 0 0 none /proc fuse.pwn user_id=1000" $ mkdir -p "$LOW" "$UP" "$WORK" $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt $ cat /proc/mounts none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0 none /proc fuse.pwn user_id=1000 0 0 $ fusermount -u /proc $ cat /proc/mounts cat: /proc/mounts: No such file or directory This fixes the problem by adding new seq_show_option and seq_show_option_n helpers, and updating the vulnerable show_option handlers to use them as needed. Some, like SELinux, need to be open coded due to unusual existing escape mechanisms. [akpm@linux-foundation.org: add lost chunk, per Kees] [keescook@chromium.org: seq_show_option should be using const parameters] Signed-off-by: NKees Cook <keescook@chromium.org> Acked-by: NSerge Hallyn <serge.hallyn@canonical.com> Acked-by: NJan Kara <jack@suse.com> Acked-by: NPaul Moore <paul@paul-moore.com> Cc: J. R. Okajima <hooanon05g@gmail.com> Signed-off-by: NKees Cook <keescook@chromium.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 8月, 2015 2 次提交
-
-
由 Theodore Ts'o 提交于
This reverts commit 08439fec. Unfortunately we still need to test for bdi->dev to avoid a crash when a USB stick is yanked out while a file system is mounted: usb 2-2: USB disconnect, device number 2 Buffer I/O error on dev sdb1, logical block 15237120, lost sync page write JBD2: Error -5 detected when updating journal superblock for sdb1-8. BUG: unable to handle kernel paging request at 34beb000 IP: [<c136ce88>] __percpu_counter_add+0x18/0xc0 *pdpt = 0000000023db9001 *pde = 0000000000000000 Oops: 0000 [#1] SMP CPU: 0 PID: 4083 Comm: umount Tainted: G U OE 4.1.1-040101-generic #201507011435 Hardware name: LENOVO 7675CTO/7675CTO, BIOS 7NETC2WW (2.22 ) 03/22/2011 task: ebf06b50 ti: ebebc000 task.ti: ebebc000 EIP: 0060:[<c136ce88>] EFLAGS: 00010082 CPU: 0 EIP is at __percpu_counter_add+0x18/0xc0 EAX: f21c8e88 EBX: f21c8e88 ECX: 00000000 EDX: 00000001 ESI: 00000001 EDI: 00000000 EBP: ebebde60 ESP: ebebde40 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 34beb000 CR3: 33354200 CR4: 000007f0 Stack: c1abe100 edcb0098 edcb00ec ffffffff f21c8e68 ffffffff f21c8e68 f286d160 ebebde84 c1160454 00000010 00000282 f72a77f8 00000984 f72a77f8 f286d160 f286d170 ebebdea0 c11e613f 00000000 00000282 f72a77f8 edd7f4d0 00000000 Call Trace: [<c1160454>] account_page_dirtied+0x74/0x110 [<c11e613f>] __set_page_dirty+0x3f/0xb0 [<c11e6203>] mark_buffer_dirty+0x53/0xc0 [<c124a0cb>] ext4_commit_super+0x17b/0x250 [<c124ac71>] ext4_put_super+0xc1/0x320 [<c11f04ba>] ? fsnotify_unmount_inodes+0x1aa/0x1c0 [<c11cfeda>] ? evict_inodes+0xca/0xe0 [<c11b925a>] generic_shutdown_super+0x6a/0xe0 [<c10a1df0>] ? prepare_to_wait_event+0xd0/0xd0 [<c1165a50>] ? unregister_shrinker+0x40/0x50 [<c11b92f6>] kill_block_super+0x26/0x70 [<c11b94f5>] deactivate_locked_super+0x45/0x80 [<c11ba007>] deactivate_super+0x47/0x60 [<c11d2b39>] cleanup_mnt+0x39/0x80 [<c11d2bc0>] __cleanup_mnt+0x10/0x20 [<c1080b51>] task_work_run+0x91/0xd0 [<c1011e3c>] do_notify_resume+0x7c/0x90 [<c1720da5>] work_notify Code: 8b 55 e8 e9 f4 fe ff ff 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 83 ec 20 89 5d f4 89 c3 89 75 f8 89 d6 89 7d fc 89 cf 8b 48 14 <64> 8b 01 89 45 ec 89 c2 8b 45 08 c1 fa 1f 01 75 ec 89 55 f0 89 EIP: [<c136ce88>] __percpu_counter_add+0x18/0xc0 SS:ESP 0068:ebebde40 CR2: 0000000034beb000 ---[ end trace dd564a7bea834ecd ]--- Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=101011Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
由 Theodore Ts'o 提交于
The xfstests ext4/305 will mount and unmount the same file system over 4,000 times, and each one of these will cause a system log message. Ratelimit this message since if we are getting more than a few dozen of these messages, they probably aren't going to be helpful. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 15 8月, 2015 1 次提交
-
-
由 Eric Sandeen 提交于
At some point along this sequence of changes: f6e63f90 ext4: fold ext4_nojournal_sops into ext4_sops bb044576 ext4: support freezing ext2 (nojournal) file systems 9ca92389 ext4: Use separate super_operations structure for no_journal filesystems ext4 started setting needs_recovery on filesystems without journals when they are unfrozen. This makes no sense, and in fact confuses blkid to the point where it doesn't recognize the filesystem at all. (freeze ext2; unfreeze ext2; run blkid; see no output; run dumpe2fs, see needs_recovery set on fs w/ no journal). To fix this, don't manipulate the INCOMPAT_RECOVER feature on filesystems without journals. Reported-by: NStu Mark <smark@datto.com> Reviewed-by: NJan Kara <jack@suse.com> Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 24 7月, 2015 1 次提交
-
-
由 Jan Kara 提交于
The functionality of ext3 is fully supported by ext4 driver. Major distributions (SUSE, RedHat) already use ext4 driver to handle ext3 filesystems for quite some time. There is some ugliness in mm resulting from jbd cleaning buffers in a dirty page without cleaning page dirty bit and also support for buffer bouncing in the block layer when stable pages are required is there only because of jbd. So let's remove the ext3 driver. This saves us some 28k lines of duplicated code. Acked-by: NTheodore Ts'o <tytso@mit.edu> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 23 7月, 2015 1 次提交
-
-
由 Daeho Jeong 提交于
When an error condition is detected, an error status should be recorded into superblocks of EXT4 or JBD2. However, the write request is submitted now without REQ_FUA flag, even in "barrier=1" mode, which is followed by panic() function in "errors=panic" mode. On mobile devices which make whole system reset as soon as kernel panic occurs, this write request containing an error flag will disappear just from storage cache without written to the physical cells. Therefore, when next start, even forever, the error flag cannot be shown in both superblocks, and e2fsck cannot fix the filesystem problems automatically, unless e2fsck is executed in force checking mode. [ Changed use test_opt(sb, BARRIER) of checking the journal flags -- TYT ] Signed-off-by: NDaeho Jeong <daeho.jeong@samsung.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 22 7月, 2015 2 次提交
-
-
由 Carlos Maiolino 提交于
There is no reason to allow ext2 filesystems be mounted with journal mount options. So, this patch adds them to the MOPT_NO_EXT2 mount options list. Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Tejun Heo 提交于
For ordered and writeback data modes, all data IOs go through ext4_io_submit. This patch adds cgroup writeback support by invoking wbc_init_bio() from io_submit_init_bio() and wbc_account_io() in io_submit_add_bh(). Journal data which is written by jbd2 worker is left alone by this patch and will always be written out from the root cgroup. ext4_fill_super() is updated to set MS_CGROUPWB when data mode is either ordered or writeback. In journaled data mode, most IOs become synchronous through the journal and enabling cgroup writeback support doesn't make much sense or difference. Journaled data mode is left alone. Lightly tested with sequential data write workload. Behaves as expected. Signed-off-by: NTejun Heo <tj@kernel.org> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 26 6月, 2015 1 次提交
-
-
由 Rasmus Villemoes 提交于
This makes a very large function a little smaller. Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 24 6月, 2015 1 次提交
-
-
由 Miklos Szeredi 提交于
Turn d_path(&file->f_path, ...); into file_path(file, ...); Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 23 6月, 2015 1 次提交
-
-
由 Theodore Ts'o 提交于
Newer versions of mount parse the lazytime feature and pass it to the mount system call via the flags field in the mount system call, removing the lazytime string from the mount options list. So we need to check for the presence of MS_LAZYTIME and set it in sb->s_flags in order for this flag to be set on a remount. Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 21 6月, 2015 2 次提交
-
-
由 Theodore Ts'o 提交于
In order to prevent quota block tracking to be inaccurate when ext4_quota_write() fails with ENOSPC, we make two changes. The quota file can now use the reserved block (since the quota file is arguably file system metadata), and ext4_quota_write() now uses ext4_should_retry_alloc() to retry the block allocation after a commit has completed and released some blocks for allocation. This fixes failures of xfstests generic/270: Quota error (device vdc): write_blk: dquota write failed Quota error (device vdc): qtree_write_dquot: Error -28 occurred while creating quota Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Normally all of the buffers will have been forced out to disk before we call invalidate_bdev(), but there will be some cases, where a file system operation was aborted due to an ext4_error(), where there may still be some dirty buffers in the buffer cache for the device. So try to force them out to memory before calling invalidate_bdev(). This fixes a warning triggered by generic/081: WARNING: CPU: 1 PID: 3473 at /usr/projects/linux/ext4/fs/block_dev.c:56 __blkdev_put+0xb5/0x16f() Signed-off-by: NTheodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 16 6月, 2015 1 次提交
-
-
由 Andreas Dilger 提交于
Several ext4_warning() messages in the directory handling code do not report the inode number of the (potentially corrupt) directory where a problem is seen, and others report this in an ad-hoc manner. Add an ext4_warning_inode() helper to print the inode number and command name consistent with ext4_error_inode(). Consolidate the place in ext4.h that these macros are defined. Clean up some other directory error and warning messages to print the calling function name. Minor code style fixes in nearby lines. Signed-off-by: NAndreas Dilger <adilger@dilger.ca> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 13 6月, 2015 1 次提交
-
-
由 Theodore Ts'o 提交于
We currently don't correctly handle the case where blocksize != pagesize, so disallow the mount in those cases. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 02 6月, 2015 1 次提交
-
-
由 Tejun Heo 提交于
With the planned cgroup writeback support, backing-dev related declarations will be more widely used across block and cgroup; unfortunately, including backing-dev.h from include/linux/blkdev.h makes cyclic include dependency quite likely. This patch separates out backing-dev-defs.h which only has the essential definitions and updates blkdev.h to include it. c files which need access to more backing-dev details now include backing-dev.h directly. This takes backing-dev.h off the common include dependency chain making it a lot easier to use it across block and cgroup. v2: fs/fat build failure fixed. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NJan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 01 6月, 2015 2 次提交
-
-
由 Chao Yu 提交于
Crypto resource should be released when ext4 module exits, otherwise it will cause memory leak. Signed-off-by: NChao Yu <chao2.yu@samsung.com> Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
As suggested by Herbert Xu, we shouldn't allocate a new tfm each time we read or write a page. Instead we can use a single tfm hanging off the inode's crypt_info structure for all of our encryption needs for that inode, since the tfm can be used by multiple crypto requests in parallel. Also use cmpxchg() to avoid races that could result in crypt_info structure getting doubly allocated or doubly freed. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 19 5月, 2015 3 次提交
-
-
由 Theodore Ts'o 提交于
The superblock fields s_file_encryption_mode and s_dir_encryption_mode are vestigal, so remove them as a cleanup. While we're at it, allow file systems with both encryption and inline_data enabled at the same time to work correctly. We can't have encrypted inodes with inline data, but there's no reason to prohibit unencrypted inodes from using the inline data feature. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
This is a pretty massive patch which does a number of different things: 1) The per-inode encryption information is now stored in an allocated data structure, ext4_crypt_info, instead of directly in the node. This reduces the size usage of an in-memory inode when it is not using encryption. 2) We drop the ext4_fname_crypto_ctx entirely, and use the per-inode encryption structure instead. This remove an unnecessary memory allocation and free for the fname_crypto_ctx as well as allowing us to reuse the ctfm in a directory for multiple lookups and file creations. 3) We also cache the inode's policy information in the ext4_crypt_info structure so we don't have to continually read it out of the extended attributes. 4) We now keep the keyring key in the inode's encryption structure instead of releasing it after we are done using it to derive the per-inode key. This allows us to test to see if the key has been revoked; if it has, we prevent the use of the derived key and free it. 5) When an inode is released (or when the derived key is freed), we will use memset_explicit() to zero out the derived key, so it's not left hanging around in memory. This implies that when a user logs out, it is important to first revoke the key, and then unlink it, and then finally, to use "echo 3 > /proc/sys/vm/drop_caches" to release any decrypted pages and dcache entries from the system caches. 6) All this, and we also shrink the number of lines of code by around 100. :-) Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-
由 Theodore Ts'o 提交于
Use struct ext4_encryption_key only for the master key passed via the kernel keyring. For internal kernel space users, we now use struct ext4_crypt_info. This will allow us to put information from the policy structure so we can cache it and avoid needing to constantly looking up the extended attribute. We will do this in a spearate patch. This patch is mostly mechnical to make it easier for patch review. Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
-