提交 · c86d8db33a922da808a5560aa15ed663a9569b37 · gsplhtlxg / clone-Linux

08 12月, 2015 5 次提交

ext4: implement allocation of pre-zeroed blocks · c86d8db3

由 Jan Kara 提交于 12月 07, 2015

DAX page fault path needs to get blocks that are pre-zeroed to avoid
races when two concurrent page faults happen in the same block of a
file. Implement support for this in ext4_map_blocks().
Signed-off-by: NJan Kara <jack@suse.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

c86d8db3

ext4: provide ext4_issue_zeroout() · 53085fac

由 Jan Kara 提交于 12月 07, 2015

Create new function ext4_issue_zeroout() to zeroout contiguous (both
logically and physically) part of inode data. We will need to issue
zeroout when extent structure is not readily available and this function
will allow us to do it without making up fake extent structures.
Signed-off-by: NJan Kara <jack@suse.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

53085fac

ext4: get rid of EXT4_GET_BLOCKS_NO_LOCK flag · 2dcba478

由 Jan Kara 提交于 12月 07, 2015

When dioread_nolock mode is enabled, we grab i_data_sem in
ext4_ext_direct_IO() and therefore we need to instruct _ext4_get_block()
not to grab i_data_sem again using EXT4_GET_BLOCKS_NO_LOCK. However
holding i_data_sem over overwrite direct IO isn't needed these days. We
have exclusion against truncate / hole punching because we increase
i_dio_count under i_mutex in ext4_ext_direct_IO() so once
ext4_file_write_iter() verifies blocks are allocated & written, they are
guaranteed to stay so during the whole direct IO even after we drop
i_mutex.

So we can just remove this locking abuse and the no longer necessary
EXT4_GET_BLOCKS_NO_LOCK flag.
Signed-off-by: NJan Kara <jack@suse.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

2dcba478

ext4: fix races of writeback with punch hole and zero range · 01127848

由 Jan Kara 提交于 12月 07, 2015

When doing delayed allocation, update of on-disk inode size is postponed
until IO submission time. However hole punch or zero range fallocate
calls can end up discarding the tail page cache page and thus on-disk
inode size would never be properly updated.

Make sure the on-disk inode size is updated before truncating page
cache.
Signed-off-by: NJan Kara <jack@suse.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

01127848

ext4: fix races between page faults and hole punching · ea3d7209

由 Jan Kara 提交于 12月 07, 2015

Currently, page faults and hole punching are completely unsynchronized.
This can result in page fault faulting in a page into a range that we
are punching after truncate_pagecache_range() has been called and thus
we can end up with a page mapped to disk blocks that will be shortly
freed. Filesystem corruption will shortly follow. Note that the same
race is avoided for truncate by checking page fault offset against
i_size but there isn't similar mechanism available for punching holes.

Fix the problem by creating new rw semaphore i_mmap_sem in inode and
grab it for writing over truncate, hole punching, and other functions
removing blocks from extent tree and for read over page faults. We
cannot easily use i_data_sem for this since that ranks below transaction
start and we need something ranking above it so that it can be held over
the whole truncate / hole punching operation. Also remove various
workarounds we had in the code to reduce race window when page fault
could have created pages with stale mapping information.
Signed-off-by: NJan Kara <jack@suse.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

ea3d7209

25 11月, 2015 1 次提交

ext4: Fix handling of extended tv_sec · a4dad1ae

由 David Turner 提交于 11月 24, 2015

In ext4, the bottom two bits of {a,c,m}time_extra are used to extend
the {a,c,m}time fields, deferring the year 2038 problem to the year
2446.

When decoding these extended fields, for times whose bottom 32 bits
would represent a negative number, sign extension causes the 64-bit
extended timestamp to be negative as well, which is not what's
intended.  This patch corrects that issue, so that the only negative
{a,c,m}times are those between 1901 and 1970 (as per 32-bit signed
timestamps).

Some older kernels might have written pre-1970 dates with 1,1 in the
extra bits.  This patch treats those incorrectly-encoded dates as
pre-1970, instead of post-2311, until kernel 4.20 is released.
Hopefully by then e2fsck will have fixed up the bad data.

Also add a comment explaining the encoding of ext4's extra {a,c,m}time
bits.
Signed-off-by: NDavid Turner <novalis@novalis.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reported-by: NMark Harris <mh8928@yahoo.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=23732
Cc: stable@vger.kernel.org

a4dad1ae

19 10月, 2015 1 次提交

ext4: do not allow journal_opts for fs w/o journal · 1e381f60

由 Dmitry Monakhov 提交于 10月 18, 2015

It is appeared that we can pass journal related mount options and such options
be shown in /proc/mounts

Example:
#mkfs.ext4 -F /dev/vdb
#tune2fs -O ^has_journal /dev/vdb
#mount /dev/vdb /mnt/  -ocommit=20,journal_async_commit
#cat /proc/mounts  | grep /mnt
 /dev/vdb /mnt ext4 rw,relatime,journal_checksum,journal_async_commit,commit=20,data=ordered 0 0

But options:"journal_checksum,journal_async_commit,commit=20,data=ordered" has
nothing with reality because there is no journal at all.

This patch disallow following options for journalless configurations:
 - journal_checksum
 - journal_async_commit
 - commit=%ld
 - data={writeback,ordered,journal}
Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>

1e381f60

18 10月, 2015 4 次提交

ext4: clean up feature test macros with predicate functions · e2b911c5

由 Darrick J. Wong 提交于 10月 17, 2015

Create separate predicate functions to test/set/clear feature flags,
thereby replacing the wordy old macros.  Furthermore, clean out the
places where we open-coded feature tests.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

e2b911c5

ext4: call out CRC and corruption errors with specific error codes · 6a797d27

由 Darrick J. Wong 提交于 10月 17, 2015

Instead of overloading EIO for CRC errors and corrupt structures,
return the same error codes that XFS returns for the same issues.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

6a797d27

ext4: store checksum seed in superblock · 8c81bd8f

由 Darrick J. Wong 提交于 10月 17, 2015

Allow the filesystem to store the metadata checksum seed in the
superblock and add an incompat feature to say that we're using it.
This enables tune2fs to change the UUID on a mounted metadata_csum
FS without having to (racy!) rewrite all disk metadata.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

8c81bd8f

T
ext4: reserve code points for the project quota feature · 8b4953e1
由 Theodore Ts'o 提交于 10月 17, 2015
```
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
```
8b4953e1

03 10月, 2015 1 次提交

ext4 crypto: ext4_page_crypto() doesn't need a encryption context · 3684de8c

由 Theodore Ts'o 提交于 10月 03, 2015

Since ext4_page_crypto() doesn't need an encryption context (at least
not any more), this allows us to simplify a number function signature
and also allows us to avoid needing to allocate a context in
ext4_block_write_begin().  It also means we no longer need a separate
ext4_decrypt_one() function.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

3684de8c

24 9月, 2015 2 次提交

ext4: move procfs registration code to fs/ext4/sysfs.c · ebd173be

由 Theodore Ts'o 提交于 9月 23, 2015

This allows us to refactor the procfs code, which saves a bit of
compiled space.  More importantly it isolates most of the procfs
support code into a single file, so it's easier to #ifdef it out if
the proc file system has been disabled.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

ebd173be

ext4: move sysfs code from super.c to fs/ext4/sysfs.c · b5799018

由 Theodore Ts'o 提交于 9月 23, 2015

Also statically allocate the ext4_kset and ext4_feat objects, since we
only need exactly one of each, and it's simpler and less code if we
drop the dynamic allocation and deallocation when it's not needed.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

b5799018

09 9月, 2015 1 次提交

ext4: add ext4_get_block_dax() · ed923b57

由 Matthew Wilcox 提交于 9月 08, 2015

DAX wants different semantics from any currently-existing ext4 get_block
callback.  Unlike ext4_get_block_write(), it needs to honour the
'create' flag, and unlike ext4_get_block(), it needs to be able to
return unwritten extents.  So introduce a new ext4_get_block_dax() which
has those semantics.

We could also change ext4_get_block_write() to honour the 'create' flag,
but that might have consequences on other users that I do not currently
understand.
Signed-off-by: NMatthew Wilcox <willy@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ed923b57

22 7月, 2015 1 次提交

ext4: replace ext4_io_submit->io_op with ->io_wbc · 5a33911f

由 Tejun Heo 提交于 7月 21, 2015

ext4_io_submit_init() takes the pointer to writeback_control to test
its sync_mode and determine between WRITE and WRITE_SYNC and records
the result in ->io_op.  This patch makes it record the pointer
directly and moves the test to ext4_io_submit().

This doesn't cause any noticeable differences now but having
writeback_control available throughout IO submission path will be
depended upon by the planned cgroup writeback support.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

5a33911f

16 6月, 2015 1 次提交

ext4: improve warning directory handling messages · b03a2f7e

由 Andreas Dilger 提交于 6月 15, 2015

Several ext4_warning() messages in the directory handling code do not
report the inode number of the (potentially corrupt) directory where a
problem is seen, and others report this in an ad-hoc manner.  Add an
ext4_warning_inode() helper to print the inode number and command name
consistent with ext4_error_inode().

Consolidate the place in ext4.h that these macros are defined.

Clean up some other directory error and warning messages to print the
calling function name.

Minor code style fixes in nearby lines.
Signed-off-by: NAndreas Dilger <adilger@dilger.ca>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

b03a2f7e

09 6月, 2015 1 次提交

ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate · 331573fe

由 Namjae Jeon 提交于 6月 09, 2015

This patch implements fallocate's FALLOC_FL_INSERT_RANGE for Ext4.

1) Make sure that both offset and len are block size aligned.
2) Update the i_size of inode by len bytes.
3) Compute the file's logical block number against offset. If the computed
   block number is not the starting block of the extent, split the extent
   such that the block number is the starting block of the extent.
4) Shift all the extents which are lying between [offset, last allocated extent]
   towards right by len bytes. This step will make a hole of len bytes
   at offset.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAshish Sangwan <a.sangwan@samsung.com>

331573fe

01 6月, 2015 3 次提交

ext4 crypto: allocate the right amount of memory for the on-disk symlink · 4d3c4e5b

由 Theodore Ts'o 提交于 5月 31, 2015

Previously we were taking the required padding when allocating space
for the on-disk symlink.  This caused a buffer overrun which could
trigger a krenel crash when running fsstress.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

4d3c4e5b

ext4 crypto: encrypt tmpfile located in encryption protected directory · e709e9df

由 Theodore Ts'o 提交于 5月 31, 2015

Factor out calls to ext4_inherit_context() and move them to
__ext4_new_inode(); this fixes a problem where ext4_tmpfile() wasn't
calling calling ext4_inherit_context(), so the temporary file wasn't
getting protected.  Since the blocks for the tmpfile could end up on
disk, they really should be protected if the tmpfile is created within
the context of an encrypted directory.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

e709e9df

ext4 crypto: use per-inode tfm structure · c936e1ec

由 Theodore Ts'o 提交于 5月 31, 2015

As suggested by Herbert Xu, we shouldn't allocate a new tfm each time
we read or write a page.  Instead we can use a single tfm hanging off
the inode's crypt_info structure for all of our encryption needs for
that inode, since the tfm can be used by multiple crypto requests in
parallel.

Also use cmpxchg() to avoid races that could result in crypt_info
structure getting doubly allocated or doubly freed.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

c936e1ec

19 5月, 2015 5 次提交

ext4 crypto: use slab caches · 8ee03714

由 Theodore Ts'o 提交于 5月 18, 2015

Use slab caches the ext4_crypto_ctx and ext4_crypt_info structures for
slighly better memory efficiency and debuggability.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

8ee03714

ext4: clean up superblock encryption mode fields · f5aed2c2

由 Theodore Ts'o 提交于 5月 18, 2015

The superblock fields s_file_encryption_mode and s_dir_encryption_mode
are vestigal, so remove them as a cleanup.  While we're at it, allow
file systems with both encryption and inline_data enabled at the same
time to work correctly.  We can't have encrypted inodes with inline
data, but there's no reason to prohibit unencrypted inodes from using
the inline data feature.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

f5aed2c2

ext4 crypto: reorganize how we store keys in the inode · b7236e21

由 Theodore Ts'o 提交于 5月 18, 2015

This is a pretty massive patch which does a number of different things:

1) The per-inode encryption information is now stored in an allocated
   data structure, ext4_crypt_info, instead of directly in the node.
   This reduces the size usage of an in-memory inode when it is not
   using encryption.

2) We drop the ext4_fname_crypto_ctx entirely, and use the per-inode
   encryption structure instead.  This remove an unnecessary memory
   allocation and free for the fname_crypto_ctx as well as allowing us
   to reuse the ctfm in a directory for multiple lookups and file
   creations.

3) We also cache the inode's policy information in the ext4_crypt_info
   structure so we don't have to continually read it out of the
   extended attributes.

4) We now keep the keyring key in the inode's encryption structure
   instead of releasing it after we are done using it to derive the
   per-inode key.  This allows us to test to see if the key has been
   revoked; if it has, we prevent the use of the derived key and free
   it.

5) When an inode is released (or when the derived key is freed), we
   will use memset_explicit() to zero out the derived key, so it's not
   left hanging around in memory.  This implies that when a user logs
   out, it is important to first revoke the key, and then unlink it,
   and then finally, to use "echo 3 > /proc/sys/vm/drop_caches" to
   release any decrypted pages and dcache entries from the system
   caches.

6) All this, and we also shrink the number of lines of code by around
   100.  :-)
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

b7236e21

ext4 crypto: separate kernel and userspace structure for the key · e2881b1b

由 Theodore Ts'o 提交于 5月 18, 2015

Use struct ext4_encryption_key only for the master key passed via the
kernel keyring.

For internal kernel space users, we now use struct ext4_crypt_info.
This will allow us to put information from the policy structure so we
can cache it and avoid needing to constantly looking up the extended
attribute.  We will do this in a spearate patch.  This patch is mostly
mechnical to make it easier for patch review.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

e2881b1b

ext4 crypto: optimize filename encryption · 5b643f9c

由 Theodore Ts'o 提交于 5月 18, 2015

Encrypt the filename as soon it is passed in by the user.  This avoids
our needing to encrypt the filename 2 or 3 times while in the process
of creating a filename.

Similarly, when looking up a directory entry, encrypt the filename
early, or if the encryption key is not available, base-64 decode the
file syystem so that the hash value and the last 16 bytes of the
encrypted filename is available in the new struct ext4_filename data
structure.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

5b643f9c

15 5月, 2015 1 次提交

ext4: remove unused function prototype from ext4.h · 92c82639

由 Theodore Ts'o 提交于 5月 14, 2015

The ext4_extent_tree_init() function hasn't been in the ext4 code for
a long time ago, except in an unused function prototype in ext4.h

Google-Bug-Id: 4530137
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

92c82639

11 5月, 2015 1 次提交
- A
  ext4: split inode_operations for encrypted symlinks off the rest · a7a67e8a
  由 Al Viro 提交于 4月 27, 2015
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  a7a67e8a
02 5月, 2015 3 次提交

ext4 crypto: remove duplicated encryption mode definitions · 9402bdca

由 Chanho Park 提交于 5月 02, 2015

This patch removes duplicated encryption modes which were already in
ext4.h. They were duplicated from commit 3edc18d8 and commit f542fb.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Michael Halcrow <mhalcrow@google.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: NChanho Park <chanho61.park@samsung.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

9402bdca

ext4 crypto: add padding to filenames before encrypting · a44cd7a0

由 Theodore Ts'o 提交于 5月 01, 2015

This obscures the length of the filenames, to decrease the amount of
information leakage. By default, we pad the filenames to the next 4
byte boundaries. This costs nothing, since the directory entries are
aligned to 4 byte boundaries anyway. Filenames can also be padded to
8, 16, or 32 bytes, which will consume more directory space.

Change-Id: Ibb7a0fb76d2c48e2061240a709358ff40b14f322
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

a44cd7a0

ext4 crypto: simplify and speed up filename encryption · 5de0b4d0

由 Theodore Ts'o 提交于 5月 01, 2015

Avoid using SHA-1 when calculating the user-visible filename when the
encryption key is available, and avoid decrypting lots of filenames
when searching for a directory entry in a directory block.

Change-Id: If4655f144784978ba0305b597bfa1c8d7bb69e63
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

5de0b4d0

16 4月, 2015 3 次提交

ext4 crypto: enable encryption feature flag · 6ddb2447

由 Theodore Ts'o 提交于 4月 16, 2015

Also add the test dummy encryption mode flag so we can more easily
test the encryption patches using xfstests.
Signed-off-by: NMichael Halcrow <mhalcrow@google.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

6ddb2447

ext4 crypto: add symlink encryption · f348c252

由 Theodore Ts'o 提交于 4月 16, 2015

Signed-off-by: NUday Savagaonkar <savagaon@google.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

f348c252

dax: unify ext2/4_{dax,}_file_operations · be64f884

由 Boaz Harrosh 提交于 4月 15, 2015

The original dax patchset split the ext2/4_file_operations because of the
two NULL splice_read/splice_write in the dax case.

In the vfs if splice_read/splice_write are NULL we then call
default_splice_read/write.

What we do here is make generic_file_splice_read aware of IS_DAX() so the
original ext2/4_file_operations can be used as is.

For write it appears that iter_file_splice_write is just fine.  It uses
the regular f_op->write(file,..) or new_sync_write(file, ...).
Signed-off-by: NBoaz Harrosh <boaz@plexistor.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

be64f884

12 4月, 2015 6 次提交

ext4 crypto: insert encrypted filenames into a leaf directory block · 4bdfc873

由 Michael Halcrow 提交于 4月 12, 2015

Signed-off-by: NUday Savagaonkar <savagaon@google.com>
Signed-off-by: NIldar Muslukhov <ildarm@google.com>
Signed-off-by: NMichael Halcrow <mhalcrow@google.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

4bdfc873

ext4 crypto: teach ext4_htree_store_dirent() to store decrypted filenames · 2f61830a

由 Theodore Ts'o 提交于 4月 12, 2015

For encrypted directories, we need to pass in a separate parameter for
the decrypted filename, since the directory entry contains the
encrypted filename.
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

2f61830a

ext4 crypto: filename encryption facilities · d5d0e8c7

由 Michael Halcrow 提交于 4月 12, 2015

Signed-off-by: NUday Savagaonkar <savagaon@google.com>
Signed-off-by: NIldar Muslukhov <ildarm@google.com>
Signed-off-by: NMichael Halcrow <mhalcrow@google.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

d5d0e8c7

ext4 crypto: add encryption key management facilities · 88bd6ccd

由 Michael Halcrow 提交于 4月 12, 2015

Signed-off-by: NMichael Halcrow <mhalcrow@google.com>
Signed-off-by: NIldar Muslukhov <muslukhovi@gmail.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

88bd6ccd

ext4 crypto: add ext4 encryption facilities · b30ab0e0

由 Michael Halcrow 提交于 4月 12, 2015

On encrypt, we will re-assign the buffer_heads to point to a bounce
page rather than the control_page (which is the original page to write
that contains the plaintext). The block I/O occurs against the bounce
page.  On write completion, we re-assign the buffer_heads to the
original plaintext page.

On decrypt, we will attach a read completion callback to the bio
struct. This read completion will decrypt the read contents in-place
prior to setting the page up-to-date.

The current encryption mode, AES-256-XTS, lacks cryptographic
integrity. AES-256-GCM is in-plan, but we will need to devise a
mechanism for handling the integrity data.
Signed-off-by: NMichael Halcrow <mhalcrow@google.com>
Signed-off-by: NIldar Muslukhov <ildarm@google.com>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

b30ab0e0

direct_IO: use iov_iter_rw() instead of rw everywhere · 6f673763

由 Omar Sandoval 提交于 3月 16, 2015

The rw parameter to direct_IO is redundant with iov_iter->type, and
treated slightly differently just about everywhere it's used: some users
do rw & WRITE, and others do rw == WRITE where they should be doing a
bitwise check. Simplify this with the new iov_iter_rw() helper, which
always returns either READ or WRITE.
Signed-off-by: NOmar Sandoval <osandov@osandov.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6f673763