1. 04 5月, 2017 4 次提交
    • E
      ext4: clean up ext4_match() and callers · d9b9f8d5
      Eric Biggers 提交于
      When ext4 encryption was originally merged, we were encrypting the
      user-specified filename in ext4_match(), introducing a lot of additional
      complexity into ext4_match() and its callers.  This has since been
      changed to encrypt the filename earlier, so we can remove the gunk
      that's no longer needed.  This more or less reverts ext4_search_dir()
      and ext4_find_dest_de() to the way they were in the v4.0 kernel.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      d9b9f8d5
    • E
      ext4: switch to using fscrypt_match_name() · 067d1023
      Eric Biggers 提交于
      Switch ext4 directory searches to use the fscrypt_match_name() helper
      function.  There should be no functional change.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      067d1023
    • E
      fscrypt: avoid collisions when presenting long encrypted filenames · 6b06cdee
      Eric Biggers 提交于
      When accessing an encrypted directory without the key, userspace must
      operate on filenames derived from the ciphertext names, which contain
      arbitrary bytes.  Since we must support filenames as long as NAME_MAX,
      we can't always just base64-encode the ciphertext, since that may make
      it too long.  Currently, this is solved by presenting long names in an
      abbreviated form containing any needed filesystem-specific hashes (e.g.
      to identify a directory block), then the last 16 bytes of ciphertext.
      This needs to be sufficient to identify the actual name on lookup.
      
      However, there is a bug.  It seems to have been assumed that due to the
      use of a CBC (ciphertext block chaining)-based encryption mode, the last
      16 bytes (i.e. the AES block size) of ciphertext would depend on the
      full plaintext, preventing collisions.  However, we actually use CBC
      with ciphertext stealing (CTS), which handles the last two blocks
      specially, causing them to appear "flipped".  Thus, it's actually the
      second-to-last block which depends on the full plaintext.
      
      This caused long filenames that differ only near the end of their
      plaintexts to, when observed without the key, point to the wrong inode
      and be undeletable.  For example, with ext4:
      
          # echo pass | e4crypt add_key -p 16 edir/
          # seq -f "edir/abcdefghijklmnopqrstuvwxyz012345%.0f" 100000 | xargs touch
          # find edir/ -type f | xargs stat -c %i | sort | uniq | wc -l
          100000
          # sync
          # echo 3 > /proc/sys/vm/drop_caches
          # keyctl new_session
          # find edir/ -type f | xargs stat -c %i | sort | uniq | wc -l
          2004
          # rm -rf edir/
          rm: cannot remove 'edir/_A7nNFi3rhkEQlJ6P,hdzluhODKOeWx5V': Structure needs cleaning
          ...
      
      To fix this, when presenting long encrypted filenames, encode the
      second-to-last block of ciphertext rather than the last 16 bytes.
      
      Although it would be nice to solve this without depending on a specific
      encryption mode, that would mean doing a cryptographic hash like SHA-256
      which would be much less efficient.  This way is sufficient for now, and
      it's still compatible with encryption modes like HEH which are strong
      pseudorandom permutations.  Also, changing the presented names is still
      allowed at any time because they are only provided to allow applications
      to do things like delete encrypted directories.  They're not designed to
      be used to persistently identify files --- which would be hard to do
      anyway, given that they're encrypted after all.
      
      For ease of backports, this patch only makes the minimal fix to both
      ext4 and f2fs.  It leaves ubifs as-is, since ubifs doesn't compare the
      ciphertext block yet.  Follow-on patches will clean things up properly
      and make the filesystems use a shared helper function.
      
      Fixes: 5de0b4d0 ("ext4 crypto: simplify and speed up filename encryption")
      Reported-by: NGwendal Grignou <gwendal@chromium.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      6b06cdee
    • E
      ext4: remove "nokey" check from ext4_lookup() · 8c68084b
      Eric Biggers 提交于
      Now that fscrypt_has_permitted_context() correctly handles the case
      where we have the key for the parent directory but not the child, we
      don't need to try to work around this in ext4_lookup().
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      8c68084b
  2. 03 4月, 2017 1 次提交
    • D
      ext4: Add statx support · 99652ea5
      David Howells 提交于
      Return enhanced file attributes from the Ext4 filesystem.  This includes
      the following:
      
       (1) The inode creation time (i_crtime) as stx_btime, setting STATX_BTIME.
      
       (2) Certain FS_xxx_FL flags are mapped to stx_attribute flags.
      
      This requires that all ext4 inodes have a getattr call, not just some of
      them, so to this end, split the ext4_getattr() function and only call part
      of it where appropriate.
      
      Example output:
      
      	[root@andromeda ~]# touch foo
      	[root@andromeda ~]# chattr +ai foo
      	[root@andromeda ~]# /tmp/test-statx foo
      	statx(foo) = 0
      	results=fff
      	  Size: 0               Blocks: 0          IO Block: 4096    regular file
      	Device: 08:12           Inode: 2101950     Links: 1
      	Access: (0644/-rw-r--r--)  Uid:     0   Gid:     0
      	Access: 2016-02-11 17:08:29.031795451+0000
      	Modify: 2016-02-11 17:08:29.031795451+0000
      	Change: 2016-02-11 17:11:11.987790114+0000
      	 Birth: 2016-02-11 17:08:29.031795451+0000
      	Attributes: 0000000000000030 (-------- -------- -------- -------- -------- -------- -------- --ai----)
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      99652ea5
  3. 05 2月, 2017 1 次提交
  4. 02 2月, 2017 1 次提交
  5. 08 1月, 2017 1 次提交
    • T
      ext4: don't allow encrypted operations without keys · 173b8439
      Theodore Ts'o 提交于
      While we allow deletes without the key, the following should not be
      permitted:
      
      # cd /vdc/encrypted-dir-without-key
      # ls -l
      total 4
      -rw-r--r-- 1 root root   0 Dec 27 22:35 6,LKNRJsp209FbXoSvJWzB
      -rw-r--r-- 1 root root 286 Dec 27 22:35 uRJ5vJh9gE7vcomYMqTAyD
      # mv uRJ5vJh9gE7vcomYMqTAyD  6,LKNRJsp209FbXoSvJWzB
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      173b8439
  6. 01 1月, 2017 1 次提交
    • E
      fscrypt: use ENOKEY when file cannot be created w/o key · 54475f53
      Eric Biggers 提交于
      As part of an effort to clean up fscrypt-related error codes, make
      attempting to create a file in an encrypted directory that hasn't been
      "unlocked" fail with ENOKEY.  Previously, several error codes were used
      for this case, including ENOENT, EACCES, and EPERM, and they were not
      consistent between and within filesystems.  ENOKEY is a better choice
      because it expresses that the failure is due to lacking the encryption
      key.  It also matches the error code returned when trying to open an
      encrypted regular file without the key.
      
      I am not aware of any users who might be relying on the previous
      inconsistent error codes, which were never documented anywhere.
      
      This failure case will be exercised by an xfstest.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      54475f53
  7. 15 11月, 2016 1 次提交
    • D
      ext4: use current_time() for inode timestamps · eeca7ea1
      Deepa Dinamani 提交于
      CURRENT_TIME_SEC and CURRENT_TIME are not y2038 safe.
      current_time() will be transitioned to be y2038 safe
      along with vfs.
      
      current_time() returns timestamps according to the
      granularities set in the super_block.
      The granularity check in ext4_current_time() to call
      current_time() or CURRENT_TIME_SEC is not required.
      Use current_time() directly to obtain timestamps
      unconditionally, and remove ext4_current_time().
      
      Quota files are assumed to be on the same filesystem.
      Hence, use current_time() for these files as well.
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NArnd Bergmann <arnd@arndb.de>
      eeca7ea1
  8. 15 10月, 2016 1 次提交
  9. 08 10月, 2016 1 次提交
  10. 30 9月, 2016 1 次提交
  11. 27 9月, 2016 1 次提交
  12. 16 9月, 2016 1 次提交
    • E
      fscrypto: make filename crypto functions return 0 on success · ef1eb3aa
      Eric Biggers 提交于
      Several filename crypto functions: fname_decrypt(),
      fscrypt_fname_disk_to_usr(), and fscrypt_fname_usr_to_disk(), returned
      the output length on success or -errno on failure.  However, the output
      length was redundant with the value written to 'oname->len'.  It is also
      potentially error-prone to make callers have to check for '< 0' instead
      of '!= 0'.
      
      Therefore, make these functions return 0 instead of a length, and make
      the callers who cared about the return value being a length use
      'oname->len' instead.  For consistency also make other callers check for
      a nonzero result rather than a negative result.
      
      This change also fixes the inconsistency of fname_encrypt() actually
      already returning 0 on success, not a length like the other filename
      crypto functions and as documented in its function comment.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      Acked-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ef1eb3aa
  13. 11 7月, 2016 1 次提交
  14. 04 7月, 2016 2 次提交
  15. 08 6月, 2016 1 次提交
  16. 05 5月, 2016 1 次提交
    • J
      ext4: fix oops on corrupted filesystem · 74177f55
      Jan Kara 提交于
      When filesystem is corrupted in the right way, it can happen
      ext4_mark_iloc_dirty() in ext4_orphan_add() returns error and we
      subsequently remove inode from the in-memory orphan list. However this
      deletion is done with list_del(&EXT4_I(inode)->i_orphan) and thus we
      leave i_orphan list_head with a stale content. Later we can look at this
      content causing list corruption, oops, or other issues. The reported
      trace looked like:
      
      WARNING: CPU: 0 PID: 46 at lib/list_debug.c:53 __list_del_entry+0x6b/0x100()
      list_del corruption, 0000000061c1d6e0->next is LIST_POISON1
      0000000000100100)
      CPU: 0 PID: 46 Comm: ext4.exe Not tainted 4.1.0-rc4+ #250
      Stack:
       60462947 62219960 602ede24 62219960
       602ede24 603ca293 622198f0 602f02eb
       62219950 6002c12c 62219900 601b4d6b
      Call Trace:
       [<6005769c>] ? vprintk_emit+0x2dc/0x5c0
       [<602ede24>] ? printk+0x0/0x94
       [<600190bc>] show_stack+0xdc/0x1a0
       [<602ede24>] ? printk+0x0/0x94
       [<602ede24>] ? printk+0x0/0x94
       [<602f02eb>] dump_stack+0x2a/0x2c
       [<6002c12c>] warn_slowpath_common+0x9c/0xf0
       [<601b4d6b>] ? __list_del_entry+0x6b/0x100
       [<6002c254>] warn_slowpath_fmt+0x94/0xa0
       [<602f4d09>] ? __mutex_lock_slowpath+0x239/0x3a0
       [<6002c1c0>] ? warn_slowpath_fmt+0x0/0xa0
       [<60023ebf>] ? set_signals+0x3f/0x50
       [<600a205a>] ? kmem_cache_free+0x10a/0x180
       [<602f4e88>] ? mutex_lock+0x18/0x30
       [<601b4d6b>] __list_del_entry+0x6b/0x100
       [<601177ec>] ext4_orphan_del+0x22c/0x2f0
       [<6012f27c>] ? __ext4_journal_start_sb+0x2c/0xa0
       [<6010b973>] ? ext4_truncate+0x383/0x390
       [<6010bc8b>] ext4_write_begin+0x30b/0x4b0
       [<6001bb50>] ? copy_from_user+0x0/0xb0
       [<601aa840>] ? iov_iter_fault_in_readable+0xa0/0xc0
       [<60072c4f>] generic_perform_write+0xaf/0x1e0
       [<600c4166>] ? file_update_time+0x46/0x110
       [<60072f0f>] __generic_file_write_iter+0x18f/0x1b0
       [<6010030f>] ext4_file_write_iter+0x15f/0x470
       [<60094e10>] ? unlink_file_vma+0x0/0x70
       [<6009b180>] ? unlink_anon_vmas+0x0/0x260
       [<6008f169>] ? free_pgtables+0xb9/0x100
       [<600a6030>] __vfs_write+0xb0/0x130
       [<600a61d5>] vfs_write+0xa5/0x170
       [<600a63d6>] SyS_write+0x56/0xe0
       [<6029fcb0>] ? __libc_waitpid+0x0/0xa0
       [<6001b698>] handle_syscall+0x68/0x90
       [<6002633d>] userspace+0x4fd/0x600
       [<6002274f>] ? save_registers+0x1f/0x40
       [<60028bd7>] ? arch_prctl+0x177/0x1b0
       [<60017bd5>] fork_handler+0x85/0x90
      
      Fix the problem by using list_del_init() as we always should with
      i_orphan list.
      
      CC: stable@vger.kernel.org
      Reported-by: NVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      74177f55
  17. 27 4月, 2016 1 次提交
  18. 24 4月, 2016 1 次提交
    • T
      ext4: allow readdir()'s of large empty directories to be interrupted · 1f60fbe7
      Theodore Ts'o 提交于
      If a directory has a large number of empty blocks, iterating over all
      of them can take a long time, leading to scheduler warnings and users
      getting irritated when they can't kill a process in the middle of one
      of these long-running readdir operations.  Fix this by adding checks to
      ext4_readdir() and ext4_htree_fill_tree().
      
      This was reverted earlier due to a typo in the original commit where I
      experimented with using signal_pending() instead of
      fatal_signal_pending().  The test was in the wrong place if we were
      going to return signal_pending() since we would end up returning
      duplicant entries.  See 9f2394c9 for a more detailed explanation.
      
      Added fix as suggested by Linus to check for signal_pending() in
      in the filldir() functions.
      Reported-by: NBenjamin LaHaise <bcrl@kvack.org>
      Google-Bug-Id: 27880676
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      1f60fbe7
  19. 11 4月, 2016 2 次提交
    • L
      Revert "ext4: allow readdir()'s of large empty directories to be interrupted" · 9f2394c9
      Linus Torvalds 提交于
      This reverts commit 1028b55b.
      
      It's broken: it makes ext4 return an error at an invalid point, causing
      the readdir wrappers to write the the position of the last successful
      directory entry into the position field, which means that the next
      readdir will now return that last successful entry _again_.
      
      You can only return fatal errors (that terminate the readdir directory
      walk) from within the filesystem readdir functions, the "normal" errors
      (that happen when the readdir buffer fills up, for example) happen in
      the iterorator where we know the position of the actual failing entry.
      
      I do have a very different patch that does the "signal_pending()"
      handling inside the iterator function where it is allowable, but while
      that one passes all the sanity checks, I screwed up something like four
      times while emailing it out, so I'm not going to commit it today.
      
      So my track record is not good enough, and the stars will have to align
      better before that one gets committed.  And it would be good to get some
      review too, of course, since celestial alignments are always an iffy
      debugging model.
      
      IOW, let's just revert the commit that caused the problem for now.
      Reported-by: NGreg Thelen <gthelen@google.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9f2394c9
    • A
      don't bother with ->d_inode->i_sb - it's always equal to ->d_sb · fc64005c
      Al Viro 提交于
      ... and neither can ever be NULL
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      fc64005c
  20. 31 3月, 2016 1 次提交
  21. 08 2月, 2016 2 次提交
  22. 23 1月, 2016 1 次提交
    • A
      wrappers for ->i_mutex access · 5955102c
      Al Viro 提交于
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  23. 09 1月, 2016 2 次提交
  24. 09 12月, 2015 1 次提交
    • A
      don't put symlink bodies in pagecache into highmem · 21fc61c7
      Al Viro 提交于
      kmap() in page_follow_link_light() needed to go - allowing to hold
      an arbitrary number of kmaps for long is a great way to deadlocking
      the system.
      
      new helper (inode_nohighmem(inode)) needs to be used for pagecache
      symlinks inodes; done for all in-tree cases.  page_follow_link_light()
      instrumented to yell about anything missed.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      21fc61c7
  25. 30 10月, 2015 1 次提交
  26. 18 10月, 2015 2 次提交
  27. 29 9月, 2015 1 次提交
  28. 27 7月, 2015 1 次提交
  29. 24 7月, 2015 1 次提交
  30. 21 6月, 2015 1 次提交
    • T
      ext4: prevent ext4_quota_write() from failing due to ENOSPC · c5e298ae
      Theodore Ts'o 提交于
      In order to prevent quota block tracking to be inaccurate when
      ext4_quota_write() fails with ENOSPC, we make two changes.  The quota
      file can now use the reserved block (since the quota file is arguably
      file system metadata), and ext4_quota_write() now uses
      ext4_should_retry_alloc() to retry the block allocation after a commit
      has completed and released some blocks for allocation.
      
      This fixes failures of xfstests generic/270:
      
      Quota error (device vdc): write_blk: dquota write failed
      Quota error (device vdc): qtree_write_dquot: Error -28 occurred while creating quota
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      c5e298ae
  31. 16 6月, 2015 1 次提交
    • A
      ext4: improve warning directory handling messages · b03a2f7e
      Andreas Dilger 提交于
      Several ext4_warning() messages in the directory handling code do not
      report the inode number of the (potentially corrupt) directory where a
      problem is seen, and others report this in an ad-hoc manner.  Add an
      ext4_warning_inode() helper to print the inode number and command name
      consistent with ext4_error_inode().
      
      Consolidate the place in ext4.h that these macros are defined.
      
      Clean up some other directory error and warning messages to print the
      calling function name.
      
      Minor code style fixes in nearby lines.
      Signed-off-by: NAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      b03a2f7e
  32. 01 6月, 2015 1 次提交