1. 24 5月, 2017 16 次提交
  2. 17 5月, 2017 1 次提交
  3. 06 5月, 2017 2 次提交
    • B
      GFS2: Allow glocks to be unlocked after withdraw · ed17545d
      Bob Peterson 提交于
      This bug fixes a regression introduced by patch 0d1c7ae9.
      
      The intent of the patch was to stop promoting glocks after a
      file system is withdrawn due to a variety of errors, because doing
      so results in a BUG(). (You should be able to unmount after a
      withdraw rather than having the kernel panic.)
      
      Unfortunately, it also stopped demotions, so glocks could not be
      unlocked after withdraw, which means the unmount would hang.
      
      This patch allows function do_xmote to demote locks to an
      unlocked state after a withdraw, but not promote them.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      ed17545d
    • E
      xfs: fix use-after-free in xfs_finish_page_writeback · 161f55ef
      Eryu Guan 提交于
      Commit 28b783e4 ("xfs: bufferhead chains are invalid after
      end_page_writeback") fixed one use-after-free issue by
      pre-calculating the loop conditionals before calling bh->b_end_io()
      in the end_io processing loop, but it assigned 'next' pointer before
      checking end offset boundary & breaking the loop, at which point the
      bh might be freed already, and caused use-after-free.
      
      This is caught by KASAN when running fstests generic/127 on sub-page
      block size XFS.
      
      [ 2517.244502] run fstests generic/127 at 2017-04-27 07:30:50
      [ 2747.868840] ==================================================================
      [ 2747.876949] BUG: KASAN: use-after-free in xfs_destroy_ioend+0x3d3/0x4e0 [xfs] at addr ffff8801395ae698
      ...
      [ 2747.918245] Call Trace:
      [ 2747.920975]  dump_stack+0x63/0x84
      [ 2747.924673]  kasan_object_err+0x21/0x70
      [ 2747.928950]  kasan_report+0x271/0x530
      [ 2747.933064]  ? xfs_destroy_ioend+0x3d3/0x4e0 [xfs]
      [ 2747.938409]  ? end_page_writeback+0xce/0x110
      [ 2747.943171]  __asan_report_load8_noabort+0x19/0x20
      [ 2747.948545]  xfs_destroy_ioend+0x3d3/0x4e0 [xfs]
      [ 2747.953724]  xfs_end_io+0x1af/0x2b0 [xfs]
      [ 2747.958197]  process_one_work+0x5ff/0x1000
      [ 2747.962766]  worker_thread+0xe4/0x10e0
      [ 2747.966946]  kthread+0x2d3/0x3d0
      [ 2747.970546]  ? process_one_work+0x1000/0x1000
      [ 2747.975405]  ? kthread_create_on_node+0xc0/0xc0
      [ 2747.980457]  ? syscall_return_slowpath+0xe6/0x140
      [ 2747.985706]  ? do_page_fault+0x30/0x80
      [ 2747.989887]  ret_from_fork+0x2c/0x40
      [ 2747.993874] Object at ffff8801395ae690, in cache buffer_head size: 104
      [ 2748.001155] Allocated:
      [ 2748.003782] PID = 8327
      [ 2748.006411]  save_stack_trace+0x1b/0x20
      [ 2748.010688]  save_stack+0x46/0xd0
      [ 2748.014383]  kasan_kmalloc+0xad/0xe0
      [ 2748.018370]  kasan_slab_alloc+0x12/0x20
      [ 2748.022648]  kmem_cache_alloc+0xb8/0x1b0
      [ 2748.027024]  alloc_buffer_head+0x22/0xc0
      [ 2748.031399]  alloc_page_buffers+0xd1/0x250
      [ 2748.035968]  create_empty_buffers+0x30/0x410
      [ 2748.040730]  create_page_buffers+0x120/0x1b0
      [ 2748.045493]  __block_write_begin_int+0x17a/0x1800
      [ 2748.050740]  iomap_write_begin+0x100/0x2f0
      [ 2748.055308]  iomap_zero_range_actor+0x253/0x5c0
      [ 2748.060362]  iomap_apply+0x157/0x270
      [ 2748.064347]  iomap_zero_range+0x5a/0x80
      [ 2748.068624]  iomap_truncate_page+0x6b/0xa0
      [ 2748.073227]  xfs_setattr_size+0x1f7/0xa10 [xfs]
      [ 2748.078312]  xfs_vn_setattr_size+0x68/0x140 [xfs]
      [ 2748.083589]  xfs_file_fallocate+0x4ac/0x820 [xfs]
      [ 2748.088838]  vfs_fallocate+0x2cf/0x780
      [ 2748.093021]  SyS_fallocate+0x48/0x80
      [ 2748.097006]  do_syscall_64+0x18a/0x430
      [ 2748.101186]  return_from_SYSCALL_64+0x0/0x6a
      [ 2748.105948] Freed:
      [ 2748.108189] PID = 8327
      [ 2748.110816]  save_stack_trace+0x1b/0x20
      [ 2748.115093]  save_stack+0x46/0xd0
      [ 2748.118788]  kasan_slab_free+0x73/0xc0
      [ 2748.122969]  kmem_cache_free+0x7a/0x200
      [ 2748.127247]  free_buffer_head+0x41/0x80
      [ 2748.131524]  try_to_free_buffers+0x178/0x250
      [ 2748.136316]  xfs_vm_releasepage+0x2e9/0x3d0 [xfs]
      [ 2748.141563]  try_to_release_page+0x100/0x180
      [ 2748.146325]  invalidate_inode_pages2_range+0x7da/0xcf0
      [ 2748.152087]  xfs_shift_file_space+0x37d/0x6e0 [xfs]
      [ 2748.157557]  xfs_collapse_file_space+0x49/0x120 [xfs]
      [ 2748.163223]  xfs_file_fallocate+0x2a7/0x820 [xfs]
      [ 2748.168462]  vfs_fallocate+0x2cf/0x780
      [ 2748.172642]  SyS_fallocate+0x48/0x80
      [ 2748.176629]  do_syscall_64+0x18a/0x430
      [ 2748.180810]  return_from_SYSCALL_64+0x0/0x6a
      
      Fixed it by checking on offset against end & breaking out first,
      dereference bh only if there're still bufferheads to process.
      Signed-off-by: NEryu Guan <eguan@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      161f55ef
  4. 05 5月, 2017 5 次提交
  5. 04 5月, 2017 16 次提交
    • E
      ext4: clean up ext4_match() and callers · d9b9f8d5
      Eric Biggers 提交于
      When ext4 encryption was originally merged, we were encrypting the
      user-specified filename in ext4_match(), introducing a lot of additional
      complexity into ext4_match() and its callers.  This has since been
      changed to encrypt the filename earlier, so we can remove the gunk
      that's no longer needed.  This more or less reverts ext4_search_dir()
      and ext4_find_dest_de() to the way they were in the v4.0 kernel.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      d9b9f8d5
    • E
      f2fs: switch to using fscrypt_match_name() · 1f73d491
      Eric Biggers 提交于
      Switch f2fs directory searches to use the fscrypt_match_name() helper
      function.  There should be no functional change.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Acked-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      1f73d491
    • E
      ext4: switch to using fscrypt_match_name() · 067d1023
      Eric Biggers 提交于
      Switch ext4 directory searches to use the fscrypt_match_name() helper
      function.  There should be no functional change.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      067d1023
    • E
      fscrypt: introduce helper function for filename matching · 17159420
      Eric Biggers 提交于
      Introduce a helper function fscrypt_match_name() which tests whether a
      fscrypt_name matches a directory entry.  Also clean up the magic numbers
      and document things properly.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      17159420
    • E
      fscrypt: avoid collisions when presenting long encrypted filenames · 6b06cdee
      Eric Biggers 提交于
      When accessing an encrypted directory without the key, userspace must
      operate on filenames derived from the ciphertext names, which contain
      arbitrary bytes.  Since we must support filenames as long as NAME_MAX,
      we can't always just base64-encode the ciphertext, since that may make
      it too long.  Currently, this is solved by presenting long names in an
      abbreviated form containing any needed filesystem-specific hashes (e.g.
      to identify a directory block), then the last 16 bytes of ciphertext.
      This needs to be sufficient to identify the actual name on lookup.
      
      However, there is a bug.  It seems to have been assumed that due to the
      use of a CBC (ciphertext block chaining)-based encryption mode, the last
      16 bytes (i.e. the AES block size) of ciphertext would depend on the
      full plaintext, preventing collisions.  However, we actually use CBC
      with ciphertext stealing (CTS), which handles the last two blocks
      specially, causing them to appear "flipped".  Thus, it's actually the
      second-to-last block which depends on the full plaintext.
      
      This caused long filenames that differ only near the end of their
      plaintexts to, when observed without the key, point to the wrong inode
      and be undeletable.  For example, with ext4:
      
          # echo pass | e4crypt add_key -p 16 edir/
          # seq -f "edir/abcdefghijklmnopqrstuvwxyz012345%.0f" 100000 | xargs touch
          # find edir/ -type f | xargs stat -c %i | sort | uniq | wc -l
          100000
          # sync
          # echo 3 > /proc/sys/vm/drop_caches
          # keyctl new_session
          # find edir/ -type f | xargs stat -c %i | sort | uniq | wc -l
          2004
          # rm -rf edir/
          rm: cannot remove 'edir/_A7nNFi3rhkEQlJ6P,hdzluhODKOeWx5V': Structure needs cleaning
          ...
      
      To fix this, when presenting long encrypted filenames, encode the
      second-to-last block of ciphertext rather than the last 16 bytes.
      
      Although it would be nice to solve this without depending on a specific
      encryption mode, that would mean doing a cryptographic hash like SHA-256
      which would be much less efficient.  This way is sufficient for now, and
      it's still compatible with encryption modes like HEH which are strong
      pseudorandom permutations.  Also, changing the presented names is still
      allowed at any time because they are only provided to allow applications
      to do things like delete encrypted directories.  They're not designed to
      be used to persistently identify files --- which would be hard to do
      anyway, given that they're encrypted after all.
      
      For ease of backports, this patch only makes the minimal fix to both
      ext4 and f2fs.  It leaves ubifs as-is, since ubifs doesn't compare the
      ciphertext block yet.  Follow-on patches will clean things up properly
      and make the filesystems use a shared helper function.
      
      Fixes: 5de0b4d0 ("ext4 crypto: simplify and speed up filename encryption")
      Reported-by: NGwendal Grignou <gwendal@chromium.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      6b06cdee
    • J
      f2fs: check entire encrypted bigname when finding a dentry · 6332cd32
      Jaegeuk Kim 提交于
      If user has no key under an encrypted dir, fscrypt gives digested dentries.
      Previously, when looking up a dentry, f2fs only checks its hash value with
      first 4 bytes of the digested dentry, which didn't handle hash collisions fully.
      This patch enhances to check entire dentry bytes likewise ext4.
      
      Eric reported how to reproduce this issue by:
      
       # seq -f "edir/abcdefghijklmnopqrstuvwxyz012345%.0f" 100000 | xargs touch
       # find edir -type f | xargs stat -c %i | sort | uniq | wc -l
      100000
       # sync
       # echo 3 > /proc/sys/vm/drop_caches
       # keyctl new_session
       # find edir -type f | xargs stat -c %i | sort | uniq | wc -l
      99999
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      (fixed f2fs_dentry_hash() to work even when the hash is 0)
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      6332cd32
    • E
      ubifs: check for consistent encryption contexts in ubifs_lookup() · 413d5a9e
      Eric Biggers 提交于
      As ext4 and f2fs do, ubifs should check for consistent encryption
      contexts during ->lookup() in an encrypted directory.  This protects
      certain users of filesystem encryption against certain types of offline
      attacks.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      413d5a9e
    • E
      f2fs: sync f2fs_lookup() with ext4_lookup() · faac7fd9
      Eric Biggers 提交于
      As for ext4, now that fscrypt_has_permitted_context() correctly handles
      the case where we have the key for the parent directory but not the
      child, f2fs_lookup() no longer has to work around it.  Also add the same
      warning message that ext4 uses.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      faac7fd9
    • E
      ext4: remove "nokey" check from ext4_lookup() · 8c68084b
      Eric Biggers 提交于
      Now that fscrypt_has_permitted_context() correctly handles the case
      where we have the key for the parent directory but not the child, we
      don't need to try to work around this in ext4_lookup().
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      8c68084b
    • E
      fscrypt: fix context consistency check when key(s) unavailable · 272f98f6
      Eric Biggers 提交于
      To mitigate some types of offline attacks, filesystem encryption is
      designed to enforce that all files in an encrypted directory tree use
      the same encryption policy (i.e. the same encryption context excluding
      the nonce).  However, the fscrypt_has_permitted_context() function which
      enforces this relies on comparing struct fscrypt_info's, which are only
      available when we have the encryption keys.  This can cause two
      incorrect behaviors:
      
      1. If we have the parent directory's key but not the child's key, or
         vice versa, then fscrypt_has_permitted_context() returned false,
         causing applications to see EPERM or ENOKEY.  This is incorrect if
         the encryption contexts are in fact consistent.  Although we'd
         normally have either both keys or neither key in that case since the
         master_key_descriptors would be the same, this is not guaranteed
         because keys can be added or removed from keyrings at any time.
      
      2. If we have neither the parent's key nor the child's key, then
         fscrypt_has_permitted_context() returned true, causing applications
         to see no error (or else an error for some other reason).  This is
         incorrect if the encryption contexts are in fact inconsistent, since
         in that case we should deny access.
      
      To fix this, retrieve and compare the fscrypt_contexts if we are unable
      to set up both fscrypt_infos.
      
      While this slightly hurts performance when accessing an encrypted
      directory tree without the key, this isn't a case we really need to be
      optimizing for; access *with* the key is much more important.
      Furthermore, the performance hit is barely noticeable given that we are
      already retrieving the fscrypt_context and doing two keyring searches in
      fscrypt_get_encryption_info().  If we ever actually wanted to optimize
      this case we might start by caching the fscrypt_contexts.
      
      Cc: stable@vger.kernel.org # 4.0+
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      272f98f6
    • J
      jbd2: cleanup write flags handling from jbd2_write_superblock() · 17f423b5
      Jan Kara 提交于
      Currently jbd2_write_superblock() silently adds REQ_SYNC to flags with
      which journal superblock is written. Make this explicit by making flags
      passed down to jbd2_write_superblock() contain REQ_SYNC.
      
      CC: linux-ext4@vger.kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      17f423b5
    • J
      ext4: mark superblock writes synchronous for nobarrier mounts · 00473374
      Jan Kara 提交于
      Commit b685d3d6 "block: treat REQ_FUA and REQ_PREFLUSH as
      synchronous" removed REQ_SYNC flag from WRITE_FUA implementation.
      generic_make_request_checks() however strips REQ_FUA flag from a bio
      when the storage doesn't report volatile write cache and thus write
      effectively becomes asynchronous which can lead to performance
      regressions. This affects superblock writes for ext4. Fix the problem
      by marking superblock writes always as synchronous.
      
      Fixes: b685d3d6
      CC: linux-ext4@vger.kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      00473374
    • J
      nfs: Fix bdi handling for cloned superblocks · 9052c7cf
      Jan Kara 提交于
      In commit 0d3b12584972 "nfs: Convert to separately allocated bdi" I have
      wrongly cloned bdi reference in nfs_clone_super(). Further inspection
      has shown that originally the code was actually allocating a new bdi (in
      ->clone_server callback) which was later registered in
      nfs_fs_mount_common() and used for sb->s_bdi in nfs_initialise_sb().
      This could later result in bdi for the original superblock not getting
      unregistered when that superblock got shutdown (as the cloned sb still
      held bdi reference) and later when a new superblock was created under
      the same anonymous device number, a clash in sysfs has happened on bdi
      registration:
      
      ------------[ cut here ]------------
      WARNING: CPU: 1 PID: 10284 at /linux-next/fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x74
      sysfs: cannot create duplicate filename '/devices/virtual/bdi/0:32'
      Modules linked in: axp20x_usb_power gpio_axp209 nvmem_sunxi_sid sun4i_dma sun4i_ss virt_dma
      CPU: 1 PID: 10284 Comm: mount.nfs Not tainted 4.11.0-rc4+ #14
      Hardware name: Allwinner sun7i (A20) Family
      [<c010f19c>] (unwind_backtrace) from [<c010bc74>] (show_stack+0x10/0x14)
      [<c010bc74>] (show_stack) from [<c03c6e24>] (dump_stack+0x78/0x8c)
      [<c03c6e24>] (dump_stack) from [<c0122200>] (__warn+0xe8/0x100)
      [<c0122200>] (__warn) from [<c0122250>] (warn_slowpath_fmt+0x38/0x48)
      [<c0122250>] (warn_slowpath_fmt) from [<c02ac178>] (sysfs_warn_dup+0x64/0x74)
      [<c02ac178>] (sysfs_warn_dup) from [<c02ac254>] (sysfs_create_dir_ns+0x84/0x94)
      [<c02ac254>] (sysfs_create_dir_ns) from [<c03c8b8c>] (kobject_add_internal+0x9c/0x2ec)
      [<c03c8b8c>] (kobject_add_internal) from [<c03c8e24>] (kobject_add+0x48/0x98)
      [<c03c8e24>] (kobject_add) from [<c048d75c>] (device_add+0xe4/0x5a0)
      [<c048d75c>] (device_add) from [<c048ddb4>] (device_create_groups_vargs+0xac/0xbc)
      [<c048ddb4>] (device_create_groups_vargs) from [<c048dde4>] (device_create_vargs+0x20/0x28)
      [<c048dde4>] (device_create_vargs) from [<c02075c8>] (bdi_register_va+0x44/0xfc)
      [<c02075c8>] (bdi_register_va) from [<c023d378>] (super_setup_bdi_name+0x48/0xa4)
      [<c023d378>] (super_setup_bdi_name) from [<c0312ef4>] (nfs_fill_super+0x1a4/0x204)
      [<c0312ef4>] (nfs_fill_super) from [<c03133f0>] (nfs_fs_mount_common+0x140/0x1e8)
      [<c03133f0>] (nfs_fs_mount_common) from [<c03335cc>] (nfs4_remote_mount+0x50/0x58)
      [<c03335cc>] (nfs4_remote_mount) from [<c023ef98>] (mount_fs+0x14/0xa4)
      [<c023ef98>] (mount_fs) from [<c025cba0>] (vfs_kern_mount+0x54/0x128)
      [<c025cba0>] (vfs_kern_mount) from [<c033352c>] (nfs_do_root_mount+0x80/0xa0)
      [<c033352c>] (nfs_do_root_mount) from [<c0333818>] (nfs4_try_mount+0x28/0x3c)
      [<c0333818>] (nfs4_try_mount) from [<c0313874>] (nfs_fs_mount+0x2cc/0x8c4)
      [<c0313874>] (nfs_fs_mount) from [<c023ef98>] (mount_fs+0x14/0xa4)
      [<c023ef98>] (mount_fs) from [<c025cba0>] (vfs_kern_mount+0x54/0x128)
      [<c025cba0>] (vfs_kern_mount) from [<c02600f0>] (do_mount+0x158/0xc7c)
      [<c02600f0>] (do_mount) from [<c0260f98>] (SyS_mount+0x8c/0xb4)
      [<c0260f98>] (SyS_mount) from [<c0107840>] (ret_fast_syscall+0x0/0x3c)
      
      Fix the problem by always creating new bdi for a superblock as we used
      to do.
      Reported-and-tested-by: NCorentin Labbe <clabbe.montjoie@gmail.com>
      Fixes: 0d3b12584972ce5781179ad3f15cca3cdb5cae05
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      9052c7cf
    • S
      SMB3: Work around mount failure when using SMB3 dialect to Macs · 7db0a6ef
      Steve French 提交于
      Macs send the maximum buffer size in response on ioctl to validate
      negotiate security information, which causes us to fail the mount
      as the response buffer is larger than the expected response.
      
      Changed ioctl response processing to allow for padding of validate
      negotiate ioctl response and limit the maximum response size to
      maximum buffer size.
      Signed-off-by: NSteve French <steve.french@primarydata.com>
      CC: Stable <stable@vger.kernel.org>
      7db0a6ef
    • Y
      f2fs: fix a mount fail for wrong next_scan_nid · e9cdd307
      Yunlei He 提交于
      -write_checkpoint
         -do_checkpoint
            -next_free_nid    <--- something wrong with next free nid
      
      -f2fs_fill_super
         -build_node_manager
            -build_free_nids
                -get_current_nat_page
                   -__get_meta_page   <--- attempt to access beyond end of device
      Signed-off-by: NYunlei He <heyunlei@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      e9cdd307
    • D
      cifs: fix CIFS_IOC_GET_MNT_INFO oops · d8a6e505
      David Disseldorp 提交于
      An open directory may have a NULL private_data pointer prior to readdir.
      
      Fixes: 0de1f4c6 ("Add way to query server fs info for smb3")
      Cc: stable@vger.kernel.org
      Signed-off-by: NDavid Disseldorp <ddiss@suse.de>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      d8a6e505