1. 28 9月, 2016 10 次提交
    • R
      fs/aio.c: eliminate redundant loads in put_aio_ring_file · de04e769
      Rasmus Villemoes 提交于
      Using a local variable we can prevent gcc from reloading
      aio_ring_file->f_inode->i_mapping twice, eliminating 2x2 dependent
      loads.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      de04e769
    • R
      fs/internal.h: add const to ns_dentry_operations declaration · be218aa2
      Rasmus Villemoes 提交于
      The actual definition in fs/nsfs.c is already const.
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      be218aa2
    • A
      compat: remove compat_printk() · 9dcfcda5
      Arnd Bergmann 提交于
      After 7e8e385a ("x86/compat: Remove sys32_vm86_warning"), this
      function has become unused, so we can remove it as well.
      
      Link: http://lkml.kernel.org/r/20160617142903.3070388-1-arnd@arndb.deSigned-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      9dcfcda5
    • E
      fs/buffer.c: make __getblk_slow() static · 0026ba40
      Eric Biggers 提交于
      __getblk_slow() was exported to modules in commit 3b5e6454
      ("fs/buffer.c: support buffer cache allocations with gfp modifiers").
      This seems to have been a mistake, as no users were introduced nor was
      the function declared in a header.  Change it back to 'static'.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0026ba40
    • A
      proc: unsigned file descriptors · 771187d6
      Alexey Dobriyan 提交于
      Make struct proc_inode::fd unsigned.
      
      This allows better code generation on x86_64 (less sign extensions).
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      771187d6
    • A
      fs/file: more unsigned file descriptors · 9b80a184
      Alexey Dobriyan 提交于
      Propagate unsignedness for grand total of 149 bytes:
      
      	$ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux
      	add/remove: 0/0 grow/shrink: 0/10 up/down: 0/-149 (-149)
      	function                                     old     new   delta
      	set_close_on_exec                             99      98      -1
      	put_files_struct                             201     200      -1
      	get_close_on_exec                             59      58      -1
      	do_prlimit                                   498     497      -1
      	do_execveat_common.isra                     1662    1661      -1
      	__close_fd                                   178     173      -5
      	do_dup2                                      219     204     -15
      	seq_show                                     685     660     -25
      	__alloc_fd                                   384     357     -27
      	dup_fd                                       718     646     -72
      
      It mostly comes from converting "unsigned int" to "long" for bit operations.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9b80a184
    • S
      fs: compat: remove redundant check of nr_segs · 85e7340f
      Shawn Lin 提交于
      nr_segs should never be less than zero as its type
      is unsigned long, so let's remove this check.
      Signed-off-by: NShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      85e7340f
    • D
      cachefiles: Fix attempt to read i_blocks after deleting file [ver #2] · a818101d
      David Howells 提交于
      An NULL-pointer dereference happens in cachefiles_mark_object_inactive()
      when it tries to read i_blocks so that it can tell the cachefilesd daemon
      how much space it's making available.
      
      The problem is that cachefiles_drop_object() calls
      cachefiles_mark_object_inactive() after calling cachefiles_delete_object()
      because the object being marked active staves off attempts to (re-)use the
      file at that filename until after it has been deleted.  This means that
      d_inode is NULL by the time we come to try to access it.
      
      To fix the problem, have the caller of cachefiles_mark_object_inactive()
      supply the number of blocks freed up.
      
      Without this, the following oops may occur:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
      IP: [<ffffffffa06c5cc1>] cachefiles_mark_object_inactive+0x61/0xb0 [cachefiles]
      ...
      CPU: 11 PID: 527 Comm: kworker/u64:4 Tainted: G          I    ------------   3.10.0-470.el7.x86_64 #1
      Hardware name: Hewlett-Packard HP Z600 Workstation/0B54h, BIOS 786G4 v03.19 03/11/2011
      Workqueue: fscache_object fscache_object_work_func [fscache]
      task: ffff880035edaf10 ti: ffff8800b77c0000 task.ti: ffff8800b77c0000
      RIP: 0010:[<ffffffffa06c5cc1>] cachefiles_mark_object_inactive+0x61/0xb0 [cachefiles]
      RSP: 0018:ffff8800b77c3d70  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff8800bf6cc400 RCX: 0000000000000034
      RDX: 0000000000000000 RSI: ffff880090ffc710 RDI: ffff8800bf761ef8
      RBP: ffff8800b77c3d88 R08: 2000000000000000 R09: 0090ffc710000000
      R10: ff51005d2ff1c400 R11: 0000000000000000 R12: ffff880090ffc600
      R13: ffff8800bf6cc520 R14: ffff8800bf6cc400 R15: ffff8800bf6cc498
      FS:  0000000000000000(0000) GS:ffff8800bb8c0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000098 CR3: 00000000019ba000 CR4: 00000000000007e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Stack:
       ffff880090ffc600 ffff8800bf6cc400 ffff8800867df140 ffff8800b77c3db0
       ffffffffa06c48cb ffff880090ffc600 ffff880090ffc180 ffff880090ffc658
       ffff8800b77c3df0 ffffffffa085d846 ffff8800a96b8150 ffff880090ffc600
      Call Trace:
       [<ffffffffa06c48cb>] cachefiles_drop_object+0x6b/0xf0 [cachefiles]
       [<ffffffffa085d846>] fscache_drop_object+0xd6/0x1e0 [fscache]
       [<ffffffffa085d615>] fscache_object_work_func+0xa5/0x200 [fscache]
       [<ffffffff810a605b>] process_one_work+0x17b/0x470
       [<ffffffff810a6e96>] worker_thread+0x126/0x410
       [<ffffffff810a6d70>] ? rescuer_thread+0x460/0x460
       [<ffffffff810ae64f>] kthread+0xcf/0xe0
       [<ffffffff810ae580>] ? kthread_create_on_node+0x140/0x140
       [<ffffffff81695418>] ret_from_fork+0x58/0x90
       [<ffffffff810ae580>] ? kthread_create_on_node+0x140/0x140
      
      The oopsing code shows:
      
      	callq  0xffffffff810af6a0 <wake_up_bit>
      	mov    0xf8(%r12),%rax
      	mov    0x30(%rax),%rax
      	mov    0x98(%rax),%rax   <---- oops here
      	lock add %rax,0x130(%rbx)
      
      where this is:
      
      	d_backing_inode(object->dentry)->i_blocks
      
      Fixes: a5b3a80b (CacheFiles: Provide read-and-reset release counters for cachefilesd)
      Reported-by: NJianhong Yin <jiyin@redhat.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NSteve Dickson <steved@redhat.com>
      cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a818101d
    • A
      cifs: don't use memcpy() to copy struct iov_iter · fc56b983
      Al Viro 提交于
      it's not 70s anymore.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      fc56b983
    • A
      get rid of separate multipage fault-in primitives · 4bce9f6e
      Al Viro 提交于
      * the only remaining callers of "short" fault-ins are just as happy with generic
      variants (both in lib/iov_iter.c); switch them to multipage variants, kill the
      "short" ones
      * rename the multipage variants to now available plain ones.
      * get rid of compat macro defining iov_iter_fault_in_multipage_readable by
      expanding it in its only user.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4bce9f6e
  2. 22 9月, 2016 2 次提交
  3. 21 9月, 2016 2 次提交
  4. 20 9月, 2016 10 次提交
  5. 16 9月, 2016 4 次提交
    • P
      configfs: Return -EFBIG from configfs_write_bin_file. · 42857cf5
      Phil Turnbull 提交于
      The check for writing more than cb_max_size bytes does not 'goto out' so
      it is a no-op which allows users to vmalloc an arbitrary amount.
      
      Fixes: 03607ace ("configfs: implement binary attributes")
      Cc: stable@kernel.org
      Signed-off-by: NPhil Turnbull <phil.turnbull@oracle.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      42857cf5
    • J
      aio: mark AIO pseudo-fs noexec · 22f6b4d3
      Jann Horn 提交于
      This ensures that do_mmap() won't implicitly make AIO memory mappings
      executable if the READ_IMPLIES_EXEC personality flag is set.  Such
      behavior is problematic because the security_mmap_file LSM hook doesn't
      catch this case, potentially permitting an attacker to bypass a W^X
      policy enforced by SELinux.
      
      I have tested the patch on my machine.
      
      To test the behavior, compile and run this:
      
          #define _GNU_SOURCE
          #include <unistd.h>
          #include <sys/personality.h>
          #include <linux/aio_abi.h>
          #include <err.h>
          #include <stdlib.h>
          #include <stdio.h>
          #include <sys/syscall.h>
      
          int main(void) {
              personality(READ_IMPLIES_EXEC);
              aio_context_t ctx = 0;
              if (syscall(__NR_io_setup, 1, &ctx))
                  err(1, "io_setup");
      
              char cmd[1000];
              sprintf(cmd, "cat /proc/%d/maps | grep -F '/[aio]'",
                  (int)getpid());
              system(cmd);
              return 0;
          }
      
      In the output, "rw-s" is good, "rwxs" is bad.
      Signed-off-by: NJann Horn <jann@thejh.net>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      22f6b4d3
    • D
      vfs: cap dedupe request structure size at PAGE_SIZE · b71dbf10
      Darrick J. Wong 提交于
      Kirill A Shutemov reports that the kernel doesn't try to cap dest_count
      in any way, and uses the number to allocate kernel memory.  This causes
      high order allocation warnings in the kernel log if someone passes in a
      big enough value.  We should clamp the allocation at PAGE_SIZE to avoid
      stressing the VM.
      
      The two existing users of the dedupe ioctl never send more than 120
      requests, so we can safely clamp dest_range at PAGE_SIZE, because with
      4k pages we can handle up to 127 dedupe candidates.  Given the max
      extent length of 16MB, we can end up doing 2GB of IO which is plenty.
      
      [ Note: the "offsetof()" can't overflow, because 'count' is just a
        16-bit integer.  That's not obvious in the limited context of the
        patch, so I'm noting it here because it made me go look.  - Linus ]
      Reported-by: N"Kirill A. Shutemov" <kirill@shutemov.name>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b71dbf10
    • D
      vfs: fix return type of ioctl_file_dedupe_range · 5297e0f0
      Darrick J. Wong 提交于
      All the VFS functions in the dedupe ioctl path return int status, so
      the ioctl handler ought to as well.
      
      Found by Coverity, CID 1350952.
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5297e0f0
  6. 12 9月, 2016 1 次提交
  7. 10 9月, 2016 7 次提交
    • E
      fscrypto: require write access to mount to set encryption policy · ba63f23d
      Eric Biggers 提交于
      Since setting an encryption policy requires writing metadata to the
      filesystem, it should be guarded by mnt_want_write/mnt_drop_write.
      Otherwise, a user could cause a write to a frozen or readonly
      filesystem.  This was handled correctly by f2fs but not by ext4.  Make
      fscrypt_process_policy() handle it rather than relying on the filesystem
      to get it right.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs}
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Acked-by: NJaegeuk Kim <jaegeuk@kernel.org>
      ba63f23d
    • S
    • S
      Compare prepaths when comparing superblocks · c1d8b24d
      Sachin Prabhu 提交于
      The patch
      fs/cifs: make share unaccessible at root level mountable
      makes use of prepaths when any component of the underlying path is
      inaccessible.
      
      When mounting 2 separate shares having different prepaths but are other
      wise similar in other respects, we end up sharing superblocks when we
      shouldn't be doing so.
      Signed-off-by: NSachin Prabhu <sprabhu@redhat.com>
      Tested-by: NAurelien Aptel <aaptel@suse.com>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      c1d8b24d
    • S
      Fix memory leaks in cifs_do_mount() · 4214ebf4
      Sachin Prabhu 提交于
      Fix memory leaks introduced by the patch
      fs/cifs: make share unaccessible at root level mountable
      
      Also move allocation of cifs_sb->prepath to cifs_setup_cifs_sb().
      Signed-off-by: NSachin Prabhu <sprabhu@redhat.com>
      Tested-by: NAurelien Aptel <aaptel@suse.com>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      4214ebf4
    • E
      fscrypto: only allow setting encryption policy on directories · 002ced4b
      Eric Biggers 提交于
      The FS_IOC_SET_ENCRYPTION_POLICY ioctl allowed setting an encryption
      policy on nondirectory files.  This was unintentional, and in the case
      of nonempty regular files did not behave as expected because existing
      data was not actually encrypted by the ioctl.
      
      In the case of ext4, the user could also trigger filesystem errors in
      ->empty_dir(), e.g. due to mismatched "directory" checksums when the
      kernel incorrectly tried to interpret a regular file as a directory.
      
      This bug affected ext4 with kernels v4.8-rc1 or later and f2fs with
      kernels v4.6 and later.  It appears that older kernels only permitted
      directories and that the check was accidentally lost during the
      refactoring to share the file encryption code between ext4 and f2fs.
      
      This patch restores the !S_ISDIR() check that was present in older
      kernels.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      002ced4b
    • E
      fscrypto: add authorization check for setting encryption policy · 163ae1c6
      Eric Biggers 提交于
      On an ext4 or f2fs filesystem with file encryption supported, a user
      could set an encryption policy on any empty directory(*) to which they
      had readonly access.  This is obviously problematic, since such a
      directory might be owned by another user and the new encryption policy
      would prevent that other user from creating files in their own directory
      (for example).
      
      Fix this by requiring inode_owner_or_capable() permission to set an
      encryption policy.  This means that either the caller must own the file,
      or the caller must have the capability CAP_FOWNER.
      
      (*) Or also on any regular file, for f2fs v4.6 and later and ext4
          v4.8-rc1 and later; a separate bug fix is coming for that.
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs}
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      163ae1c6
    • D
      mm: fix show_smap() for zone_device-pmd ranges · ca120cf6
      Dan Williams 提交于
      Attempting to dump /proc/<pid>/smaps for a process with pmd dax mappings
      currently results in the following VM_BUG_ONs:
      
       kernel BUG at mm/huge_memory.c:1105!
       task: ffff88045f16b140 task.stack: ffff88045be14000
       RIP: 0010:[<ffffffff81268f9b>]  [<ffffffff81268f9b>] follow_trans_huge_pmd+0x2cb/0x340
       [..]
       Call Trace:
        [<ffffffff81306030>] smaps_pte_range+0xa0/0x4b0
        [<ffffffff814c2755>] ? vsnprintf+0x255/0x4c0
        [<ffffffff8123c46e>] __walk_page_range+0x1fe/0x4d0
        [<ffffffff8123c8a2>] walk_page_vma+0x62/0x80
        [<ffffffff81307656>] show_smap+0xa6/0x2b0
      
       kernel BUG at fs/proc/task_mmu.c:585!
       RIP: 0010:[<ffffffff81306469>]  [<ffffffff81306469>] smaps_pte_range+0x499/0x4b0
       Call Trace:
        [<ffffffff814c2795>] ? vsnprintf+0x255/0x4c0
        [<ffffffff8123c46e>] __walk_page_range+0x1fe/0x4d0
        [<ffffffff8123c8a2>] walk_page_vma+0x62/0x80
        [<ffffffff81307696>] show_smap+0xa6/0x2b0
      
      These locations are sanity checking page flags that must be set for an
      anonymous transparent huge page, but are not set for the zone_device
      pages associated with dax mappings.
      
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      ca120cf6
  8. 06 9月, 2016 2 次提交
    • W
      btrfs: introduce tickets_id to determine whether asynchronous metadata reclaim work makes progress · ce129655
      Wang Xiaoguang 提交于
      In btrfs_async_reclaim_metadata_space(), we use ticket's address to
      determine whether asynchronous metadata reclaim work is making progress.
      
      	ticket = list_first_entry(&space_info->tickets,
      				  struct reserve_ticket, list);
      	if (last_ticket == ticket) {
      		flush_state++;
      	} else {
      		last_ticket = ticket;
      		flush_state = FLUSH_DELAYED_ITEMS_NR;
      		if (commit_cycles)
      			commit_cycles--;
      	}
      
      But indeed it's wrong, we should not rely on local variable's address to
      do this check, because addresses may be same. In my test environment, I
      dd one 168MB file in a 256MB fs, found that for this file, every time
      wait_reserve_ticket() called, local variable ticket's address is same,
      
      For above codes, assume a previous ticket's address is addrA, last_ticket
      is addrA. Btrfs_async_reclaim_metadata_space() finished this ticket and
      wake up it, then another ticket is added, but with the same address addrA,
      now last_ticket will be same to current ticket, then current ticket's flush
      work will start from current flush_state, not initial FLUSH_DELAYED_ITEMS_NR,
      which may result in some enospc issues(I have seen this in my test machine).
      Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ce129655
    • C
      Btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns · cbd60aa7
      Chris Mason 提交于
      We use a btrfs_log_ctx structure to pass information into the
      tree log commit, and get error values out.  It gets added to a per
      log-transaction list which we walk when things go bad.
      
      Commit d1433deb added an optimization to skip waiting for the log
      commit, but didn't take root_log_ctx out of the list.  This
      patch makes sure we remove things before exiting.
      Signed-off-by: NChris Mason <clm@fb.com>
      Fixes: d1433deb
      cc: stable@vger.kernel.org # 3.15+
      cbd60aa7
  9. 05 9月, 2016 2 次提交
    • W
      btrfs: do not decrease bytes_may_use when replaying extents · ed7a6948
      Wang Xiaoguang 提交于
      When replaying extents, there is no need to update bytes_may_use
      in btrfs_alloc_logged_file_extent(), otherwise it'll trigger a
      WARN_ON about bytes_may_use.
      
      Fixes: ("btrfs: update btrfs_space_info's bytes_may_use timely")
      Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ed7a6948
    • N
      ceph: do not modify fi->frag in need_reset_readdir() · 0f5aa88a
      Nicolas Iooss 提交于
      Commit f3c4ebe6 ("ceph: using hash value to compose dentry offset")
      modified "if (fpos_frag(new_pos) != fi->frag)" to "if (fi->frag |=
      fpos_frag(new_pos))" in need_reset_readdir(), thus replacing a
      comparison operator with an assignment one.
      
      This looks like a typo which is reported by clang when building the
      kernel with some warning flags:
      
          fs/ceph/dir.c:600:22: error: using the result of an assignment as a
          condition without parentheses [-Werror,-Wparentheses]
                  } else if (fi->frag |= fpos_frag(new_pos)) {
                             ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
          fs/ceph/dir.c:600:22: note: place parentheses around the assignment
          to silence this warning
                  } else if (fi->frag |= fpos_frag(new_pos)) {
                                      ^
                             (                             )
          fs/ceph/dir.c:600:22: note: use '!=' to turn this compound
          assignment into an inequality comparison
                  } else if (fi->frag |= fpos_frag(new_pos)) {
                                      ^~
                                      !=
      
      Fixes: f3c4ebe6 ("ceph: using hash value to compose dentry offset")
      Signed-off-by: NNicolas Iooss <nicolas.iooss_linux@m4x.org>
      Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
      0f5aa88a