1. 20 4月, 2015 1 次提交
  2. 09 4月, 2015 3 次提交
  3. 07 4月, 2015 2 次提交
    • A
      ioctx_alloc(): fix vma (and file) leak on failure · deeb8525
      Al Viro 提交于
      If we fail past the aio_setup_ring(), we need to destroy the
      mapping.  We don't need to care about anybody having found ctx,
      or added requests to it, since the last failure exit is exactly
      the failure to make ctx visible to lookups.
      
      Reproducer (based on one by Joe Mario <jmario@redhat.com>):
      
      void count(char *p)
      {
      	char s[80];
      	printf("%s: ", p);
      	fflush(stdout);
      	sprintf(s, "/bin/cat /proc/%d/maps|/bin/fgrep -c '/[aio] (deleted)'", getpid());
      	system(s);
      }
      
      int main()
      {
      	io_context_t *ctx;
      	int created, limit, i, destroyed;
      	FILE *f;
      
      	count("before");
      	if ((f = fopen("/proc/sys/fs/aio-max-nr", "r")) == NULL)
      		perror("opening aio-max-nr");
      	else if (fscanf(f, "%d", &limit) != 1)
      		fprintf(stderr, "can't parse aio-max-nr\n");
      	else if ((ctx = calloc(limit, sizeof(io_context_t))) == NULL)
      		perror("allocating aio_context_t array");
      	else {
      		for (i = 0, created = 0; i < limit; i++) {
      			if (io_setup(1000, ctx + created) == 0)
      				created++;
      		}
      		for (i = 0, destroyed = 0; i < created; i++)
      			if (io_destroy(ctx[i]) == 0)
      				destroyed++;
      		printf("created %d, failed %d, destroyed %d\n",
      			created, limit - created, destroyed);
      		count("after");
      	}
      }
      Found-by: NJoe Mario <jmario@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      deeb8525
    • A
      fix mremap() vs. ioctx_kill() race · b2edffdd
      Al Viro 提交于
      teach ->mremap() method to return an error and have it fail for
      aio mappings in process of being killed
      
      Note that in case of ->mremap() failure we need to undo move_page_tables()
      we'd already done; we could call ->mremap() first, but then the failure of
      move_page_tables() would require undoing whatever _successful_ ->mremap()
      has done, which would be a lot more headache in general.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b2edffdd
  4. 01 4月, 2015 8 次提交
  5. 31 3月, 2015 1 次提交
  6. 27 3月, 2015 1 次提交
  7. 26 3月, 2015 7 次提交
  8. 22 3月, 2015 1 次提交
  9. 21 3月, 2015 7 次提交
    • D
      cifs: fix use-after-free bug in find_writable_file · e1e9bda2
      David Disseldorp 提交于
      Under intermittent network outages, find_writable_file() is susceptible
      to the following race condition, which results in a user-after-free in
      the cifs_writepages code-path:
      
      Thread 1                                        Thread 2
      ========                                        ========
      
      inv_file = NULL
      refind = 0
      spin_lock(&cifs_file_list_lock)
      
      // invalidHandle found on openFileList
      
      inv_file = open_file
      // inv_file->count currently 1
      
      cifsFileInfo_get(inv_file)
      // inv_file->count = 2
      
      spin_unlock(&cifs_file_list_lock);
      
      cifs_reopen_file()                            cifs_close()
      // fails (rc != 0)                            ->cifsFileInfo_put()
                                             spin_lock(&cifs_file_list_lock)
                                             // inv_file->count = 1
                                             spin_unlock(&cifs_file_list_lock)
      
      spin_lock(&cifs_file_list_lock);
      list_move_tail(&inv_file->flist,
            &cifs_inode->openFileList);
      spin_unlock(&cifs_file_list_lock);
      
      cifsFileInfo_put(inv_file);
      ->spin_lock(&cifs_file_list_lock)
      
        // inv_file->count = 0
        list_del(&cifs_file->flist);
        // cleanup!!
        kfree(cifs_file);
      
        spin_unlock(&cifs_file_list_lock);
      
      spin_lock(&cifs_file_list_lock);
      ++refind;
      // refind = 1
      goto refind_writable;
      
      At this point we loop back through with an invalid inv_file pointer
      and a refind value of 1. On second pass, inv_file is not overwritten on
      openFileList traversal, and is subsequently dereferenced.
      Signed-off-by: NDavid Disseldorp <ddiss@suse.de>
      Reviewed-by: NJeff Layton <jlayton@samba.org>
      CC: <stable@vger.kernel.org>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      e1e9bda2
    • S
      cifs: smb2_clone_range() - exit on unhandled error · 2477bc58
      Sachin Prabhu 提交于
      While attempting to clone a file on a samba server, we receive a
      STATUS_INVALID_DEVICE_REQUEST. This is mapped to -EOPNOTSUPP which
      isn't handled in smb2_clone_range(). We end up looping in the while loop
      making same call to the samba server over and over again.
      
      The proposed fix is to exit and return the error value when encountered
      with an unhandled error.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NSachin Prabhu <sprabhu@redhat.com>
      Signed-off-by: NSteve French <steve.french@primarydata.com>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      2477bc58
    • K
      NFSD: Put exports after nfsd4_layout_verify fail · a1420384
      Kinglong Mee 提交于
      Fix commit 9cf514cc (nfsd: implement pNFS operations).
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      a1420384
    • K
      NFSD: Error out when register_shrinker() fail · a68465c9
      Kinglong Mee 提交于
      If register_shrinker() failed, nfsd will cause a NULL pointer access as,
      
      [ 9250.875465] nfsd: last server has exited, flushing export cache
      [ 9251.427270] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [ 9251.427393] IP: [<ffffffff8136fc29>] __list_del_entry+0x29/0xd0
      [ 9251.427579] PGD 13e4d067 PUD 13e4c067 PMD 0
      [ 9251.427633] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
      [ 9251.427706] Modules linked in: ip6t_rpfilter ip6t_REJECT bnep bluetooth xt_conntrack cfg80211 rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw btrfs xfs microcode ppdev serio_raw pcspkr xor libcrc32c raid6_pq e1000 parport_pc parport i2c_piix4 i2c_core nfsd(OE-) auth_rpcgss nfs_acl lockd sunrpc(E) ata_generic pata_acpi
      [ 9251.428240] CPU: 0 PID: 1557 Comm: rmmod Tainted: G           OE 3.16.0-rc2+ #22
      [ 9251.428366] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
      [ 9251.428496] task: ffff880000849540 ti: ffff8800136f4000 task.ti: ffff8800136f4000
      [ 9251.428593] RIP: 0010:[<ffffffff8136fc29>]  [<ffffffff8136fc29>] __list_del_entry+0x29/0xd0
      [ 9251.428696] RSP: 0018:ffff8800136f7ea0  EFLAGS: 00010207
      [ 9251.428751] RAX: 0000000000000000 RBX: ffffffffa0116d48 RCX: dead000000200200
      [ 9251.428814] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa0116d48
      [ 9251.428876] RBP: ffff8800136f7ea0 R08: ffff8800136f4000 R09: 0000000000000001
      [ 9251.428939] R10: 8080808080808080 R11: 0000000000000000 R12: ffffffffa011a5a0
      [ 9251.429002] R13: 0000000000000800 R14: 0000000000000000 R15: 00000000018ac090
      [ 9251.429064] FS:  00007fb9acef0740(0000) GS:ffff88003fa00000(0000) knlGS:0000000000000000
      [ 9251.429164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 9251.429221] CR2: 0000000000000000 CR3: 0000000031a17000 CR4: 00000000001407f0
      [ 9251.429306] Stack:
      [ 9251.429410]  ffff8800136f7eb8 ffffffff8136fcdd ffffffffa0116d20 ffff8800136f7ed0
      [ 9251.429511]  ffffffff8118a0f2 0000000000000000 ffff8800136f7ee0 ffffffffa00eb765
      [ 9251.429610]  ffff8800136f7ef0 ffffffffa010e93c ffff8800136f7f78 ffffffff81104ac2
      [ 9251.429709] Call Trace:
      [ 9251.429755]  [<ffffffff8136fcdd>] list_del+0xd/0x30
      [ 9251.429896]  [<ffffffff8118a0f2>] unregister_shrinker+0x22/0x40
      [ 9251.430037]  [<ffffffffa00eb765>] nfsd_reply_cache_shutdown+0x15/0x90 [nfsd]
      [ 9251.430106]  [<ffffffffa010e93c>] exit_nfsd+0x9/0x6cd [nfsd]
      [ 9251.430192]  [<ffffffff81104ac2>] SyS_delete_module+0x162/0x200
      [ 9251.430280]  [<ffffffff81013b69>] ? do_notify_resume+0x59/0x90
      [ 9251.430395]  [<ffffffff816f2369>] system_call_fastpath+0x16/0x1b
      [ 9251.430457] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08
      [ 9251.430691] RIP  [<ffffffff8136fc29>] __list_del_entry+0x29/0xd0
      [ 9251.430755]  RSP <ffff8800136f7ea0>
      [ 9251.430805] CR2: 0000000000000000
      [ 9251.431033] ---[ end trace 080f3050d082b4ea ]---
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      a68465c9
    • K
      NFSD: Take care the return value from nfsd4_decode_stateid · db59c0ef
      Kinglong Mee 提交于
      Return status after nfsd4_decode_stateid failed.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      db59c0ef
    • K
      NFSD: Check layout type when returning client layouts · 6f8f28ec
      Kinglong Mee 提交于
      According to RFC5661:
      " When lr_returntype is LAYOUTRETURN4_FSID, the current filehandle is used
         to identify the file system and all layouts matching the client ID,
         the fsid of the file system, lora_layout_type, and lora_iomode are
         returned.  When lr_returntype is LAYOUTRETURN4_ALL, all layouts
         matching the client ID, lora_layout_type, and lora_iomode are
         returned and the current filehandle is not used. "
      
      When returning client layouts, always check layout type.
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      6f8f28ec
    • K
      NFSD: restore trace event lost in mismerge · 715a03d2
      Kinglong Mee 提交于
      31ef83dc "nfsd: add trace events" had a typo that dropped a trace
      event and replaced it by an incorrect recursive call to
      nfsd4_cb_layout_fail.  133d5582 "Subject: nfsd: don't recursively
      call nfsd4_cb_layout_fail" fixed the crash, this restores the
      tracepoint.
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      715a03d2
  10. 20 3月, 2015 1 次提交
    • C
      Subject: nfsd: don't recursively call nfsd4_cb_layout_fail · 133d5582
      Christoph Hellwig 提交于
      Due to a merge error when creating c5c707f9 ("nfsd: implement pNFS
      layout recalls"), we recursively call nfsd4_cb_layout_fail from itself,
      leading to stack overflows.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Fixes:  c5c707f9 ("nfsd: implement pNFS layout recalls")
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      ---
       fs/nfsd/nfs4layouts.c | 2 --
       1 file changed, 2 deletions(-)
      
      diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
      index 3c1bfa1..1028a06 100644
      --- a/fs/nfsd/nfs4layouts.c
      +++ b/fs/nfsd/nfs4layouts.c
      @@ -587,8 +587,6 @@ nfsd4_cb_layout_fail(struct nfs4_layout_stateid *ls)
      
       	rpc_ntop((struct sockaddr *)&clp->cl_addr, addr_str, sizeof(addr_str));
      
      -	nfsd4_cb_layout_fail(ls);
      -
       	printk(KERN_WARNING
       		"nfsd: client %s failed to respond to layout recall. "
       		"  Fencing..\n", addr_str);
      --
      1.9.1
      133d5582
  11. 19 3月, 2015 1 次提交
    • T
      fuse: explicitly set /dev/fuse file's private_data · 94e4fe2c
      Tom Van Braeckel 提交于
      The misc subsystem (which is used for /dev/fuse) initializes private_data to
      point to the misc device when a driver has registered a custom open file
      operation, and initializes it to NULL when a custom open file operation has
      *not* been provided.
      
      This subtle quirk is confusing, to the point where kernel code registers
      *empty* file open operations to have private_data point to the misc device
      structure. And it leads to bugs, where the addition or removal of a custom open
      file operation surprisingly changes the initial contents of a file's
      private_data structure.
      
      So to simplify things in the misc subsystem, a patch [1] has been proposed to
      *always* set the private_data to point to the misc device, instead of only
      doing this when a custom open file operation has been registered.
      
      But before this patch can be applied we need to modify drivers that make the
      assumption that a misc device file's private_data is initialized to NULL
      because they didn't register a custom open file operation, so they don't rely
      on this assumption anymore. FUSE uses private_data to store the fuse_conn and
      errors out if this is not initialized to NULL at mount time.
      
      Hence, we now set a file's private_data to NULL explicitly, to be independent
      of whatever value the misc subsystem initializes it to by default.
      
      [1] https://lkml.org/lkml/2014/12/4/939Reported-by: NGiedrius Statkevicius <giedriuswork@gmail.com>
      Reported-by: NThierry Reding <thierry.reding@gmail.com>
      Signed-off-by: NTom Van Braeckel <tomvanbraeckel@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      94e4fe2c
  12. 18 3月, 2015 7 次提交
    • H
      ovl: upper fs should not be R/O · 71cbad7e
      hujianyang 提交于
      After importing multi-lower layer support, users could mount a r/o
      partition as the left most lowerdir instead of using it as upperdir.
      And a r/o upperdir may cause an error like
      
      	overlayfs: failed to create directory ./workdir/work
      
      during mount.
      
      This patch check the *s_flags* of upper fs and return an error if
      it is a r/o partition. The checking of *upper_mnt->mnt_sb->s_flags*
      can be removed now.
      
      This patch also remove
      
      	/* FIXME: workdir is not needed for a R/O mount */
      
      from ovl_fill_super() because:
      
      1) for upper fs r/o case
      Setting a r/o partition as upper is prevented, no need to care about
      workdir in this case.
      
      2) for "mount overlay -o ro" with a r/w upper fs case
      Users could remount overlayfs to r/w in this case, so workdir should
      not be omitted.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      71cbad7e
    • H
      ovl: check lowerdir amount for non-upper mount · 6be4506e
      hujianyang 提交于
      Recently multi-lower layer mount support allow upperdir and workdir
      to be omitted, then cause overlayfs can be mount with only one
      lowerdir directory. This action make no sense and have potential risk.
      
      This patch check the total number of lower directories to prevent
      mounting overlayfs with only one directory.
      
      Also, an error message is added to indicate lower directories exceed
      OVL_MAX_STACK limit.
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      6be4506e
    • H
      ovl: print error message for invalid mount options · bead55ef
      hujianyang 提交于
      Overlayfs should print an error message if an incorrect mount option
      is caught like other filesystems.
      
      After this patch, improper option input could be clearly known.
      Reported-by: NFabian Sturm <fabian.sturm@aduu.de>
      Signed-off-by: Nhujianyang <hujianyang@huawei.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      bead55ef
    • J
      Btrfs: fix outstanding_extents accounting in DIO · e1cbbfa5
      Josef Bacik 提交于
      We are keeping track of how many extents we need to reserve properly based on
      the amount we want to write, but we were still incrementing outstanding_extents
      if we wrote less than what we requested.  This isn't quite right since we will
      be limited to our max extent size.  So instead lets do something horrible!  Keep
      track of how many outstanding_extents we reserved, and decrement each time we
      allocate an extent.  If we use our entire reserve make sure to jack up
      outstanding_extents on the inode so the accounting works out properly.  Thanks,
      Reported-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      e1cbbfa5
    • J
      Btrfs: add sanity test for outstanding_extents accounting · 6a3891c5
      Josef Bacik 提交于
      I introduced a regression wrt outstanding_extents accounting.  These are tricky
      areas that aren't easily covered by xfstests as we could change MAX_EXTENT_SIZE
      at any time.  So add sanity tests to cover the various conditions that are
      tricky in order to make sure we don't introduce regressions in the future.
      Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      6a3891c5
    • J
      Btrfs: just free dummy extent buffers · bcb7e449
      Josef Bacik 提交于
      If we fail during our sanity tests we could get NULL deref's because we unload
      the module before the dummy extent buffers are free'd via RCU.  So check for
      this case and just free the things directly.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      bcb7e449
    • J
      Btrfs: account merges/splits properly · ba117213
      Josef Bacik 提交于
      My fix
      
      Btrfs: fix merge delalloc logic
      
      only fixed half of the problems, it didn't fix the case where we have two large
      extents on either side and then join them together with a new small extent.  We
      need to instead keep track of how many extents we have accounted for with each
      side of the new extent, and then see how many extents we need for the new large
      extent.  If they match then we know we need to keep our reservation, otherwise
      we need to drop our reservation.  This shows up with a case like this
      
      [BTRFS_MAX_EXTENT_SIZE+4K][4K HOLE][BTRFS_MAX_EXTENT_SIZE+4K]
      
      Previously the logic would have said that the number extents required for the
      new size (3) is larger than the number of extents required for the largest side
      (2) therefore we need to keep our reservation.  But this isn't the case, since
      both sides require a reservation of 2 which leads to 4 for the whole range
      currently reserved, but we only need 3, so we need to drop one of the
      reservations.  The same problem existed for splits, we'd think we only need 3
      extents when creating the hole but in reality we need 4.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      ba117213