1. 22 5月, 2014 9 次提交
  2. 21 5月, 2014 2 次提交
    • F
      Btrfs: send, fix incorrect ref access when using extrefs · 51a60253
      Filipe Manana 提交于
      When running send, if an inode only has extended reference items
      associated to it and no regular references, send.c:get_first_ref()
      was incorrectly assuming the reference it found was of type
      BTRFS_INODE_REF_KEY due to use of the wrong key variable.
      This caused weird behaviour when using the found item has a regular
      reference, such as weird path string, and occasionally (when lucky)
      a crash:
      
      [  190.600652] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      [  190.600994] Modules linked in: btrfs xor raid6_pq binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc psmouse serio_raw evbug pcspkr i2c_piix4 e1000 floppy
      [  190.602565] CPU: 2 PID: 14520 Comm: btrfs Not tainted 3.13.0-fdm-btrfs-next-26+ #1
      [  190.602728] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [  190.602868] task: ffff8800d447c920 ti: ffff8801fa79e000 task.ti: ffff8801fa79e000
      [  190.603030] RIP: 0010:[<ffffffff813266b4>]  [<ffffffff813266b4>] memcpy+0x54/0x110
      [  190.603262] RSP: 0018:ffff8801fa79f880  EFLAGS: 00010202
      [  190.603395] RAX: ffff8800d4326e3f RBX: 000000000000036a RCX: ffff880000000000
      [  190.603553] RDX: 000000000000032a RSI: ffe708844042936a RDI: ffff8800d43271a9
      [  190.603710] RBP: ffff8801fa79f8c8 R08: 00000000003a4ef0 R09: 0000000000000000
      [  190.603867] R10: 793a4ef09f000000 R11: 9f0000000053726f R12: ffff8800d43271a9
      [  190.604020] R13: 0000160000000000 R14: ffff8802110134f0 R15: 000000000000036a
      [  190.604020] FS:  00007fb423d09b80(0000) GS:ffff880216200000(0000) knlGS:0000000000000000
      [  190.604020] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  190.604020] CR2: 00007fb4229d4b78 CR3: 00000001f5d76000 CR4: 00000000000006e0
      [  190.604020] Stack:
      [  190.604020]  ffffffffa01f4d49 ffff8801fa79f8f0 00000000000009f9 ffff8801fa79f8c8
      [  190.604020]  00000000000009f9 ffff880211013260 000000000000f971 ffff88021147dba8
      [  190.604020]  00000000000009f9 ffff8801fa79f918 ffffffffa02367f5 ffff8801fa79f928
      [  190.604020] Call Trace:
      [  190.604020]  [<ffffffffa01f4d49>] ? read_extent_buffer+0xb9/0x120 [btrfs]
      [  190.604020]  [<ffffffffa02367f5>] fs_path_add_from_extent_buffer+0x45/0x60 [btrfs]
      [  190.604020]  [<ffffffffa0238806>] get_first_ref+0x1f6/0x210 [btrfs]
      [  190.604020]  [<ffffffffa0238994>] __get_cur_name_and_parent+0x174/0x3a0 [btrfs]
      [  190.604020]  [<ffffffff8118df3d>] ? kmem_cache_alloc_trace+0x11d/0x1e0
      [  190.604020]  [<ffffffffa0236674>] ? fs_path_alloc+0x24/0x60 [btrfs]
      [  190.604020]  [<ffffffffa0238c91>] get_cur_path+0xd1/0x240 [btrfs]
      (...)
      
      Steps to reproduce (either crash or some weirdness like an odd path string):
      
          mkfs.btrfs -f -O extref /dev/sdd
          mount /dev/sdd /mnt
      
          mkdir /mnt/testdir
          touch /mnt/testdir/foobar
      
          for i in `seq 1 2550`; do
              ln /mnt/testdir/foobar /mnt/testdir/foobar_link_`printf "%04d" $i`
          done
      
          ln /mnt/testdir/foobar /mnt/testdir/final_foobar_name
      
          rm -f /mnt/testdir/foobar
          for i in `seq 1 2550`; do
              rm -f /mnt/testdir/foobar_link_`printf "%04d" $i`
          done
      
          btrfs subvolume snapshot -r /mnt /mnt/mysnap
          btrfs send /mnt/mysnap -f /tmp/mysnap.send
      Signed-off-by: NFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
      51a60253
    • L
      Btrfs: fix EIO on reading file after ioctl clone works on it · d3ecfcdf
      Liu Bo 提交于
      For inline data extent, we need to make its length aligned, otherwise,
      we can get a phantom extent map which confuses readpages() to return -EIO.
      
      This can be detected by xfstests/btrfs/035.
      Reported-by: NDavid Disseldorp <ddiss@suse.de>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d3ecfcdf
  3. 20 5月, 2014 1 次提交
    • T
      sysfs: make sure read buffer is zeroed · f5c16f29
      Tejun Heo 提交于
      13c589d5 ("sysfs: use seq_file when reading regular files")
      switched sysfs from custom read implementation to seq_file to enable
      later transition to kernfs.  After the change, the buffer passed to
      ->show() is acquired through seq_get_buf(); unfortunately, this
      introduces a subtle behavior change.  Before the commit, the buffer
      passed to ->show() was always zero as it was allocated using
      get_zeroed_page().  Because seq_file doesn't clear buffers on
      allocation and neither does seq_get_buf(), after the commit, depending
      on the behavior of ->show(), we may end up exposing uninitialized data
      to userland thus possibly altering userland visible behavior and
      leaking information.
      
      Fix it by explicitly clearing the buffer.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NRon <ron@debian.org>
      Fixes: 13c589d5 ("sysfs: use seq_file when reading regular files")
      Cc: stable <stable@vger.kernel.org> # 3.13+
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5c16f29
  4. 15 5月, 2014 10 次提交
  5. 13 5月, 2014 1 次提交
    • T
      kernfs, sysfs, cgroup: restrict extra perm check on open to sysfs · 555724a8
      Tejun Heo 提交于
      The kernfs open method - kernfs_fop_open() - inherited extra
      permission checks from sysfs.  While the vfs layer allows ignoring the
      read/write permissions checks if the issuer has CAP_DAC_OVERRIDE,
      sysfs explicitly denied open regardless of the cap if the file doesn't
      have any of the UGO perms of the requested access or doesn't implement
      the requested operation.  It can be debated whether this was a good
      idea or not but the behavior is too subtle and dangerous to change at
      this point.
      
      After cgroup got converted to kernfs, this extra perm check also got
      applied to cgroup breaking libcgroup which opens write-only files with
      O_RDWR as root.  This patch gates the extra open permission check with
      a new flag KERNFS_ROOT_EXTRA_OPEN_PERM_CHECK and enables it for sysfs.
      For sysfs, nothing changes.  For cgroup, root now can perform any
      operation regardless of the permissions as it was before kernfs
      conversion.  Note that kernfs still fails unimplemented operations
      with -EINVAL.
      
      While at it, add comments explaining KERNFS_ROOT flags.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NAndrey Wagin <avagin@gmail.com>
      Tested-by: NAndrey Wagin <avagin@gmail.com>
      Cc: Li Zefan <lizefan@huawei.com>
      References: http://lkml.kernel.org/g/CANaxB-xUm3rJ-Cbp72q-rQJO5mZe1qK6qXsQM=vh0U8upJ44+A@mail.gmail.com
      Fixes: 2bd59d48 ("cgroup: convert to kernfs")
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      555724a8
  6. 09 5月, 2014 2 次提交
    • J
      locks: only validate the lock vs. f_mode in F_SETLK codepaths · cf01f4ee
      Jeff Layton 提交于
      v2: replace missing break in switch statement (as pointed out by Dave
          Jones)
      
      commit bce7560d (locks: consolidate checks for compatible
      filp->f_mode values in setlk handlers) introduced a regression in the
      F_GETLK handler.
      
      flock64_to_posix_lock is a shared codepath between F_GETLK and F_SETLK,
      but the f_mode checks should only be applicable to the F_SETLK codepaths
      according to POSIX.
      
      Instead of just reverting the patch, add a new function to do this
      checking and have the F_SETLK handlers call it.
      
      Cc: Dave Jones <davej@redhat.com>
      Reported-and-Tested-by: NReuben Farrelly <reuben@reub.net>
      Signed-off-by: NJeff Layton <jlayton@poochiereds.net>
      cf01f4ee
    • K
      NFSD: Call ->set_acl with a NULL ACL structure if no entries · aa07c713
      Kinglong Mee 提交于
      After setting ACL for directory, I got two problems that caused
      by the cached zero-length default posix acl.
      
      This patch make sure nfsd4_set_nfs4_acl calls ->set_acl
      with a NULL ACL structure if there are no entries.
      
      Thanks for Christoph Hellwig's advice.
      
      First problem:
      ............ hang ...........
      
      Second problem:
      [ 1610.167668] ------------[ cut here ]------------
      [ 1610.168320] kernel BUG at /root/nfs/linux/fs/nfsd/nfs4acl.c:239!
      [ 1610.168320] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
      [ 1610.168320] Modules linked in: nfsv4(OE) nfs(OE) nfsd(OE)
      rpcsec_gss_krb5 fscache ip6t_rpfilter ip6t_REJECT cfg80211 xt_conntrack
      rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables
      ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
      ip6table_mangle ip6table_security ip6table_raw ip6table_filter
      ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
      nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
      auth_rpcgss nfs_acl snd_intel8x0 ppdev lockd snd_ac97_codec ac97_bus
      snd_pcm snd_timer e1000 pcspkr parport_pc snd parport serio_raw joydev
      i2c_piix4 sunrpc(OE) microcode soundcore i2c_core ata_generic pata_acpi
      [last unloaded: nfsd]
      [ 1610.168320] CPU: 0 PID: 27397 Comm: nfsd Tainted: G           OE
      3.15.0-rc1+ #15
      [ 1610.168320] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
      VirtualBox 12/01/2006
      [ 1610.168320] task: ffff88005ab653d0 ti: ffff88005a944000 task.ti:
      ffff88005a944000
      [ 1610.168320] RIP: 0010:[<ffffffffa034d5ed>]  [<ffffffffa034d5ed>]
      _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd]
      [ 1610.168320] RSP: 0018:ffff88005a945b00  EFLAGS: 00010293
      [ 1610.168320] RAX: 0000000000000001 RBX: ffff88006700bac0 RCX:
      0000000000000000
      [ 1610.168320] RDX: 0000000000000000 RSI: ffff880067c83f00 RDI:
      ffff880068233300
      [ 1610.168320] RBP: ffff88005a945b48 R08: ffffffff81c64830 R09:
      0000000000000000
      [ 1610.168320] R10: ffff88004ea85be0 R11: 000000000000f475 R12:
      ffff880068233300
      [ 1610.168320] R13: 0000000000000003 R14: 0000000000000002 R15:
      ffff880068233300
      [ 1610.168320] FS:  0000000000000000(0000) GS:ffff880077800000(0000)
      knlGS:0000000000000000
      [ 1610.168320] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 1610.168320] CR2: 00007f5bcbd3b0b9 CR3: 0000000001c0f000 CR4:
      00000000000006f0
      [ 1610.168320] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [ 1610.168320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
      0000000000000400
      [ 1610.168320] Stack:
      [ 1610.168320]  ffffffff00000000 0000000b67c83500 000000076700bac0
      0000000000000000
      [ 1610.168320]  ffff88006700bac0 ffff880068233300 ffff88005a945c08
      0000000000000002
      [ 1610.168320]  0000000000000000 ffff88005a945b88 ffffffffa034e2d5
      000000065a945b68
      [ 1610.168320] Call Trace:
      [ 1610.168320]  [<ffffffffa034e2d5>] nfsd4_get_nfs4_acl+0x95/0x150 [nfsd]
      [ 1610.168320]  [<ffffffffa03400d6>] nfsd4_encode_fattr+0x646/0x1e70 [nfsd]
      [ 1610.168320]  [<ffffffff816a6e6e>] ? kmemleak_alloc+0x4e/0xb0
      [ 1610.168320]  [<ffffffffa0327962>] ?
      nfsd_setuser_and_check_port+0x52/0x80 [nfsd]
      [ 1610.168320]  [<ffffffff812cd4bb>] ? selinux_cred_prepare+0x1b/0x30
      [ 1610.168320]  [<ffffffffa0341caa>] nfsd4_encode_getattr+0x5a/0x60 [nfsd]
      [ 1610.168320]  [<ffffffffa0341e07>] nfsd4_encode_operation+0x67/0x110
      [nfsd]
      [ 1610.168320]  [<ffffffffa033844d>] nfsd4_proc_compound+0x21d/0x810 [nfsd]
      [ 1610.168320]  [<ffffffffa0324d9b>] nfsd_dispatch+0xbb/0x200 [nfsd]
      [ 1610.168320]  [<ffffffffa00850cd>] svc_process_common+0x46d/0x6d0 [sunrpc]
      [ 1610.168320]  [<ffffffffa0085433>] svc_process+0x103/0x170 [sunrpc]
      [ 1610.168320]  [<ffffffffa032472f>] nfsd+0xbf/0x130 [nfsd]
      [ 1610.168320]  [<ffffffffa0324670>] ? nfsd_destroy+0x80/0x80 [nfsd]
      [ 1610.168320]  [<ffffffff810a5202>] kthread+0xd2/0xf0
      [ 1610.168320]  [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40
      [ 1610.168320]  [<ffffffff816c1ebc>] ret_from_fork+0x7c/0xb0
      [ 1610.168320]  [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40
      [ 1610.168320] Code: 78 02 e9 e7 fc ff ff 31 c0 31 d2 31 c9 66 89 45 ce
      41 8b 04 24 66 89 55 d0 66 89 4d d2 48 8d 04 80 49 8d 5c 84 04 e9 37 fd
      ff ff <0f> 0b 90 0f 1f 44 00 00 55 8b 56 08 c7 07 00 00 00 00 8b 46 0c
      [ 1610.168320] RIP  [<ffffffffa034d5ed>] _posix_to_nfsv4_one+0x3cd/0x3d0
      [nfsd]
      [ 1610.168320]  RSP <ffff88005a945b00>
      [ 1610.257313] ---[ end trace 838254e3e352285b ]---
      Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      aa07c713
  7. 07 5月, 2014 7 次提交
  8. 06 5月, 2014 2 次提交
    • D
      xfs: remote attribute overwrite causes transaction overrun · 8275cdd0
      Dave Chinner 提交于
      Commit e461fcb1 ("xfs: remote attribute lookups require the value
      length") passes the remote attribute length in the xfs_da_args
      structure on lookup so that CRC calculations and validity checking
      can be performed correctly by related code. This, unfortunately has
      the side effect of changing the args->valuelen parameter in cases
      where it shouldn't.
      
      That is, when we replace a remote attribute, the incoming
      replacement stores the value and length in args->value and
      args->valuelen, but then the lookup which finds the existing remote
      attribute overwrites args->valuelen with the length of the remote
      attribute being replaced. Hence when we go to create the new
      attribute, we create it of the size of the existing remote
      attribute, not the size it is supposed to be. When the new attribute
      is much smaller than the old attribute, this results in a
      transaction overrun and an ASSERT() failure on a debug kernel:
      
      XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, file: fs/xfs/xfs_trans.c, line: 331
      
      Fix this by keeping the remote attribute value length separate to
      the attribute value length in the xfs_da_args structure. The enables
      us to pass the length of the remote attribute to be removed without
      overwriting the new attribute's length.
      
      Also, ensure that when we save remote block contexts for a later
      rename we zero the original state variables so that we don't confuse
      the state of the attribute to be removes with the state of the new
      attribute that we just added. [Spotted by Brain Foster.]
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NBrian Foster <bfoster@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      8275cdd0
    • B
      xfs: initialize default acls for ->tmpfile() · d540e43b
      Brian Foster 提交于
      The current tmpfile handler does not initialize default ACLs. Doing so
      within xfs_vn_tmpfile() makes it roughly equivalent to xfs_vn_mknod(),
      which is already used as a common create handler.
      
      xfs_vn_mknod() does not currently have a mechanism to determine whether
      to link the file into the namespace. Therefore, further abstract
      xfs_vn_mknod() into a new xfs_generic_create() handler with a tmpfile
      parameter. This new handler calls xfs_create_tmpfile() and d_tmpfile()
      on the dentry when called via ->tmpfile().
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      d540e43b
  9. 05 5月, 2014 2 次提交
  10. 04 5月, 2014 3 次提交
    • M
      dcache: don't need rcu in shrink_dentry_list() · 60942f2f
      Miklos Szeredi 提交于
      Since now the shrink list is private and nobody can free the dentry while
      it is on the shrink list, we can remove RCU protection from this.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      60942f2f
    • A
      more graceful recovery in umount_collect() · 9c8c10e2
      Al Viro 提交于
      Start with shrink_dcache_parent(), then scan what remains.
      
      First of all, BUG() is very much an overkill here; we are holding
      ->s_umount, and hitting BUG() means that a lot of interesting stuff
      will be hanging after that point (sync(2), for example).  Moreover,
      in cases when there had been more than one leak, we'll be better
      off reporting all of them.  And more than just the last component
      of pathname - %pd is there for just such uses...
      
      That was the last user of dentry_lru_del(), so kill it off...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9c8c10e2
    • A
      don't remove from shrink list in select_collect() · fe91522a
      Al Viro 提交于
      	If we find something already on a shrink list, just increment
      data->found and do nothing else.  Loops in shrink_dcache_parent() and
      check_submounts_and_drop() will do the right thing - everything we
      did put into our list will be evicted and if there had been nothing,
      but data->found got non-zero, well, we have somebody else shrinking
      those guys; just try again.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      fe91522a
  11. 01 5月, 2014 1 次提交
    • A
      dentry_kill(): don't try to remove from shrink list · 41edf278
      Al Viro 提交于
      If the victim in on the shrink list, don't remove it from there.
      If shrink_dentry_list() manages to remove it from the list before
      we are done - fine, we'll just free it as usual.  If not - mark
      it with new flag (DCACHE_MAY_FREE) and leave it there.
      
      Eventually, shrink_dentry_list() will get to it, remove the sucker
      from shrink list and call dentry_kill(dentry, 0).  Which is where
      we'll deal with freeing.
      
      Since now dentry_kill(dentry, 0) may happen after or during
      dentry_kill(dentry, 1), we need to recognize that (by seeing
      DCACHE_DENTRY_KILLED already set), unlock everything
      and either free the sucker (in case DCACHE_MAY_FREE has been
      set) or leave it for ongoing dentry_kill(dentry, 1) to deal with.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      41edf278