1. 29 7月, 2015 2 次提交
  2. 03 6月, 2015 8 次提交
    • O
      Btrfs: show subvol= and subvolid= in /proc/mounts · c8d3fe02
      Omar Sandoval 提交于
      Now that we're guaranteed to have a meaningful root dentry, we can just
      export seq_dentry() and use it in btrfs_show_options(). The subvolume ID
      is easy to get and can also be useful, so put that in there, too.
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NOmar Sandoval <osandov@osandov.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      c8d3fe02
    • O
      Btrfs: unify subvol= and subvolid= mounting · 05dbe683
      Omar Sandoval 提交于
      Currently, mounting a subvolume with subvolid= takes a different code
      path than mounting with subvol=. This isn't really a big deal except for
      the fact that mounts done with subvolid= or the default subvolume don't
      have a dentry that's connected to the dentry tree like in the subvol=
      case. To unify the code paths, when given subvolid= or using the default
      subvolume ID, translate it into a subvolume name by walking
      ROOT_BACKREFs in the root tree and INODE_REFs in the filesystem trees.
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NOmar Sandoval <osandov@osandov.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      05dbe683
    • O
      Btrfs: fail on mismatched subvol and subvolid mount options · bb289b7b
      Omar Sandoval 提交于
      There's nothing to stop a user from passing both subvol= and subvolid=
      to mount, but if they don't refer to the same subvolume, someone is
      going to be surprised at some point. Error out on this case, but allow
      users to pass in both if they do match (which they could, for example,
      get out of /proc/mounts).
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NOmar Sandoval <osandov@osandov.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      bb289b7b
    • O
      Btrfs: clean up error handling in mount_subvol() · fa330659
      Omar Sandoval 提交于
      In preparation for new functionality in mount_subvol(), give it
      ownership of subvol_name and tidy up the error paths.
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NOmar Sandoval <osandov@osandov.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      fa330659
    • O
      Btrfs: remove all subvol options before mounting top-level · e6e4dbe8
      Omar Sandoval 提交于
      Currently, setup_root_args() substitutes 's/subvol=[^,]*/subvolid=0/'.
      But, this means that if the user passes both a subvol and subvolid for
      some reason, we won't actually mount the top-level when we recursively
      mount. For example, consider:
      
      mkfs.btrfs -f /dev/sdb
      mount /dev/sdb /mnt
      btrfs subvol create /mnt/subvol1 # subvolid=257
      btrfs subvol create /mnt/subvol2 # subvolid=258
      umount /mnt
      mount -osubvol=/subvol1,subvolid=258 /dev/sdb /mnt
      
      In the final mount, subvol=/subvol1,subvolid=258 becomes
      subvolid=0,subvolid=258, and the last option takes precedence, so we
      mount subvol2 and try to look up subvol1 inside of it, which fails.
      
      So, instead, do a thorough scan through the argument list and remove any
      subvol= and subvolid= options, then append subvolid=0 to the end. This
      implicitly makes subvol= take precedence over subvolid=, but we're about
      to add a stricter check for that. This also makes setup_root_args() more
      generic, which we'll need soon.
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NOmar Sandoval <osandov@osandov.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      e6e4dbe8
    • O
      Btrfs: lock superblock before remounting for rw subvol · 773cd04e
      Omar Sandoval 提交于
      Since commit 0723a047 ("btrfs: allow mounting btrfs subvolumes with
      different ro/rw options"), when mounting a subvolume read/write when
      another subvolume has previously been mounted read-only, we first do a
      remount. However, this should be done with the superblock locked, as per
      sync_filesystem():
      
      	/*
      	 * We need to be protected against the filesystem going from
      	 * r/o to r/w or vice versa.
      	 */
      	WARN_ON(!rwsem_is_locked(&sb->s_umount));
      
      This WARN_ON can easily be hit with:
      
      mkfs.btrfs -f /dev/vdb
      mount /dev/vdb /mnt
      btrfs subvol create /mnt/vol1
      btrfs subvol create /mnt/vol2
      umount /mnt
      mount -oro,subvol=/vol1 /dev/vdb /mnt
      mount -orw,subvol=/vol2 /dev/vdb /mnt2
      
      Fixes: 0723a047 ("btrfs: allow mounting btrfs subvolumes with different ro/rw options")
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NOmar Sandoval <osandov@osandov.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      773cd04e
    • D
      btrfs: add 'cold' compiler annotations to all error handling functions · c0d19e2b
      David Sterba 提交于
      The annotated functios will be placed into .text.unlikely section. The
      annotation also hints compiler to move the code out of the hot paths,
      and may implicitly mark if-statement leading to that block as unlikely.
      
      This is a heuristic, the impact on the generated code is not
      significant.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      c0d19e2b
    • D
      btrfs: report exact callsite where transaction abort occurs · 1a9a8a71
      David Sterba 提交于
      WARN is called from a single location and all bugreports say that's in
      super.c __btrfs_abort_transaction. This is slightly confusing as we'd
      rather want to know the exact callsite. Whereas this information is
      printed in the syslog below the stacktrace, this requires further look
      and we usually see only the headline from WARNING.
      
      Moving the WARN into the macro has to inline some code and increases
      code by a few kilobytes:
      
        text    data     bss     dec     hex filename
      835481   20305   14120  869906   d4612 btrfs.ko.before
      842883   20305   14120  877308   d62fc btrfs.ko.after
      
      The delta is +7k (130+ calls), measured on 3.19 x86_64, distro config.
      The increase is not small and could lead to worse icache use. The code
      is on error/exit paths that can be recognized by compiler as cold and
      moved out of the way so the impact is speculated to be low, if
      measurable at all.
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      1a9a8a71
  3. 16 4月, 2015 1 次提交
  4. 27 3月, 2015 2 次提交
    • J
      btrfs: cleanup orphans while looking up default subvolume · 727b9784
      Jeff Mahoney 提交于
      Orphans in the fs tree are cleaned up via open_ctree and subvolume
      orphans are cleaned via btrfs_lookup_dentry -- except when a default
      subvolume is in use.  The name for the default subvolume uses a manual
      lookup that doesn't trigger orphan cleanup and needs to trigger it
      manually as well. This doesn't apply to the remount case since the
      subvolumes are cleaned up by walking the root radix tree.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      727b9784
    • T
      btrfs: explicitly set control file's private_data · d8620958
      Tom Van Braeckel 提交于
      The private_data member of the Btrfs control device file
      (/dev/btrfs-control) is used to hold the current transaction and needs
      to be initialized to NULL to signify that no transaction is in progress.
      
      We explicitly set the control file's private_data to NULL to be
      independent of whatever value the misc subsystem initializes it to.
      
      Backstory:
      ----------
      
      The misc subsystem (which is used by /dev/btrfs-control) initializes
      a file's private_data to point to the misc device when a driver has
      registered a custom open file operation and initializes it to NULL
      when a custom open file operation has *not* been provided.
      
      This subtle quirk is confusing, to the point where kernel code registers
      *empty* file open operations to have private_data point to the misc
      device structure.
      
      And it leads to bugs, where the addition or removal of a custom open
      file operation surprisingly changes the initial contents of a file's
      private_data structure.
      
      To simplify things in the misc subsystem, a patch [1] has been proposed
      to *always* set private_data to point to the misc device instead of
      only doing this when a custom open file operation has been registered.
      
      But before we can fix this in the misc subsystem itself, we need to
      modify the (few) drivers that rely on this very subtle behavior.
      
      [1] https://lkml.org/lkml/2014/12/4/939Signed-off-by: NMartin Kepplinger <martink@posteo.de>
      Signed-off-by: NTom Van Braeckel <tomvanbraeckel@gmail.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d8620958
  5. 04 3月, 2015 1 次提交
  6. 21 2月, 2015 1 次提交
  7. 22 1月, 2015 1 次提交
  8. 21 1月, 2015 1 次提交
    • Q
      btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock. · a53f4f8e
      Qu Wenruo 提交于
      Commit 6b5fe46d (btrfs: do commit in sync_fs if there are pending
      changes) will call btrfs_start_transaction() in sync_fs(), to handle
      some operations needed to be done in next transaction.
      
      However this can cause deadlock if the filesystem is frozen, with the
      following sys_r+w output:
      [  143.255932] Call Trace:
      [  143.255936]  [<ffffffff816c0e09>] schedule+0x29/0x70
      [  143.255939]  [<ffffffff811cb7f3>] __sb_start_write+0xb3/0x100
      [  143.255971]  [<ffffffffa040ec06>] start_transaction+0x2e6/0x5a0
      [btrfs]
      [  143.255992]  [<ffffffffa040f1eb>] btrfs_start_transaction+0x1b/0x20
      [btrfs]
      [  143.256003]  [<ffffffffa03dc0ba>] btrfs_sync_fs+0xca/0xd0 [btrfs]
      [  143.256007]  [<ffffffff811f7be0>] sync_fs_one_sb+0x20/0x30
      [  143.256011]  [<ffffffff811cbd01>] iterate_supers+0xe1/0xf0
      [  143.256014]  [<ffffffff811f7d75>] sys_sync+0x55/0x90
      [  143.256017]  [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17
      [  143.256111] Call Trace:
      [  143.256114]  [<ffffffff816c0e09>] schedule+0x29/0x70
      [  143.256119]  [<ffffffff816c3405>] rwsem_down_write_failed+0x1c5/0x2d0
      [  143.256123]  [<ffffffff8133f013>] call_rwsem_down_write_failed+0x13/0x20
      [  143.256131]  [<ffffffff811caae8>] thaw_super+0x28/0xc0
      [  143.256135]  [<ffffffff811db3e5>] do_vfs_ioctl+0x3f5/0x540
      [  143.256187]  [<ffffffff811db5c1>] SyS_ioctl+0x91/0xb0
      [  143.256213]  [<ffffffff816c49d2>] system_call_fastpath+0x12/0x17
      
      The reason is like the following:
      (Holding s_umount)
      VFS sync_fs staff:
      |- btrfs_sync_fs()
         |- btrfs_start_transaction()
            |- sb_start_intwrite()
            (Waiting thaw_fs to unfreeze)
      					VFS thaw_fs staff:
      					thaw_fs()
      					(Waiting sync_fs to release
      					 s_umount)
      
      So deadlock happens.
      This can be easily triggered by fstest/generic/068 with inode_cache
      mount option.
      
      The fix is to check if the fs is frozen, if the fs is frozen, just
      return and waiting for the next transaction.
      
      Cc: David Sterba <dsterba@suse.cz>
      Reported-by: NGui Hecheng <guihc.fnst@cn.fujitsu.com>
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      [enhanced comment, changed to SB_FREEZE_WRITE]
      Signed-off-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NChris Mason <clm@fb.com>
      a53f4f8e
  9. 20 1月, 2015 1 次提交
  10. 03 12月, 2014 1 次提交
    • F
      Btrfs: make btrfs_abort_transaction consider existence of new block groups · c92f6be3
      Filipe Manana 提交于
      If the transaction handle doesn't have used blocks but has created new block
      groups make sure we turn the fs into readonly mode too. This is because the
      new block groups didn't get all their metadata persisted into the chunk and
      device trees, and therefore if a subsequent transaction starts, allocates
      space from the new block groups, writes data or metadata into that space,
      commits successfully and then after we unmount and mount the filesystem
      again, the same space can be allocated again for a new block group,
      resulting in file data or metadata corruption.
      
      Example where we don't abort the transaction when we fail to finish the
      chunk allocation (add items to the chunk and device trees) and later a
      future transaction where the block group is removed fails because it can't
      find the chunk item in the chunk tree:
      
      [25230.404300] WARNING: CPU: 0 PID: 7721 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x50/0xfc [btrfs]()
      [25230.404301] BTRFS: Transaction aborted (error -28)
      [25230.404302] Modules linked in: btrfs dm_flakey nls_utf8 fuse xor raid6_pq ntfs vfat msdos fat xfs crc32c_generic libcrc32c ext3 jbd ext2 dm_mod nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc loop psmouse i2c_piix4 i2ccore parport_pc parport processor button pcspkr serio_raw thermal_sys evdev microcode ext4 crc16 jbd2 mbcache sr_mod cdrom ata_generic sg sd_mod crc_t10dif crct10dif_generic crct10dif_common virtio_scsi floppy e1000 ata_piix libata virtio_pci virtio_ring scsi_mod virtio [last unloaded: btrfs]
      [25230.404325] CPU: 0 PID: 7721 Comm: xfs_io Not tainted 3.17.0-rc5-btrfs-next-1+ #1
      [25230.404326] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      [25230.404328]  0000000000000000 ffff88004581bb08 ffffffff813e7a13 ffff88004581bb50
      [25230.404330]  ffff88004581bb40 ffffffff810423aa ffffffffa049386a 00000000ffffffe4
      [25230.404332]  ffffffffa05214c0 000000000000240c ffff88010fc8f800 ffff88004581bba8
      [25230.404334] Call Trace:
      [25230.404338]  [<ffffffff813e7a13>] dump_stack+0x45/0x56
      [25230.404342]  [<ffffffff810423aa>] warn_slowpath_common+0x7f/0x98
      [25230.404351]  [<ffffffffa049386a>] ? __btrfs_abort_transaction+0x50/0xfc [btrfs]
      [25230.404353]  [<ffffffff8104240b>] warn_slowpath_fmt+0x48/0x50
      [25230.404362]  [<ffffffffa049386a>] __btrfs_abort_transaction+0x50/0xfc [btrfs]
      [25230.404374]  [<ffffffffa04a8c43>] btrfs_create_pending_block_groups+0x10c/0x135 [btrfs]
      [25230.404387]  [<ffffffffa04b77fd>] __btrfs_end_transaction+0x7e/0x2de [btrfs]
      [25230.404398]  [<ffffffffa04b7a6d>] btrfs_end_transaction+0x10/0x12 [btrfs]
      [25230.404408]  [<ffffffffa04a3d64>] btrfs_check_data_free_space+0x111/0x1f0 [btrfs]
      [25230.404421]  [<ffffffffa04c53bd>] __btrfs_buffered_write+0x160/0x48d [btrfs]
      [25230.404425]  [<ffffffff811a9268>] ? cap_inode_need_killpriv+0x2d/0x37
      [25230.404429]  [<ffffffff810f6501>] ? get_page+0x1a/0x2b
      [25230.404441]  [<ffffffffa04c7c95>] btrfs_file_write_iter+0x321/0x42f [btrfs]
      [25230.404443]  [<ffffffff8110f5d9>] ? handle_mm_fault+0x7f3/0x846
      [25230.404446]  [<ffffffff813e98c5>] ? mutex_unlock+0x16/0x18
      [25230.404449]  [<ffffffff81138d68>] new_sync_write+0x7c/0xa0
      [25230.404450]  [<ffffffff81139401>] vfs_write+0xb0/0x112
      [25230.404452]  [<ffffffff81139c9d>] SyS_pwrite64+0x66/0x84
      [25230.404454]  [<ffffffff813ebf52>] system_call_fastpath+0x16/0x1b
      [25230.404455] ---[ end trace 5aa5684fdf47ab38 ]---
      [25230.404458] BTRFS warning (device sdc): btrfs_create_pending_block_groups:9228: Aborting unused transaction(No space left).
      [25288.084814] BTRFS: error (device sdc) in btrfs_free_chunk:2509: errno=-2 No such entry (Failed lookup while freeing chunk.)
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      c92f6be3
  11. 21 11月, 2014 2 次提交
  12. 12 11月, 2014 2 次提交
  13. 28 10月, 2014 1 次提交
  14. 08 10月, 2014 1 次提交
  15. 06 10月, 2014 1 次提交
    • Q
      btrfs: Make btrfs handle security mount options internally to avoid losing security label. · f667aef6
      Qu Wenruo 提交于
      [BUG]
      Originally when mount btrfs with "-o subvol=" mount option, btrfs will
      lose all security lable.
      And if the btrfs fs is mounted somewhere else, due to the lost of
      security lable, SELinux will refuse to mount since the same super block
      is being mounted using different security lable.
      
      [REPRODUCER]
      With SELinux enabled:
       #mkfs -t btrfs /dev/sda5
       #mount -o context=system_u:object_r:nfs_t:s0 /dev/sda5 /mnt/btrfs
       #btrfs subvolume create /mnt/btrfs/subvol
       #mount -o subvol=subvol,context=system_u:object_r:nfs_t:s0 /dev/sda5
        /mnt/test
      
      kernel message:
      SELinux: mount invalid.  Same superblock, different security settings
      for (dev sda5, type btrfs)
      
      [REASON]
      This happens because btrfs will call vfs_kern_mount() and then
      mount_subtree() to handle subvolume name lookup.
      First mount will cut off all the security lables and when it comes to
      the second vfs_kern_mount(), it has no security label now.
      
      [FIX]
      This patch will makes btrfs behavior much more like nfs,
      which has the type flag FS_BINARY_MOUNTDATA,
      making btrfs handles the security label internally.
      So security label will be set in the real mount time and won't lose
      label when use with "subvol=" mount option.
      Reported-by: NEryu Guan <guaneryu@gmail.com>
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      f667aef6
  16. 02 10月, 2014 4 次提交
  17. 18 9月, 2014 4 次提交
  18. 15 8月, 2014 1 次提交
  19. 08 8月, 2014 1 次提交
    • J
      dcache: d_obtain_alias callers don't all want DISCONNECTED · 1a0a397e
      J. Bruce Fields 提交于
      There are a few d_obtain_alias callers that are using it to get the
      root of a filesystem which may already have an alias somewhere else.
      
      This is not the same as the filehandle-lookup case, and none of them
      actually need DCACHE_DISCONNECTED set.
      
      It isn't really a serious problem, but it would really be clearer if we
      reserved DCACHE_DISCONNECTED for those cases where it's actually needed.
      
      In the btrfs case this was causing a spurious printk from
      nfsd/nfsfh.c:fh_verify when it found an unexpected DCACHE_DISCONNECTED
      dentry.  Josef worked around this by unsetting DCACHE_DISCONNECTED
      manually in 3a0dfa6a "Btrfs: unset DCACHE_DISCONNECTED when mounting
      default subvol", and this replaces that workaround.
      
      Cc: Josef Bacik <jbacik@fb.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1a0a397e
  20. 03 7月, 2014 3 次提交
    • A
      btrfs: fix null pointer dereference in btrfs_show_devname when name is null · 0aeb8a6e
      Anand Jain 提交于
      dev->name is null but missing flag is not set.
      Strictly speaking the missing flag should have been set, but there
      are more places where code just checks if name is null. For now this
      patch does the same.
      
      stack:
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000064
      IP: [<ffffffffa0228908>] btrfs_show_devname+0x58/0xf0 [btrfs]
      
      [<ffffffff81198879>] show_vfsmnt+0x39/0x130
      [<ffffffff81178056>] m_show+0x16/0x20
      [<ffffffff8117d706>] seq_read+0x296/0x390
      [<ffffffff8115aa7d>] vfs_read+0x9d/0x160
      [<ffffffff8115b549>] SyS_read+0x49/0x90
      [<ffffffff817abe52>] system_call_fastpath+0x16/0x1b
      
      reproducer:
      mkfs.btrfs -draid1 -mraid1 /dev/sdg1 /dev/sdg2
      btrfstune -S 1 /dev/sdg1
      modprobe -r btrfs && modprobe btrfs
      mount -o degraded /dev/sdg1 /btrfs
      btrfs dev add /dev/sdg3 /btrfs
      Signed-off-by: NAnand Jain <Anand.Jain@oracle.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      0aeb8a6e
    • E
      btrfs: fix nossd and ssd_spread mount option regression · 2aa06a35
      Eric Sandeen 提交于
      The commit
      
      07802534 btrfs: Cleanup the btrfs_parse_options for remount.
      
      broke ssd options quite badly; it stopped making ssd_spread
      imply ssd, and it made "nossd" unsettable.
      
      Put things back at least as well as they were before
      (though ssd mount option handling is still pretty odd:
      # mount -o "nossd,ssd_spread" works?)
      Reported-by: NRoman Mamedov <rm@romanrm.net>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      2aa06a35
    • W
      Btrfs: fix race between balance recovery and root deletion · 5f316481
      Wang Shilong 提交于
      Balance recovery is called when RW mounting or remounting from
      RO to RW, it is called to finish roots merging.
      
      When doing balance recovery, relocation root's corresponding
      fs root(whose root refs is 0) might be destroyed by cleaner
      thread, this will make btrfs fail to mount.
      
      Fix this problem by holding @cleaner_mutex when doing balance
      recovery.
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      5f316481
  21. 10 6月, 2014 1 次提交