1. 24 5月, 2011 2 次提交
    • J
      Btrfs: kill trans_mutex · a4abeea4
      Josef Bacik 提交于
      We use trans_mutex for lots of things, here's a basic list
      
      1) To serialize trans_handles joining the currently running transaction
      2) To make sure that no new trans handles are started while we are committing
      3) To protect the dead_roots list and the transaction lists
      
      Really the serializing trans_handles joining is not too hard, and can really get
      bogged down in acquiring a reference to the transaction.  So replace the
      trans_mutex with a trans_lock spinlock and use it to do the following
      
      1) Protect fs_info->running_transaction.  All trans handles have to do is check
      this, and then take a reference of the transaction and keep on going.
      2) Protect the fs_info->trans_list.  This doesn't get used too much, basically
      it just holds the current transactions, which will usually just be the currently
      committing transaction and the currently running transaction at most.
      3) Protect the dead roots list.  This is only ever processed by splicing the
      list so this is relatively simple.
      4) Protect the fs_info->reloc_ctl stuff.  This is very lightweight and was using
      the trans_mutex before, so this is a pretty straightforward change.
      5) Protect fs_info->no_trans_join.  Because we don't hold the trans_lock over
      the entirety of the commit we need to have a way to block new people from
      creating a new transaction while we're doing our work.  So we set no_trans_join
      and in join_transaction we test to see if that is set, and if it is we do a
      wait_on_commit.
      6) Make the transaction use count atomic so we don't need to take locks to
      modify it when we're dropping references.
      7) Add a commit_lock to the transaction to make sure multiple people trying to
      commit the same transaction don't race and commit at the same time.
      8) Make open_ioctl_trans an atomic so we don't have to take any locks for ioctl
      trans.
      
      I have tested this with xfstests, but obviously it is a pretty hairy change so
      lots of testing is greatly appreciated.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      a4abeea4
    • J
      Btrfs: take away the num_items argument from btrfs_join_transaction · 7a7eaa40
      Josef Bacik 提交于
      I keep forgetting that btrfs_join_transaction() just ignores the num_items
      argument, which leads me to sending pointless patches and looking stupid :).  So
      just kill the num_items argument from btrfs_join_transaction and
      btrfs_start_ioctl_transaction, since neither of them use it.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      7a7eaa40
  2. 15 5月, 2011 3 次提交
  3. 12 4月, 2011 1 次提交
  4. 05 4月, 2011 2 次提交
  5. 28 3月, 2011 4 次提交
  6. 24 3月, 2011 1 次提交
  7. 18 3月, 2011 1 次提交
    • J
      Btrfs: handle errors in btrfs_orphan_cleanup · 66b4ffd1
      Josef Bacik 提交于
      If we cannot truncate an inode for some reason we will never delete the orphan
      item associated with that inode, which means that we will loop forever in
      btrfs_orphan_cleanup.  Instead of doing this just return error so we fail to
      mount.  It sucks, but hey it's better than hanging.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      66b4ffd1
  8. 17 2月, 2011 1 次提交
  9. 15 2月, 2011 1 次提交
    • D
      btrfs: prevent heap corruption in btrfs_ioctl_space_info() · 51788b1b
      Dan Rosenberg 提交于
      Commit bf5fc093 refactored
      btrfs_ioctl_space_info() and introduced several security issues.
      
      space_args.space_slots is an unsigned 64-bit type controlled by a
      possibly unprivileged caller.  The comparison as a signed int type
      allows providing values that are treated as negative and cause the
      subsequent allocation size calculation to wrap, or be truncated to 0.
      By providing a size that's truncated to 0, kmalloc() will return
      ZERO_SIZE_PTR.  It's also possible to provide a value smaller than the
      slot count.  The subsequent loop ignores the allocation size when
      copying data in, resulting in a heap overflow or write to ZERO_SIZE_PTR.
      
      The fix changes the slot count type and comparison typecast to u64,
      which prevents truncation or signedness errors, and also ensures that we
      don't copy more data than we've allocated in the subsequent loop.  Note
      that zero-size allocations are no longer possible since there is already
      an explicit check for space_args.space_slots being 0 and truncation of
      this value is no longer an issue.
      Signed-off-by: NDan Rosenberg <drosenberg@vsecurity.com>
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Reviewed-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      51788b1b
  10. 01 2月, 2011 1 次提交
  11. 29 1月, 2011 2 次提交
  12. 27 1月, 2011 1 次提交
    • L
      Btrfs: Fix file clone when source offset is not 0 · 4d728ec7
      Li Zefan 提交于
      Suppose:
      - the source extent is: [0, 100]
      - the src offset is 10
      - the clone length is 90
      - the dest offset is 0
      
      This statement:
      
      	new_key.offset = key.offset + destoff - off
      
      will produce such an extent for the dest file:
      
      	[ino, BTRFS_EXTENT_DATA_KEY, -10]
      
      , which is obviously wrong.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      4d728ec7
  13. 23 12月, 2010 3 次提交
    • L
      Btrfs: Add BTRFS_IOC_SUBVOL_GETFLAGS/SETFLAGS ioctls · 0caa102d
      Li Zefan 提交于
      This allows us to set a snapshot or a subvolume readonly or writable
      on the fly.
      
      Usage:
      
      Set BTRFS_SUBVOL_RDONLY of btrfs_ioctl_vol_arg_v2->flags, and then
      call ioctl(BTRFS_IOCTL_SUBVOL_SETFLAGS);
      
      Changelog for v3:
      
      - Change to pass __u64 as ioctl parameter.
      
      Changelog for v2:
      
      - Add _GETFLAGS ioctl.
      - Check if the passed fd is the root of a subvolume.
      - Change the name from _SNAP_SETFLAGS to _SUBVOL_SETFLAGS.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      0caa102d
    • L
      Btrfs: Add readonly snapshots support · b83cc969
      Li Zefan 提交于
      Usage:
      
      Set BTRFS_SUBVOL_RDONLY of btrfs_ioctl_vol_arg_v2->flags, and call
      ioctl(BTRFS_I0CTL_SNAP_CREATE_V2).
      
      Implementation:
      
      - Set readonly bit of btrfs_root_item->flags.
      - Add readonly checks in btrfs_permission (inode_permission),
      btrfs_setattr, btrfs_set/remove_xattr and some ioctls.
      
      Changelog for v3:
      
      - Eliminate btrfs_root->readonly, but check btrfs_root->root_item.flags.
      - Rename BTRFS_ROOT_SNAP_RDONLY to BTRFS_ROOT_SUBVOL_RDONLY.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      b83cc969
    • L
      Btrfs: Refactor btrfs_ioctl_snap_create() · fa0d2b9b
      Li Zefan 提交于
      Split it into two functions for two different ioctls, since they
      share no common code.
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      fa0d2b9b
  14. 22 12月, 2010 2 次提交
  15. 11 12月, 2010 2 次提交
  16. 22 11月, 2010 3 次提交
  17. 30 10月, 2010 9 次提交
    • S
      Btrfs: allow subvol deletion by unprivileged user with -o user_subvol_rm_allowed · 4260f7c7
      Sage Weil 提交于
      Add a mount option user_subvol_rm_allowed that allows users to delete a
      (potentially non-empty!) subvol when they would otherwise we allowed to do
      an rmdir(2).  We duplicate the may_delete() checks from the core VFS code
      to implement identical security checks (minus the directory size check).
      We additionally require that the user has write+exec permission on the
      subvol root inode.
      Signed-off-by: NSage Weil <sage@newdream.net>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      4260f7c7
    • S
      Btrfs: make SNAP_DESTROY async · 531cb13f
      Sage Weil 提交于
      There is no reason to force an immediate commit when deleting a snapshot.
      Users have some expectation that space from a deleted snapshot be freed
      immediately, but even if we do commit the reclaim is a background process.
      
      If users _do_ want the deletion to be durable, they can call 'sync'.
      Signed-off-by: NSage Weil <sage@newdream.net>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      531cb13f
    • S
      Btrfs: add SNAP_CREATE_ASYNC ioctl · 72fd032e
      Sage Weil 提交于
      Create a snap without waiting for it to commit to disk.  The ioctl is
      ordered such that subsequent operations will not be contained by the
      created snapshot, and the commit is initiated, but the ioctl does not
      wait for the snapshot to commit to disk.
      
      We return the specific transid to userspace so that an application can wait
      for this specific snapshot creation to commit via the WAIT_SYNC ioctl.
      Signed-off-by: NSage Weil <sage@newdream.net>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      72fd032e
    • S
      Btrfs: add START_SYNC, WAIT_SYNC ioctls · 46204592
      Sage Weil 提交于
      START_SYNC will start a sync/commit, but not wait for it to
      complete.  Any modification started after the ioctl returns is
      guaranteed not to be included in the commit.  If a non-NULL
      pointer is passed, the transaction id will be returned to
      userspace.
      
      WAIT_SYNC will wait for any in-progress commit to complete.  If a
      transaction id is specified, the ioctl will block and then
      return (success) when the specified transaction has committed.
      If it has already committed when we call the ioctl, it returns
      immediately.  If the specified transaction doesn't exist, it
      returns EINVAL.
      
      If no transaction id is specified, WAIT_SYNC will wait for the
      currently committing transaction to finish it's commit to disk.
      If there is no currently committing transaction, it returns
      success.
      
      These ioctls are useful for applications which want to impose an
      ordering on when fs modifications reach disk, but do not want to
      wait for the full (slow) commit process to do so.
      
      Picky callers can take the transid returned by START_SYNC and
      feed it to WAIT_SYNC, and be certain to wait only as long as
      necessary for the transaction _they_ started to reach disk.
      
      Sloppy callers can START_SYNC and WAIT_SYNC without a transid,
      and provided they didn't wait too long between the calls, they
      will get the same result.  However, if a second commit starts
      before they call WAIT_SYNC, they may end up waiting longer for
      it to commit as well.  Even so, a START_SYNC+WAIT_SYNC still
      guarantees that any operation completed before the START_SYNC
      reaches disk.
      Signed-off-by: NSage Weil <sage@newdream.net>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      46204592
    • S
      Btrfs: fix lockdep warning on clone ioctl · fccdae43
      Sage Weil 提交于
      I'm no lockdep expert, but this appears to make the lockdep warning go
      away for the i_mutex locking in the clone ioctl.
      Signed-off-by: NSage Weil <sage@newdream.net>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      fccdae43
    • S
      Btrfs: fix clone ioctl where range is adjacent to extent · 050006a7
      Sage Weil 提交于
      We had an edge case issue where the requested range was just
      following an existing extent. Instead of skipping to the next
      extent, we used the previous one which lead to having zero
      sized extents.
      Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      050006a7
    • S
      Btrfs: fix delalloc checks in clone ioctl · 9a019196
      Sage Weil 提交于
      The lookup_first_ordered_extent() was done on the wrong inode, and the
      ->delalloc_bytes test was wrong, as the following
      btrfs_wait_ordered_range() would only invoke a range write and wouldn't
      write the entire file data range. Also, a bad parameter was passed to
      btrfs_wait_ordered_range().
      Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      9a019196
    • A
      Btrfs: cleanup warnings from gcc 4.6 (nonbugs) · 559af821
      Andi Kleen 提交于
      These are all the cases where a variable is set, but not read which are
      not bugs as far as I can see, but simply leftovers.
      
      Still needs more review.
      
      Found by gcc 4.6's new warnings
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      559af821
    • J
      Btrfs: use memdup_user helpers · 2354d08f
      Julia Lawall 提交于
      Use memdup_user when user data is immediately copied into the
      allocated region.
      
      The semantic patch that makes this change is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      expression from,to,size,flag;
      position p;
      identifier l1,l2;
      @@
      
      -  to = \(kmalloc@p\|kzalloc@p\)(size,flag);
      +  to = memdup_user(from,size);
         if (
      -      to==NULL
      +      IS_ERR(to)
                       || ...) {
         <+... when != goto l1;
      -  -ENOMEM
      +  PTR_ERR(to)
         ...+>
         }
      -  if (copy_from_user(to, from, size) != 0) {
      -    <+... when != goto l2;
      -    -EFAULT
      -    ...+>
      -  }
      // </smpl>
      Signed-off-by: NJulia Lawall <julia@diku.dk>
      Cc: Chris Mason <chris.mason@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      2354d08f
  18. 23 10月, 2010 1 次提交
    • J
      Btrfs: fix the df ioctl to report raid types · bf5fc093
      Josef Bacik 提交于
      The new ENOSPC stuff broke the df ioctl since we no longer create seperate space
      info's for each RAID type.  So instead, loop through each space info's raid
      lists so we can get the right RAID information which will allow the df ioctl to
      tell us RAID types again.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      bf5fc093