1. 28 7月, 2012 5 次提交
  2. 27 7月, 2012 1 次提交
  3. 25 7月, 2012 5 次提交
  4. 11 7月, 2012 4 次提交
    • J
      nfsd: share some function prototypes · 7f2e7dc0
      J. Bruce Fields 提交于
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7f2e7dc0
    • J
      nfsd: allow owner_override only for regular files · d91d0b56
      J. Bruce Fields 提交于
      We normally allow the owner of a file to override permissions checks on
      IO operations, since:
      	- the client will take responsibility for doing an access check
      	  on open;
      	- the permission checks offer no protection against malicious
      	  clients--if they can authenticate as the file's owner then
      	  they can always just change its permissions;
      	- checking permission on each IO operation breaks the usual
      	  posix rule that permission is checked only on open.
      
      However, we've never allowed the owner to override permissions on
      readdir operations, even though the above logic would also apply to
      directories.  I've never heard of this causing a problem, probably
      because a) simultaneously opening and creating a directory (with
      restricted mode) isn't possible, and b) opening a directory, then
      chmod'ing it, is rare.
      
      Our disallowal of owner-override on directories appears to be an
      accident, though--the readdir itself succeeds, and then we fail just
      because lookup_one_len() calls in our filldir methods fail.
      
      I'm not sure what the easiest fix for that would be.  For now, just make
      this behavior obvious by denying the override right at the start.
      
      This also fixes some odd v4 behavior: with the rdattr_error attribute
      requested, it would perform the readdir but return an ACCES error with
      each entry.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      d91d0b56
    • J
      nfsd4: release openowners on free in >=4.1 case · 74dbafaf
      J. Bruce Fields 提交于
      We don't need to keep openowners around in the >=4.1 case, because they
      aren't needed to handle CLOSE replays any more (that's a problem for
      sessions).  And doing so causes unexpected failures on a subsequent
      destroy_clientid to fail.
      
      We probably also need something comparable for lock owners on last
      unlock.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      74dbafaf
    • J
      nfsd4: our filesystems are normally case sensitive · 2930d381
      J. Bruce Fields 提交于
      Actually, xfs and jfs can optionally be case insensitive; we'll handle
      that case in later patches.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      2930d381
  5. 20 6月, 2012 5 次提交
    • J
      nfsd4: process_open2 cleanup · 4af82504
      J. Bruce Fields 提交于
      Note we can simplify the error handling a little by doing the truncate
      earlier.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      4af82504
    • J
      nfsd4: nfsd4_lock() cleanup · e1aaa891
      J. Bruce Fields 提交于
      Share a little common logic.  And note the comments here are a little
      out of date (e.g. we don't always create new state in the "new" case any
      more.)
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      e1aaa891
    • J
      nfsd4: remove unnecessary comment · 9068bed1
      J. Bruce Fields 提交于
      For the most part readers of cl_cb_state only need a value that is
      "eventually" right.  And the value is set only either 1) in response to
      some change of state, in which case it's set to UNKNOWN and then a
      callback rpc is sent to probe the real state, or b) in the handling of a
      response to such a callback.  UNKNOWN is therefore always a "temporary"
      state, and for the other states we're happy to accept last writer wins.
      
      So I think we're OK here.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      9068bed1
    • C
      NFSD: TEST_STATEID should not return NFS4ERR_STALE_STATEID · 7df302f7
      Chuck Lever 提交于
      According to RFC 5661, the TEST_STATEID operation is not allowed to
      return NFS4ERR_STALE_STATEID.  In addition, RFC 5661 says:
      
      15.1.16.5.  NFS4ERR_STALE_STATEID (Error Code 10023)
      
         A stateid generated by an earlier server instance was used.  This
         error is moot in NFSv4.1 because all operations that take a stateid
         MUST be preceded by the SEQUENCE operation, and the earlier server
         instance is detected by the session infrastructure that supports
         SEQUENCE.
      
      I triggered NFS4ERR_STALE_STATEID while testing the Linux client's
      NOGRACE recovery.  Bruce suggested an additional test that could be
      useful to client developers.
      
      Lastly, RFC 5661, section 18.48.3 has this:
      
       o  Special stateids are always considered invalid (they result in the
          error code NFS4ERR_BAD_STATEID).
      
      An explicit check is made for those state IDs to avoid printk noise.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      7df302f7
    • W
      nfsd: probe the back channel on new connections · 24119673
      Weston Andros Adamson 提交于
      Initiate a CB probe when a new connection with the correct direction is added
      to a session (IFF backchannel is marked as down).  Without this a
      BIND_CONN_TO_SESSION has no effect on the internal backchannel state, which
      causes the server to reply to every SEQUENCE op with the
      SEQ4_STATUS_CB_PATH_DOWN flag set until DESTROY_SESSION.
      Signed-off-by: NWeston Andros Adamson <dros@netapp.com>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      24119673
  6. 18 6月, 2012 2 次提交
  7. 16 6月, 2012 2 次提交
  8. 15 6月, 2012 16 次提交
    • M
      Btrfs: destroy the items of the delayed inodes in error handling routine · 67cde344
      Miao Xie 提交于
      the items of the delayed inodes were forgotten to be freed, this patch
      fixes it.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      67cde344
    • L
      Btrfs: make sure that we've made everything in pinned tree clean · ed0eaa14
      Liu Bo 提交于
      Since we have two trees for recording pinned extents, we need to go through
      both of them to make sure that we've done everything clean.
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      ed0eaa14
    • L
      Btrfs: avoid memory leak of extent state in error handling routine · 6e841e32
      Liu Bo 提交于
      We've forgotten to clear extent states in pinned tree, which will results in
      space counter mismatch and memory leak:
      
      WARNING: at fs/btrfs/extent-tree.c:7537 btrfs_free_block_groups+0x1f3/0x2e0 [btrfs]()
      ...
      space_info 2 has 8380416 free, is not full
      space_info total=12582912, used=4096, pinned=4096, reserved=0, may_use=0, readonly=4194304
      btrfs state leak: start 29364224 end 29376511 state 1 in tree ffff880075f20090 refs 1
      ...
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      6e841e32
    • L
      Btrfs: do not resize a seeding device · 4e42ae1b
      Liu Bo 提交于
      Seeding devices are not supposed to change any more.
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      4e42ae1b
    • L
      Btrfs: fix missing inherited flag in rename · bc178237
      Liu Bo 提交于
      When we move a file into a directory with compression flag, we need to
      inherite BTRFS_INODE_COMPRESS and clear BTRFS_INODE_NOCOMPRESS as well.
      But if we move a file into a directory without compression flag, we need
      to clear both of them.
      
      It is the way how our setflags deals with compression flag, so keep
      the same behaviour here.
      Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      bc178237
    • L
      Btrfs: fix incompat flags setting · 69e380d1
      Li Zefan 提交于
      It's a bug, but it happens to work, as BTRFS_COMPRESS_LZO == 2, which
      has only one bit set.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      69e380d1
    • L
      Btrfs: fix defrag regression · 6c282eb4
      Li Zefan 提交于
      If a file has 3 small extents:
      
      | ext1 | ext2 | ext3 |
      
      Running "btrfs fi defrag" will only defrag the last two extents, if those
      extent mappings hasn't been read into memory from disk.
      
      This bug was introduced by commit 17ce6ef8
      ("Btrfs: add a check to decide if we should defrag the range")
      
      The cause is, that commit looked into previous and next extents using
      lookup_extent_mapping() only.
      
      While at it, remove the code that checks the previous extent, since
      it's sufficient to check the next extent.
      Signed-off-by: NLi Zefan <lizefan@huawei.com>
      6c282eb4
    • J
      Btrfs: call filemap_fdatawrite twice for compression · 7ddf5a42
      Josef Bacik 提交于
      I removed this in an earlier commit and I was wrong.  Because compression
      can return from filemap_fdatawrite() without having actually set any of it's
      pages as writeback() it can make filemap_fdatawait() do essentially nothing,
      and then we won't find any ordered extents because they may not have been
      created yet.  So not only does this make fsync() completely useless, but it
      will also screw up if you truncate on a non-page aligned offset since we
      zero out the end and then wait on ordered extents and then call drop caches.
      We can drop the cache before the io completes and then we try to unpin the
      extent we just wrote we won't find it and everything goes sideways.  So fix
      this by putting it back and put a giant comment there to keep me from trying
      to remove it in the future.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      7ddf5a42
    • J
      Btrfs: keep inode pinned when compressing writes · 8180ef88
      Josef Bacik 提交于
      A user reported lots of problems using compression on the new code and it
      turns out part of the problem was that igrab() was failing when we added a
      new ordered extent.  This is because when writing out an inode under
      compression we immediately return without actually doing anything to the
      pages, and then in another thread at some point down the line actually do
      the ordered dance.  The problem is between the point that we start writeback
      and we actually add the ordered extent we could be trying to reclaim the
      inode, which makes igrab() return NULL.  So we need to do an igrab() when we
      create the async extent and then drop it when we are done with it.  This
      makes sure we stay pinned in memory until the ordered extent can get a
      reference on it and we are good to go.  With this patch we no longer panic
      in btrfs_finish_ordered_io().  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      8180ef88
    • J
      Btrfs: implement ->show_devname · 9c5085c1
      Josef Bacik 提交于
      Because btrfs can remove the device that was mounted we need to have a
      ->show_devname so that in this case we can print out some other device in
      the file system to /proc/mount.  So if there are multiple devices in a btrfs
      file system we will just print the device with the lowest devid that we can
      find.  This will make everything consistent and deal with device removal
      properly.  The drawback is if you mount with a device that is higher than
      the lowest devicd it won't show up as the mounted device in /proc/mounts,
      but this is a small price to pay. This was inspired by Miao Xie's patch.
      Thanks,
      Reviewed-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      9c5085c1
    • J
      Btrfs: use rcu to protect device->name · 606686ee
      Josef Bacik 提交于
      Al pointed out that we can just toss out the old name on a device and add a
      new one arbitrarily, so anybody who uses device->name in printk could
      possibly use free'd memory.  Instead of adding locking around all of this he
      suggested doing it with RCU, so I've introduced a struct rcu_string that
      does just that and have gone through and protected all accesses to
      device->name that aren't under the uuid_mutex with rcu_read_lock().  This
      protects us and I will use it for dealing with removing the device that we
      used to mount the file system in a later patch.  Thanks,
      Reviewed-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      606686ee
    • J
      Btrfs: unlock everything properly in the error case for nocow · 17ca04af
      Josef Bacik 提交于
      I was getting hung on umount when a transaction was aborted because a range
      of one of the free space inodes was still locked.  This is because the nocow
      stuff doesn't unlock anything on error.  This fixed the problem and I
      verified that is what was happening.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      17ca04af
    • J
      Btrfs: fix btrfs_destroy_marked_extents · ee670f0a
      Josef Bacik 提交于
      So we're forcing the eb's to have their ref count set to 1 so invalidatepage
      works but this breaks lots of things, for example root nodes, and is just
      plain wrong, we don't need to just evict all of this stuff.  Also drop the
      invalidatepage altogether and add a page_cache_release().  With this patch
      we no longer hang when trying to access the root nodes after an aborted
      transaction and we no longer leak memory.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      ee670f0a
    • J
      Btrfs: abort the transaction if the commit fails · 7b8b92af
      Josef Bacik 提交于
      If a transaction commit fails we don't abort it so we don't set an error on
      the file system.  This patch fixes that by actually calling the abort stuff
      and then adding a check for a fs error in the transaction start stuff to
      make sure it is caught properly.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      7b8b92af
    • J
      Btrfs: wake up transaction waiters when aborting a transaction · d7096fc3
      Josef Bacik 提交于
      I was getting lots of hung tasks and a NULL pointer dereference because we
      are not cleaning up the transaction properly when it aborts.  First we need
      to reset the running_transaction to NULL so we don't get a bad dereference
      for any start_transaction callers after this.  Also we cannot rely on
      waitqueue_active() since it's just a list_empty(), so just call wake_up()
      directly since that will do the barrier for us and such.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      d7096fc3
    • J
      Btrfs: fix locking in btrfs_destroy_delayed_refs · b939d1ab
      Josef Bacik 提交于
      The transaction abort stuff was throwing warnings from the list debugging
      code because we do a list_del_init outside of the delayed_refs spin lock.
      The delayed refs locking makes baby Jesus cry so it's not hard to get wrong,
      but we need to take the ref head mutex to make sure it's not being processed
      currently, and so if it is we need to drop the spin lock and then take and
      drop the mutex and do the search again.  If we can take the mutex then we
      can safely remove the head from the list and carry on.  Now when the
      transaction aborts I don't get the list debugging warnings.  Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      b939d1ab