1. 13 9月, 2021 2 次提交
    • D
      afs: Fix updating of i_blocks on file/dir extension · 9d37e1ca
      David Howells 提交于
      When an afs file or directory is modified locally such that the total file
      size is extended, i_blocks needs to be recalculated too.
      
      Fix this by making afs_write_end() and afs_edit_dir_add() call
      afs_set_i_size() rather than setting inode->i_size directly as that also
      recalculates inode->i_blocks.
      
      This can be tested by creating and writing into directories and files and
      then examining them with du.  Without this change, directories show a 4
      blocks (they start out at 2048 bytes) and files show 0 blocks; with this
      change, they should show a number of blocks proportional to the file size
      rounded up to 1024.
      
      Fixes: 31143d5d ("AFS: implement basic file write support")
      Fixes: 63a4681f ("afs: Locally edit directory data for mkdir/create/unlink/...")
      Reported-by: NMarkus Suvanto <markus.suvanto@gmail.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMarc Dionne <marc.dionne@auristor.com>
      Tested-by: NMarkus Suvanto <markus.suvanto@gmail.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/163113612442.352844.11162345591911691150.stgit@warthog.procyon.org.uk/
      9d37e1ca
    • D
      afs: Try to avoid taking RCU read lock when checking vnode validity · 4fe6a946
      David Howells 提交于
      Try to avoid taking the RCU read lock when checking the validity of a
      vnode's callback state.  The only thing it's needed for is to pin the
      parent volume's server list whilst we search it to find the record of the
      server we're currently using to see if it has been reinitialised (ie. it
      sent us a CB.InitCallBackState* RPC).
      
      Do this by the following means:
      
       (1) Keep an additional per-cell counter (fs_s_break) that's incremented
           each time any of the fileservers in the cell reinitialises.
      
           Since the new counter can be accessed without RCU from the vnode, we
           can check that first - and only if it differs, get the RCU read lock
           and check the volume's server list.
      
       (2) Replace afs_get_s_break_rcu() with afs_check_server_good() which now
           indicates whether the callback promise is still expected to be present
           on the server.  This does the checks as described in (1).
      
       (3) Restructure afs_check_validity() to take account of the change in (2).
      
           We can also get rid of the valid variable and just use the need_clear
           variable with the addition of the afs_cb_break_no_promise reason.
      
       (4) afs_check_validity() probably shouldn't be altering vnode->cb_v_break
           and vnode->cb_s_break when it doesn't have cb_lock exclusively locked.
      
           Move the change to vnode->cb_v_break to __afs_break_callback().
      
           Delegate the change to vnode->cb_s_break to afs_select_fileserver()
           and set vnode->cb_fs_s_break there also.
      
       (5) afs_validate() no longer needs to get the RCU read lock around its
           call to afs_check_validity() - and can skip the call entirely if we
           don't have a promise.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NMarkus Suvanto <markus.suvanto@gmail.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/163111669583.283156.1397603105683094563.stgit@warthog.procyon.org.uk/
      4fe6a946
  2. 02 5月, 2021 1 次提交
    • D
      afs: Fix speculative status fetches · 22650f14
      David Howells 提交于
      The generic/464 xfstest causes kAFS to emit occasional warnings of the
      form:
      
              kAFS: vnode modified {100055:8a} 30->31 YFS.StoreData64 (c=6015)
      
      This indicates that the data version received back from the server did not
      match the expected value (the DV should be incremented monotonically for
      each individual modification op committed to a vnode).
      
      What is happening is that a lookup call is doing a bulk status fetch
      speculatively on a bunch of vnodes in a directory besides getting the
      status of the vnode it's actually interested in.  This is racing with a
      StoreData operation (though it could also occur with, say, a MakeDir op).
      
      On the client, a modification operation locks the vnode, but the bulk
      status fetch only locks the parent directory, so no ordering is imposed
      there (thereby avoiding an avenue to deadlock).
      
      On the server, the StoreData op handler doesn't lock the vnode until it's
      received all the request data, and downgrades the lock after committing the
      data until it has finished sending change notifications to other clients -
      which allows the status fetch to occur before it has finished.
      
      This means that:
      
       - a status fetch can access the target vnode either side of the exclusive
         section of the modification
      
       - the status fetch could start before the modification, yet finish after,
         and vice-versa.
      
       - the status fetch and the modification RPCs can complete in either order.
      
       - the status fetch can return either the before or the after DV from the
         modification.
      
       - the status fetch might regress the locally cached DV.
      
      Some of these are handled by the previous fix[1], but that's not sufficient
      because it checks the DV it received against the DV it cached at the start
      of the op, but the DV might've been updated in the meantime by a locally
      generated modification op.
      
      Fix this by the following means:
      
       (1) Keep track of when we're performing a modification operation on a
           vnode.  This is done by marking vnode parameters with a 'modification'
           note that causes the AFS_VNODE_MODIFYING flag to be set on the vnode
           for the duration.
      
       (2) Alter the speculation race detection to ignore speculative status
           fetches if either the vnode is marked as being modified or the data
           version number is not what we expected.
      
      Note that whilst the "vnode modified" warning does get recovered from as it
      causes the client to refetch the status at the next opportunity, it will
      also invalidate the pagecache, so changes might get lost.
      
      Fixes: a9e5c87c ("afs: Fix speculative status fetch going out of order wrt to modifications")
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-and-reviewed-by: NMarc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/160605082531.252452.14708077925602709042.stgit@warthog.procyon.org.uk/ [1]
      Link: https://lore.kernel.org/linux-fsdevel/161961335926.39335.2552653972195467566.stgit@warthog.procyon.org.uk/ # v1
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      22650f14
  3. 23 4月, 2021 2 次提交
  4. 16 3月, 2021 1 次提交
  5. 08 3月, 2021 1 次提交
  6. 24 1月, 2021 2 次提交
  7. 23 11月, 2020 1 次提交
    • D
      afs: Fix speculative status fetch going out of order wrt to modifications · a9e5c87c
      David Howells 提交于
      When doing a lookup in a directory, the afs filesystem uses a bulk
      status fetch to speculatively retrieve the statuses of up to 48 other
      vnodes found in the same directory and it will then either update extant
      inodes or create new ones - effectively doing 'lookup ahead'.
      
      To avoid the possibility of deadlocking itself, however, the filesystem
      doesn't lock all of those inodes; rather just the directory inode is
      locked (by the VFS).
      
      When the operation completes, afs_inode_init_from_status() or
      afs_apply_status() is called, depending on whether the inode already
      exists, to commit the new status.
      
      A case exists, however, where the speculative status fetch operation may
      straddle a modification operation on one of those vnodes.  What can then
      happen is that the speculative bulk status RPC retrieves the old status,
      and whilst that is happening, the modification happens - which returns
      an updated status, then the modification status is committed, then we
      attempt to commit the speculative status.
      
      This results in something like the following being seen in dmesg:
      
      	kAFS: vnode modified {100058:861} 8->9 YFS.InlineBulkStatus
      
      showing that for vnode 861 on volume 100058, we saw YFS.InlineBulkStatus
      say that the vnode had data version 8 when we'd already recorded version
      9 due to a local modification.  This was causing the cache to be
      invalidated for that vnode when it shouldn't have been.  If it happens
      on a data file, this might lead to local changes being lost.
      
      Fix this by ignoring speculative status updates if the data version
      doesn't match the expected value.
      
      Note that it is possible to get a DV regression if a volume gets
      restored from a backup - but we should get a callback break in such a
      case that should trigger a recheck anyway.  It might be worth checking
      the volume creation time in the volsync info and, if a change is
      observed in that (as would happen on a restore), invalidate all caches
      associated with the volume.
      
      Fixes: 5cf9dd55 ("afs: Prospectively look up extra files when doing a single lookup")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9e5c87c
  8. 09 10月, 2020 1 次提交
    • D
      afs: Fix deadlock between writeback and truncate · ec0fa0b6
      David Howells 提交于
      The afs filesystem has a lock[*] that it uses to serialise I/O operations
      going to the server (vnode->io_lock), as the server will only perform one
      modification operation at a time on any given file or directory.  This
      prevents the the filesystem from filling up all the call slots to a server
      with calls that aren't going to be executed in parallel anyway, thereby
      allowing operations on other files to obtain slots.
      
        [*] Note that is probably redundant for directories at least since
            i_rwsem is used to serialise directory modifications and
            lookup/reading vs modification.  The server does allow parallel
            non-modification ops, however.
      
      When a file truncation op completes, we truncate the in-memory copy of the
      file to match - but we do it whilst still holding the io_lock, the idea
      being to prevent races with other operations.
      
      However, if writeback starts in a worker thread simultaneously with
      truncation (whilst notify_change() is called with i_rwsem locked, writeback
      pays it no heed), it may manage to set PG_writeback bits on the pages that
      will get truncated before afs_setattr_success() manages to call
      truncate_pagecache().  Truncate will then wait for those pages - whilst
      still inside io_lock:
      
          # cat /proc/8837/stack
          [<0>] wait_on_page_bit_common+0x184/0x1e7
          [<0>] truncate_inode_pages_range+0x37f/0x3eb
          [<0>] truncate_pagecache+0x3c/0x53
          [<0>] afs_setattr_success+0x4d/0x6e
          [<0>] afs_wait_for_operation+0xd8/0x169
          [<0>] afs_do_sync_operation+0x16/0x1f
          [<0>] afs_setattr+0x1fb/0x25d
          [<0>] notify_change+0x2cf/0x3c4
          [<0>] do_truncate+0x7f/0xb2
          [<0>] do_sys_ftruncate+0xd1/0x104
          [<0>] do_syscall_64+0x2d/0x3a
          [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The writeback operation, however, stalls indefinitely because it needs to
      get the io_lock to proceed:
      
          # cat /proc/5940/stack
          [<0>] afs_get_io_locks+0x58/0x1ae
          [<0>] afs_begin_vnode_operation+0xc7/0xd1
          [<0>] afs_store_data+0x1b2/0x2a3
          [<0>] afs_write_back_from_locked_page+0x418/0x57c
          [<0>] afs_writepages_region+0x196/0x224
          [<0>] afs_writepages+0x74/0x156
          [<0>] do_writepages+0x2d/0x56
          [<0>] __writeback_single_inode+0x84/0x207
          [<0>] writeback_sb_inodes+0x238/0x3cf
          [<0>] __writeback_inodes_wb+0x68/0x9f
          [<0>] wb_writeback+0x145/0x26c
          [<0>] wb_do_writeback+0x16a/0x194
          [<0>] wb_workfn+0x74/0x177
          [<0>] process_one_work+0x174/0x264
          [<0>] worker_thread+0x117/0x1b9
          [<0>] kthread+0xec/0xf1
          [<0>] ret_from_fork+0x1f/0x30
      
      and thus deadlock has occurred.
      
      Note that whilst afs_setattr() calls filemap_write_and_wait(), the fact
      that the caller is holding i_rwsem doesn't preclude more pages being
      dirtied through an mmap'd region.
      
      Fix this by:
      
       (1) Use the vnode validate_lock to mediate access between afs_setattr()
           and afs_writepages():
      
           (a) Exclusively lock validate_lock in afs_setattr() around the whole
           	 RPC operation.
      
           (b) If WB_SYNC_ALL isn't set on entry to afs_writepages(), trying to
           	 shared-lock validate_lock and returning immediately if we couldn't
           	 get it.
      
           (c) If WB_SYNC_ALL is set, wait for the lock.
      
           The validate_lock is also used to validate a file and to zap its cache
           if the file was altered by a third party, so it's probably a good fit
           for this.
      
       (2) Move the truncation outside of the io_lock in setattr, using the same
           hook as is used for local directory editing.
      
           This requires the old i_size to be retained in the operation record as
           we commit the revised status to the inode members inside the io_lock
           still, but we still need to know if we reduced the file size.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ec0fa0b6
  9. 17 6月, 2020 1 次提交
    • D
      afs: Fix silly rename · b6489a49
      David Howells 提交于
      Fix AFS's silly rename by the following means:
      
       (1) Set the destination directory in afs_do_silly_rename() so as to avoid
           misbehaviour and indicate that the directory data version will
           increment by 1 so as to avoid warnings about unexpected changes in the
           DV.  Also indicate that the ctime should be updated to avoid xfstest
           grumbling.
      
       (2) Note when the server indicates that a directory changed more than we
           expected (AFS_OPERATION_DIR_CONFLICT), indicating a conflict with a
           third party change, checking on successful completion of unlink and
           rename.
      
           The problem is that the FS.RemoveFile RPC op doesn't report the status
           of the unlinked file, though YFS.RemoveFile2 does.  This can be
           mitigated by the assumption that if the directory DV cranked by
           exactly 1, we can be sure we removed one link from the file; further,
           ordinarily in AFS, files cannot be hardlinked across directories, so
           if we reduce nlink to 0, the file is deleted.
      
           However, if the directory DV jumps by more than 1, we cannot know if a
           third party intervened by adding or removing a link on the file we
           just removed a link from.
      
           The same also goes for any vnode that is at the destination of the
           FS.Rename RPC op.
      
       (3) Make afs_vnode_commit_status() apply the nlink drop inside the cb_lock
           section along with the other attribute updates if ->op_unlinked is set
           on the descriptor for the appropriate vnode.
      
       (4) Issue a follow up status fetch to the unlinked file in the event of a
           third party conflict that makes it impossible for us to know if we
           actually deleted the file or not.
      
       (5) Provide a flag, AFS_VNODE_SILLY_DELETED, to make afs_getattr() lie to
           the user about the nlink of a silly deleted file so that it appears as
           0, not 1.
      
      Found with the generic/035 and generic/084 xfstests.
      
      Fixes: e49c7b2f ("afs: Build an abstraction around an "operation" concept")
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      b6489a49
  10. 16 6月, 2020 2 次提交
    • D
      afs: afs_vnode_commit_status() doesn't need to check the RPC error · 7c295eec
      David Howells 提交于
      afs_vnode_commit_status() is only ever called if op->error is 0, so remove
      the op->error checks from the function.
      
      Fixes: e49c7b2f ("afs: Build an abstraction around an "operation" concept")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      7c295eec
    • D
      afs: Fix use of afs_check_for_remote_deletion() · 728279a5
      David Howells 提交于
      afs_check_for_remote_deletion() checks to see if error ENOENT is returned
      by the server in response to an operation and, if so, marks the primary
      vnode as having been deleted as the FID is no longer valid.
      
      However, it's being called from the operation success functions, where no
      abort has happened - and if an inline abort is recorded, it's handled by
      afs_vnode_commit_status().
      
      Fix this by actually calling the operation aborted method if provided and
      having that point to afs_check_for_remote_deletion().
      
      Fixes: e49c7b2f ("afs: Build an abstraction around an "operation" concept")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      728279a5
  11. 15 6月, 2020 3 次提交
    • D
      afs: Fix truncation issues and mmap writeback size · 793fe82e
      David Howells 提交于
      Fix the following issues:
      
       (1) Fix writeback to reduce the size of a store operation to i_size,
           effectively discarding the extra data.
      
           The problem comes when afs_page_mkwrite() records that a page is about
           to be modified by mmap().  It doesn't know what bits of the page are
           going to be modified, so it records the whole page as being dirty
           (this is stored in page->private as start and end offsets).
      
           Without this, the marshalling for the store to the server extends the
           size of the file to the end of the page (in afs_fs_store_data() and
           yfs_fs_store_data()).
      
       (2) Fix setattr to actually truncate the pagecache, thereby clearing
           the discarded part of a file.
      
       (3) Fix setattr to check that the new size is okay and to disable
           ATTR_SIZE if i_size wouldn't change.
      
       (4) Force i_size to be updated as the result of a truncate.
      
       (5) Don't truncate if ATTR_SIZE is not set.
      
       (6) Call pagecache_isize_extended() if the file was enlarged.
      
      Note that truncate_set_size() isn't used because the setting of i_size is
      done inside afs_vnode_commit_status() under the vnode->cb_lock.
      
      Found with the generic/029 and generic/393 xfstests.
      
      Fixes: 31143d5d ("AFS: implement basic file write support")
      Fixes: 4343d008 ("afs: Get rid of the afs_writeback record")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      793fe82e
    • D
      afs: Concoct ctimes · da8d0755
      David Howells 提交于
      The in-kernel afs filesystem ignores ctime because the AFS fileserver
      protocol doesn't support ctimes.  This, however, causes various xfstests to
      fail.
      
      Work around this by:
      
       (1) Setting ctime to attr->ia_ctime in afs_setattr().
      
       (2) Not ignoring ATTR_MTIME_SET, ATTR_TIMES_SET and ATTR_TOUCH settings.
      
       (3) Setting the ctime from the server mtime when on the target file when
           creating a hard link to it.
      
       (4) Setting the ctime on directories from their revised mtimes when
           renaming/moving a file.
      
      Found by the generic/221 and generic/309 xfstests.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      da8d0755
    • D
      afs: Fix EOF corruption · 3f4aa981
      David Howells 提交于
      When doing a partial writeback, afs_write_back_from_locked_page() may
      generate an FS.StoreData RPC request that writes out part of a file when a
      file has been constructed from pieces by doing seek, write, seek, write,
      ... as is done by ld.
      
      The FS.StoreData RPC is given the current i_size as the file length, but
      the server basically ignores it unless the data length is 0 (in which case
      it's just a truncate operation).  The revised file length returned in the
      result of the RPC may then not reflect what we suggested - and this leads
      to i_size getting moved backwards - which causes issues later.
      
      Fix the client to take account of this by ignoring the returned file size
      unless the data version number jumped unexpectedly - in which case we're
      going to have to clear the pagecache and reload anyway.
      
      This can be observed when doing a kernel build on an AFS mount.  The
      following pair of commands produce the issue:
      
        ld -m elf_x86_64 -z max-page-size=0x200000 --emit-relocs \
            -T arch/x86/realmode/rm/realmode.lds \
            arch/x86/realmode/rm/header.o \
            arch/x86/realmode/rm/trampoline_64.o \
            arch/x86/realmode/rm/stack.o \
            arch/x86/realmode/rm/reboot.o \
            -o arch/x86/realmode/rm/realmode.elf
        arch/x86/tools/relocs --realmode \
            arch/x86/realmode/rm/realmode.elf \
            >arch/x86/realmode/rm/realmode.relocs
      
      This results in the latter giving:
      
      	Cannot read ELF section headers 0/18: Success
      
      as the realmode.elf file got corrupted.
      
      The sequence of events can also be driven with:
      
      	xfs_io -t -f \
      		-c "pwrite -S 0x58 0 0x58" \
      		-c "pwrite -S 0x59 10000 1000" \
      		-c "close" \
      		/afs/example.com/scratch/a
      
      Fixes: 31143d5d ("AFS: implement basic file write support")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3f4aa981
  12. 10 6月, 2020 1 次提交
  13. 04 6月, 2020 2 次提交
    • D
      afs: Reorganise volume and server trees to be rooted on the cell · 20325960
      David Howells 提交于
      Reorganise afs_volume objects such that they're in a tree keyed on volume
      ID, rooted at on an afs_cell object rather than being in multiple trees,
      each of which is rooted on an afs_server object.
      
      afs_server structs become per-cell and acquire a pointer to the cell.
      
      The process of breaking a callback then starts with finding the server by
      its network address, following that to the cell and then looking up each
      volume ID in the volume tree.
      
      This is simpler than the afs_vol_interest/afs_cb_interest N:M mapping web
      and allows those structs and the code for maintaining them to be simplified
      or removed.
      
      It does make a couple of things a bit more tricky, though:
      
       (1) Operations now start with a volume, not a server, so there can be more
           than one answer as to whether or not the server we'll end up using
           supports the FS.InlineBulkStatus RPC.
      
       (2) CB RPC operations that specify the server UUID.  There's still a tree
           of servers by UUID on the afs_net struct, but the UUIDs in it aren't
           guaranteed unique.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      20325960
    • D
      afs: Build an abstraction around an "operation" concept · e49c7b2f
      David Howells 提交于
      Turn the afs_operation struct into the main way that most fileserver
      operations are managed.  Various things are added to the struct, including
      the following:
      
       (1) All the parameters and results of the relevant operations are moved
           into it, removing corresponding fields from the afs_call struct.
           afs_call gets a pointer to the op.
      
       (2) The target volume is made the main focus of the operation, rather than
           the target vnode(s), and a bunch of op->vnode->volume are made
           op->volume instead.
      
       (3) Two vnode records are defined (op->file[]) for the vnode(s) involved
           in most operations.  The vnode record (struct afs_vnode_param)
           contains:
      
      	- The vnode pointer.
      
      	- The fid of the vnode to be included in the parameters or that was
                returned in the reply (eg. FS.MakeDir).
      
      	- The status and callback information that may be returned in the
           	  reply about the vnode.
      
      	- Callback break and data version tracking for detecting
                simultaneous third-parth changes.
      
       (4) Pointers to dentries to be updated with new inodes.
      
       (5) An operations table pointer.  The table includes pointers to functions
           for issuing AFS and YFS-variant RPCs, handling the success and abort
           of an operation and handling post-I/O-lock local editing of a
           directory.
      
      To make this work, the following function restructuring is made:
      
       (A) The rotation loop that issues calls to fileservers that can be found
           in each function that wants to issue an RPC (such as afs_mkdir()) is
           extracted out into common code, in a new file called fs_operation.c.
      
       (B) The rotation loops, such as the one in afs_mkdir(), are replaced with
           a much smaller piece of code that allocates an operation, sets the
           parameters and then calls out to the common code to do the actual
           work.
      
       (C) The code for handling the success and failure of an operation are
           moved into operation functions (as (5) above) and these are called
           from the core code at appropriate times.
      
       (D) The pseudo inode getting stuff used by the dynamic root code is moved
           over into dynroot.c.
      
       (E) struct afs_iget_data is absorbed into the operation struct and
           afs_iget() expects to be given an op pointer and a vnode record.
      
       (F) Point (E) doesn't work for the root dir of a volume, but we know the
           FID in advance (it's always vnode 1, unique 1), so a separate inode
           getter, afs_root_iget(), is provided to special-case that.
      
       (G) The inode status init/update functions now also take an op and a vnode
           record.
      
       (H) The RPC marshalling functions now, for the most part, just take an
           afs_operation struct as their only argument.  All the data they need
           is held there.  The result delivery functions write their answers
           there as well.
      
       (I) The call is attached to the operation and then the operation core does
           the waiting.
      
      And then the new operation code is, for the moment, made to just initialise
      the operation, get the appropriate vnode I/O locks and do the same rotation
      loop as before.
      
      This lays the foundation for the following changes in the future:
      
       (*) Overhauling the rotation (again).
      
       (*) Support for asynchronous I/O, where the fileserver rotation must be
           done asynchronously also.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e49c7b2f
  14. 31 5月, 2020 2 次提交
  15. 18 10月, 2019 1 次提交
  16. 16 9月, 2019 1 次提交
  17. 21 6月, 2019 2 次提交
    • D
      afs: Add some callback management tracepoints · 051d2525
      David Howells 提交于
      Add a couple of tracepoints to track callback management:
      
       (1) afs_cb_miss - Logs when we were unable to apply a callback, either due
           to the inode being discarded or due to a competing thread applying a
           callback first.
      
       (2) afs_cb_break - Logs when we attempted to clear the noted callback
           promise, either due to the server explicitly breaking the callback,
           the callback promise lapsing or a local event obsoleting it.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      051d2525
    • D
      afs: Fix setting of i_blocks · 2cd42d19
      David Howells 提交于
      The setting of i_blocks, which is calculated from i_size, has got
      accidentally misordered relative to the setting of i_size when initially
      setting up an inode.  Further, i_blocks isn't updated by afs_apply_status()
      when the size is updated.
      
      To fix this, break the i_size/i_blocks setting out into a helper function
      and call it from both places.
      
      Fixes: a58823ac ("afs: Fix application of status and callback to be under same lock")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      2cd42d19
  18. 20 6月, 2019 1 次提交
    • D
      afs: Fix over zealous "vnode modified" warnings · 3647e42b
      David Howells 提交于
      Occasionally, warnings like this:
      
      	vnode modified 2af7 on {10000b:1} [exp 2af2] YFS.FetchStatus(vnode)
      
      are emitted into the kernel log.  This indicates that when we were applying
      the updated vnode (file) status retrieved from the server to an inode we
      saw that the data version number wasn't what we were expecting (in this
      case it's 0x2af7 rather than 0x2af2).
      
      We've usually received a callback from the server prior to this point - or
      the callback promise has lapsed - so the warning is merely informative and
      the state is to be expected.
      
      Fix this by only emitting the warning if the we still think that we have a
      valid callback promise and haven't received a callback.
      
      Also change the format slightly so so that the new data version doesn't
      look like part of the text, the like is prefixed with "kAFS: " and the
      message is ranked as a warning.
      
      Fixes: 31143d5d ("AFS: implement basic file write support")
      Reported-by: NIan Wienand <iwienand@redhat.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3647e42b
  19. 17 5月, 2019 6 次提交
  20. 16 5月, 2019 3 次提交
    • D
      afs: Fix application of status and callback to be under same lock · a58823ac
      David Howells 提交于
      When applying the status and callback in the response of an operation,
      apply them in the same critical section so that there's no race between
      checking the callback state and checking status-dependent state (such as
      the data version).
      
      Fix this by:
      
       (1) Allocating a joint {status,callback} record (afs_status_cb) before
           calling the RPC function for each vnode for which the RPC reply
           contains a status or a status plus a callback.  A flag is set in the
           record to indicate if a callback was actually received.
      
       (2) These records are passed into the RPC functions to be filled in.  The
           afs_decode_status() and yfs_decode_status() functions are removed and
           the cb_lock is no longer taken.
      
       (3) xdr_decode_AFSFetchStatus() and xdr_decode_YFSFetchStatus() no longer
           update the vnode.
      
       (4) xdr_decode_AFSCallBack() and xdr_decode_YFSCallBack() no longer update
           the vnode.
      
       (5) vnodes, expected data-version numbers and callback break counters
           (cb_break) no longer need to be passed to the reply delivery
           functions.
      
           Note that, for the moment, the file locking functions still need
           access to both the call and the vnode at the same time.
      
       (6) afs_vnode_commit_status() is now given the cb_break value and the
           expected data_version and the task of applying the status and the
           callback to the vnode are now done here.
      
           This is done under a single taking of vnode->cb_lock.
      
       (7) afs_pages_written_back() is now called by afs_store_data() rather than
           by the reply delivery function.
      
           afs_pages_written_back() has been moved to before the call point and
           is now given the first and last page numbers rather than a pointer to
           the call.
      
       (8) The indicator from YFS.RemoveFile2 as to whether the target file
           actually got removed (status.abort_code == VNOVNODE) rather than
           merely dropping a link is now checked in afs_unlink rather than in
           xdr_decode_YFSFetchStatus().
      
      Supplementary fixes:
      
       (*) afs_cache_permit() now gets the caller_access mask from the
           afs_status_cb object rather than picking it out of the vnode's status
           record.  afs_fetch_status() returns caller_access through its argument
           list for this purpose also.
      
       (*) afs_inode_init_from_status() now uses a write lock on cb_lock rather
           than a read lock and now sets the callback inside the same critical
           section.
      
      Fixes: c435ee34 ("afs: Overhaul the callback handling")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      a58823ac
    • D
      afs: Don't invalidate callback if AFS_VNODE_DIR_VALID not set · d9052dda
      David Howells 提交于
      Don't invalidate the callback promise on a directory if the
      AFS_VNODE_DIR_VALID flag is not set (which indicates that the directory
      contents are invalid, due to edit failure, callback break, page reclaim).
      
      The directory will be reloaded next time the directory is accessed, so
      clearing the callback flag at this point may race with a reload of the
      directory and cancel it's recorded callback promise.
      
      Fixes: f3ddee8d ("afs: Fix directory handling")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      d9052dda
    • D
      afs: Make some RPC operations non-interruptible · 20b8391f
      David Howells 提交于
      Make certain RPC operations non-interruptible, including:
      
       (*) Set attributes
       (*) Store data
      
           We don't want to get interrupted during a flush on close, flush on
           unlock, writeback or an inode update, leaving us in a state where we
           still need to do the writeback or update.
      
       (*) Extend lock
       (*) Release lock
      
           We don't want to get lock extension interrupted as the file locks on
           the server are time-limited.  Interruption during lock release is less
           of an issue since the lock is time-limited, but it's better to
           complete the release to avoid a several-minute wait to recover it.
      
           *Setting* the lock isn't a problem if it's interrupted since we can
            just return to the user and tell them they were interrupted - at
            which point they can elect to retry.
      
       (*) Silly unlink
      
           We want to remove silly unlink files if we can, rather than leaving
           them for the salvager to clear up.
      
      Note that whilst these calls are no longer interruptible, they do have
      timeouts on them, so if the server stops responding the call will fail with
      something like ETIME or ECONNRESET.
      
      Without this, the following:
      
      	kAFS: Unexpected error from FS.StoreData -512
      
      appears in dmesg when a pending store data gets interrupted and some
      processes may just hang.
      
      Additionally, make the code that checks/updates the server record ignore
      failure due to interruption if the main call is uninterruptible and if the
      server has an address list.  The next op will check it again since the
      expiration time on the old list has past.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Reported-by: NJonathan Billings <jsbillings@jsbillings.org>
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      20b8391f
  21. 15 5月, 2019 1 次提交
    • D
      afs: Fix key leak in afs_release() and afs_evict_inode() · a1b879ee
      David Howells 提交于
      Fix afs_release() to go through the cleanup part of the function if
      FMODE_WRITE is set rather than exiting through vfs_fsync() (which skips the
      cleanup).  The cleanup involves discarding the refs on the key used for
      file ops and the writeback key record.
      
      Also fix afs_evict_inode() to clean up any left over wb keys attached to
      the inode/vnode when it is removed.
      
      Fixes: 5a813276 ("afs: Do better accretion of small writes on newly created content")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      a1b879ee
  22. 07 5月, 2019 2 次提交
    • M
      afs: Calculate i_blocks based on file size · c0abbb57
      Marc Dionne 提交于
      While it's not possible to give an accurate number for the blocks
      used on the server, populate i_blocks based on the file size so
      that 'du' can give a reasonable estimate.
      
      The value is rounded up to 1K granularity, for consistency with
      what other AFS clients report, and the servers' 1K usage quota
      unit.  Note that the value calculated by 'du' at the root of a
      volume can still be slightly lower than the quota usage on the
      server, as 0-length files are charged 1 quota block, but are
      reported as occupying 0 blocks.  Again, this is consistent with
      other AFS clients.
      Signed-off-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      c0abbb57
    • D
      afs: Log more information for "kAFS: AFS vnode with undefined type\n" · b134d687
      David Howells 提交于
      Log more information when "kAFS: AFS vnode with undefined type\n" is
      displayed due to a vnode record being retrieved from the server that
      appears to have a duff file type (usually 0).  This prints more information
      to try and help pin down the problem.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      b134d687
  23. 25 4月, 2019 1 次提交
    • D
      afs: Implement sillyrename for unlink and rename · 79ddbfa5
      David Howells 提交于
      Implement sillyrename for AFS unlink and rename, using the NFS variant
      implementation as a basis.
      
      Note that the asynchronous file locking extender/releaser has to be
      notified with a state change to stop it complaining if there's a race
      between that and the actual file deletion.
      
      A tracepoint, afs_silly_rename, is also added to note the silly rename and
      the cleanup.  The afs_edit_dir tracepoint is given some extra reason
      indicators and the afs_flock_ev tracepoint is given a silly-delete file
      lock cancellation indicator.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      79ddbfa5