1. 17 5月, 2019 4 次提交
  2. 16 5月, 2019 8 次提交
    • D
      afs: Fix application of status and callback to be under same lock · a58823ac
      David Howells 提交于
      When applying the status and callback in the response of an operation,
      apply them in the same critical section so that there's no race between
      checking the callback state and checking status-dependent state (such as
      the data version).
      
      Fix this by:
      
       (1) Allocating a joint {status,callback} record (afs_status_cb) before
           calling the RPC function for each vnode for which the RPC reply
           contains a status or a status plus a callback.  A flag is set in the
           record to indicate if a callback was actually received.
      
       (2) These records are passed into the RPC functions to be filled in.  The
           afs_decode_status() and yfs_decode_status() functions are removed and
           the cb_lock is no longer taken.
      
       (3) xdr_decode_AFSFetchStatus() and xdr_decode_YFSFetchStatus() no longer
           update the vnode.
      
       (4) xdr_decode_AFSCallBack() and xdr_decode_YFSCallBack() no longer update
           the vnode.
      
       (5) vnodes, expected data-version numbers and callback break counters
           (cb_break) no longer need to be passed to the reply delivery
           functions.
      
           Note that, for the moment, the file locking functions still need
           access to both the call and the vnode at the same time.
      
       (6) afs_vnode_commit_status() is now given the cb_break value and the
           expected data_version and the task of applying the status and the
           callback to the vnode are now done here.
      
           This is done under a single taking of vnode->cb_lock.
      
       (7) afs_pages_written_back() is now called by afs_store_data() rather than
           by the reply delivery function.
      
           afs_pages_written_back() has been moved to before the call point and
           is now given the first and last page numbers rather than a pointer to
           the call.
      
       (8) The indicator from YFS.RemoveFile2 as to whether the target file
           actually got removed (status.abort_code == VNOVNODE) rather than
           merely dropping a link is now checked in afs_unlink rather than in
           xdr_decode_YFSFetchStatus().
      
      Supplementary fixes:
      
       (*) afs_cache_permit() now gets the caller_access mask from the
           afs_status_cb object rather than picking it out of the vnode's status
           record.  afs_fetch_status() returns caller_access through its argument
           list for this purpose also.
      
       (*) afs_inode_init_from_status() now uses a write lock on cb_lock rather
           than a read lock and now sets the callback inside the same critical
           section.
      
      Fixes: c435ee34 ("afs: Overhaul the callback handling")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      a58823ac
    • D
      afs: Always get the reply time · 4571577f
      David Howells 提交于
      Always ask for the reply time from AF_RXRPC as it's used to calculate the
      callback expiry time and lock expiry times, so it's needed by most FS
      operations.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4571577f
    • D
      afs: Fix order-1 allocation in afs_do_lookup() · 87182759
      David Howells 提交于
      afs_do_lookup() will do an order-1 allocation to allocate status records if
      there are more than 39 vnodes to stat.
      
      Fix this by allocating an array of {status,callback} records for each vnode
      we want to examine using vmalloc() if larger than a page.
      
      This not only gets rid of the order-1 allocation, but makes it easier to
      grow beyond 50 records for YFS servers.  It also allows us to move to
      {status,callback} tuples for other calls too and makes it easier to lock
      across the application of the status and the callback to the vnode.
      
      Fixes: 5cf9dd55 ("afs: Prospectively look up extra files when doing a single lookup")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      87182759
    • D
      afs: Get rid of afs_call::reply[] · ffba718e
      David Howells 提交于
      Replace the afs_call::reply[] array with a bunch of typed members so that
      the compiler can use type-checking on them.  It's also easier for the eye
      to see what's going on.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      ffba718e
    • D
      afs: Make some RPC operations non-interruptible · 20b8391f
      David Howells 提交于
      Make certain RPC operations non-interruptible, including:
      
       (*) Set attributes
       (*) Store data
      
           We don't want to get interrupted during a flush on close, flush on
           unlock, writeback or an inode update, leaving us in a state where we
           still need to do the writeback or update.
      
       (*) Extend lock
       (*) Release lock
      
           We don't want to get lock extension interrupted as the file locks on
           the server are time-limited.  Interruption during lock release is less
           of an issue since the lock is time-limited, but it's better to
           complete the release to avoid a several-minute wait to recover it.
      
           *Setting* the lock isn't a problem if it's interrupted since we can
            just return to the user and tell them they were interrupted - at
            which point they can elect to retry.
      
       (*) Silly unlink
      
           We want to remove silly unlink files if we can, rather than leaving
           them for the salvager to clear up.
      
      Note that whilst these calls are no longer interruptible, they do have
      timeouts on them, so if the server stops responding the call will fail with
      something like ETIME or ECONNRESET.
      
      Without this, the following:
      
      	kAFS: Unexpected error from FS.StoreData -512
      
      appears in dmesg when a pending store data gets interrupted and some
      processes may just hang.
      
      Additionally, make the code that checks/updates the server record ignore
      failure due to interruption if the main call is uninterruptible and if the
      server has an address list.  The next op will check it again since the
      expiration time on the old list has past.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Reported-by: NJonathan Billings <jsbillings@jsbillings.org>
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      20b8391f
    • D
      afs: Fix the maximum lifespan of VL and probe calls · 94f699c9
      David Howells 提交于
      If an older AFS server doesn't support an operation, it may accept the call
      and then sit on it forever, happily responding to pings that make kafs
      think that the call is still alive.
      
      Fix this by setting the maximum lifespan of Volume Location service calls
      in particular and probe calls in general so that they don't run on
      endlessly if they're not supported.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      94f699c9
    • D
      afs: Fix cell DNS lookup · d5c32c89
      David Howells 提交于
      Currently, once configured, AFS cells are looked up in the DNS at regular
      intervals - which is a waste of resources if those cells aren't being
      used.  It also leads to a problem where cells preloaded, but not
      configured, before the network is brought up end up effectively statically
      configured with no VL servers and are unable to get any.
      
      Fix this by not doing the DNS lookup until the first time a cell is
      touched.  It is waited for if we don't have any cached records yet,
      otherwise the DNS lookup to maintain the record is done in the background.
      
      This has the downside that the first time you touch a cell, you now have to
      wait for the upcall to do the required DNS lookups rather than them already
      being cached.
      
      Further, the record is not replaced if the old record has at least one
      server in it and the new record doesn't have any.
      
      Fixes: 0a5143f2 ("afs: Implement VL server rotation")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      d5c32c89
    • D
      afs: Fix afs_xattr_get_yfs() to not try freeing an error value · 773e0c40
      David Howells 提交于
      afs_xattr_get_yfs() tries to free yacl, which may hold an error value (say
      if yfs_fs_fetch_opaque_acl() failed and returned an error).
      
      Fix this by allocating yacl up front (since it's a fixed-length struct,
      unlike afs_acl) and passing it in to the RPC function.  This also allows
      the flags to be placed in the object rather than passing them through to
      the RPC function.
      
      Fixes: ae46578b ("afs: Get YFS ACLs and information through xattrs")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      773e0c40
  3. 07 5月, 2019 5 次提交
  4. 25 4月, 2019 5 次提交
    • D
      afs: Provide mount-time configurable byte-range file locking emulation · 6c6c1d63
      David Howells 提交于
      Provide byte-range file locking emulation that can be configured at mount
      time to one of four modes:
      
       (1) flock=local.  Locking is done locally only and no reference is made to
           the server.
      
       (2) flock=openafs.  Byte-range locking is done locally only; whole-file
           locking is done with reference to the server.  Whole-file locks cannot
           be upgraded unless the client holds an exclusive lock.
      
       (3) flock=strict.  Byte-range and whole-file locking both require a
           sufficient whole-file lock on the server.
      
       (4) flock=write.  As strict, but the client always gets an exclusive
           whole-file lock on the server.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      6c6c1d63
    • D
      afs: Implement sillyrename for unlink and rename · 79ddbfa5
      David Howells 提交于
      Implement sillyrename for AFS unlink and rename, using the NFS variant
      implementation as a basis.
      
      Note that the asynchronous file locking extender/releaser has to be
      notified with a state change to stop it complaining if there's a race
      between that and the actual file deletion.
      
      A tracepoint, afs_silly_rename, is also added to note the silly rename and
      the cleanup.  The afs_edit_dir tracepoint is given some extra reason
      indicators and the afs_flock_ev tracepoint is given a silly-delete file
      lock cancellation indicator.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      79ddbfa5
    • D
      afs: Handle lock rpc ops failing on a file that got deleted · cdfb26b4
      David Howells 提交于
      Holding a file lock on an AFS file does not prevent it from being deleted
      on the server, so we need to handle an error resulting from that when we
      try setting, extending or releasing a lock.
      
      Fix this by adding a "deleted" lock state and cancelling the lock extension
      process for that file and aborting all waiters for the lock.
      
      Fixes: 0fafdc9f ("afs: Fix file locking")
      Reported-by: NJonathan Billings <jsbillin@umich.edu>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cdfb26b4
    • D
      afs: Calculate lock extend timer from set/extend reply reception · a690f60a
      David Howells 提交于
      Record the timestamp on the first reply DATA packet received in response to
      a set- or extend-lock operation, then use this to calculate the time
      remaining till the lock expires rather than using whatever time the
      requesting process wakes up and finishes processing the operation as a
      base.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      a690f60a
    • D
      afs: Split wait from afs_make_call() · 0b9bf381
      David Howells 提交于
      Split the call to afs_wait_for_call_to_complete() from afs_make_call() to
      make it easier to handle asynchronous calls and to make it easier to
      convert a synchronous call to an asynchronous one in future, for instance
      when someone tries to interrupt an operation by pressing Ctrl-C.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0b9bf381
  5. 13 4月, 2019 1 次提交
    • D
      afs: Fix in-progess ops to ignore server-level callback invalidation · eeba1e9c
      David Howells 提交于
      The in-kernel afs filesystem client counts the number of server-level
      callback invalidation events (CB.InitCallBackState* RPC operations) that it
      receives from the server.  This is stored in cb_s_break in various
      structures, including afs_server and afs_vnode.
      
      If an inode is examined by afs_validate(), say, the afs_server copy is
      compared, along with other break counters, to those in afs_vnode, and if
      one or more of the counters do not match, it is considered that the
      server's callback promise is broken.  At points where this happens,
      AFS_VNODE_CB_PROMISED is cleared to indicate that the status must be
      refetched from the server.
      
      afs_validate() issues an FS.FetchStatus operation to get updated metadata -
      and based on the updated data_version may invalidate the pagecache too.
      
      However, the break counters are also used to determine whether to note a
      new callback in the vnode (which would set the AFS_VNODE_CB_PROMISED flag)
      and whether to cache the permit data included in the YFSFetchStatus record
      by the server.
      
      
      The problem comes when the server sends us a CB.InitCallBackState op.  The
      first such instance doesn't cause cb_s_break to be incremented, but rather
      causes AFS_SERVER_FL_NEW to be cleared - but thereafter, say some hours
      after last use and all the volumes have been automatically unmounted and
      the server has forgotten about the client[*], this *will* likely cause an
      increment.
      
       [*] There are other circumstances too, such as the server restarting or
           needing to make space in its callback table.
      
      Note that the server won't send us a CB.InitCallBackState op until we talk
      to it again.
      
      So what happens is:
      
       (1) A mount for a new volume is attempted, a inode is created for the root
           vnode and vnode->cb_s_break and AFS_VNODE_CB_PROMISED aren't set
           immediately, as we don't have a nominated server to talk to yet - and
           we may iterate through a few to find one.
      
       (2) Before the operation happens, afs_fetch_status(), say, notes in the
           cursor (fc.cb_break) the break counter sum from the vnode, volume and
           server counters, but the server->cb_s_break is currently 0.
      
       (3) We send FS.FetchStatus to the server.  The server sends us back
           CB.InitCallBackState.  We increment server->cb_s_break.
      
       (4) Our FS.FetchStatus completes.  The reply includes a callback record.
      
       (5) xdr_decode_AFSCallBack()/xdr_decode_YFSCallBack() check to see whether
           the callback promise was broken by checking the break counter sum from
           step (2) against the current sum.
      
           This fails because of step (3), so we don't set the callback record
           and, importantly, don't set AFS_VNODE_CB_PROMISED on the vnode.
      
      This does not preclude the syscall from progressing, and we don't loop here
      rechecking the status, but rather assume it's good enough for one round
      only and will need to be rechecked next time.
      
       (6) afs_validate() it triggered on the vnode, probably called from
           d_revalidate() checking the parent directory.
      
       (7) afs_validate() notes that AFS_VNODE_CB_PROMISED isn't set, so doesn't
           update vnode->cb_s_break and assumes the vnode to be invalid.
      
       (8) afs_validate() needs to calls afs_fetch_status().  Go back to step (2)
           and repeat, every time the vnode is validated.
      
      This primarily affects volume root dir vnodes.  Everything subsequent to
      those inherit an already incremented cb_s_break upon mounting.
      
      
      The issue is that we assume that the callback record and the cached permit
      information in a reply from the server can't be trusted after getting a
      server break - but this is wrong since the server makes sure things are
      done in the right order, holding up our ops if necessary[*].
      
       [*] There is an extremely unlikely scenario where a reply from before the
           CB.InitCallBackState could get its delivery deferred till after - at
           which point we think we have a promise when we don't.  This, however,
           requires unlucky mass packet loss to one call.
      
      AFS_SERVER_FL_NEW tries to paper over the cracks for the initial mount from
      a server we've never contacted before, but this should be unnecessary.
      It's also further insulated from the problem on an initial mount by
      querying the server first with FS.GetCapabilities, which triggers the
      CB.InitCallBackState.
      
      
      Fix this by
      
       (1) Remove AFS_SERVER_FL_NEW.
      
       (2) In afs_calc_vnode_cb_break(), don't include cb_s_break in the
           calculation.
      
       (3) In afs_cb_is_broken(), don't include cb_s_break in the check.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      eeba1e9c
  6. 28 2月, 2019 2 次提交
    • D
      afs: Use fs_context to pass parameters over automount · c99c2171
      David Howells 提交于
      Alter the AFS automounting code to create and modify an fs_context struct
      when parameterising a new mount triggered by an AFS mountpoint rather than
      constructing device name and option strings.
      
      Also remove the cell=, vol= and rwpath options as they are then redundant.
      The reason they existed is because the 'device name' may be derived
      literally from a mountpoint object in the filesystem, so default cell and
      parent-type information needed to be passed in by some other method from
      the automount routines.  The vol= option didn't end up being used.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cc: Eric W. Biederman <ebiederm@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c99c2171
    • D
      afs: Add fs_context support · 13fcc683
      David Howells 提交于
      Add fs_context support to the AFS filesystem, converting the parameter
      parsing to store options there.
      
      This will form the basis for namespace propagation over mountpoints within
      the AFS model, thereby allowing AFS to be used in containers more easily.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      13fcc683
  7. 30 11月, 2018 1 次提交
    • D
      afs: Fix missing net error handling · 4584ae96
      David Howells 提交于
      kAFS can be given certain network errors (EADDRNOTAVAIL, EHOSTDOWN and
      ERFKILL) that it doesn't handle in its server/address rotation algorithms.
      They cause the probing and rotation to abort immediately rather than
      rotating.
      
      Fix this by:
      
       (1) Abstracting out the error prioritisation from the VL and FS rotation
           algorithms into a common function and expand usage into the server
           probing code.
      
           When multiple errors are available, this code selects the one we'd
           prefer to return.
      
       (2) Add handling for EADDRNOTAVAIL, EHOSTDOWN and ERFKILL.
      
      Fixes: 0fafdc9f ("afs: Fix file locking")
      Fixes: 0338747d8454 ("afs: Probe multiple fileservers simultaneously")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4584ae96
  8. 24 10月, 2018 11 次提交
    • D
      afs: Probe multiple fileservers simultaneously · 3bf0fb6f
      David Howells 提交于
      Send probes to all the unprobed fileservers in a fileserver list on all
      addresses simultaneously in an attempt to find out the fastest route whilst
      not getting stuck for 20s on any server or address that we don't get a
      reply from.
      
      This alleviates the problem whereby attempting to access a new server can
      take a long time because the rotation algorithm ends up rotating through
      all servers and addresses until it finds one that responds.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3bf0fb6f
    • D
      afs: Fix callback handling · 18ac6185
      David Howells 提交于
      In some circumstances, the callback interest pointer is NULL, so in such a
      case we can't dereference it when checking to see if the callback is
      broken.  This causes an oops in some circumstances.
      
      Fix this by replacing the function that worked out the aggregate break
      counter with one that actually does the comparison, and then make that
      return true (ie. broken) if there is no callback interest as yet (ie. the
      pointer is NULL).
      
      Fixes: 68251f0a ("afs: Fix whole-volume callback handling")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      18ac6185
    • D
      afs: Eliminate the address pointer from the address list cursor · 2feeaf84
      David Howells 提交于
      Eliminate the address pointer from the address list cursor as it's
      redundant (ac->addrs[ac->index] can be used to find the same address) and
      address lists must be replaced rather than being rearranged, so is of
      limited value.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      2feeaf84
    • D
      afs: Allow dumping of server cursor on operation failure · 744bcd71
      David Howells 提交于
      Provide an option to allow the file or volume location server cursor to be
      dumped if the rotation routine falls off the end without managing to
      contact a server.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      744bcd71
    • D
      afs: Implement YFS support in the fs client · 30062bd1
      David Howells 提交于
      Implement support for talking to YFS-variant fileservers in the cache
      manager and the filesystem client.  These implement upgraded services on
      the same port as their AFS services.
      
      YFS fileservers provide expanded capabilities over AFS.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      30062bd1
    • D
      afs: Calc callback expiry in op reply delivery · 12d8e95a
      David Howells 提交于
      Calculate the callback expiration time at the point of operation reply
      delivery, using the reply time queried from AF_RXRPC on that call as a
      base.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      12d8e95a
    • D
      afs: Add a couple of tracepoints to log I/O errors · f51375cd
      David Howells 提交于
      Add a couple of tracepoints to log the production of I/O errors within the AFS
      filesystem.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f51375cd
    • D
      afs: Implement VL server rotation · 0a5143f2
      David Howells 提交于
      Track VL servers as independent entities rather than lumping all their
      addresses together into one set and implement server-level rotation by:
      
       (1) Add the concept of a VL server list, where each server has its own
           separate address list.  This code is similar to the FS server list.
      
       (2) Use the DNS resolver to retrieve a set of servers and their associated
           addresses, ports, preference and weight ratings.
      
       (3) In the case of a legacy DNS resolver or an address list given directly
           through /proc/net/afs/cells, create a list containing just a dummy
           server record and attach all the addresses to that.
      
       (4) Implement a simple rotation policy, for the moment ignoring the
           priorities and weights assigned to the servers.
      
       (5) Show the address list through /proc/net/afs/<cell>/vlservers.  This
           also displays the source and status of the data as indicated by the
           upcall.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0a5143f2
    • D
      afs: Improve FS server rotation error handling · e7f680f4
      David Howells 提交于
      Improve the error handling in FS server rotation by:
      
       (1) Cache the latest useful error value for the fs operation as a whole in
           struct afs_fs_cursor separately from the error cached in the
           afs_addr_cursor struct.  The one in the address cursor gets clobbered
           occasionally.  Copy over the error to the fs operation only when it's
           something we'd be interested in passing to userspace.
      
       (2) Make it so that EDESTADDRREQ is the default that is seen only if no
           addresses are available to be accessed.
      
       (3) When calling utility functions, such as checking a volume status or
           probing a fileserver, don't let a successful result clobber the cached
           error in the cursor; instead, stash the result in a temporary variable
           until it has been assessed.
      
       (4) Don't return ETIMEDOUT or ETIME if a better error, such as
           ENETUNREACH, is already cached.
      
       (5) On leaving the rotation loop, turn any remote abort code into a more
           useful error than ECONNABORTED.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e7f680f4
    • D
      afs: Set up the iov_iter before calling afs_extract_data() · 12bdcf33
      David Howells 提交于
      afs_extract_data sets up a temporary iov_iter and passes it to AF_RXRPC
      each time it is called to describe the remaining buffer to be filled.
      
      Instead:
      
       (1) Put an iterator in the afs_call struct.
      
       (2) Set the iterator for each marshalling stage to load data into the
           appropriate places.  A number of convenience functions are provided to
           this end (eg. afs_extract_to_buf()).
      
           This iterator is then passed to afs_extract_data().
      
       (3) Use the new ITER_DISCARD iterator to discard any excess data provided
           by FetchData.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      12bdcf33
    • D
      afs: Better tracing of protocol errors · 160cb957
      David Howells 提交于
      Include the site of detection of AFS protocol errors in trace lines to
      better be able to determine what went wrong.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      160cb957
  9. 12 10月, 2018 1 次提交
    • D
      afs: Fix cell proc list · 6b3944e4
      David Howells 提交于
      Access to the list of cells by /proc/net/afs/cells has a couple of
      problems:
      
       (1) It should be checking against SEQ_START_TOKEN for the keying the
           header line.
      
       (2) It's only holding the RCU read lock, so it can't just walk over the
           list without following the proper RCU methods.
      
      Fix these by using an hlist instead of an ordinary list and using the
      appropriate accessor functions to follow it with RCU.
      
      Since the code that adds a cell to the list must also necessarily change,
      sort the list on insertion whilst we're at it.
      
      Fixes: 989782dc ("afs: Overhaul cell database management")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b3944e4
  10. 04 10月, 2018 1 次提交
  11. 24 8月, 2018 1 次提交