1. 04 1月, 2021 1 次提交
    • D
      afs: Fix directory entry size calculation · 366911cd
      David Howells 提交于
      The number of dirent records used by an AFS directory entry should be
      calculated using the assumption that there is a 16-byte name field in the
      first block, rather than a 20-byte name field (which is actually the case).
      This miscalculation is historic and effectively standard, so we have to use
      it.
      
      The calculation we need to use is:
      
      	1 + (((strlen(name) + 1) + 15) >> 5)
      
      where we are adding one to the strlen() result to account for the NUL
      termination.
      
      Fix this by the following means:
      
       (1) Create an inline function to do the calculation for a given name
           length.
      
       (2) Use the function to calculate the number of records used for a dirent
           in afs_dir_iterate_block().
      
           Use this to move the over-end check out of the loop since it only
           needs to be done once.
      
           Further use this to only go through the loop for the 2nd+ records
           composing an entry.  The only test there now is for if the record is
           allocated - and we already checked the first block at the top of the
           outer loop.
      
       (3) Add a max name length check in afs_dir_iterate_block().
      
       (4) Make afs_edit_dir_add() and afs_edit_dir_remove() use the function
           from (1) to calculate the number of blocks rather than doing it
           incorrectly themselves.
      
      Fixes: 63a4681f ("afs: Locally edit directory data for mkdir/create/unlink/...")
      Fixes: ^1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NMarc Dionne <marc.dionne@auristor.com>
      366911cd
  2. 29 10月, 2020 2 次提交
    • D
      afs: Fix afs_invalidatepage to adjust the dirty region · f86726a6
      David Howells 提交于
      Fix afs_invalidatepage() to adjust the dirty region recorded in
      page->private when truncating a page.  If the dirty region is entirely
      removed, then the private data is cleared and the page dirty state is
      cleared.
      
      Without this, if the page is truncated and then expanded again by truncate,
      zeros from the expanded, but no-longer dirty region may get written back to
      the server if the page gets laundered due to a conflicting 3rd-party write.
      
      It mustn't, however, shorten the dirty region of the page if that page is
      still mmapped and has been marked dirty by afs_page_mkwrite(), so a flag is
      stored in page->private to record this.
      
      Fixes: 4343d008 ("afs: Get rid of the afs_writeback record")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f86726a6
    • D
      afs: Wrap page->private manipulations in inline functions · 185f0c70
      David Howells 提交于
      The afs filesystem uses page->private to store the dirty range within a
      page such that in the event of a conflicting 3rd-party write to the server,
      we write back just the bits that got changed locally.
      
      However, there are a couple of problems with this:
      
       (1) I need a bit to note if the page might be mapped so that partial
           invalidation doesn't shrink the range.
      
       (2) There aren't necessarily sufficient bits to store the entire range of
           data altered (say it's a 32-bit system with 64KiB pages or transparent
           huge pages are in use).
      
      So wrap the accesses in inline functions so that future commits can change
      how this works.
      
      Also move them out of the tracing header into the in-directory header.
      There's not really any need for them to be in the tracing header.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      185f0c70
  3. 16 10月, 2020 2 次提交
  4. 04 6月, 2020 3 次提交
    • D
      afs: Add a tracepoint to track the lifetime of the afs_volume struct · cca37d45
      David Howells 提交于
      Add a tracepoint to track the lifetime of the afs_volume struct.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cca37d45
    • D
      afs: Implement client support for the YFSVL.GetCellName RPC op · c3e9f888
      David Howells 提交于
      Implement client support for the YFSVL.GetCellName RPC operation by which
      YFS permits the canonical cell name to be queried from a VL server.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      c3e9f888
    • D
      afs: Build an abstraction around an "operation" concept · e49c7b2f
      David Howells 提交于
      Turn the afs_operation struct into the main way that most fileserver
      operations are managed.  Various things are added to the struct, including
      the following:
      
       (1) All the parameters and results of the relevant operations are moved
           into it, removing corresponding fields from the afs_call struct.
           afs_call gets a pointer to the op.
      
       (2) The target volume is made the main focus of the operation, rather than
           the target vnode(s), and a bunch of op->vnode->volume are made
           op->volume instead.
      
       (3) Two vnode records are defined (op->file[]) for the vnode(s) involved
           in most operations.  The vnode record (struct afs_vnode_param)
           contains:
      
      	- The vnode pointer.
      
      	- The fid of the vnode to be included in the parameters or that was
                returned in the reply (eg. FS.MakeDir).
      
      	- The status and callback information that may be returned in the
           	  reply about the vnode.
      
      	- Callback break and data version tracking for detecting
                simultaneous third-parth changes.
      
       (4) Pointers to dentries to be updated with new inodes.
      
       (5) An operations table pointer.  The table includes pointers to functions
           for issuing AFS and YFS-variant RPCs, handling the success and abort
           of an operation and handling post-I/O-lock local editing of a
           directory.
      
      To make this work, the following function restructuring is made:
      
       (A) The rotation loop that issues calls to fileservers that can be found
           in each function that wants to issue an RPC (such as afs_mkdir()) is
           extracted out into common code, in a new file called fs_operation.c.
      
       (B) The rotation loops, such as the one in afs_mkdir(), are replaced with
           a much smaller piece of code that allocates an operation, sets the
           parameters and then calls out to the common code to do the actual
           work.
      
       (C) The code for handling the success and failure of an operation are
           moved into operation functions (as (5) above) and these are called
           from the core code at appropriate times.
      
       (D) The pseudo inode getting stuff used by the dynamic root code is moved
           over into dynroot.c.
      
       (E) struct afs_iget_data is absorbed into the operation struct and
           afs_iget() expects to be given an op pointer and a vnode record.
      
       (F) Point (E) doesn't work for the root dir of a volume, but we know the
           FID in advance (it's always vnode 1, unique 1), so a separate inode
           getter, afs_root_iget(), is provided to special-case that.
      
       (G) The inode status init/update functions now also take an op and a vnode
           record.
      
       (H) The RPC marshalling functions now, for the most part, just take an
           afs_operation struct as their only argument.  All the data they need
           is held there.  The result delivery functions write their answers
           there as well.
      
       (I) The call is attached to the operation and then the operation core does
           the waiting.
      
      And then the new operation code is, for the moment, made to just initialise
      the operation, get the appropriate vnode I/O locks and do the same rotation
      loop as before.
      
      This lays the foundation for the following changes in the future:
      
       (*) Overhauling the rotation (again).
      
       (*) Support for asynchronous I/O, where the fileserver rotation must be
           done asynchronously also.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e49c7b2f
  5. 31 5月, 2020 3 次提交
    • D
      afs: Remove the error argument from afs_protocol_error() · 7126ead9
      David Howells 提交于
      Remove the error argument from afs_protocol_error() as it's always
      -EBADMSG.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      7126ead9
    • D
      afs: Actively poll fileservers to maintain NAT or firewall openings · f6cbb368
      David Howells 提交于
      When an AFS client accesses a file, it receives a limited-duration callback
      promise that the server will notify it if another client changes a file.
      This callback duration can be a few hours in length.
      
      If a client mounts a volume and then an application prevents it from being
      unmounted, say by chdir'ing into it, but then does nothing for some time,
      the rxrpc_peer record will expire and rxrpc-level keepalive will cease.
      
      If there is NAT or a firewall between the client and the server, the route
      back for the server may close after a comparatively short duration, meaning
      that attempts by the server to notify the client may then bounce.
      
      The client, however, may (so far as it knows) still have a valid unexpired
      promise and will then rely on its cached data and will not see changes made
      on the server by a third party until it incidentally rechecks the status or
      the promise needs renewal.
      
      To deal with this, the client needs to regularly probe the server.  This
      has two effects: firstly, it keeps a route open back for the server, and
      secondly, it causes the server to disgorge any notifications that got
      queued up because they couldn't be sent.
      
      Fix this by adding a mechanism to emit regular probes.
      
      Two levels of probing are made available: Under normal circumstances the
      'slow' queue will be used for a fileserver - this just probes the preferred
      address once every 5 mins or so; however, if server fails to respond to any
      probes, the server will shift to the 'fast' queue from which all its
      interfaces will be probed every 30s.  When it finally responds, the record
      will switch back to the slow queue.
      
      Further notes:
      
       (1) Probing is now no longer driven from the fileserver rotation
           algorithm.
      
       (2) Probes are dispatched to all interfaces on a fileserver when that an
           afs_server object is set up to record it.
      
       (3) The afs_server object is removed from the probe queues when we start
           to probe it.  afs_is_probing_server() returns true if it's not listed
           - ie. it's undergoing probing.
      
       (4) The afs_server object is added back on to the probe queue when the
           final outstanding probe completes, but the probed_at time is set when
           we're about to launch a probe so that it's not dependent on the probe
           duration.
      
       (5) The timer and the work item added for this must be handed a count on
           net->servers_outstanding, which they hand on or release.  This makes
           sure that network namespace cleanup waits for them.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Reported-by: NDave Botsch <botsch@cnf.cornell.edu>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f6cbb368
    • D
      afs: Split the usage count on struct afs_server · 977e5f8e
      David Howells 提交于
      Split the usage count on the afs_server struct to have an active count that
      registers who's actually using it separately from the reference count on
      the object.
      
      This allows a future patch to dispatch polling probes without advancing the
      "unuse" time into the future each time we emit a probe, which would
      otherwise prevent unused server records from expiring.
      
      Included in this:
      
       (1) The latter part of afs_destroy_server() in which the RCU destruction
           of afs_server objects is invoked and the outstanding server count is
           decremented is split out into __afs_put_server().
      
       (2) afs_put_server() now calls __afs_put_server() rather then setting the
           management timer.
      
       (3) The calls begun by afs_fs_give_up_all_callbacks() and
           afs_fs_get_capabilities() can now take a ref on the server record, so
           afs_destroy_server() can just drop its ref and needn't wait for the
           completion of these calls.  They'll put the ref when they're done.
      
       (4) Because of (3), afs_fs_probe_done() no longer needs to wake up
           afs_destroy_server() with server->probe_outstanding.
      
       (5) afs_gc_servers can be simplified.  It only needs to check if
           server->active is 0 rather than playing games with the refcount.
      
       (6) afs_manage_servers() can propose a server for gc if usage == 0 rather
           than if ref == 1.  The gc is effected by (5).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      977e5f8e
  6. 14 3月, 2020 1 次提交
    • D
      afs: Fix some tracing details · 4636cf18
      David Howells 提交于
      Fix a couple of tracelines to indicate the usage count after the atomic op,
      not the usage count before it to be consistent with other afs and rxrpc
      trace lines.
      
      Change the wording of the afs_call_trace_work trace ID label from "WORK" to
      "QUEUE" to reflect the fact that it's queueing work, not doing work.
      
      Fixes: 341f741f ("afs: Refcount the afs_call struct")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4636cf18
  7. 15 1月, 2020 1 次提交
  8. 21 6月, 2019 2 次提交
  9. 24 5月, 2019 1 次提交
  10. 07 5月, 2019 2 次提交
  11. 25 4月, 2019 6 次提交
    • D
      afs: Provide mount-time configurable byte-range file locking emulation · 6c6c1d63
      David Howells 提交于
      Provide byte-range file locking emulation that can be configured at mount
      time to one of four modes:
      
       (1) flock=local.  Locking is done locally only and no reference is made to
           the server.
      
       (2) flock=openafs.  Byte-range locking is done locally only; whole-file
           locking is done with reference to the server.  Whole-file locks cannot
           be upgraded unless the client holds an exclusive lock.
      
       (3) flock=strict.  Byte-range and whole-file locking both require a
           sufficient whole-file lock on the server.
      
       (4) flock=write.  As strict, but the client always gets an exclusive
           whole-file lock on the server.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      6c6c1d63
    • D
      afs: Add more tracepoints · 80548b03
      David Howells 提交于
      Add four more tracepoints:
      
       (1) afs_make_fs_call1 - Split from afs_make_fs_call but takes a filename
           to log also.
      
       (2) afs_make_fs_call2 - Like the above but takes two filenames to log.
      
       (3) afs_lookup - Log the result of doing a successful lookup, including a
           negative result (fid 0:0).
      
       (4) afs_get_tree - Log the set up of a volume for mounting.
      
      It also extends the name buffer on the afs_edit_dir tracepoint to 24 chars
      and puts quotes around the filename in the text representation.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      80548b03
    • D
      afs: Implement sillyrename for unlink and rename · 79ddbfa5
      David Howells 提交于
      Implement sillyrename for AFS unlink and rename, using the NFS variant
      implementation as a basis.
      
      Note that the asynchronous file locking extender/releaser has to be
      notified with a state change to stop it complaining if there's a race
      between that and the actual file deletion.
      
      A tracepoint, afs_silly_rename, is also added to note the silly rename and
      the cleanup.  The afs_edit_dir tracepoint is given some extra reason
      indicators and the afs_flock_ev tracepoint is given a silly-delete file
      lock cancellation indicator.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      79ddbfa5
    • D
      afs: Add directory reload tracepoint · 99987c56
      David Howells 提交于
      Add a tracepoint (afs_reload_dir) to indicate when a directory is being
      reloaded.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      99987c56
    • D
      afs: Handle lock rpc ops failing on a file that got deleted · cdfb26b4
      David Howells 提交于
      Holding a file lock on an AFS file does not prevent it from being deleted
      on the server, so we need to handle an error resulting from that when we
      try setting, extending or releasing a lock.
      
      Fix this by adding a "deleted" lock state and cancelling the lock extension
      process for that file and aborting all waiters for the lock.
      
      Fixes: 0fafdc9f ("afs: Fix file locking")
      Reported-by: NJonathan Billings <jsbillin@umich.edu>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cdfb26b4
    • D
      afs: Add file locking tracepoints · d4696601
      David Howells 提交于
      Add two tracepoints for monitoring AFS file locking.  Firstly, add one that
      follows the operational part:
      
          echo 1 >/sys/kernel/debug/tracing/events/afs/afs_flock_op/enable
      
      And add a second that more follows the event-driven part:
      
          echo 1 >/sys/kernel/debug/tracing/events/afs/afs_flock_ev/enable
      
      Individual file_lock structs seen by afs are tagged with debugging IDs that
      are displayed in the trace log to make it easier to see what's going on,
      especially as setting the first lock always seems to involve copying the
      file_lock twice.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      d4696601
  12. 17 1月, 2019 1 次提交
    • D
      afs: Fix race in async call refcounting · 34fa4761
      David Howells 提交于
      There's a race between afs_make_call() and afs_wake_up_async_call() in the
      case that an error is returned from rxrpc_kernel_send_data() after it has
      queued the final packet.
      
      afs_make_call() will try and clean up the mess, but the call state may have
      been moved on thereby causing afs_process_async_call() to also try and to
      delete the call.
      
      Fix this by:
      
       (1) Getting an extra ref for an asynchronous call for the call itself to
           hold.  This makes sure the call doesn't evaporate on us accidentally
           and will allow the call to be retained by the caller in a future
           patch.  The ref is released on leaving afs_make_call() or
           afs_wait_for_call_to_complete().
      
       (2) In the event of an error from rxrpc_kernel_send_data():
      
           (a) Don't set the call state to AFS_CALL_COMPLETE until *after* the
           	 call has been aborted and ended.  This prevents
           	 afs_deliver_to_call() from doing anything with any notifications
           	 it gets.
      
           (b) Explicitly end the call immediately to prevent further callbacks.
      
           (c) Cancel any queued async_work and wait for the work if it's
           	 executing.  This allows us to be sure the race won't recur when we
           	 change the state.  We put the work queue's ref on the call if we
           	 managed to cancel it.
      
           (d) Put the call's ref that we got in (1).  This belongs to us as long
           	 as the call is in state AFS_CALL_CL_REQUESTING.
      
      Fixes: 341f741f ("afs: Refcount the afs_call struct")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      34fa4761
  13. 24 10月, 2018 6 次提交
  14. 14 5月, 2018 1 次提交
  15. 10 4月, 2018 3 次提交
    • D
      afs: Trace protocol errors · 5f702c8e
      David Howells 提交于
      Trace protocol errors detected in afs.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      5f702c8e
    • D
      afs: Locally edit directory data for mkdir/create/unlink/... · 63a4681f
      David Howells 提交于
      Locally edit the contents of an AFS directory upon a successful inode
      operation that modifies that directory (such as mkdir, create and unlink)
      so that we can avoid the current practice of re-downloading the directory
      after each change.
      
      This is viable provided that the directory version number we get back from
      the modifying RPC op is exactly incremented by 1 from what we had
      previously.  The data in the directory contents is in a defined format that
      we have to parse locally to perform lookups and readdir, so modifying isn't
      a problem.
      
      If the edit fails, we just clear the VALID flag on the directory and it
      will be reloaded next time it is needed.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      63a4681f
    • D
      afs: Prospectively look up extra files when doing a single lookup · 5cf9dd55
      David Howells 提交于
      When afs_lookup() is called, prospectively look up the next 50 uncached
      fids also from that same directory and cache the results, rather than just
      looking up the one file requested.
      
      This allows us to use the FS.InlineBulkStatus RPC op to increase efficiency
      by fetching up to 50 file statuses at a time.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      5cf9dd55
  16. 28 3月, 2018 1 次提交
    • D
      rxrpc, afs: Use debug_ids rather than pointers in traces · a25e21f0
      David Howells 提交于
      In rxrpc and afs, use the debug_ids that are monotonically allocated to
      various objects as they're allocated rather than pointers as kernel
      pointers are now hashed making them less useful.  Further, the debug ids
      aren't reused anywhere nearly as quickly.
      
      In addition, allow kernel services that use rxrpc, such as afs, to take
      numbers from the rxrpc counter, assign them to their own call struct and
      pass them in to rxrpc for both client and service calls so that the trace
      lines for each will have the same ID tag.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      a25e21f0
  17. 13 11月, 2017 4 次提交
    • D
      afs: Protect call->state changes against signals · 98bf40cd
      David Howells 提交于
      Protect call->state changes against the call being prematurely terminated
      due to a signal.
      
      What can happen is that a signal causes afs_wait_for_call_to_complete() to
      abort an afs_call because it's not yet complete whilst afs_deliver_to_call()
      is delivering data to that call.
      
      If the data delivery causes the state to change, this may overwrite the state
      of the afs_call, making it not-yet-complete again - but no further
      notifications will be forthcoming from AF_RXRPC as the rxrpc call has been
      aborted and completed, so kAFS will just hang in various places waiting for
      that call or on page bits that need clearing by that call.
      
      A tracepoint to monitor call state changes is also provided.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      98bf40cd
    • D
      afs: Trace page dirty/clean · 13524ab3
      David Howells 提交于
      Add a trace event that logs the dirtying and cleaning of pages attached to
      AFS inodes.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      13524ab3
    • D
      afs: Fix directory read/modify race · dab17c1a
      David Howells 提交于
      Because parsing of the directory wasn't being done under any sort of lock,
      the pages holding the directory content can get invalidated whilst the
      parsing is ongoing.
      
      Further, the directory page check function gets called outside of the page
      lock, so if the page gets cleared or updated, this may return reports of
      bad magic numbers in the directory page.
      
      Also, the directory may change size whilst checking and parsing are
      ongoing, so more care needs to be taken here.
      
      Fix this by:
      
       (1) Perform the page check from the page filling function before we set
           PageUptodate and drop the page lock.
      
       (2) Check for the file having shrunk and the page having been abandoned
           before checking the page contents.
      
       (3) Lock the page whilst parsing it for the directory iterator.
      
      Whilst we're at it, add a tracepoint to report check failure.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      dab17c1a
    • D
      afs: Trace the sending of pages · 2c099014
      David Howells 提交于
      Add a pair of tracepoints to log the sending of pages for an FS.StoreData
      or FS.StoreData64 operation.
      
      Tracepoint afs_send_pages notes each set of pages added to the operation.
      There may be several of these per operation as we get up at most 8
      contiguous pages in one go because the bvec we're using is on the stack.
      
      Tracepoint afs_sent_pages notes the end of adding data from a whole run of
      pages to the operation and the completion of the request phase.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      2c099014