1. 24 10月, 2018 7 次提交
    • D
      afs: Implement YFS support in the fs client · 30062bd1
      David Howells 提交于
      Implement support for talking to YFS-variant fileservers in the cache
      manager and the filesystem client.  These implement upgraded services on
      the same port as their AFS services.
      
      YFS fileservers provide expanded capabilities over AFS.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      30062bd1
    • D
      afs: Calc callback expiry in op reply delivery · 12d8e95a
      David Howells 提交于
      Calculate the callback expiration time at the point of operation reply
      delivery, using the reply time queried from AF_RXRPC on that call as a
      base.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      12d8e95a
    • D
      afs: Add a couple of tracepoints to log I/O errors · f51375cd
      David Howells 提交于
      Add a couple of tracepoints to log the production of I/O errors within the AFS
      filesystem.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f51375cd
    • D
      afs: Implement VL server rotation · 0a5143f2
      David Howells 提交于
      Track VL servers as independent entities rather than lumping all their
      addresses together into one set and implement server-level rotation by:
      
       (1) Add the concept of a VL server list, where each server has its own
           separate address list.  This code is similar to the FS server list.
      
       (2) Use the DNS resolver to retrieve a set of servers and their associated
           addresses, ports, preference and weight ratings.
      
       (3) In the case of a legacy DNS resolver or an address list given directly
           through /proc/net/afs/cells, create a list containing just a dummy
           server record and attach all the addresses to that.
      
       (4) Implement a simple rotation policy, for the moment ignoring the
           priorities and weights assigned to the servers.
      
       (5) Show the address list through /proc/net/afs/<cell>/vlservers.  This
           also displays the source and status of the data as indicated by the
           upcall.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0a5143f2
    • D
      afs: Improve FS server rotation error handling · e7f680f4
      David Howells 提交于
      Improve the error handling in FS server rotation by:
      
       (1) Cache the latest useful error value for the fs operation as a whole in
           struct afs_fs_cursor separately from the error cached in the
           afs_addr_cursor struct.  The one in the address cursor gets clobbered
           occasionally.  Copy over the error to the fs operation only when it's
           something we'd be interested in passing to userspace.
      
       (2) Make it so that EDESTADDRREQ is the default that is seen only if no
           addresses are available to be accessed.
      
       (3) When calling utility functions, such as checking a volume status or
           probing a fileserver, don't let a successful result clobber the cached
           error in the cursor; instead, stash the result in a temporary variable
           until it has been assessed.
      
       (4) Don't return ETIMEDOUT or ETIME if a better error, such as
           ENETUNREACH, is already cached.
      
       (5) On leaving the rotation loop, turn any remote abort code into a more
           useful error than ECONNABORTED.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e7f680f4
    • D
      afs: Set up the iov_iter before calling afs_extract_data() · 12bdcf33
      David Howells 提交于
      afs_extract_data sets up a temporary iov_iter and passes it to AF_RXRPC
      each time it is called to describe the remaining buffer to be filled.
      
      Instead:
      
       (1) Put an iterator in the afs_call struct.
      
       (2) Set the iterator for each marshalling stage to load data into the
           appropriate places.  A number of convenience functions are provided to
           this end (eg. afs_extract_to_buf()).
      
           This iterator is then passed to afs_extract_data().
      
       (3) Use the new ITER_DISCARD iterator to discard any excess data provided
           by FetchData.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      12bdcf33
    • D
      afs: Better tracing of protocol errors · 160cb957
      David Howells 提交于
      Include the site of detection of AFS protocol errors in trace lines to
      better be able to determine what went wrong.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      160cb957
  2. 12 10月, 2018 1 次提交
    • D
      afs: Fix cell proc list · 6b3944e4
      David Howells 提交于
      Access to the list of cells by /proc/net/afs/cells has a couple of
      problems:
      
       (1) It should be checking against SEQ_START_TOKEN for the keying the
           header line.
      
       (2) It's only holding the RCU read lock, so it can't just walk over the
           list without following the proper RCU methods.
      
      Fix these by using an hlist instead of an ordinary list and using the
      appropriate accessor functions to follow it with RCU.
      
      Since the code that adds a cell to the list must also necessarily change,
      sort the list on insertion whilst we're at it.
      
      Fixes: 989782dc ("afs: Overhaul cell database management")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b3944e4
  3. 04 10月, 2018 1 次提交
  4. 24 8月, 2018 1 次提交
  5. 15 6月, 2018 3 次提交
    • D
      afs: Optimise callback breaking by not repeating volume lookup · 47ea0f2e
      David Howells 提交于
      At the moment, afs_break_callbacks calls afs_break_one_callback() for each
      separate FID it was given, and the latter looks up the volume individually
      for each one.
      
      However, this is inefficient if two or more FIDs have the same vid as we
      could reuse the volume.  This is complicated by cell aliasing whereby we
      may have multiple cells sharing a volume and can therefore have multiple
      callback interests for any particular volume ID.
      
      At the moment afs_break_one_callback() scans the entire list of volumes
      we're getting from a server and breaks the appropriate callback in every
      matching volume, regardless of cell.  This scan is done for every FID.
      
      Optimise callback breaking by the following means:
      
       (1) Sort the FID list by vid so that all FIDs belonging to the same volume
           are clumped together.
      
           This is done through the use of an indirection table as we cannot do
           an insertion sort on the afs_callback_break array as we decode FIDs
           into it as we subsequently also have to decode callback info into it
           that corresponds by array index only.
      
           We also don't really want to bubblesort afterwards if we can avoid it.
      
       (2) Sort the server->cb_interests array by vid so that all the matching
           volumes are grouped together.  This permits the scan to stop after
           finding a record that has a higher vid.
      
       (3) When breaking FIDs, we try to keep server->cb_break_lock as long as
           possible, caching the start point in the array for that volume group
           as long as possible.
      
           It might make sense to add another layer in that list and have a
           refcounted volume ID anchor that has the matching interests attached
           to it rather than being in the list.  This would allow the lock to be
           dropped without losing the cursor.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      47ea0f2e
    • D
      afs: Display manually added cells in dynamic root mount · 0da0b7fd
      David Howells 提交于
      Alter the dynroot mount so that cells created by manipulation of
      /proc/fs/afs/cells and /proc/fs/afs/rootcell and by specification of a root
      cell as a module parameter will cause directories for those cells to be
      created in the dynamic root superblock for the network namespace[*].
      
      To this end:
      
       (1) Only one dynamic root superblock is now created per network namespace
           and this is shared between all attempts to mount it.  This makes it
           easier to find the superblock to modify.
      
       (2) When a dynamic root superblock is created, the list of cells is walked
           and directories created for each cell already defined.
      
       (3) When a new cell is added, if a dynamic root superblock exists, a
           directory is created for it.
      
       (4) When a cell is destroyed, the directory is removed.
      
       (5) These directories are created by calling lookup_one_len() on the root
           dir which automatically creates them if they don't exist.
      
      [*] Inasmuch as network namespaces are currently supported here.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0da0b7fd
    • D
      afs: Handle CONFIG_PROC_FS=n · b6cfbeca
      David Howells 提交于
      The AFS filesystem depends at the moment on /proc for configuration and
      also presents information that way - however, this causes a compilation
      failure if procfs is disabled.
      
      Fix it so that the procfs bits aren't compiled in if procfs is disabled.
      
      This means that you can't configure the AFS filesystem directly, but it is
      still usable provided that an up-to-date keyutils is installed to look up
      cells by SRV or AFSDB DNS records.
      Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      b6cfbeca
  6. 23 5月, 2018 2 次提交
  7. 14 5月, 2018 4 次提交
    • D
      afs: Fix whole-volume callback handling · 68251f0a
      David Howells 提交于
      It's possible for an AFS file server to issue a whole-volume notification
      that callbacks on all the vnodes in the file have been broken.  This is
      done for R/O and backup volumes (which don't have per-file callbacks) and
      for things like a volume being taken offline.
      
      Fix callback handling to detect whole-volume notifications, to track it
      across operations and to check it during inode validation.
      
      Fixes: c435ee34 ("afs: Overhaul the callback handling")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      68251f0a
    • D
      afs: Fix refcounting in callback registration · d4a96bec
      David Howells 提交于
      The refcounting on afs_cb_interest struct objects in
      afs_register_server_cb_interest() is wrong as it uses the server list
      entry's call back interest pointer without regard for the fact that it
      might be replaced at any time and the object thrown away.
      
      Fix this by:
      
       (1) Put a lock on the afs_server_list struct that can be used to
           mediate access to the callback interest pointers in the servers array.
      
       (2) Keep a ref on the callback interest that we get from the entry.
      
       (3) Dropping the old reference held by vnode->cb_interest if we replace
           the pointer.
      
      Fixes: c435ee34 ("afs: Overhaul the callback handling")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      d4a96bec
    • D
      afs: Fix giving up callbacks on server destruction · f2686b09
      David Howells 提交于
      When a server record is destroyed, we want to send a message to the server
      telling it that we're giving up all the callbacks it has promised us.
      
      Apply two fixes to this:
      
       (1) Only send the FS.GiveUpAllCallBacks message if we actually got a
           callback from that server.  We assume this to be the case if we
           performed at least one successful FS operation on that server.
      
       (2) Send it to the address last used for that server rather than always
           picking the first address in the list (which might be unreachable).
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f2686b09
    • D
      afs: Fix directory page locking · b61f7dcf
      David Howells 提交于
      The afs directory loading code (primarily afs_read_dir()) locks all the
      pages that hold a directory's content blob to defend against
      getdents/getdents races and getdents/lookup races where the competitors
      issue conflicting reads on the same data.  As the reads will complete
      consecutively, they may retrieve different versions of the data and
      one may overwrite the data that the other is busy parsing.
      
      Fix this by not locking the pages at all, but rather by turning the
      validation lock into an rwsem and getting an exclusive lock on it whilst
      reading the data or validating the attributes and a shared lock whilst
      parsing the data.  Sharing the attribute validation lock should be fine as
      the data fetch will retrieve the attributes also.
      
      The individual page locks aren't needed at all as the only place they're
      being used is to serialise data loading.
      
      Without this patch, the:
      
       	if (!test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags)) {
      		...
      	}
      
      part of afs_read_dir() may be skipped, leaving the pages unlocked when we
      hit the success: clause - in which case we try to unlock the not-locked
      pages, leading to the following oops:
      
        page:ffffe38b405b4300 count:3 mapcount:0 mapping:ffff98156c83a978 index:0x0
        flags: 0xfffe000001004(referenced|private)
        raw: 000fffe000001004 ffff98156c83a978 0000000000000000 00000003ffffffff
        raw: dead000000000100 dead000000000200 0000000000000001 ffff98156b27c000
        page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
        page->mem_cgroup:ffff98156b27c000
        ------------[ cut here ]------------
        kernel BUG at mm/filemap.c:1205!
        ...
        RIP: 0010:unlock_page+0x43/0x50
        ...
        Call Trace:
         afs_dir_iterate+0x789/0x8f0 [kafs]
         ? _cond_resched+0x15/0x30
         ? kmem_cache_alloc_trace+0x166/0x1d0
         ? afs_do_lookup+0x69/0x490 [kafs]
         ? afs_do_lookup+0x101/0x490 [kafs]
         ? key_default_cmp+0x20/0x20
         ? request_key+0x3c/0x80
         ? afs_lookup+0xf1/0x340 [kafs]
         ? __lookup_slow+0x97/0x150
         ? lookup_slow+0x35/0x50
         ? walk_component+0x1bf/0x490
         ? path_lookupat.isra.52+0x75/0x200
         ? filename_lookup.part.66+0xa0/0x170
         ? afs_end_vnode_operation+0x41/0x60 [kafs]
         ? __check_object_size+0x9c/0x171
         ? strncpy_from_user+0x4a/0x170
         ? vfs_statx+0x73/0xe0
         ? __do_sys_newlstat+0x39/0x70
         ? __x64_sys_getdents+0xc9/0x140
         ? __x64_sys_getdents+0x140/0x140
         ? do_syscall_64+0x5b/0x160
         ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: f3ddee8d ("afs: Fix directory handling")
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      b61f7dcf
  8. 10 4月, 2018 13 次提交
    • D
      afs: Do better accretion of small writes on newly created content · 5a813276
      David Howells 提交于
      Processes like ld that do lots of small writes that aren't necessarily
      contiguous result in a lot of small StoreData operations to the server, the
      idea being that if someone else changes the data on the server, we only
      write our changes over that and not the space between.  Further, we don't
      want to write back empty space if we can avoid it to make it easier for the
      server to do sparse files.
      
      However, making lots of tiny RPC ops is a lot less efficient for the server
      than one big one because each op requires allocation of resources and the
      taking of locks, so we want to compromise a bit.
      
      Reduce the load by the following:
      
       (1) If a file is just created locally or has just been truncated with
           O_TRUNC locally, allow subsequent writes to the file to be merged with
           intervening space if that space doesn't cross an entire intervening
           page.
      
       (2) Don't flush the file on ->flush() but rather on ->release() if the
           file was open for writing.
      
      Just linking vmlinux.o, without this patch, looking in /proc/fs/afs/stats:
      
      	file-wr : n=441 nb=513581204
      
      and after the patch:
      
      	file-wr : n=62 nb=513668555
      
      there were 379 fewer StoreData RPC operations at the expense of an extra
      87K being written.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      5a813276
    • D
      afs: Add stats for data transfer operations · 76a5cb6f
      David Howells 提交于
      Add statistics to /proc/fs/afs/stats for data transfer RPC operations.  New
      lines are added that look like:
      
      	file-rd : n=55794 nb=10252282150
      	file-wr : n=9789 nb=3247763645
      
      where n= indicates the number of ops completed and nb= indicates the number
      of bytes successfully transferred.  file-rd is the counts for read/fetch
      operations and file-wr the counts for write/store operations.
      
      Note that directory and symlink downloading are included in the file-rd
      stats at the moment.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      76a5cb6f
    • D
      afs: Trace protocol errors · 5f702c8e
      David Howells 提交于
      Trace protocol errors detected in afs.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      5f702c8e
    • D
      afs: Locally edit directory data for mkdir/create/unlink/... · 63a4681f
      David Howells 提交于
      Locally edit the contents of an AFS directory upon a successful inode
      operation that modifies that directory (such as mkdir, create and unlink)
      so that we can avoid the current practice of re-downloading the directory
      after each change.
      
      This is viable provided that the directory version number we get back from
      the modifying RPC op is exactly incremented by 1 from what we had
      previously.  The data in the directory contents is in a defined format that
      we have to parse locally to perform lookups and readdir, so modifying isn't
      a problem.
      
      If the edit fails, we just clear the VALID flag on the directory and it
      will be reloaded next time it is needed.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      63a4681f
    • D
      afs: Fix directory handling · f3ddee8d
      David Howells 提交于
      AFS directories are structured blobs that are downloaded just like files
      and then parsed by the lookup and readdir code and, as such, are currently
      handled in the pagecache like any other file, with the entire directory
      content being thrown away each time the directory changes.
      
      However, since the blob is a known structure and since the data version
      counter on a directory increases by exactly one for each change committed
      to that directory, we can actually edit the directory locally rather than
      fetching it from the server after each locally-induced change.
      
      What we can't do, though, is mix data from the server and data from the
      client since the server is technically at liberty to rearrange or compress
      a directory if it sees fit, provided it updates the data version number
      when it does so and breaks the callback (ie. sends a notification).
      
      Further, lookup with lookup-ahead, readdir and, when it arrives, local
      editing are likely want to scan the whole of a directory.
      
      So directory handling needs to be improved to maintain the coherency of the
      directory blob prior to permitting local directory editing.
      
      To this end:
      
       (1) If any directory page gets discarded, invalidate and reread the entire
           directory.
      
       (2) If readpage notes that if when it fetches a single page that the
           version number has changed, the entire directory is flagged for
           invalidation.
      
       (3) Read as much of the directory in one go as we can.
      
      Note that this removes local caching of directories in fscache for the
      moment as we can't pass the pages to fscache_read_or_alloc_pages() since
      page->lru is in use by the LRU.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f3ddee8d
    • D
      afs: Split the dynroot stuff out and give it its own ops tables · 66c7e1d3
      David Howells 提交于
      Split the AFS dynamic root stuff out of the main directory handling file
      and into its own file as they share little in common.
      
      The dynamic root code also gets its own dentry and inode ops tables.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      66c7e1d3
    • D
      afs: Keep track of invalid-before version for dentry coherency · a4ff7401
      David Howells 提交于
      Each afs dentry is tagged with the version that the parent directory was at
      last time it was validated and, currently, if this differs, the directory
      is scanned and the dentry is refreshed.
      
      However, this leads to an excessive amount of revalidation on directories
      that get modified on the client without conflict with another client.  We
      know there's no conflict because the parent directory's data version number
      got incremented by exactly 1 on any create, mkdir, unlink, etc., therefore
      we can trust the current state of the unaffected dentries when we perform a
      local directory modification.
      
      Optimise by keeping track of the last version of the parent directory that
      was changed outside of the client in the parent directory's vnode and using
      that to validate the dentries rather than the current version.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      a4ff7401
    • D
      afs: Rearrange status mapping · dd9fbcb8
      David Howells 提交于
      Rearrange the AFSFetchStatus to inode attribute mapping code in a number of
      ways:
      
       (1) Use an XDR structure rather than a series of incremented pointer
           accesses when decoding an AFSFetchStatus object.  This allows
           out-of-order decode.
      
       (2) Don't store the if_version value but rather just check it and abort if
           it's not something we can handle.
      
       (3) Store the owner and group in the status record as raw values rather
           than converting them to kuid/kgid.  Do that when they're mapped into
           i_uid/i_gid.
      
       (4) Validate the type and abort code up front and abort if they're wrong.
      
       (5) Split the inode attribute setting out into its own function from the
           XDR decode of an AFSFetchStatus object.  This allows it to be called
           from elsewhere too.
      
       (6) Differentiate changes to data from changes to metadata.
      
       (7) Use the split-out attribute mapping function from afs_iget().
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      dd9fbcb8
    • D
      afs: Make it possible to get the data version in readpage · 0c3a5ac2
      David Howells 提交于
      Store the data version number indicated by an FS.FetchData op into the read
      request structure so that it's accessible by the page reader.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0c3a5ac2
    • D
      afs: Introduce a statistics proc file · d55b4da4
      David Howells 提交于
      Introduce a proc file that displays a bunch of statistics for the AFS
      filesystem in the current network namespace.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      d55b4da4
    • D
      afs: Implement @sys substitution handling · 6f8880d8
      David Howells 提交于
      Implement the AFS feature by which @sys at the end of a pathname component
      may be substituted for one of a list of values, typically naming the
      operating system.  Up to 16 alternatives may be specified and these are
      tried in turn until one works.  Each network namespace has[*] a separate
      independent list.
      
      Upon creation of a new network namespace, the list of values is
      initialised[*] to a single OpenAFS-compatible string representing arch type
      plus "_linux26".  For example, on x86_64, the sysname is "amd64_linux26".
      
      [*] Or will, once network namespace support is finalised in kAFS.
      
      The list may be set by:
      
      	# for i in foo bar linux-x86_64; do echo $i; done >/proc/fs/afs/sysname
      
      for which separate writes to the same fd are amalgamated and applied on
      close.  The LF character may be used as a separator to specify multiple
      items in the same write() call.
      
      The list may be cleared by:
      
      	# echo >/proc/fs/afs/sysname
      
      and read by:
      
      	# cat /proc/fs/afs/sysname
      	foo
      	bar
      	linux-x86_64
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      6f8880d8
    • D
      afs: Prospectively look up extra files when doing a single lookup · 5cf9dd55
      David Howells 提交于
      When afs_lookup() is called, prospectively look up the next 50 uncached
      fids also from that same directory and cache the results, rather than just
      looking up the one file requested.
      
      This allows us to use the FS.InlineBulkStatus RPC op to increase efficiency
      by fetching up to 50 file statuses at a time.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      5cf9dd55
    • D
      afs: Fix checker warnings · fe342cf7
      David Howells 提交于
      Fix warnings raised by checker, including:
      
       (*) Warnings raised by unequal comparison for the purposes of sorting,
           where the endianness doesn't matter:
      
      fs/afs/addr_list.c:246:21: warning: restricted __be16 degrades to integer
      fs/afs/addr_list.c:246:30: warning: restricted __be16 degrades to integer
      fs/afs/addr_list.c:248:21: warning: restricted __be32 degrades to integer
      fs/afs/addr_list.c:248:49: warning: restricted __be32 degrades to integer
      fs/afs/addr_list.c:283:21: warning: restricted __be16 degrades to integer
      fs/afs/addr_list.c:283:30: warning: restricted __be16 degrades to integer
      
       (*) afs_set_cb_interest() is not actually used and can be removed.
      
       (*) afs_cell_gc_delay() should be provided with a sysctl.
      
       (*) afs_cell_destroy() needs to use rcu_access_pointer() to read
           cell->vl_addrs.
      
       (*) afs_init_fs_cursor() should be static.
      
       (*) struct afs_vnode::permit_cache needs to be marked __rcu.
      
       (*) afs_server_rcu() needs to use rcu_access_pointer().
      
       (*) afs_destroy_server() should use rcu_access_pointer() on
           server->addresses as the server object is no longer accessible.
      
       (*) afs_find_server() casts __be16/__be32 values to int in order to
           directly compare them for the purpose of finding a match in a list,
           but is should also annotate the cast with __force to avoid checker
           warnings.
      
       (*) afs_check_permit() accesses vnode->permit_cache outside of the RCU
           readlock, though it doesn't then access the value; the extraneous
           access is deleted.
      
      False positives:
      
       (*) Conditional locking around the code in xdr_decode_AFSFetchStatus.  This
           can be dealt with in a separate patch.
      
      fs/afs/fsclient.c:148:9: warning: context imbalance in 'xdr_decode_AFSFetchStatus' - different lock contexts for basic block
      
       (*) Incorrect handling of seq-retry lock context balance:
      
      fs/afs/inode.c:455:38: warning: context imbalance in 'afs_getattr' - different
      lock contexts for basic block
      fs/afs/server.c:52:17: warning: context imbalance in 'afs_find_server' - different lock contexts for basic block
      fs/afs/server.c:128:17: warning: context imbalance in 'afs_find_server_by_uuid' - different lock contexts for basic block
      
      Errors:
      
       (*) afs_lookup_cell_rcu() needs to break out of the seq-retry loop, not go
           round again if it successfully found the workstation cell.
      
       (*) Fix UUID decode in afs_deliver_cb_probe_uuid().
      
       (*) afs_cache_permit() has a missing rcu_read_unlock() before one of the
           jumps to the someone_else_changed_it label.  Move the unlock to after
           the label.
      
       (*) afs_vl_get_addrs_u() is using ntohl() rather than htonl() when
           encoding to XDR.
      
       (*) afs_deliver_yfsvl_get_endpoints() is using htonl() rather than ntohl()
           when decoding from XDR.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      fe342cf7
  9. 04 4月, 2018 1 次提交
    • D
      fscache: Attach the index key and aux data to the cookie · 402cb8dd
      David Howells 提交于
      Attach copies of the index key and auxiliary data to the fscache cookie so
      that:
      
       (1) The callbacks to the netfs for this stuff can be eliminated.  This
           can simplify things in the cache as the information is still
           available, even after the cache has relinquished the cookie.
      
       (2) Simplifies the locking requirements of accessing the information as we
           don't have to worry about the netfs object going away on us.
      
       (3) The cache can do lazy updating of the coherency information on disk.
           As long as the cache is flushed before reboot/poweroff, there's no
           need to update the coherency info on disk every time it changes.
      
       (4) Cookies can be hashed or put in a tree as the index key is easily
           available.  This allows:
      
           (a) Checks for duplicate cookies can be made at the top fscache layer
           	 rather than down in the bowels of the cache backend.
      
           (b) Caching can be added to a netfs object that has a cookie if the
           	 cache is brought online after the netfs object is allocated.
      
      A certain amount of space is made in the cookie for inline copies of the
      data, but if it won't fit there, extra memory will be allocated for it.
      
      The downside of this is that live cache operation requires more memory.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NAnna Schumaker <anna.schumaker@netapp.com>
      Tested-by: NSteve Dickson <steved@redhat.com>
      402cb8dd
  10. 28 3月, 2018 1 次提交
    • D
      rxrpc, afs: Use debug_ids rather than pointers in traces · a25e21f0
      David Howells 提交于
      In rxrpc and afs, use the debug_ids that are monotonically allocated to
      various objects as they're allocated rather than pointers as kernel
      pointers are now hashed making them less useful.  Further, the debug ids
      aren't reused anywhere nearly as quickly.
      
      In addition, allow kernel services that use rxrpc, such as afs, to take
      numbers from the rxrpc counter, assign them to their own call struct and
      pass them in to rxrpc for both client and service calls so that the trace
      lines for each will have the same ID tag.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      a25e21f0
  11. 06 2月, 2018 1 次提交
    • D
      afs: Support the AFS dynamic root · 4d673da1
      David Howells 提交于
      Support the AFS dynamic root which is a pseudo-volume that doesn't connect
      to any server resource, but rather is just a root directory that
      dynamically creates mountpoint directories where the name of such a
      directory is the name of the cell.
      
      Such a mount can be created thus:
      
      	mount -t afs none /afs -o dyn
      
      Dynamic root superblocks aren't shared except by bind mounts and
      propagation.  Cell root volumes can then be mounted by referring to them by
      name, e.g.:
      
      	ls /afs/grand.central.org/
      	ls /afs/.grand.central.org/
      
      The kernel will upcall to consult the DNS if the address wasn't supplied
      directly.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4d673da1
  12. 01 12月, 2017 1 次提交
    • D
      afs: Properly reset afs_vnode (inode) fields · f8de483e
      David Howells 提交于
      When an AFS inode is allocated by afs_alloc_inode(), the allocated
      afs_vnode struct isn't necessarily reset from the last time it was used as
      an inode because the slab constructor is only invoked once when the memory
      is obtained from the page allocator.
      
      This means that information can leak from one inode to the next because
      we're not calling kmem_cache_zalloc().  Some of the information isn't
      reset, in particular the permit cache pointer.
      
      Bring the clearances up to date.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NMarc Dionne <marc.dionne@auristor.com>
      f8de483e
  13. 17 11月, 2017 1 次提交
    • D
      afs: Fix file locking · 0fafdc9f
      David Howells 提交于
      Fix the AFS file locking whereby the use of the big kernel lock (which
      could be slept with) was replaced by a spinlock (which couldn't).  The
      problem is that the AFS code was doing stuff inside the critical section
      that might call schedule(), so this is a broken transformation.
      
      Fix this by the following means:
      
       (1) Use a state machine with a proper state that can only be changed under
           the spinlock rather than using a collection of bit flags.
      
       (2) Cache the key used for the lock and the lock type in the afs_vnode
           struct so that the manager work function doesn't have to refer to a
           file_lock struct that's been dequeued.  This makes signal handling
           safer.
      
       (4) Move the unlock from afs_do_unlk() to afs_fl_release_private() which
           means that unlock is achieved in other circumstances too.
      
       (5) Unlock the file on the server before taking the next conflicting lock.
      
      Also change:
      
       (1) Check the permits on a file before actually trying the lock.
      
       (2) fsync the file before effecting an explicit unlock operation.  We
           don't fsync if the lock is erased otherwise as we might not be in a
           context where we can actually do that.
      
      Further fixes:
      
       (1) Fixed-fileserver address rotation is made to work.  It's only used by
           the locking functions, so couldn't be tested before.
      
      Fixes: 72f98e72 ("locks: turn lock_flocks into a spinlock")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cc: jlayton@redhat.com
      0fafdc9f
  14. 13 11月, 2017 3 次提交
    • D
      afs: Protect call->state changes against signals · 98bf40cd
      David Howells 提交于
      Protect call->state changes against the call being prematurely terminated
      due to a signal.
      
      What can happen is that a signal causes afs_wait_for_call_to_complete() to
      abort an afs_call because it's not yet complete whilst afs_deliver_to_call()
      is delivering data to that call.
      
      If the data delivery causes the state to change, this may overwrite the state
      of the afs_call, making it not-yet-complete again - but no further
      notifications will be forthcoming from AF_RXRPC as the rxrpc call has been
      aborted and completed, so kAFS will just hang in various places waiting for
      that call or on page bits that need clearing by that call.
      
      A tracepoint to monitor call state changes is also provided.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      98bf40cd
    • D
      afs: Implement shared-writeable mmap · 1cf7a151
      David Howells 提交于
      Implement shared-writeable mmap for AFS.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      1cf7a151
    • D
      afs: Get rid of the afs_writeback record · 4343d008
      David Howells 提交于
      Get rid of the afs_writeback record that kAFS is using to match keys with
      writes made by that key.
      
      Instead, keep a list of keys that have a file open for writing and/or
      sync'ing and iterate through those.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4343d008