提交 · 428edade4e6c70e5b51fcd4188d944fbb744d84c · openanolis / cloud-kernel

14 5月, 2018 13 次提交

afs: Fix CB.CallBack handling · 428edade

由 David Howells 提交于 5月 12, 2018

The handling of CB.CallBack messages sent by the fileserver to the client
is broken in that they are currently being processed after the reply has
been transmitted.

This is not what the fileserver expects, however.  It holds up change
visibility until the reply comes so as to maintain cache coherency, and so
expects the client to have to refetch the state on the affected files.

Fix CB.CallBack handling to perform the callback break before sending the
reply.

The fileserver is free to hold up status fetches issued by other threads on
the same client that occur in reponse to the callback until any pending
changes have been committed.

Fixes: d001648e ("rxrpc: Don't expose skbs to in-kernel users [ver #2]")
Signed-off-by: NDavid Howells <dhowells@redhat.com>

428edade

afs: Fix whole-volume callback handling · 68251f0a

由 David Howells 提交于 5月 12, 2018

It's possible for an AFS file server to issue a whole-volume notification
that callbacks on all the vnodes in the file have been broken. This is
done for R/O and backup volumes (which don't have per-file callbacks) and
for things like a volume being taken offline.

Fix callback handling to detect whole-volume notifications, to track it
across operations and to check it during inode validation.

Fixes: c435ee34 ("afs: Overhaul the callback handling")
Signed-off-by: NDavid Howells <dhowells@redhat.com>

68251f0a

afs: Fix afs_find_server search loop · f9c1bba3

由 Marc Dionne 提交于 5月 11, 2018

The code that looks up servers by addresses makes the assumption
that the list of addresses for a server is sorted. It exits the
loop if it finds that the target address is larger than the
current candidate. As the list is not currently sorted, this
can lead to a failure to find a matching server, which can cause
callbacks from that server to be ignored.

Remove the early exit case so that the complete list is searched.

Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: NMarc Dionne <marc.dionne@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

f9c1bba3

afs: Fix the handling of an unfound server in CM operations · a86b06d1

由 David Howells 提交于 5月 11, 2018

If the client cache manager operations that need the server record
(CB.Callback, CB.InitCallBackState, and CB.InitCallBackState3) can't find
the server record, they abort the call from the file server with
RX_CALL_DEAD when they should return okay.

Fixes: c35eccb1 ("[AFS]: Implement the CB.InitCallBackState3 operation.")
Signed-off-by: NDavid Howells <dhowells@redhat.com>

a86b06d1

afs: Add a tracepoint to record callbacks from unlisted servers · 3709a399

由 David Howells 提交于 5月 11, 2018

Add a tracepoint to record callbacks from servers for which we don't have a
record.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

3709a399

afs: Fix the handling of CB.InitCallBackState3 to find the server by UUID · 001ab5a6

由 David Howells 提交于 5月 11, 2018

Fix the handling of the CB.InitCallBackState3 service call to find the
record of a server that we're using by looking it up by the UUID passed as
the parameter rather than by its address (of which it might have many, and
which may change).

Fixes: c35eccb1 ("[AFS]: Implement the CB.InitCallBackState3 operation.")
Signed-off-by: NDavid Howells <dhowells@redhat.com>

001ab5a6

afs: Fix VNOVOL handling in address rotation · 3d9fa911

由 David Howells 提交于 5月 11, 2018

If a volume location record lists multiple file servers for a volume, then
it's possible that due to a misconfiguration or a changing configuration
that one of the file servers doesn't know about it yet and will abort
VNOVOL. Currently, the rotation algorithm will stop with EREMOTEIO.

Fix this by moving on to try the next server if VNOVOL is returned. Once
all the servers have been tried and the record rechecked, the algorithm
will stop with EREMOTEIO or ENOMEDIUM.

Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
Reported-by: NMarc Dionne <marc.dionne@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

3d9fa911

afs: Fix AFSFetchStatus decoder to provide OpenAFS compatibility · 684b0f68

由 David Howells 提交于 5月 10, 2018

The OpenAFS server's RXAFS_InlineBulkStatus implementation has a bug
whereby if an error occurs on one of the vnodes being queried, then the
errorCode field is set correctly in the corresponding status, but the
interfaceVersion field is left unset.

Fix kAFS to deal with this by evaluating the AFSFetchStatus blob against
the following cases when called from FS.InlineBulkStatus delivery:

 (1) If InterfaceVersion == 0 then:

     (a) If errorCode != 0 then it indicates the abort code for the
         corresponding vnode.

     (b) If errorCode == 0 then the status record is invalid.

 (2) If InterfaceVersion == 1 then:

     (a) If errorCode != 0 then it indicates the abort code for the
         corresponding vnode.

     (b) If errorCode == 0 then the status record is valid and can be
     	 parsed.

 (3) If InterfaceVersion is anything else then the status record is
     invalid.

Fixes: dd9fbcb8 ("afs: Rearrange status mapping")
Reported-by: NJeffrey Altman <jaltman@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

684b0f68

afs: Fix server rotation's handling of fileserver probe failure · ec5a3b4b

由 David Howells 提交于 5月 10, 2018

The server rotation algorithm just gives up if it fails to probe a
fileserver. Fix this by rotating to the next fileserver instead.

Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: NDavid Howells <dhowells@redhat.com>

ec5a3b4b

afs: Fix refcounting in callback registration · d4a96bec

由 David Howells 提交于 5月 10, 2018

The refcounting on afs_cb_interest struct objects in
afs_register_server_cb_interest() is wrong as it uses the server list
entry's call back interest pointer without regard for the fact that it
might be replaced at any time and the object thrown away.

Fix this by:

 (1) Put a lock on the afs_server_list struct that can be used to
     mediate access to the callback interest pointers in the servers array.

 (2) Keep a ref on the callback interest that we get from the entry.

 (3) Dropping the old reference held by vnode->cb_interest if we replace
     the pointer.

Fixes: c435ee34 ("afs: Overhaul the callback handling")
Signed-off-by: NDavid Howells <dhowells@redhat.com>

d4a96bec

afs: Fix giving up callbacks on server destruction · f2686b09

由 David Howells 提交于 5月 10, 2018

When a server record is destroyed, we want to send a message to the server
telling it that we're giving up all the callbacks it has promised us.

Apply two fixes to this:

 (1) Only send the FS.GiveUpAllCallBacks message if we actually got a
     callback from that server.  We assume this to be the case if we
     performed at least one successful FS operation on that server.

 (2) Send it to the address last used for that server rather than always
     picking the first address in the list (which might be unreachable).

Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: NDavid Howells <dhowells@redhat.com>

f2686b09

afs: Fix address list parsing · 01fd79e6

由 David Howells 提交于 5月 09, 2018

The parsing of port specifiers in the address list obtained from the DNS
resolution upcall doesn't work as in4_pton() and in6_pton() will fail on
encountering an unexpected delimiter (in this case, the '+' marking the
port number). However, in*_pton() can't be given multiple specifiers.

Fix this by finding the delimiter in advance and not relying on in*_pton()
to find the end of the address for us.

Fixes: 8b2a464c ("afs: Add an address list concept")
Signed-off-by: NDavid Howells <dhowells@redhat.com>

01fd79e6

afs: Fix directory page locking · b61f7dcf

由 David Howells 提交于 4月 27, 2018

The afs directory loading code (primarily afs_read_dir()) locks all the
pages that hold a directory's content blob to defend against
getdents/getdents races and getdents/lookup races where the competitors
issue conflicting reads on the same data.  As the reads will complete
consecutively, they may retrieve different versions of the data and
one may overwrite the data that the other is busy parsing.

Fix this by not locking the pages at all, but rather by turning the
validation lock into an rwsem and getting an exclusive lock on it whilst
reading the data or validating the attributes and a shared lock whilst
parsing the data.  Sharing the attribute validation lock should be fine as
the data fetch will retrieve the attributes also.

The individual page locks aren't needed at all as the only place they're
being used is to serialise data loading.

Without this patch, the:

 	if (!test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags)) {
		...
	}

part of afs_read_dir() may be skipped, leaving the pages unlocked when we
hit the success: clause - in which case we try to unlock the not-locked
pages, leading to the following oops:

  page:ffffe38b405b4300 count:3 mapcount:0 mapping:ffff98156c83a978 index:0x0
  flags: 0xfffe000001004(referenced|private)
  raw: 000fffe000001004 ffff98156c83a978 0000000000000000 00000003ffffffff
  raw: dead000000000100 dead000000000200 0000000000000001 ffff98156b27c000
  page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
  page->mem_cgroup:ffff98156b27c000
  ------------[ cut here ]------------
  kernel BUG at mm/filemap.c:1205!
  ...
  RIP: 0010:unlock_page+0x43/0x50
  ...
  Call Trace:
   afs_dir_iterate+0x789/0x8f0 [kafs]
   ? _cond_resched+0x15/0x30
   ? kmem_cache_alloc_trace+0x166/0x1d0
   ? afs_do_lookup+0x69/0x490 [kafs]
   ? afs_do_lookup+0x101/0x490 [kafs]
   ? key_default_cmp+0x20/0x20
   ? request_key+0x3c/0x80
   ? afs_lookup+0xf1/0x340 [kafs]
   ? __lookup_slow+0x97/0x150
   ? lookup_slow+0x35/0x50
   ? walk_component+0x1bf/0x490
   ? path_lookupat.isra.52+0x75/0x200
   ? filename_lookup.part.66+0xa0/0x170
   ? afs_end_vnode_operation+0x41/0x60 [kafs]
   ? __check_object_size+0x9c/0x171
   ? strncpy_from_user+0x4a/0x170
   ? vfs_statx+0x73/0xe0
   ? __do_sys_newlstat+0x39/0x70
   ? __x64_sys_getdents+0xc9/0x140
   ? __x64_sys_getdents+0x140/0x140
   ? do_syscall_64+0x5b/0x160
   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: f3ddee8d ("afs: Fix directory handling")
Reported-by: NMarc Dionne <marc.dionne@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

b61f7dcf

21 4月, 2018 1 次提交

afs: Fix server record deletion · 66062592

由 David Howells 提交于 4月 18, 2018

AFS server records get removed from the net->fs_servers tree when
they're deleted, but not from the net->fs_addresses{4,6} lists, which
can lead to an oops in afs_find_server() when a server record has been
removed, for instance during rmmod.

Fix this by deleting the record from the by-address lists before posting
it for RCU destruction.

The reason this hasn't been noticed before is that the fileserver keeps
probing the local cache manager, thereby keeping the service record
alive, so the oops would only happen when a fileserver eventually gets
bored and stops pinging or if the module gets rmmod'd and a call comes
in from the fileserver during the window between the server records
being destroyed and the socket being closed.

The oops looks something like:

  BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
  ...
  Workqueue: kafsd afs_process_async_call [kafs]
  RIP: 0010:afs_find_server+0x271/0x36f [kafs]
  ...
  Call Trace:
   afs_deliver_cb_init_call_back_state3+0x1f2/0x21f [kafs]
   afs_deliver_to_call+0x1ee/0x5e8 [kafs]
   afs_process_async_call+0x5b/0xd0 [kafs]
   process_one_work+0x2c2/0x504
   worker_thread+0x1d4/0x2ac
   kthread+0x11f/0x127
   ret_from_fork+0x24/0x30

Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

66062592

12 4月, 2018 1 次提交

page cache: use xa_lock · b93b0163

由 Matthew Wilcox 提交于 4月 10, 2018

Remove the address_space ->tree_lock and use the xa_lock newly added to
the radix_tree_root.  Rename the address_space ->page_tree to ->i_pages,
since we don't really care that it's a tree.

[willy@infradead.org: fix nds32, fs/dax.c]
  Link: http://lkml.kernel.org/r/20180406145415.GB20605@bombadil.infradead.orgLink: http://lkml.kernel.org/r/20180313132639.17387-9-willy@infradead.orgSigned-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
Acked-by: NJeff Layton <jlayton@redhat.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b93b0163

10 4月, 2018 19 次提交

afs: Do better accretion of small writes on newly created content · 5a813276

由 David Howells 提交于 4月 06, 2018

Processes like ld that do lots of small writes that aren't necessarily
contiguous result in a lot of small StoreData operations to the server, the
idea being that if someone else changes the data on the server, we only
write our changes over that and not the space between.  Further, we don't
want to write back empty space if we can avoid it to make it easier for the
server to do sparse files.

However, making lots of tiny RPC ops is a lot less efficient for the server
than one big one because each op requires allocation of resources and the
taking of locks, so we want to compromise a bit.

Reduce the load by the following:

 (1) If a file is just created locally or has just been truncated with
     O_TRUNC locally, allow subsequent writes to the file to be merged with
     intervening space if that space doesn't cross an entire intervening
     page.

 (2) Don't flush the file on ->flush() but rather on ->release() if the
     file was open for writing.

Just linking vmlinux.o, without this patch, looking in /proc/fs/afs/stats:

	file-wr : n=441 nb=513581204

and after the patch:

	file-wr : n=62 nb=513668555

there were 379 fewer StoreData RPC operations at the expense of an extra
87K being written.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

5a813276

afs: Add stats for data transfer operations · 76a5cb6f

由 David Howells 提交于 4月 06, 2018

Add statistics to /proc/fs/afs/stats for data transfer RPC operations.  New
lines are added that look like:

	file-rd : n=55794 nb=10252282150
	file-wr : n=9789 nb=3247763645

where n= indicates the number of ops completed and nb= indicates the number
of bytes successfully transferred.  file-rd is the counts for read/fetch
operations and file-wr the counts for write/store operations.

Note that directory and symlink downloading are included in the file-rd
stats at the moment.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

76a5cb6f

afs: Trace protocol errors · 5f702c8e

由 David Howells 提交于 4月 06, 2018

Trace protocol errors detected in afs.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

5f702c8e

afs: Locally edit directory data for mkdir/create/unlink/... · 63a4681f

由 David Howells 提交于 4月 06, 2018

Locally edit the contents of an AFS directory upon a successful inode
operation that modifies that directory (such as mkdir, create and unlink)
so that we can avoid the current practice of re-downloading the directory
after each change.

This is viable provided that the directory version number we get back from
the modifying RPC op is exactly incremented by 1 from what we had
previously.  The data in the directory contents is in a defined format that
we have to parse locally to perform lookups and readdir, so modifying isn't
a problem.

If the edit fails, we just clear the VALID flag on the directory and it
will be reloaded next time it is needed.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

63a4681f

afs: Adjust the directory XDR structures · 00317636

由 David Howells 提交于 4月 06, 2018

Adjust the AFS directory XDR structures in a number of superficial ways:

 (1) Rename them to all begin afs_xdr_.

 (2) Use u8 instead of uint8_t.

 (3) Mark the structures as __packed so they don't get rearranged by the
     compiler.

 (4) Rename the hdr member of afs_xdr_dir_block to meta.

 (5) Rename the pagehdr member of afs_xdr_dir_block to hdr.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

00317636

afs: Split the directory content defs into a header · 4ea219a8

由 David Howells 提交于 4月 06, 2018

Split the directory content definitions into a header file so that they can
be used by multiple .c files.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

4ea219a8

afs: Fix directory handling · f3ddee8d

由 David Howells 提交于 4月 06, 2018

AFS directories are structured blobs that are downloaded just like files
and then parsed by the lookup and readdir code and, as such, are currently
handled in the pagecache like any other file, with the entire directory
content being thrown away each time the directory changes.

However, since the blob is a known structure and since the data version
counter on a directory increases by exactly one for each change committed
to that directory, we can actually edit the directory locally rather than
fetching it from the server after each locally-induced change.

What we can't do, though, is mix data from the server and data from the
client since the server is technically at liberty to rearrange or compress
a directory if it sees fit, provided it updates the data version number
when it does so and breaks the callback (ie. sends a notification).

Further, lookup with lookup-ahead, readdir and, when it arrives, local
editing are likely want to scan the whole of a directory.

So directory handling needs to be improved to maintain the coherency of the
directory blob prior to permitting local directory editing.

To this end:

 (1) If any directory page gets discarded, invalidate and reread the entire
     directory.

 (2) If readpage notes that if when it fetches a single page that the
     version number has changed, the entire directory is flagged for
     invalidation.

 (3) Read as much of the directory in one go as we can.

Note that this removes local caching of directories in fscache for the
moment as we can't pass the pages to fscache_read_or_alloc_pages() since
page->lru is in use by the LRU.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

f3ddee8d

afs: Split the dynroot stuff out and give it its own ops tables · 66c7e1d3

由 David Howells 提交于 4月 06, 2018

Split the AFS dynamic root stuff out of the main directory handling file
and into its own file as they share little in common.

The dynamic root code also gets its own dentry and inode ops tables.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

66c7e1d3

afs: Keep track of invalid-before version for dentry coherency · a4ff7401

由 David Howells 提交于 4月 06, 2018

Each afs dentry is tagged with the version that the parent directory was at
last time it was validated and, currently, if this differs, the directory
is scanned and the dentry is refreshed.

However, this leads to an excessive amount of revalidation on directories
that get modified on the client without conflict with another client. We
know there's no conflict because the parent directory's data version number
got incremented by exactly 1 on any create, mkdir, unlink, etc., therefore
we can trust the current state of the unaffected dentries when we perform a
local directory modification.

Optimise by keeping track of the last version of the parent directory that
was changed outside of the client in the parent directory's vnode and using
that to validate the dentries rather than the current version.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

a4ff7401

afs: Rearrange status mapping · dd9fbcb8

由 David Howells 提交于 4月 06, 2018

Rearrange the AFSFetchStatus to inode attribute mapping code in a number of
ways:

 (1) Use an XDR structure rather than a series of incremented pointer
     accesses when decoding an AFSFetchStatus object.  This allows
     out-of-order decode.

 (2) Don't store the if_version value but rather just check it and abort if
     it's not something we can handle.

 (3) Store the owner and group in the status record as raw values rather
     than converting them to kuid/kgid.  Do that when they're mapped into
     i_uid/i_gid.

 (4) Validate the type and abort code up front and abort if they're wrong.

 (5) Split the inode attribute setting out into its own function from the
     XDR decode of an AFSFetchStatus object.  This allows it to be called
     from elsewhere too.

 (6) Differentiate changes to data from changes to metadata.

 (7) Use the split-out attribute mapping function from afs_iget().
Signed-off-by: NDavid Howells <dhowells@redhat.com>

dd9fbcb8

afs: Make it possible to get the data version in readpage · 0c3a5ac2

由 David Howells 提交于 4月 06, 2018

Store the data version number indicated by an FS.FetchData op into the read
request structure so that it's accessible by the page reader.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

0c3a5ac2

afs: Init inode before accessing cache · 5800db81

由 David Howells 提交于 4月 06, 2018

We no longer parse symlinks when we get the inode to determine if this
symlink is actually a mountpoint as we detect that by examining the mode
instead (symlinks are always 0777 and mountpoints 0644).

Access the cache after mapping the status so that we don't have to manually
set the inode size now.

Note that this may need adjusting if the disconnected operation is
implemented as the file metadata may have to be obtained from the cache.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

5800db81

afs: Introduce a statistics proc file · d55b4da4

由 David Howells 提交于 4月 06, 2018

Introduce a proc file that displays a bunch of statistics for the AFS
filesystem in the current network namespace.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

d55b4da4

afs: Dump bad status record · 888b3384

由 David Howells 提交于 4月 06, 2018

Dump an AFS FileStatus record that is detected as invalid.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

888b3384

afs: Implement @cell substitution handling · 37ab6368

由 David Howells 提交于 4月 06, 2018

Implement @cell substitution handling such that if @cell is seen as a name
in a dynamic root mount, then the name of the root cell for that network
namespace will be substituted for @cell during lookup.

The substitution of @cell for the current net namespace is set by writing
the cell name to /proc/fs/afs/rootcell.  The value can be obtained by
reading the file.

For example:

	# mount -t afs none /kafs -o dyn
	# echo grand.central.org >/proc/fs/afs/rootcell
	# ls /kafs/@cell
	archive/  cvs/  doc/  local/  project/  service/  software/  user/  www/
	# cat /proc/fs/afs/rootcell
	grand.central.org
Signed-off-by: NDavid Howells <dhowells@redhat.com>

37ab6368

afs: Implement @sys substitution handling · 6f8880d8

由 David Howells 提交于 4月 09, 2018

Implement the AFS feature by which @sys at the end of a pathname component
may be substituted for one of a list of values, typically naming the
operating system.  Up to 16 alternatives may be specified and these are
tried in turn until one works.  Each network namespace has[*] a separate
independent list.

Upon creation of a new network namespace, the list of values is
initialised[*] to a single OpenAFS-compatible string representing arch type
plus "_linux26".  For example, on x86_64, the sysname is "amd64_linux26".

[*] Or will, once network namespace support is finalised in kAFS.

The list may be set by:

	# for i in foo bar linux-x86_64; do echo $i; done >/proc/fs/afs/sysname

for which separate writes to the same fd are amalgamated and applied on
close.  The LF character may be used as a separator to specify multiple
items in the same write() call.

The list may be cleared by:

	# echo >/proc/fs/afs/sysname

and read by:

	# cat /proc/fs/afs/sysname
	foo
	bar
	linux-x86_64
Signed-off-by: NDavid Howells <dhowells@redhat.com>

6f8880d8

afs: Prospectively look up extra files when doing a single lookup · 5cf9dd55

由 David Howells 提交于 4月 09, 2018

When afs_lookup() is called, prospectively look up the next 50 uncached
fids also from that same directory and cache the results, rather than just
looking up the one file requested.

This allows us to use the FS.InlineBulkStatus RPC op to increase efficiency
by fetching up to 50 file statuses at a time.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

5cf9dd55

afs: Don't over-increment the cell usage count when pinning it · 17814aef

由 David Howells 提交于 4月 09, 2018

AFS cells that are added or set as the workstation cell through /proc are
pinned against removal by setting the AFS_CELL_FL_NO_GC flag on them and
taking a ref.  The ref should be only taken if the flag wasn't already set.

Fix this by making it conditional.

Without this an assertion failure will occur during module removal
indicating that the refcount is too elevated.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

17814aef

afs: Fix checker warnings · fe342cf7

由 David Howells 提交于 4月 09, 2018

Fix warnings raised by checker, including:

 (*) Warnings raised by unequal comparison for the purposes of sorting,
     where the endianness doesn't matter:

fs/afs/addr_list.c:246:21: warning: restricted __be16 degrades to integer
fs/afs/addr_list.c:246:30: warning: restricted __be16 degrades to integer
fs/afs/addr_list.c:248:21: warning: restricted __be32 degrades to integer
fs/afs/addr_list.c:248:49: warning: restricted __be32 degrades to integer
fs/afs/addr_list.c:283:21: warning: restricted __be16 degrades to integer
fs/afs/addr_list.c:283:30: warning: restricted __be16 degrades to integer

 (*) afs_set_cb_interest() is not actually used and can be removed.

 (*) afs_cell_gc_delay() should be provided with a sysctl.

 (*) afs_cell_destroy() needs to use rcu_access_pointer() to read
     cell->vl_addrs.

 (*) afs_init_fs_cursor() should be static.

 (*) struct afs_vnode::permit_cache needs to be marked __rcu.

 (*) afs_server_rcu() needs to use rcu_access_pointer().

 (*) afs_destroy_server() should use rcu_access_pointer() on
     server->addresses as the server object is no longer accessible.

 (*) afs_find_server() casts __be16/__be32 values to int in order to
     directly compare them for the purpose of finding a match in a list,
     but is should also annotate the cast with __force to avoid checker
     warnings.

 (*) afs_check_permit() accesses vnode->permit_cache outside of the RCU
     readlock, though it doesn't then access the value; the extraneous
     access is deleted.

False positives:

 (*) Conditional locking around the code in xdr_decode_AFSFetchStatus.  This
     can be dealt with in a separate patch.

fs/afs/fsclient.c:148:9: warning: context imbalance in 'xdr_decode_AFSFetchStatus' - different lock contexts for basic block

 (*) Incorrect handling of seq-retry lock context balance:

fs/afs/inode.c:455:38: warning: context imbalance in 'afs_getattr' - different
lock contexts for basic block
fs/afs/server.c:52:17: warning: context imbalance in 'afs_find_server' - different lock contexts for basic block
fs/afs/server.c:128:17: warning: context imbalance in 'afs_find_server_by_uuid' - different lock contexts for basic block

Errors:

 (*) afs_lookup_cell_rcu() needs to break out of the seq-retry loop, not go
     round again if it successfully found the workstation cell.

 (*) Fix UUID decode in afs_deliver_cb_probe_uuid().

 (*) afs_cache_permit() has a missing rcu_read_unlock() before one of the
     jumps to the someone_else_changed_it label.  Move the unlock to after
     the label.

 (*) afs_vl_get_addrs_u() is using ntohl() rather than htonl() when
     encoding to XDR.

 (*) afs_deliver_yfsvl_get_endpoints() is using htonl() rather than ntohl()
     when decoding from XDR.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

fe342cf7

06 4月, 2018 1 次提交

fscache: Pass object size in rather than calling back for it · ee1235a9

由 David Howells 提交于 4月 04, 2018

Pass the object size in to fscache_acquire_cookie() and
fscache_write_page() rather than the netfs providing a callback by which it
can be received.  This makes it easier to update the size of the object
when a new page is written that extends the object.

The current object size is also passed by fscache to the check_aux
function, obviating the need to store it in the aux data.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NAnna Schumaker <anna.schumaker@netapp.com>
Tested-by: NSteve Dickson <steved@redhat.com>

ee1235a9

04 4月, 2018 4 次提交

fscache: Attach the index key and aux data to the cookie · 402cb8dd

由 David Howells 提交于 4月 04, 2018

Attach copies of the index key and auxiliary data to the fscache cookie so
that:

 (1) The callbacks to the netfs for this stuff can be eliminated.  This
     can simplify things in the cache as the information is still
     available, even after the cache has relinquished the cookie.

 (2) Simplifies the locking requirements of accessing the information as we
     don't have to worry about the netfs object going away on us.

 (3) The cache can do lazy updating of the coherency information on disk.
     As long as the cache is flushed before reboot/poweroff, there's no
     need to update the coherency info on disk every time it changes.

 (4) Cookies can be hashed or put in a tree as the index key is easily
     available.  This allows:

     (a) Checks for duplicate cookies can be made at the top fscache layer
     	 rather than down in the bowels of the cache backend.

     (b) Caching can be added to a netfs object that has a cookie if the
     	 cache is brought online after the netfs object is allocated.

A certain amount of space is made in the cookie for inline copies of the
data, but if it won't fit there, extra memory will be allocated for it.

The downside of this is that live cache operation requires more memory.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NAnna Schumaker <anna.schumaker@netapp.com>
Tested-by: NSteve Dickson <steved@redhat.com>

402cb8dd

afs: Be more aggressive in retiring cached vnodes · 678edd09

由 David Howells 提交于 4月 04, 2018

When relinquishing cookies, either due to iget failure or to inode
eviction, retire a cookie if we think the corresponding vnode got deleted
on the server rather than just letting it lie in the cache.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

678edd09

afs: Use the vnode ID uniquifier in the cache key not the aux data · 27a3ee3a

由 David Howells 提交于 4月 04, 2018

AFS vnodes (files) are referenced by a triplet of { volume ID, vnode ID,
uniquifier }. Currently, kafs is only using the vnode ID as the file key
in the volume fscache index and checking the uniquifier on cookie
acquisition against the contents of the auxiliary data stored in the cache.

Unfortunately, this is subject to a race in which an FS.RemoveFile or
FS.RemoveDir op is issued against the server but the local afs inode isn't
torn down and disposed off before another thread issues something like
FS.CreateFile. The latter then gets given the vnode ID that just got
removed, but with a new uniquifier and a cookie collision occurs in the
cache because the cookie is only keyed on the vnode ID whereas the inode is
keyed on the vnode ID plus the uniquifier.

Fix this by keying the cookie on the uniquifier in addition to the vnode ID
and dropping the uniquifier from the auxiliary data supplied.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

27a3ee3a

afs: Invalidate cache on server data change · c1515999

由 David Howells 提交于 4月 04, 2018

Invalidate any data stored in fscache for a vnode that changes on the
server so that we don't end up with the cache in a bad state locally.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

c1515999

28 3月, 2018 1 次提交

rxrpc, afs: Use debug_ids rather than pointers in traces · a25e21f0

由 David Howells 提交于 3月 27, 2018

In rxrpc and afs, use the debug_ids that are monotonically allocated to
various objects as they're allocated rather than pointers as kernel
pointers are now hashed making them less useful. Further, the debug ids
aren't reused anywhere nearly as quickly.

In addition, allow kernel services that use rxrpc, such as afs, to take
numbers from the rxrpc counter, assign them to their own call struct and
pass them in to rxrpc for both client and service calls so that the trace
lines for each will have the same ID tag.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

a25e21f0

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功