1. 11 6月, 2022 1 次提交
  2. 10 6月, 2022 1 次提交
    • D
      netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context · 874c8ca1
      David Howells 提交于
      While randstruct was satisfied with using an open-coded "void *" offset
      cast for the netfs_i_context <-> inode casting, __builtin_object_size() as
      used by FORTIFY_SOURCE was not as easily fooled.  This was causing the
      following complaint[1] from gcc v12:
      
        In file included from include/linux/string.h:253,
                         from include/linux/ceph/ceph_debug.h:7,
                         from fs/ceph/inode.c:2:
        In function 'fortify_memset_chk',
            inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2,
            inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2:
        include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
          242 |                         __write_overflow_field(p_size_field, size);
              |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Fix this by embedding a struct inode into struct netfs_i_context (which
      should perhaps be renamed to struct netfs_inode).  The struct inode
      vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode
      structs and vfs_inode is then simply changed to "netfs.inode" in those
      filesystems.
      
      Further, rename netfs_i_context to netfs_inode, get rid of the
      netfs_inode() function that converted a netfs_i_context pointer to an
      inode pointer (that can now be done with &ctx->inode) and rename the
      netfs_i_context() function to netfs_inode() (which is now a wrapper
      around container_of()).
      
      Most of the changes were done with:
      
        perl -p -i -e 's/vfs_inode/netfs.inode/'g \
              `git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]`
      
      Kees suggested doing it with a pair structure[2] and a special
      declarator to insert that into the network filesystem's inode
      wrapper[3], but I think it's cleaner to embed it - and then it doesn't
      matter if struct randomisation reorders things.
      
      Dave Chinner suggested using a filesystem-specific VFS_I() function in
      each filesystem to convert that filesystem's own inode wrapper struct
      into the VFS inode struct[4].
      
      Version #2:
       - Fix a couple of missed name changes due to a disabled cifs option.
       - Rename nfs_i_context to nfs_inode
       - Use "netfs" instead of "nic" as the member name in per-fs inode wrapper
         structs.
      
      [ This also undoes commit 507160f4 ("netfs: gcc-12: temporarily
        disable '-Wattribute-warning' for now") that is no longer needed ]
      
      Fixes: bc899ee1 ("netfs: Add a netfs inode context")
      Reported-by: NJeff Layton <jlayton@kernel.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NJeff Layton <jlayton@kernel.org>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NXiubo Li <xiubli@redhat.com>
      cc: Jonathan Corbet <corbet@lwn.net>
      cc: Eric Van Hensbergen <ericvh@gmail.com>
      cc: Latchesar Ionkov <lucho@ionkov.net>
      cc: Dominique Martinet <asmadeus@codewreck.org>
      cc: Christian Schoenebeck <linux_oss@crudebyte.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: Ilya Dryomov <idryomov@gmail.com>
      cc: Steve French <smfrench@gmail.com>
      cc: William Kucharski <william.kucharski@oracle.com>
      cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      cc: Dave Chinner <david@fromorbit.com>
      cc: linux-doc@vger.kernel.org
      cc: v9fs-developer@lists.sourceforge.net
      cc: linux-afs@lists.infradead.org
      cc: ceph-devel@vger.kernel.org
      cc: linux-cifs@vger.kernel.org
      cc: samba-technical@lists.samba.org
      cc: linux-fsdevel@vger.kernel.org
      cc: linux-hardening@vger.kernel.org
      Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1]
      Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2]
      Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3]
      Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4]
      Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1
      Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      874c8ca1
  3. 10 5月, 2022 3 次提交
  4. 18 3月, 2022 4 次提交
  5. 15 3月, 2022 3 次提交
  6. 12 1月, 2022 1 次提交
  7. 07 1月, 2022 3 次提交
  8. 17 12月, 2021 1 次提交
  9. 11 11月, 2021 1 次提交
  10. 02 11月, 2021 1 次提交
  11. 13 9月, 2021 2 次提交
    • D
      afs: Fix mmap coherency vs 3rd-party changes · 6e0e99d5
      David Howells 提交于
      Fix the coherency management of mmap'd data such that 3rd-party changes
      become visible as soon as possible after the callback notification is
      delivered by the fileserver.  This is done by the following means:
      
       (1) When we break a callback on a vnode specified by the CB.CallBack call
           from the server, we queue a work item (vnode->cb_work) to go and
           clobber all the PTEs mapping to that inode.
      
           This causes the CPU to trip through the ->map_pages() and
           ->page_mkwrite() handlers if userspace attempts to access the page(s)
           again.
      
           (Ideally, this would be done in the service handler for CB.CallBack,
           but the server is waiting for our reply before considering, and we
           have a list of vnodes, all of which need breaking - and the process of
           getting the mmap_lock and stripping the PTEs on all CPUs could be
           quite slow.)
      
       (2) Call afs_validate() from the ->map_pages() handler to check to see if
           the file has changed and to get a new callback promise from the
           server.
      
      Also handle the fileserver telling us that it's dropping all callbacks,
      possibly after it's been restarted by sending us a CB.InitCallBackState*
      call by the following means:
      
       (3) Maintain a per-cell list of afs files that are currently mmap'd
           (cell->fs_open_mmaps).
      
       (4) Add a work item to each server that is invoked if there are any open
           mmaps when CB.InitCallBackState happens.  This work item goes through
           the aforementioned list and invokes the vnode->cb_work work item for
           each one that is currently using this server.
      
           This causes the PTEs to be cleared, causing ->map_pages() or
           ->page_mkwrite() to be called again, thereby calling afs_validate()
           again.
      
      I've chosen to simply strip the PTEs at the point of notification reception
      rather than invalidate all the pages as well because (a) it's faster, (b)
      we may get a notification for other reasons than the data being altered (in
      which case we don't want to clobber the pagecache) and (c) we need to ask
      the server to find out - and I don't want to wait for the reply before
      holding up userspace.
      
      This was tested using the attached test program:
      
      	#include <stdbool.h>
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <unistd.h>
      	#include <fcntl.h>
      	#include <sys/mman.h>
      	int main(int argc, char *argv[])
      	{
      		size_t size = getpagesize();
      		unsigned char *p;
      		bool mod = (argc == 3);
      		int fd;
      		if (argc != 2 && argc != 3) {
      			fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]);
      			exit(2);
      		}
      		fd = open(argv[1], mod ? O_RDWR : O_RDONLY);
      		if (fd < 0) {
      			perror(argv[1]);
      			exit(1);
      		}
      
      		p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ,
      			 MAP_SHARED, fd, 0);
      		if (p == MAP_FAILED) {
      			perror("mmap");
      			exit(1);
      		}
      		for (;;) {
      			if (mod) {
      				p[0]++;
      				msync(p, size, MS_ASYNC);
      				fsync(fd);
      			}
      			printf("%02x", p[0]);
      			fflush(stdout);
      			sleep(1);
      		}
      	}
      
      It runs in two modes: in one mode, it mmaps a file, then sits in a loop
      reading the first byte, printing it and sleeping for a second; in the
      second mode it mmaps a file, then sits in a loop incrementing the first
      byte and flushing, then printing and sleeping.
      
      Two instances of this program can be run on different machines, one doing
      the reading and one doing the writing.  The reader should see the changes
      made by the writer, but without this patch, they aren't because validity
      checking is being done lazily - only on entry to the filesystem.
      
      Testing the InitCallBackState change is more complicated.  The server has
      to be taken offline, the saved callback state file removed and then the
      server restarted whilst the reading-mode program continues to run.  The
      client machine then has to poke the server to trigger the InitCallBackState
      call.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NMarkus Suvanto <markus.suvanto@gmail.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
      6e0e99d5
    • D
      afs: Add missing vnode validation checks · 3978d816
      David Howells 提交于
      afs_d_revalidate() should only be validating the directory entry it is
      given and the directory to which that belongs; it shouldn't be validating
      the inode/vnode to which that dentry points.  Besides, validation need to
      be done even if we don't call afs_d_revalidate() - which might be the case
      if we're starting from a file descriptor.
      
      In order for afs_d_revalidate() to be fixed, validation points must be
      added in some other places.  Certain directory operations, such as
      afs_unlink(), already check this, but not all and not all file operations
      either.
      
      Note that the validation of a vnode not only checks to see if the
      attributes we have are correct, but also gets a promise from the server to
      notify us if that file gets changed by a third party.
      
      Add the following checks:
      
       - Check the vnode we're going to make a hard link to.
       - Check the vnode we're going to move/rename.
       - Check the vnode we're going to read from.
       - Check the vnode we're going to write to.
       - Check the vnode we're going to sync.
       - Check the vnode we're going to make a mapped page writable for.
      
      Some of these aren't strictly necessary as we're going to perform a server
      operation that might get the attributes anyway from which we can determine
      if something changed - though it might not get us a callback promise.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NMarkus Suvanto <markus.suvanto@gmail.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/163111667354.283156.12720698333342917516.stgit@warthog.procyon.org.uk/
      3978d816
  12. 11 9月, 2021 1 次提交
  13. 23 4月, 2021 9 次提交
  14. 16 3月, 2021 1 次提交
  15. 29 10月, 2020 2 次提交
    • D
      afs: Fix afs_invalidatepage to adjust the dirty region · f86726a6
      David Howells 提交于
      Fix afs_invalidatepage() to adjust the dirty region recorded in
      page->private when truncating a page.  If the dirty region is entirely
      removed, then the private data is cleared and the page dirty state is
      cleared.
      
      Without this, if the page is truncated and then expanded again by truncate,
      zeros from the expanded, but no-longer dirty region may get written back to
      the server if the page gets laundered due to a conflicting 3rd-party write.
      
      It mustn't, however, shorten the dirty region of the page if that page is
      still mmapped and has been marked dirty by afs_page_mkwrite(), so a flag is
      stored in page->private to record this.
      
      Fixes: 4343d008 ("afs: Get rid of the afs_writeback record")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f86726a6
    • D
      afs: Fix to take ref on page when PG_private is set · fa04a40b
      David Howells 提交于
      Fix afs to take a ref on a page when it sets PG_private on it and to drop
      the ref when removing the flag.
      
      Note that in afs_write_begin(), a lot of the time, PG_private is already
      set on a page to which we're going to add some data.  In such a case, we
      leave the bit set and mustn't increment the page count.
      
      As suggested by Matthew Wilcox, use attach/detach_page_private() where
      possible.
      
      Fixes: 31143d5d ("AFS: implement basic file write support")
      Reported-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      fa04a40b
  16. 28 10月, 2020 1 次提交
  17. 24 8月, 2020 1 次提交
  18. 16 6月, 2020 1 次提交
    • D
      afs: Fix use of afs_check_for_remote_deletion() · 728279a5
      David Howells 提交于
      afs_check_for_remote_deletion() checks to see if error ENOENT is returned
      by the server in response to an operation and, if so, marks the primary
      vnode as having been deleted as the FID is no longer valid.
      
      However, it's being called from the operation success functions, where no
      abort has happened - and if an inline abort is recorded, it's handled by
      afs_vnode_commit_status().
      
      Fix this by actually calling the operation aborted method if provided and
      having that point to afs_check_for_remote_deletion().
      
      Fixes: e49c7b2f ("afs: Build an abstraction around an "operation" concept")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      728279a5
  19. 04 6月, 2020 1 次提交
    • D
      afs: Build an abstraction around an "operation" concept · e49c7b2f
      David Howells 提交于
      Turn the afs_operation struct into the main way that most fileserver
      operations are managed.  Various things are added to the struct, including
      the following:
      
       (1) All the parameters and results of the relevant operations are moved
           into it, removing corresponding fields from the afs_call struct.
           afs_call gets a pointer to the op.
      
       (2) The target volume is made the main focus of the operation, rather than
           the target vnode(s), and a bunch of op->vnode->volume are made
           op->volume instead.
      
       (3) Two vnode records are defined (op->file[]) for the vnode(s) involved
           in most operations.  The vnode record (struct afs_vnode_param)
           contains:
      
      	- The vnode pointer.
      
      	- The fid of the vnode to be included in the parameters or that was
                returned in the reply (eg. FS.MakeDir).
      
      	- The status and callback information that may be returned in the
           	  reply about the vnode.
      
      	- Callback break and data version tracking for detecting
                simultaneous third-parth changes.
      
       (4) Pointers to dentries to be updated with new inodes.
      
       (5) An operations table pointer.  The table includes pointers to functions
           for issuing AFS and YFS-variant RPCs, handling the success and abort
           of an operation and handling post-I/O-lock local editing of a
           directory.
      
      To make this work, the following function restructuring is made:
      
       (A) The rotation loop that issues calls to fileservers that can be found
           in each function that wants to issue an RPC (such as afs_mkdir()) is
           extracted out into common code, in a new file called fs_operation.c.
      
       (B) The rotation loops, such as the one in afs_mkdir(), are replaced with
           a much smaller piece of code that allocates an operation, sets the
           parameters and then calls out to the common code to do the actual
           work.
      
       (C) The code for handling the success and failure of an operation are
           moved into operation functions (as (5) above) and these are called
           from the core code at appropriate times.
      
       (D) The pseudo inode getting stuff used by the dynamic root code is moved
           over into dynroot.c.
      
       (E) struct afs_iget_data is absorbed into the operation struct and
           afs_iget() expects to be given an op pointer and a vnode record.
      
       (F) Point (E) doesn't work for the root dir of a volume, but we know the
           FID in advance (it's always vnode 1, unique 1), so a separate inode
           getter, afs_root_iget(), is provided to special-case that.
      
       (G) The inode status init/update functions now also take an op and a vnode
           record.
      
       (H) The RPC marshalling functions now, for the most part, just take an
           afs_operation struct as their only argument.  All the data they need
           is held there.  The result delivery functions write their answers
           there as well.
      
       (I) The call is attached to the operation and then the operation core does
           the waiting.
      
      And then the new operation code is, for the moment, made to just initialise
      the operation, get the appropriate vnode I/O locks and do the same rotation
      loop as before.
      
      This lays the foundation for the following changes in the future:
      
       (*) Overhauling the rotation (again).
      
       (*) Support for asynchronous I/O, where the fileserver rotation must be
           done asynchronously also.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e49c7b2f
  20. 31 5月, 2020 1 次提交
  21. 21 11月, 2019 1 次提交