1. 24 10月, 2018 18 次提交
    • D
      afs: Implement YFS support in the fs client · 30062bd1
      David Howells 提交于
      Implement support for talking to YFS-variant fileservers in the cache
      manager and the filesystem client.  These implement upgraded services on
      the same port as their AFS services.
      
      YFS fileservers provide expanded capabilities over AFS.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      30062bd1
    • D
      afs: Expand data structure fields to support YFS · d4936803
      David Howells 提交于
      Expand fields in various data structures to support the expanded
      information that YFS is capable of returning.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      d4936803
    • D
      afs: Get the target vnode in afs_rmdir() and get a callback on it · f58db83f
      David Howells 提交于
      Get the target vnode in afs_rmdir() and validate it before we attempt the
      deletion, The vnode pointer will be passed through to the delivery function
      in a later patch so that the delivery function can mark it deleted.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f58db83f
    • D
      afs: Calc callback expiry in op reply delivery · 12d8e95a
      David Howells 提交于
      Calculate the callback expiration time at the point of operation reply
      delivery, using the reply time queried from AF_RXRPC on that call as a
      base.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      12d8e95a
    • D
      afs: Fix FS.FetchStatus delivery from updating wrong vnode · 36bb5f49
      David Howells 提交于
      The FS.FetchStatus reply delivery function was updating inode of the
      directory in which a lookup had been done with the status of the looked up
      file.  This corrupts some of the directory state.
      
      Fixes: 5cf9dd55 ("afs: Prospectively look up extra files when doing a single lookup")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      36bb5f49
    • D
      afs: Implement the YFS cache manager service · 35dbfba3
      David Howells 提交于
      Implement the YFS cache manager service which gives extra capabilities on
      top of AFS.  This is done by listening for an additional service on the
      same port and indicating that anyone requesting an upgrade should be
      upgraded to the YFS port.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      35dbfba3
    • D
      afs: Remove callback details from afs_callback_break struct · 06aeb297
      David Howells 提交于
      Remove unnecessary details of a broken callback, such as version, expiry
      and type, from the afs_callback_break struct as they're not actually used
      and make the list take more memory.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      06aeb297
    • D
      afs: Commit the status on a new file/dir/symlink · 00671912
      David Howells 提交于
      Call the function to commit the status on a new file, dir or symlink so
      that the access rights for the caller's key are cached for that object.
      
      Without this, the next access to the file will cause a FetchStatus
      operation to be emitted to retrieve the access rights.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      00671912
    • D
      afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS · 3b6492df
      David Howells 提交于
      Increase the sizes of the volume ID to 64 bits and the vnode ID (inode
      number equivalent) to 96 bits to allow the support of YFS.
      
      This requires the iget comparator to check the vnode->fid rather than i_ino
      and i_generation as i_ino is not sufficiently capacious.  It also requires
      this data to be placed into the vnode cache key for fscache.
      
      For the moment, just discard the top 32 bits of the vnode ID when returning
      it though stat.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3b6492df
    • D
      afs: Don't invoke the server to read data beyond EOF · 2a0b4f64
      David Howells 提交于
      When writing a new page, clear space in the page rather than attempting to
      load it from the server if the space is beyond the EOF.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      2a0b4f64
    • D
      afs: Add a couple of tracepoints to log I/O errors · f51375cd
      David Howells 提交于
      Add a couple of tracepoints to log the production of I/O errors within the AFS
      filesystem.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      f51375cd
    • D
      afs: Handle EIO from delivery function · 4ac15ea5
      David Howells 提交于
      Fix afs_deliver_to_call() to handle -EIO being returned by the operation
      delivery function, indicating that the call found itself in the wrong
      state, by printing an error and aborting the call.
      
      Currently, an assertion failure will occur.  This can happen, say, if the
      delivery function falls off the end without calling afs_extract_data() with
      the want_more parameter set to false to collect the end of the Rx phase of
      a call.
      
      The assertion failure looks like:
      
      	AFS: Assertion failed
      	4 == 7 is false
      	0x4 == 0x7 is false
      	------------[ cut here ]------------
      	kernel BUG at fs/afs/rxrpc.c:462!
      
      and is matched in the trace buffer by a line like:
      
      kworker/7:3-3226 [007] ...1 85158.030203: afs_io_error: c=0003be0c r=-5 CM_REPLY
      
      Fixes: 98bf40cd ("afs: Protect call->state changes against signals")
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      4ac15ea5
    • D
      afs: Fix TTL on VL server and address lists · ded2f4c5
      David Howells 提交于
      Currently the TTL on VL server and address lists isn't set in all
      circumstances and may be set to poor choices in others, since the TTL is
      derived from the SRV/AFSDB DNS record if and when available.
      
      Fix the TTL by limiting the range to a minimum and maximum from the current
      time.  At some point these can be made into sysctl knobs.  Further, use the
      TTL we obtained from the upcall to set the expiry on negative results too;
      in future a mechanism can be added to force reloading of such data.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      ded2f4c5
    • D
      afs: Implement VL server rotation · 0a5143f2
      David Howells 提交于
      Track VL servers as independent entities rather than lumping all their
      addresses together into one set and implement server-level rotation by:
      
       (1) Add the concept of a VL server list, where each server has its own
           separate address list.  This code is similar to the FS server list.
      
       (2) Use the DNS resolver to retrieve a set of servers and their associated
           addresses, ports, preference and weight ratings.
      
       (3) In the case of a legacy DNS resolver or an address list given directly
           through /proc/net/afs/cells, create a list containing just a dummy
           server record and attach all the addresses to that.
      
       (4) Implement a simple rotation policy, for the moment ignoring the
           priorities and weights assigned to the servers.
      
       (5) Show the address list through /proc/net/afs/<cell>/vlservers.  This
           also displays the source and status of the data as indicated by the
           upcall.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0a5143f2
    • D
      afs: Improve FS server rotation error handling · e7f680f4
      David Howells 提交于
      Improve the error handling in FS server rotation by:
      
       (1) Cache the latest useful error value for the fs operation as a whole in
           struct afs_fs_cursor separately from the error cached in the
           afs_addr_cursor struct.  The one in the address cursor gets clobbered
           occasionally.  Copy over the error to the fs operation only when it's
           something we'd be interested in passing to userspace.
      
       (2) Make it so that EDESTADDRREQ is the default that is seen only if no
           addresses are available to be accessed.
      
       (3) When calling utility functions, such as checking a volume status or
           probing a fileserver, don't let a successful result clobber the cached
           error in the cursor; instead, stash the result in a temporary variable
           until it has been assessed.
      
       (4) Don't return ETIMEDOUT or ETIME if a better error, such as
           ENETUNREACH, is already cached.
      
       (5) On leaving the rotation loop, turn any remote abort code into a more
           useful error than ECONNABORTED.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      e7f680f4
    • D
      afs: Set up the iov_iter before calling afs_extract_data() · 12bdcf33
      David Howells 提交于
      afs_extract_data sets up a temporary iov_iter and passes it to AF_RXRPC
      each time it is called to describe the remaining buffer to be filled.
      
      Instead:
      
       (1) Put an iterator in the afs_call struct.
      
       (2) Set the iterator for each marshalling stage to load data into the
           appropriate places.  A number of convenience functions are provided to
           this end (eg. afs_extract_to_buf()).
      
           This iterator is then passed to afs_extract_data().
      
       (3) Use the new ITER_DISCARD iterator to discard any excess data provided
           by FetchData.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      12bdcf33
    • D
      afs: Better tracing of protocol errors · 160cb957
      David Howells 提交于
      Include the site of detection of AFS protocol errors in trace lines to
      better be able to determine what went wrong.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      160cb957
    • D
      iov_iter: Separate type from direction and use accessor functions · aa563d7b
      David Howells 提交于
      In the iov_iter struct, separate the iterator type from the iterator
      direction and use accessor functions to access them in most places.
      
      Convert a bunch of places to use switch-statements to access them rather
      then chains of bitwise-AND statements.  This makes it easier to add further
      iterator types.  Also, this can be more efficient as to implement a switch
      of small contiguous integers, the compiler can use ~50% fewer compare
      instructions than it has to use bitwise-and instructions.
      
      Further, cease passing the iterator type into the iterator setup function.
      The iterator function can set that itself.  Only the direction is required.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      aa563d7b
  2. 15 10月, 2018 1 次提交
    • D
      afs: Fix clearance of reply · f0a7d188
      David Howells 提交于
      The recent patch to fix the afs_server struct leak didn't actually fix the
      bug, but rather fixed some of the symptoms.  The problem is that an
      asynchronous call that holds a resource pointed to by call->reply[0] will
      find the pointer cleared in the call destructor, thereby preventing the
      resource from being cleaned up.
      
      In the case of the server record leak, the afs_fs_get_capabilities()
      function in devel code sets up a call with reply[0] pointing at the server
      record that should be altered when the result is obtained, but this was
      being cleared before the destructor was called, so the put in the
      destructor does nothing and the record is leaked.
      
      Commit f014ffb0 removed the additional ref obtained by
      afs_install_server(), but the removal of this ref is actually used by the
      garbage collector to mark a server record as being defunct after the record
      has expired through lack of use.
      
      The offending clearance of call->reply[0] upon completion in
      afs_process_async_call() has been there from the origin of the code, but
      none of the asynchronous calls actually use that pointer currently, so it
      should be safe to remove (note that synchronous calls don't involve this
      function).
      
      Fix this by the following means:
      
       (1) Revert commit f014ffb0.
      
       (2) Remove the clearance of reply[0] from afs_process_async_call().
      
      Without this, afs_manage_servers() will suffer an assertion failure if it
      sees a server record that didn't get used because the usage count is not 1.
      
      Fixes: f014ffb0 ("afs: Fix afs_server struct leak")
      Fixes: 08e0e7c8 ("[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC.")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f0a7d188
  3. 12 10月, 2018 2 次提交
    • D
      afs: Fix afs_server struct leak · f014ffb0
      David Howells 提交于
      Fix a leak of afs_server structs.  The routine that installs them in the
      various lookup lists and trees gets a ref on leaving the function, whether
      it added the server or a server already exists.  It shouldn't increment
      the refcount if it added the server.
      
      The effect of this that "rmmod kafs" will hang waiting for the leaked
      server to become unused.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f014ffb0
    • D
      afs: Fix cell proc list · 6b3944e4
      David Howells 提交于
      Access to the list of cells by /proc/net/afs/cells has a couple of
      problems:
      
       (1) It should be checking against SEQ_START_TOKEN for the keying the
           header line.
      
       (2) It's only holding the RCU read lock, so it can't just walk over the
           list without following the proper RCU methods.
      
      Fix these by using an hlist instead of an ordinary list and using the
      appropriate accessor functions to follow it with RCU.
      
      Since the code that adds a cell to the list must also necessarily change,
      sort the list on insertion whilst we're at it.
      
      Fixes: 989782dc ("afs: Overhaul cell database management")
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b3944e4
  4. 04 10月, 2018 4 次提交
  5. 08 9月, 2018 1 次提交
    • D
      afs: Fix cell specification to permit an empty address list · ecfe951f
      David Howells 提交于
      Fix the cell specification mechanism to allow cells to be pre-created
      without having to specify at least one address (the addresses will be
      upcalled for).
      
      This allows the cell information preload service to avoid the need to issue
      loads of DNS lookups during boot to get the addresses for each cell (500+
      lookups for the 'standard' cell list[*]).  The lookups can be done later as
      each cell is accessed through the filesystem.
      
      Also remove the print statement that prints a line every time a new cell is
      added.
      
      [*] There are 144 cells in the list.  Each cell is first looked up for an
          SRV record, and if that fails, for an AFSDB record.  These get a list
          of server names, each of which then has to be looked up to get the
          addresses for that server.  E.g.:
      
      	dig srv _afs3-vlserver._udp.grand.central.org
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ecfe951f
  6. 24 8月, 2018 1 次提交
  7. 06 8月, 2018 3 次提交
  8. 04 8月, 2018 1 次提交
  9. 21 6月, 2018 1 次提交
  10. 15 6月, 2018 5 次提交
    • D
      afs: Optimise callback breaking by not repeating volume lookup · 47ea0f2e
      David Howells 提交于
      At the moment, afs_break_callbacks calls afs_break_one_callback() for each
      separate FID it was given, and the latter looks up the volume individually
      for each one.
      
      However, this is inefficient if two or more FIDs have the same vid as we
      could reuse the volume.  This is complicated by cell aliasing whereby we
      may have multiple cells sharing a volume and can therefore have multiple
      callback interests for any particular volume ID.
      
      At the moment afs_break_one_callback() scans the entire list of volumes
      we're getting from a server and breaks the appropriate callback in every
      matching volume, regardless of cell.  This scan is done for every FID.
      
      Optimise callback breaking by the following means:
      
       (1) Sort the FID list by vid so that all FIDs belonging to the same volume
           are clumped together.
      
           This is done through the use of an indirection table as we cannot do
           an insertion sort on the afs_callback_break array as we decode FIDs
           into it as we subsequently also have to decode callback info into it
           that corresponds by array index only.
      
           We also don't really want to bubblesort afterwards if we can avoid it.
      
       (2) Sort the server->cb_interests array by vid so that all the matching
           volumes are grouped together.  This permits the scan to stop after
           finding a record that has a higher vid.
      
       (3) When breaking FIDs, we try to keep server->cb_break_lock as long as
           possible, caching the start point in the array for that volume group
           as long as possible.
      
           It might make sense to add another layer in that list and have a
           refcounted volume ID anchor that has the matching interests attached
           to it rather than being in the list.  This would allow the lock to be
           dropped without losing the cursor.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      47ea0f2e
    • D
      afs: Display manually added cells in dynamic root mount · 0da0b7fd
      David Howells 提交于
      Alter the dynroot mount so that cells created by manipulation of
      /proc/fs/afs/cells and /proc/fs/afs/rootcell and by specification of a root
      cell as a module parameter will cause directories for those cells to be
      created in the dynamic root superblock for the network namespace[*].
      
      To this end:
      
       (1) Only one dynamic root superblock is now created per network namespace
           and this is shared between all attempts to mount it.  This makes it
           easier to find the superblock to modify.
      
       (2) When a dynamic root superblock is created, the list of cells is walked
           and directories created for each cell already defined.
      
       (3) When a new cell is added, if a dynamic root superblock exists, a
           directory is created for it.
      
       (4) When a cell is destroyed, the directory is removed.
      
       (5) These directories are created by calling lookup_one_len() on the root
           dir which automatically creates them if they don't exist.
      
      [*] Inasmuch as network namespaces are currently supported here.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0da0b7fd
    • D
      afs: Enable IPv6 DNS lookups · c88d5a7f
      David Howells 提交于
      Remove the restriction on DNS lookup upcalls that prevents ipv6 addresses
      from being looked up.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      c88d5a7f
    • D
      afs: Show all of a server's addresses in /proc/fs/afs/servers · 0aac4bce
      David Howells 提交于
      Show all of a server's addresses in /proc/fs/afs/servers, placing the
      second plus addresses on padded lines of their own.  The current address is
      marked with a star.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      0aac4bce
    • D
      afs: Handle CONFIG_PROC_FS=n · b6cfbeca
      David Howells 提交于
      The AFS filesystem depends at the moment on /proc for configuration and
      also presents information that way - however, this causes a compilation
      failure if procfs is disabled.
      
      Fix it so that the procfs bits aren't compiled in if procfs is disabled.
      
      This means that you can't configure the AFS filesystem directly, but it is
      still usable provided that an up-to-date keyutils is installed to look up
      cells by SRV or AFSDB DNS records.
      Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      b6cfbeca
  11. 13 6月, 2018 1 次提交
    • K
      treewide: kmalloc() -> kmalloc_array() · 6da2ec56
      Kees Cook 提交于
      The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
      patch replaces cases of:
      
              kmalloc(a * b, gfp)
      
      with:
              kmalloc_array(a * b, gfp)
      
      as well as handling cases of:
      
              kmalloc(a * b * c, gfp)
      
      with:
      
              kmalloc(array3_size(a, b, c), gfp)
      
      as it's slightly less ugly than:
      
              kmalloc_array(array_size(a, b), c, gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              kmalloc(4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The tools/ directory was manually excluded, since it has its own
      implementation of kmalloc().
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      type TYPE;
      expression THING, E;
      @@
      
      (
        kmalloc(
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        kmalloc(
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
        , ...)
      )
      
      // Drop single-byte sizes and redundant parens.
      @@
      expression COUNT;
      typedef u8;
      typedef __u8;
      @@
      
      (
        kmalloc(
      -	sizeof(u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(__u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(char) * (COUNT)
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(unsigned char) * (COUNT)
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(u8) * COUNT
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(__u8) * COUNT
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(char) * COUNT
      +	COUNT
        , ...)
      |
        kmalloc(
      -	sizeof(unsigned char) * COUNT
      +	COUNT
        , ...)
      )
      
      // 2-factor product with sizeof(type/expression) and identifier or constant.
      @@
      type TYPE;
      expression THING;
      identifier COUNT_ID;
      constant COUNT_CONST;
      @@
      
      (
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(TYPE) * (COUNT_ID)
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(TYPE) * COUNT_ID
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(TYPE) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(TYPE) * COUNT_CONST
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(THING) * (COUNT_ID)
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(THING) * COUNT_ID
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(THING) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(THING)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(THING) * COUNT_CONST
      +	COUNT_CONST, sizeof(THING)
        , ...)
      )
      
      // 2-factor product, only identifiers.
      @@
      identifier SIZE, COUNT;
      @@
      
      - kmalloc
      + kmalloc_array
        (
      -	SIZE * COUNT
      +	COUNT, SIZE
        , ...)
      
      // 3-factor product with 1 sizeof(type) or sizeof(expression), with
      // redundant parens removed.
      @@
      expression THING;
      identifier STRIDE, COUNT;
      type TYPE;
      @@
      
      (
        kmalloc(
      -	sizeof(TYPE) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kmalloc(
      -	sizeof(TYPE) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kmalloc(
      -	sizeof(TYPE) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kmalloc(
      -	sizeof(TYPE) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kmalloc(
      -	sizeof(THING) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kmalloc(
      -	sizeof(THING) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kmalloc(
      -	sizeof(THING) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kmalloc(
      -	sizeof(THING) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      )
      
      // 3-factor product with 2 sizeof(variable), with redundant parens removed.
      @@
      expression THING1, THING2;
      identifier COUNT;
      type TYPE1, TYPE2;
      @@
      
      (
        kmalloc(
      -	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kmalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kmalloc(
      -	sizeof(THING1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kmalloc(
      -	sizeof(THING1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kmalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      |
        kmalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      )
      
      // 3-factor product, only identifiers, with redundant parens removed.
      @@
      identifier STRIDE, SIZE, COUNT;
      @@
      
      (
        kmalloc(
      -	(COUNT) * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	COUNT * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	COUNT * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	(COUNT) * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	COUNT * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	(COUNT) * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	(COUNT) * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kmalloc(
      -	COUNT * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      )
      
      // Any remaining multi-factor products, first at least 3-factor products,
      // when they're not all constants...
      @@
      expression E1, E2, E3;
      constant C1, C2, C3;
      @@
      
      (
        kmalloc(C1 * C2 * C3, ...)
      |
        kmalloc(
      -	(E1) * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kmalloc(
      -	(E1) * (E2) * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kmalloc(
      -	(E1) * (E2) * (E3)
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kmalloc(
      -	E1 * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      )
      
      // And then all remaining 2 factors products when they're not all constants,
      // keeping sizeof() as the second factor argument.
      @@
      expression THING, E1, E2;
      type TYPE;
      constant C1, C2, C3;
      @@
      
      (
        kmalloc(sizeof(THING) * C2, ...)
      |
        kmalloc(sizeof(TYPE) * C2, ...)
      |
        kmalloc(C1 * C2 * C3, ...)
      |
        kmalloc(C1 * C2, ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(TYPE) * (E2)
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(TYPE) * E2
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(THING) * (E2)
      +	E2, sizeof(THING)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	sizeof(THING) * E2
      +	E2, sizeof(THING)
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	(E1) * E2
      +	E1, E2
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	(E1) * (E2)
      +	E1, E2
        , ...)
      |
      - kmalloc
      + kmalloc_array
        (
      -	E1 * E2
      +	E1, E2
        , ...)
      )
      Signed-off-by: NKees Cook <keescook@chromium.org>
      6da2ec56
  12. 07 6月, 2018 1 次提交
    • K
      treewide: Use struct_size() for kmalloc()-family · acafe7e3
      Kees Cook 提交于
      One of the more common cases of allocation size calculations is finding
      the size of a structure that has a zero-sized array at the end, along
      with memory for some number of elements for that array. For example:
      
      struct foo {
          int stuff;
          void *entry[];
      };
      
      instance = kmalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
      
      Instead of leaving these open-coded and prone to type mistakes, we can
      now use the new struct_size() helper:
      
      instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);
      
      This patch makes the changes for kmalloc()-family (and kvmalloc()-family)
      uses. It was done via automatic conversion with manual review for the
      "CHECKME" non-standard cases noted below, using the following Coccinelle
      script:
      
      // pkey_cache = kmalloc(sizeof *pkey_cache + tprops->pkey_tbl_len *
      //                      sizeof *pkey_cache->table, GFP_KERNEL);
      @@
      identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
      expression GFP;
      identifier VAR, ELEMENT;
      expression COUNT;
      @@
      
      - alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT), GFP)
      + alloc(struct_size(VAR, ELEMENT, COUNT), GFP)
      
      // mr = kzalloc(sizeof(*mr) + m * sizeof(mr->map[0]), GFP_KERNEL);
      @@
      identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
      expression GFP;
      identifier VAR, ELEMENT;
      expression COUNT;
      @@
      
      - alloc(sizeof(*VAR) + COUNT * sizeof(VAR->ELEMENT[0]), GFP)
      + alloc(struct_size(VAR, ELEMENT, COUNT), GFP)
      
      // Same pattern, but can't trivially locate the trailing element name,
      // or variable name.
      @@
      identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
      expression GFP;
      expression SOMETHING, COUNT, ELEMENT;
      @@
      
      - alloc(sizeof(SOMETHING) + COUNT * sizeof(ELEMENT), GFP)
      + alloc(CHECKME_struct_size(&SOMETHING, ELEMENT, COUNT), GFP)
      Signed-off-by: NKees Cook <keescook@chromium.org>
      acafe7e3
  13. 06 6月, 2018 1 次提交
    • D
      vfs: change inode times to use struct timespec64 · 95582b00
      Deepa Dinamani 提交于
      struct timespec is not y2038 safe. Transition vfs to use
      y2038 safe struct timespec64 instead.
      
      The change was made with the help of the following cocinelle
      script. This catches about 80% of the changes.
      All the header file and logic changes are included in the
      first 5 rules. The rest are trivial substitutions.
      I avoid changing any of the function signatures or any other
      filesystem specific data structures to keep the patch simple
      for review.
      
      The script can be a little shorter by combining different cases.
      But, this version was sufficient for my usecase.
      
      virtual patch
      
      @ depends on patch @
      identifier now;
      @@
      - struct timespec
      + struct timespec64
        current_time ( ... )
        {
      - struct timespec now = current_kernel_time();
      + struct timespec64 now = current_kernel_time64();
        ...
      - return timespec_trunc(
      + return timespec64_trunc(
        ... );
        }
      
      @ depends on patch @
      identifier xtime;
      @@
       struct \( iattr \| inode \| kstat \) {
       ...
      -       struct timespec xtime;
      +       struct timespec64 xtime;
       ...
       }
      
      @ depends on patch @
      identifier t;
      @@
       struct inode_operations {
       ...
      int (*update_time) (...,
      -       struct timespec t,
      +       struct timespec64 t,
      ...);
       ...
       }
      
      @ depends on patch @
      identifier t;
      identifier fn_update_time =~ "update_time$";
      @@
       fn_update_time (...,
      - struct timespec *t,
      + struct timespec64 *t,
       ...) { ... }
      
      @ depends on patch @
      identifier t;
      @@
      lease_get_mtime( ... ,
      - struct timespec *t
      + struct timespec64 *t
        ) { ... }
      
      @te depends on patch forall@
      identifier ts;
      local idexpression struct inode *inode_node;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn_update_time =~ "update_time$";
      identifier fn;
      expression e, E3;
      local idexpression struct inode *node1;
      local idexpression struct inode *node2;
      local idexpression struct iattr *attr1;
      local idexpression struct iattr *attr2;
      local idexpression struct iattr attr;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      @@
      (
      (
      - struct timespec ts;
      + struct timespec64 ts;
      |
      - struct timespec ts = current_time(inode_node);
      + struct timespec64 ts = current_time(inode_node);
      )
      
      <+... when != ts
      (
      - timespec_equal(&inode_node->i_xtime, &ts)
      + timespec64_equal(&inode_node->i_xtime, &ts)
      |
      - timespec_equal(&ts, &inode_node->i_xtime)
      + timespec64_equal(&ts, &inode_node->i_xtime)
      |
      - timespec_compare(&inode_node->i_xtime, &ts)
      + timespec64_compare(&inode_node->i_xtime, &ts)
      |
      - timespec_compare(&ts, &inode_node->i_xtime)
      + timespec64_compare(&ts, &inode_node->i_xtime)
      |
      ts = current_time(e)
      |
      fn_update_time(..., &ts,...)
      |
      inode_node->i_xtime = ts
      |
      node1->i_xtime = ts
      |
      ts = inode_node->i_xtime
      |
      <+... attr1->ia_xtime ...+> = ts
      |
      ts = attr1->ia_xtime
      |
      ts.tv_sec
      |
      ts.tv_nsec
      |
      btrfs_set_stack_timespec_sec(..., ts.tv_sec)
      |
      btrfs_set_stack_timespec_nsec(..., ts.tv_nsec)
      |
      - ts = timespec64_to_timespec(
      + ts =
      ...
      -)
      |
      - ts = ktime_to_timespec(
      + ts = ktime_to_timespec64(
      ...)
      |
      - ts = E3
      + ts = timespec_to_timespec64(E3)
      |
      - ktime_get_real_ts(&ts)
      + ktime_get_real_ts64(&ts)
      |
      fn(...,
      - ts
      + timespec64_to_timespec(ts)
      ,...)
      )
      ...+>
      (
      <... when != ts
      - return ts;
      + return timespec64_to_timespec(ts);
      ...>
      )
      |
      - timespec_equal(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_equal(&node1->i_xtime2, &node2->i_xtime2)
      |
      - timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2)
      + timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2)
      |
      - timespec_compare(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_compare(&node1->i_xtime1, &node2->i_xtime2)
      |
      node1->i_xtime1 =
      - timespec_trunc(attr1->ia_xtime1,
      + timespec64_trunc(attr1->ia_xtime1,
      ...)
      |
      - attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2,
      + attr1->ia_xtime1 =  timespec64_trunc(attr2->ia_xtime2,
      ...)
      |
      - ktime_get_real_ts(&attr1->ia_xtime1)
      + ktime_get_real_ts64(&attr1->ia_xtime1)
      |
      - ktime_get_real_ts(&attr.ia_xtime1)
      + ktime_get_real_ts64(&attr.ia_xtime1)
      )
      
      @ depends on patch @
      struct inode *node;
      struct iattr *attr;
      identifier fn;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      expression e;
      @@
      (
      - fn(node->i_xtime);
      + fn(timespec64_to_timespec(node->i_xtime));
      |
       fn(...,
      - node->i_xtime);
      + timespec64_to_timespec(node->i_xtime));
      |
      - e = fn(attr->ia_xtime);
      + e = fn(timespec64_to_timespec(attr->ia_xtime));
      )
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      )
      ...+>
      }
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      struct kstat *stat;
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier i_xtime =~ "^i_[acm]time$";
      identifier xtime =~ "^[acm]time$";
      identifier fn, ret;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(stat->xtime);
      ret = fn (...,
      - &stat->xtime);
      + &ts);
      )
      ...+>
      }
      
      @ depends on patch @
      struct inode *node;
      struct inode *node2;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier i_xtime3 =~ "^i_[acm]time$";
      struct iattr *attrp;
      struct iattr *attrp2;
      struct iattr attr ;
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      struct kstat *stat;
      struct kstat stat1;
      struct timespec64 ts;
      identifier xtime =~ "^[acmb]time$";
      expression e;
      @@
      (
      ( node->i_xtime2 \| attrp->ia_xtime2 \| attr.ia_xtime2 \) = node->i_xtime1  ;
      |
       node->i_xtime2 = \( node2->i_xtime1 \| timespec64_trunc(...) \);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       stat->xtime = node2->i_xtime1;
      |
       stat1.xtime = node2->i_xtime1;
      |
      ( node->i_xtime2 \| attrp->ia_xtime2 \) = attrp->ia_xtime1  ;
      |
      ( attrp->ia_xtime1 \| attr.ia_xtime1 \) = attrp2->ia_xtime2;
      |
      - e = node->i_xtime1;
      + e = timespec64_to_timespec( node->i_xtime1 );
      |
      - e = attrp->ia_xtime1;
      + e = timespec64_to_timespec( attrp->ia_xtime1 );
      |
      node->i_xtime1 = current_time(...);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
       node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
      - node->i_xtime1 = e;
      + node->i_xtime1 = timespec_to_timespec64(e);
      )
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Cc: <anton@tuxera.com>
      Cc: <balbi@kernel.org>
      Cc: <bfields@fieldses.org>
      Cc: <darrick.wong@oracle.com>
      Cc: <dhowells@redhat.com>
      Cc: <dsterba@suse.com>
      Cc: <dwmw2@infradead.org>
      Cc: <hch@lst.de>
      Cc: <hirofumi@mail.parknet.co.jp>
      Cc: <hubcap@omnibond.com>
      Cc: <jack@suse.com>
      Cc: <jaegeuk@kernel.org>
      Cc: <jaharkes@cs.cmu.edu>
      Cc: <jslaby@suse.com>
      Cc: <keescook@chromium.org>
      Cc: <mark@fasheh.com>
      Cc: <miklos@szeredi.hu>
      Cc: <nico@linaro.org>
      Cc: <reiserfs-devel@vger.kernel.org>
      Cc: <richard@nod.at>
      Cc: <sage@redhat.com>
      Cc: <sfrench@samba.org>
      Cc: <swhiteho@redhat.com>
      Cc: <tj@kernel.org>
      Cc: <trond.myklebust@primarydata.com>
      Cc: <tytso@mit.edu>
      Cc: <viro@zeniv.linux.org.uk>
      95582b00