1. 03 6月, 2020 1 次提交
    • A
      ovl: fix out of bounds access warning in ovl_check_fb_len() · 522f6e6c
      Amir Goldstein 提交于
      syzbot reported out of bounds memory access from open_by_handle_at()
      with a crafted file handle that looks like this:
      
        { .handle_bytes = 2, .handle_type = OVL_FILEID_V1 }
      
      handle_bytes gets rounded down to 0 and we end up calling:
        ovl_check_fh_len(fh, 0) => ovl_check_fb_len(fh + 3, -3)
      
      But fh buffer is only 2 bytes long, so accessing struct ovl_fb at
      fh + 3 is illegal.
      
      Fixes: cbe7fba8 ("ovl: make sure that real fid is 32bit aligned in memory")
      Reported-and-tested-by: syzbot+61958888b1c60361a791@syzkaller.appspotmail.com
      Cc: <stable@vger.kernel.org> # v5.5
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      522f6e6c
  2. 13 5月, 2020 2 次提交
  3. 27 3月, 2020 1 次提交
    • A
      ovl: enable xino automatically in more cases · 926e94d7
      Amir Goldstein 提交于
      So far, with xino=auto, we only enable xino if we know that all
      underlying filesystem use 32bit inode numbers.
      
      When users configure overlay with xino=auto, they already declare that
      they are ready to handle 64bit inode number from overlay.
      
      It is a very common case, that underlying filesystem uses 64bit ino,
      but rarely or never uses the high inode number bits (e.g. tmpfs, xfs).
      Leaving it for the users to declare high ino bits are unused with
      xino=on is not a recipe for many users to enjoy the benefits of xino.
      
      There appears to be very little reason not to enable xino when users
      declare xino=auto even if we do not know how many bits underlying
      filesystem uses for inode numbers.
      
      In the worst case of xino bits overflow by real inode number, we
      already fall back to the non-xino behavior - real inode number with
      unique pseudo dev or to non persistent inode number and overlay st_dev
      (for directories).
      
      The only annoyance from auto enabling xino is that xino bits overflow
      emits a warning to kmsg. Suppress those warnings unless users explicitly
      asked for xino=on, suggesting that they expected high ino bits to be
      unused by underlying filesystem.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      926e94d7
  4. 17 3月, 2020 4 次提交
  5. 12 3月, 2020 1 次提交
  6. 24 1月, 2020 3 次提交
    • J
      ovl: implement async IO routines · 2406a307
      Jiufei Xue 提交于
      A performance regression was observed since linux v4.19 with aio test using
      fio with iodepth 128 on overlayfs.  The queue depth of the device was
      always 1 which is unexpected.
      
      After investigation, it was found that commit 16914e6f ("ovl: add
      ovl_read_iter()") and commit 2a92e07e ("ovl: add ovl_write_iter()")
      resulted in vfs_iter_{read,write} being called on underlying filesystem,
      which always results in syncronous IO.
      
      Implement async IO for stacked reading and writing.  This resolves the
      performance regresion.
      
      This is implemented by allocating a new kiocb for submitting the AIO
      request on the underlying filesystem.  When the request is completed, the
      new kiocb is freed and the completion callback is called on the original
      iocb.
      Signed-off-by: NJiufei Xue <jiufei.xue@linux.alibaba.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      2406a307
    • M
      ovl: layer is const · 13464165
      Miklos Szeredi 提交于
      The ovl_layer struct is never modified except at initialization.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      13464165
    • A
      ovl: simplify ovl_same_sb() helper · 0f831ec8
      Amir Goldstein 提交于
      No code uses the sb returned from this helper, so make it retrun a boolean
      and rename it to ovl_same_fs().
      
      The xino mode is irrelevant when all layers are on same fs, so instead of
      describing samefs with mode OVL_XINO_OFF, use a new xino_mode state, which
      is 0 in the case of samefs, -1 in the case of xino=off and > 0 with xino
      enabled.
      
      Create a new helper ovl_same_dev(), to use instead of the common check for
      (ovl_same_fs() || xinobits).
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      0f831ec8
  7. 23 1月, 2020 1 次提交
  8. 10 12月, 2019 1 次提交
    • A
      ovl: make sure that real fid is 32bit aligned in memory · cbe7fba8
      Amir Goldstein 提交于
      Seprate on-disk encoding from in-memory and on-wire resresentation
      of overlay file handle.
      
      In-memory and on-wire we only ever pass around pointers to struct
      ovl_fh, which encapsulates at offset 3 the on-disk format struct
      ovl_fb. struct ovl_fb encapsulates at offset 21 the real file handle.
      That makes sure that the real file handle is always 32bit aligned
      in-memory when passed down to the underlying filesystem.
      
      On-disk format remains the same and store/load are done into
      correctly aligned buffer.
      
      New nfs exported file handles are exported with aligned real fid.
      Old nfs file handles are copied to an aligned buffer before being
      decoded.
      Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      cbe7fba8
  9. 19 6月, 2019 1 次提交
  10. 29 5月, 2019 1 次提交
    • A
      ovl: detect overlapping layers · 146d62e5
      Amir Goldstein 提交于
      Overlapping overlay layers are not supported and can cause unexpected
      behavior, but overlayfs does not currently check or warn about these
      configurations.
      
      User is not supposed to specify the same directory for upper and
      lower dirs or for different lower layers and user is not supposed to
      specify directories that are descendants of each other for overlay
      layers, but that is exactly what this zysbot repro did:
      
          https://syzkaller.appspot.com/x/repro.syz?x=12c7a94f400000
      
      Moving layer root directories into other layers while overlayfs
      is mounted could also result in unexpected behavior.
      
      This commit places "traps" in the overlay inode hash table.
      Those traps are dummy overlay inodes that are hashed by the layers
      root inodes.
      
      On mount, the hash table trap entries are used to verify that overlay
      layers are not overlapping.  While at it, we also verify that overlay
      layers are not overlapping with directories "in-use" by other overlay
      instances as upperdir/workdir.
      
      On lookup, the trap entries are used to verify that overlay layers
      root inodes have not been moved into other layers after mount.
      
      Some examples:
      
      $ ./run --ov --samefs -s
      ...
      ( mkdir -p base/upper/0/u base/upper/0/w base/lower lower upper mnt
        mount -o bind base/lower lower
        mount -o bind base/upper upper
        mount -t overlay none mnt ...
              -o lowerdir=lower,upperdir=upper/0/u,workdir=upper/0/w)
      
      $ umount mnt
      $ mount -t overlay none mnt ...
              -o lowerdir=base,upperdir=upper/0/u,workdir=upper/0/w
      
        [   94.434900] overlayfs: overlapping upperdir path
        mount: mount overlay on mnt failed: Too many levels of symbolic links
      
      $ mount -t overlay none mnt ...
              -o lowerdir=upper/0/u,upperdir=upper/0/u,workdir=upper/0/w
      
        [  151.350132] overlayfs: conflicting lowerdir path
        mount: none is already mounted or mnt busy
      
      $ mount -t overlay none mnt ...
              -o lowerdir=lower:lower/a,upperdir=upper/0/u,workdir=upper/0/w
      
        [  201.205045] overlayfs: overlapping lowerdir path
        mount: mount overlay on mnt failed: Too many levels of symbolic links
      
      $ mount -t overlay none mnt ...
              -o lowerdir=lower,upperdir=upper/0/u,workdir=upper/0/w
      $ mv base/upper/0/ base/lower/
      $ find mnt/0
        mnt/0
        mnt/0/w
        find: 'mnt/0/w/work': Too many levels of symbolic links
        find: 'mnt/0/u': Too many levels of symbolic links
      
      Reported-by: syzbot+9c69c282adc4edd2b540@syzkaller.appspotmail.com
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      146d62e5
  11. 06 5月, 2019 1 次提交
    • A
      ovl: fix missing upper fs freeze protection on copy up for ioctl · 3428030d
      Amir Goldstein 提交于
      Generalize the helper ovl_open_maybe_copy_up() and use it to copy up file
      with data before FS_IOC_SETFLAGS ioctl.
      
      The FS_IOC_SETFLAGS ioctl is a bit of an odd ball in vfs, which probably
      caused the confusion.  File may be open O_RDONLY, but ioctl modifies the
      file.  VFS does not call mnt_want_write_file() nor lock inode mutex, but
      fs-specific code for FS_IOC_SETFLAGS does.  So ovl_ioctl() calls
      mnt_want_write_file() for the overlay file, and fs-specific code calls
      mnt_want_write_file() for upper fs file, but there was no call for
      ovl_want_write() for copy up duration which prevents overlayfs from copying
      up on a frozen upper fs.
      
      Fixes: dab5ca8f ("ovl: add lsattr/chattr support")
      Cc: <stable@vger.kernel.org> # v4.19
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      3428030d
  12. 13 2月, 2019 1 次提交
  13. 27 10月, 2018 2 次提交
  14. 04 10月, 2018 1 次提交
  15. 20 7月, 2018 13 次提交
    • V
      ovl: add helper to force data copy-up · d1e6f6a9
      Vivek Goyal 提交于
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      d1e6f6a9
    • V
      ovl: Check redirect on index as well · 0a2d0d3f
      Vivek Goyal 提交于
      Right now we seem to check redirect only if upperdentry is found.  But it
      is possible that there is no upperdentry but later we found an index.
      
      We need to check redirect on index as well and set it in
      ovl_inode->redirect.  Otherwise link code can assume that dentry does not
      have redirect and place a new one which breaks things.  In my testing
      overlay/033 test started failing in xfstests.  Following are the details.
      
      For example do following.
      
      $ mkdir lower upper work merged
      
       - Make lower dir with 4 links.
        $ echo "foo" > lower/l0.txt
        $ ln  lower/l0.txt lower/l1.txt
        $ ln  lower/l0.txt lower/l2.txt
        $ ln  lower/l0.txt lower/l3.txt
      
       - Mount with index on and metacopy on.
      
        $ mount -t overlay -o lowerdir=lower,upperdir=upper,workdir=work,\
                              index=on,metacopy=on none merged
      
       - Link lower
      
        $ ln merged/l0.txt merged/l4.txt
          (This will metadata copy up of l0.txt and put an absolute redirect
           /l0.txt)
      
        $ echo 2 > /proc/sys/vm/drop/caches
      
        $ ls merged/l1.txt
        (Now l1.txt will be looked up.  There is no upper dentry but there is
         lower dentry and index will be found.  We don't check for redirect on
         index, hence ovl_inode->redirect will be NULL.)
      
       - Link Upper
      
        $ ln merged/l4.txt merged/l5.txt
        (Lookup of l4.txt will use inode from l1.txt lookup which is still in
         cache.  It has ovl_inode->redirect NULL, hence link will put a new
         redirect and replace /l0.txt with /l4.txt
      
       - Drop caches.
        echo 2 > /proc/sys/vm/drop_caches
      
       - List l1.txt and it returns -ESTALE
      
        $ ls merged/l0.txt
      
        (It returns stale because, we found a metacopy of l0.txt in upper and it
         has redirect l4.txt but there is no file named l4.txt in lower layer.
         So lower data copy is not found and -ESTALE is returned.)
      
      So problem here is that we did not process redirect on index.  Check
      redirect on index as well and then problem is fixed.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      0a2d0d3f
    • V
      ovl: Add an inode flag OVL_CONST_INO · a00c2d59
      Vivek Goyal 提交于
      Add an ovl_inode flag OVL_CONST_INO.  This flag signifies if inode number
      will remain constant over copy up or not.  This flag does not get updated
      over copy up and remains unmodifed after setting once.
      
      Next patch in the series will make use of this flag.  It will basically
      figure out if dentry is of type ORIGIN or not.  And this can be derived by
      this flag.
      
      ORIGIN = (upperdentry && ovl_test_flag(OVL_CONST_INO, inode)).
      Suggested-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      a00c2d59
    • V
      ovl: Add helper ovl_inode_realdata() · 4823d49c
      Vivek Goyal 提交于
      Add an helper to retrieve real data inode associated with overlay inode.
      This helper will ignore all metacopy inodes and will return only the real
      inode which has data.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      4823d49c
    • V
      ovl: Store lower data inode in ovl_inode · 2664bd08
      Vivek Goyal 提交于
      Right now ovl_inode stores inode pointer for lower inode.  This helps with
      quickly getting lower inode given overlay inode (ovl_inode_lower()).
      
      Now with metadata only copy-up, we can have metacopy inode in middle layer
      as well and inode containing data can be different from ->lower.  I need to
      be able to open the real file in ovl_open_realfile() and for that I need to
      quickly find the lower data inode.
      
      Hence store lower data inode also in ovl_inode.  Also provide an helper
      ovl_inode_lowerdata() to access this field.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      2664bd08
    • V
      ovl: Fix ovl_getattr() to get number of blocks from lower · 67d756c2
      Vivek Goyal 提交于
      If an inode has been copied up metadata only, then we need to query the
      number of blocks from lower and fill up the stat->st_blocks.
      
      We need to be careful about races where we are doing stat on one cpu and
      data copy up is taking place on other cpu.  We want to return
      stat->st_blocks either from lower or stable upper and not something in
      between.  Hence, ovl_has_upperdata() is called first to figure out whether
      block reporting will take place from lower or upper.
      
      We now support metacopy dentries in middle layer.  That means number of
      blocks reporting needs to come from lowest data dentry and this could be
      different from lower dentry.  Hence we end up making a separate
      vfs_getxattr() call for metacopy dentries to get number of blocks.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      67d756c2
    • V
      ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry · 647d253f
      Vivek Goyal 提交于
      Now we have the notion of data dentry and metacopy dentry.
      ovl_dentry_lower() will return uppermost lower dentry, but it could be
      either data or metacopy dentry.  Now we support metacopy dentries in lower
      layers so it is possible that lowerstack[0] is metacopy dentry while
      lowerstack[1] is actual data dentry.
      
      So add an helper which returns lowest most dentry which is supposed to be
      data dentry.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      647d253f
    • V
      ovl: Copy up meta inode data from lowest data inode · 4f93b426
      Vivek Goyal 提交于
      So far lower could not be a meta inode.  So whenever it was time to copy up
      data of a meta inode, we could copy it up from top most lower dentry.
      
      But now lower itself can be a metacopy inode.  That means data copy up
      needs to take place from a data inode in metacopy inode chain.  Find lower
      data inode in the chain and use that for data copy up.
      
      Introduced a helper called ovl_path_lowerdata() to find the lower data
      inode chain.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      4f93b426
    • V
      ovl: Modify ovl_lookup() and friends to lookup metacopy dentry · 9d3dfea3
      Vivek Goyal 提交于
      This patch modifies ovl_lookup() and friends to lookup metacopy dentries.
      It also allows for presence of metacopy dentries in lower layer.
      
      During lookup, check for presence of OVL_XATTR_METACOPY and if not present,
      set OVL_UPPERDATA bit in flags.
      
      We don't support metacopy feature with nfs_export.  So in nfs_export code,
      we set OVL_UPPERDATA flag set unconditionally if upper inode exists.
      
      Do not follow metacopy origin if we find a metacopy only inode and metacopy
      feature is not enabled for that mount.  Like redirect, this can have
      security implications where an attacker could hand craft upper and try to
      gain access to file on lower which it should not have to begin with.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      9d3dfea3
    • V
      ovl: A new xattr OVL_XATTR_METACOPY for file on upper · 0c288874
      Vivek Goyal 提交于
      Now we will have the capability to have upper inodes which might be only
      metadata copy up and data is still on lower inode.  So add a new xattr
      OVL_XATTR_METACOPY to distinguish between two cases.
      
      Presence of OVL_XATTR_METACOPY reflects that file has been copied up
      metadata only and and data will be copied up later from lower origin.  So
      this xattr is set when a metadata copy takes place and cleared when data
      copy takes place.
      
      We also use a bit in ovl_inode->flags to cache OVL_UPPERDATA which reflects
      whether ovl inode has data or not (as opposed to metadata only copy up).
      
      If a file is copied up metadata only and later when same file is opened for
      WRITE, then data copy up takes place.  We copy up data, remove METACOPY
      xattr and then set the UPPERDATA flag in ovl_inode->flags.  While all these
      operations happen with oi->lock held, read side of oi->flags can be
      lockless.  That is another thread on another cpu can check if UPPERDATA
      flag is set or not.
      
      So this gives us an ordering requirement w.r.t UPPERDATA flag.  That is, if
      another cpu sees UPPERDATA flag set, then it should be guaranteed that
      effects of data copy up and remove xattr operations are also visible.
      
      For example.
      
      	CPU1				CPU2
      ovl_open()				acquire(oi->lock)
       ovl_open_maybe_copy_up()                ovl_copy_up_data()
        open_open_need_copy_up()		 vfs_removexattr()
         ovl_already_copied_up()
          ovl_dentry_needs_data_copy_up()	 ovl_set_flag(OVL_UPPERDATA)
           ovl_test_flag(OVL_UPPERDATA)       release(oi->lock)
      
      Say CPU2 is copying up data and in the end sets UPPERDATA flag.  But if
      CPU1 perceives the effects of setting UPPERDATA flag but not the effects of
      preceding operations (ex. upper that is not fully copied up), it will be a
      problem.
      
      Hence this patch introduces smp_wmb() on setting UPPERDATA flag operation
      and smp_rmb() on UPPERDATA flag test operation.
      
      May be some other lock or barrier is already covering it. But I am not sure
      what that is and is it obvious enough that we will not break it in future.
      
      So hence trying to be safe here and introducing barriers explicitly for
      UPPERDATA flag/bit.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      0c288874
    • V
      ovl: Add helper ovl_already_copied_up() · 2002df85
      Vivek Goyal 提交于
      There are couple of places where we need to know if file is already copied
      up (in lockless manner).  Right now its open coded and there are only two
      conditions to check.  Soon this patch series will introduce another
      condition to check and Amir wants to introduce one more.  So introduce a
      helper instead to check this so that code is easier to read.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      2002df85
    • V
      ovl: Move the copy up helpers to copy_up.c · d6eac039
      Vivek Goyal 提交于
      Right now two copy up helpers are in inode.c.  Amir suggested it might be
      better to move these to copy_up.c.
      
      There will one more related function which will come in later patch.
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      d6eac039
    • V
      ovl: Initialize ovl_inode->redirect in ovl_get_inode() · 9cec54c8
      Vivek Goyal 提交于
      ovl_inode->redirect is an inode property and should be initialized in
      ovl_get_inode() only when we are adding a new inode to cache.  If inode is
      already in cache, it is already initialized and we should not be touching
      ovl_inode->redirect field.
      
      As of now this is not a problem as redirects are used only for directories
      which don't share inode.  But soon I want to use redirects for regular
      files also and there it can become an issue.
      
      Hence, move ->redirect initialization in ovl_get_inode().
      Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      9cec54c8
  16. 18 7月, 2018 4 次提交
    • M
      ovl: stack file ops · d1d04ef8
      Miklos Szeredi 提交于
      Implement file operations on a regular overlay file.  The underlying file
      is opened separately and cached in ->private_data.
      
      It might be worth making an exception for such files when accounting in
      nr_file to confirm to userspace expectations.  We are only adding a small
      overhead (248bytes for the struct file) since the real inode and dentry are
      pinned by overlayfs anyway.
      
      This patch doesn't have any effect, since the vfs will use d_real() to find
      the real underlying file to open.  The patch at the end of the series will
      actually enable this functionality.
      
      AV: make it use open_with_fake_path(), don't mess with override_creds
      
      SzM: still need to mess with override_creds() until no fs uses
      current_cred() in their open method.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d1d04ef8
    • M
      ovl: copy up file size as well · 46e5d0a3
      Miklos Szeredi 提交于
      Copy i_size of the underlying inode to the overlay inode in ovl_copyattr().
      
      This is in preparation for stacking I/O operations on overlay files.
      
      This patch shouldn't have any observable effect.
      
      Remove stale comment from ovl_setattr() [spotted by Vivek Goyal].
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      46e5d0a3
    • M
      ovl: copy up inode flags · 4f357295
      Miklos Szeredi 提交于
      On inode creation copy certain inode flags from the underlying real inode
      to the overlay inode.
      
      This is in preparation for moving overlay functionality out of the VFS.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      4f357295
    • M
      ovl: copy up times · d9854c87
      Miklos Szeredi 提交于
      Copy up mtime and ctime to overlay inode after times in real object are
      modified.  Be careful not to dirty cachelines when not necessary.
      
      This is in preparation for moving overlay functionality out of the VFS.
      
      This patch shouldn't have any observable effect.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      d9854c87
  17. 06 6月, 2018 1 次提交
    • D
      vfs: change inode times to use struct timespec64 · 95582b00
      Deepa Dinamani 提交于
      struct timespec is not y2038 safe. Transition vfs to use
      y2038 safe struct timespec64 instead.
      
      The change was made with the help of the following cocinelle
      script. This catches about 80% of the changes.
      All the header file and logic changes are included in the
      first 5 rules. The rest are trivial substitutions.
      I avoid changing any of the function signatures or any other
      filesystem specific data structures to keep the patch simple
      for review.
      
      The script can be a little shorter by combining different cases.
      But, this version was sufficient for my usecase.
      
      virtual patch
      
      @ depends on patch @
      identifier now;
      @@
      - struct timespec
      + struct timespec64
        current_time ( ... )
        {
      - struct timespec now = current_kernel_time();
      + struct timespec64 now = current_kernel_time64();
        ...
      - return timespec_trunc(
      + return timespec64_trunc(
        ... );
        }
      
      @ depends on patch @
      identifier xtime;
      @@
       struct \( iattr \| inode \| kstat \) {
       ...
      -       struct timespec xtime;
      +       struct timespec64 xtime;
       ...
       }
      
      @ depends on patch @
      identifier t;
      @@
       struct inode_operations {
       ...
      int (*update_time) (...,
      -       struct timespec t,
      +       struct timespec64 t,
      ...);
       ...
       }
      
      @ depends on patch @
      identifier t;
      identifier fn_update_time =~ "update_time$";
      @@
       fn_update_time (...,
      - struct timespec *t,
      + struct timespec64 *t,
       ...) { ... }
      
      @ depends on patch @
      identifier t;
      @@
      lease_get_mtime( ... ,
      - struct timespec *t
      + struct timespec64 *t
        ) { ... }
      
      @te depends on patch forall@
      identifier ts;
      local idexpression struct inode *inode_node;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn_update_time =~ "update_time$";
      identifier fn;
      expression e, E3;
      local idexpression struct inode *node1;
      local idexpression struct inode *node2;
      local idexpression struct iattr *attr1;
      local idexpression struct iattr *attr2;
      local idexpression struct iattr attr;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      @@
      (
      (
      - struct timespec ts;
      + struct timespec64 ts;
      |
      - struct timespec ts = current_time(inode_node);
      + struct timespec64 ts = current_time(inode_node);
      )
      
      <+... when != ts
      (
      - timespec_equal(&inode_node->i_xtime, &ts)
      + timespec64_equal(&inode_node->i_xtime, &ts)
      |
      - timespec_equal(&ts, &inode_node->i_xtime)
      + timespec64_equal(&ts, &inode_node->i_xtime)
      |
      - timespec_compare(&inode_node->i_xtime, &ts)
      + timespec64_compare(&inode_node->i_xtime, &ts)
      |
      - timespec_compare(&ts, &inode_node->i_xtime)
      + timespec64_compare(&ts, &inode_node->i_xtime)
      |
      ts = current_time(e)
      |
      fn_update_time(..., &ts,...)
      |
      inode_node->i_xtime = ts
      |
      node1->i_xtime = ts
      |
      ts = inode_node->i_xtime
      |
      <+... attr1->ia_xtime ...+> = ts
      |
      ts = attr1->ia_xtime
      |
      ts.tv_sec
      |
      ts.tv_nsec
      |
      btrfs_set_stack_timespec_sec(..., ts.tv_sec)
      |
      btrfs_set_stack_timespec_nsec(..., ts.tv_nsec)
      |
      - ts = timespec64_to_timespec(
      + ts =
      ...
      -)
      |
      - ts = ktime_to_timespec(
      + ts = ktime_to_timespec64(
      ...)
      |
      - ts = E3
      + ts = timespec_to_timespec64(E3)
      |
      - ktime_get_real_ts(&ts)
      + ktime_get_real_ts64(&ts)
      |
      fn(...,
      - ts
      + timespec64_to_timespec(ts)
      ,...)
      )
      ...+>
      (
      <... when != ts
      - return ts;
      + return timespec64_to_timespec(ts);
      ...>
      )
      |
      - timespec_equal(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_equal(&node1->i_xtime2, &node2->i_xtime2)
      |
      - timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2)
      + timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2)
      |
      - timespec_compare(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_compare(&node1->i_xtime1, &node2->i_xtime2)
      |
      node1->i_xtime1 =
      - timespec_trunc(attr1->ia_xtime1,
      + timespec64_trunc(attr1->ia_xtime1,
      ...)
      |
      - attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2,
      + attr1->ia_xtime1 =  timespec64_trunc(attr2->ia_xtime2,
      ...)
      |
      - ktime_get_real_ts(&attr1->ia_xtime1)
      + ktime_get_real_ts64(&attr1->ia_xtime1)
      |
      - ktime_get_real_ts(&attr.ia_xtime1)
      + ktime_get_real_ts64(&attr.ia_xtime1)
      )
      
      @ depends on patch @
      struct inode *node;
      struct iattr *attr;
      identifier fn;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      expression e;
      @@
      (
      - fn(node->i_xtime);
      + fn(timespec64_to_timespec(node->i_xtime));
      |
       fn(...,
      - node->i_xtime);
      + timespec64_to_timespec(node->i_xtime));
      |
      - e = fn(attr->ia_xtime);
      + e = fn(timespec64_to_timespec(attr->ia_xtime));
      )
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      )
      ...+>
      }
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      struct kstat *stat;
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier i_xtime =~ "^i_[acm]time$";
      identifier xtime =~ "^[acm]time$";
      identifier fn, ret;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(stat->xtime);
      ret = fn (...,
      - &stat->xtime);
      + &ts);
      )
      ...+>
      }
      
      @ depends on patch @
      struct inode *node;
      struct inode *node2;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier i_xtime3 =~ "^i_[acm]time$";
      struct iattr *attrp;
      struct iattr *attrp2;
      struct iattr attr ;
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      struct kstat *stat;
      struct kstat stat1;
      struct timespec64 ts;
      identifier xtime =~ "^[acmb]time$";
      expression e;
      @@
      (
      ( node->i_xtime2 \| attrp->ia_xtime2 \| attr.ia_xtime2 \) = node->i_xtime1  ;
      |
       node->i_xtime2 = \( node2->i_xtime1 \| timespec64_trunc(...) \);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       stat->xtime = node2->i_xtime1;
      |
       stat1.xtime = node2->i_xtime1;
      |
      ( node->i_xtime2 \| attrp->ia_xtime2 \) = attrp->ia_xtime1  ;
      |
      ( attrp->ia_xtime1 \| attr.ia_xtime1 \) = attrp2->ia_xtime2;
      |
      - e = node->i_xtime1;
      + e = timespec64_to_timespec( node->i_xtime1 );
      |
      - e = attrp->ia_xtime1;
      + e = timespec64_to_timespec( attrp->ia_xtime1 );
      |
      node->i_xtime1 = current_time(...);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
       node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
      - node->i_xtime1 = e;
      + node->i_xtime1 = timespec_to_timespec64(e);
      )
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Cc: <anton@tuxera.com>
      Cc: <balbi@kernel.org>
      Cc: <bfields@fieldses.org>
      Cc: <darrick.wong@oracle.com>
      Cc: <dhowells@redhat.com>
      Cc: <dsterba@suse.com>
      Cc: <dwmw2@infradead.org>
      Cc: <hch@lst.de>
      Cc: <hirofumi@mail.parknet.co.jp>
      Cc: <hubcap@omnibond.com>
      Cc: <jack@suse.com>
      Cc: <jaegeuk@kernel.org>
      Cc: <jaharkes@cs.cmu.edu>
      Cc: <jslaby@suse.com>
      Cc: <keescook@chromium.org>
      Cc: <mark@fasheh.com>
      Cc: <miklos@szeredi.hu>
      Cc: <nico@linaro.org>
      Cc: <reiserfs-devel@vger.kernel.org>
      Cc: <richard@nod.at>
      Cc: <sage@redhat.com>
      Cc: <sfrench@samba.org>
      Cc: <swhiteho@redhat.com>
      Cc: <tj@kernel.org>
      Cc: <trond.myklebust@primarydata.com>
      Cc: <tytso@mit.edu>
      Cc: <viro@zeniv.linux.org.uk>
      95582b00
  18. 31 5月, 2018 1 次提交
    • A
      ovl: use inode_insert5() to hash a newly created inode · 01b39dcc
      Amir Goldstein 提交于
      Currently, there is a small window where ovl_obtain_alias() can
      race with ovl_instantiate() and create two different overlay inodes
      with the same underlying real non-dir non-hardlink inode.
      
      The race requires an adversary to guess the file handle of the
      yet to be created upper inode and decode the guessed file handle
      after ovl_creat_real(), but before ovl_instantiate().
      This race does not affect overlay directory inodes, because those
      are decoded via ovl_lookup_real() and not with ovl_obtain_alias().
      
      This patch fixes the race, by using inode_insert5() to add a newly
      created inode to cache.
      
      If the newly created inode apears to already exist in cache (hashed
      by the same real upper inode), we instantiate the dentry with the old
      inode and drop the new inode, instead of silently not hashing the new
      inode.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      01b39dcc