1. 17 3月, 2020 13 次提交
  2. 13 3月, 2020 2 次提交
    • M
      ovl: fix lockdep warning for async write · c8536804
      Miklos Szeredi 提交于
      Lockdep reports "WARNING: lock held when returning to user space!" due to
      async write holding freeze lock over the write.  Apparently aio.c already
      deals with this by lying to lockdep about the state of the lock.
      
      Do the same here.  No need to check for S_IFREG() here since these file ops
      are regular-only.
      
      Reported-by: syzbot+9331a354f4f624a52a55@syzkaller.appspotmail.com
      Fixes: 2406a307 ("ovl: implement async IO routines")
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      c8536804
    • A
      ovl: fix some xino configurations · 53afcd31
      Amir Goldstein 提交于
      Fix up two bugs in the coversion to xino_mode:
      1. xino=off does not always end up in disabled mode
      2. xino=auto on 32bit arch should end up in disabled mode
      
      Take a proactive approach to disabling xino on 32bit kernel:
      1. Disable XINO_AUTO config during build time
      2. Disable xino with a warning on mount time
      
      As a by product, xino=on on 32bit arch also ends up in disabled mode.
      We never intended to enable xino on 32bit arch and this will make the
      rest of the logic simpler.
      
      Fixes: 0f831ec8 ("ovl: simplify ovl_same_sb() helper")
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      53afcd31
  3. 12 3月, 2020 1 次提交
  4. 03 2月, 2020 1 次提交
  5. 24 1月, 2020 7 次提交
    • M
      ovl: add splice file read write helper · 1a980b8c
      Murphy Zhou 提交于
      Now overlayfs falls back to use default file splice read
      and write, which is not compatiple with overlayfs, returning
      EFAULT. xfstests generic/591 can reproduce part of this.
      
      Tested this patch with xfstests auto group tests.
      Signed-off-by: NMurphy Zhou <jencce.kernel@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      1a980b8c
    • J
      ovl: implement async IO routines · 2406a307
      Jiufei Xue 提交于
      A performance regression was observed since linux v4.19 with aio test using
      fio with iodepth 128 on overlayfs.  The queue depth of the device was
      always 1 which is unexpected.
      
      After investigation, it was found that commit 16914e6f ("ovl: add
      ovl_read_iter()") and commit 2a92e07e ("ovl: add ovl_write_iter()")
      resulted in vfs_iter_{read,write} being called on underlying filesystem,
      which always results in syncronous IO.
      
      Implement async IO for stacked reading and writing.  This resolves the
      performance regresion.
      
      This is implemented by allocating a new kiocb for submitting the AIO
      request on the underlying filesystem.  When the request is completed, the
      new kiocb is freed and the completion callback is called on the original
      iocb.
      Signed-off-by: NJiufei Xue <jiufei.xue@linux.alibaba.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      2406a307
    • M
      ovl: layer is const · 13464165
      Miklos Szeredi 提交于
      The ovl_layer struct is never modified except at initialization.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      13464165
    • A
      ovl: fix corner case of non-constant st_dev;st_ino · b7bf9908
      Amir Goldstein 提交于
      On non-samefs overlay without xino, non pure upper inodes should use a
      pseudo_dev assigned to each unique lower fs, but if lower layer is on the
      same fs and upper layer, it has no pseudo_dev assigned.
      
      In this overlay layers setup:
       - two filesystems, A and B
       - upper layer is on A
       - lower layer 1 is also on A
       - lower layer 2 is on B
      
      Non pure upper overlay inode, whose origin is in layer 1 will have the
      st_dev;st_ino values of the real lower inode before copy up and the
      st_dev;st_ino values of the real upper inode after copy up.
      
      Fix this inconsitency by assigning a unique pseudo_dev also for upper fs,
      that will be used as st_dev value along with the lower inode st_dev for
      overlay inodes in the case above.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      b7bf9908
    • A
      ovl: fix corner case of conflicting lower layer uuid · 1b81dddd
      Amir Goldstein 提交于
      This fixes ovl_lower_uuid_ok() to correctly detect the corner case:
       - two filesystems, A and B, both have null uuid
       - upper layer is on A
       - lower layer 1 is also on A
       - lower layer 2 is on B
      
      In this case, bad_uuid would not have been set for B, because the check
      only involved the list of lower fs.  Hence we'll try to decode a layer 2
      origin on layer 1 and fail.
      
      We check for conflicting (and null) uuid among all lower layers, including
      those layers that are on the same fs as the upper layer.
      Reported-by: NMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      1b81dddd
    • A
      ovl: generalize the lower_fs[] array · 07f1e596
      Amir Goldstein 提交于
      Rename lower_fs[] array to fs[], extend its size by one and use index fsid
      (instead of fsid-1) to access the fs[] array.
      
      Initialize fs[0] with upper fs values. fsid 0 is reserved even with lower
      only overlay, so fs[0] remains null in this case.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      07f1e596
    • A
      ovl: simplify ovl_same_sb() helper · 0f831ec8
      Amir Goldstein 提交于
      No code uses the sb returned from this helper, so make it retrun a boolean
      and rename it to ovl_same_fs().
      
      The xino mode is irrelevant when all layers are on same fs, so instead of
      describing samefs with mode OVL_XINO_OFF, use a new xino_mode state, which
      is 0 in the case of samefs, -1 in the case of xino=off and > 0 with xino
      enabled.
      
      Create a new helper ovl_same_dev(), to use instead of the common check for
      (ovl_same_fs() || xinobits).
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      0f831ec8
  6. 23 1月, 2020 5 次提交
    • A
      ovl: generalize the lower_layers[] array · 94375f9d
      Amir Goldstein 提交于
      Rename lower_layers[] array to layers[], extend its size by one and
      initialize layers[0] with upper layer values.  Lower layers are now
      addressed with index 1..numlower.  layers[0] is reserved even with lower
      only overlay.
      
      [SzM: replace ofs->numlower with ofs->numlayer, the latter's value is
      incremented by one]
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      94375f9d
    • C
      ovl: improving copy-up efficiency for big sparse file · b504c654
      Chengguang Xu 提交于
      Current copy-up is not efficient for big sparse file,
      It's not only slow but also wasting more disk space
      when the target lower file has huge hole inside.
      This patch tries to recognize file hole and skip it
      during copy-up.
      
      Detail logic of hole detection as below:
      When we detect next data position is larger than current
      position we will skip that hole, otherwise we copy
      data in the size of OVL_COPY_UP_CHUNK_SIZE. Actually,
      it may not recognize all kind of holes and sometimes
      only skips partial of hole area. However, it will be
      enough for most of the use cases.
      
      Additionally, this optimization relies on lseek(2)
      SEEK_DATA implementation, so for some specific
      filesystems which do not support this feature
      will behave as before on copy-up.
      Reviewed-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NChengguang Xu <cgxu519@mykernel.net>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      b504c654
    • A
      ovl: use ovl_inode_lock in ovl_llseek() · b1f9d385
      Amir Goldstein 提交于
      In ovl_llseek() we use the overlay inode rwsem to protect against
      concurrent modifications to real file f_pos, because we copy the overlay
      file f_pos to/from the real file f_pos.
      
      This caused a lockdep warning of locking order violation when the
      ovl_llseek() operation was called on a lower nested overlay layer while the
      upper layer fs sb_writers is held (with patch improving copy-up efficiency
      for big sparse file).
      
      Use the internal ovl_inode_lock() instead of the overlay inode rwsem in
      those cases. It is meant to be used for protecting against concurrent
      changes to overlay inode internal state changes.
      
      The locking order rules are documented to explain this case.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      b1f9d385
    • L
      ovl: use pr_fmt auto generate prefix · 1bd0a3ae
      lijiazi 提交于
      Use pr_fmt auto generate "overlayfs: " prefix.
      Signed-off-by: Nlijiazi <lijiazi@xiaomi.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      1bd0a3ae
    • A
      ovl: fix wrong WARN_ON() in ovl_cache_update_ino() · 4c37e71b
      Amir Goldstein 提交于
      The WARN_ON() that child entry is always on overlay st_dev became wrong
      when we allowed this function to update d_ino in non-samefs setup with xino
      enabled.
      
      It is not true in case of xino bits overflow on a non-dir inode.  Leave the
      WARN_ON() only for directories, where assertion is still true.
      
      Fixes: adbf4f7e ("ovl: consistent d_ino for non-samefs with xino")
      Cc: <stable@vger.kernel.org> # v4.17+
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      4c37e71b
  7. 10 12月, 2019 5 次提交
    • A
      ovl: relax WARN_ON() on rename to self · 6889ee5a
      Amir Goldstein 提交于
      In ovl_rename(), if new upper is hardlinked to old upper underneath
      overlayfs before upper dirs are locked, user will get an ESTALE error
      and a WARN_ON will be printed.
      
      Changes to underlying layers while overlayfs is mounted may result in
      unexpected behavior, but it shouldn't crash the kernel and it shouldn't
      trigger WARN_ON() either, so relax this WARN_ON().
      
      Reported-by: syzbot+bb1836a212e69f8e201a@syzkaller.appspotmail.com
      Fixes: 804032fa ("ovl: don't check rename to self")
      Cc: <stable@vger.kernel.org> # v4.9+
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      6889ee5a
    • A
      ovl: fix corner case of non-unique st_dev;st_ino · 9c6d8f13
      Amir Goldstein 提交于
      On non-samefs overlay without xino, non pure upper inodes should use a
      pseudo_dev assigned to each unique lower fs and pure upper inodes use the
      real upper st_dev.
      
      It is fine for an overlay pure upper inode to use the same st_dev;st_ino
      values as the real upper inode, because the content of those two different
      filesystem objects is always the same.
      
      In this case, however:
       - two filesystems, A and B
       - upper layer is on A
       - lower layer 1 is also on A
       - lower layer 2 is on B
      
      Non pure upper overlay inode, whose origin is in layer 1 will have the same
      st_dev;st_ino values as the real lower inode. This may result with a false
      positive results of 'diff' between the real lower and copied up overlay
      inode.
      
      Fix this by using the upper st_dev;st_ino values in this case.  This breaks
      the property of constant st_dev;st_ino across copy up of this case. This
      breakage will be fixed by a later patch.
      
      Fixes: 5148626b ("ovl: allocate anon bdev per unique lower fs")
      Cc: stable@vger.kernel.org # v4.17+
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      9c6d8f13
    • A
      ovl: don't use a temp buf for encoding real fh · ec7bbb53
      Amir Goldstein 提交于
      We can allocate maximum fh size and encode into it directly.
      Suggested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      ec7bbb53
    • A
      ovl: make sure that real fid is 32bit aligned in memory · cbe7fba8
      Amir Goldstein 提交于
      Seprate on-disk encoding from in-memory and on-wire resresentation
      of overlay file handle.
      
      In-memory and on-wire we only ever pass around pointers to struct
      ovl_fh, which encapsulates at offset 3 the on-disk format struct
      ovl_fb. struct ovl_fb encapsulates at offset 21 the real file handle.
      That makes sure that the real file handle is always 32bit aligned
      in-memory when passed down to the underlying filesystem.
      
      On-disk format remains the same and store/load are done into
      correctly aligned buffer.
      
      New nfs exported file handles are exported with aligned real fid.
      Old nfs file handles are copied to an aligned buffer before being
      decoded.
      Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      cbe7fba8
    • A
      ovl: fix lookup failure on multi lower squashfs · 7e63c87f
      Amir Goldstein 提交于
      In the past, overlayfs required that lower fs have non null uuid in
      order to support nfs export and decode copy up origin file handles.
      
      Commit 9df085f3 ("ovl: relax requirement for non null uuid of
      lower fs") relaxed this requirement for nfs export support, as long
      as uuid (even if null) is unique among all lower fs.
      
      However, said commit unintentionally also relaxed the non null uuid
      requirement for decoding copy up origin file handles, regardless of
      the unique uuid requirement.
      
      Amend this mistake by disabling decoding of copy up origin file handle
      from lower fs with a conflicting uuid.
      
      We still encode copy up origin file handles from those fs, because
      file handles like those already exist in the wild and because they
      might provide useful information in the future.
      
      There is an unhandled corner case described by Miklos this way:
      - two filesystems, A and B, both have null uuid
      - upper layer is on A
      - lower layer 1 is also on A
      - lower layer 2 is on B
      
      In this case bad_uuid won't be set for B, because the check only
      involves the list of lower fs.  Hence we'll try to decode a layer 2
      origin on layer 1 and fail.
      
      We will deal with this corner case later.
      Reported-by: NColin Ian King <colin.king@canonical.com>
      Tested-by: NColin Ian King <colin.king@canonical.com>
      Link: https://lore.kernel.org/lkml/20191106234301.283006-1-colin.king@canonical.com/
      Fixes: 9df085f3 ("ovl: relax requirement for non null uuid ...")
      Cc: stable@vger.kernel.org # v4.20+
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      7e63c87f
  8. 16 11月, 2019 1 次提交
    • A
      new helper: lookup_positive_unlocked() · 6c2d4798
      Al Viro 提交于
      Most of the callers of lookup_one_len_unlocked() treat negatives are
      ERR_PTR(-ENOENT).  Provide a helper that would do just that.  Note
      that a pinned positive dentry remains positive - it's ->d_inode is
      stable, etc.; a pinned _negative_ dentry can become positive at any
      point as long as you are not holding its parent at least shared.
      So using lookup_one_len_unlocked() needs to be careful;
      lookup_positive_unlocked() is safer and that's what the callers
      end up open-coding anyway.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6c2d4798
  9. 11 9月, 2019 2 次提交
  10. 16 7月, 2019 1 次提交
  11. 19 6月, 2019 2 次提交