1. 12 4月, 2018 2 次提交
    • A
      ovl: do not try to reconnect a disconnected origin dentry · 8a22efa1
      Amir Goldstein 提交于
      On lookup of non directory, we try to decode the origin file handle
      stored in upper inode. The origin file handle is supposed to be decoded
      to a disconnected non-dir dentry, which is fine, because we only need
      the lower inode of a copy up origin.
      
      However, if the origin file handle somehow turns out to be a directory
      we pay the expensive cost of reconnecting the directory dentry, only to
      get a mismatch file type and drop the dentry.
      
      Optimize this case by explicitly opting out of reconnecting the dentry.
      Opting-out of reconnect is done by passing a NULL acceptable callback
      to exportfs_decode_fh().
      
      While the case described above is a strange corner case that does not
      really need to be optimized, the API added for this optimization will
      be used by a following patch to optimize a more common case of decoding
      an overlayfs file handle.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      8a22efa1
    • A
      ovl: disambiguate ovl_encode_fh() · 5b2cccd3
      Amir Goldstein 提交于
      Rename ovl_encode_fh() to ovl_encode_real_fh() to differentiate from the
      exportfs function ovl_encode_inode_fh() and change the latter to
      ovl_encode_fh() to match the exportfs method name.
      
      Rename ovl_decode_fh() to ovl_decode_real_fh() for consistency.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      5b2cccd3
  2. 16 2月, 2018 1 次提交
    • A
      ovl: check lower ancestry on encode of lower dir file handle · 2ca3c148
      Amir Goldstein 提交于
      This change relaxes copy up on encode of merge dir with lower layer > 1
      and handles the case of encoding a merge dir with lower layer 1, where an
      ancestor is a non-indexed merge dir. In that case, decode of the lower
      file handle will not have been possible if the non-indexed ancestor is
      redirected before or after encode.
      
      Before encoding a non-upper directory file handle from real layer N, we
      need to check if it will be possible to reconnect an overlay dentry from
      the real lower decoded dentry. This is done by following the overlay
      ancestry up to a "layer N connected" ancestor and verifying that all
      parents along the way are "layer N connectable". If an ancestor that is
      NOT "layer N connectable" is found, we need to copy up an ancestor, which
      is "layer N connectable", thus making that ancestor "layer N connected".
      For example:
      
       layer 1: /a
       layer 2: /a/b/c
      
      The overlay dentry /a is NOT "layer 2 connectable", because if dir /a is
      copied up and renamed, upper dir /a will be indexed by lower dir /a from
      layer 1. The dir /a from layer 2 will never be indexed, so the algorithm
      in ovl_lookup_real_ancestor() (*) will not be able to lookup a connected
      overlay dentry from the connected lower dentry /a/b/c.
      
      To avoid this problem on decode time, we need to copy up an ancestor of
      /a/b/c, which is "layer 2 connectable", on encode time. That ancestor is
      /a/b. After copy up (and index) of /a/b, it will become "layer 2 connected"
      and when the time comes to decode the file handle from lower dentry /a/b/c,
      ovl_lookup_real_ancestor() will find the indexed ancestor /a/b and decoding
      a connected overlay dentry will be accomplished.
      
      (*) the algorithm in ovl_lookup_real_ancestor() can be improved to lookup
      an entry /a in the lower layers above layer N and find the indexed dir /a
      from layer 1. If that improvement is made, then the check for "layer N
      connected" will need to verify there are no redirects in lower layers above
      layer N. In the example above, /a will be "layer 2 connectable". However,
      if layer 2 dir /a is a target of a layer 1 redirect, then /a will NOT be
      "layer 2 connectable":
      
       layer 1: /A (redirect = /a)
       layer 2: /a/b/c
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      2ca3c148
  3. 24 1月, 2018 17 次提交
  4. 11 12月, 2017 1 次提交
  5. 10 11月, 2017 1 次提交
  6. 09 11月, 2017 3 次提交
    • C
      ovl: re-structure overlay lower layers in-memory · b9343632
      Chandan Rajendra 提交于
      Define new structures to represent overlay instance lower layers and
      overlay merge dir lower layers to make room for storing more per layer
      information in-memory.
      
      Instead of keeping the fs instance lower layers in an array of struct
      vfsmount, keep them in an array of new struct ovl_layer, that has a
      pointer to struct vfsmount.
      
      Instead of keeping the dentry lower layers in an array of struct path,
      keep them in an array of new struct ovl_path, that has a pointer to
      struct dentry and to struct ovl_layer.
      
      Add a small helper to find the fs layer id that correspopnds to a lower
      struct ovl_path and use it in ovl_lookup().
      
      [amir: split re-structure from anonymous bdev patch]
      Signed-off-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      b9343632
    • A
      ovl: move include of ovl_entry.h into overlayfs.h · ee023c30
      Amir Goldstein 提交于
      Most overlayfs c files already explicitly include ovl_entry.h
      to use overlay entry struct definitions and upcoming changes
      are going to require even more c files to include this header.
      
      All overlayfs c files include overlayfs.h and overlayfs.h itself
      refers to some structs defined in ovl_entry.h, so it seems more
      logic to include ovl_entry.h from overlayfs.h than from c files.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      ee023c30
    • A
      ovl: no direct iteration for dir with origin xattr · b79e05aa
      Amir Goldstein 提交于
      If a non-merge dir in an overlay mount has an overlay.origin xattr, it
      means it was once an upper merge dir, which may contain whiteouts and
      then the lower dir was removed under it.
      
      Do not iterate real dir directly in this case to avoid exposing whiteouts.
      
      [SzM] Set OVL_WHITEOUT for all merge directories as well.
      
      [amir] A directory that was just copied up does not have the OVL_WHITEOUTS
      flag. We need to set it to fix merge dir iteration.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      b79e05aa
  7. 24 10月, 2017 1 次提交
    • A
      ovl: fix EIO from lookup of non-indexed upper · 6eaf0111
      Amir Goldstein 提交于
      Commit fbaf94ee ("ovl: don't set origin on broken lower hardlink")
      attempt to avoid the condition of non-indexed upper inode with lower
      hardlink as origin. If this condition is found, lookup returns EIO.
      
      The protection of commit mentioned above does not cover the case of lower
      that is not a hardlink when it is copied up (with either index=off/on)
      and then lower is hardlinked while overlay is offline.
      
      Changes to lower layer while overlayfs is offline should not result in
      unexpected behavior, so a permanent EIO error after creating a link in
      lower layer should not be considered as correct behavior.
      
      This fix replaces EIO error with success in cases where upper has origin
      but no index is found, or index is found that does not match upper
      inode. In those cases, lookup will not fail and the returned overlay inode
      will be hashed by upper inode instead of by lower origin inode.
      
      Fixes: 359f392c ("ovl: lookup index entry for copy up origin")
      Cc: <stable@vger.kernel.org> # v4.13
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      6eaf0111
  8. 05 10月, 2017 1 次提交
  9. 28 7月, 2017 1 次提交
    • M
      ovl: constant d_ino for non-merge dirs · 4edb83bb
      Miklos Szeredi 提交于
      Impure directories are ones which contain objects with origins (i.e. those
      that have been copied up).  These are relevant to readdir operation only
      because of the d_ino field, no other transformation is necessary.  Also a
      directory can become impure between two getdents(2) calls.
      
      This patch creates a cache for impure directories.  Unlike the cache for
      merged directories, this one only contains entries with origin and is not
      refcounted but has a its lifetime tied to that of the dentry.
      
      Similarly to the merged cache, the impure cache is invalidated based on a
      version number.  This version number is incremented when an entry with
      origin is added or removed from the directory.
      
      If the cache is empty, then the impure xattr is removed from the directory.
      
      This patch also fixes up handling of d_ino for the ".." entry if the parent
      directory is merged.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      4edb83bb
  10. 20 7月, 2017 1 次提交
  11. 14 7月, 2017 1 次提交
  12. 05 7月, 2017 10 次提交
    • A
      ovl: cleanup orphan index entries · caf70cb2
      Amir Goldstein 提交于
      index entry should live only as long as there are upper or lower
      hardlinks.
      
      Cleanup orphan index entries on mount and when dropping the last
      overlay inode nlink.
      
      When about to cleanup or link up to orphan index and the index inode
      nlink > 1, admit that something went wrong and adjust overlay nlink
      to index inode nlink - 1 to prevent it from dropping below zero.
      This could happen when adding lower hardlinks underneath a mounted
      overlay and then trying to unlink them.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      caf70cb2
    • A
      ovl: persistent overlay inode nlink for indexed inodes · 5f8415d6
      Amir Goldstein 提交于
      With inodes index enabled, an overlay inode nlink counts the union of upper
      and non-covered lower hardlinks. During the lifetime of a non-pure upper
      inode, the following nlink modifying operations can happen:
      
      1. Lower hardlink copy up
      2. Upper hardlink created, unlinked or renamed over
      3. Lower hardlink whiteout or renamed over
      
      For the first, copy up case, the union nlink does not change, whether the
      operation succeeds or fails, but the upper inode nlink may change.
      Therefore, before copy up, we store the union nlink value relative to the
      lower inode nlink in the index inode xattr trusted.overlay.nlink.
      
      For the second, upper hardlink case, the union nlink should be incremented
      or decremented IFF the operation succeeds, aligned with nlink change of the
      upper inode. Therefore, before link/unlink/rename, we store the union nlink
      value relative to the upper inode nlink in the index inode.
      
      For the last, lower cover up case, we simplify things by preceding the
      whiteout or cover up with copy up. This makes sure that there is an index
      upper inode where the nlink xattr can be stored before the copied up upper
      entry is unlink.
      
      Return the overlay inode nlinks for indexed upper inodes on stat(2).
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      5f8415d6
    • M
      ovl: add flag for upper in ovl_entry · 55acc661
      Miklos Szeredi 提交于
      For rename, we need to ensure that an upper alias exists for hard links
      before attempting the operation.  Introduce a flag in ovl_entry to track
      the state of the upper alias.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      55acc661
    • A
      ovl: cleanup bad and stale index entries on mount · 415543d5
      Amir Goldstein 提交于
      Bad index entries are entries whose name does not match the
      origin file handle stored in trusted.overlay.origin xattr.
      Bad index entries could be a result of a system power off in
      the middle of copy up.
      
      Stale index entries are entries whose origin file handle is
      stale. Stale index entries could be a result of copying layers
      or removing lower entries while the overlay is not mounted.
      The case of copying layers should be detected earlier by the
      verification of upper root dir origin and index dir origin.
      
      Both bad and stale index entries are detected and removed
      on mount.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      415543d5
    • A
      ovl: lookup index entry for copy up origin · 359f392c
      Amir Goldstein 提交于
      When inodes index feature is enabled, lookup in indexdir for the index
      entry of lower real inode or copy up origin inode. The index entry name
      is the hex representation of the lower inode file handle.
      
      If the index dentry in negative, then either no lower aliases have been
      copied up yet, or aliases have been copied up in older kernels and are
      not indexed.
      
      If the index dentry for a copy up origin inode is positive, but points
      to an inode different than the upper inode, then either the upper inode
      has been copied up and not indexed or it was indexed, but since then
      index dir was cleared. Either way, that index cannot be used to indentify
      the overlay inode.
      
      If a positive dentry that matches the upper inode was found, then it is
      safe to use the copy up origin st_ino for upper hardlinks, because all
      indexed upper hardlinks are represented by the same overlay inode as the
      copy up origin.
      
      Set the INDEX type flag on an indexed upper dentry. A non-upper dentry
      may also have a positive index from copy up of another lower hardlink.
      This situation will be handled by following patches.
      
      Index lookup is going to be used to prevent breaking hardlinks on copy up.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      359f392c
    • A
      ovl: verify index dir matches upper dir · 54fb347e
      Amir Goldstein 提交于
      An index dir contains persistent hardlinks to files in upper dir.
      Therefore, we must never mount an existing index dir with a differnt
      upper dir.
      
      Store the upper root dir file handle in index dir inode when index
      dir is created and verify the file handle before using an existing
      index dir on mount.
      
      Add an 'is_upper' flag to the overlay file handle encoding and set it
      when encoding the upper root file handle. This is not critical for index
      dir verification, but it is good practice towards a standard overlayfs
      file handle format for NFS export.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      54fb347e
    • A
      ovl: verify upper root dir matches lower root dir · 8b88a2e6
      Amir Goldstein 提交于
      When inodes index feature is enabled, verify that the file handle stored
      in upper root dir matches the lower root dir or fail to mount.
      
      If upper root dir has no stored file handle, encode and store the lower
      root dir file handle in overlay.origin xattr.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      8b88a2e6
    • A
      ovl: introduce the inodes index dir feature · 02bcd157
      Amir Goldstein 提交于
      Create the index dir on mount. The index dir will contain hardlinks to
      upper inodes, named after the hex representation of their origin lower
      inodes.
      
      The index dir is going to be used to prevent breaking lower hardlinks
      on copy up and to implement overlayfs NFS export.
      
      Because the feature is not fully backward compat, enabling the feature
      is opt-in by config/module/mount option.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      02bcd157
    • A
      vfs: introduce inode 'inuse' lock · ad0af710
      Amir Goldstein 提交于
      Added an i_state flag I_INUSE and helpers to set/clear/test the bit.
      
      The 'inuse' lock is an 'advisory' inode lock, that can be used to extend
      exclusive create protection beyond parent->i_mutex lock among cooperating
      users.
      
      This is going to be used by overlayfs to get exclusive ownership on upper
      and work dirs among overlayfs mounts.
      Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      ad0af710
    • M
      ovl: move impure to ovl_inode · 13c72075
      Miklos Szeredi 提交于
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      13c72075