1. 12 11月, 2013 1 次提交
  2. 01 9月, 2013 2 次提交
    • S
      Btrfs: check UUID tree during mount if required · 70f80175
      Stefan Behrens 提交于
      If the filesystem was mounted with an old kernel that was not
      aware of the UUID tree, this is detected by looking at the
      uuid_tree_generation field of the superblock (similar to how
      the free space cache is doing it). If a mismatch is detected
      at mount time, a thread is started that does two things:
      1. Iterate through the UUID tree, check each entry, delete those
         entries that are not valid anymore (i.e., the subvol does not
         exist anymore or the value changed).
      2. Iterate through the root tree, for each found subvolume, add
         the UUID tree entries for the subvolume (if they are not
         already there).
      
      This mechanism is also used to handle and repair errors that
      happened during the initial creation and filling of the tree.
      The update of the uuid_tree_generation field (which indicates
      that the state of the UUID tree is up to date) is blocked until
      all create and repair operations are successfully completed.
      Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      70f80175
    • S
      Btrfs: introduce a tree for items that map UUIDs to something · 07b30a49
      Stefan Behrens 提交于
      Mapping UUIDs to subvolume IDs is an operation with a high effort
      today. Today, the algorithm even has quadratic effort (based on the
      number of existing subvolumes), which means, that it takes minutes
      to send/receive a single subvolume if 10,000 subvolumes exist. But
      even linear effort would be too much since it is a waste. And these
      data structures to allow mapping UUIDs to subvolume IDs are created
      every time a btrfs send/receive instance is started.
      
      It is much more efficient to maintain a searchable persistent data
      structure in the filesystem, one that is updated whenever a
      subvolume/snapshot is created and deleted, and when the received
      subvolume UUID is set by the btrfs-receive tool.
      
      Therefore kernel code is added with this commit that is able to
      maintain data structures in the filesystem that allow to quickly
      search for a given UUID and to retrieve data that is assigned to
      this UUID, like which subvolume ID is related to this UUID.
      
      This commit adds a new tree to hold UUID-to-data mapping items. The
      key of the items is the full UUID plus the key type BTRFS_UUID_KEY.
      Multiple data blocks can be stored for a given UUID, a type/length/
      value scheme is used.
      
      Now follows the lengthy justification, why a new tree was added
      instead of using the existing root tree:
      
      The first approach was to not create another tree that holds UUID
      items. Instead, the items should just go into the top root tree.
      Unfortunately this confused the algorithm to assign the objectid
      of subvolumes and snapshots. The reason is that
      btrfs_find_free_objectid() calls btrfs_find_highest_objectid() for
      the first created subvol or snapshot after mounting a filesystem,
      and this function simply searches for the largest used objectid in
      the root tree keys to pick the next objectid to assign. Of course,
      the UUID keys have always been the ones with the highest offset
      value, and the next assigned subvol ID was wastefully huge.
      
      To use any other existing tree did not look proper. To apply a
      workaround such as setting the objectid to zero in the UUID item
      key and to implement collision handling would either add
      limitations (in case of a btrfs_extend_item() approach to handle
      the collisions) or a lot of complexity and source code (in case a
      key would be looked up that is free of collisions). Adding new code
      that introduces limitations is not good, and adding code that is
      complex and lengthy for no good reason is also not good. That's the
      justification why a completely new tree was introduced.
      Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      07b30a49