• C
    Btrfs: tree logging unlink/rename fixes · 12fcfd22
    Chris Mason 提交于
    The tree logging code allows individual files or directories to be logged
    without including operations on other files and directories in the FS.
    It tries to commit the minimal set of changes to disk in order to
    fsync the single file or directory that was sent to fsync or O_SYNC.
    
    The tree logging code was allowing files and directories to be unlinked
    if they were part of a rename operation where only one directory
    in the rename was in the fsync log.  This patch adds a few new rules
    to the tree logging.
    
    1) on rename or unlink, if the inode being unlinked isn't in the fsync
    log, we must force a full commit before doing an fsync of the directory
    where the unlink was done.  The commit isn't done during the unlink,
    but it is forced the next time we try to log the parent directory.
    
    Solution: record transid of last unlink/rename per directory when the
    directory wasn't already logged.  For renames this is only done when
    renaming to a different directory.
    
    mkdir foo/some_dir
    normal commit
    rename foo/some_dir foo2/some_dir
    mkdir foo/some_dir
    fsync foo/some_dir/some_file
    
    The fsync above will unlink the original some_dir without recording
    it in its new location (foo2).  After a crash, some_dir will be gone
    unless the fsync of some_file forces a full commit
    
    2) we must log any new names for any file or dir that is in the fsync
    log.  This way we make sure not to lose files that are unlinked during
    the same transaction.
    
    2a) we must log any new names for any file or dir during rename
    when the directory they are being removed from was logged.
    
    2a is actually the more important variant.  Without the extra logging
    a crash might unlink the old name without recreating the new one
    
    3) after a crash, we must go through any directories with a link count
    of zero and redo the rm -rf
    
    mkdir f1/foo
    normal commit
    rm -rf f1/foo
    fsync(f1)
    
    The directory f1 was fully removed from the FS, but fsync was never
    called on f1, only its parent dir.  After a crash the rm -rf must
    be replayed.  This must be able to recurse down the entire
    directory tree.  The inode link count fixup code takes care of the
    ugly details.
    Signed-off-by: NChris Mason <chris.mason@oracle.com>
    12fcfd22
tree-log.h 2.0 KB
/*
 * Copyright (C) 2008 Oracle.  All rights reserved.
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public
 * License v2 as published by the Free Software Foundation.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * General Public License for more details.
 *
 * You should have received a copy of the GNU General Public
 * License along with this program; if not, write to the
 * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
 * Boston, MA 021110-1307, USA.
 */

#ifndef __TREE_LOG_
#define __TREE_LOG_

int btrfs_sync_log(struct btrfs_trans_handle *trans,
		   struct btrfs_root *root);
int btrfs_free_log(struct btrfs_trans_handle *trans, struct btrfs_root *root);
int btrfs_recover_log_trees(struct btrfs_root *tree_root);
int btrfs_log_dentry_safe(struct btrfs_trans_handle *trans,
			  struct btrfs_root *root, struct dentry *dentry);
int btrfs_del_dir_entries_in_log(struct btrfs_trans_handle *trans,
				 struct btrfs_root *root,
				 const char *name, int name_len,
				 struct inode *dir, u64 index);
int btrfs_del_inode_ref_in_log(struct btrfs_trans_handle *trans,
			       struct btrfs_root *root,
			       const char *name, int name_len,
			       struct inode *inode, u64 dirid);
int btrfs_join_running_log_trans(struct btrfs_root *root);
int btrfs_end_log_trans(struct btrfs_root *root);
int btrfs_pin_log_trans(struct btrfs_root *root);
int btrfs_log_inode_parent(struct btrfs_trans_handle *trans,
		    struct btrfs_root *root, struct inode *inode,
		    struct dentry *parent, int exists_only);
void btrfs_record_unlink_dir(struct btrfs_trans_handle *trans,
			     struct inode *dir, struct inode *inode,
			     int for_rename);
int btrfs_log_new_name(struct btrfs_trans_handle *trans,
			struct inode *inode, struct inode *old_dir,
			struct dentry *parent);
#endif