- 26 10月, 2012 2 次提交
-
-
由 Josef Bacik 提交于
We BUG if we fail to commit the transaction when creating a snapshot, which is just obnoxious. Remove the BUG_ON(). Thanks, Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Lukas Czerner 提交于
Currently if len argument in btrfs_ioctl_fitrim() is smaller than one FSB we will continue and finally return 0 bytes discarded. However if the length to discard is smaller then file system block we should really return EINVAL. Signed-off-by: NLukas Czerner <lczerner@redhat.com>
-
- 09 10月, 2012 2 次提交
-
-
由 Stefan Behrens 提交于
So far the return code of barrier_all_devices() is ignored, which means that errors are ignored. The result can be a corrupt filesystem which is not consistent. This commit adds code to evaluate the return code of barrier_all_devices(). The normal btrfs_error() mechanism is used to switch the filesystem into read-only mode when errors are detected. In order to decide whether barrier_all_devices() should return error or success, the number of disks that are allowed to fail the barrier submission is calculated. This calculation accounts for the worst RAID level of metadata, system and data. If single, dup or RAID0 is in use, a single disk error is already considered to be fatal. Otherwise a single disk error is tolerated. The calculation of the number of disks that are tolerated to fail the barrier operation is performed when the filesystem gets mounted, when a balance operation is started and finished, and when devices are added or removed. Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
-
由 Liu Bo 提交于
Btrfs uses inclusive range end for lock_extent(), unlock_extent() and related functions, so we made off-by-one errors in file clone. This fixes it and also fixes some style problems. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
-
- 04 10月, 2012 1 次提交
-
-
由 David Sterba 提交于
Hi, the patch si simple, but it has user visible impact and I'm not quite sure how to resolve it. In short, $subj says it, chattr -C supports it and we want to use it. The conditions that acutally allow to change the NOCOW flag are clear. What if I try to set the flag on a file that is not empty? Options: 1) whole ioctl will fail, EINVAL 2.1) ioctl will succeed, the NOCOW flag will be silently removed, but the file will stay COW-ed and checksummed 2.2) ioctl will succeed, flag will not be removed and a syslog message will warn that the COW flag has not been changed 2.2.1) dtto, no syslog message Man page of chattr states that "If it is set on a file which already has data blocks, it is undefined when the blocks assigned to the file will be fully stable." Yes, it's undefined and with current implementation it'll never happen. So from this end, the user cannot expect anything. I'm trying to find a reasonable behaviour, so that a command like 'chattr -R -aijS +C' to tweak a broad set of flags in a deep directory does not fail unnecessarily and does not pollute the log. My personal preference is 2.2.1, but my dev's oppinion is skewed, not counting the fact that I know the code and otherwise would look there before consulting the documentation. The patch implements 2.2.1. david -------------8<------------------- From: David Sterba <dsterba@suse.cz> It's safe to turn off checksums for a zero sized file. http://thread.gmane.org/gmane.comp.file-systems.btrfs/18030 "We cannot switch on NODATASUM for a file that already has extents that are checksummed. The invariant here is that either all the extents or none are checksummed. Theoretically it's possible to add/remove all checksums from a given file, but it's a potentially longtime operation, the file has to be in some intermediate state where the checksums partially exist but have to be ignored (for the csum->nocsum) until the file is fully converted, this brings more special cases to extent handling, it has to survive power failure and remain consistent, and probably needs to be restarted after next mount." Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
- 02 10月, 2012 8 次提交
-
-
由 Liu Bo 提交于
This is the change of the kernel side. Translation of logical to inode used to have an upper limit 4k on inode container's size, but the limit is not large enough for a data with a great many of refs, so when resolving logical address, we can end up with "ioctl ret=0, bytes_left=0, bytes_missing=19944, cnt=510, missed=2493" This changes to regard 64k as the upper limit and use vmalloc instead of kmalloc to get memory more easily. Signed-off-by: NJosef Bacik <jbacik@fusionio.com> Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
-
由 Liu Bo 提交于
We already have a helper, iterate_inodes_from_logical(), for logical resolve, so just use it. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
-
由 Liu Bo 提交于
In logical resolve, we parse extent_from_logical()'s 'ret' as a kind of flag. It is possible to lose our errors because (-EXXXX & BTRFS_EXTENT_FLAG_TREE_BLOCK) is true. I'm not sure if it is on purpose, it just looks too hacky if it is. I'd rather use a real flag and a 'ret' to catch errors. Acked-by: NJan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: NLiu Bo <liub.liubo@gmail.com>
-
由 Liu Bo 提交于
We're going to use this flag EXTENT_DEFRAG to indicate which range belongs to defragment so that we can implement snapshow-aware defrag: We set the EXTENT_DEFRAG flag when dirtying the extents that need defragmented, so later on writeback thread can differentiate between normal writeback and writeback started by defragmentation. Original-Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com> Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
-
由 Miao Xie 提交于
We should insert/update 6 items(root ref, root backref, dir item, dir index, root item and parent inode) when creating a snapshot, not 5 items, fix it. Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
-
由 Miao Xie 提交于
Sometimes we need choose the method of the reservation according to the type of the block reservation, such as the reservation for the delayed inode update. Now we identify the type just by comparing the address of the reservation variants, it is very ugly if it is a temporary one because we need compare it with all the common reservation variants. So we add a new "type" field to keep the type the reservation variants. Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
-
由 Josef Bacik 提交于
I audited all users of btrfs_drop_extents and found that nobody actually uses the hint_byte argument. I'm sure it was used for something at some point but it's not used now, and the way the pinning works the disk bytenr would never be immediately useful anyway so lets just remove it. Thanks, Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Josef Bacik 提交于
At least for the vm workload. Currently on fsync we will 1) Truncate all items in the log tree for the given inode if they exist and 2) Copy all items for a given inode into the log The problem with this is that for things like VMs you can have lots of extents from the fragmented writing behavior, and worst yet you may have only modified a few extents, not the entire thing. This patch fixes this problem by tracking which transid modified our extent, and then when we do the tree logging we find all of the extents we've modified in our current transaction, sort them and commit them. We also only truncate up to the xattrs of the inode and copy that stuff in normally, and then just drop any extents in the range we have that exist in the log already. Here are some numbers of a 50 meg fio job that does random writes and fsync()s after every write Original Patched SATA drive 82KB/s 140KB/s Fusion drive 431KB/s 2532KB/s So around 2-6 times faster depending on your hardware. There are a few corner cases, for example if you truncate at all we have to do it the old way since there is no way to be sure what is in the log is ok. This probably could be done smarter, but if you write-fsync-truncate-write-fsync you deserve what you get. All this work is in RAM of course so if your inode gets evicted from cache and you read it in and fsync it we'll do it the slow way if we are still in the same transaction that we last modified the inode in. The biggest cool part of this is that it requires no changes to the recovery code, so if you fsync with this patch and crash and load an old kernel, it will run the recovery and be a-ok. I have tested this pretty thoroughly with an fsync tester and everything comes back fine, as well as xfstests. Thanks, Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
- 29 8月, 2012 1 次提交
-
-
由 Dan Carpenter 提交于
"trans->transid" is cpu endian but we want to store the data as little endian. "item->ctime.nsec" is only 32 bits, not 64. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
-
- 09 8月, 2012 1 次提交
-
-
由 Alexander Block 提交于
We got a recursive lock in mksubvol because the caller already held a lock. I think we got into this due to a merge error. Commit a874a63e removed the mnt_want_write call from btrfs_mksubvol and added a replacement call to mnt_want_write_file in btrfs_ioctl_snap_create_transid. Commit e7848683 however tried to move all calls to mnt_want_write above i_mutex. So somewhere while merging this, it got mixed up. The solution is to remove the mnt_want_write call completely from mksubvol. Reported-by: NDavid Sterba <dave@jikos.cz> Signed-off-by: NAlexander Block <ablock84@googlemail.com> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
- 31 7月, 2012 1 次提交
-
-
由 Jan Kara 提交于
When mnt_want_write() starts to handle freezing it will get a full lock semantics requiring proper lock ordering. So push mnt_want_write() call consistently outside of i_mutex. CC: Chris Mason <chris.mason@oracle.com> CC: linux-btrfs@vger.kernel.org Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 26 7月, 2012 3 次提交
-
-
由 Alexander Block 提交于
This patch introduces the BTRFS_IOC_SEND ioctl that is required for send. It allows btrfs-progs to implement full and incremental sends. Patches for btrfs-progs will follow. Signed-off-by: NAlexander Block <ablock84@googlemail.com> Reviewed-by: NDavid Sterba <dave@jikos.cz> Reviewed-by: NArne Jansen <sensille@gmx.net> Reviewed-by: NJan Schmidt <list.btrfs@jan-o-sch.net> Reviewed-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
-
由 Alexander Block 提交于
This patch introduces uuids for subvolumes. Each subvolume has it's own uuid. In case it was snapshotted, it also contains parent_uuid. In case it was received, it also contains received_uuid. It also introduces subvolume ctime/otime/stime/rtime. The first two are comparable to the times found in inodes. otime is the origin/creation time and ctime is the change time. stime/rtime are only valid on received subvolumes. stime is the time of the subvolume when it was sent. rtime is the time of the subvolume when it was received. Additionally to the times, we have a transid for each time. They are updated at the same place as the times. btrfs receive uses stransid and rtransid to find out if a received subvolume changed in the meantime. If an older kernel mounts a filesystem with the extented fields, all fields become invalid. The next mount with a new kernel will detect this and reset the fields. Signed-off-by: NAlexander Block <ablock84@googlemail.com> Reviewed-by: NDavid Sterba <dave@jikos.cz> Reviewed-by: NArne Jansen <sensille@gmx.net> Reviewed-by: NJan Schmidt <list.btrfs@jan-o-sch.net> Reviewed-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
-
由 Mitch Harder 提交于
In support of the recently added capability to remount with lzo compression, provide a helper function to check the compression INCOMPAT flags when remounting with lzo compression, and set the flags if necessary. Also, implement the new helper function when defragmenting with explicit lzo compression and when setting the default subvolume. Signed-off-by: NMitch Harder <mitch.harder@sabayonlinux.org> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
- 25 7月, 2012 1 次提交
-
-
由 David Sterba 提交于
Lift the EXDEV condition and allow different root trees for files being cloned, then pass source inode's root when searching for extents. Cloning is not allowed to cross vfsmounts, ie. when two subvolumes from one filesystem are mounted separately. Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
- 24 7月, 2012 6 次提交
-
-
由 Liu Bo 提交于
$ mkfs.btrfs /dev/sdb7 $ btrfstune -S1 /dev/sdb7 $ mount /dev/sdb7 /mnt/btrfs mount: block device /dev/sdb7 is write-protected, mounting read-only $ btrfs dev add /dev/sdb8 /mnt/btrfs/ Now we get a btrfs in which mnt flags has readonly but sb flags does not. So for those ioctls that only check sb flags with MS_RDONLY, it is going to be a problem. Setting subvolume flags is such an ioctl, we should use mnt_want_write_file() to check RO flags. Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
-
由 Liu Bo 提交于
mnt_want_write_file is faster when file has been opened for write. Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
-
由 Liu Bo 提交于
mnt_want_write() and mnt_want_write_file() will check sb->s_flags with MS_RDONLY, and we don't need to do it ourselves. Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
-
由 Liu Bo 提交于
Move check of write access to mount into upper functions so that we can use mnt_want_write_file instead, which is faster than mnt_want_write. Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com>
-
由 David Sterba 提交于
Commit c11d2c23 (Btrfs: add ioctl to get and reset the device stats) introduced two ioctls doing almost the same thing distinguished by just the ioctl number which encodes "do reset after read". I have suggested http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg16604.html to implement it via the ioctl args. This hasn't happen, and I think we should use a more clean way to pass flags and should not waste ioctl numbers. CC: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: NDavid Sterba <dsterba@suse.cz>
-
由 Andrew Mahone 提交于
Rebased on btrfs-next and retested. Inform should_defrag_range if BTRFS_DEFRAG_RANGE_COMPRESS is set. If so, skip checks for adjacent extents and extent size when deciding whether to defrag, as these can prevent an uncompressed and unfragmented file from being compressed as requested. Signed-off-by: NAndrew Mahone <andrew.mahone@gmail.com>
-
- 23 7月, 2012 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 12 7月, 2012 2 次提交
-
-
由 Arne Jansen 提交于
When creating a subvolume or snapshot, it is necessary to initialize the qgroup account with a copy of some other (tracking) qgroup. This patch adds parameters to the ioctls to pass the information from which qgroup to inherit. Signed-off-by: NArne Jansen <sensille@gmx.net>
-
由 Arne Jansen 提交于
Ioctls to control the qgroup feature like adding and removing qgroups and assigning qgroups. Signed-off-by: NArne Jansen <sensille@gmx.net>
-
- 16 6月, 2012 1 次提交
-
-
由 Chris Mason 提交于
Avoid warning in 32 bit machines Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
- 15 6月, 2012 3 次提交
-
-
由 Liu Bo 提交于
Seeding devices are not supposed to change any more. Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
由 Li Zefan 提交于
If a file has 3 small extents: | ext1 | ext2 | ext3 | Running "btrfs fi defrag" will only defrag the last two extents, if those extent mappings hasn't been read into memory from disk. This bug was introduced by commit 17ce6ef8 ("Btrfs: add a check to decide if we should defrag the range") The cause is, that commit looked into previous and next extents using lookup_extent_mapping() only. While at it, remove the code that checks the previous extent, since it's sufficient to check the next extent. Signed-off-by: NLi Zefan <lizefan@huawei.com>
-
由 Josef Bacik 提交于
Al pointed out that we can just toss out the old name on a device and add a new one arbitrarily, so anybody who uses device->name in printk could possibly use free'd memory. Instead of adding locking around all of this he suggested doing it with RCU, so I've introduced a struct rcu_string that does just that and have gone through and protected all accesses to device->name that aren't under the uuid_mutex with rcu_read_lock(). This protects us and I will use it for dealing with removing the device that we used to mount the file system in a later patch. Thanks, Reviewed-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NJosef Bacik <josef@redhat.com>
-
- 30 5月, 2012 5 次提交
-
-
由 Stefan Behrens 提交于
An ioctl interface is added to get the device statistic counters. A second ioctl is added to atomically get and reset these counters. Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
-
由 Liu Bo 提交于
In normal cases, we would not be allowed to do balance in RO mode. However, when we're using a seeding device and adding another device to sprout, things will change: $ mkfs.btrfs /dev/sdb7 $ btrfstune -S 1 /dev/sdb7 $ mount /dev/sdb7 /mnt/btrfs -o ro $ btrfs fi bal /mnt/btrfs -----------------------> fail. $ btrfs dev add /dev/sdb8 /mnt/btrfs $ btrfs fi bal /mnt/btrfs -----------------------> works! It should not be designed as an exception, and we'd better add another check for mnt flags. Signed-off-by: NLiu Bo <liubo2009@cn.fujitsu.com> Reviewed-by: NJosef Bacik <josef@redhat.com>
-
由 Jim Meyering 提交于
A device with name of length BTRFS_DEVICE_PATH_NAME_MAX or longer would not be NUL-terminated in the DEV_INFO ioctl result buffer. Signed-off-by: NJim Meyering <meyering@redhat.com>
-
由 Daniel J Blueman 提交于
Address some minor type issues identified by sparse checker. Signed-off-by: NDaniel J Blueman <daniel@quora.org>
-
由 Josef Bacik 提交于
We've been keeping around the inode sequence number in hopes that somebody would use it, but nobody uses it and people actually use i_version which serves the same purpose, so use i_version where we used the incore inode's sequence number and that way the sequence is updated properly across the board, and not just in file write. Thanks, Signed-off-by: NJosef Bacik <josef@redhat.com>
-
- 26 5月, 2012 1 次提交
-
-
由 Jan Schmidt 提交于
Three callers of btrfs_free_tree_block or btrfs_alloc_tree_block passed parameter for_cow = 1. In fact, these two functions should never mark their tree modification operations as for_cow, because they can change the number of blocks referenced by a tree. Hence, we remove the extra for_cow parameter from these functions and make them pass a zero down. Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
-
- 19 4月, 2012 1 次提交
-
-
由 Stefan Behrens 提交于
When a filesystem is mounted with the degraded option, it is possible that some of the devices are not there. btrfs_ioctl_dev_info() crashs in this case because the device name is a NULL pointer. This ioctl was only used for scrub. Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de>
-