- 07 5月, 2013 7 次提交
-
-
由 Eric Sandeen 提交于
Big patch, but all it does is add statics to functions which are in fact static, then remove the associated dead-code fallout. removed functions: btrfs_iref_to_path() __btrfs_lookup_delayed_deletion_item() __btrfs_search_delayed_insertion_item() __btrfs_search_delayed_deletion_item() find_eb_for_page() btrfs_find_block_group() range_straddles_pages() extent_range_uptodate() btrfs_file_extent_length() btrfs_scrub_cancel_devid() btrfs_start_transaction_lflush() btrfs_print_tree() is left because it is used for debugging. btrfs_start_transaction_lflush() and btrfs_reada_detach() are left for symmetry. ulist.c functions are left, another patch will take care of those. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Josef Bacik 提交于
A user sent me a btrfs-image that was panicing because of some corruption. This is because we pass in a bogus value to btrfs_num_copies, and it panics. Instead just return 1. We only call btrfs_num_copies to see if there are other copies to try and read for things, so if we just return 1 it will make the callers exit out with an appropriate error value. Thanks, Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Josef Bacik 提交于
Martin Steigerwald reported a BUG_ON() in btrfs_map_block where we didn't find a chunk for a particular block we were trying to map. This happened because the block was bogus. We shouldn't be BUG_ON()'ing in this case, just print a message and return an error. This came from reada_add_block and it appears to deal with an error fine so we should be good there. Thanks, Reported-by: NMartin Steigerwald <Martin@lichtvoll.de> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Miao Xie 提交于
The following case will make the incompat/compat flag of the super block be recovered. Task1 |Task2 flags = btrfs_super_incompat_flags(); | |flags = btrfs_super_incompat_flags(); flags |= new_flag1; | |flags |= new_flag2; btrfs_set_super_incompat_flags(flags); | |btrfs_set_super_incompat_flags(flags); the new_flag1 is recovered. In order to avoid this problem, we introduce a lock named super_lock into the btrfs_fs_info structure. If we want to update incompat/compat flags of the super block, we must hold it. Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Eric Sandeen 提交于
This: # mkfs.btrfs /dev/sdb{1,2} ; wipefs -a /dev/sdb1; mount /dev/sdb2 /mnt/test would lead to a blkdev open/close mismatch when the mount fails, and a permanently busy (opened O_EXCL) sdb2: # wipefs -a /dev/sdb2 wipefs: error: /dev/sdb2: probing initialization failed: Device or resource busy It's because btrfs_open_devices() may open some devices, fail on the last one, and return that failure stored in "ret." The mount then fails, but the caller then does not clean up the open devices. Chris assures me that: "btrfs_open_devices just means: go off and open every bdev you can from this uuid. It should return success if we opened any of them at all." So change the logic to ignore any open failures; just skip processing of that device. Later on it's decided whether we have enough devices to continue. Reported-by: NJan Safranek <jsafrane@redhat.com> Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Josef Bacik 提交于
A user sent me a btrfs-image of a file system that was panicing on mount during the log recovery. I had originally thought these problems were from a bug in the free space cache code, but that was just a symptom of the problem. The problem is if your application does something like this [prealloc][prealloc][prealloc] the internal extent maps will merge those all together into one extent map, even though on disk they are 3 separate extents. So if you go to write into one of these ranges the extent map will be right since we use the physical extent when doing the write, but when we log the extents they will use the wrong sizes for the remainder prealloc space. If this doesn't happen to trip up the free space cache (which it won't in a lot of cases) then you will get bogus entries in your extent tree which will screw stuff up later. The data and such will still work, but everything else is broken. This patch fixes this by not allowing extents that are on the modified list to be merged. This has the side effect that we are no longer adding everything to the modified list all the time, which means we now have to call btrfs_drop_extents every time we log an extent into the tree. So this allows me to drop all this speciality code I was using to get around calling btrfs_drop_extents. With this patch the testcase I've created no longer creates a bogus file system after replaying the log. Thanks, Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Simon Kirby 提交于
With more than one btrfs volume mounted, it can be very difficult to find out which volume is hitting an error. btrfs_error() will print this, but it is currently rigged as more of a fatal error handler, while many of the printk()s are currently for debugging and yet-unhandled cases. This patch just changes the functions where the device information is already available. Some cases remain where the root or fs_info is not passed to the function emitting the error. This may introduce some confusion with volumes backed by multiple devices emitting errors referring to the primary device in the set instead of the one on which the error occurred. Use btrfs_printk(fs_info, format, ...) rather than writing the device string every time, and introduce macro wrappers ala XFS for brevity. Since the function already cannot be used for continuations, print a newline as part of the btrfs_printk() message rather than at each caller. Signed-off-by: NSimon Kirby <sim@hostway.ca> Reviewed-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
- 22 3月, 2013 1 次提交
-
-
由 Josef Bacik 提交于
If you restore a btrfs-image file system and try to mount that file system we'll panic. That's because btrfs-image restores and just makes one big chunk to envelope the whole disk, since they are really only meant to be messed with by our btrfs-progs. So fix up btrfs_rmap_block and the callers of it for mount so that we no longer panic but instead just return an error and fail to mount. Thanks, Signed-off-by: NJosef Bacik <jbacik@fusionio.com> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
- 15 3月, 2013 1 次提交
-
-
由 Eric Sandeen 提交于
Doing this would reliably fail with -EBUSY for me: # mount /dev/sdb2 /mnt/scratch; umount /mnt/scratch; mkfs.btrfs -f /dev/sdb2 ... unable to open /dev/sdb2: Device or resource busy because mkfs.btrfs tries to open the device O_EXCL, and somebody still has it. Using systemtap to track bdev gets & puts shows a kworker thread doing a blkdev put after mkfs attempts a get; this is left over from the unmount path: btrfs_close_devices __btrfs_close_devices call_rcu(&device->rcu, free_device); free_device INIT_WORK(&device->rcu_work, __free_device); schedule_work(&device->rcu_work); so unmount might complete before __free_device fires & does its blkdev_put. Adding an rcu_barrier() to btrfs_close_devices() causes unmount to wait until all blkdev_put()s are done, and the device is truly free once unmount completes. Cc: stable@vger.kernel.org Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
- 07 3月, 2013 1 次提交
-
-
由 Ilya Dryomov 提交于
Raid56 merge (merge commit e942f883) had mistakenly removed a call to __cancel_balance(), which resulted in balance not cleaning up after itself after a successful finish. (Cleanup includes switching the state, removing the balance item and releasing mut_ex_op testnset lock.) Bring it back. Reported-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
- 05 3月, 2013 1 次提交
-
-
由 Liu Bo 提交于
Btrfs balance can easily hit BUG_ON in these places, but we want to it bail out gracefully after we force the whole filesystem to readonly. So we use btrfs_std_error hook in place of BUG_ON. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
- 01 3月, 2013 1 次提交
-
-
由 David Sterba 提交于
Dave pointed out that he saw messages from btrfs although there was no such filesystem on his computers. The automatic device scan is called on every new blockdevice if the usual distro udev rule set is used. The printk introduced in 6f60cbd3 was a remainder from copying portions of code from btrfs_get_bdev_and_sb which is used under different conditions and the warning makes sense there. Reported-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
- 27 2月, 2013 1 次提交
-
-
由 Qu Wenruo 提交于
Though most of the btrfs codes are using ALIGN macro for page alignment, there are still some codes using open-coded alignment like the following: ------ u64 mask = ((u64)root->stripesize - 1); u64 ret = (val + mask) & ~mask; ------ Or even hidden one: ------ num_bytes = (end - start + blocksize) & ~(blocksize - 1); ------ Sometimes these open-coded alignment is not so easy to understand for newbie like me. This commit changes the open-coded alignment to the ALIGN macro for a better readability. Also there is a previous patch from David Sterba with similar changes, but the patch is for 3.2 kernel and seems not merged. http://www.spinics.net/lists/linux-btrfs/msg12747.html Cc: David Sterba <dave@jikos.cz> Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
- 21 2月, 2013 9 次提交
-
-
由 Chris Mason 提交于
We try to limit the size of a chunk to 10GB, which keeps the unit of work reasonable during balance and resize operations. The limit checks were taking into account the number of copies of the data we had but what they really should be doing is comparing against the logical size of the chunk we're creating. This moves the code around a little to use the count of data stripes from raid5/6. Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
由 Thomas Gleixner 提交于
__btrfs_close_devices() clones btrfs device structs with memcpy(). Some of the fields in the clone are reinitialized, but it's missing to init io_lock. In mainline this goes unnoticed, but on RT it leaves the plist pointing to the original about to be freed lock struct. Initialize io_lock after cloning, so no references to the original struct are left. Reported-and-tested-by: NMike Galbraith <efault@gmx.de> Cc: stable@vger.kernel.org Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
由 Zach Brown 提交于
super.magic is an le64 but it's treated as an unterminated string when compared against BTRFS_MAGIC which is defined as a string. Instead define BTRFS_MAGIC as a normal hex value and use endian helpers to compare it to the super's magic. I tested this by mounting an fs made before the change and made sure that it didn't introduce sparse errors. This matches a similar cleanup that is pending in btrfs-progs. David Sterba pointed out that we should fix the kernel side as well :). Signed-off-by: NZach Brown <zab@redhat.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Ilya Dryomov 提交于
Enhance balance usage filter by making it possible to balance out only completely empty chunks. Today, usage filter properly acts on values from 1 to 99 inclusive, usage=100 selects all chunks, and usage=0 selects no chunks. This commit changes the usage=0 case: the new meaning is to restripe only completely empty chunks and nothing else. Suggested-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Ilya Dryomov 提交于
Commit 5af3e8cc introduced a use-after-free at volumes.c:3139: bctl is freed above in __cancel_balance() in all cases except for balance pause. Fix this by moving the offending check a couple statements above, the meaning of the check is preserved. Reported-by: NChris Mason <chris.mason@fusionio.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Eric Sandeen 提交于
WARN_ON isn't enough, we need to stop the loop if for any reason we would overrun the devices_info array. I tried to track down the connection between the length of the alloc_devices list and the rw_devices counter but it wasn't immediately obvious, so be defensive about it. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Josef Bacik 提交于
I got a double free error when unmounting a file system that failed to add a chunk during its operation. This is because we will kfree the mapping that we created but leave the extent_map in the em_tree for chunks. So to fix this just remove the extent_map when we error out so we don't run into this problem. Thanks, Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Josef Bacik 提交于
If we error out allocating a dev extent we will have already created the block group and such which will cause problems since the allocator may have tried to allocate out of the block group that no longer exists. This will cause BUG_ON()'s in the bio submission path. This also makes a failure to allocate a dev extent a non-abort error, we will just clean up the dev extents we did allocate and exit. Now if we fail to delete the dev extents we will abort since we can't have half of the dev extents hanging around, but this will make us much less likely to abort. Thanks, Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
由 Miao Xie 提交于
There is no lock to protect fs_info->avail_{data, metadata, system}_alloc_bits, it may introduce some problem, such as the wrong profile information, so we add a seqlock to protect them. Signed-off-by: NZhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
- 20 2月, 2013 1 次提交
-
-
由 Miao Xie 提交于
The current code of raid attr arry is hard to understand and it is easy to introduce some problem if we modify the array. So I changed it and made it more readable. Cc: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
- 16 2月, 2013 1 次提交
-
-
由 David Sterba 提交于
btrfs_scan_one_device is calling set_blocksize() which can race with a concurrent process making dirty page cache pages. It can end up dropping dirty page cache pages on the floor, which isn't very nice when someone is just running btrfs dev scan to find filesystems on the box. Now that udev is registering btrfs devices as it discovers them, we can actually end up racing with our own mkfs program too. When this happens, we drop some of the important blocks written by mkfs. This commit changes scan_one_device to read the super out of the page cache instead of trying to use bread. This way we don't have to care about the blocksize of the device. This also drops the invalidate_bdev() call. It wasn't very polite to invalidate during the scan either. mkfs is putting the super into the page cache, there's no reason to invalidate at this point. Signed-off-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
- 05 2月, 2013 1 次提交
-
-
由 Chris Mason 提交于
The device removal code was incorrectly checking against two different limits for raid5 and raid6. Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
- 02 2月, 2013 2 次提交
-
-
由 David Woodhouse 提交于
This builds on David Woodhouse's original Btrfs raid5/6 implementation. The code has changed quite a bit, blame Chris Mason for any bugs. Read/modify/write is done after the higher levels of the filesystem have prepared a given bio. This means the higher layers are not responsible for building full stripes, and they don't need to query for the topology of the extents that may get allocated during delayed allocation runs. It also means different files can easily share the same stripe. But, it does expose us to incorrect parity if we crash or lose power while doing a read/modify/write cycle. This will be addressed in a later commit. Scrub is unable to repair crc errors on raid5/6 chunks. Discard does not work on raid5/6 (yet) The stripe size is fixed at 64KiB per disk. This will be tunable in a later commit. Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
由 Eric Sandeen 提交于
If we remove a missing device, bdev is null, and if we send that off to btrfs_kobject_uevent we'll panic. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
- 25 1月, 2013 1 次提交
-
-
由 Miao Xie 提交于
The max device number of single profile is 1, not 0 (0 means 'as many as possible'). Fix it. Cc: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com> Reviewed-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
-
- 22 1月, 2013 1 次提交
-
-
由 Ilya Dryomov 提交于
Commit 3fed40cc ("Btrfs: cleanup duplicated division functions"), which was merged into 3.8-rc1, has introduced a regression by removing logic that was guarding us against bad user input. Bring it back. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
- 20 1月, 2013 1 次提交
-
-
由 Ilya Dryomov 提交于
Balance pause/resume logic got broken by 5ac00add (went in into 3.8-rc1 as part of dev-replace merge). Offending commit took a stab at making mutually exclusive volume operations (add_dev, rm_dev, resize, balance, replace_dev) not block behind volume_mutex if another such operation is in progress and instead return an error right away. Balancing front-end relied on the blocking behaviour, so the fix is ugly, but short of a complete rework, it's the best we can do. Reported-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 15 1月, 2013 1 次提交
-
-
由 Lukas Czerner 提交于
When we're deleting the device we should get it in write mode since we're going to re-write the super block magic on that device. And it should fail if the device is read-only. Signed-off-by: NLukas Czerner <lczerner@redhat.com>
-
- 17 12月, 2012 4 次提交
-
-
由 Liu Bo 提交于
Raid properties can be shared among raid calculation code, we can put them into a global table to keep it simple. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
由 Josef Bacik 提交于
We don't really need to copy extents from the source tree since we have all of the information already available to us in the extent_map tree. So instead just write the extents straight to the log tree and don't bother to copy the extent items from the source tree. Signed-off-by: NJosef Bacik <jbacik@fusionio.com> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
由 Lukas Czerner 提交于
Currently udev does not know about the device being removed from the file system. This may result in the situation where we're unable to mount the file system by UUID or by LABEL because the by-uuid and by-label links may still point to the device which is no longer part of the btrfs file system and hence does not have any btrfs super block. It can be easily reproduced by the following: mkfs.btrfs -L bugfs /dev/loop[0-6] mount /dev/loop0 /mnt/test btrfs device delete /dev/loop0 /mnt/test umount /mnt/test mount LABEL=bugfs /mnt/test <---- this fails then see: ls -l /dev/disk/by-label/bugfs which will still point to the /dev/loop0 We did not noticed this before because libblkid would send the udev event for us when it notice that the link does not fit the reality, however it does not do that anymore and completely relies on udev information. Fix this by sending the KOBJ_CHANGE event to the bdev kobject after successful device removal. Note that this does not affect device addition, because we will open the device prior the addition from userspace and udev will notice that and reread the device afterwards. Signed-off-by: NLukas Czerner <lczerner@redhat.com> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
由 Stefan Behrens 提交于
This issue was detected by the "0-DAY kernel build testing". fs/btrfs/volumes.c: In function 'btrfs_rm_device': fs/btrfs/volumes.c:1505:1: warning: label 'error_close' defined but not used [-Wunused-label] Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
- 13 12月, 2012 5 次提交
-
-
由 Stefan Behrens 提交于
Make the target disk of a running device replace operation available for reading. This is only used as a last ressort for the defect repair procedure. And it is dependent on the location of the data block to read, because during an ongoing device replace operation, the target drive is only partially filled with the filesystem data. Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
由 Stefan Behrens 提交于
It is desirable to be able to configure the device replace procedure to avoid reading the source drive (the one to be copied) whenever possible. This is useful when the number of read errors on this disk is high, because it would delay the copy procedure alot. Therefore there is an option to avoid reading from the source disk unless the repair procedure really needs to access it. The regular read req asks for mapping the block with mirror_num == 0, in this case the source disk is avoided whenever possible. The repair code selects the mirror_num explicitly (mirror_num != 0), this case is not changed by this commit. Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
由 Stefan Behrens 提交于
During a running dev replace operation, all write requests to the live filesystem are duplicated to also write to the target drive. Therefore btrfs_map_block() is changed to duplicate stripes that are written to the source disk of a device replace procedure to be written to the target disk as well. Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
由 Stefan Behrens 提交于
Before this commit, btrfs_map_block() was called with REQ_WRITE in order to retrieve the list of mirrors for a disk block. This needs to be changed for the device replace procedure since it makes a difference whether you are asking for read mirrors or for locations to write to. GET_READ_MIRRORS is introduced as a new interface to call btrfs_map_block(). In the current commit, the functionality is not yet changed, only the interface for GET_READ_MIRRORS is introduced and all the places that should use this new interface are adapted. The reason that REQ_WRITE cannot be abused anymore to retrieve a list of read mirrors is that during a running dev replace operation all write requests to the live filesystem are duplicated to also write to the target drive. Keep in mind that the target disk is only partially a valid copy of the source disk while the operation is ongoing. All writes go to the target disk, but not all reads would return valid data on the target disk. Therefore it is not possible anymore to abuse a REQ_WRITE interface to find valid mirrors for a REQ_READ. Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-
由 Stefan Behrens 提交于
This commit contains all the essential changes to the core code of Btrfs for support of the device replace procedure. Signed-off-by: NStefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: NChris Mason <chris.mason@fusionio.com>
-