- 09 9月, 2014 8 次提交
-
-
由 Eric Sandeen 提交于
Move the resblks test out of the xfs_dir_canenter, and into the caller. This makes a little more sense on the face of it; xfs_dir_canenter immediately returns if resblks !=0; and given some of the comments preceding the calls: * Check for ability to enter directory entry, if no space reserved. even more so. It also facilitates the next patch. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBrian Foster <bfoster@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Eric Sandeen 提交于
In xlog_do_recovery_pass(), there are 2 distinct cases: non-wrapped and wrapped log recovery. If we find a wrapped log, we recover around the end of the log, and then handle the rest of recovery exactly as in the non-wrapped case - using exactly the same (duplicated) code. Rather than having the same code in both cases, we can get the wrapped portion out of the way first if needed, and then recover the non-wrapped portion of the log. There should be no functional change here, just code reorganization & deduplication. The patch looks a bit bigger than it really is; the last hunk is whitespace changes (un-indenting). Tested with xfstests "check -g log" on a stock configuration. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Eric Sandeen 提交于
For some reason, the older commit: 965c8e59 lseek: the "whence" argument is called "whence" lseek: the "whence" argument is called "whence" But the kernel decided to call it "origin" instead. Fix most of the sites. left out xfs. So fix xfs. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NJie Liu <jeff.liu@oracle.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Eric Sandeen 提交于
xfs_seek_hole & xfs_seek_data are remarkably similar; so much so that they can be combined, saving a fair bit of semi-complex code duplication. The following patch passes generic/285 and generic/286, which specifically test seek behavior. Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NJie Liu <jeff.liu@oracle.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Brian Foster 提交于
XFS log recovery has been discovered to have race conditions with buffers when I/O errors occur. External tools are available to simulate I/O errors to XFS, but this alone is not sufficient for testing log recovery. XFS unconditionally resets the inactive region of the log prior to log recovery to avoid confusion over processing any partially written log records that might have been written before an unclean shutdown. Therefore, unconditional write I/O failures at mount time are caught by the reset sequence rather than log recovery and hinder the ability to test the latter. The device-mapper dm-flakey module uses an up/down timer to define a cycle for when to fail I/Os. Create a pre log recovery delay tunable that can be used to coordinate XFS log recovery with I/O errors simulated by dm-flakey. This facilitates coordination in userspace that allows the reset of stale log blocks to succeed and writes due to log recovery to fail. For example, define a dm-flakey instance with an uptime long enough to allow log reset to succeed and a log recovery delay long enough to allow the dm-flakey uptime to expire. The 'log_recovery_delay' sysfs tunable is exported under /sys/fs/xfs/debug and is only enabled for kernels compiled in XFS debug mode. The value is exported in units of seconds and allows for a delay of up to 60 seconds. Note that this is for XFS debug and test instrumentation purposes only and should not be used by applications. No delay is enabled by default. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Brian Foster 提交于
Create a top-level debug directory for global debug sysfs attributes. This directory is added and removed on XFS module initialization and removal respectively for DEBUG mode kernels only. It typically resides at /sys/fs/xfs/debug. It is located at the top level of the xfs sysfs hierarchy as attributes might define global behavior or behavior that must be configured before an xfs mount is available (e.g., log recovery behavior). Define the global debug kobject that represents the debug sysfs directory and add generic attribute show/store helpers to support future attributes. No debug attributes are exported as of yet. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Eric Sandeen 提交于
These were exposed by fsfuzzer runs; without them we fail in various exciting and sometimes convoluted ways when we encounter disk corruption. Without the MAXLEVELS tests we tend to walk off the end of an array in a loop like this: for (i = 0; i < cur->bc_nlevels; i++) { if (cur->bc_bufs[i]) Without the dirblklog test we try to allocate more memory than we could possibly hope for and loop forever: xfs_dabuf_map() nfsb = mp->m_dir_geo->fsbcount; irecs = kmem_zalloc(sizeof(irec) * nfsb, KM_SLEEP... As for the logbsize check, that's the convoluted one. If logbsize is specified at mount time, it's sanitized in xfs_parseargs; in particular it makes sure that it's not > XLOG_MAX_RECORD_BSIZE. If not specified at mount time, it comes from the superblock via sb_logsunit; this is limited to 256k at mkfs time as well; it's copied into m_logbsize in xfs_finish_flags(). However, if for some reason the on-disk value is corrupt and too large, nothing catches it. It's a circuitous path, but that size eventually finds its way to places that make the kernel very unhappy, leading to oopses in xlog_pack_data() because we use the size as an index into iclog->ic_data, but the array is not necessarily that big. Anyway - bounds checking when we read from disk is a good thing! Signed-off-by: NEric Sandeen <sandeen@redhat.com> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
由 Brian Foster 提交于
Workqueues must be explicitly set as freezable to ensure they are frozen in the assocated part of the hibernation/suspend sequence. Freezing of workqueues and kernel threads is important to ensure that modifications are not made on-disk after the hibernation image has been created. Otherwise, the in-memory state can become inconsistent with what is on disk and eventually lead to filesystem corruption. We have reports of free space btree corruptions that occur immediately after restore from hibernate that suggest the xfs-eofblocks workqueue could be causing such problems if it races with hibernation. Mark all of the internal XFS workqueues as freezable to ensure nothing changes on-disk once the freezer infrastructure freezes kernel threads and creates the hibernation image. Signed-off-by: NBrian Foster <bfoster@redhat.com> Reported-by: NCarlos E. R. <carlos.e.r@opensuse.org> Reviewed-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NDave Chinner <david@fromorbit.com>
-
- 25 8月, 2014 1 次提交
-
-
由 Benjamin LaHaise 提交于
As reported by Dan Aloni, commit f8567a38 ("aio: fix aio request leak when events are reaped by userspace") introduces a regression when user code attempts to perform io_submit() with more events than are available in the ring buffer. Reverting that commit would reintroduce a regression when user space event reaping is used. Fixing this bug is a bit more involved than the previous attempts to fix this regression. Since we do not have a single point at which we can count events as being reaped by user space and io_getevents(), we have to track event completion by looking at the number of events left in the event ring. So long as there are as many events in the ring buffer as there have been completion events generate, we cannot call put_reqs_available(). The code to check for this is now placed in refill_reqs_available(). A test program from Dan and modified by me for verifying this bug is available at http://www.kvack.org/~bcrl/20140824-aio_bug.c . Reported-by: NDan Aloni <dan@kernelim.com> Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org> Acked-by: NDan Aloni <dan@kernelim.com> Cc: Kent Overstreet <kmo@daterainc.com> Cc: Mateusz Guzik <mguzik@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: stable@vger.kernel.org # v3.16 and anything that f8567a38 was backported to Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 8月, 2014 8 次提交
-
-
由 David Jeffery 提交于
If a SIGKILL is sent to a task waiting in __nfs_iocounter_wait, it will busy-wait or soft lockup in its while loop. nfs_wait_bit_killable won't sleep, and the loop won't exit on the error return. Stop the busy-wait by breaking out of the loop when nfs_wait_bit_killable returns an error. Signed-off-by: NDavid Jeffery <djeffery@redhat.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Weston Andros Adamson 提交于
Commit 6094f838 "nfs: allow coalescing of subpage requests" got rid of the requirement that requests cover whole pages, but it made some incorrect assumptions. It turns out that callers of this interface can map adjacent requests (by file position as seen by req_offset + req->wb_bytes) to different pages, even when they could share a page. An example is the direct I/O interface - iov_iter_get_pages_alloc may return one segment with a partial page filled and the next segment (which is adjacent in the file position) starts with a new page. Reported-by: NToralf Förster <toralf.foerster@gmx.de> Signed-off-by: NWeston Andros Adamson <dros@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Weston Andros Adamson 提交于
Adjacent requests that share the same page are allowed, but should only use one entry in the page vector. This avoids overruning the page vector - it is sized based on how many bytes there are, not by request count. This fixes issues that manifest as "Redzone overwritten" bugs (the vector overrun) and hangs waiting on page read / write, as it waits on the same page more than once. This also adds bounds checking to the page vector with a graceful failure (WARN_ON_ONCE and pgio error returned to application). Reported-by: NToralf Förster <toralf.foerster@gmx.de> Signed-off-by: NWeston Andros Adamson <dros@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Weston Andros Adamson 提交于
This handles the 'nonblock=false' case in nfs_lock_and_join_requests. If the group is already locked and blocking is allowed, drop the inode lock and wait for the group lock to be cleared before trying it all again. This should fix warnings found in peterz's tree (sched/wait branch), where might_sleep() checks are added to wait.[ch]. Reported-by: NFengguang Wu <fengguang.wu@intel.com> Signed-off-by: NWeston Andros Adamson <dros@primarydata.com> Reviewed-by: NPeng Tao <tao.peng@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Weston Andros Adamson 提交于
This fixes handling of errors from nfs_page_group_lock in nfs_lock_and_join_requests. It now releases the inode lock and the reference to the head request. Reported-by: NPeng Tao <tao.peng@primarydata.com> Signed-off-by: NWeston Andros Adamson <dros@primarydata.com> Reviewed-by: NPeng Tao <tao.peng@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Weston Andros Adamson 提交于
__nfs_pageio_add_request was calling nfs_page_group_lock nonblocking, but this can return -EAGAIN which would end up passing -EIO to the application. There is no reason not to block in this path, so change the two calls to do so. Also, there is no need to check the return value of nfs_page_group_lock when nonblock=false, so remove the error handling code. Signed-off-by: NWeston Andros Adamson <dros@primarydata.com> Reviewed-by: NPeng Tao <tao.peng@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Weston Andros Adamson 提交于
nfs_page_group_lock was calling wait_on_bit_lock even when told not to block. Fix by first trying test_and_set_bit, followed by wait_on_bit_lock if and only if blocking is allowed. Return -EAGAIN if nonblocking and the test_and_set of the bit was already locked. Signed-off-by: NWeston Andros Adamson <dros@primarydata.com> Reviewed-by: NPeng Tao <tao.peng@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
由 Weston Andros Adamson 提交于
Flip the meaning of the second argument from 'wait' to 'nonblock' to match related functions. Update all five calls to reflect this change. Signed-off-by: NWeston Andros Adamson <dros@primarydata.com> Reviewed-by: NPeng Tao <tao.peng@primarydata.com> Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
-
- 20 8月, 2014 3 次提交
-
-
由 Chin-Tsung Cheng 提交于
The journal blocks of external journal device should not be counted as overhead. Signed-off-by: NChin-Tsung Cheng <chintzung@gmail.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Jan Kara 提交于
We did not check relocated directory in any way when processing Rock Ridge 'CL' tag. Thus a corrupted isofs image can possibly have a CL entry pointing to another CL entry leading to possibly unbounded recursion in kernel code and thus stack overflow or deadlocks (if there is a loop created from CL entries). Fix the problem by not allowing CL entry to point to a directory entry with CL entry (such use makes no good sense anyway) and by checking whether CL entry doesn't point to itself. CC: stable@vger.kernel.org Reported-by: NChris Evans <cevans@google.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
由 Chao Yu 提交于
We have released the ->i_data_sem before invoking udf_add_entry(), so in following error path, we should not release this lock again. Signed-off-by: NChao Yu <chao2.yu@samsung.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 18 8月, 2014 3 次提交
-
-
由 Steve French 提交于
fallocate -z (FALLOC_FL_ZERO_RANGE) can map to SMB3 FSCTL_SET_ZERO_DATA SMB3 FSCTL but FALLOC_FL_ZERO_RANGE when called without the FALLOC_FL_KEEPSIZE flag set could want the file size changed so we can not support that subcase unless the file is cached (and thus we know the file size). Signed-off-by: NSteve French <smfrench@gmail.com> Reviewed-by: NPavel Shilovsky <pshilovsky@samba.org>
-
由 Steve French 提交于
Implement FALLOC_FL_PUNCH_HOLE (which does not change the file size fortunately so this matches the behavior of the equivalent SMB3 fsctl call) for SMB3 mounts. This allows "fallocate -p" to work. It requires that the server support setting files as sparse (which Windows allows). Signed-off-by: NSteve French <smfrench@gmail.com>
-
由 Steve French 提交于
When the server (for an SMB2 or SMB3 mount) doesn't support an ioctl (such as setting the compressed flag on a file) we were incorrectly returning EIO instead of EOPNOTSUPP, this is confusing e.g. doing chattr +c to a file on a non-btrfs Samba partition, now the error returned is more intuitive to the user. Also fixes error mapping on setting hardlink to servers which don't support that. Signed-off-by: NSteve French <smfrench@gmail.com> Reviewed-by: NDavid Disseldorp <ddiss@suse.de>
-
- 17 8月, 2014 3 次提交
-
-
由 Pavel Shilovsky 提交于
When we requests rename we also need to update attributes of both source and target parent directories. Not doing it causes generic/309 xfstest to fail on SMB2 mounts. Fix this by marking these directories for force revalidating. Cc: <stable@vger.kernel.org> Signed-off-by: NPavel Shilovsky <pshilovsky@samba.org> Signed-off-by: NSteve French <smfrench@gmail.com>
-
由 Pavel Shilovsky 提交于
SMB2 servers indicates the end of a directory search with STATUS_NO_MORE_FILE error code that is not processed now. This causes generic/257 xfstest to fail. Fix this by triggering the end of search by this error code in SMB2_query_directory. Also when negotiating CIFS protocol we tell the server to close the search automatically at the end and there is no need to do it itself. In the case of SMB2 protocol, we need to close it explicitly - separate close directory checks for different protocols. Cc: <stable@vger.kernel.org> Signed-off-by: NPavel Shilovsky <pshilovsky@samba.org> Signed-off-by: NSteve French <smfrench@gmail.com>
-
由 Steve French 提交于
As Raphael Geissert pointed out, tcon_error_exit can dereference tcon and there is one path in which tcon can be null. Signed-off-by: NSteve French <smfrench@gmail.com> CC: Stable <stable@vger.kernel.org> # v3.7+ Reported-by: NRaphael Geissert <geissert@debian.org>
-
- 16 8月, 2014 3 次提交
-
-
由 Steve French 提交于
response Writes fail to Mac servers with SMB2.1 mounts (works with cifs though) due to them sending an incorrect RFC1001 length for the SMB2.1 Write response. Workaround this problem. MacOS server sends a write response with 3 bytes of pad beyond the end of the SMB itself. The RFC1001 length is 3 bytes more than the sum of the SMB2.1 header length + the write reponse. Incorporate feedback from Jeff and JRA to allow servers to send a tcp frame that is even more than three bytes too long (ie much longer than the SMB2/SMB3 request that it contains) but we do log it once now. In the earlier version of the patch I had limited how far off the length field could be before we fail the request. Signed-off-by: NSteve French <smfrench@gmail.com>
-
由 Jeff Layton 提交于
Currently any F_UNLCK request for a lease just gets back -EAGAIN. Allow them to go immediately to generic_setlease instead. Signed-off-by: NJeff Layton <jlayton@primarydata.com> Signed-off-by: NSteve French <smfrench@gmail.com>
-
由 Steve French 提交于
Simply move code to new function (for clarity). Function sets or clears the sparse file attribute flag. Signed-off-by: NSteve French <smfrench@gmail.com> Reviewed-by: NDavid Disseldorp <ddiss@samba.org>
-
- 15 8月, 2014 9 次提交
-
-
由 Chris Mason 提交于
Truncates and renames are often used to replace old versions of a file with new versions. Applications often expect this to be an atomic replacement, even if they haven't done anything to make sure the new version is fully on disk. Btrfs has strict flushing in place to make sure that renaming over an old file with a new file will fully flush out the new file before allowing the transaction commit with the rename to complete. This ordering means the commit code needs to be able to lock file pages, and there are a few paths in the filesystem where we will try to end a transaction with the page lock held. It's rare, but these things can deadlock. This patch removes the ordered flushes and switches to a best effort filemap_flush like ext4 uses. It's not perfect, but it should fix the deadlocks. Signed-off-by: NChris Mason <clm@fb.com>
-
由 Filipe Manana 提交于
Under rare circumstances we can end up leaving 2 versions of a checksum for the same file extent range. The reason for this is that after calling btrfs_next_leaf we process slot 0 of the leaf it returns, instead of processing the slot set in path->slots[0]. Most of the time (by far) path->slots[0] is 0, but after btrfs_next_leaf() releases the path and before it searches for the next leaf, another task might cause a split of the next leaf, which migrates some of its keys to the leaf we were processing before calling btrfs_next_leaf(). In this case btrfs_next_leaf() returns again the same leaf but with path->slots[0] having a slot number corresponding to the first new key it got, that is, a slot number that didn't exist before calling btrfs_next_leaf(), as the leaf now has more keys than it had before. So we must really process the returned leaf starting at path->slots[0] always, as it isn't always 0, and the key at slot 0 can have an offset much lower than our search offset/bytenr. For example, consider the following scenario, where we have: sums->bytenr: 40157184, sums->len: 16384, sums end: 40173568 four 4kb file data blocks with offsets 40157184, 40161280, 40165376, 40169472 Leaf N: slot = 0 slot = btrfs_header_nritems() - 1 |-------------------------------------------------------------------| | [(CSUM CSUM 39239680), size 8] ... [(CSUM CSUM 40116224), size 4] | |-------------------------------------------------------------------| Leaf N + 1: slot = 0 slot = btrfs_header_nritems() - 1 |--------------------------------------------------------------------| | [(CSUM CSUM 40161280), size 32] ... [((CSUM CSUM 40615936), size 8 | |--------------------------------------------------------------------| Because we are at the last slot of leaf N, we call btrfs_next_leaf() to find the next highest key, which releases the current path and then searches for that next key. However after releasing the path and before finding that next key, the item at slot 0 of leaf N + 1 gets moved to leaf N, due to a call to ctree.c:push_leaf_left() (via ctree.c:split_leaf()), and therefore btrfs_next_leaf() will returns us a path again with leaf N but with the slot pointing to its new last key (CSUM CSUM 40161280). This new version of leaf N is then: slot = 0 slot = btrfs_header_nritems() - 2 slot = btrfs_header_nritems() - 1 |----------------------------------------------------------------------------------------------------| | [(CSUM CSUM 39239680), size 8] ... [(CSUM CSUM 40116224), size 4] [(CSUM CSUM 40161280), size 32] | |----------------------------------------------------------------------------------------------------| And incorrecly using slot 0, makes us set next_offset to 39239680 and we jump into the "insert:" label, which will set tmp to: tmp = min((sums->len - total_bytes) >> blocksize_bits, (next_offset - file_key.offset) >> blocksize_bits) = min((16384 - 0) >> 12, (39239680 - 40157184) >> 12) = min(4, (u64)-917504 = 18446744073708634112 >> 12) = 4 and ins_size = csum_size * tmp = 4 * 4 = 16 bytes. In other words, we insert a new csum item in the tree with key (CSUM_OBJECTID CSUM_KEY 40157184 = sums->bytenr) that contains the checksums for all the data (4 blocks of 4096 bytes each = sums->len). Which is wrong, because the item with key (CSUM CSUM 40161280) (the one that was moved from leaf N + 1 to the end of leaf N) contains the old checksums of the last 12288 bytes of our data and won't get those old checksums removed. So this leaves us 2 different checksums for 3 4kb blocks of data in the tree, and breaks the logical rule: Key_N+1.offset >= Key_N.offset + length_of_data_its_checksums_cover An obvious bad effect of this is that a subsequent csum tree lookup to get the checksum of any of the blocks with logical offset of 40161280, 40165376 or 40169472 (the last 3 4kb blocks of file data), will get the old checksums. Cc: stable@vger.kernel.org Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Takashi Iwai 提交于
We've got bug reports that btrfs crashes when quota is enabled on 32bit kernel, typically with the Oops like below: BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<f9234590>] find_parent_nodes+0x360/0x1380 [btrfs] *pde = 00000000 Oops: 0000 [#1] SMP CPU: 0 PID: 151 Comm: kworker/u8:2 Tainted: G S W 3.15.2-1.gd43d97e-default #1 Workqueue: btrfs-qgroup-rescan normal_work_helper [btrfs] task: f1478130 ti: f147c000 task.ti: f147c000 EIP: 0060:[<f9234590>] EFLAGS: 00010213 CPU: 0 EIP is at find_parent_nodes+0x360/0x1380 [btrfs] EAX: f147dda8 EBX: f147ddb0 ECX: 00000011 EDX: 00000000 ESI: 00000000 EDI: f147dda4 EBP: f147ddf8 ESP: f147dd38 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 00000004 CR3: 00bf3000 CR4: 00000690 Stack: 00000000 00000000 f147dda4 00000050 00000001 00000000 00000001 00000050 00000001 00000000 d3059000 00000001 00000022 000000a8 00000000 00000000 00000000 000000a1 00000000 00000000 00000001 00000000 00000000 11800000 Call Trace: [<f923564d>] __btrfs_find_all_roots+0x9d/0xf0 [btrfs] [<f9237bb1>] btrfs_qgroup_rescan_worker+0x401/0x760 [btrfs] [<f9206148>] normal_work_helper+0xc8/0x270 [btrfs] [<c025e38b>] process_one_work+0x11b/0x390 [<c025eea1>] worker_thread+0x101/0x340 [<c026432b>] kthread+0x9b/0xb0 [<c0712a71>] ret_from_kernel_thread+0x21/0x30 [<c0264290>] kthread_create_on_node+0x110/0x110 This indicates a NULL corruption in prefs_delayed list. The further investigation and bisection pointed that the call of ulist_add_merge() results in the corruption. ulist_add_merge() takes u64 as aux and writes a 64bit value into old_aux. The callers of this function in backref.c, however, pass a pointer of a pointer to old_aux. That is, the function overwrites 64bit value on 32bit pointer. This caused a NULL in the adjacent variable, in this case, prefs_delayed. Here is a quick attempt to band-aid over this: a new function, ulist_add_merge_ptr() is introduced to pass/store properly a pointer value instead of u64. There are still ugly void ** cast remaining in the callers because void ** cannot be taken implicitly. But, it's safer than explicit cast to u64, anyway. Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=887046 Cc: <stable@vger.kernel.org> [v3.11+] Signed-off-by: NTakashi Iwai <tiwai@suse.de> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Liu Bo 提交于
When failing to allocate space for the whole compressed extent, we'll fallback to uncompressed IO, but we've forgotten to redirty the pages which belong to this compressed extent, and these 'clean' pages will simply skip 'submit' part and go to endio directly, at last we got data corruption as we write nothing. Signed-off-by: NLiu Bo <bo.li.liu@oracle.com> Tested-By: NMartin Steigerwald <martin@lichtvoll.de> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Mark Fasheh 提交于
ulist_add() can return '1' on sucess, which qgroup_subtree_accounting() doesn't take into account. As a result, that value can be bubbled up to callers, causing an error to be printed. Fix this by only returning the value of ulist_add() when it indicates an error. Signed-off-by: NMark Fasheh <mfasheh@suse.de> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Mark Fasheh 提交于
During its tree walk, btrfs_drop_snapshot() will skip any shared subtrees it encounters. This is incorrect when we have qgroups turned on as those subtrees need to have their contents accounted. In particular, the case we're concerned with is when removing our snapshot root leaves the subtree with only one root reference. In those cases we need to find the last remaining root and add each extent in the subtree to the corresponding qgroup exclusive counts. This patch implements the shared subtree walk and a new qgroup operation, BTRFS_QGROUP_OPER_SUB_SUBTREE. When an operation of this type is encountered during qgroup accounting, we search for any root references to that extent and in the case that we find only one reference left, we go ahead and do the math on it's exclusive counts. Signed-off-by: NMark Fasheh <mfasheh@suse.de> Reviewed-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Filipe Manana 提交于
Before processing the extent buffer, acquire a read lock on it, so that we're safe against concurrent updates on the extent buffer. Signed-off-by: NFilipe Manana <fdmanana@suse.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 Josef Bacik 提交于
Before I extended the no_quota arg to btrfs_dec/inc_ref because I didn't understand how snapshot delete was using it and assumed that we needed the quota operations there. With Mark's work this has turned out to be not the case, we _always_ need to use no_quota for btrfs_dec/inc_ref, so just drop the argument and make __btrfs_mod_ref call it's process function with no_quota set always. Thanks, Signed-off-by: NJosef Bacik <jbacik@fb.com> Signed-off-by: NChris Mason <clm@fb.com>
-
由 David Sterba 提交于
This has been discussed in thread: http://thread.gmane.org/gmane.comp.file-systems.btrfs/32528 and this patch implements this proposal: http://thread.gmane.org/gmane.comp.file-systems.btrfs/32536 Works fine for "clean" raid profiles where the raid factor correction does the right job. Otherwise it's pessimistic and may show low space although there's still some left. The df nubmers are lightly wrong in case of mixed block groups, but this is not a major usecase and can be addressed later. The RAID56 numbers are wrong almost the same way as before and will be addressed separately. CC: Hugo Mills <hugo@carfax.org.uk> CC: cwillu <cwillu@cwillu.com> CC: Josef Bacik <jbacik@fb.com> Signed-off-by: NDavid Sterba <dsterba@suse.cz> Signed-off-by: NChris Mason <clm@fb.com>
-
- 14 8月, 2014 2 次提交
-
-
由 Jeff Layton 提交于
There's no need to call locks_free_lock here while still holding the i_lock. Defer that until the lock has been dropped. Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NJeff Layton <jlayton@primarydata.com>
-
由 Jeff Layton 提交于
In commit 72f98e72 (locks: turn lock_flocks into a spinlock), we moved from using the BKL to a global spinlock. With this change, we lost the ability to block in the fl_release_private operation. This is problematic for NFS (and probably some other filesystems as well). Add a new list_head argument to locks_delete_lock. If that argument is non-NULL, then queue any locks that we want to free to the list instead of freeing them. Then, add a new locks_dispose_list function that will walk such a list and call locks_free_lock on them after the i_lock has been dropped. Finally, change all of the callers of locks_delete_lock to pass in a list_head, except for lease_modify. That function can be called long after the i_lock has been acquired. Deferring the freeing of a lease after unlocking it in that function is non-trivial until we overhaul some of the spinlocking in the lease code. Currently though, no filesystem that sets fl_release_private supports leases, so this is not currently a problem. We'll eventually want to make the same change in the lease code, but it needs a lot more work before we can reasonably do so. Acked-by: NJ. Bruce Fields <bfields@fieldses.org> Signed-off-by: NJeff Layton <jlayton@primarydata.com>
-