提交 · 9f3959c53d57d010ae6f4205fbd0159cb7976a83 · openeuler / Kernel

13 12月, 2012 1 次提交

Btrfs: get right arguments for btrfs_wait_ordered_range · 9f3959c5

由 Liu Bo 提交于 11月 01, 2012

btrfs_wait_ordered_range expects for 'len' instead of 'end'.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

9f3959c5

09 10月, 2012 1 次提交

mm: kill vma flag VM_CAN_NONLINEAR · 0b173bc4

由 Konstantin Khlebnikov 提交于 10月 08, 2012

Move actual pte filling for non-linear file mappings into the new special
vma operation: ->remap_pages().

Filesystems must implement this method to get non-linear mapping support,
if it uses filemap_fault() then generic_file_remap_pages() can be used.

Now device drivers can implement this method and obtain nonlinear vma support.
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>	#arch/tile
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Matt Helsley <matthltc@us.ibm.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Venkatesh Pallipadi <venki@google.com>
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0b173bc4

04 10月, 2012 1 次提交

Btrfs: fix punch hole when no extent exists · c3308f84

由 Josef Bacik 提交于 9月 14, 2012

I saw the warning in btrfs_drop_extent_cache where our end is less than our
start while running xfstests 68 in a loop.  This is because we
unconditionally do drop_end = min(end, extent_end) in
__btrfs_drop_extents(), even though we may not have found an extent in the
range we were looking to drop.  So keep track of wether or not we found
something, and if we didn't just use our end.  Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

c3308f84

02 10月, 2012 10 次提交

Revert "Btrfs: do not do filemap_write_and_wait_range in fsync" · 90abccf2

由 Miao Xie 提交于 9月 13, 2012

This reverts commit 0885ef5b

After applying the above patch, the performance slowed down because the dirty
page flush can only be done by one task, so revert it.

The following is the test result of sysbench:
	Before		After
	24MB/s		39MB/s
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>

90abccf2

Btrfs: use flag EXTENT_DEFRAG for snapshot-aware defrag · 9e8a4a8b

由 Liu Bo 提交于 9月 05, 2012

We're going to use this flag EXTENT_DEFRAG to indicate which range
belongs to defragment so that we can implement snapshow-aware defrag:

We set the EXTENT_DEFRAG flag when dirtying the extents that need
defragmented, so later on writeback thread can differentiate between
normal writeback and writeback started by defragmentation.
Original-Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>

9e8a4a8b

Btrfs: fix wrong size for the reservation when doing, file pre-allocation. · 903889f4

由 Miao Xie 提交于 9月 06, 2012

When we ran fsstress(a program in xfstests), the filesystem hung up when it
is full. It was because the space reserved in btrfs_fallocate() was wrong,
btrfs_fallocate() just used the size of the pre-allocation to reserve the
space, didn't took the block size aligning into account, so the size of
the reserved space was less than the allocated space, it caused the over
reserve problem and made the filesystem hung up when invoking cow_file_range().
Fix it.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>

903889f4

Btrfs: fix unprotected ->log_batch · 2ecb7923

由 Miao Xie 提交于 9月 06, 2012

We forget to protect ->log_batch when syncing a file, this patch fix
this problem by atomic operation. And ->log_batch is used to check
if there are parallel sync operations or not, so it is unnecessary to
reset it to 0 after the sync operation of the current log tree complete.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>

2ecb7923

Btrfs: add a new "type" field into the block reservation structure · 66d8f3dd

由 Miao Xie 提交于 9月 06, 2012

Sometimes we need choose the method of the reservation according to the type
of the block reservation, such as the reservation for the delayed inode update.
Now we identify the type just by comparing the address of the reservation
variants, it is very ugly if it is a temporary one because we need compare it
with all the common reservation variants. So we add a new "type" field to keep
the type the reservation variants.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>

66d8f3dd

Btrfs: btrfs_drop_extent_cache should never fail · 7014cdb4

由 Josef Bacik 提交于 8月 30, 2012

I noticed this when I was doing the fsync stuff, we allocate split extents if we
drop an extent range that is in the middle of an existing extent. This BUG()'s
if we fail to allocate memory, but the fact is this is just a cache, we will
just regenerate the cache if we need it, the important part is that we free the
range we are given. This can be done without allocations, so if we fail to
allocate splits just skip the splitting stage and free our em and look for more
extents to drop. This also makes btrfs_drop_extent_cache a void since nobody
was checking the return value anyway. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

7014cdb4

Btrfs: add hole punching · 2aaa6655

由 Josef Bacik 提交于 8月 29, 2012

This patch adds hole punching via fallocate.  Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

2aaa6655

Btrfs: remove unused hint byte argument for btrfs_drop_extents · 2671485d

由 Josef Bacik 提交于 8月 29, 2012

I audited all users of btrfs_drop_extents and found that nobody actually uses
the hint_byte argument. I'm sure it was used for something at some point but
it's not used now, and the way the pinning works the disk bytenr would never be
immediately useful anyway so lets just remove it. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

2671485d

Btrfs: turbo charge fsync · 5dc562c5

由 Josef Bacik 提交于 8月 17, 2012

At least for the vm workload.  Currently on fsync we will

1) Truncate all items in the log tree for the given inode if they exist

and

2) Copy all items for a given inode into the log

The problem with this is that for things like VMs you can have lots of
extents from the fragmented writing behavior, and worst yet you may have
only modified a few extents, not the entire thing.  This patch fixes this
problem by tracking which transid modified our extent, and then when we do
the tree logging we find all of the extents we've modified in our current
transaction, sort them and commit them.  We also only truncate up to the
xattrs of the inode and copy that stuff in normally, and then just drop any
extents in the range we have that exist in the log already.  Here are some
numbers of a 50 meg fio job that does random writes and fsync()s after every
write

		Original	Patched
SATA drive	82KB/s		140KB/s
Fusion drive	431KB/s		2532KB/s

So around 2-6 times faster depending on your hardware.  There are a few
corner cases, for example if you truncate at all we have to do it the old
way since there is no way to be sure what is in the log is ok.  This
probably could be done smarter, but if you write-fsync-truncate-write-fsync
you deserve what you get.  All this work is in RAM of course so if your
inode gets evicted from cache and you read it in and fsync it we'll do it
the slow way if we are still in the same transaction that we last modified
the inode in.

The biggest cool part of this is that it requires no changes to the recovery
code, so if you fsync with this patch and crash and load an old kernel, it
will run the recovery and be a-ok.  I have tested this pretty thoroughly
with an fsync tester and everything comes back fine, as well as xfstests.
Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

5dc562c5

Btrfs: fix possible corruption when fsyncing written prealloced extents · 224ecce5

由 Josef Bacik 提交于 8月 16, 2012

While working on my fsync patch my fsync tester kept hitting mismatching
md5sums when I would randomly write to a prealloc'ed region, syncfs() and
then write to the prealloced region some more and then fsync() and then
immediately reboot. This is because the tree logging code will skip writing
csums for file extents who's generation is less than the current running
transaction. When we mark extents as written we haven't been updating their
generation so they were always being skipped. This wouldn't happen if you
were to preallocate and then write in the same transaction, but if you for
example prealloced a VM you could definitely run into this problem. This
patch makes my fsync tester happy again. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

224ecce5

31 7月, 2012 1 次提交

btrfs: Convert to new freezing mechanism · b2b5ef5c

由 Jan Kara 提交于 6月 12, 2012

We convert btrfs_file_aio_write() to use new freeze check.  We also add proper
freeze protection to btrfs_page_mkwrite(). We also add freeze protection to
the transaction mechanism to avoid starting transactions on frozen filesystem.
At minimum this is necessary to stop iput() of unlinked file to change frozen
filesystem during truncation.

Checks in cleaner_kthread() and transaction_kthread() can be safely removed
since btrfs_freeze() will lock the mutexes and thus block the threads (and they
shouldn't have anything to do anyway).

CC: linux-btrfs@vger.kernel.org
CC: Chris Mason <chris.mason@oracle.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

b2b5ef5c

03 7月, 2012 1 次提交

Btrfs: fix dio write vs buffered read race · c3473e83

由 Josef Bacik 提交于 6月 19, 2012

Miao pointed out there's a problem with mixing dio writes and buffered
reads. If the read happens between us invalidating the page range and
actually locking the extent we can bring in pages into page cache. Then
once the write finishes if somebody tries to read again it will just find
uptodate pages and we'll read stale data. So we need to lock the extent and
check for uptodate bits in the range. If there are uptodate bits we need to
unlock and invalidate again. This will keep this race from happening since
we will hold the extent locked until we create the ordered extent, and then
teh read side always waits for ordered extents. There was also a race in
how we updated i_size, previously we were relying on the generic DIO stuff
to adjust the i_size after the DIO had completed, but this happens outside
of the extent lock which means reads could come in and not see the updated
i_size. So instead move this work into where we create the extents, and
then this way the update ordered i_size stuff works properly in the endio
handlers. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

c3473e83

02 6月, 2012 1 次提交

Btrfs: move over to use ->update_time · e41f941a

由 Josef Bacik 提交于 3月 26, 2012

Btrfs had been doing it's own file_update_time so we could catch ENOSPC
properly, so just update our btrfs_update_time to work with the new stuff and
then we'll be fancy later.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

e41f941a

30 5月, 2012 5 次提交

Btrfs: check to see if the inode is in the log before fsyncing · 22ee6985

由 Josef Bacik 提交于 5月 29, 2012

We have this check down in the actual logging code, but this is after we
start a transaction and all that good stuff. So move the helper
inode_in_log() out so we can call it in fsync() and avoid starting a
transaction altogether and just exit if we've already fsync()'ed this file
recently. You would notice this issue if you fsync()'ed a file over and
over again until the transaction committed. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

22ee6985

Btrfs: fix the same inode id problem when doing auto defragment · 762f2263

由 Miao Xie 提交于 5月 24, 2012

Two files in the different subvolumes may have the same inode id, so
The rb-tree which is used to manage the defragment object must take it
into account. This patch fix this problem.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>

762f2263

Btrfs: convert the inode bit field to use the actual bit operations · 72ac3c0d

由 Josef Bacik 提交于 5月 23, 2012

Miao pointed this out while I was working on an orphan problem that messing
with a bitfield where different ranges are protected by different locks
doesn't work out right. Turns out we've been doing this forever where we
have different parts of the bit field protected by either no lock at all or
different locks which could cause all sorts of weird problems including the
issue I was hitting. So instead make a runtime_flags thing that we use the
normal bit operations on that are all atomic so we can keep having our
no/different locking for the different flags and then make force_compress
it's own thing so it can be treated normally. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

72ac3c0d

Btrfs: do not do filemap_write_and_wait_range in fsync · 0885ef5b

由 Josef Bacik 提交于 4月 23, 2012

We already do the btrfs_wait_ordered_range which will do this for us, so
just remove this call so we don't call it twice.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

0885ef5b

Btrfs: use i_version instead of our own sequence · 0c4d2d95

由 Josef Bacik 提交于 4月 05, 2012

We've been keeping around the inode sequence number in hopes that somebody
would use it, but nobody uses it and people actually use i_version which
serves the same purpose, so use i_version where we used the incore inode's
sequence number and that way the sequence is updated properly across the
board, and not just in file write. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

0c4d2d95

28 4月, 2012 1 次提交

Btrfs: reduce lock contention during extent insertion · dc7fdde3

由 Chris Mason 提交于 4月 27, 2012

We're spending huge amounts of time on lock contention during
end_io processing because we unconditionally assume we are overwriting
an existing extent in the file for each IO.

This checks to see if we are outside i_size, and if so, it uses a
less expensive readonly search of the btree to look for existing
extents.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

dc7fdde3

22 3月, 2012 2 次提交

btrfs: replace many BUG_ONs with proper error handling · 79787eaa

由 Jeff Mahoney 提交于 3月 12, 2012

 btrfs currently handles most errors with BUG_ON. This patch is a work-in-
 progress but aims to handle most errors other than internal logic
 errors and ENOMEM more gracefully.

 This iteration prevents most crashes but can run into lockups with
 the page lock on occasion when the timing "works out."
Signed-off-by: NJeff Mahoney <jeffm@suse.com>

79787eaa

btrfs: drop gfp_t from lock_extent · d0082371

由 Jeff Mahoney 提交于 3月 01, 2012

 lock_extent and unlock_extent are always called with GFP_NOFS, drop the
 argument and use GFP_NOFS consistently.
Signed-off-by: NJeff Mahoney <jeffm@suse.com>

d0082371

15 2月, 2012 1 次提交

Btrfs: return the internal error unchanged if btrfs_get_extent_fiemap() call... · 6af021d8

由 Jeff Liu 提交于 2月 09, 2012

Btrfs: return the internal error unchanged if btrfs_get_extent_fiemap() call failed for SEEK_DATA/SEEK_HOLE inquiry

Given that ENXIO only means "offset beyond EOF" for either SEEK_DATA or SEEK_HOLE inquiry
in a desired file range, so we should return the internal error unchanged if btrfs_get_extent_fiemap()
call failed, rather than ENXIO.

Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: NJie Liu <jeff.liu@oracle.com>

6af021d8

01 2月, 2012 1 次提交

Btrfs: don't reserve data with extents locked in btrfs_fallocate · d98456fc

由 Chris Mason 提交于 1月 31, 2012

btrfs_fallocate tries to allocate space only if ranges in the file don't
already exist.  But the enospc checks it does are not allowed with
extents locked.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

d98456fc

17 1月, 2012 1 次提交

Btrfs: don't call btrfs_throttle in file write · 45a8090e

由 Josef Bacik 提交于 1月 12, 2012

Btrfs_throttle will make us wait if there is a currently committing transaction
until we can open new transactions, which is ridiculous since we don't actually
start any transactions within the file write path anyway, so all this does is
introduce big latencies if we have a sync/fsync heavy workload going on while
somebody else is trying to do work. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

45a8090e

11 1月, 2012 1 次提交

btrfs: pass __GFP_WRITE for buffered write page allocations · e3a41a5b

由 Johannes Weiner 提交于 1月 10, 2012

Tell the page allocator that pages allocated for a buffered write are
expected to become dirty soon.
Signed-off-by: NJohannes Weiner <jweiner@redhat.com>
Reviewed-by: NRik van Riel <riel@redhat.com>
Acked-by: NMel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Chris Mason <chris.mason@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e3a41a5b

22 12月, 2011 1 次提交

Btrfs: mark delayed refs as for cow · 66d7e7f0

由 Arne Jansen 提交于 9月 12, 2011

Add a for_cow parameter to add_delayed_*_ref and pass the appropriate value
from every call site. The for_cow parameter will later on be used to
determine if a ref will change anything with respect to qgroups.

Delayed refs coming from relocation are always counted as for_cow, as they
don't change subvol quota.

Also pass in the fs_info for later use.

btrfs_find_all_roots() will use this as an optimization, as changes that are
for_cow will not change anything with respect to which root points to a
certain leaf. Thus, we don't need to add the current sequence number to
those delayed refs.
Signed-off-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>

66d7e7f0

18 12月, 2011 1 次提交

btrfs: fix dirtied pages accounting on sub-page writes · 32c7f202

由 Wu Fengguang 提交于 8月 08, 2011

When doing 1KB sequential writes to the same page,
balance_dirty_pages_ratelimited_nr() should be called once instead of 4
times, the latter makes the dirtier tasks be throttled much too heavy.

Fix it with proper de-accounting on clear_page_dirty_for_io().

CC: Chris Mason <chris.mason@oracle.com>
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>

32c7f202

17 12月, 2011 1 次提交

btrfs: lower the dirty balance poll interval · 142349f5

由 Wu Fengguang 提交于 12月 16, 2011

Tests show that the original large intervals can easily make the dirty
limit exceeded on 100 concurrent dd's. So adapt to as large as the
next check point selected by the dirty throttling algorithm.
Signed-off-by: NWu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

142349f5

16 12月, 2011 1 次提交

Btrfs: deal with enospc from dirtying inodes properly · 22c44fe6

由 Josef Bacik 提交于 11月 30, 2011

Now that we're properly keeping track of delayed inode space we've been getting
a lot of warnings out of btrfs_dirty_inode() when running xfstest 83. This is
because a bunch of people call mark_inode_dirty, which is void so we can't
return ENOSPC. This needs to be fixed in a few areas

1) file_update_time - this updates the mtime and such when writing to a file,
which will call mark_inode_dirty. So copy file_update_time into btrfs so we can
call btrfs_dirty_inode directly and return an error if we get one appropriately.

2) fix symlinks to use btrfs_setattr for ->setattr. For some reason we weren't
setting ->setattr for symlinks, even though we should have been. This catches
one of the cases where we were getting errors in mark_inode_dirty.

3) Fix btrfs_setattr and btrfs_setsize to call btrfs_dirty_inode directly
instead of mark_inode_dirty. This lets us return errors properly for truncate
and chown/anything related to setattr.

4) Add a new btrfs_fs_dirty_inode which will just call btrfs_dirty_inode and
print an error if we have one. The only remaining user we can't control for
this is touch_atime(), but we don't really want to keep people from walking
down the tree if we don't have space to save the atime update, so just complain
but don't worry about it.

With this patch xfstests 83 complains a handful of times instead of hundreds of
times. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

22c44fe6

28 10月, 2011 1 次提交

vfs: do (nearly) lockless generic_file_llseek · ef3d0fd2

由 Andi Kleen 提交于 9月 15, 2011

The i_mutex lock use of generic _file_llseek hurts.  Independent processes
accessing the same file synchronize over a single lock, even though
they have no need for synchronization at all.

Under high utilization this can cause llseek to scale very poorly on larger
systems.

This patch does some rethinking of the llseek locking model:

First the 64bit f_pos is not necessarily atomic without locks
on 32bit systems. This can already cause races with read() today.
This was discussed on linux-kernel in the past and deemed acceptable.
The patch does not change that.

Let's look at the different seek variants:

SEEK_SET: Doesn't really need any locking.
If there's a race one writer wins, the other loses.

For 32bit the non atomic update races against read()
stay the same. Without a lock they can also happen
against write() now.  The read() race was deemed
acceptable in past discussions, and I think if it's
ok for read it's ok for write too.

=> Don't need a lock.

SEEK_END: This behaves like SEEK_SET plus it reads
the maximum size too. Reading the maximum size would have the
32bit atomic problem. But luckily we already have a way to read
the maximum size without locking (i_size_read), so we
can just use that instead.

Without i_mutex there is no synchronization with write() anymore,
however since the write() update is atomic on 64bit it just behaves
like another racy SEEK_SET.  On non atomic 32bit it's the same
as SEEK_SET.

=> Don't need a lock, but need to use i_size_read()

SEEK_CUR: This has a read-modify-write race window
on the same file. One could argue that any application
doing unsynchronized seeks on the same file is already broken.
But for the sake of not adding a regression here I'm
using the file->f_lock to synchronize this. Using this
lock is much better than the inode mutex because it doesn't
synchronize between processes.

=> So still need a lock, but can use a f_lock.

This patch implements this new scheme in generic_file_llseek.
I dropped generic_file_llseek_unlocked and changed all callers.
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ef3d0fd2

20 10月, 2011 2 次提交

Btrfs: use the inode's mapping mask for allocating pages · 3b16a4e3

由 Josef Bacik 提交于 9月 21, 2011

Johannes pointed out we were allocating only kernel pages for doing writes,
which is kind of a big deal if you are on 32bit and have more than a gig of ram.
So fix our allocations to use the mapping's gfp but still clear __GFP_FS so we
don't re-enter.  Thanks,
Reported-by: NJohannes Weiner <jweiner@redhat.com>
Signed-off-by: NJosef Bacik <josef@redhat.com>

3b16a4e3

Btrfs: only reserve space in fallocate if we have to do a preallocate · 1b9c332b

由 Josef Bacik 提交于 8月 17, 2011

Lukas found a problem where if he tries to fallocate over the same region twice
and the first fallocate took up all the space we would fail with ENOSPC. This
is because we reserve the total space we want to use for fallocate, regardless
of wether or not we will have to actually preallocate. So instead move the
check into the loop where we actually have to do the preallocate. Thanks,
Tested-by: NLukas Czerner <lczerner@redhat.com>
Signed-off-by: NJosef Bacik <josef@redhat.com>

1b9c332b

01 10月, 2011 1 次提交

Btrfs: force a page fault if we have a shorty copy on a page boundary · b6316429

由 Josef Bacik 提交于 9月 30, 2011

A user reported a problem where ceph was getting into 100% cpu usage while doing
some writing. It turns out it's because we were doing a short write on a not
uptodate page, which means we'd fall back at one page at a time and fault the
page in. The problem is our position is on the page boundary, so our fault in
logic wasn't actually reading the page, so we'd just spin forever or until the
page got read in by somebody else. This will force a readpage if we end up
doing a short copy. Alexandre could reproduce this easily with ceph and reports
it fixes his problem. I also wrote a reproducer that no longer hangs my box
with this patch. Thanks,
Reported-and-tested-by: NAlexandre Oliva <aoliva@redhat.com>
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b6316429

18 9月, 2011 1 次提交

BTRFS: Fix lseek return value for error · 48802c8a

由 Jeff Liu 提交于 9月 18, 2011

The recent reworking of btrfs' lseek lead to incorrect
values being returned.  This adds checks for seeking
beyond EOF in SEEK_HOLE and makes sure the error
values come back correct.

Andi Kleen also sent in similar patches.
Signed-off-by: NJie Liu <jeff.liu@oracle.com>
Reported-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

48802c8a

11 9月, 2011 1 次提交

Btrfs: fix the file extent gap when doing direct IO · 0c1a98c8

由 Miao Xie 提交于 9月 11, 2011

When we write some data to the place that is beyond the end of the file
in direct I/O mode, a data hole will be created. And Btrfs should insert
a file extent item that point to this hole into the fs tree. But unfortunately
Btrfs forgets doing it.

The following is a simple way to reproduce it:
 # mkfs.btrfs /dev/sdc2
 # mount /dev/sdc2 /test4
 # touch /test4/a
 # dd if=/dev/zero of=/test4/a seek=8 count=1 bs=4K oflag=direct conv=nocreat,notrunc
 # umount /test4
 # btrfsck /dev/sdc2
 root 5 inode 257 errors 100
Reported-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Tested-by: NTsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0c1a98c8

18 8月, 2011 2 次提交

Btrfs: set i_size properly when fallocating and we already · f1e490a7

由 Josef Bacik 提交于 8月 18, 2011

xfstests exposed a problem with preallocate when it fallocates a range that
already has an extent. We don't set the new i_size properly because we see that
we already have an extent. This isn't right and we should update i_size if the
space already exists. With this patch we now pass xfstests 075. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f1e490a7

btrfs: unlock on error in btrfs_file_llseek() · 9a4327ca

由 Dan Carpenter 提交于 8月 18, 2011

There were some unlocks on error missing in a recent patch to
btrfs_file_llseek().
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9a4327ca

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功