提交 · 06d3d22b456c2f87aeb1eb4517eeabb47e21fcc9 · openanolis / cloud-kernel

02 10月, 2012 31 次提交

Btrfs: cleanup extents after we finish logging inode · 06d3d22b

由 Liu Bo 提交于 8月 27, 2012

This is based on Josef's "Btrfs: turbo charge fsync".

We should cleanup those extents after we've finished logging inode,
otherwise we may do redundant work on them.
Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>

06d3d22b

Btrfs: only warn if we hit an error when doing the tree logging · 0fa83cdb

由 Josef Bacik 提交于 8月 24, 2012

I hit this a couple times while working on my fsync patch (all my bugs, not
normal operation), but with my new stuff we could have new errors from cases
I have not encountered, so instead of BUG()'ing we should be WARN()'ing so
that we are notified there is a problem but the user doesn't lose their
data. We can easily commit the transaction in the case that the tree
logging fails and still be fine, so let's try and be as nice to the user as
possible. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

0fa83cdb

Btrfs: turbo charge fsync · 5dc562c5

由 Josef Bacik 提交于 8月 17, 2012

At least for the vm workload.  Currently on fsync we will

1) Truncate all items in the log tree for the given inode if they exist

and

2) Copy all items for a given inode into the log

The problem with this is that for things like VMs you can have lots of
extents from the fragmented writing behavior, and worst yet you may have
only modified a few extents, not the entire thing.  This patch fixes this
problem by tracking which transid modified our extent, and then when we do
the tree logging we find all of the extents we've modified in our current
transaction, sort them and commit them.  We also only truncate up to the
xattrs of the inode and copy that stuff in normally, and then just drop any
extents in the range we have that exist in the log already.  Here are some
numbers of a 50 meg fio job that does random writes and fsync()s after every
write

		Original	Patched
SATA drive	82KB/s		140KB/s
Fusion drive	431KB/s		2532KB/s

So around 2-6 times faster depending on your hardware.  There are a few
corner cases, for example if you truncate at all we have to do it the old
way since there is no way to be sure what is in the log is ok.  This
probably could be done smarter, but if you write-fsync-truncate-write-fsync
you deserve what you get.  All this work is in RAM of course so if your
inode gets evicted from cache and you read it in and fsync it we'll do it
the slow way if we are still in the same transaction that we last modified
the inode in.

The biggest cool part of this is that it requires no changes to the recovery
code, so if you fsync with this patch and crash and load an old kernel, it
will run the recovery and be a-ok.  I have tested this pretty thoroughly
with an fsync tester and everything comes back fine, as well as xfstests.
Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

5dc562c5

Btrfs: fix possible corruption when fsyncing written prealloced extents · 224ecce5

由 Josef Bacik 提交于 8月 16, 2012

While working on my fsync patch my fsync tester kept hitting mismatching
md5sums when I would randomly write to a prealloc'ed region, syncfs() and
then write to the prealloced region some more and then fsync() and then
immediately reboot. This is because the tree logging code will skip writing
csums for file extents who's generation is less than the current running
transaction. When we mark extents as written we haven't been updating their
generation so they were always being skipped. This wouldn't happen if you
were to preallocate and then write in the same transaction, but if you for
example prealloced a VM you could definitely run into this problem. This
patch makes my fsync tester happy again. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

224ecce5

Btrfs: do not allocate chunks as agressively · 54338b5c

由 Josef Bacik 提交于 8月 14, 2012

Swinging this pendulum back the other way. We've been allocating chunks up
to 2% of the disk no matter how much we actually have allocated. So instead
fix this calculation to only allocate chunks if we have more than 80% of the
space available allocated. Please test this as it will likely cause all
sorts of ENOSPC problems to pop up suddenly. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

54338b5c

Btrfs: update last trans if we don't update the inode · 7c735313

由 Josef Bacik 提交于 8月 13, 2012

There is a completely impossible situation to hit where you can preallocate
a file, fsync it, write into the preallocated region, have the transaction
commit twice and then fsync and then immediately lose power and lose all of
the contents of the write. This patch fixes this just so I feel better
about the situation and because it is lightweight, we just update the
last_trans when we finish an ordered IO and we don't update the inode
itself. This way we are completely safe and I feel better. Thanks,
Signed-off-by: NJosef Bacik <jbacik@fusionio.com>

7c735313

Btrfs: fix gcc warnings for 32bit compiles · 995e01b7

由 Jan Schmidt 提交于 8月 13, 2012

Signed-off-by: NJan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

995e01b7

Btrfs: fix btrfs send for inline items and compression · 74dd17fb

由 Chris Mason 提交于 8月 07, 2012

The btrfs send code was assuming the offset of the file item into the
extent translated to bytes on disk.  If we're compressed, this isn't
true, and so it was off into extents owned by other files.

It was also improperly handling inline extents.  This solves a crash
where we may have gone past the end of the file extent item by not
testing early enough for an inline extent.  It also solves problems
where we have a whole between the end of the inline item and the start
of the full extent.
Signed-off-by: NChris Mason <chris.mason@fusionio.com>

74dd17fb

Btrfs: don't treat top/root directory inode as deleted/reused · 6d85ed05

由 Alexander Block 提交于 8月 01, 2012

We can't do the deleted/reused logic for top/root inodes as it would
create a stream that tries to delete and recreate the root dir.
Reported-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

6d85ed05

Btrfs: ignore non-FS inodes for send/receive · 2981e225

由 Alexander Block 提交于 8月 01, 2012

We have to ignore inode/space cache objects in send/receive.
Reported-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

2981e225

Btrfs: pass root instead of parent_root to iterate_inode_ref · 2f28f478

由 Alexander Block 提交于 8月 01, 2012

We need to pass the root that we determined earlier to iterate_inode_ref.
Reported-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

2f28f478

Btrfs: use <= instead of < in is_extent_unchanged · d8347fa4

由 Alexander Block 提交于 8月 01, 2012

Used the wrong compare operator here.
Reported-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

d8347fa4

Btrfs: fix check for changed extent in is_extent_unchanged · 3954096d

由 Alexander Block 提交于 8月 01, 2012

The previous check was working fine, but this check should be
easier to read. Also, we could theoritically have some exotic
bugs with the previous checks.
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

3954096d

Btrfs: free nce and nce_head on error in name_cache_insert · 5dc67d0b

由 Alexander Block 提交于 8月 01, 2012

Both were leaked in case of error.
Reported-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

5dc67d0b

Btrfs: remove unused tmp_path from iterate_dir_item · 3e126f32

由 Alexander Block 提交于 8月 01, 2012

A leftover from older code and unused now.
Reported-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

3e126f32

Btrfs: code cleanups for send/receive · e938c8ad

由 Alexander Block 提交于 7月 28, 2012

Doing some code cleanups as suggested by Arne.
Changes do not change any logic.
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

e938c8ad

A
Btrfs: add/fix comments/documentation for send/receive · 766702ef
由 Alexander Block 提交于 7月 28, 2012
```
As the subject already said, add/fix comments.
Signed-off-by: NAlexander Block <ablock84@googlemail.com>
```
766702ef

Btrfs: update send_progress at correct places · e479d9bb

由 Alexander Block 提交于 7月 28, 2012

Updating send_progress in process_recorded_refs was not correct.
It got updated too early in the cur_inode_new_gen case.
Reported-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
Reported-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

e479d9bb

Btrfs: make aux field of ulist 64 bit · 34d73f54

由 Alexander Block 提交于 7月 28, 2012

Btrfs send/receive uses the aux field to store inode numbers. On
32 bit machines this may become a problem.

Also fix all users of ulist_add and ulist_add_merged.
Reported-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

34d73f54

Btrfs: fix use of radix_tree for name_cache in send/receive · 7e0926fe

由 Alexander Block 提交于 7月 28, 2012

We can't easily use the index of the radix tree for inums as the
radix tree uses 32bit indexes on 32bit kernels. For 32bit kernels,
we now use the lower 32bit of the inum as index and an additional
list to store multiple entries per radix tree entry.
Reported-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

7e0926fe

Btrfs: fix memory leak for name_cache in send/receive · 17589bd9

由 Alexander Block 提交于 7月 28, 2012

When everything is done, name_cache_free is called which however
forgot to call kfree on the cache entries.
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

17589bd9

Btrfs: don't break in the final loop of find_extent_clone · adbe7fb6

由 Alexander Block 提交于 7月 28, 2012

If we break, we may miss the clone from send_root which we prefer
over all other clones.

Commit is a result of Arne's review.
Reported-by: NArne Jansen <sensille@gmx.net>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

adbe7fb6

Btrfs: use normal return path for root == send_root case · 52f9e53e

由 Alexander Block 提交于 7月 28, 2012

Don't have a seperate return path for the mentioned case. Now
we do the same "take lowest inode/offset" logic for all found clones.

Commit is a result of Arne's review.
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

52f9e53e

Btrfs: use kmalloc instead of stack for backref_ctx · 35075bb0

由 Alexander Block 提交于 7月 28, 2012

Make sure to never get in trouble due to the backref_ctx
which was on the stack before.

Commit is a result of Arne's review.
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

35075bb0

Btrfs: rename backref_ctx::found_in_send_root to found_itself · ee849c04

由 Alexander Block 提交于 7月 28, 2012

The new name should be easier to understand/read.

Commit is a result of Arne's review.
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

ee849c04

A
Btrfs: remove unused use_list from send/receive code · d27aed5e
由 Alexander Block 提交于 7月 28, 2012
```
use_list is a leftover and unused.
Signed-off-by: NAlexander Block <ablock84@googlemail.com>
```
d27aed5e

Btrfs: add correct parent to check_dirs when dir got moved · ccf1626b

由 Alexander Block 提交于 7月 28, 2012

We only added the parent for the new position of a moved dir.
We also need to add the old parent of the moved dir.
Reported-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

ccf1626b

Btrfs: remove unused code with #if 0 · 9ea3ef51

由 Alexander Block 提交于 7月 28, 2012

fs_path_remove is not used at the moment due to a previous patch.
Remove it for now (with #if 0) to avoid compile warnings.
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

9ea3ef51

Btrfs: add missing check for dir != tmp_dir to is_first_ref · b9291aff

由 Alexander Block 提交于 7月 28, 2012

We missed that check which resultet in all refs with the same name
being reported as first_ref.
Reported-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

b9291aff

Btrfs: fix cur_ino < parent_ino case for send/receive · 1f4692da

由 Alexander Block 提交于 7月 28, 2012

When the current inodes inum is smaller then the inum of the
parent directory strange things were happending due to wrong
path resolution and other bugs. Fix this with a new approach
for the problem.
Reported-by: NAlex Lyakas <alex.bolshoy.btrfs@gmail.com>
Signed-off-by: NAlexander Block <ablock84@googlemail.com>

1f4692da

A
Btrfs: add rdev to get_inode_info in send/receive · 85a7b33b
由 Alexander Block 提交于 7月 26, 2012
```
We need rdev in the next commit.
Signed-off-by: NAlexander Block <ablock84@googlemail.com>
```
85a7b33b

01 10月, 2012 1 次提交
- L
  
  Linux 3.6 · a0d271cb
  由 Linus Torvalds 提交于 9月 30, 2012
  
  a0d271cb
30 9月, 2012 3 次提交

vfs: dcache: fix deadlock in tree traversal · 8110e16d

由 Miklos Szeredi 提交于 9月 17, 2012

IBM reported a deadlock in select_parent().  This was found to be caused
by taking rename_lock when already locked when restarting the tree
traversal.

There are two cases when the traversal needs to be restarted:

 1) concurrent d_move(); this can only happen when not already locked,
    since taking rename_lock protects against concurrent d_move().

 2) racing with final d_put() on child just at the moment of ascending
    to parent; rename_lock doesn't protect against this rare race, so it
    can happen when already locked.

Because of case 2, we need to be able to handle restarting the traversal
when rename_lock is already held.  This patch fixes all three callers of
try_to_ascend().

IBM reported that the deadlock is gone with this patch.

[ I rewrote the patch to be smaller and just do the "goto again" if the
  lock was already held, but credit goes to Miklos for the real work.
   - Linus ]
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8110e16d

Merge tag 'iommu-fixes-v3.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 6a3e3dbe

由 Linus Torvalds 提交于 9月 29, 2012

Pull IOMMU fixes from Joerg Roedel:
 "Two small patches:

	* One patch to fix the function declarations for
	  !CONFIG_IOMMU_API. This is causing build errors
	  in linux-next and should be fixed for v3.6.

	* Another patch to fix an IOMMU group related NULL pointer
	  dereference."

* tag 'iommu-fixes-v3.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/amd: Fix wrong assumption in iommu-group specific code
  iommu: static inline iommu group stub functions

6a3e3dbe

Merge git://git.infradead.org/users/willy/linux-nvme · 21e98932

由 Linus Torvalds 提交于 9月 29, 2012

Pull NVMe driver fixes from Matthew Wilcox:
 "Now that actual hardware has been released (don't have any yet
  myself), people are starting to want some of these fixes merged."

Willy doesn't have hardware? Guys...

* git://git.infradead.org/users/willy/linux-nvme:
  NVMe: Cancel outstanding IOs on queue deletion
  NVMe: Free admin queue memory on initialisation failure
  NVMe: Use ida for nvme device instance
  NVMe: Fix whitespace damage in nvme_init
  NVMe: handle allocation failure in nvme_map_user_pages()
  NVMe: Fix uninitialized iod compiler warning
  NVMe: Do not set IO queue depth beyond device max
  NVMe: Set block queue max sectors
  NVMe: use namespace id for nvme_get_features
  NVMe: replace nvme_ns with nvme_dev for user admin
  NVMe: Fix nvme module init when nvme_major is set
  NVMe: Set request queue logical block size

21e98932

29 9月, 2012 5 次提交

mtdchar: fix offset overflow detection · 9c603e53

由 Linus Torvalds 提交于 9月 08, 2012

Sasha Levin has been running trinity in a KVM tools guest, and was able
to trigger the BUG_ON() at arch/x86/mm/pat.c:279 (verifying the range of
the memory type).  The call trace showed that it was mtdchar_mmap() that
created an invalid remap_pfn_range().

The problem is that mtdchar_mmap() does various really odd and subtle
things with the vma page offset etc, and uses the wrong types (and the
wrong overflow) detection for it.

For example, the page offset may well be 32-bit on a 32-bit
architecture, but after shifting it up by PAGE_SHIFT, we need to use a
potentially 64-bit resource_size_t to correctly hold the full value.

Also, we need to check that the vma length plus offset doesn't overflow
before we check that it is smaller than the length of the mtdmap region.

This fixes things up and tries to make the code a bit easier to read.
Reported-and-tested-by: NSasha Levin <levinsasha928@gmail.com>
Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Acked-by: NArtem Bityutskiy <dedekind1@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: linux-mtd@lists.infradead.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9c603e53

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 6672d90f

由 Linus Torvalds 提交于 9月 28, 2012

Pull networking fixes from David S Miller:

 1) Netfilter xt_limit module can use uninitialized rules, from Jan
    Engelhardt.

 2) Wei Yongjun has found several more spots where error pointers were
    treated as NULL/non-NULL and vice versa.

 3) bnx2x was converted to pci_io{,un}map() but one remaining plain
    iounmap() got missed.  From Neil Horman.

 4) Due to a fence-post type error in initialization of inetpeer entries
    (which is where we store the ICMP rate limiting information), we can
    erroneously drop ICMPs if the inetpeer was created right around when
    jiffies wraps.

    Fix from Nicolas Dichtel.

 5) smsc75xx resume fix from Steve Glendinnig.

 6) LAN87xx smsc chips need an explicit hardware init, from Marek Vasut.

 7) qlcnic uses msleep() with locks held, fix from Narendra K.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  netdev: octeon: fix return value check in octeon_mgmt_init_phy()
  inetpeer: fix token initialization
  qlcnic: Fix scheduling while atomic bug
  bnx2: Clean up remaining iounmap
  net: phy: smsc: Implement PHY config_init for LAN87xx
  smsc75xx: fix resume after device reset
  netdev: pasemi: fix return value check in pasemi_mac_phy_init()
  team: fix return value check
  l2tp: fix return value check
  netfilter: xt_limit: have r->cost != 0 case work

6672d90f

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 7596824e

由 Linus Torvalds 提交于 9月 28, 2012

Pull vfs fixes from Al Viro:
 "A couple of fixes; one for automount/lazy umount race, another a
  classic "we don't protect the refcount transition to zero with the
  lock that protects looking for object in hash" kind of crap in lockd."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  close the race in nlmsvc_free_block()
  do_add_mount()/umount -l races

7596824e

Merge branch 'for-linus-3.6-rc-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · 97956605

由 Linus Torvalds 提交于 9月 28, 2012

Pull UML fixes from Richard Weinberger.

* 'for-linus-3.6-rc-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Preinclude include/linux/kern_levels.h
  um: Fix IPC on um
  um: kill thread->forking
  um: let signal_delivered() do SIGTRAP on singlestepping into handler
  um: don't leak floating point state and segment registers on execve()
  um: take cleaning singlestep to start_thread()

97956605

Merge tag 'dm-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm · c3a086e6

由 Linus Torvalds 提交于 9月 28, 2012

Pull dm fixes from Alasdair G Kergon:
 "A few fixes for problems discovered during the 3.6 cycle.

  Of particular note, are fixes to the thin target's discard support,
  which I hope is finally working correctly; and fixes for multipath
  ioctls and device limits when there are no paths."

* tag 'dm-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
  dm verity: fix overflow check
  dm thin: fix discard support for data devices
  dm thin: tidy discard support
  dm: retain table limits when swapping to new table with no devices
  dm table: clear add_random unless all devices have it set
  dm: handle requests beyond end of device instead of using BUG_ON
  dm mpath: only retry ioctl when no paths if queue_if_no_path set
  dm thin: do not set discard_zeroes_data

c3a086e6

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功