提交 · 5598e9005a4076d6700bbd89d0cdbe5b2922a846 · openeuler / raspberrypi-kernel

18 2月, 2016 7 次提交

btrfs: drop null testing before destroy functions · 5598e900

由 Kinglong Mee 提交于 1月 29, 2016

Cleanup.

kmem_cache_destroy has support NULL argument checking,
so drop the double null testing before calling it.
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

5598e900

btrfs: fix build warning · 89771cc9

由 Sudip Mukherjee 提交于 2月 16, 2016

We were getting build warning about:
fs/btrfs/extent-tree.c:7021:34: warning: ‘used_bg’ may be used
	uninitialized in this function

It is not a valid warning as used_bg is never used uninitilized since
locked is initially false so we can never be in the section where
'used_bg' is used. But gcc is not able to understand that and we can
initialize it while declaring to silence the warning.
Signed-off-by: NSudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

89771cc9

btrfs: use proper type for failrec in extent_state · 47dc196a

由 David Sterba 提交于 2月 11, 2016

We use the private member of extent_state to store the failrec and play
pointless pointer games.
Signed-off-by: NDavid Sterba <dsterba@suse.com>

47dc196a

btrfs: Replace CURRENT_TIME by current_fs_time() · 04b285f3

由 Deepa Dinamani 提交于 2月 06, 2016

CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_fs_time() instead.
Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: linux-btrfs@vger.kernel.org
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

04b285f3

btrfs: remove open-coded swap() in backref.c:__merge_refs · 8f682f69

由 Dave Jones 提交于 1月 28, 2016

The kernel provides a swap() that does the same thing as this code.
Signed-off-by: NDave Jones <dsj@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8f682f69

btrfs: remove redundant error check · ac1407ba

由 Byongho Lee 提交于 1月 27, 2016

While running btrfs_mksubvol(), d_really_is_positive() is called twice.
First in btrfs_mksubvol() and second inside btrfs_may_create().  So I
remove the first one.
Signed-off-by: NByongho Lee <bhlee.kernel@gmail.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

ac1407ba

btrfs: simplify expression in btrfs_calc_trans_metadata_size() · 0138b6fe

由 Byongho Lee 提交于 1月 27, 2016

Simplify expression in btrfs_calc_trans_metadata_size().
Signed-off-by: NByongho Lee <bhlee.kernel@gmail.com>
Reviewed-by: NStefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

0138b6fe

06 2月, 2016 4 次提交

epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT · b6a515c8

由 Jason Baron 提交于 2月 05, 2016

In the current implementation of the EPOLLEXCLUSIVE flag (added for
4.5-rc1), if epoll waiters create different POLL* sets and register them
as exclusive against the same target fd, the current implementation will
stop waking any further waiters once it finds the first idle waiter.
This means that waiters could miss wakeups in certain cases.

For example, when we wake up a pipe for reading we do:
wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM); So if
one epoll set or epfd is added to pipe p with POLLIN and a second set
epfd2 is added to pipe p with POLLRDNORM, only epfd may receive the
wakeup since the current implementation will stop after it finds any
intersection of events with a waiter that is blocked in epoll_wait().

We could potentially address this by requiring all epoll waiters that
are added to p be required to pass the same set of POLL* events.  IE the
first EPOLL_CTL_ADD that passes EPOLLEXCLUSIVE establishes the set POLL*
flags to be used by any other epfds that are added as EPOLLEXCLUSIVE.
However, I think it might be somewhat confusing interface as we would
have to reference count the number of users for that set, and so
userspace would have to keep track of that count, or we would need a
more involved interface.  It also adds some shared state that we'd have
store somewhere.  I don't think anybody will want to bloat
__wait_queue_head for this.

I think what we could do instead, is to simply restrict EPOLLEXCLUSIVE
such that it can only be specified with EPOLLIN and/or EPOLLOUT.  So
that way if the wakeup includes 'POLLIN' and not 'POLLOUT', we can stop
once we hit the first idle waiter that specifies the EPOLLIN bit, since
any remaining waiters that only have 'POLLOUT' set wouldn't need to be
woken.  Likewise, we can do the same thing if 'POLLOUT' is in the wakeup
bit set and not 'POLLIN'.  If both 'POLLOUT' and 'POLLIN' are set in the
wake bit set (there is at least one example of this I saw in fs/pipe.c),
then we just wake the entire exclusive list.  Having both 'POLLOUT' and
'POLLIN' both set should not be on any performance critical path, so I
think that's ok (in fs/pipe.c its in pipe_release()).  We also continue
to include EPOLLERR and EPOLLHUP by default in any exclusive set.  Thus,
the user can specify EPOLLERR and/or EPOLLHUP but is not required to do
so.

Since epoll waiters may be interested in other events as well besides
EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP, these can still be added by
doing a 'dup' call on the target fd and adding that as one normally
would with EPOLL_CTL_ADD.  Since I think that the POLLIN and POLLOUT
events are what we are interest in balancing, I think that the 'dup'
thing could perhaps be added to only one of the waiter threads.
However, I think that EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP should be
sufficient for the majority of use-cases.

Since EPOLLEXCLUSIVE is intended to be used with a target fd shared
among multiple epfds, where between 1 and n of the epfds may receive an
event, it does not satisfy the semantics of EPOLLONESHOT where only 1
epfd would get an event.  Thus, it is not allowed to be specified in
conjunction with EPOLLEXCLUSIVE.

EPOLL_CTL_MOD is also not allowed if the fd was previously added as
EPOLLEXCLUSIVE.  It seems with the limited number of flags to not be as
interesting, but this could be relaxed at some further point.
Signed-off-by: NJason Baron <jbaron@akamai.com>
Tested-by: NMadars Vitolins <m@silodev.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Eric Wong <normalperson@yhbt.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b6a515c8

dax: dirty inode only if required · d2b2a28e

由 Dmitry Monakhov 提交于 2月 05, 2016

Signed-off-by: NDmitry Monakhov <dmonakhov@openvz.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d2b2a28e

ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup · c95a5180

由 xuejiufei 提交于 2月 05, 2016

When recovery master down, dlm_do_local_recovery_cleanup() only remove
the $RECOVERY lock owned by dead node, but do not clear the refmap bit.
Which will make umount thread falling in dead loop migrating $RECOVERY
to the dead node.
Signed-off-by: Nxuejiufei <xuejiufei@huawei.com>
Reviewed-by: NJoseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c95a5180

block: fix pfn_mkwrite() DAX fault handler · 9c5a05bc

由 Ross Zwisler 提交于 2月 05, 2016

Previously the pfn_mkwrite() fault handler for raw block devices called
bldev_dax_fault() -> __dax_fault() to do a full DAX page fault.

Really what the pfn_mkwrite() fault handler needs to do is call
dax_pfn_mkwrite() to make sure that the radix tree entry for the given
PTE is marked as dirty so that a follow-up fsync or msync call will
flush it durably to media.

Fixes: 5a023cdb ("block: enable dax for raw block devices")
Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9c5a05bc

05 2月, 2016 2 次提交

Y
ceph: fix snap context leak in error path · db6aed70
由 Yan, Zheng 提交于 1月 26, 2016
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
db6aed70

ceph: checking for IS_ERR instead of NULL · 1418bf07

由 Dan Carpenter 提交于 1月 26, 2016

ceph_osdc_alloc_request() returns NULL on error, it never returns error
pointers.

Fixes: 5be0389d ('ceph: re-send AIO write request when getting -EOLDSNAP error')
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

1418bf07

04 2月, 2016 3 次提交

proc: revert /proc/<pid>/maps [stack:TID] annotation · 65376df5

由 Johannes Weiner 提交于 2月 02, 2016

Commit b7643757 ("procfs: mark thread stack correctly in
proc/<pid>/maps") added [stack:TID] annotation to /proc/<pid>/maps.

Finding the task of a stack VMA requires walking the entire thread list,
turning this into quadratic behavior: a thousand threads means a
thousand stacks, so the rendering of /proc/<pid>/maps needs to look at a
million combinations.

The cost is not in proportion to the usefulness as described in the
patch.

Drop the [stack:TID] annotation to make /proc/<pid>/maps (and
/proc/<pid>/numa_maps) usable again for higher thread counts.

The [stack] annotation inside /proc/<pid>/task/<tid>/maps is retained, as
identifying the stack VMA there is an O(1) operation.

Siddesh said:
 "The end users needed a way to identify thread stacks programmatically and
  there wasn't a way to do that.  I'm afraid I no longer remember (or have
  access to the resources that would aid my memory since I changed
  employers) the details of their requirement.  However, I did do this on my
  own time because I thought it was an interesting project for me and nobody
  really gave any feedback then as to its utility, so as far as I am
  concerned you could roll back the main thread maps information since the
  information is available in the thread-specific files"
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>
Cc: Shaohua Li <shli@fb.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

65376df5

numa: fix /proc/<pid>/numa_maps for hugetlbfs on s390 · 5c2ff95e

由 Michael Holzheu 提交于 2月 02, 2016

When working with hugetlbfs ptes (which are actually pmds) is not valid to
directly use pte functions like pte_present() because the hardware bit
layout of pmds and ptes can be different.  This is the case on s390.
Therefore we have to convert the hugetlbfs ptes first into a valid pte
encoding with huge_ptep_get().

Currently the /proc/<pid>/numa_maps code uses hugetlbfs ptes without
huge_ptep_get().  On s390 this leads to the following two problems:

1) The pte_present() function returns false (instead of true) for
   PROT_NONE hugetlb ptes. Therefore PROT_NONE vmas are missing
   completely in the "numa_maps" output.

2) The pte_dirty() function always returns false for all hugetlb ptes.
   Therefore these pages are reported as "mapped=xxx" instead of
   "dirty=xxx".

Therefore use huge_ptep_get() to correctly convert the hugetlb ptes.
Signed-off-by: NMichael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: <stable@vger.kernel.org>	[4.3+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5c2ff95e

ocfs2/cluster: fix memory leak in o2hb_region_release · a4a1dfa4

由 Joseph Qi 提交于 2月 02, 2016

o2hb_region_release currently doesn't free o2hb_debug_buf
hr_db_elapsed_time and hr_db_pinned malloced in o2hb_debug_create.  Also
we should call debugfs_remove before freeing its data, to prevent the risk
accessing debugfs rightly after its data has been freed.
Signed-off-by: NJoseph Qi <joseph.qi@huawei.com>
Reviewed-by: NJiufei Xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a4a1dfa4

31 1月, 2016 2 次提交

block: use DAX for partition table reads · d1a5f2b4

由 Dan Williams 提交于 1月 28, 2016

Avoid populating pagecache when the block device is in DAX mode.
Otherwise these page cache entries collide with the fsync/msync
implementation and break data durability guarantees.

Cc: Jan Kara <jack@suse.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reported-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: NMatthew Wilcox <willy@linux.intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

d1a5f2b4

block: revert runtime dax control of the raw block device · 9f4736fe

由 Dan Williams 提交于 1月 28, 2016

Dynamically enabling DAX requires that the page cache first be flushed
and invalidated.  This must occur atomically with the change of DAX mode
otherwise we confuse the fsync/msync tracking and violate data
durability guarantees.  Eliminate the possibilty of DAX-disabled to
DAX-enabled transitions for now and revisit this for the next cycle.

Cc: Jan Kara <jack@suse.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

9f4736fe

30 1月, 2016 1 次提交

Revert "btrfs: synchronize incompat feature bits with sysfs files" · e410e34f

由 Chris Mason 提交于 1月 29, 2016

This reverts commit 14e46e04.

This ends up doing sysfs operations from deep in balance (where we
should be GFP_NOFS) and under heavy balance load, we're making races
against sysfs internals.

Revert it for now while we figure things out.
Signed-off-by: NChris Mason <clm@fb.com>

e410e34f

28 1月, 2016 1 次提交

NFS: Cleanup - rename NFS_LAYOUT_RETURN_BEFORE_CLOSE · 2370abda

由 Trond Myklebust 提交于 1月 27, 2016

NFS_LAYOUT_RETURN_BEFORE_CLOSE is being used to signal that a
layoutreturn is needed, either due to a layout recall or to a
layout error. Rename it to NFS_LAYOUT_RETURN_REQUESTED in order
to clarify its purpose.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

2370abda

27 1月, 2016 4 次提交

Bluetooth: Add missing COMPATIBLE_IOCTL for UART line discipline · d10d34aa

由 Marcel Holtmann 提交于 1月 25, 2016

The HCIUARTGETDEVICE, HCIUARTSETFLAGS and HCIUARTGETFLAGS ioctl are
missing the COMPATIBLE_IOCTL declaration.
Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>

d10d34aa

C
btrfs: don't use GFP_HIGHMEM for free-space-tree bitmap kzalloc · e1c0ebad
由 Chris Mason 提交于 1月 27, 2016
```
This was copied incorrectly from the __vmalloc call.
Signed-off-by: NChris Mason <clm@fb.com>
```
e1c0ebad

btrfs: sysfs: check initialization state before updating features · bf609206

由 David Sterba 提交于 1月 27, 2016

If the mount phase is not finished, we can't update the sysfs files.
Reported-by: NChris Mason <clm@fb.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

bf609206

pNFS: Fix missing layoutreturn calls · 13c13a6a

由 Trond Myklebust 提交于 1月 26, 2016

The layoutreturn code currently relies on pnfs_put_lseg() to initiate the
RPC call when conditions are right. A problem arises when we want to
free the layout segment from inside an inode->i_lock section (e.g. in
pnfs_clear_request_commit()), since we cannot sleep.

The workaround is to move the actual call to pnfs_send_layoutreturn()
to pnfs_put_layout_hdr(), which doesn't have this restriction.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>

13c13a6a

26 1月, 2016 3 次提交

Revert "btrfs: clear PF_NOFREEZE in cleaner_kthread()" · 80ad623e

由 David Sterba 提交于 1月 25, 2016

This reverts commit 69624913. The
cleaner thread can block freezing when there's a snapshot cleaning in
progress and the other threads get suspended first. From the logs
provided by Martin we're waiting for reading extent pages:

kernel: PM: Syncing filesystems ... done.
kernel: Freezing user space processes ... (elapsed 0.015 seconds) done.
kernel: Freezing remaining freezable tasks ...
kernel: Freezing of tasks failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0):
kernel: btrfs-cleaner   D ffff88033dd13bc0     0   152      2 0x00000000
kernel: ffff88032ebc2e00 ffff88032e750000 ffff88032e74fa50 7fffffffffffffff
kernel: ffffffff814a58df 0000000000000002 ffffea000934d580 ffffffff814a5451
kernel: 7fffffffffffffff ffffffff814a6e8f 0000000000000000 0000000000000020
kernel: Call Trace:
kernel: [<ffffffff814a58df>] ? bit_wait+0x2c/0x2c
kernel: [<ffffffff814a5451>] ? schedule+0x6f/0x7c
kernel: [<ffffffff814a6e8f>] ? schedule_timeout+0x2f/0xd8
kernel: [<ffffffff81076f94>] ? timekeeping_get_ns+0xa/0x2e
kernel: [<ffffffff81077603>] ? ktime_get+0x36/0x44
kernel: [<ffffffff814a4f6c>] ? io_schedule_timeout+0x94/0xf2
kernel: [<ffffffff814a4f6c>] ? io_schedule_timeout+0x94/0xf2
kernel: [<ffffffff814a590b>] ? bit_wait_io+0x2c/0x30
kernel: [<ffffffff814a5694>] ? __wait_on_bit+0x41/0x73
kernel: [<ffffffff8109eba8>] ? wait_on_page_bit+0x6d/0x72
kernel: [<ffffffff8105d718>] ? autoremove_wake_function+0x2a/0x2a
kernel: [<ffffffff811a02d7>] ? read_extent_buffer_pages+0x1bd/0x203
kernel: [<ffffffff8117d9e9>] ? free_root_pointers+0x4c/0x4c
kernel: [<ffffffff8117e831>] ? btree_read_extent_buffer_pages.constprop.57+0x5a/0xe9
kernel: [<ffffffff8117f4f3>] ? read_tree_block+0x2d/0x45
kernel: [<ffffffff8116782a>] ? read_block_for_search.isra.34+0x22a/0x26b
kernel: [<ffffffff811656c3>] ? btrfs_set_path_blocking+0x1e/0x4a
kernel: [<ffffffff8116919b>] ? btrfs_search_slot+0x648/0x736
kernel: [<ffffffff81170559>] ? btrfs_lookup_extent_info+0xb7/0x2c7
kernel: [<ffffffff81170ee5>] ? walk_down_proc+0x9c/0x1ae
kernel: [<ffffffff81171c9d>] ? walk_down_tree+0x40/0xa4
kernel: [<ffffffff8117375f>] ? btrfs_drop_snapshot+0x2da/0x664
kernel: [<ffffffff8104ff21>] ? finish_task_switch+0x126/0x167
kernel: [<ffffffff811850f8>] ? btrfs_clean_one_deleted_snapshot+0xa6/0xb0
kernel: [<ffffffff8117eaba>] ? cleaner_kthread+0x13e/0x17b
kernel: [<ffffffff8117e97c>] ? btrfs_item_end+0x33/0x33
kernel: [<ffffffff8104d256>] ? kthread+0x95/0x9d
kernel: [<ffffffff8104d1c1>] ? kthread_parkme+0x16/0x16
kernel: [<ffffffff814a7b5f>] ? ret_from_fork+0x3f/0x70
kernel: [<ffffffff8104d1c1>] ? kthread_parkme+0x16/0x16

As this affects a released kernel (4.4) we need a minimal fix for
stable kernels.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=108361Reported-by: NMartin Ziegler <ziegler@uni-freiburg.de>
CC: stable@vger.kernel.org # 4.4
CC: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

80ad623e

btrfs: async-thread: Fix a use-after-free error for trace · 0a95b851

由 Qu Wenruo 提交于 1月 22, 2016

Parameter of trace_btrfs_work_queued() can be freed in its workqueue.
So no one use use that pointer after queue_work().

Fix the user-after-free bug by move the trace line before queue_work().
Reported-by: NDave Jones <davej@codemonkey.org.uk>
Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

0a95b851

Btrfs: fix race between fsync and lockless direct IO writes · de0ee0ed

由 Filipe Manana 提交于 1月 21, 2016

An fsync, using the fast path, can race with a concurrent lockless direct
IO write and end up logging a file extent item that points to an extent
that wasn't written to yet. This is because the fast fsync path collects
ordered extents into a local list and then collects all the new extent
maps to log file extent items based on them, while the direct IO write
path creates the new extent map before it creates the corresponding
ordered extent (and submitting the respective bio(s)).

So fix this by making the direct IO write path create ordered extents
before the extent maps and make the fast fsync path collect any new
ordered extents after it collects the extent maps.
Note that making the fsync handler call inode_dio_wait() (after acquiring
the inode's i_mutex) would not work and lead to a deadlock when doing
AIO, as through AIO we end up in a path where the fsync handler is called
(through dio_aio_complete_work() -> dio_complete() -> vfs_fsync_range())
before the inode's dio counter is decremented (inode_dio_wait() waits
for this counter to have a value of zero).
Signed-off-by: NFilipe Manana <fdmanana@suse.com>
Signed-off-by: NChris Mason <clm@fb.com>

de0ee0ed

25 1月, 2016 2 次提交
- D
  btrfs: add free space tree to the cow-only list · 3e4c5efb
  由 David Sterba 提交于 1月 25, 2016
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
  3e4c5efb
- D
  btrfs: add free space tree to lockdep classes · 6b20e0ad
  由 David Sterba 提交于 1月 25, 2016
```
Signed-off-by: NDavid Sterba <dsterba@suse.com>
```
  6b20e0ad
23 1月, 2016 11 次提交

vfs: abort dedupe loop if fatal signals are pending · e62e560f

由 Darrick J. Wong 提交于 1月 22, 2016

If the program running dedupe receives a fatal signal during the
dedupe loop, we should bail out to avoid tying up the system.
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e62e560f

tree wide: use kvfree() than conditional kfree()/vfree() · 1d5cfdb0

由 Tetsuo Handa 提交于 1月 22, 2016

There are many locations that do

  if (memory_was_allocated_by_vmalloc)
    vfree(ptr);
  else
    kfree(ptr);

but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory
using is_vmalloc_addr().  Unless callers have special reasons, we can
replace this branch with kvfree().  Please check and reply if you found
problems.
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NJan Kara <jack@suse.com>
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: NAndreas Dilger <andreas.dilger@intel.com>
Acked-by: N"Rafael J. Wysocki" <rjw@rjwysocki.net>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Boris Petkov <bp@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1d5cfdb0

dax: never rely on bh.b_dev being set by get_block() · eab95db6