提交 · 05455e1177f76849e0a6450e8710dcb2c361f337 · openeuler / raspberrypi-kernel

07 9月, 2017 27 次提交

ceph: make writepage_nounlock() invalidate page that beyonds EOF · 05455e11

由 Yan, Zheng 提交于 9月 02, 2017

Otherwise, the page left in state that page is associated with a
snapc, but (PageDirty(page) || PageWriteback(page)) is false.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

05455e11

ceph: properly get capsnap's size in get_oldest_context() · 1f934b00

由 Yan, Zheng 提交于 8月 30, 2017

capsnap's size is set by __ceph_finish_cap_snap(). If capsnap is under
writing, its size is zero. In this case, get_oldest_context() should
read i_size. Besides, ceph_writepages_start() should re-check capsnap's
size after dirty pages get locked.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

1f934b00

ceph: remove stale check in ceph_invalidatepage() · b072d774

由 Yan, Zheng 提交于 8月 30, 2017

Both set_page_dirty and truncate_complete_page should be called
for locked page, they can't race with each other.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

b072d774

ceph: queue cap snap only when snap realm's context changes · 3ae0bebc

由 Yan, Zheng 提交于 8月 28, 2017

If we create capsnap when snap realm's context does not change, the
new capsnap's snapc is equal to ci->i_head_snapc. Page writeback code
can't differentiates dirty pages associated with the new capsnap from
dirty pages associated with i_head_snapc.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

3ae0bebc

ceph: handle race between vmtruncate and queuing cap snap · c8fd0d37

由 Yan, Zheng 提交于 8月 28, 2017

It's possible that we create a cap snap while there is pending
vmtruncate (truncate hasn't been processed by worker thread).
We should truncate dirty pages beyond capsnap->size in that case.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

c8fd0d37

ceph: fix message order check in handle_cap_export() · fa0aa3b8

由 Yan, Zheng 提交于 8月 28, 2017

If caps for importer mds exists, but cap id mismatch, client should
have received corresponding import message. Because cap ID does not
change as long as client holds the caps.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

fa0aa3b8

Y
ceph: fix NULL pointer dereference in ceph_flush_snaps() · c858a070
由 Yan, Zheng 提交于 8月 28, 2017
```
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
c858a070

ceph: adjust 36 checks for NULL pointers · d37b1d99

由 Markus Elfring 提交于 8月 20, 2017

The script “checkpatch.pl” pointed information out like the following.

Comparison to NULL could be written ...

Thus fix the affected source code places.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
Reviewed-by: NYan, Zheng <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

d37b1d99

ceph: delete an unnecessary return statement in update_dentry_lease() · b529d1b3

由 Markus Elfring 提交于 8月 20, 2017

The script "checkpatch.pl" pointed information out like the following.

WARNING: void function return statements are not generally useful

Thus remove such a statement in the affected function.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
Reviewed-by: NYan, Zheng <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

b529d1b3

ceph: ENOMEM pr_err in __get_or_create_frag() is redundant · 51308806

由 Markus Elfring 提交于 8月 20, 2017

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.
Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net>
Reviewed-by: NYan, Zheng <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

51308806

ceph: check negative offsets in ceph_llseek() · 397f2389

由 Luis Henriques 提交于 7月 28, 2017

When a user requests SEEK_HOLE or SEEK_DATA with a negative offset
ceph_llseek should return -ENXIO.  Currently -EINVAL is being returned for
SEEK_DATA and 0 for SEEK_HOLE.
Signed-off-by: NLuis Henriques <lhenriques@suse.com>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

397f2389

ceph: more accurate statfs · 06d74376

由 Douglas Fuller 提交于 8月 16, 2017

Improve accuracy of statfs reporting for Ceph filesystems comprising
exactly one data pool. In this case, the Ceph monitor can now report
the space usage for the single data pool instead of the global data
for the entire Ceph cluster. Include support for this message in
mon_client and leverage it in ceph/super.
Signed-off-by: NDouglas Fuller <dfuller@redhat.com>
Reviewed-by: NYan, Zheng <zyan@redhat.com>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

06d74376

Y
ceph: properly set snap follows for cap reconnect · 92776fd2
由 Yan, Zheng 提交于 8月 16, 2017
```
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
92776fd2

ceph: don't use CEPH_OSD_FLAG_ORDERSNAP · b178cf43

由 Yan, Zheng 提交于 8月 16, 2017

Inode can be moved between snap realms. It's possible inode is moved
into a snap realm whose seq number is smaller than old snap realm's.
So there is no guarantee that seq number inode's snap context always
increases.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

b178cf43

ceph: include snapc in debug message of write · 1c0a9c2d

由 Yan, Zheng 提交于 8月 16, 2017

Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

1c0a9c2d

ceph: make sure flushsnap messages are sent in proper order · 24d063ac

由 Yan, Zheng 提交于 8月 15, 2017

Before sending new flushsnap message, check if there are old
flushsnap messages that need to be re-sent. If there are, re-send
old messages first. This guarantees ordering of flushsnap messages.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

24d063ac

ceph: fix -EOLDSNAPC handling · a5cd74ad

由 Yan, Zheng 提交于 8月 14, 2017

Need to drop cap reference before retry. Besides, it's better to
redo file write checks for each retry because we re-lock inode.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

a5cd74ad

ceph: send LSSNAP request to auth mds of directory inode · 5d37ca14

由 Yan, Zheng 提交于 7月 26, 2017

Snapdir inode has no capability. __choose_mds() should choose mds
base on capabilities of snapdir's parent inode.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

5d37ca14

Y
ceph: don't fill readdir cache for LSSNAP reply · 8d45b911
由 Yan, Zheng 提交于 7月 26, 2017
```
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
8d45b911

ceph: cleanup ceph_readdir_prepopulate() · 9a86962b

由 Yan, Zheng 提交于 7月 26, 2017

In LSSNAP case, req->r_dentry is already set to snapdir dentry.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

9a86962b

ceph: use errseq_t for writeback error reporting · b74fceae

由 Jeff Layton 提交于 7月 25, 2017

Ensure that when writeback errors are marked that we report those to all
file descriptions that were open at the time of the error.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

b74fceae

ceph: new cap message flags indicate if there is pending capsnap · 95569713

由 Yan, Zheng 提交于 7月 24, 2017

These flags tell mds if there is pending capsnap explicitly.
Without this explicit notification, mds can only conclude if
client has pending capsnap. The method mds use is inefficient
and error-prone.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

95569713

ceph: nuke startsync op · 3fb99d48

由 Yanhu Cao 提交于 7月 21, 2017

startsync is a no-op, has been for years.  Remove it.

Link: http://tracker.ceph.com/issues/20604Signed-off-by: NYanhu Cao <gmayyyha@gmail.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

3fb99d48

Y
ceph: validate correctness of some mount options · 4214fb15
由 Yan, Zheng 提交于 7月 11, 2017
```
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
4214fb15

ceph: limit osd write size · 95cca2b4

由 Yan, Zheng 提交于 7月 11, 2017

OSD has a configurable limitation of max write size. OSD return
error if write request size is larger than the limitation. For now,
set max write size to CEPH_MSG_MAX_DATA_LEN. It should be small
enough.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

95cca2b4

ceph: limit osd read size to CEPH_MSG_MAX_DATA_LEN · aa187926

由 Yan, Zheng 提交于 7月 11, 2017

libceph returns -EIO when read size > CEPH_MSG_MAX_DATA_LEN.

Link: http://tracker.ceph.com/issues/20528Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

aa187926

Y
ceph: remove unused cap_release_safety mount option · 2ae409dc
由 Yan, Zheng 提交于 7月 11, 2017
```
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
2ae409dc

02 9月, 2017 1 次提交

epoll: fix race between ep_poll_callback(POLLFREE) and ep_free()/ep_remove() · 138e4ad6

由 Oleg Nesterov 提交于 9月 01, 2017

The race was introduced by me in commit 971316f0 ("epoll:
ep_unregister_pollwait() can use the freed pwq->whead").  I did not
realize that nothing can protect eventpoll after ep_poll_callback() sets
->whead = NULL, only whead->lock can save us from the race with
ep_free() or ep_remove().

Move ->whead = NULL to the end of ep_poll_callback() and add the
necessary barriers.

TODO: cleanup the ewake/EPOLLEXCLUSIVE logic, it was confusing even
before this patch.

Hopefully this explains use-after-free reported by syzcaller:

	BUG: KASAN: use-after-free in debug_spin_lock_before
	...
	 _raw_spin_lock_irqsave+0x4a/0x60 kernel/locking/spinlock.c:159
	 ep_poll_callback+0x29f/0xff0 fs/eventpoll.c:1148

this is spin_lock(eventpoll->lock),

	...
	Freed by task 17774:
	...
	 kfree+0xe8/0x2c0 mm/slub.c:3883
	 ep_free+0x22c/0x2a0 fs/eventpoll.c:865

Fixes: 971316f0 ("epoll: ep_unregister_pollwait() can use the freed pwq->whead")
Reported-by: N范龙飞 <long7573@126.com>
Cc: stable@vger.kernel.org
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

138e4ad6

01 9月, 2017 4 次提交

Fix warning messages when mounting to older servers · 7e682f76

由 Steve French 提交于 8月 31, 2017

When mounting to older servers, such as Windows XP (or even Windows 7),
the limited error messages that can be passed back to user space can
get confusing since the default dialect has changed from SMB1 (CIFS) to
more secure SMB3 dialect. Log additional information when the user chooses
to use the default dialects and when the server does not support the
dialect requested.
Signed-off-by: NSteve French <smfrench@gmail.com>
Reviewed-by: NRonnie Sahlberg <lsahlber@redhat.com>
Acked-by: NPavel Shilovsky <pshilov@microsoft.com>

7e682f76

jfs should use MAX_LFS_FILESIZE when calculating s_maxbytes · c227390c

由 Dave Kleikamp 提交于 8月 31, 2017

jfs had previously avoided the use of MAX_LFS_FILESIZE because it hadn't
accounted for the whole 32-bit index range on 32-bit systems.  That has
been fixed by commit 0cc3b0ec ("Clarify (and fix) MAX_LFS_FILESIZE
macros"), so we can simplify the code now.

Suggested by Andreas Dilger.
Signed-off-by: NDave Kleikamp <dave.kleikamp@oracle.com>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
Cc: jfs-discussion@lists.sourceforge.net
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c227390c

dax: update to new mmu_notifier semantic · a4d1a885

由 Jérôme Glisse 提交于 8月 31, 2017

Replace all mmu_notifier_invalidate_page() calls by *_invalidate_range()
and make sure it is bracketed by calls to *_invalidate_range_start()/end().

Note that because we can not presume the pmd value or pte value we have
to assume the worst and unconditionaly report an invalidation as
happening.
Signed-off-by: NJérôme Glisse <jglisse@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: axie <axie@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a4d1a885

ceph: fix readpage from fscache · dd2bc473

由 Yan, Zheng 提交于 8月 04, 2017

ceph_readpage() unlocks page prematurely prematurely in the case
that page is reading from fscache. Caller of readpage expects that
page is uptodate when it get unlocked. So page shoule get locked
by completion callback of fscache_read_or_alloc_pages()

Cc: stable@vger.kernel.org # 4.1+, needs backporting for < 4.7
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

dd2bc473

31 8月, 2017 2 次提交

CIFS: remove endian related sparse warning · 6e3c1529

由 Steve French 提交于 8月 27, 2017

Recent patch had an endian warning ie
cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
Signed-off-by: NSteve French <smfrench@gmail.com>
CC: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
Acked-by: NPavel Shilovsky <pshilov@microsoft.com>

6e3c1529

CIFS: Fix maximum SMB2 header size · 9e37b178

由 Pavel Shilovsky 提交于 8月 24, 2017

Currently the maximum size of SMB2/3 header is set incorrectly which
leads to hanging of directory listing operations on encrypted SMB3
connections. Fix this by setting the maximum size to 170 bytes that
is calculated as RFC1002 length field size (4) + transform header
size (52) + SMB2 header size (64) + create response size (56).

Cc: <stable@vger.kernel.org>
Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: NSteve French <smfrench@gmail.com>
Acked-by: NSachin Prabhu <sprabhu@redhat.com>

9e37b178

29 8月, 2017 1 次提交

fs/select: Fix memory corruption in compat_get_fd_set() · 79de3cbe

由 Helge Deller 提交于 8月 23, 2017

Commit 464d6242 ("select: switch compat_{get,put}_fd_set() to
compat_{get,put}_bitmap()") changed the calculation on how many bytes
need to be zeroed when userspace handed over a NULL pointer for a fdset
array in the select syscall.

The calculation was changed in compat_get_fd_set() wrongly from
	memset(fdset, 0, ((nr + 1) & ~1)*sizeof(compat_ulong_t));
to
	memset(fdset, 0, ALIGN(nr, BITS_PER_LONG));

The ALIGN(nr, BITS_PER_LONG) calculates the number of _bits_ which need
to be zeroed in the target fdset array (rounded up to the next full bits
for an unsigned long).

But the memset() call expects the number of _bytes_ to be zeroed.

This leads to clearing more memory than wanted (on the stack area or
even at kmalloc()ed memory areas) and to random kernel crashes as we
have seen them on the parisc platform.

The correct change should have been

	memset(fdset, 0, (ALIGN(nr, BITS_PER_LONG) / BITS_PER_LONG) * BYTES_PER_LONG);

which is the same as can be archieved with a call to

	zero_fd_set(nr, fdset).

Fixes: 464d6242 ("select: switch compat_{get,put}_fd_set() to compat_{get,put}_bitmap()"
Acked-by: N: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NHelge Deller <deller@gmx.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

79de3cbe

26 8月, 2017 1 次提交

dax: fix deadlock due to misaligned PMD faults · fffa281b

由 Ross Zwisler 提交于 8月 25, 2017

In DAX there are two separate places where the 2MiB range of a PMD is
defined.

The first is in the page tables, where a PMD mapping inserted for a
given address spans from (vmf->address & PMD_MASK) to ((vmf->address &
PMD_MASK) + PMD_SIZE - 1).  That is, from the 2MiB boundary below the
address to the 2MiB boundary above the address.

So, for example, a fault at address 3MiB (0x30 0000) falls within the
PMD that ranges from 2MiB (0x20 0000) to 4MiB (0x40 0000).

The second PMD range is in the mapping->page_tree, where a given file
offset is covered by a radix tree entry that spans from one 2MiB aligned
file offset to another 2MiB aligned file offset.

So, for example, the file offset for 3MiB (pgoff 768) falls within the
PMD range for the order 9 radix tree entry that ranges from 2MiB (pgoff
512) to 4MiB (pgoff 1024).

This system works so long as the addresses and file offsets for a given
mapping both have the same offsets relative to the start of each PMD.

Consider the case where the starting address for a given file isn't 2MiB
aligned - say our faulting address is 3 MiB (0x30 0000), but that
corresponds to the beginning of our file (pgoff 0).  Now all the PMDs in
the mapping are misaligned so that the 2MiB range defined in the page
tables never matches up with the 2MiB range defined in the radix tree.

The current code notices this case for DAX faults to storage with the
following test in dax_pmd_insert_mapping():

	if (pfn_t_to_pfn(pfn) & PG_PMD_COLOUR)
		goto unlock_fallback;

This test makes sure that the pfn we get from the driver is 2MiB
aligned, and relies on the assumption that the 2MiB alignment of the pfn
we get back from the driver matches the 2MiB alignment of the faulting
address.

However, faults to holes were not checked and we could hit the problem
described above.

This was reported in response to the NVML nvml/src/test/pmempool_sync
TEST5:

	$ cd nvml/src/test/pmempool_sync
	$ make TEST5

You can grab NVML here:

	https://github.com/pmem/nvml/

The dmesg warning you see when you hit this error is:

  WARNING: CPU: 13 PID: 2900 at fs/dax.c:641 dax_insert_mapping_entry+0x2df/0x310

Where we notice in dax_insert_mapping_entry() that the radix tree entry
we are about to replace doesn't match the locked entry that we had
previously inserted into the tree.  This happens because the initial
insertion was done in grab_mapping_entry() using a pgoff calculated from
the faulting address (vmf->address), and the replacement in
dax_pmd_load_hole() => dax_insert_mapping_entry() is done using
vmf->pgoff.

In our failure case those two page offsets (one calculated from
vmf->address, one using vmf->pgoff) point to different order 9 radix
tree entries.

This failure case can result in a deadlock because the radix tree unlock
also happens on the pgoff calculated from vmf->address.  This means that
the locked radix tree entry that we swapped in to the tree in
dax_insert_mapping_entry() using vmf->pgoff is never unlocked, so all
future faults to that 2MiB range will block forever.

Fix this by validating that the faulting address's PMD offset matches
the PMD offset from the start of the file.  This check is done at the
very beginning of the fault and covers faults that would have mapped to
storage as well as faults to holes.  I left the COLOUR check in
dax_pmd_insert_mapping() in place in case we ever hit the insanity
condition where the alignment of the pfn we get from the driver doesn't
match the alignment of the userspace address.

Link: http://lkml.kernel.org/r/20170822222436.18926-1-ross.zwisler@linux.intel.comSigned-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Reported-by: N"Slusarz, Marcin" <marcin.slusarz@intel.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fffa281b

25 8月, 2017 2 次提交

nfsd: Limit end of page list when decoding NFSv4 WRITE · fc788f64

由 Chuck Lever 提交于 8月 18, 2017

When processing an NFSv4 WRITE operation, argp->end should never
point past the end of the data in the final page of the page list.
Otherwise, nfsd4_decode_compound can walk into uninitialized memory.

More critical, nfsd4_decode_write is failing to increment argp->pagelen
when it increments argp->pagelist.  This can cause later xdr decoders
to assume more data is available than really is, which can cause server
crashes on malformed requests.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

fc788f64

pty: Repair TIOCGPTPEER · 311fc65c

由 Eric W. Biederman 提交于 8月 24, 2017

The implementation of TIOCGPTPEER has two issues.

When /dev/ptmx (as opposed to /dev/pts/ptmx) is opened the wrong
vfsmount is passed to dentry_open.  Which results in the kernel displaying
the wrong pathname for the peer.

The second is simply by caching the vfsmount and dentry of the peer it leaves
them open, in a way they were not previously Which because of the inreased
reference counts can cause unnecessary behaviour differences resulting in
regressions.

To fix these move the ioctl into tty_io.c at a generic level allowing
the ioctl to have access to the struct file on which the ioctl is
being called.  This allows the path of the slave to be derived when
opening the slave through TIOCGPTPEER instead of requiring the path to
the slave be cached.  Thus removing the need for caching the path.

A new function devpts_ptmx_path is factored out of devpts_acquire and
used to implement a function devpts_mntget.   The new function devpts_mntget
takes a filp to perform the lookup on and fsi so that it can confirm
that the superblock that is found by devpts_ptmx_path is the proper superblock.

v2: Lots of fixes to make the code actually work
v3: Suggestions by Linus
    - Removed the unnecessary initialization of filp in ptm_open_peer
    - Simplified devpts_ptmx_path as gotos are no longer required

[ This is the fix for the issue that was reverted in commit
  143c97cc, but this time without breaking 'pbuilder' due to
  increased reference counts   - Linus ]

Fixes: 54ebbfb1 ("tty: add TIOCGPTPEER ioctl")
Reported-by: NChristian Brauner <christian.brauner@canonical.com>
Reported-and-tested-by: NStefan Lippers-Hollmann <s.l-h@gmx.de>
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

311fc65c

24 8月, 2017 2 次提交

Btrfs: fix blk_status_t/errno confusion · 58efbc9f

由 Omar Sandoval 提交于 8月 22, 2017

This fixes several instances of blk_status_t and bare errno ints being
mixed up, some of which are real bugs.

In the normal case, 0 matches BLK_STS_OK, so we don't observe any
effects of the missing conversion, but in case of errors or passes
through the repair/retry paths, the errors get mixed up.

The changes were identified using 'sparse', we don't have reports of the
buggy behaviour.

Fixes: 4e4cbee9 ("block: switch bios to blk_status_t")
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Reviewed-by: NLiu Bo <bo.li.liu@oracle.com>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

58efbc9f

Revert "pty: fix the cached path of the pty slave file descriptor in the master" · 143c97cc

由 Linus Torvalds 提交于 8月 23, 2017

This reverts commit c8c03f18.

It turns out that while fixing the ptmx file descriptor to have the
correct 'struct path' to the associated slave pty is a really good
thing, it breaks some user space tools for a very annoying reason.

The problem is that /dev/ptmx and its associated slave pty (/dev/pts/X)
are on different mounts.  That was what caused us to have the wrong path
in the first place (we would mix up the vfsmount of the 'ptmx' node,
with the dentry of the pty slave node), but it also means that now while
we use the right vfsmount, having the pty master open also keeps the pts
mount busy.

And it turn sout that that makes 'pbuilder' very unhappy, as noted by
Stefan Lippers-Hollmann:

 "This patch introduces a regression for me when using pbuilder
  0.228.7[2] (a helper to build Debian packages in a chroot and to
  create and update its chroots) when trying to umount /dev/ptmx (inside
  the chroot) on Debian/ unstable (full log and pbuilder configuration
  file[3] attached).

  [...]
  Setting up build-essential (12.3) ...
  Processing triggers for libc-bin (2.24-15) ...
  I: unmounting dev/ptmx filesystem
  W: Could not unmount dev/ptmx: umount: /var/cache/pbuilder/build/1340/dev/ptmx: target is busy
          (In some cases useful info about processes that
           use the device is found by lsof(8) or fuser(1).)"

apparently pbuilder tries to unmount the /dev/pts filesystem while still
holding at least one master node open, which is arguably not very nice,
but we don't break user space even when fixing other bugs.

So this commit has to be reverted.

I'll try to figure out a way to avoid caching the path to the slave pty
in the master pty.  The only thing that actually wants that slave pty
path is the "TIOCGPTPEER" ioctl, and I think we could just recreate the
path at that time.
Reported-by: NStefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: Eric W Biederman <ebiederm@xmission.com>
Cc: Christian Brauner <christian.brauner@canonical.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

143c97cc