提交 · 1f02071358301e376f5a54e40531db37a1d7c7ab · openeuler / Kernel

20 3月, 2017 5 次提交

f2fs: combine nat_bits and free_nid_bitmap cache · 7041d5d2

由 Chao Yu 提交于 3月 08, 2017

Both nat_bits cache and free_nid_bitmap cache provide same functionality
as a intermediate cache between free nid cache and disk, but with
different granularity of indicating free nid range, and different
persistence policy. nat_bits cache provides better persistence ability,
and free_nid_bitmap provides better granularity.

In this patch we combine advantage of both caches, so finally policy of
the intermediate cache would be:
- init: load free nid status from nat_bits into free_nid_bitmap
- lookup: scan free_nid_bitmap before load NAT blocks
- update: update free_nid_bitmap in real-time
- persistence: udpate and persist nat_bits in checkpoint

This patch also resolves performance regression reported by lkp-robot.

commit:
  4ac91242 ("f2fs: introduce free nid bitmap")
  d00030cf9cd0bb96fdccc41e33d3c91dcbb672ba ("f2fs: use __set{__clear}_bit_le")
  1382c0f3f9d3f936c8bc42ed1591cf7a593ef9f7 ("f2fs: combine nat_bits and free_nid_bitmap cache")

4ac91242 d00030cf9cd0bb96fdccc41e33 1382c0f3f9d3f936c8bc42ed15
---------------- -------------------------- --------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
     77863 ±  0%      +2.1%      79485 ±  1%     +50.8%     117404 ±  0%  aim7.jobs-per-min
    231.63 ±  0%      -2.0%     227.01 ±  1%     -33.6%     153.80 ±  0%  aim7.time.elapsed_time
    231.63 ±  0%      -2.0%     227.01 ±  1%     -33.6%     153.80 ±  0%  aim7.time.elapsed_time.max
    896604 ±  0%      -0.8%     889221 ±  3%     -20.2%     715260 ±  1%  aim7.time.involuntary_context_switches
      2394 ±  1%      +4.6%       2503 ±  1%      +3.7%       2481 ±  2%  aim7.time.maximum_resident_set_size
      6240 ±  0%      -1.5%       6145 ±  1%     -14.1%       5360 ±  1%  aim7.time.system_time
   1111357 ±  3%      +1.9%    1132509 ±  2%      -6.2%    1041932 ±  2%  aim7.time.voluntary_context_switches
...
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Tested-by: NXiaolong Ye <xiaolong.ye@intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

7041d5d2

f2fs: skip scanning free nid bitmap of full NAT blocks · 586d1492

由 Chao Yu 提交于 3月 01, 2017

This patch adds to account free nids for each NAT blocks, and while
scanning all free nid bitmap, do check count and skip lookuping in
full NAT block.
Signed-off-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

586d1492

f2fs: use __set{__clear}_bit_le · 23380b85

由 Jaegeuk Kim 提交于 3月 07, 2017

This patch uses __set{__clear}_bit_le for highter speed.
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

23380b85

f2fs: declare static functions · 9f7e4a2c

由 Jaegeuk Kim 提交于 3月 10, 2017

This is to avoid build warning reported by kbuild test robot.
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

9f7e4a2c

f2fs: don't overwrite node block by SSR · 720037f9

由 Jaegeuk Kim 提交于 3月 06, 2017

This patch fixes that SSR can overwrite previous warm node block consisting of
a node chain since the last checkpoint.

Fixes: 5b6c6be2 ("f2fs: use SSR for warm node as well")
Reviewed-by: NChao Yu <yuchao0@huawei.com>
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

720037f9

18 3月, 2017 6 次提交

pNFS/flexfiles: never nfs4_mark_deviceid_unavailable · da066f3f

由 Weston Andros Adamson 提交于 3月 09, 2017

The flexfiles layout should never mark a device unavailable.

Move nfs4_mark_deviceid_unavailable out of nfs4_pnfs_ds_connect and call
directly from files layout where it's still needed.

The flexfiles driver still handles marked devices in error paths, but will
now print a rate limited warning.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

da066f3f

pNFS: return status from nfs4_pnfs_ds_connect · a33e4b03

由 Weston Andros Adamson 提交于 3月 09, 2017

The nfs4_pnfs_ds_connect path can call rpc_create which can fail or it
can wait on another context to reach the same failure.

This checks that the rpc_create succeeded and returns the error to the
caller.

When an error is returned, both the files and flexfiles layouts will return
NULL from _prepare_ds(). The flexfiles layout will also return the layout
with the error NFS4ERR_NXIO.
Signed-off-by: NWeston Andros Adamson <dros@primarydata.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

a33e4b03

NFSv4.1 respect server's max size in CREATE_SESSION · 03385332

由 Olga Kornievskaia 提交于 3月 08, 2017

Currently client doesn't respect max sizes server returns in CREATE_SESSION.
nfs4_session_set_rwsize() gets called and server->rsize, server->wsize are 0
so they never get set to the sizes returned by the server.
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

03385332

NFS prevent double free in async nfs4_exchange_id · 63513232

由 Olga Kornievskaia 提交于 3月 13, 2017

Since rpc_task is async, the release function should be called which
will free the impl_id, scope, and owner.

Trond pointed at 2 more problems:
-- use of client pointer after free in the nfs4_exchangeid_release() function
-- cl_count mismatch if rpc_run_task() isn't run

Fixes: 8d89bd70 ("NFS setup async exchange_id")
Signed-off-by: NOlga Kornievskaia <kolga@netapp.com>
Cc: stable@vger.kernel.org # 4.9
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

63513232

nfs: make nfs4_cb_sv_ops static · 05fae7bb

由 Jason Yan 提交于 3月 10, 2017

Fixes the following sparse warning:

fs/nfs/callback.c:235:21: warning: symbol 'nfs4_cb_sv_ops' was not
declared. Should it be static?
Signed-off-by: NJason Yan <yanaijie@huawei.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

05fae7bb

NFS: fix the fault nrequests decreasing for nfs_inode COPY · 38a33101

由 Kinglong Mee 提交于 3月 09, 2017

The nfs_commit_file for NFSv4.2's COPY operation goes through
the commit path for normal WRITE, but without increase nrequests,
so, the nrequests decreased in nfs_commit_release_pages is fault.
After that, the nrequests will be wrong.

[ 5670.299881] ------------[ cut here ]------------
[ 5670.300295] WARNING: CPU: 0 PID: 27656 at fs/nfs/inode.c:127 nfs_clear_inode+0x66/0x90 [nfs]
[ 5670.300558] Modules linked in: nfsv4(E) nfs(E) fscache(E) tun bridge stp llc fuse ip_set nfnetlink vmw_vsock_vmci_transport vsock snd_seq_midi snd_seq_midi_event ppdev f2fs coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_ens1371 intel_rapl_perf gameport snd_ac97_codec vmw_balloon ac97_bus snd_seq snd_pcm joydev snd_rawmidi snd_timer snd_seq_device snd soundcore nfit parport_pc parport acpi_cpufreq tpm_tis tpm_tis_core tpm i2c_piix4 vmw_vmci shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm e1000 crc32c_intel mptspi scsi_transport_spi serio_raw mptscsih mptbase ata_generic pata_acpi fjes [last unloaded: fscache]
[ 5670.302925] CPU: 0 PID: 27656 Comm: umount.nfs4 Tainted: G        W   E   4.11.0-rc1+ #519
[ 5670.303292] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[ 5670.304094] Call Trace:
[ 5670.304510]  dump_stack+0x63/0x86
[ 5670.304917]  __warn+0xcb/0xf0
[ 5670.305276]  warn_slowpath_null+0x1d/0x20
[ 5670.305661]  nfs_clear_inode+0x66/0x90 [nfs]
[ 5670.306093]  nfs4_evict_inode+0x61/0x70 [nfsv4]
[ 5670.306480]  evict+0xbb/0x1c0
[ 5670.306888]  dispose_list+0x4d/0x70
[ 5670.307233]  evict_inodes+0x178/0x1a0
[ 5670.307579]  generic_shutdown_super+0x44/0xf0
[ 5670.307985]  nfs_kill_super+0x21/0x40 [nfs]
[ 5670.308325]  deactivate_locked_super+0x43/0x70
[ 5670.308698]  deactivate_super+0x5a/0x60
[ 5670.309036]  cleanup_mnt+0x3f/0x90
[ 5670.309407]  __cleanup_mnt+0x12/0x20
[ 5670.309837]  task_work_run+0x80/0xa0
[ 5670.310162]  exit_to_usermode_loop+0x89/0x90
[ 5670.310497]  syscall_return_slowpath+0xaa/0xb0
[ 5670.310875]  entry_SYSCALL_64_fastpath+0xa7/0xa9
[ 5670.311197] RIP: 0033:0x7f1bb3617fe7
[ 5670.311545] RSP: 002b:00007ffecbabb828 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6
[ 5670.311906] RAX: 0000000000000000 RBX: 0000000001dca1f0 RCX: 00007f1bb3617fe7
[ 5670.312239] RDX: 000000000000000c RSI: 0000000000000001 RDI: 0000000001dc83c0
[ 5670.312653] RBP: 0000000001dc83c0 R08: 0000000000000001 R09: 0000000000000000
[ 5670.312998] R10: 0000000000000755 R11: 0000000000000206 R12: 00007ffecbabc66a
[ 5670.313335] R13: 0000000001dc83a0 R14: 0000000000000000 R15: 0000000000000000
[ 5670.313758] ---[ end trace bf4bfe7764e4eb40 ]---

Cc: linux-kernel@vger.kernel.org
Fixes: 67911c8f ("NFS: Add nfs_commit_file()")
Signed-off-by: NKinglong Mee <kinglongmee@gmail.com>
Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

38a33101

17 3月, 2017 28 次提交

D
afs: Don't wait for page writeback with the page lock held · c5051c7b
由 David Howells 提交于 3月 16, 2017
```
Drop the page lock before waiting for page writeback.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
```
c5051c7b

afs: ->writepage() shouldn't call clear_page_dirty_for_io() · 65a15109

由 David Howells 提交于 3月 16, 2017

The ->writepage() op shouldn't call clear_page_dirty_for_io() as that has
already been called by the caller.

Fix afs_writepage() by moving the call out of
afs_write_back_from_locked_page() to afs_writepages_region() where it is
needed.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

65a15109

afs: Fix abort on signal while waiting for call completion · 954cd6dc

由 David Howells 提交于 3月 16, 2017

Fix the way in which a call that's in progress and being waited for is
aborted in the case that EINTR is detected. We should be sending
RX_USER_ABORT rather than RX_CALL_DEAD as the abort code.

Note that since the only two ways out of the loop are if the call completes
or if a signal happens, the kill-the-call clause after the loop has
finished can only happen in the case of EINTR. This means that we only
have one abort case to deal with, not two, and the "KWC" case can never
happen and so can be deleted.

Note further that simply aborting the call isn't necessarily the best thing
here since at this point: the request has been entirely sent and it's
likely the server will do the operation anyway - whether we abort it or
not. In future, we should punt the handling of the remainder of the call
off to a background thread.
Reported-by: NMarc Dionne <marc.c.dionne@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

954cd6dc

afs: Fix an off-by-one error in afs_send_pages() · 445783d0

由 David Howells 提交于 3月 16, 2017

afs_send_pages() should only put the call into the AFS_CALL_AWAIT_REPLY
state if it has sent all the pages - but the check it makes is incorrect
and sometimes it will finish the loop early.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

445783d0

afs: Fix afs_kill_pages() · 7286a35e

由 David Howells 提交于 3月 16, 2017

Fix afs_kill_pages() in two ways:

 (1) If a writeback has been partially flushed, then if we try and kill the
     pages it contains, some of them may no longer be undergoing writeback
     and end_page_writeback() will assert.

     Fix this by checking to see whether the page in question is actually
     undergoing writeback before ending that writeback.

 (2) The loop that scans for pages to kill doesn't increase the first page
     index, and so the loop may not terminate, but it will try to process
     the same pages over and over again.

     Fix this by increasing the first page index to one after the last page
     we processed.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

7286a35e

afs: Fix page leak in afs_write_begin() · 6d06b0d2

由 David Howells 提交于 3月 16, 2017

afs_write_begin() leaks a ref and a lock on a page if afs_fill_page()
fails. Fix the leak by unlocking and releasing the page in the error path.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

6d06b0d2

afs: Don't set PG_error on local EINTR or ENOMEM when filling a page · 68ae849d

由 David Howells 提交于 3月 16, 2017

Don't set PG_error on a page if we get local EINTR or ENOMEM when filling a
page for writing.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

68ae849d

afs: Populate and use client modification time · ab94f5d0

由 Marc Dionne 提交于 3月 16, 2017

The inode timestamps should be set from the client time
in the status received from the server, rather than the
server time which is meant for internal server use.

Set AFS_SET_MTIME and populate the mtime for operations
that take an input status, such as file/dir creation
and StoreData.  If an input time is not provided the
server will set the vnode times based on the current server
time.

In a situation where the server has some skew with the
client, this could lead to the client seeing a timestamp
in the future for a file that it just created or wrote.
Signed-off-by: NMarc Dionne <marc.dionne@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

ab94f5d0

afs: Better abort and net error handling · 70af0e3b

由 David Howells 提交于 3月 16, 2017

If we receive a network error, a remote abort or a protocol error whilst
we're still transmitting data, make sure we return an appropriate error to
the caller rather than ESHUTDOWN or ECONNABORTED.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

70af0e3b

afs: Invalid op ID should abort with RXGEN_OPCODE · 1157f153

由 David Howells 提交于 3月 16, 2017

When we are given an invalid operation ID, we should abort that with
RXGEN_OPCODE rather than RX_INVALID_OPERATION.

Also map RXGEN_OPCODE to -ENOTSUPP.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

1157f153

afs: Fix the maths in afs_fs_store_data() · 146a1192

由 David Howells 提交于 3月 16, 2017

afs_fs_store_data() works out of the size of the write it's going to make,
but it uses 32-bit unsigned subtraction in one place that gets
automatically cast to loff_t.

However, if to < offset, then the number goes negative, but as the result
isn't signed, this doesn't get sign-extended to 64-bits when placed in a
loff_t.

Fix by casting the operands to loff_t.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

146a1192

afs: Use a bvec rather than a kvec in afs_send_pages() · 2f5705a5

由 David Howells 提交于 3月 16, 2017

Use a bvec rather than a kvec in afs_send_pages() as we don't then have to
call kmap() in advance.  This allows us to pass the array of contiguous
pages that we extracted through to rxrpc in one go rather than passing a
single page at a time.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

2f5705a5

afs: Make struct afs_read::remain 64-bit · 6a0e3999

由 David Howells 提交于 3月 16, 2017

Make struct afs_read::remain 64-bit so that it can handle huge transfers if
we ever request them or the server decides to give us a bit extra data (the
other fields there are already 64-bit).
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Tested-by: NMarc Dionne <marc.dionne@auristor.com>

6a0e3999

afs: Fix AFS read bug · 29f06985

由 David Howells 提交于 3月 16, 2017

Fix a bug in AFS read whereby the request page afs_read::index isn't
incremented after calling ->page_done() if ->remain reaches 0, indicating
that the data read is complete.

Without this a NULL pointer exception happens when ->page_done() is called
twice for the last page because the page clearing loop will call it also
and afs_readpages_page_done() clears the current entry in the page list.

BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: afs_readpages_page_done+0x21/0xa4 [kafs]
PGD 0
Oops: 0002 [#1] SMP
Modules linked in: kafs(E)
CPU: 2 PID: 3002 Comm: md5sum Tainted: G            E   4.10.0-fscache #485
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
task: ffff8804017d86c0 task.stack: ffff8803fc1d8000
RIP: 0010:afs_readpages_page_done+0x21/0xa4 [kafs]
RSP: 0018:ffff8803fc1db978 EFLAGS: 00010282
RAX: ffff880405d39af8 RBX: 0000000000000000 RCX: ffff880407d83ed4
RDX: 0000000000000000 RSI: ffff880405d39a00 RDI: ffff880405c6f400
RBP: ffff8803fc1db988 R08: 0000000000000000 R09: 0000000000000001
R10: ffff8803fc1db820 R11: ffff88040cf56000 R12: ffff8804088f1780
R13: ffff8804017d86c0 R14: ffff8804088f1780 R15: 0000000000003840
FS:  00007f8154469700(0000) GS:ffff88041fb00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000004016ec000 CR4: 00000000001406e0
Call Trace:
 afs_deliver_fs_fetch_data+0x5b9/0x60e [kafs]
 ? afs_make_call+0x316/0x4e8 [kafs]
 ? afs_make_call+0x359/0x4e8 [kafs]
 afs_deliver_to_call+0x173/0x2e8 [kafs]
 ? afs_make_call+0x316/0x4e8 [kafs]
 afs_make_call+0x37a/0x4e8 [kafs]
 ? wake_up_q+0x4f/0x4f
 ? __init_waitqueue_head+0x36/0x49
 afs_fs_fetch_data+0x21c/0x227 [kafs]
 ? afs_fs_fetch_data+0x21c/0x227 [kafs]
 afs_vnode_fetch_data+0xf3/0x1d2 [kafs]
 afs_readpages+0x314/0x3fd [kafs]
 __do_page_cache_readahead+0x208/0x2c5
 ondemand_readahead+0x3a2/0x3b7
 ? ondemand_readahead+0x3a2/0x3b7
 page_cache_async_readahead+0x5e/0x67
 generic_file_read_iter+0x23b/0x70c
 ? __inode_security_revalidate+0x2f/0x62
 __vfs_read+0xc4/0xe8
 vfs_read+0xd1/0x15a
 SyS_read+0x4c/0x89
 do_syscall_64+0x80/0x191
 entry_SYSCALL64_slow_path+0x25/0x25
Reported-by: NMarc Dionne <marc.dionne@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Tested-by: NMarc Dionne <marc.dionne@auristor.com>

29f06985

afs: Prevent callback expiry timer overflow · 56e71431

由 Tina Ruchandani 提交于 3月 16, 2017

get_seconds() returns real wall-clock seconds. On 32-bit systems
this value will overflow in year 2038 and beyond. This patch changes
afs_vnode record to use ktime_get_real_seconds() instead, for the
fields cb_expires and cb_expires_at.
Signed-off-by: NTina Ruchandani <ruchandani.tina@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

56e71431

afs: Migrate vlocation fields to 64-bit · 8a79790b

由 Tina Ruchandani 提交于 3月 16, 2017

get_seconds() returns real wall-clock seconds. On 32-bit systems
this value will overflow in year 2038 and beyond. This patch changes
afs's vlocation record to use ktime_get_real_seconds() instead, for the
fields time_of_death and update_at.
Signed-off-by: NTina Ruchandani <ruchandani.tina@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

8a79790b

afs: security: Replace rcu_assign_pointer() with RCU_INIT_POINTER() · df8a09d1

由 Andreea-Cristina Bernat 提交于 3月 16, 2017

The use of "rcu_assign_pointer()" is NULLing out the pointer.
According to RCU_INIT_POINTER()'s block comment:
"1.   This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.

The following Coccinelle semantic patch was used:
@@
@@

- rcu_assign_pointer
+ RCU_INIT_POINTER
  (..., NULL)
Signed-off-by: NAndreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

df8a09d1

afs: inode: Replace rcu_assign_pointer() with RCU_INIT_POINTER() · 1d7e4ebf

由 Andreea-Cristina Bernat 提交于 3月 16, 2017

The use of "rcu_assign_pointer()" is NULLing out the pointer.
According to RCU_INIT_POINTER()'s block comment:
"1.   This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.

The following Coccinelle semantic patch was used:
@@
@@

- rcu_assign_pointer
+ RCU_INIT_POINTER
  (..., NULL)
Signed-off-by: NAndreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

1d7e4ebf

afs: Distinguish mountpoints from symlinks by file mode alone · 944c74f4

由 David Howells 提交于 3月 16, 2017

In AFS, mountpoints appear as symlinks with mode 0644 and normal symlinks
have mode 0777, so use this to distinguish them rather than reading the
content and parsing it. In the case of a mountpoint, the symlink body is a
formatted string indicating the location of the target volume.

Note that with this, kAFS no longer 'pre-fetches' the contents of symlinks,
so afs_readpage() may fail with an access-denial because when the VFS calls
d_automount(), it wraps the call in an credentials override that sets the
initial creds - thereby preventing access to the caller's keyrings and the
authentication keys held therein.

To this end, a patch reverting that change to the VFS is required also.
Reported-by: NJeffrey Altman <jaltman@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

944c74f4

afs: Flush outstanding writes when an fd is closed · 58fed94d

由 David Howells 提交于 3月 16, 2017

Flush outstanding writes in afs when an fd is closed.  This is what NFS and
CIFS do.
Reported-by: NMarc Dionne <marc.c.dionne@gmail.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

58fed94d

afs: Handle a short write to an AFS page · e8e581a8

由 David Howells 提交于 3月 16, 2017

Handle the situation where afs_write_begin() is told to expect that a
full-page write will be made, but this doesn't happen (EFAULT, CTRL-C,
etc.), and so afs_write_end() sees a partial write took place.  Currently,
no attempt is to deal with the discrepency.

Fix this by loading the gap from the server.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

e8e581a8

afs: Kill struct afs_read::pg_offset · 3448e652

由 David Howells 提交于 3月 16, 2017

Kill struct afs_read::pg_offset as nothing uses it.  It's unnecessary as pos
can be masked off.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

3448e652

afs: Handle better the server returning excess or short data · 6db3ac3c

由 David Howells 提交于 3月 16, 2017

When an AFS server is given an FS.FetchData{,64} request to read data from
a file, it is permitted by the protocol to return more or less than was
requested.  kafs currently relies on the latter behaviour in readpage{,s}
to handle a partial page at the end of the file (we just ask for a whole
page and clear space beyond the short read).

However, we don't handle all cases.  Add:

 (1) Handle excess data by discarding it rather than aborting.  Note that
     we use a common static buffer to discard into so that the decryption
     algorithm advances the PCBC state.

 (2) Handle a short read that affects more than just the last page.

Note that if a read comes up unexpectedly short of long, it's possible that
the server's copy of the file changed - in which case the data version
number will have been incremented and the callback will have been broken -
in which case all the pages currently attached to the inode will be zapped
anyway at some point.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

6db3ac3c

afs: Deal with an empty callback array · bcd89270

由 Marc Dionne 提交于 3月 16, 2017

Servers may send a callback array that is the same size as
the FID array, or an empty array.  If the callback count is
0, the code would attempt to read (fid_count * 12) bytes of
data, which would fail and result in an unmarshalling error.
This would lead to stale data for remotely modified files
or directories.

Store the callback array size in the internal afs_call
structure and use that to determine the amount of data to
read.
Signed-off-by: NMarc Dionne <marc.dionne@auristor.com>

bcd89270

afs: Adjust mode bits processing · 627f4694

由 Marc Dionne 提交于 3月 16, 2017

Mode bits for an afs file should not be enforced in the usual
way.

For files, the absence of user bits can restrict file access
with respect to what is granted by the server.

These bits apply regardless of the owner or the current uid; the
rest of the mode bits (group, other) are ignored.
Signed-off-by: NMarc Dionne <marc.dionne@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

627f4694

afs: Populate group ID from vnode status · 6186f078

由 Marc Dionne 提交于 3月 16, 2017

The group was hard coded to GLOBAL_ROOT_GID; use the group
ID that was received from the server.
Signed-off-by: NMarc Dionne <marc.dionne@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>

6186f078

afs: Fix page overput in afs_fill_page() · 5611ef28

由 David Howells 提交于 3月 16, 2017

afs_fill_page() loads the page it wants to fill into the afs_read request
without incrementing its refcount - but then calls afs_put_read() to clean
up afterwards, which then releases a ref on the page.

Fix this by getting a ref on the page before calling
afs_vnode_fetch_data().

This causes sync after a write to hang in afs_writepages_region() because
find_get_pages_tag() gets confused and doesn't return.

Fixes: 196ee9cd ("afs: Make afs_fs_fetch_data() take a list of pages")
Reported-by: NMarc Dionne <marc.dionne@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Tested-by: NMarc Dionne <marc.dionne@auristor.com>

5611ef28

afs: Fix missing put_page() · 29c8bbbd

由 David Howells 提交于 3月 16, 2017

In afs_writepages_region(), inside the loop where we find dirty pages to
deal with, one of the if-statements is missing a put_page().
Signed-off-by: NDavid Howells <dhowells@redhat.com>

29c8bbbd

15 3月, 2017 1 次提交

gfs2: Avoid alignment hole in struct lm_lockname · 28ea06c4

由 Andreas Gruenbacher 提交于 3月 06, 2017

Commit 88ffbf3e switches to using rhashtables for glocks, hashing over
the entire struct lm_lockname instead of its individual fields.  On some
architectures, struct lm_lockname contains a hole of uninitialized
memory due to alignment rules, which now leads to incorrect hash values.
Get rid of that hole.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
CC: <stable@vger.kernel.org> #v4.3+

28ea06c4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功