提交 · e27f49c33b7410f4db065bc4382a8e03769eecc3 · openeuler / Kernel

07 3月, 2012 6 次提交

nfsd41: refactor nfsd4_deleg_xgrade_none_ext logic out of nfsd4_process_open2 · e27f49c3

由 Benny Halevy 提交于 2月 21, 2012

Handle the case where the nfsv4.1 client asked to uprade or downgrade
its delegations and server returns no delegation.

In this case, op_delegate_type is set to NFS4_OPEN_DELEGATE_NONE_EXT
and op_why_no_deleg is set respectively to WND4_NOT_SUPP_{UP,DOWN}GRADE
Signed-off-by: NBenny Halevy <bhalevy@tonian.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

e27f49c3

nfsd41: refactor nfs4_open_deleg_none_ext logic out of nfs4_open_delegation · 4aa8913c

由 Benny Halevy 提交于 2月 21, 2012

When a 4.1 client asks for a delegation and the server returns none
op_delegate_type is set to NFS4_OPEN_DELEGATE_NONE_EXT
and op_why_no_deleg is set to either WND4_CONTENTION or WND4_RESOURCE.
Or, if the client sent a NFS4_SHARE_WANT_CANCEL (which it is not supposed
to ever do until our server supports delegations signaling),
op_why_no_deleg is set to WND4_CANCELLED.

Note that for WND4_CONTENTION and WND4_RESOURCE, the xdr layer is hard coded
at this time to encode boolean FALSE for ond_server_will_push_deleg /
ond_server_will_signal_avail.
Signed-off-by: NBenny Halevy <bhalevy@tonian.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

4aa8913c

J
nfsd4: fix recovery-entry leak nfsd startup failure · a8ae08eb
由 J. Bruce Fields 提交于 3月 06, 2012
```
Another leak on error
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
a8ae08eb

nfsd4: fix recovery-dir leak on nfsd startup failure · a6d6b781

由 Jeff Layton 提交于 3月 05, 2012

The current code never calls nfsd4_shutdown_recdir if nfs4_state_start
returns an error. Also, it's better to go ahead and consolidate these
functions since one is just a trivial wrapper around the other.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

a6d6b781

nfsd4: purge stable client records with insufficient state · 393d8ed8

由 J. Bruce Fields 提交于 3月 06, 2012

To escape having your stable storage record purged at the end of the
grace period, it's not sufficient to simply have performed a
setclientid_confirm; you also need to meet the same requirements as
someone creating a new record: either you should have done an open or
open reclaim (in the 4.0 case) or a reclaim_complete (in the 4.1 case).
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

393d8ed8

nfsd4: don't set cl_firststate on first reclaim in 4.1 case · 1255a8f3

由 J. Bruce Fields 提交于 3月 06, 2012

We set cl_firststate when we first decide that a client will be
permitted to reclaim state on next boot.  This happens:

	- for new 4.0 clients, when they confirm their first open
	- for returning 4.0 clients, when they reclaim their first open
	- for 4.1+ clients, when they perform reclaim_complete

We also use cl_firststate to decide whether a reclaim_complete has
already been performed, in the 4.1+ case.

We were setting it on 4.1 open reclaims, which caused spurious
COMPLETE_ALREADY errors on RECLAIM_COMPLETE from an nfs4.1 client with
anything to reclaim.
Reported-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

1255a8f3

18 2月, 2012 5 次提交

nfsd41: implement NFS4_SHARE_WANT_NO_DELEG, NFS4_OPEN_DELEGATE_NONE_EXT, why_no_deleg · d24433cd

由 Benny Halevy 提交于 2月 16, 2012

Respect client request for not getting a delegation in NFSv4.1
Appropriately return delegation "type" NFS4_OPEN_DELEGATE_NONE_EXT
and WND4_NOT_WANTED reason.

[nfsd41: add missing break when encoding op_why_no_deleg]
Signed-off-by: NBenny Halevy <bhalevy@tonian.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

d24433cd

NFSD: Clean up the test_stateid function · 03cfb420

由 Bryan Schumaker 提交于 1月 27, 2012

When I initially wrote it, I didn't understand how lists worked so I
wrote something that didn't use them.  I think making a list of stateids
to test is a more straightforward implementation, especially compared to
especially compared to decoding stateids while simultaneously encoding
a reply to the client.
Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

03cfb420

lockd: fix arg parsing for grace_period and timeout. · de5b8e8e

由 NeilBrown 提交于 2月 07, 2012

If you try to set grace_period or timeout via a module parameter
to lockd, and do this on a big-endian machine where

   sizeof(int) != sizeof(unsigned long)

it won't work.  This number given will be effectively shifted right
by the difference in those two sizes.

So cast kp->arg properly to get correct result.

Cc: stable@kernel.org
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

de5b8e8e

B
nfsd41: split out share_access want and signal flags while decoding · 2c8bd7e0
由 Benny Halevy 提交于 2月 16, 2012
```
Signed-off-by: NBenny Halevy <bhalevy@tonian.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
2c8bd7e0

nfsd41: share_access_to_flags should consider only nfs4.x share_access flags · 00b5f95a

由 Benny Halevy 提交于 2月 16, 2012

Currently, it will not correctly ignore any nfsv4.1 signal flags
if the client sends them.
Signed-off-by: NBenny Halevy <bhalevy@tonian.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

00b5f95a

16 2月, 2012 10 次提交
- T
  nfsd41: use current stateid by value · 37c593c5
  由 Tigran Mkrtchyan 提交于 2月 13, 2012
```
Signed-off-by: NTigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
  37c593c5
- T
  nfsd41: consume current stateid on DELEGRETURN and OPENDOWNGRADE · 9428fe1a
  由 Tigran Mkrtchyan 提交于 2月 13, 2012
```
Signed-off-by: NTigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
  9428fe1a
- T
  nfsd41: handle current stateid in SETATTR and FREE_STATEID · 1e97b519
  由 Tigran Mkrtchyan 提交于 2月 13, 2012
```
Signed-off-by: NTigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
  1e97b519
- T
  nfsd41: mark LOOKUP, LOOKUPP and CREATE to invalidate current stateid · d1471053
  由 Tigran Mkrtchyan 提交于 2月 13, 2012
```
Signed-off-by: NTigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
  d1471053
- T
  nfsd41: save and restore current stateid with current fh · 83071114
  由 Tigran Mkrtchyan 提交于 2月 13, 2012
```
Signed-off-by: NTigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
  83071114
- T
  nfsd41: mark PUTFH, PUTPUBFH and PUTROOTFH to clear current stateid · 80e01cc1
  由 Tigran Mkrtchyan 提交于 2月 13, 2012
```
Signed-off-by: NTigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
  80e01cc1
- T
  nfsd41: consume current stateid on read and write · 30813e27
  由 Tigran Mkrtchyan 提交于 2月 13, 2012
```
Signed-off-by: NTigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
  30813e27
- T
  nfsd41: handle current stateid on lock and locku · 62cd4a59
  由 Tigran Mkrtchyan 提交于 2月 13, 2012
```
Signed-off-by: NTigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
  62cd4a59
- T
  nfsd41: handle current stateid in open and close · 8b70484c
  由 Tigran Mkrtchyan 提交于 2月 13, 2012
```
Signed-off-by: NTigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
  8b70484c
- T
  nfsd4: initialize current stateid at compile time · 19ff0f28
  由 Tigran Mkrtchyan 提交于 2月 13, 2012
```
Signed-off-by: NTigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
  19ff0f28
15 2月, 2012 2 次提交

nfsd4: check for uninitialized slot · bf5c43c8

由 J. Bruce Fields 提交于 2月 13, 2012

This fixes an oops when a buggy client tries to use an initial seqid of
0 on a new slot, which we may misinterpret as a replay.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

bf5c43c8

nfsd4: rearrange struct nfsd4_slot · 73e79482

由 J. Bruce Fields 提交于 2月 13, 2012

Combine two booleans into a single flag field, move the smaller fields
to the end.

(In practice this doesn't make the struct any smaller.  But we'll be
adding another flag here soon.)

Remove some debugging code that doesn't look useful, while we're in the
neighborhood.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

73e79482

14 2月, 2012 1 次提交

nfsd4: fix sessions slotid wraparound logic · f6d82485

由 J. Bruce Fields 提交于 2月 13, 2012

From RFC 5661 2.10.6.1: "If the previous sequence ID was 0xFFFFFFFF,
then the next request for the slot MUST have the sequence ID set to
zero."

While we're there, delete some redundant comments.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f6d82485

04 2月, 2012 3 次提交

nfsd: fix default iosize calculation on 32bit · 508f9227

由 J. Bruce Fields 提交于 1月 30, 2012

The rpc buffers will be allocated out of low memory, so we should really
only be taking that into account.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

508f9227

nfsd: cleanup setting of default max_block_size · 87b0fc7d

由 J. Bruce Fields 提交于 1月 30, 2012

Move calculation of the default into a helper function.

Get rid of an unused variable "err" while we're there.

Thanks to Mi Jinlong for catching an arithmetic error in a previous
version.

Cc: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

87b0fc7d

nfsd: remove some unneeded checks · 3476964d

由 Dan Carpenter 提交于 1月 20, 2012

We check for zero length strings in the caller now, so these aren't
needed.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

3476964d

28 1月, 2012 9 次提交

Logfs: Allow NULL block_isbad() methods · f2933e86

由 Joern Engel 提交于 8月 05, 2011

Not all mtd drivers define block_isbad().  Let's assume no bad blocks
instead of refusing to mount.
Signed-off-by: NJoern Engel <joern@logfs.org>

f2933e86

logfs: Grow inode in delete path · bbe01387

由 Joern Engel 提交于 8月 05, 2011

Can be necessary if an inode gets deleted (through -ENOSPC) before being
written.  Might be better to move this into logfs_write_rec(), but for
now go with the stupid&safe patch.
Signed-off-by: NJoern Engel <joern@logfs.org>

bbe01387

J
logfs: Free areas before calling generic_shutdown_super() · 1bcceaff
由 Joern Engel 提交于 8月 05, 2011
```
Or hit an assertion in map_invalidatepage() instead.
Signed-off-by: NJoern Engel <joern@logfs.org>
```
1bcceaff
J
logfs: remove useless BUG_ON · 6c69494f
由 Joern Engel 提交于 9月 12, 2011
```
It prevents write sizes >4k.
Signed-off-by: NJoern Engel <joern@logfs.org>
```
6c69494f

logfs: Propagate page parameter to __logfs_write_inode · 0bd90387

由 Prasad Joshi 提交于 10月 02, 2011

During GC LogFS has to rewrite each valid block to a separate segment.
Rewrite operation reads data from an old segment and writes it to a
newly allocated segment. Since every write operation changes data
block pointers maintained in inode, inode should also be rewritten.

In GC path to avoid AB-BA deadlock LogFS marks a page with
PG_pre_locked in addition to locking the page (PG_locked). The page
lock is ignored iff the page is pre-locked.

LogFS uses a special file called segment file. The segment file
maintains an 8 bytes entry for every segment. It keeps track of erase
count, level etc. for every segment.

Bad things happen with a segment belonging to the segment file is GCed

 ------------[ cut here ]------------
kernel BUG at /home/prasad/logfs/readwrite.c:297!
invalid opcode: 0000 [#1] SMP
Modules linked in: logfs joydev usbhid hid psmouse e1000 i2c_piix4
		serio_raw [last unloaded: logfs]
Pid: 20161, comm: mount Not tainted 3.1.0-rc3+ #3 innotek GmbH
		VirtualBox
EIP: 0060:[<f809132a>] EFLAGS: 00010292 CPU: 0
EIP is at logfs_lock_write_page+0x6a/0x70 [logfs]
EAX: 00000027 EBX: f73f5b20 ECX: c16007c8 EDX: 00000094
ESI: 00000000 EDI: e59be6e4 EBP: c7337b28 ESP: c7337b18
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process mount (pid: 20161, ti=c7336000 task=eb323f70 task.ti=c7336000)
Stack:
f8099a3d c7337b24 f73f5b20 00001002 c7337b50 f8091f6d f8099a4d f80994e4
00000003 00000000 c7337b68 00000000 c67e4400 00001000 c7337b80 f80935e5
00000000 00000000 00000000 00000000 e1fcf000 0000000f e59be618 c70bf900
Call Trace:
[<f8091f6d>] logfs_get_write_page.clone.16+0xdd/0x100 [logfs]
[<f80935e5>] logfs_mod_segment_entry+0x55/0x110 [logfs]
[<f809460d>] logfs_get_segment_entry+0x1d/0x20 [logfs]
[<f8091060>] ? logfs_cleanup_journal+0x50/0x50 [logfs]
[<f809521b>] ostore_get_erase_count+0x1b/0x40 [logfs]
[<f80965b8>] logfs_open_area+0xc8/0x150 [logfs]
[<c141a7ec>] ? kmemleak_alloc+0x2c/0x60
[<f809668e>] __logfs_segment_write.clone.16+0x4e/0x1b0 [logfs]
[<c10dd563>] ? mempool_kmalloc+0x13/0x20
[<c10dd563>] ? mempool_kmalloc+0x13/0x20
[<f809696f>] logfs_segment_write+0x17f/0x1d0 [logfs]
[<f8092e8c>] logfs_write_i0+0x11c/0x180 [logfs]
[<f8092f35>] logfs_write_direct+0x45/0x90 [logfs]
[<f80934cd>] __logfs_write_buf+0xbd/0xf0 [logfs]
[<c102900e>] ? kmap_atomic_prot+0x4e/0xe0
[<f809424b>] logfs_write_buf+0x3b/0x60 [logfs]
[<f80947a9>] __logfs_write_inode+0xa9/0x110 [logfs]
[<f8094cb0>] logfs_rewrite_block+0xc0/0x110 [logfs]
[<f8095300>] ? get_mapping_page+0x10/0x60 [logfs]
[<f8095aa0>] ? logfs_load_object_aliases+0x2e0/0x2f0 [logfs]
[<f808e57d>] logfs_gc_segment+0x2ad/0x310 [logfs]
[<f808e62a>] __logfs_gc_once+0x4a/0x80 [logfs]
[<f808ed43>] logfs_gc_pass+0x683/0x6a0 [logfs]
[<f8097a89>] logfs_mount+0x5a9/0x680 [logfs]
[<c1126b21>] mount_fs+0x21/0xd0
[<c10f6f6f>] ? __alloc_percpu+0xf/0x20
[<c113da41>] ? alloc_vfsmnt+0xb1/0x130
[<c113db4b>] vfs_kern_mount+0x4b/0xa0
[<c113e06e>] do_kern_mount+0x3e/0xe0
[<c113f60d>] do_mount+0x34d/0x670
[<c10f2749>] ? strndup_user+0x49/0x70
[<c113fcab>] sys_mount+0x6b/0xa0
[<c142d87c>] syscall_call+0x7/0xb
Code: f8 e8 8b 93 39 c9 8b 45 f8 3e 0f ba 28 00 19 d2 85 d2 74 ca eb d0 0f 0b 8d 45 fc 89 44 24 04 c7 04 24 3d 9a 09 f8 e8 09 92 39 c9 <0f> 0b 8d 74 26 00 55 89 e5 3e 8d 74 26 00 8b 10 80 e6 01 74 09
EIP: [<f809132a>] logfs_lock_write_page+0x6a/0x70 [logfs] SS:ESP 0068:c7337b18
---[ end trace 96e67d5b3aa3d6ca ]---

The patch passes locked page to __logfs_write_inode. It calls function
logfs_get_wblocks() to pre-lock the page. This ensures any further
attempts to lock the page are ignored (esp from get_erase_count).
Acked-by: NJoern Engel <joern@logfs.org>
Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>

0bd90387

logfs: set superblock shutdown flag after generic sb shutdown · ecfd8909

由 Prasad Joshi 提交于 10月 30, 2011

While unmounting the file system LogFS calls generic_shutdown_super.
The function does file system independent superblock shutdown.
However, it might result in call file system specific inode eviction.

LogFS marks FS shutting down by setting bit LOGFS_SB_FLAG_SHUTDOWN in
super->s_flags. Since, inode eviction might call truncate on inode,
following BUG is observed when file system is unmounted:

------------[ cut here ]------------
kernel BUG at /home/prasad/logfs/segment.c:362!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU 3
Modules linked in: logfs binfmt_misc ppdev virtio_blk parport_pc lp
	parport psmouse floppy virtio_pci serio_raw virtio_ring virtio

Pid: 1933, comm: umount Not tainted 3.0.0+ #4 Bochs Bochs
RIP: 0010:[<ffffffffa008c841>]  [<ffffffffa008c841>]
		logfs_segment_write+0x211/0x230 [logfs]
RSP: 0018:ffff880062d7b9e8  EFLAGS: 00010202
RAX: 000000000000000e RBX: ffff88006eca9000 RCX: 0000000000000000
RDX: ffff88006fd87c40 RSI: ffffea00014ff468 RDI: ffff88007b68e000
RBP: ffff880062d7ba48 R08: 8000000020451430 R09: 0000000000000000
R10: dead000000100100 R11: 0000000000000000 R12: ffff88006fd87c40
R13: ffffea00014ff468 R14: ffff88005ad0a460 R15: 0000000000000000
FS:  00007f25d50ea760(0000) GS:ffff88007fd80000(0000)
	knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000d05e48 CR3: 0000000062c72000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process umount (pid: 1933, threadinfo ffff880062d7a000,
	task ffff880070b44500)
Stack:
ffff880062d7ba38 ffff88005ad0a508 0000000000001000 0000000000000000
8000000020451430 ffffea00014ff468 ffff880062d7ba48 ffff88005ad0a460
ffff880062d7bad8 ffffea00014ff468 ffff88006fd87c40 0000000000000000
Call Trace:
[<ffffffffa0088fee>] logfs_write_i0+0x12e/0x190 [logfs]
[<ffffffffa0089360>] __logfs_write_rec+0x140/0x220 [logfs]
[<ffffffffa0089312>] __logfs_write_rec+0xf2/0x220 [logfs]
[<ffffffffa00894a4>] logfs_write_rec+0x64/0xd0 [logfs]
[<ffffffffa0089616>] __logfs_write_buf+0x106/0x110 [logfs]
[<ffffffffa008a19e>] logfs_write_buf+0x4e/0x80 [logfs]
[<ffffffffa008a6b8>] __logfs_write_inode+0x98/0x110 [logfs]
[<ffffffffa008a7c4>] logfs_truncate+0x54/0x290 [logfs]
[<ffffffffa008abfc>] logfs_evict_inode+0xdc/0x190 [logfs]
[<ffffffff8115eef5>] evict+0x85/0x170
[<ffffffff8115f126>] iput+0xe6/0x1b0
[<ffffffff8115b4a8>] shrink_dcache_for_umount_subtree+0x218/0x280
[<ffffffff8115ce91>] shrink_dcache_for_umount+0x51/0x90
[<ffffffff8114796c>] generic_shutdown_super+0x2c/0x100
[<ffffffffa008cc47>] logfs_kill_sb+0x57/0xf0 [logfs]
[<ffffffff81147de5>] deactivate_locked_super+0x45/0x70
[<ffffffff811487ea>] deactivate_super+0x4a/0x70
[<ffffffff81163934>] mntput_no_expire+0xa4/0xf0
[<ffffffff8116469f>] sys_umount+0x6f/0x380
[<ffffffff814dd46b>] system_call_fastpath+0x16/0x1b
Code: 55 c8 49 8d b6 a8 00 00 00 45 89 f9 45 89 e8 4c 89 e1 4c 89 55
b8 c7 04 24 00 00 00 00 e8 68 fc ff ff 4c 8b 55 b8 e9 3c ff ff ff <0f>
0b 0f 0b c7 45 c0 00 00 00 00 e9 44 fe ff ff 66 66 66 66 66
RIP  [<ffffffffa008c841>] logfs_segment_write+0x211/0x230 [logfs]
RSP <ffff880062d7b9e8>
---[ end trace fe6b040cea952290 ]---

Therefore, move super->s_flags setting after the fs-indenpendent work
has been finished.
Reviewed-by: NJoern Engel <joern@logfs.org>
Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>

ecfd8909

logfs: take write mutex lock during fsync and sync · 13ced29c

由 Prasad Joshi 提交于 1月 28, 2012

LogFS uses super->s_write_mutex while writing data to disk. Taking the
same mutex lock in sync and fsync code path solves the following BUG:

------------[ cut here ]------------
kernel BUG at /home/prasad/logfs/dev_bdev.c:134!

Pid: 2387, comm: flush-253:16 Not tainted 3.0.0+ #4 Bochs Bochs
RIP: 0010:[<ffffffffa007deed>]  [<ffffffffa007deed>]
                bdev_writeseg+0x25d/0x270 [logfs]
Call Trace:
[<ffffffffa007c381>] logfs_open_area+0x91/0x150 [logfs]
[<ffffffff8128dcb2>] ? find_level.clone.9+0x62/0x100
[<ffffffffa007c49c>] __logfs_segment_write.clone.20+0x5c/0x190 [logfs]
[<ffffffff810ef005>] ? mempool_kmalloc+0x15/0x20
[<ffffffff810ef383>] ? mempool_alloc+0x53/0x130
[<ffffffffa007c7a4>] logfs_segment_write+0x1d4/0x230 [logfs]
[<ffffffffa0078f8e>] logfs_write_i0+0x12e/0x190 [logfs]
[<ffffffffa0079300>] __logfs_write_rec+0x140/0x220 [logfs]
[<ffffffffa0079444>] logfs_write_rec+0x64/0xd0 [logfs]
[<ffffffffa00795b6>] __logfs_write_buf+0x106/0x110 [logfs]
[<ffffffffa007a13e>] logfs_write_buf+0x4e/0x80 [logfs]
[<ffffffffa0073e33>] __logfs_writepage+0x23/0x80 [logfs]
[<ffffffffa007410c>] logfs_writepage+0xdc/0x110 [logfs]
[<ffffffff810f5ba7>] __writepage+0x17/0x40
[<ffffffff810f6208>] write_cache_pages+0x208/0x4f0
[<ffffffff810f5b90>] ? set_page_dirty+0x70/0x70
[<ffffffff810f653a>] generic_writepages+0x4a/0x70
[<ffffffff810f75d1>] do_writepages+0x21/0x40
[<ffffffff8116b9d1>] writeback_single_inode+0x101/0x250
[<ffffffff8116bdbd>] writeback_sb_inodes+0xed/0x1c0
[<ffffffff8116c5fb>] writeback_inodes_wb+0x7b/0x1e0
[<ffffffff8116cc23>] wb_writeback+0x4c3/0x530
[<ffffffff814d984d>] ? sub_preempt_count+0x9d/0xd0
[<ffffffff8116cd6b>] wb_do_writeback+0xdb/0x290
[<ffffffff814d984d>] ? sub_preempt_count+0x9d/0xd0
[<ffffffff814d6208>] ? _raw_spin_unlock_irqrestore+0x18/0x40
[<ffffffff8105aa5a>] ? del_timer+0x8a/0x120
[<ffffffff8116cfac>] bdi_writeback_thread+0x8c/0x2e0
[<ffffffff8116cf20>] ? wb_do_writeback+0x290/0x290
[<ffffffff8106d2e6>] kthread+0x96/0xa0
[<ffffffff814de514>] kernel_thread_helper+0x4/0x10
[<ffffffff8106d250>] ? kthread_worker_fn+0x190/0x190
[<ffffffff814de510>] ? gs_change+0xb/0xb
RIP  [<ffffffffa007deed>] bdev_writeseg+0x25d/0x270 [logfs]
---[ end trace 0211ad60a57657c4 ]---
Reviewed-by: NJoern Engel <joern@logfs.org>
Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>

13ced29c

logfs: Prevent memory corruption · 934eed39

由 Joern Engel 提交于 11月 20, 2011

This is a bad one.  I wonder whether we were so far protected by
no_free_segments(sb) usually being smaller than LOGFS_NO_AREAS.

Found by Dan Carpenter <dan.carpenter@oracle.com> using smatch.
Signed-off-by: NJoern Engel <joern@logfs.org>
Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>

934eed39

logfs: update page reference count for pined pages · 96150606

由 Prasad Joshi 提交于 11月 26, 2011

LogFS sets PG_private flag to indicate a pined page. We assumed that
marking a page as private is enough to ensure its existence. But
instead it is necessary to hold a reference count to the page.

The change resolves the following BUG

BUG: Bad page state in process flush-253:16  pfn:6a6d0
page flags: 0x100000000000808(uptodate|private)
Suggested-and-Acked-by: NJoern Engel <joern@logfs.org>
Signed-off-by: NPrasad Joshi <prasadjoshi.linux@gmail.com>

96150606

27 1月, 2012 4 次提交

Btrfs: fix reservations in btrfs_page_mkwrite · 9998eb70

由 Chris Mason 提交于 1月 25, 2012

Josef fixed btrfs_page_mkwrite to properly release reserved
extents if there was an error.  But if we fail to get a reservation
and we fail to dirty the inode (for ENOSPC reasons), we'll end up
trying to release a reservation we never had.

This makes sure we only release if we were able to reserve.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9998eb70

Btrfs: advance window_start if we're using a bitmap · 9b230628

由 Josef Bacik 提交于 1月 26, 2012

If we span a long area in a bitmap we could end up taking a lot of time
searching to the next free area if we're searching from the original
window_start, so advance window_start in order to make sure we don't do any
superficial searching.  Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9b230628

btrfs: mask out gfp flags in releasepage · 0c4e538b

由 David Sterba 提交于 1月 26, 2012

btree_releasepage is a callback and can be passed unknown gfp flags and then
they may end up in kmem_cache_alloc called from alloc_extent_state, slab
allocator will BUG_ON when there is HIGHMEM or DMA32 flag set.

This may happen when btrfs is mounted from a loop device, which masks out
__GFP_IO flag. The check in try_release_extent_state

3399                 if ((mask & GFP_NOFS) == GFP_NOFS)
3400                         mask = GFP_NOFS;

will not work and passes unfiltered flags further resulting in crash at
mm/slab.c:2963

 [<000000000024ae4c>] cache_alloc_refill+0x3b4/0x5c8
 [<000000000024c810>] kmem_cache_alloc+0x204/0x294
 [<00000000001fd3c2>] mempool_alloc+0x52/0x170
 [<000003c000ced0b0>] alloc_extent_state+0x40/0xd4 [btrfs]
 [<000003c000cee5ae>] __clear_extent_bit+0x38a/0x4cc [btrfs]
 [<000003c000cee78c>] try_release_extent_state+0x9c/0xd4 [btrfs]
 [<000003c000cc4c66>] btree_releasepage+0x7e/0xd0 [btrfs]
 [<0000000000210d84>] shrink_page_list+0x6a0/0x724
 [<0000000000211394>] shrink_inactive_list+0x230/0x578
 [<0000000000211bb8>] shrink_list+0x6c/0x120
 [<0000000000211e4e>] shrink_zone+0x1e2/0x228
 [<0000000000211f24>] shrink_zones+0x90/0x254
 [<0000000000213410>] do_try_to_free_pages+0xac/0x420
 [<0000000000213ae0>] try_to_free_pages+0x13c/0x1b0
 [<0000000000204e6c>] __alloc_pages_nodemask+0x5b4/0x9a8
 [<00000000001fb04a>] grab_cache_page_write_begin+0x7e/0xe8
Signed-off-by: NDavid Sterba <dsterba@suse.cz>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

0c4e538b

Btrfs: fix enospc error caused by wrong checks of the chunk · 9e622d6b

由 Miao Xie 提交于 1月 26, 2012

When we did sysbench test for inline files, enospc error happened easily though
there was lots of free disk space which could be allocated for new chunks.

Reproduce steps:
 # mkfs.btrfs -b $((2 * 1024 * 1024 * 1024)) <test partition>
 # mount <test partition> /mnt
 # ulimit -n 102400
 # cd /mnt
 # sysbench --num-threads=1 --test=fileio --file-num=81920 \
 > --file-total-size=80M --file-block-size=1K --file-io-mode=sync \
 > --file-test-mode=seqwr prepare
 # sysbench --num-threads=1 --test=fileio --file-num=81920 \
 > --file-total-size=80M --file-block-size=1K --file-io-mode=sync \
 > --file-test-mode=seqwr run
 <soon later, BUG_ON() was triggered by enospc error>

The reason of this bug is:
Now, we can reserve space which is larger than the free space in the chunks if
we have enough free disk space which can be used for new chunks. By this way,
the space allocator should allocate a new chunk by force if there is no free
space in the free space cache. But there are two wrong checks which break this
operation.

One is
	if (ret == -ENOSPC && num_bytes > min_alloc_size)
in btrfs_reserve_extent(), it is wrong, we should try to allocate a new chunk
even we fail to allocate free space by minimum allocable size.

The other is
	if (space_info->force_alloc)
		force = space_info->force_alloc;
in do_chunk_alloc(). It makes the allocator ignore CHUNK_ALLOC_FORCE If someone
sets ->force_alloc to CHUNK_ALLOC_LIMITED, and makes the enospc error happen.

Fix these two wrong checks. Especially the second one, we fix it by changing
the value of CHUNK_ALLOC_LIMITED and CHUNK_ALLOC_FORCE, and make
CHUNK_ALLOC_FORCE greater than CHUNK_ALLOC_LIMITED since CHUNK_ALLOC_FORCE has
higher priority. And if the value which is passed in by the caller is greater
than ->force_alloc, use the passed value.
Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

9e622d6b

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功