提交 · 68942870c66a7983f1d2efb9c1c524f7ebef8097 · openeuler / Kernel

15 10月, 2020 2 次提交

gfs2: Wipe jdata and ail1 in gfs2_journal_wipe, formerly gfs2_meta_wipe · 68942870

由 Bob Peterson 提交于 7月 22, 2020

Before this patch, when blocks were freed, it called gfs2_meta_wipe to
take the metadata out of the pending journal blocks. It did this mostly
by calling another function called gfs2_remove_from_journal. This is
shortsighted because it does not do anything with jdata blocks which
may also be in the journal.

This patch expands the function so that it wipes out jdata blocks from
the journal as well, and it wipes it from the ail1 list if it hasn't
been written back yet. Since it now processes jdata blocks as well,
the function has been renamed from gfs2_meta_wipe to gfs2_journal_wipe.

New function gfs2_ail1_wipe wants a static view of the ail list, so it
locks the sd_ail_lock when removing items. To accomplish this, function
gfs2_remove_from_journal no longer locks the sd_ail_lock, and it's now
the caller's responsibility to do so.

I was going to make sd_ail_lock locking conditional, but the practice is
generally frowned upon. For details, see: https://lwn.net/Articles/109066/Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

68942870

gfs2: Fix NULL pointer dereference in gfs2_rgrp_dump · 0e539ca1

由 Andrew Price 提交于 10月 07, 2020

When an rindex entry is found to be corrupt, compute_bitstructs() calls
gfs2_consist_rgrpd() which calls gfs2_rgrp_dump() like this:

gfs2_rgrp_dump(NULL, rgd->rd_gl, fs_id_buf);

gfs2_rgrp_dump then dereferences the gl without checking it and we get

BUG: KASAN: null-ptr-deref in gfs2_rgrp_dump+0x28/0x280

because there's no rgrp glock involved while reading the rindex on mount.

Fix this by changing gfs2_rgrp_dump to take an rgrp argument.

Reported-by: syzbot+43fa87986bdd31df9de6@syzkaller.appspotmail.com
Signed-off-by: NAndrew Price <anprice@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

0e539ca1

06 6月, 2020 1 次提交

gfs2: Turn gl_delete into a delayed work · a0e3cc65

由 Andreas Gruenbacher 提交于 1月 16, 2020

This requires flushing delayed work items in gfs2_make_fs_ro (which is called
before unmounting a filesystem).

When inodes are deleted and then recreated, pending gl_delete work items would
have no effect because the inode generations will have changed, so we can
cancel any pending gl_delete works before reusing iopen glocks.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

a0e3cc65

28 3月, 2020 4 次提交

gfs2: don't lock sd_log_flush_lock in try_rgrp_unlink · e04d339b

由 Bob Peterson 提交于 3月 26, 2020

In function try_rgrp_unlink, we added a temporary lock of the
sd_log_flush_lock while searching the bitmaps. This protected us from
problems in which dinodes being freed were still in a state of flux
because the rgrp was in an active transaction. It was a kludge.
Now that we've straightened out the code for inode eviction, deletes,
and all the recovery mess, we no longer need this kludge.
This patch removes it, and should improve performance.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

e04d339b

gfs2: Split gfs2_rsqa_delete into gfs2_rs_delete and gfs2_qa_put · 1595548f

由 Andreas Gruenbacher 提交于 3月 06, 2020

Keeping reservations and quotas separate helps reviewing the code.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

1595548f

gfs2: Change inode qa_data to allow multiple users · 2fba46a0

由 Bob Peterson 提交于 2月 27, 2020

Before this patch, multiple users called gfs2_qa_alloc which allocated
a qadata structure to the inode, if quotas are turned on. Later, in
file close or evict, the structure was deleted with gfs2_qa_delete.
But there can be several competing processes who need access to the
structure. There were races between file close (release) and the others.
Thus, a release could delete the structure out from under a process
that relied upon its existence. For example, chown.

This patch changes the management of the qadata structures to be
a get/put scheme. Function gfs2_qa_alloc has been changed to gfs2_qa_get
and if the structure is allocated, the count essentially starts out at
1. Function gfs2_qa_delete has been renamed to gfs2_qa_put, and the
last guy to decrement the count to 0 frees the memory.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

2fba46a0

gfs2: eliminate gfs2_rsqa_alloc in favor of gfs2_qa_alloc · d580712a

由 Bob Peterson 提交于 3月 06, 2020

Before this patch, multiple callers called gfs2_rsqa_alloc to force
the existence of a reservations structure and a quota data structure
if needed. However, now the reservations are handled separately, so
the quota data is only the quota data. So we eliminate the one in
favor of just calling gfs2_qa_alloc directly.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

d580712a

10 2月, 2020 2 次提交

gfs2: Rework how rgrp buffer_heads are managed · b3422cac

由 Bob Peterson 提交于 11月 13, 2019

Before this patch, the rgrp code had a serious problem related to
how it managed buffer_heads for resource groups. The problem caused
file system corruption, especially in cases of journal replay.

When an rgrp glock was demoted to transfer ownership to a
different cluster node, do_xmote() first calls rgrp_go_sync and then
rgrp_go_inval, as expected. When it calls rgrp_go_sync, that called
gfs2_rgrp_brelse() that dropped the buffer_head reference count.
In most cases, the reference count went to zero, which is right.
However, there were other places where the buffers are handled
differently.

After rgrp_go_sync, do_xmote called rgrp_go_inval which called
gfs2_rgrp_brelse a second time, then rgrp_go_inval's call to
truncate_inode_pages_range would get rid of the pages in memory,
but only if the reference count drops to 0.

Unfortunately, gfs2_rgrp_brelse was setting bi->bi_bh = NULL.
So when rgrp_go_sync called gfs2_rgrp_brelse, it lost the pointer
to the buffer_heads in cases where the reference count was still 1.
Therefore, when rgrp_go_inval called gfs2_rgrp_brelse a second time,
it failed the check for "if (bi->bi_bh)" and thus failed to call
brelse a second time. Because of that, the reference count on those
buffers sometimes failed to drop from 1 to 0. And that caused
function truncate_inode_pages_range to keep the pages in page cache
rather than freeing them.

The next time the rgrp glock was acquired, the metadata read of
the rgrp buffers re-used the pages in memory, which were now
wrong because they were likely modified by the other node who
acquired the glock in EX (which is why we demoted the glock).
This re-use of the page cache caused corruption because changes
made by the other nodes were never seen, so the bitmaps were
inaccurate.

For some reason, the problem became most apparent when journal
replay forced the replay of rgrps in memory, which caused newer
rgrp data to be overwritten by the older in-core pages.

A big part of the problem was that the rgrp buffer were released
in multiple places: The go_unlock function would release them when
the glock was released rather than when the glock is demoted,
which is clearly wrong because our intent was to cache them until
the glock is demoted from SH or EX.

This patch attempts to clean up the mess and make one consistent
and centralized mechanism for managing the rgrp buffer_heads by
implementing several changes:

1. It eliminates the call to gfs2_rgrp_brelse() from rgrp_go_sync.
   We don't want to release the buffers or zero the pointers when
   syncing for the reasons stated above. It only makes sense to
   release them when the glock is actually invalidated (go_inval).
   And when we do, then we set the bh pointers to NULL.
2. The go_unlock function (which was only used for rgrps) is
   eliminated, as we've talked about doing many times before.
   The go_unlock function was called too early in the glock dq
   process, and should not happen until the glock is invalidated.
3. It also eliminates the call to rgrp_brelse in gfs2_clear_rgrpd.
   That will now happen automatically when the rgrp glocks are
   demoted, and shouldn't happen any sooner or later than that.
   Instead, function gfs2_clear_rgrpd has been modified to demote
   the rgrp glocks, and therefore, free those pages, before the
   remaining glocks are culled by gfs2_gl_hash_clear. This
   prevents the gl_object from hanging around when the glocks are
   culled.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com>

b3422cac

gfs2: Report errors before withdraw · 8dc88ac6

由 Andreas Gruenbacher 提交于 1月 23, 2020

In gfs2_rgrp_verify and compute_bitstructs, make sure to report errors before
withdrawing the filesystem: otherwise, when we withdraw first and withdraw is
configured to panic, we'll never get to the error reporting.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

8dc88ac6

21 1月, 2020 1 次提交

gfs2: remove unused LBIT macros · c04f2e0d

由 Alex Shi 提交于 1月 21, 2020

Since commit 223b2b88 ("GFS2: Fix alignment issue and tidy
gfs2_bitfit"), these 3 macros aren't used anymore, so remove them.
Signed-off-by: NAlex Shi <alex.shi@linux.alibaba.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

c04f2e0d

03 9月, 2019 1 次提交

gfs2: Fix possible fs name overflows · 98fb0574

由 Bob Peterson 提交于 8月 13, 2019

This patch fixes three places in which temporary character buffers
could overflow due to the addition of the file system id from patch
3792ce97. Thanks to Dan Carpenter for pointing it out.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

98fb0574

28 6月, 2019 2 次提交

gfs2: replace more printk with calls to fs_info and friends · f29e62ee

由 Bob Peterson 提交于 5月 13, 2019

This patch replaces a few leftover printk errors with calls to
fs_info and similar, so that the file system having the error is
properly logged.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

f29e62ee

gfs2: dump fsid when dumping glock problems · 3792ce97

由 Bob Peterson 提交于 5月 09, 2019

Before this patch, if a glock error was encountered, the glock with
the problem was dumped. But sometimes you may have lots of file systems
mounted, and that doesn't tell you which file system it was for.

This patch adds a new boolean parameter fsid to the dump_glock family
of functions. For non-error cases, such as dumping the glocks debugfs
file, the fsid is not dumped in order to keep lock dumps and glocktop
as clean as possible. For all error cases, such as GLOCK_BUG_ON, the
file system id is now printed. This will make it easier to debug.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

3792ce97

05 6月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 398 · 7336d0e6

由 Thomas Gleixner 提交于 5月 31, 2019

Based on 1 normalized pattern(s):

  this copyrighted material is made available to anyone wishing to use
  modify copy or redistribute it subject to the terms and conditions
  of the gnu general public license version 2

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 44 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531081038.653000175@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

7336d0e6

08 5月, 2019 2 次提交

gfs2: Rename gfs2_trans_{add_unrevoke => remove_revoke} · fbb27873

由 Andreas Gruenbacher 提交于 4月 05, 2019

Rename gfs2_trans_add_unrevoke to gfs2_trans_remove_revoke: there is no
such thing as an "unrevoke" object; all this function does is remove
existing revoke objects plus some bookkeeping.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

fbb27873

gfs2: Fix loop in gfs2_rbm_find (v2) · 71921ef8

由 Andreas Gruenbacher 提交于 3月 14, 2019

Fix the resource group wrap-around logic in gfs2_rbm_find that commit
e579ed4f broke.  The bug can lead to unnecessary repeated scanning of the
same bitmaps; there is a risk that future changes will turn this into an
endless loop.

This is an updated version of commit 2d29f6b9 ("gfs2: Fix loop in
gfs2_rbm_find") which ended up being reverted because it introduced a
performance regression in iozone (see commit e74c98ca).  Changes since v1:

 - Simplify the wrap-around logic.

 - Handle the case where each resource group only has a single bitmap block
   (small filesystem).

 - Update rd_extfail_pt whenever we scan the entire bitmap, even when we don't
   start the scan at the very beginning of the bitmap.

Fixes: e579ed4f ("GFS2: Introduce rbm field bii")
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

71921ef8

01 2月, 2019 1 次提交

gfs2: Revert "Fix loop in gfs2_rbm_find" · e74c98ca

由 Andreas Gruenbacher 提交于 1月 30, 2019

This reverts commit 2d29f6b9.

It turns out that the fix can lead to a ~20 percent performance regression
in initial writes to the page cache according to iozone.  Let's revert this
for now to have more time for a proper fix.

Cc: stable@vger.kernel.org # v3.13+
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e74c98ca

12 12月, 2018 2 次提交

gfs2: Dump nrpages for inodes and their glocks · 27a2660f

由 Bob Peterson 提交于 4月 18, 2018

This patch is based on an idea from Steve Whitehouse. The idea is
to dump the number of pages for inodes in the glock dumps.
The additional locking required me to drop const from quite a few
places.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

27a2660f

gfs2: Fix loop in gfs2_rbm_find · 2d29f6b9

由 Andreas Gruenbacher 提交于 12月 04, 2018

Fix the resource group wrap-around logic in gfs2_rbm_find that commit
e579ed4f broke.  The bug can lead to unnecessary repeated scanning of the
same bitmaps; there is a risk that future changes will turn this into an
endless loop.

Fixes: e579ed4f ("GFS2: Introduce rbm field bii")
Cc: stable@vger.kernel.org # v3.13+
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

2d29f6b9

09 11月, 2018 1 次提交

gfs2: Put bitmap buffers in put_super · 10283ea5

由 Andreas Gruenbacher 提交于 11月 05, 2018

gfs2_put_super calls gfs2_clear_rgrpd to destroy the gfs2_rgrpd objects
attached to the resource group glocks.  That function should release the
buffers attached to the gfs2_bitmap objects (bi_bh), but the call to
gfs2_rgrp_brelse for doing that is missing.

When gfs2_releasepage later runs across these buffers which are still
referenced, it refuses to free them.  This causes the pages the buffers
are attached to to remain referenced as well.  With enough mount/unmount
cycles, the system will eventually run out of memory.

Fix this by adding the missing call to gfs2_rgrp_brelse in
gfs2_clear_rgrpd.

(Also fix a gfs2_rgrp_relse -> gfs2_rgrp_brelse typo in a comment.)

Fixes: 39b0f1e9 ("GFS2: Don't brelse rgrp buffer_heads every allocation")
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

10283ea5

12 10月, 2018 9 次提交

gfs2: Pass resource group to rgblk_free · 0ddeded4

由 Andreas Gruenbacher 提交于 10月 04, 2018

Function rgblk_free can only deal with one resource group at a time, so
pass that resource group is as a parameter.  Several of the callers
already have the resource group at hand, so we only need additional
lookup code in a few places.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NSteven Whitehouse <swhiteho@redhat.com>

0ddeded4

gfs2: Remove unnecessary gfs2_rlist_alloc parameter · c3abc29e

由 Bob Peterson 提交于 10月 04, 2018

The state parameter of gfs2_rlist_alloc is set to LM_ST_EXCLUSIVE in all
calls, so remove it and hardcode that state in gfs2_rlist_alloc instead.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NSteven Whitehouse <swhiteho@redhat.com>

c3abc29e

gfs2: Fix marking bitmaps non-full · ec23df2b

由 Andreas Gruenbacher 提交于 9月 27, 2018

Reservations in gfs can span multiple gfs2_bitmaps (but they won't span
multiple resource groups).  When removing a reservation, we want to
clear the GBF_FULL flags of all involved gfs2_bitmaps, not just that of
the first bitmap.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NSteven Whitehouse <swhiteho@redhat.com>

ec23df2b

gfs2: Fix some minor typos · 243fea4d

由 Andreas Gruenbacher 提交于 10月 02, 2018

Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NSteven Whitehouse <swhiteho@redhat.com>

243fea4d

gfs2: Rename bitmap.bi_{len => bytes} · 281b4952

由 Andreas Gruenbacher 提交于 9月 26, 2018

This field indicates the size of the bitmap in bytes, similar to how the
bi_blocks field indicates the size of the bitmap in blocks.

In count_unlinked, replace an instance of bi_bytes * GFS2_NBBY by
bi_blocks.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NSteven Whitehouse <swhiteho@redhat.com>

281b4952

gfs2: Remove unused RGRP_RSRV_MINBYTES definition · ad899458

由 Andreas Gruenbacher 提交于 9月 25, 2018

This definition is only used to define RGRP_RSRV_MINBLKS, with no
benefit over defining RGRP_RSRV_MINBLKS directly.

In addition, instead of forcing RGRP_RSRV_MINBLKS to be of type u32,
cast it to that type where that type is required.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NSteven Whitehouse <swhiteho@redhat.com>

ad899458

gfs2: Move rs_{sizehint, rgd_gh} fields into the inode · 21f09c43

由 Andreas Gruenbacher 提交于 8月 30, 2018

Move the rs_sizehint and rs_rgd_gh fields from struct gfs2_blkreserv
into the inode: they are more closely related to the inode than to a
particular reservation.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NSteven Whitehouse <swhiteho@redhat.com>

21f09c43

gfs2: Clean up out-of-bounds check in gfs2_rbm_from_block · 3548fce1

由 Andreas Gruenbacher 提交于 10月 11, 2018

We already have a function that checks if a block is within a resource
group, so use that in gfs2_rbm_from_block as well.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NSteven Whitehouse <swhiteho@redhat.com>

3548fce1

gfs2: Always check the result of gfs2_rbm_from_block · f654683d

由 Andreas Gruenbacher 提交于 9月 10, 2018

When gfs2_rbm_from_block fails, the rbm it returns is undefined, so we
always want to make sure gfs2_rbm_from_block has succeeded before
looking at the rbm.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NSteven Whitehouse <swhiteho@redhat.com>

f654683d

06 10月, 2018 1 次提交

gfs2: Use fs_* functions instead of pr_* function where we can · e54c78a2

由 Bob Peterson 提交于 10月 03, 2018

Before this patch, various errors and messages were reported using
the pr_* functions: pr_err, pr_warn, pr_info, etc., but that does
not tell you which gfs2 mount had the problem, which is often vital
to debugging. This patch changes the calls from pr_* to fs_* in
most of the messages so that the file system id is printed along
with the message.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

e54c78a2

29 8月, 2018 2 次提交

gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updated · 4f36cb36

由 Bob Peterson 提交于 8月 16, 2018

The GFS2_RDF_UPTODATE flag in the rgrp is used to determine when
a rgrp buffer is valid. It's cleared when the glock is invalidated,
signifying that the buffer data is now invalid. But before this
patch, function update_rgrp_lvb was setting the flag when it
determined it had a valid lvb. But that's an invalid assumption:
just because you have a valid lvb doesn't mean you have valid
buffers. After all, another node may have made the lvb valid,
and this node just fetched it from the glock via dlm.

Consider this scenario:
1. The file system is mounted with RGRPLVB option.
2. In gfs2_inplace_reserve it locks the rgrp glock EX, but thanks
   to GL_SKIP, it skips the gfs2_rgrp_bh_get.
3. Since loops == 0 and the allocation target (ap->target) is
   bigger than the largest known chunk of blocks in the rgrp
   (rs->rs_rbm.rgd->rd_extfail_pt) it skips that rgrp and bypasses
   the call to gfs2_rgrp_bh_get there as well.
4. update_rgrp_lvb sees the lvb MAGIC number is valid, so bypasses
   gfs2_rgrp_bh_get, but it still sets sets GFS2_RDF_UPTODATE due
   to this invalid assumption.
5. The next time update_rgrp_lvb is called, it sees the bit is set
   and just returns 0, assuming both the lvb and rgrp are both
   uptodate. But since this is a smaller allocation, or space has
   been freed by another node, thus adjusting the lvb values,
   it decides to use the rgrp for allocations, with invalid rd_free
   due to the fact it was never updated.

This patch changes update_rgrp_lvb so it doesn't set the UPTODATE
flag anymore. That way, it has no choice but to fetch the latest
values.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

4f36cb36

gfs2: improve debug information when lvb mismatches are found · 72244b6b

由 Bob Peterson 提交于 8月 15, 2018

Before this patch, gfs2_rgrp_bh_get would check for lvb mismatches,
but it wouldn't tell you what was actually wrong. This patch adds
more information to help us debug it. It also makes rgrp consistency
checks dump any bad rgrps, and the rgrp dump code dump any lvbs
as well as the rgrp itself.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>

72244b6b

08 8月, 2018 1 次提交

gfs2: eliminate update_rgrp_lvb_unlinked · f5580d0f

由 Bob Peterson 提交于 8月 08, 2018

Function update_rgrp_lvb_unlinked used to do the same thing as
be32_add_cpu. This patch removes it in favor of using be32_add_cpu
directly.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NAndrew Price <anprice@redhat.com>

f5580d0f

07 8月, 2018 1 次提交

gfs2: Fix gfs2_testbit to use clone bitmaps · dffe12a8

由 Bob Peterson 提交于 8月 07, 2018

Function gfs2_testbit is called in three places. Two of those places,
gfs2_alloc_extent and gfs2_unaligned_extlen, should be using the clone
bitmaps, not the "real" bitmaps. Function gfs2_unaligned_extlen is used
by the block reservations scheme to determine the length of an extent of
free blocks. Before this patch, it wasn't using the clone bitmap, which
means recently-freed blocks were treated as free blocks for the purposes
of an allocation.

This patch adds a new parameter to gfs2_testbit to indicate whether or
not the clone bitmaps should be used (if available).
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com>

dffe12a8

27 7月, 2018 1 次提交

gfs2: cleanup: call gfs2_rgrp_ondisk2lvb from gfs2_rgrp_out · 3f30f929

由 Bob Peterson 提交于 7月 26, 2018

Before this patch gfs2_rgrp_ondisk2lvb was called after every call
to gfs2_rgrp_out. This patch just calls it directly from within
gfs2_rgrp_out, and moves the function to be before it so we don't
need a function prototype.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com>

3f30f929

25 7月, 2018 2 次提交

GFS2: rgrp free blocks used incorrectly · f6753df3

由 Bob Peterson 提交于 5月 30, 2018

Before this patch, several functions in rgrp.c checked the value of
rgd->rd_free_clone. That does not take into account blocks that were
reserved by a multi-block reservation. This causes a problem when
space gets tight in the file system. For example, when function
gfs2_inplace_reserve checks to see if a rgrp has enough blocks to
satisfy the request, it can accept a rgrp that it should reject
because, although there are enough blocks to satisfy the request
_now_, those blocks may be reserved for another running process.

A second problem with this occurs when we've reserved the remaining
blocks in an rgrp: function rg_mblk_search() can reject an rgrp
improperly because it calculates:

   u32 free_blocks = rgd->rd_free_clone - rgd->rd_reserved;

But rd_reserved includes blocks that the current process just
reserved in its own call to inplace_reserve. For example, it can
reserve the last 128 blocks of an rgrp, then reject that same rgrp
because the above calculates out to free_blocks = 0;

Consequences include, but are not limited to, (1) leaving holes,
and thus increasing file system fragmentation, and (2) reporting
file system is full long before it actually is.

This patch introduces a new function, rgd_free, which returns the
number of clone-free blocks (blocks that are truly free as opposed
to blocks that are still being used because an unlinked file is
still open) minus the number of blocks reserved by processes, but
not counting the blocks we ourselves reserved (because obviously
we need to allocate them).
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

f6753df3

gfs2: Don't reject a supposedly full bitmap if we have blocks reserved · e79e0e14

由 Bob Peterson 提交于 6月 18, 2018

Before this patch, you could get into situations like this:

1. Process 1 searches for X free blocks, finds them, makes a reservation
2. Process 2 searches for free blocks in the same rgrp, but now the
   bitmap is full because process 1's reservation is skipped over.
   So it marks the bitmap as GBF_FULL.
3. Process 1 tries to allocate blocks from its own reservation, but
   since the GBF_FULL bit is set, it skips over the rgrp and searches
   elsewhere, thus not using its own reservation.

This patch adds an additional check to allow processes to use their
own reservations.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

e79e0e14

05 7月, 2018 2 次提交

gfs2: Eliminate redundant ip->i_rgd · b7eba890

由 Andreas Gruenbacher 提交于 6月 21, 2018

GFS2 remembers the last rgrp used for allocations in ip->i_rgd.
However, block allocations are made by way of a reservations structure,
ip->i_res, which keeps the last rgrp in ip->i_res.rs_rgd, and ip->i_res
is kept in sync with ip->i_res.rs_rgd, so it's redundant.  Get rid of
ip->i_rgd and just use ip->i_res.rs_rgd in its place.

Based on patches by Robert Peterson.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

b7eba890

gfs2: Stop messing with ip->i_rgd in the rlist code · 03f8c41c

由 Andreas Gruenbacher 提交于 6月 21, 2018

In the resource group list code, keep the last resource group added in
the last position in the array.  Check against that instead of messing
with ip->i_rgd.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

03f8c41c

21 6月, 2018 1 次提交

gfs2: eliminate rs_inum and reduce the size of gfs2 inodes · f85c10e2

由 Bob Peterson 提交于 6月 13, 2018

Before this patch, block reservations kept track of the inode
number. At one point, that was a valid thing to do. However, since
we made the reservation a part of the inode (rather than a pointer
to a separate allocated object) the reservation can determine the
inode number by using container_of. This saves us a little memory
in our inode.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com>

f85c10e2

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功