提交 · d14e1ca305fc27dbceabad64bf5158b35d8864c8 · openeuler / Kernel

28 6月, 2019 1 次提交

gfs2: eliminate tr_num_revoke_rm · e955537e

由 Bob Peterson 提交于 3月 26, 2019

For its journal processing, gfs2 kept track of the number of buffers
added and removed on a per-transaction basis. These values are used
to calculate space needed in the journal. But while these calculations
make sense for the number of buffers, they make no sense for revokes.
Revokes are managed in their own list, linked from the superblock.
So it's entirely unnecessary to keep separate per-transaction counts
for revokes added and removed. A single count will do the same job.
Therefore, this patch combines the transaction revokes into a single
count.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

e955537e

06 6月, 2019 1 次提交

Revert "gfs2: Replace gl_revokes with a GLF flag" · 638803d4

由 Bob Peterson 提交于 6月 06, 2019

Commit 73118ca8 introduced a glock reference counting bug in
gfs2_trans_remove_revoke.  Given that, replacing gl_revokes with a GLF flag is
no longer useful, so revert that commit.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

638803d4

05 6月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 398 · 7336d0e6

由 Thomas Gleixner 提交于 5月 31, 2019

Based on 1 normalized pattern(s):

  this copyrighted material is made available to anyone wishing to use
  modify copy or redistribute it subject to the terms and conditions
  of the gnu general public license version 2

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 44 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531081038.653000175@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

7336d0e6

08 5月, 2019 4 次提交

gfs2: fix race between gfs2_freeze_func and unmount · 8f918219

由 Abhi Das 提交于 4月 30, 2019

As part of the freeze operation, gfs2_freeze_func() is left blocking
on a request to hold the sd_freeze_gl in SH. This glock is held in EX
by the gfs2_freeze() code.

A subsequent call to gfs2_unfreeze() releases the EXclusively held
sd_freeze_gl, which allows gfs2_freeze_func() to acquire it in SH and
resume its operation.

gfs2_unfreeze(), however, doesn't wait for gfs2_freeze_func() to complete.
If a umount is issued right after unfreeze, it could result in an
inconsistent filesystem because some journal data (statfs update) isn't
written out.

Refer to commit 24972557 for a more detailed explanation of how
freeze/unfreeze work.

This patch causes gfs2_unfreeze() to wait for gfs2_freeze_func() to
complete before returning to the user.
Signed-off-by: NAbhi Das <adas@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

8f918219

gfs2: Rename sd_log_le_{revoke,ordered} · a5b1d3fc

由 Andreas Gruenbacher 提交于 4月 05, 2019

Rename sd_log_le_revoke to sd_log_revokes and sd_log_le_ordered to
sd_log_ordered: not sure what le stands for here, but it doesn't add
clarity, and if it stands for list entry, it's actually confusing as
those are both list heads but not list entries.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

a5b1d3fc

gfs2: Replace gl_revokes with a GLF flag · 73118ca8

由 Bob Peterson 提交于 4月 05, 2019

The gl_revokes value determines how many outstanding revokes a glock has
on the superblock revokes list; this is used to avoid unnecessary log
flushes. However, gl_revokes is only ever tested for being zero, and it's
only decremented in revoke_lo_after_commit, which removes all revokes
from the list, so we know that the gl_revoke values of all the glocks on
the list will reach zero. Therefore, we can replace gl_revokes with a
bit flag. This saves an atomic counter in struct gfs2_glock.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

73118ca8

gfs2: clean_journal improperly set sd_log_flush_head · 7c70b896

由 Bob Peterson 提交于 3月 25, 2019

This patch fixes regressions in 588bff95.
Due to that patch, function clean_journal was setting the value of
sd_log_flush_head, but that's only valid if it is replaying the node's
own journal. If it's replaying another node's journal, that's completely
wrong and will lead to multiple problems. This patch tries to clean up
the mess by passing the value of the logical journal block number into
gfs2_write_log_header so the function can treat non-owned journals
generically. For the local journal, the journal extent map is used for
best performance. For other nodes from other journals, new function
gfs2_lblk_to_dblk is called to figure it out using gfs2_iomap_get.

This patch also tries to establish more consistency when passing journal
block parameters by changing several unsigned int types to a consistent
u32.

Fixes: 588bff95 ("GFS2: Reduce code redundancy writing log headers")
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

7c70b896

23 1月, 2019 1 次提交

gfs: no need to check return value of debugfs_create functions · 2abbf9a4

由 Greg Kroah-Hartman 提交于 1月 22, 2019

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

There is no need to save the dentries for the debugfs files, so drop
those variables to save a bit of space and make the code simpler.

Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: cluster-devel@redhat.com
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

2abbf9a4

12 12月, 2018 2 次提交

gfs2: Dump nrpages for inodes and their glocks · 27a2660f

由 Bob Peterson 提交于 4月 18, 2018

This patch is based on an idea from Steve Whitehouse. The idea is
to dump the number of pages for inodes in the glock dumps.
The additional locking required me to drop const from quite a few
places.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

27a2660f

gfs2: Remove vestigial bd_ops · cbbe76c8

由 Bob Peterson 提交于 11月 16, 2018

Field bd_ops was set but never used, so I removed it, and all
code supporting it.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>

cbbe76c8

12 10月, 2018 2 次提交

gfs2: Rename bitmap.bi_{len => bytes} · 281b4952

由 Andreas Gruenbacher 提交于 9月 26, 2018

This field indicates the size of the bitmap in bytes, similar to how the
bi_blocks field indicates the size of the bitmap in blocks.

In count_unlinked, replace an instance of bi_bytes * GFS2_NBBY by
bi_blocks.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NSteven Whitehouse <swhiteho@redhat.com>

281b4952

gfs2: Move rs_{sizehint, rgd_gh} fields into the inode · 21f09c43

由 Andreas Gruenbacher 提交于 8月 30, 2018

Move the rs_sizehint and rs_rgd_gh fields from struct gfs2_blkreserv
into the inode: they are more closely related to the inode than to a
particular reservation.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NSteven Whitehouse <swhiteho@redhat.com>

21f09c43

05 10月, 2018 1 次提交

gfs2: slow the deluge of io error messages · b524abcc

由 Bob Peterson 提交于 10月 04, 2018

When an io error is hit, it calls gfs2_io_error_bh_i for every
journal buffer it can't write. Since we changed gfs2_io_error_bh_i
recently to withdraw later in the cycle, it sends a flood of
errors to the console. This patch checks for the file system already
being withdrawn, and if so, doesn't send more messages. It doesn't
stop the flood of messages, but it slows it down and keeps it more
reasonable.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

b524abcc

07 8月, 2018 1 次提交

gfs2: Fix gfs2_testbit to use clone bitmaps · dffe12a8

由 Bob Peterson 提交于 8月 07, 2018

Function gfs2_testbit is called in three places. Two of those places,
gfs2_alloc_extent and gfs2_unaligned_extlen, should be using the clone
bitmaps, not the "real" bitmaps. Function gfs2_unaligned_extlen is used
by the block reservations scheme to determine the length of an extent of
free blocks. Before this patch, it wasn't using the clone bitmap, which
means recently-freed blocks were treated as free blocks for the purposes
of an allocation.

This patch adds a new parameter to gfs2_testbit to indicate whether or
not the clone bitmaps should be used (if available).
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com>

dffe12a8

05 7月, 2018 1 次提交

gfs2: Eliminate redundant ip->i_rgd · b7eba890

由 Andreas Gruenbacher 提交于 6月 21, 2018

GFS2 remembers the last rgrp used for allocations in ip->i_rgd.
However, block allocations are made by way of a reservations structure,
ip->i_res, which keeps the last rgrp in ip->i_res.rs_rgd, and ip->i_res
is kept in sync with ip->i_res.rs_rgd, so it's redundant.  Get rid of
ip->i_rgd and just use ip->i_res.rs_rgd in its place.

Based on patches by Robert Peterson.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

b7eba890

21 6月, 2018 1 次提交

gfs2: eliminate rs_inum and reduce the size of gfs2 inodes · f85c10e2

由 Bob Peterson 提交于 6月 13, 2018

Before this patch, block reservations kept track of the inode
number. At one point, that was a valid thing to do. However, since
we made the reservation a part of the inode (rather than a pointer
to a separate allocated object) the reservation can determine the
inode number by using container_of. This saves us a little memory
in our inode.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com>

f85c10e2

04 6月, 2018 1 次提交

GFS2: gfs2_free_extlen can return an extent that is too long · dc8fbb03

由 Bob Peterson 提交于 6月 01, 2018

Function gfs2_free_extlen calculates the length of an extent of
free blocks that may be reserved. The end pointer was calculated as
end = start + bh->b_size but b_size is incorrect because the
bitmap usually stops prior to the end of the buffer data on
the last bitmap.

What this means is that when you do a write, you can reserve a
chunk of blocks that runs off the end of the last bitmap. For
example, I've got a file system where there is only one bitmap
for each rgrp, so ri_length==1. I saw cases in which iozone
tried to do a big write, grabbed a large block reservation,
chose rgrp 5464152, which has ri_data0 5464153 and ri_data 8188.
So 5464153 + 8188 = 5472341 which is the end of the rgrp.

When it grabbed a reservation it got back: 5470936, length 7229.
But 5470936 + 7229 = 5478165. So the reservation starts inside
the rgrp but runs 5824 blocks past the end of the bitmap.

This patch fixes the calculation so it won't exceed the last
bitmap. It also adds a BUG_ON to guard against overflows in the
future.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

dc8fbb03

17 4月, 2018 1 次提交

gfs2: Remove sdp->sd_jheightsize · 9a38662b

由 Andreas Gruenbacher 提交于 4月 16, 2018

GFS2 keeps two arrarys in the superblock that define the maximum size of
an inode depending on the inode's height: sdp->sd_heightsize defines the
heights in units of sb->s_blocksize; sdp->sd_jheightsize defines them in
units of sb->s_blocksize - sizeof(struct gfs2_meta_header).  These
arrays are used to determine when additional layers of indirect blocks
are needed.  The second array is used for directories which have an
additional gfs2_meta_header at the beginning of each block.

Distinguishing between these two cases makes no sense: the height
required for representing N blocks will come out the same no matter if
the calculation is done in gross (sb->s_blocksize) or net
(sb->s_blocksize - sizeof(struct gfs2_meta_header)) units.

Stuffed directories don't have an additional gfs2_meta_header, but the
stuffed case is handled separately for both files and directories,
anyway.

Remove the unncessary sdp->sd_jheightsize array.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

9a38662b

29 3月, 2018 1 次提交

gfs2: Zero out fallocated blocks in fallocate_chunk · fffb6412

由 Andreas Gruenbacher 提交于 3月 29, 2018

Instead of zeroing out fallocated blocks in gfs2_iomap_alloc, zero them
out in fallocate_chunk, much higher up the call stack. This gets rid of
gfs2's abuse of the IOMAP_ZERO flag as well as the gfs2 specific zeronew
buffer flag. I can't think of a reason why zeroing out the blocks in
gfs2_iomap_alloc would have any benefits: there is no additional locking
at that level that would add protection to the newly allocated blocks.

While at it, change fallocate over from gs2_block_map to gfs2_iomap_begin.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Acked-by: NChristoph Hellwig <hch@lst.de>

fffb6412

22 1月, 2018 1 次提交

gfs2: Get rid of gfs2_log_header_in · 0ff5916a

由 Andreas Gruenbacher 提交于 1月 16, 2018

Get rid of gfs2_log_header_in by integrating it into get_log_header.
Clean up the crc32 computations and use the same functions for encoding
and decoding to make things less confusing.  Eliminate lh_hash from
gfs2_log_header_host which is completely useless.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

0ff5916a

19 1月, 2018 1 次提交

gfs2: Add gfs2_max_stuffed_size · 235628c5

由 Andreas Gruenbacher 提交于 11月 14, 2017

Add a small inline function for computing the maximum size of a stuffed
inode instead of open coding that in several places throughout the code.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

235628c5

25 8月, 2017 2 次提交

gfs2: Silence gcc format-truncation warning · 561b7969

由 Andreas Gruenbacher 提交于 8月 22, 2017

Enlarge sd_fsname to be big enough for the longest long lock table name
and an arbitrary journal number.  This silences two -Wformat-truncation
warnings with gcc 7.1.1.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

561b7969

GFS2: Withdraw for IO errors writing to the journal or statfs · 942b0cdd

由 Bob Peterson 提交于 8月 16, 2017

Before this patch, if GFS2 encountered IO errors while writing to
the journal, it would not report the problem, so they would go
unnoticed, sometimes for many hours. Sometimes this would only be
noticed later, when recovery tried to do journal replay and failed
due to invalid metadata at the blocks that resulted in IO errors.

This patch makes GFS2's log daemon check for IO errors. If it
encounters one, it withdraws from the file system and reports
why in dmesg. A similar action is taken when IO errors occur when
writing to the system statfs file.

These errors are also reported back to any callers of fsync, since
that requires the journal to be flushed. Therefore, any IO errors
that would previously go unnoticed are now noticed and the file
system is withdrawn as early as possible, thus preventing further
file system damage.

Also note that this reintroduces superblock variable sd_log_error,
which Christoph removed with commit f729b66f.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

942b0cdd

10 8月, 2017 1 次提交

gfs2: forcibly flush ail to relieve memory pressure · b066a4ee

由 Abhi Das 提交于 8月 04, 2017

On systems with low memory, it is possible for gfs2 to infinitely
loop in balance_dirty_pages() under heavy IO (creating sparse files).

balance_dirty_pages() attempts to write out the dirty pages via
gfs2_writepages() but none are found because these dirty pages are
being used by the journaling code in the ail. Normally, the journal
has an upper threshold which when hit triggers an automatic flush
of the ail. But this threshold can be higher than the number of
allowable dirty pages and result in the ail never being flushed.

This patch forces an ail flush when gfs2_writepages() fails to write
anything. This is a good indication that the ail might be holding
some dirty pages.
Signed-off-by: NAbhi Das <adas@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

b066a4ee

08 7月, 2017 1 次提交

gfs2: Fix glock rhashtable rcu bug · 961ae1d8

由 Andreas Gruenbacher 提交于 7月 07, 2017

Before commit 88ffbf3e "GFS2: Use resizable hash table for glocks",
glocks were freed via call_rcu to allow reading the glock hashtable
locklessly using rcu.  This was then changed to free glocks immediately,
which made reading the glock hashtable unsafe.  Bring back the original
code for freeing glocks via call_rcu.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Cc: stable@vger.kernel.org # 4.3+

961ae1d8

05 7月, 2017 2 次提交

gfs2: Protect gl->gl_object by spin lock · 6f6597ba

由 Andreas Gruenbacher 提交于 6月 30, 2017

Put all remaining accesses to gl->gl_object under the
gl->gl_lockref.lock spinlock to prevent races.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

6f6597ba

gfs2: Get rid of flush_delayed_work in gfs2_evict_inode · 4fd1a579

由 Andreas Gruenbacher 提交于 6月 30, 2017

So far, gfs2_evict_inode clears gl->gl_object and then flushes the glock
work queue to make sure that inode glops which dereference gl->gl_object
have finished running before the inode is destroyed.  However, flushing
the work queue may do more work than needed, and in particular, it may
call into DLM, which we want to avoid here.  Use a bit lock
(GIF_GLOP_PENDING) to synchronize between the inode glops and
gfs2_evict_inode instead to get rid of the flushing.

In addition, flush the work queues of existing glocks before reusing
them for new inodes to get those glocks into a known state: the glock
state engine currently doesn't handle glock re-appropriation correctly.
(We may be able to fix the glock state engine instead later.)

Based on a patch by Steven Whitehouse <swhiteho@redhat.com>.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

4fd1a579

20 6月, 2017 1 次提交

GFS2: Eliminate vestigial sd_log_flush_wrapped · 722f6f62

由 Bob Peterson 提交于 6月 20, 2017

Superblock variable sd_log_flush_wrapped is set, but never referenced,
so this patch eliminates it.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

722f6f62

13 6月, 2017 1 次提交

GFS2: Remove gl_list from glock structure · df68f20f

由 Bob Peterson 提交于 6月 05, 2017

The gl_list is no longer used nor needed in the glock structure,
so this patch eliminates it.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

df68f20f

09 6月, 2017 1 次提交

gfs2: remove the unused sd_log_error field · f729b66f

由 Christoph Hellwig 提交于 6月 03, 2017

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NBart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

f729b66f

16 3月, 2017 1 次提交

gfs2: Don't pack struct lm_lockname · 972b044e

由 Andreas Gruenbacher 提交于 3月 16, 2017

As per a suggestion by Linus, don't pack struct lm_lockname: we did that
because the struct is used as a rhashtable key, but packing tells the
compiler that the 64-bit fields in the struct may be unaligned, causing
it to generate worse code on some architectures. Instead, rearrange the
fields in the struct so that there is no padding between fields, and
exclude any tail padding from the hash key size.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

972b044e

15 3月, 2017 1 次提交

gfs2: Avoid alignment hole in struct lm_lockname · 28ea06c4

由 Andreas Gruenbacher 提交于 3月 06, 2017

Commit 88ffbf3e switches to using rhashtables for glocks, hashing over
the entire struct lm_lockname instead of its individual fields.  On some
architectures, struct lm_lockname contains a hole of uninitialized
memory due to alignment rules, which now leads to incorrect hash values.
Get rid of that hole.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
CC: <stable@vger.kernel.org> #v4.3+

28ea06c4

27 1月, 2017 1 次提交

GFS2: Switch tr_touched to flag in transaction · 9862ca05

由 Bob Peterson 提交于 1月 25, 2017

This patch eliminates the int variable tr_touched in favor of a
new flag in the transaction. This is a step toward reducing contention
on the gfs2_log_lock spin_lock.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

9862ca05

06 1月, 2017 1 次提交

GFS2: Made logd daemon take into account log demand · f07b3520

由 Bob Peterson 提交于 1月 05, 2017

Before this patch, the logd daemon only tried to flush things when
the log blocks pinned exceeded a certain threshold. But when we're
deleting very large files, it may require a huge number of journal
blocks, and that, in turn, may exceed the threshold. This patch
factors that into account.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

f07b3520

15 3月, 2016 1 次提交

GFS2: Prevent delete work from occurring on glocks used for create · a4923865

由 Bob Peterson 提交于 12月 07, 2015

This patch tries to prevent delete work (queued via iopen callback)
from executing if the glock is currently being used to create
a new inode.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>

a4923865

15 12月, 2015 3 次提交

gfs2: change gfs2 readdir cookie · 471f3db2

由 Benjamin Marzinski 提交于 12月 01, 2015

gfs2 currently returns 31 bits of filename hash as a cookie that readdir
uses for an offset into the directory.  When there are a large number of
directory entries, the likelihood of a collision goes up way too
quickly.  GFS2 will now return cookies that are guaranteed unique for a
while, and then fail back to using 30 bits of filename hash.
Specifically, the directory leaf blocks are divided up into chunks based
on the minimum size of a gfs2 directory entry (48 bytes). Each entry's
cookie is based off the chunk where it starts, in the linked list of
leaf blocks that it hashes to (there are 131072 hash buckets). Directory
entries will have unique names until they take reach chunk 8192.
Assuming the largest filenames possible, and the least efficient spacing
possible, this new method will still be able to return unique names when
the previous method has statistically more than a 99% chance of a
collision.  The non-unique names it fails back to are guaranteed to not
collide with the unique names.

unique cookies will be in this format:
- 1 bit "0" to make sure the the returned cookie is positive
- 17 bits for the hash table index
- 1 bit for the mode "0"
- 13 bits for the offset

non-unique cookies will be in this format:
- 1 bit "0" to make sure the the returned cookie is positive
- 17 bits for the hash table index
- 1 bit for the mode "1"
- 13 more bits of the name hash

Another benefit of location based cookies, is that once a directory's
exhash table is fully extended (so that multiple hash table indexs do
not use the same leaf blocks), gfs2 can skip sorting the directory
entries until it reaches the non-unique ones, and then it only needs to
sort these. This provides a significant speed up for directory reads of
very large directories.

The only issue is that for these cookies to continue to point to the
correct entry as files are added and removed from the directory, gfs2
must keep the entries at the same offset in the leaf block when they are
split (see my previous patch). This means that until all the nodes in a
cluster are running with code that will split the directory leaf blocks
this way, none of the nodes can use the new cookie code. To deal with
this, gfs2 now has the mount option loccookie, which, if set, will make
it return these new location based cookies.  This option must not be set
until all nodes in the cluster are at least running this version of the
kernel code, and you have guaranteed that there are no outstanding
cookies required by other software, such as NFS.

gfs2 uses some of the extra space at the end of the gfs2_dirent
structure to store the calculated readdir cookies. This keeps us from
needing to allocate a seperate array to hold these values.  gfs2
recomputes the cookie stored in de_cookie for every readdir call.  The
time it takes to do so is small, and if gfs2 expected this value to be
saved on disk, the new code wouldn't work correctly on filesystems
created with an earlier version of gfs2.

One issue with adding de_cookie to the union in the gfs2_dirent
structure is that it caused the union to align itself to a 4 byte
boundary, instead of its previous 2 byte boundary. This changed the
offset of de_rahead. To solve that, I pulled de_rahead out of the union,
since it does not need to be there.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

471f3db2

GFS2: Reduce size of incore inode · b58bf407

由 Bob Peterson 提交于 7月 24, 2015

This patch makes no functional changes. Its goal is to reduce the
size of the gfs2 inode in memory by rearranging structures and
changing the size of some variables within the structure.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

b58bf407

GFS2: Make rgrp reservations part of the gfs2_inode structure · a097dc7e

由 Bob Peterson 提交于 7月 16, 2015

Before this patch, multi-block reservation structures were allocated
from a special slab. This patch folds the structure into the gfs2_inode
structure. The disadvantage is that the gfs2_inode needs more memory,
even when a file is opened read-only. The advantages are: (a) we don't
need the special slab and the extra time it takes to allocate and
deallocate from it. (b) we no longer need to worry that the structure
exists for things like quota management. (c) This also allows us to
remove the calls to get_write_access and put_write_access since we
know the structure will exist.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

a097dc7e

24 11月, 2015 1 次提交

GFS2: Extract quota data from reservations structure (revert ) · b54e9a0b

由 Bob Peterson 提交于 10月 26, 2015

This patch basically reverts the majority of patch 5407e242.
That patch eliminated the gfs2_qadata structure in favor of just
using the reservations structure. The problem with doing that is that
it increases the size of the reservations structure. That is not an
issue until it comes time to fold the reservations structure into the
inode in memory so we know it's always there. By separating out the
quota structure again, we aren't punishing the non-quota users by
making all the inodes bigger, requiring more slab space. This patch
creates a new slab area to allocate the quota stuff so it's managed
a little more sanely.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

b54e9a0b

17 11月, 2015 1 次提交

gfs2: Extended attribute readahead · c8d57703

由 Andreas Gruenbacher 提交于 11月 11, 2015

When gfs2 allocates an inode and its extended attribute block next to
each other at inode create time, the inode's directory entry indicates
that in de_rahead.  In that case, we can readahead the extended
attribute block when we read in the inode.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

c8d57703

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功