提交 · eccba068915feece2868c502787037e244db3376 · openeuler / raspberrypi-kernel

08 2月, 2008 2 次提交

gfs2: make gfs2_glock.gl_owner_pid be a struct pid * · eccba068

由 Pavel Emelyanov 提交于 2月 07, 2008

The gl_owner_pid field is used to get the lock owning task by its pid, so make
it in a proper manner, i.e.  by using the struct pid pointer and pid_task()
function.

The pid_task() becomes exported for the gfs2 module.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eccba068

gfs2: make gfs2_holder.gh_owner_pid be a struct pid * · b1e058da

由 Pavel Emelyanov 提交于 2月 07, 2008

The gl_owner_pid field is used to get the holder task by its pid and check
whether the current is a holder, so make it in a proper manner, i.e.  via the
struct pid * manipulations.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b1e058da

25 1月, 2008 16 次提交

[GFS2] Remove unneeded i_spin · 598278bd

由 Bob Peterson 提交于 1月 11, 2008

This patch removes a vestigial variable "i_spin" from the gfs2_inode
structure. This not only saves us memory (>300000 of these in memory
for the oom test) it also saves us time because we don't have to
spend time initializing it (i.e. slightly better performance).
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

598278bd

[GFS2] Reduce inode size by moving i_alloc out of line · 6dbd8224

由 Steven Whitehouse 提交于 1月 10, 2008

It is possible to reduce the size of GFS2 inodes by taking the i_alloc
structure out of the gfs2_inode. This patch allocates the i_alloc
structure whenever its needed, and frees it afterward. This decreases
the amount of low memory we use at the expense of requiring a memory
allocation for each page or partial page that we write. A quick test
with postmark shows that the overhead is not measurable and I also note
that OCFS2 use the same approach.

In the future I'd like to solve the problem by shrinking down the size
of the members of the i_alloc structure, but for now, this reduces the
immediate problem of using too much low-memory on x86 and doesn't add
too much overhead.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

6dbd8224

[GFS2] Remove unused variable · 65a62909

由 Steven Whitehouse 提交于 1月 02, 2008

The go_drop_th function is never called or referenced.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

65a62909

[GFS2] Eliminate the no longer needed sd_statfs_mutex · c3f60b6e

由 Bob Peterson 提交于 12月 12, 2007

This patch eliminates the unneeded sd_statfs_mutex mutex but preserves
the ordering as discussed.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c3f60b6e

[GFS2] Journal extent mapping · da6dd40d

由 Bob Peterson 提交于 12月 11, 2007

This patch saves a little time when gfs2 writes to the journals by
keeping a mapping between logical and physical blocks on disk.
That's better than constantly looking up indirect pointers in
buffers, when the journals are several levels of indirection
(which they typically are).
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

da6dd40d

[GFS2] Don't periodically update the jindex · e35b9211

由 Steven Whitehouse 提交于 11月 09, 2007

We only care about the content of the jindex in two cases,
one is when we mount the fs and the other is when we need
to recover another journal. In both cases we have to update
the jindex anyway, so there is no point in updating it
periodically between times, so this removes it to simplify
gfs2_logd.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e35b9211

[GFS2] Use atomic_t for journal free blocks counter · fd041f0b

由 Steven Whitehouse 提交于 11月 08, 2007

This patch changes the counter which keeps track of the free
blocks in the journal to an atomic_t in preparation for the
following patch which will update the log reservation code.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

fd041f0b

[GFS2] Don't add glocks to the journal · 2bcd610d

由 Steven Whitehouse 提交于 11月 08, 2007

The only reason for adding glocks to the journal was to keep track
of which locks required a log flush prior to release. We add a
flag to the glock to allow this check to be made in a simpler way.

This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64)
and means that we can avoid extra work during the journal flush.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2bcd610d

[GFS2] Remove flags no longer required · e589665e

由 Steven Whitehouse 提交于 11月 02, 2007

The HIF_MUTEX and HIF_PROMOTE flags were set on the glock holders
depending upon which of the two waiters lists they were going to
be queued upon. They were then tested when the holders were taken
off the lists to ensure that the right type of holder was being
dequeued.

Since we are already using separate lists, there doesn't seem a
lot of point having these flags as well, and since setting them
and testing them is in the fast path for locking and unlocking
glock, this patch removes them.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e589665e

[GFS2] Remove "reclaim limit" · c2932e03

由 Steven Whitehouse 提交于 11月 01, 2007

This call to reclaim glocks is not needed, and in particular we don't want it
in the fast path for locking glocks. The limit was entirely arbitrary anyway
and we can't expect users to adjust things like this, the remaining code will
do the right thing on its own.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c2932e03

[GFS2] Remove unused variables · 60b0d087

由 Steven Whitehouse 提交于 10月 31, 2007

These haven't been used for some time, remove them.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

60b0d087

[GFS2] Add gfs2_is_writeback() · bf36a713

由 Steven Whitehouse 提交于 10月 17, 2007

This adds a function "gfs2_is_writeback()" along the lines of the
existing "gfs2_is_jdata()" in order to clean up the code and make
the various tests for the inode mode more obvious. It also fixes
the PageChecked() logic where we were resetting the flag too early
in the case of an error path.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

bf36a713

S
[GFS2] Remove unused field in struct gfs2_inode · e7e36f14
由 Steven Whitehouse 提交于 10月 16, 2007
```
Removes a field that is not used.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
```
e7e36f14

[GFS2] Remove useless i_cache from inodes · f91a0d3e

由 Steven Whitehouse 提交于 10月 15, 2007

The i_cache was designed to keep references to the indirect blocks
used during block mapping so that they didn't have to be looked
up continually. The idea failed because there are too many places
where the i_cache needs to be freed, and this has in the past been
the cause of many bugs.

In addition there was no performance benefit being gained since the
disk blocks in question were cached anyway. So this patch removes
it in order to simplify the code to prepare for other changes which
would otherwise have had to add further support for this feature.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f91a0d3e

[GFS2] Use ->page_mkwrite() for mmap() · 3cc3f710

由 Steven Whitehouse 提交于 10月 15, 2007

This cleans up the mmap() code path for GFS2 by implementing the
page_mkwrite function for GFS2. We are thus able to use the
generic filemap_fault function for our ->fault() implementation.

This now means that shared writable mappings will be much more
efficiently shared across the cluster if there is a reasonable
proportion of read activity (the greater proportion, the better).

As a side effect, it also reduces the size of the code, removes
special cases from readpage and readpages, and makes the code
path easier to follow.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3cc3f710

[GFS2] Handle multiple glock demote requests · cc7e79b1

由 Wendy Cheng 提交于 10月 05, 2007

Fix a race condition where multiple glock demote requests are sent to
a node back-to-back. This patch does a check inside handle_callback()
to see whether a demote request is in progress. If true, it sets a flag
to make sure run_queue() will loop again to handle the new request,
instead of erronously setting gl_demote_state to a different state.
Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

cc7e79b1

10 10月, 2007 5 次提交

[GFS2] Clean up journaled data writing · 16615be1

由 Steven Whitehouse 提交于 9月 17, 2007

This patch cleans up the code for writing journaled data into the log.
It also removes the need to allocate a small "tag" structure for each
block written into the log. Instead we just keep count of the outstanding
I/O so that we can be sure that its all been written at the correct time.
Another result of this patch is that a number of ll_rw_block() calls
have become submit_bh() calls, closing some races at the same time.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

16615be1

[GFS2] Replace revoke structure with bufdata structure · 82e86087

由 Steven Whitehouse 提交于 9月 02, 2007

Both the revoke structure and the bufdata structure are quite similar.
They are basically small tags which are put on lists. In addition to
which the revoke structure is always allocated when there is a bufdata
structure which is (or can be) freed. As such it should be possible to
reduce the number of frees and allocations by using the same structure
for both purposes.

This patch is the first step along that path. It replaces existing uses
of the revoke structure with the bufdata structure.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

82e86087

[GFS2] Clean up ordered write code · d7b616e2

由 Steven Whitehouse 提交于 9月 02, 2007

The following patch removes the ordered write processing from
databuf_lo_before_commit() and moves it to log.c. This has the effect of
greatly simplyfying databuf_lo_before_commit() and well as potentially
making the ordered write code more efficient.

As a side effect of this, its now possible to remove ordered buffers
from the ordered buffer list at any time, so we now make use of this in
invalidatepage and releasepage to ensure timely release of these
buffers.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

d7b616e2

[GFS2] delay glock demote for a minimum hold time · c4f68a13

由 Benjamin Marzinski 提交于 8月 23, 2007

When a lot of IO, with some distributed mmap IO, is run on a GFS2 filesystem in
a cluster, it will deadlock. The reason is that do_no_page() will repeatedly
call gfs2_sharewrite_nopage(), because each node keeps giving up the glock
too early, and is forced to call unmap_mapping_range(). This bumps the
mapping->truncate_count sequence count, forcing do_no_page() to retry. This
patch institutes a minimum glock hold time a tenth a second. This insures
that even in heavy contention cases, the node has enough time to get some
useful work done before it gives up the glock.

A second issue is that when gfs2_glock_dq() is called from within a page fault
to demote a lock, and the associated page needs to be written out, it will
try to acqire a lock on it, but it has already been locked at a higher level.
This patch puts makes gfs2_glock_dq() use the work queue as well, to avoid this
issue. This is the same patch as Steve Whitehouse originally proposed to fix
this issue, execpt that gfs2_glock_dq() now grabs a reference to the glock
before it queues up the work on it.
Signed-off-by: NBenjamin E. Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c4f68a13

[GFS2] Reduce number of gfs2_scand processes to one · 8fbbfd21

由 Steven Whitehouse 提交于 8月 01, 2007

We only need a single gfs2_scand process rather than the one
per filesystem which we had previously. As a result the parameter
determining the frequency of gfs2_scand runs becomes a module
parameter rather than a mount parameter as it was before.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8fbbfd21

09 7月, 2007 5 次提交

[GFS2] assertion failure after writing to journaled file, umount · 2332c443

由 Robert Peterson 提交于 6月 18, 2007

This patch passes all my nasty tests that were causing the code to
fail under one circumstance or another. Here is a complete summary
of all changes from today's git tree, in order of appearance:

1. There are now separate variables for metadata buffer accounting.
2. Variable sd_log_num_hdrs is no longer needed, since the header
accounting is taken care of by the reserve/refund sequence.
3. Fixed a tiny grammatical problem in a comment.
4. Added a new function "calc_reserved" to calculate the reserved
log space. This isn't entirely necessary, but it has two benefits:
First, it simplifies the gfs2_log_refund function greatly.
Second, it allows for easier debugging because I could sprinkle the
code with calls to this function to make sure the accounting is
proper (by adding asserts and printks) at strategic point of the code.
5. In log_pull_tail there apparently was a kludge to fix up the
accounting based on a "pull" parameter. The buffer accounting is
now done properly, so the kludge was removed.
6. File sync operations were making a call to gfs2_log_flush that
writes another journal header. Since that header was unplanned
for (reserved) by the reserve/refund sequence, the free space had
to be decremented so that when log_pull_tail gets called, the free
space is be adjusted properly. (Did I hear you call that a kludge?
well, maybe, but a lot more justifiable than the one I removed).
7. In the gfs2_log_shutdown code, it optionally syncs the log by
specifying the PULL parameter to log_write_header. I'm not sure
this is necessary anymore. It just seems to me there could be
cases where shutdown is called while there are outstanding log
buffers.
8. In the (data)buf_lo_before_commit functions, I changed some offset
values from being calculated on the fly to being constants. That
simplified some code and we might as well let the compiler do the
calculation once rather than redoing those cycles at run time.
9. This version has my rewritten databuf_lo_add function.
This version is much more like its predecessor, buf_lo_add, which
makes it easier to understand. Again, this might not be necessary,
but it seems as if this one works as well as the previous one,
maybe even better, so I decided to leave it in.
10. In databuf_lo_before_commit, a previous data corruption problem
was caused by going off the end of the buffer. The proper solution
is to have the proper limit in place, rather than stopping earlier.
(Thus my previous attempt to fix it is wrong).
If you don't wrap the buffer, you're stopping too early and that
causes more log buffer accounting problems.
11. In lops.h there are two new (previously mentioned) constants for
figuring out the data offset for the journal buffers.
12. There are also two new functions, buf_limit and databuf_limit to
calculate how many entries will fit in the buffer.
13. In function gfs2_meta_wipe, it needs to distinguish between pinned
metadata buffers and journaled data buffers for proper journal buffer
accounting. It can't use the JDATA gfs2_inode flag because it's
sometimes passed the "real" inode and sometimes the "metadata
inode" and the inode flags will be random bits in a metadata
gfs2_inode. It needs to base its decision on which was passed in.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2332c443

[GFS2] Recovery for lost unlinked inodes · c8cdf479

由 Steven Whitehouse 提交于 6月 08, 2007

Under certain circumstances its possible (though rather unlikely) that
inodes which were unlinked by one node while still open on another might
get "lost" in the sense that they don't get deallocated if the node
which held the inode open crashed before it was unlinked.

This patch adds the recovery code which allows automatic deallocation of
the inode if its found during block allocation (the sensible time to
look for such inodes since we are scanning the rgrp's bitmaps anyway at
this time, so it adds no overhead to do this).

Since the inode will have had its i_nlink set to zero, all we need to
trigger recovery is a lookup and an iput(), and the normal deallocation
code takes care of the rest.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c8cdf479

[GFS2] Fix sign problem in quota/statfs and cleanup _host structures · bb8d8a6f

由 Steven Whitehouse 提交于 6月 01, 2007

This patch fixes some sign issues which were accidentally introduced
into the quota & statfs code during the endianess annotation process.
Also included is a general clean up which moves all of the _host
structures out of gfs2_ondisk.h (where they should not have been to
start with) and into the places where they are actually used (often only
one place). Also those _host structures which are not required any more
are removed entirely (which is the eventual plan for all of them).

The conversion routines from ondisk.c are also moved into the places
where they are actually used, which for almost every one, was just one
single place, so all those are now static functions. This also cleans up
the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__.

The net result is a reduction of about 100 lines of code, many functions
now marked static plus the bug fixes as mentioned above. For good
measure I ran the code through sparse after making these changes to
check that there are no warnings generated.

This fixes Red Hat bz #239686
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

bb8d8a6f

[GFS2] Quotas non-functional - fix bug · 2a87ab08

由 Abhijith Das 提交于 5月 16, 2007

This patch fixes an error in the quota code where a 'struct
gfs2_quota_lvb*' was being passed to gfs2_adjust_quota() instead of a
'struct gfs2_quota_data*'. Also moved 'struct gfs2_quota_lvb' from
fs/gfs2/incore.h to include/linux/gfs2_ondisk.h as per Steve's suggestion.
Signed-off-by: NAbhijith Das <adas@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2a87ab08

[GFS2] Clean up inode number handling · dbb7cae2

由 Steven Whitehouse 提交于 5月 15, 2007

This patch cleans up the inode number handling code. The main difference
is that instead of looking up the inodes using a struct gfs2_inum_host
we now use just the no_addr member of this structure. The tests relating
to no_formal_ino can then be done by the calling code. This has
advantages in that we want to do different things in different code
paths if the no_formal_ino doesn't match. In the NFS patch we want to
return -ESTALE, but in the ->lookup() path, its a bug in the fs if the
no_formal_ino doesn't match and thus we can withdraw in this case.

In order to later fix bz #201012, we need to be able to look up an inode
without knowing no_formal_ino, as the only information that is known to
us is the on-disk location of the inode in question.

This patch will also help us to fix bz #236099 at a later date by
cleaning up a lot of the code in that area.

There are no user visible changes as a result of this patch and there
are no changes to the on-disk format either.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

dbb7cae2

01 5月, 2007 4 次提交

[GFS2] lockdump improvements · 5f882096

由 Robert Peterson 提交于 4月 18, 2007

The patch below consists of the following changes (in code order):

1. I fixed a minor compiler warning regarding the printing of
   a kernel symbol address.
2. I implemented a suggestion from Dave Teigland that moves
   the debugfs information for gfs2 into a subdirectory so
   we can easily expand our use of debugfs in the future.
   The current code keeps the glock information in:
   /debug/gfs2/<fs>
   With the patch, the new code keeps the glock information in:
   /debug/gfs2/<fs>/glock
   That will allow us to create more debugfs files in the future.
3. This fixes a bug whereby a failed mount attempt causes the
   debugfs file to not be deleted.  Failed mount attempts should
   always clean up after themselves, including deleting the
   debugfs file and/or directory.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5f882096

[GFS2] Red Hat bz 228540: owner references · 04b933f2

由 Robert Peterson 提交于 3月 23, 2007

In Testing the previously posted and accepted patch for
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228540
I uncovered some gfs2 badness.  It turns out that the current
gfs2 code saves off a process pointer when glocks is taken
in both the glock and glock holder structures.  Those
structures will persist in memory long after the process has
ended; pointers to poisoned memory.

This problem isn't caused by the 228540 fix; the new capability
introduced by the fix just uncovered the problem.

I wrote this patch that avoids saving process pointers
and instead saves off the process pid.  Rather than
referencing the bad pointers, it now does process lookups.
There is special code that makes the output nicer for
printing holder information for processes that have ended.

This patch also adds a stub for the new "sprint_symbol"
function that exists in Andrew Morton's -mm patch set, but
won't go into the base kernel until 2.6.22, since it adds
functionality but doesn't fix a bug.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

04b933f2

[GFS2] Fix bz 224480 and cleanup glock demotion code · 3b8249f6

由 Steven Whitehouse 提交于 3月 16, 2007

This patch prevents the printing of a warning message in cases where
the fs is functioning normally by handing off responsibility for
unlinked, but still open inodes, to another node for eventual deallocation.
Also, there is now an improved system for ensuring that such requests
to other nodes do not get lost. The callback on the iopen lock is
only ever called when i_nlink == 0 and when a node is unable to deallocate
it due to it still being in use on another node. When a node receives
the callback therefore, it knows that i_nlink must be zero, so we mark
it as such (in gfs2_drop_inode) in order that it will then attempt
deallocation of the inode itself.

As an additional benefit, queuing a demote request no longer requires
a memory allocation. This simplifies the code for dealing with gfs2_holders
as it removes one special case.

There are two new fields in struct gfs2_glock. gl_demote_state is the
state which the remote node has requested and gl_demote_time is the
time when the request came in. Both fields are only valid when the
GLF_DEMOTE flag is set in gl_flags.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3b8249f6

[GFS2] Add gfs2_tool lockdump support to gfs2 (bz 228540) · 7c52b166

由 Robert Peterson 提交于 3月 16, 2007

The attached patch resolves bz 228540.  This adds the capability
for gfs2 to dump gfs2 locks through the debugfs file system.
This used to exist in gfs1 as "gfs_tool lockdump" but it's missing from
gfs2 because all the ioctls were stripped out.  Please see the bugzilla
for more history about the fix.  This patch is also attached to the bugzilla
record.

The patch is against Steve Whitehouse's latest nmw git tree kernel
(2.6.21-rc1) and has been tested on system trin-10.
Signed-off-by: NRobert Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

7c52b166

08 3月, 2007 2 次提交

[GFS2] go_drop_bh is never used, so remove it · 631c42e1

由 Steven Whitehouse 提交于 3月 01, 2007

The ->go_drop_bh function is never used, so this removes it and the single
caller,
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

631c42e1

S
[GFS2] Remove unused variable · 04b159b1
由 Steven Whitehouse 提交于 3月 01, 2007
```
Remove an unused variable.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
```
04b159b1

06 2月, 2007 6 次提交

[GFS2] Tidy up glops calls · b5d32bea

由 Steven Whitehouse 提交于 1月 22, 2007

This patch doesn't make any changes to the ordering of the various
operations related to glocking, but it does tidy up the calls to the
glops.c functions to make the structure more obvious.

The two functions: gfs2_glock_xmote_th() and gfs2_glock_drop_th() can be
made static within glock.c since they are called by every set of glock
operations. The xmote_th and drop_th glock operations are then made
conditional upon those two routines existing and called from the
previously mentioned functions in glock.c respectively.

Also it can be seen that the go_sync operation isn't needed since it can
easily be replaced by calls to xmote_bh and drop_bh respectively. This
results in no longer (confusingly) calling back into routines in glock.c
from glops.c and also reducing the glock operations by one member.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

b5d32bea

[GFS2] Remove unused go_callback operation · 6bd9c8c2

由 Steven Whitehouse 提交于 1月 19, 2007

This is never used, so we might as well remove it.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

6bd9c8c2

[GFS2] Remove the "greedy" function from glock.[ch] · e5dab552

由 Steven Whitehouse 提交于 1月 18, 2007

The "greedy" code was an attempt to retain glocks for a minimum length
of time when they relate to mmap()ed files. The current implementation
of this feature is not, however, ideal in that it required allocating
memory in order to do this and its overly complicated.

It also misses the mark by ignoring the other I/O operations which are
just as likely to suffer from the same problem. So the plan is to remove
this now and then add the functionality back as part of the glock state
machine at a later date (and thus take into account all the possible
users of this feature)
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e5dab552

[GFS2] Shrink gfs2_inode memory by half · fee852e3

由 Steven Whitehouse 提交于 1月 17, 2007

Here is something I spotted (while looking for something entirely
different) the other day.

Rather than using a completion in each and every struct gfs2_holder,
this removes it in favour of hashed wait queues, thus saving a
considerable amount of memory both on the stack (where a number of
gfs2_holder structures are allocated) and in particular in the
gfs2_inode which has 8 gfs2_holder structures embedded within it.

As a result on x86_64 the gfs2_inode shrinks from 2488 bytes to
1912 bytes, a saving of 576 bytes per inode (no thats not a typo!).
In actual practice we get a much better result than that since
now that a gfs2_inode is under the 2048 byte barrier, we get two
per 4k slab page effectively halving the amount of memory required
to store gfs2_inodes.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

fee852e3

[GFS2] Remove max_atomic_write tunable · 330005c2

由 Steven Whitehouse 提交于 1月 15, 2007

This removes an unused sysfs tunable parameter.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

330005c2

[GFS2] Clean up/speed up readdir · 3699e3a4

由 Steven Whitehouse 提交于 1月 17, 2007

This removes the extra filldir callback which gfs2 was using to
enclose an attempt at readahead for inodes during readdir. The
code was too complicated and also hurts performance badly in the
case that the getdents64/readdir call isn't being followed by
stat() and it wasn't even getting it right all the time when it
was.

As a result, on my test box an "ls" of a directory containing 250000
files fell from about 7mins (freshly mounted, so nothing cached) to
between about 15 to 25 seconds. When the directory content was cached,
the time taken fell from about 3mins to about 4 or 5 seconds.

Interestingly in the cached case, running "ls -l" once reduced the time
taken for subsequent runs of "ls" to about 6 secs even without this
patch. Now it turns out that there was a special case of glocks being
used for prefetching the metadata, but because of the timeouts for these
locks (set to 10 secs) the metadata was being timed out before it was
being used and this the prefetch code was constantly trying to prefetch
the same data over and over.

Calling "ls -l" meant that the inodes were brought into memory and once
the inodes are cached, the glocks are not disposed of until the inodes
are pushed out of the cache, thus extending the lifetime of the glocks,
and thus bringing down the time for subsequent runs of "ls"
considerably.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

3699e3a4