提交 · 60a0b8f93664621a07b93273fc8ebc29590c62f5 · openeuler / Kernel

21 5月, 2009 1 次提交

GFS2: Add a rgrp bitmap full flag · 60a0b8f9

由 Steven Whitehouse 提交于 5月 21, 2009

During block allocation, it is useful to know if sections of disk
are full on a finer grained basis than a single resource group.
This can make a performance difference when resource groups have
larger numbers of bitmap blocks, since we no longer have to search
them all block by block in each individual bitmap.

The full flag is set on a per-bitmap basis when it has been
searched and found to have no free space. It is then skipped in
subsequent searches until the flag is reset. The resetting
occurs if we have to drop the glock on the resource group for any
reason, or if we deallocate some blocks within that resource
group and thus free up some space.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

60a0b8f9

20 5月, 2009 1 次提交

GFS2: Improve resource group error handling · 09010978

由 Steven Whitehouse 提交于 5月 20, 2009

This patch improves the error handling in the case where we
discover that the summary information in the resource group
doesn't match the bitmap information while in the process of
allocating blocks. Originally this resulted in a kernel bug,
but this patch changes that so that we return -EIO and print
some messages explaining what went wrong, and how to fix it.

We also remember locally not to try and allocate from the
same rgrp again, so that a subsequent allocation in a
different rgrp should succeed.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

09010978

19 5月, 2009 2 次提交

GFS2: Don't warn when delete inode fails on ro filesystem · ef9e8b14

由 Steven Whitehouse 提交于 5月 19, 2009

If the filesystem is read-only, then we expect that delete inode
will fail, so there is no need to warn about it.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ef9e8b14

GFS2: Umount recovery race fix · fe64d517

由 Steven Whitehouse 提交于 5月 19, 2009

This patch fixes a race condition where we can receive recovery
requests part way through processing a umount. This was causing
problems since the recovery thread had already gone away.

Looking in more detail at the recovery code, it was really trying
to implement a slight variation on a work queue, and that happens to
align nicely with the recently introduced slow-work subsystem. As a
result I've updated the code to use slow-work, rather than its own home
grown variety of work queue.

When using the wait_on_bit() function, I noticed that the wait function
that was supplied as an argument was appearing in the WCHAN field, so
I've updated the function names in order to produce more meaningful
output.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

fe64d517

13 5月, 2009 3 次提交

GFS2: Remove a couple of unused sysfs entries · 9582d411

由 Steven Whitehouse 提交于 5月 13, 2009

These two tunables are pointless and would never need to be
changed anyway. There is also a race between them and umount
as the deamons which they refer to might have gone away. The
easiest way to fix the race is to remove the interface.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

9582d411

GFS2: Add commit= mount option · 48c2b613

由 Steven Whitehouse 提交于 5月 13, 2009

It has always been possible to adjust the gfs2 log commit
interval, but only from the sysfs interface. This adds a
mount option, commit=<nn>, which will be familar to ext3
users.

The sysfs interface continues to be available as well, although
this might be removed in the future.

Also this patch cleans up some duplicated structures in the GFS2
sysfs code.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

48c2b613

GFS2: Move journal live test at transaction start · a1c0643f

由 Steven Whitehouse 提交于 5月 13, 2009

There seems little point grabbing the transaction glock
only to have to release it again if the journal isn't
live. This moves the test earlier to avoid grabbing the lock
when we don't need it in the first place.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

a1c0643f

12 5月, 2009 1 次提交

GFS2: Fix timestamps on write · 7537d81a

由 Abhijith Das 提交于 5月 12, 2009

This patch copies the timestamps from the vfs inode into gfs2 and syncs
it to the disk inode during writes.
Signed-off-by: NAbhijith Das <adas@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

7537d81a

11 5月, 2009 3 次提交

GFS2: Something nonlinear this way comes! · 48bf2b17

由 Steven Whitehouse 提交于 4月 29, 2009

For some reason GFS2 has been missing support for non-linear
mappings. This patch fixes that, and also avoids taking any
locks for mmap in the O_NOATIME case. In fact we don't actually need
to take the lock here at all - just doing file_accessed() would be
enough, but we have to take the lock eventually and this helps
it hit disk (and thus be seen by other nodes) faster.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

48bf2b17

GFS2: Optimise writepage for metadata · 4a0f9a32

由 Steven Whitehouse 提交于 4月 20, 2009

This adds a GFS2 specific writepage for metadata, rather than
continuing to use the VFS function. As a result we now tag all
our metadata I/O with the correct flag so that blktraces will
now be less confusing.

Also, the generic function was checking for a number of corner
cases which cannot happen on the metadata address spaces so that
this should be faster too.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

4a0f9a32

GFS2: Update the rw flags · c969f58c

由 Steven Whitehouse 提交于 4月 07, 2009

After Jens recent updates:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a1f242524c3c1f5d40f1c9c343427e34d1aadd6e
et al. this is a patch to bring gfs2 uptodate with the core
code. Also I've managed to squash another call to ll_rw_block()
along the way.

There is still one part of the GFS2 I/O paths which are not correctly
annotated and that is due to the sharing of the writeback code between
the data and metadata address spaces. I would like to change that too,
but this patch is still worth doing on its own, I think.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c969f58c

09 5月, 2009 2 次提交

Reduce path_lookup() abuses · e24977d4

由 Al Viro 提交于 4月 02, 2009

... use kern_path() where possible

[folded a fix from rdd]
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e24977d4

GFS2: Fix glock ref counting bug · 0c7a531a

由 Steven Whitehouse 提交于 4月 30, 2009

Depending on the ordering of events as we go around the
glock shrinker loop, it is possible to drop the ref count
of a glock incorrectly. It doesn't happen very often. This
patch corrects the got_ref variable, fixing the problem.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

0c7a531a

23 4月, 2009 2 次提交

GFS2: Ensure that the inode goal block settings are updated · d9ba7615

由 Steven Whitehouse 提交于 4月 23, 2009

GFS2 has a goal block associated with each inode indicating the
search start position for future block allocations (in fact there
are two, but thats for backward compatibility with GFS1 as they
are set to identical locations in GFS2).

In some circumstances, depending on the ordering of updates to
the inode it was possible for the goal block settings to not
be updated on disk. This patch ensures that the goal block will
always get updated, thus reducing the potential for searching
the same (already allocated) blocks again when looking for free
space during block allocation.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

d9ba7615

GFS2: Fix bug in block allocation · d8bd504a

由 Steven Whitehouse 提交于 4月 23, 2009

The new bitfit algorithm was counting from the wrong end of
64 bit words in the bitfield. This fixes it by using __ffs64
instead of fls64
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

d8bd504a

20 4月, 2009 2 次提交

GFS2: Fix page_mkwrite() return code · e56985da

由 Steven Whitehouse 提交于 4月 20, 2009

This allows for the possibility of returning VM_FAULT_OOM as
well as VM_FAULT_SIGBUS. This ensures that the correct action
is taken.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

e56985da

GFS2: Clear dirty bit at end of inode glock sync · 52fcd11c

由 Steven Whitehouse 提交于 4月 20, 2009

The dirty bit can get set during the inode glock sync. Its too
complicated to change that at the moment, so this is the quick
fix - to clear the bit again at the end of the function.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

52fcd11c

15 4月, 2009 6 次提交

gfs2: Remove code handling bio_alloc failure with __GFP_WAIT · b1fffc9c

由 Nikanth Karthikesan 提交于 4月 15, 2009

Remove code handling bio_alloc failure with __GFP_WAIT.
GFP_NOFS implies __GFP_WAIT.
Signed-off-by: NNikanth Karthikesan <knikanth@suse.de>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

b1fffc9c

GFS2: Use DEFINE_SPINLOCK · 1328df72

由 Xu Gang 提交于 4月 14, 2009

SPIN_LOCK_UNLOCKED is deprecated, use DEFINE_SPINLOCK instead.
(as suggested in Documentation/spinlocks.txt)
Signed-off-by: NXu Gang <xug@cn.fujitsu.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1328df72

GFS2: cleanup file_operations mess · 10d21988

由 Christoph Hellwig 提交于 4月 07, 2009

Remove the weird pointer to file_operations mess and replace it with
straight-forward defining of the lockinginstance names to the _nolock
variants.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

10d21988

GFS2: Move umount flush rwsem · a228df63

由 Steven Whitehouse 提交于 4月 07, 2009

The rwsem, used only on umount, is in the wrong place in glock.c.
This patch moves it up a bit so that it does not get called under
a spinlock.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

a228df63

GFS2: Fix symlink creation race · 5cf32524

由 Steven Whitehouse 提交于 3月 31, 2009

In certain cases symlinks can appear to have zero size if a lookup
on the inode occurs within a certain (very short) time after the
symlink has been created. The symlink is correctly created on disk
but appears to have zero size when stat()ed. This patch closes the
race and prevents incorrect sizes appearing.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

5cf32524

GFS2: Make quotad's waiting interruptible · 7fa5d20d

由 Steven Whitehouse 提交于 3月 31, 2009

So we don't count its D state in the loadavg.
Reported-by: NNathan Straz <nstraz@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

7fa5d20d

01 4月, 2009 2 次提交

mm: page_mkwrite change prototype to match fault · c2ec175c

由 Nick Piggin 提交于 3月 31, 2009

Change the page_mkwrite prototype to take a struct vm_fault, and return
VM_FAULT_xxx flags.  There should be no functional change.

This makes it possible to return much more detailed error information to
the VM (and also can provide more information eg.  virtual_address to the
driver, which might be important in some special cases).

This is required for a subsequent fix.  And will also make it easier to
merge page_mkwrite() with fault() in future.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Cc: Felix Blyakher <felixb@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c2ec175c

New helper - current_umask() · ce3b0f8d

由 Al Viro 提交于 3月 29, 2009

current->fs->umask is what most of fs_struct users are doing.
Put that into a helper function.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ce3b0f8d

28 3月, 2009 1 次提交
- A
  constify dentry_operations: GFS2 · 92cecbbf
  由 Al Viro 提交于 2月 20, 2009
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  92cecbbf
24 3月, 2009 14 次提交

GFS2: Fix freeze issue · df3647b2

由 Steven Whitehouse 提交于 3月 23, 2009

This removes some old code that was causing issues during
filesystem freeze.
Reported-by: NAndrew Price <andy@andrewprice.me.uk>
Tested-by: NAndrew Price <andy@andrewprice.me.uk>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

df3647b2

Fix a minor bug in the previous patch · 9c538837

由 Steven Whitehouse 提交于 3月 19, 2009

The logic requires that we mark the glock dirty in page_mkwrite
otherwise we might not flush correctly in the case that no
allocation was required in the process of dirying the page.
Also we need to set the shared write flag early for the same
reason.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

9c538837

GFS2: Clean up of glops.c · 6bac243f

由 Steven Whitehouse 提交于 3月 09, 2009

This cleans up a number of bits of code mostly based in glops.c.
A couple of simple functions have been merged into the callers
to make it more obvious what is going on, the mysterious raising
of i_writecount around the truncate_inode_pages() call has been
removed. The meta_go_* operations have been renamed rgrp_go_*
since that is the only lock type that they are used with.

The unused argument of gfs2_read_sb has been removed. Also
a bug has been fixed where a check for the rindex inode was
in the wrong callback. More comments are added, and the
debugging code is improved too.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

6bac243f

GFS2: Fix locking bug in failed shared to exclusive conversion · 02ffad08

由 Benjamin Marzinski 提交于 3月 06, 2009

After calling out to the dlm, GFS2 sets the new state of a glock to
gl_target in gdlm_ast(). However, gl_target is not always the lock
state that was requested. If a conversion from shared to exclusive
fails, finish_xmote() will call do_xmote() with LM_ST_UNLOCKED, instead
of gl->gl_target, so that it can reacquire the lock in exlusive the next
time around. In this case, setting the lock to gl_target in gdlm_ast()
will make GFS2 think that it has the glock in exclusive mode, when
really, it doesn't have the glock locked at all. This patch adds a new
field to the gfs2_glock structure, gl_req, to track the mode that was
requested.
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

02ffad08

GFS2: Pagecache usage optimization on GFS2 · 229615de

由 Hisashi Hifumi 提交于 3月 03, 2009

I introduced "is_partially_uptodate" aops for GFS2.

A page can have multiple buffers and even if a page is not uptodate, some buffers
can be uptodate on pagesize != blocksize environment.
This aops checks that all buffers which correspond to a part of a file
that we want to read are uptodate. If so, we do not have to issue actual
read IO to HDD even if a page is not uptodate because the portion we
want to read are uptodate.
"block_is_partially_uptodate" function is already used by ext2/3/4.
With the following patch random read/write mixed workloads or random read after
random write workloads can be optimized and we can get performance improvement.

I did a performance test using the sysbench.

#sysbench --num-threads=16 --max-requests=200000 --test=fileio --file-num=1
--file-block-size=8K --file-total-size=2G --file-test-mode=rndrw --file-fsync-freq=0
--file-rw-ratio=1 run

-2.6.29-rc6
Test execution summary:
    total time:                          202.6389s
    total number of events:              200000
    total time taken by event execution: 2580.0480
    per-request statistics:
         min:                            0.0000s
         avg:                            0.0129s
         max:                            49.5852s
         approx.  95 percentile:         0.0462s

-2.6.29-rc6-patched
Test execution summary:
    total time:                          177.8639s
    total number of events:              200000
    total time taken by event execution: 2419.0199
    per-request statistics:
         min:                            0.0000s
         avg:                            0.0121s
         max:                            52.4306s
         approx.  95 percentile:         0.0444s

arch: ia64
pagesize: 16k
blocksize: 4k
Signed-off-by: NHisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

229615de

GFS2: fix sparse warning: Should it be static? · 02ab1721

由 Hannes Eder 提交于 2月 21, 2009

Impact: Make symbol static.

Fix this sparse warning:
  fs/gfs2/rgrp.c:188:5: warning: symbol 'gfs2_bitfit' was not declared. Should it be static?
Signed-off-by: NHannes Eder <hannes@hanneseder.net>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

02ab1721

GFS2: fix sparse warnings: constant is so big it is ... · 075ac448

由 Hannes Eder 提交于 2月 21, 2009

Fix this sparse warnings:
fs/gfs2/rgrp.c:156:23: warning: constant 0xffffffffffffffff is so big it is unsigned long long
fs/gfs2/rgrp.c:157:23: warning: constant 0xaaaaaaaaaaaaaaaa is so big it is unsigned long long
fs/gfs2/rgrp.c:158:23: warning: constant 0x5555555555555555 is so big it is long long
fs/gfs2/rgrp.c:194:20: warning: constant 0x5555555555555555 is so big it is long long
fs/gfs2/rgrp.c:204:44: warning: constant 0x5555555555555555 is so big it is long long
Signed-off-by: NHannes Eder <hannes@hanneseder.net>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

075ac448

GFS2: Support quota/noquota mount arguments · b9a96945

由 Steven Whitehouse 提交于 2月 19, 2009

This adds support for "quota" and "noquota" mount options in addition to the
existing "quota=on/off/account" so that we are compatible with the names by
which these options are more generally known.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

b9a96945

GFS2: Fix alignment issue and tidy gfs2_bitfit · 223b2b88

由 Steven Whitehouse 提交于 2月 17, 2009

An alignment issue with the existing bitfit algorithm was reported
on IA64. This patch attempts to fix that, and also to tidy up the
code a bit. There is now more documentation about how this works
and it has survived a number of different tests.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

223b2b88

GFS2: Add a "demote a glock" interface to sysfs · 64d576ba

由 Steven Whitehouse 提交于 2月 12, 2009

This adds a sysfs file called demote_rq to GFS2's
per filesystem directory. Its possible to use this
file to demote arbitrary glocks in exactly the same
way as if a request had come in from a remote node.

This is intended for testing issues relating to caching
of data under glocks. Despite that, the interface is
generic enough to send requests to any type of glock,
but be careful as its not always safe to send an
arbitrary message to an arbitrary glock. For that reason
and to prevent DoS, this interface is restricted to root
only.

The messages look like this:

<type>:<glocknumber> <mode>

Example:

echo -n "2:13324 EX" >/sys/fs/gfs2/unity:myfs/demote_rq

Which means "please demote inode glock (type 2) number 13324 so that
I can get an EX (exclusive) lock". The lock modes are those which
would normally be sent by a remote node in its callback so if you
want to unlock a glock, you use EX, to demote to shared, use SH or PR
(depending on whether you like GFS2 or DLM lock modes better!).

If the glock doesn't exist, you'll get -ENOENT returned. If the
arguments don't make sense, you'll get -EINVAL returned.

The plan is that this interface will be used in combination with
the blktrace patch which I recently posted for comments although
it is, of course, still useful in its own right.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

64d576ba

GFS2: Expose UUID via sysfs/uevent · 02e3cc70

由 Steven Whitehouse 提交于 2月 10, 2009

Since we have a UUID, we ought to expose it to the user via sysfs
and uevents. We already have the fs name in both of these places
(a combination of the lock proto and lock table name) so if we add
the UUID as well, we have a full set.

For older filesystems (i.e. those created before mkfs.gfs2 was writing
UUIDs by default) the sysfs file will appear zero length, and no UUID
env var will be added to the uevents.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

02e3cc70

GFS2: Support generation of discard requests · f15ab561

由 Steven Whitehouse 提交于 2月 09, 2009

This patch allows GFS2 to generate discard requests for blocks which are
no longer useful to the filesystem (i.e. those which have been freed as
the result of an unlink operation). The requests are generated at the
time which those blocks become available for reuse in the filesystem.

In order to use this new feature, you have to specify the "discard"
mount option. The code coalesces adjacent blocks into a single extent
when generating the discard requests, thus generating the minimum
number.

If an error occurs when the request has been sent to the block device,
then it will print a message and turn off the requests for that
filesystem. If the problem is temporary, then you can use remount to
turn the option back on again. There is also a nodiscard mount option
so that you can use remount to turn discard requests off, if required.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f15ab561

GFS2: Fix deadlock on journal flush · d8348de0

由 Steven Whitehouse 提交于 2月 05, 2009

This patch fixes a deadlock when the journal is flushed and there
are dirty inodes other than the one which caused the journal flush.
Originally the journal flushing code was trying to obtain the
transaction glock while running the flush code for an inode glock.
We no longer require the transaction glock at this point in time
since we know that any attempt to get the transaction glock from
another node will result in a journal flush. So if we are flushing
the journal, we can be sure that the transaction lock is still
cached from when the transaction was started.

By inlining a version of gfs2_trans_begin() (minus the bit which
gets the transaction glock) we can avoid the deadlock problems
caused if there is a demote request queued up on the transaction
glock.

In addition I've also moved the umount rwsem so that it covers
the glock workqueue, since it all demotions are done by this
workqueue now. That fixes a bug on umount which I came across
while fixing the original problem.
Reported-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

d8348de0

GFS2: Fix error path ref counting for root inode · e7c8707e

由 Steven Whitehouse 提交于 1月 20, 2009

We were keeping hold of an extra ref to the root inode in one
of the error paths, that resulted in a hang.
Reported-by: NNate Straz <nstraz@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Tested-by: NRobert Peterson <rpeterso@redhat.com>

e7c8707e

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功