提交 · 1b21f458ddbc8fb6fceeb68158e9e04b2571dabd · xiphi1978 / linux

10 7月, 2007 20 次提交

[GFS2] Accept old format NFS filehandles · 3ebf4490

由 Steven Whitehouse 提交于 7月 10, 2007

On Tue, 2007-07-10 at 10:06 +0100, Christoph Hellwig wrote:
> > -#define GFS2_LARGE_FH_SIZE 10
> > -
> > -struct gfs2_fh_obj {
> > -   struct gfs2_inum_host this;
> > -   u32 imode;
> > -};
> > +#define GFS2_LARGE_FH_SIZE 8
>
> Because gfs2_decode_fh only accepts file handles with GFS2_LARGE_FH_SIZE
> or GFS2_LARGE_FH_SIZE you don't accept filehandles sent out by and older
> gfs version anymore.  Stale filehandles because of a new kernel version
> are a big no-no, so please add back code to handle the old filehandles
> on the decode side.
>

This should fix that problem I think since its only relating to end of
the fh we can just ignore that field in order to accept the older
format.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Wendy Cheng <wcheng@redhat.com>

3ebf4490

pipe: add documentation and comments · 0845718d

由 Jens Axboe 提交于 6月 12, 2007

As per Andrew Mortons request, here's a set of documentation for
the generic pipe_buf_operations hooks, the pipe, and pipe_buffer
structures.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

0845718d

pipe: change the ->pin() operation to ->confirm() · cac36bb0

由 Jens Axboe 提交于 6月 14, 2007

The name 'pin' was badly chosen, it doesn't pin a pipe buffer
in the most commonly used sense in the kernel. So change the
name to 'confirm', after debating this issue with Hugh
Dickins a bit.

A good return from ->confirm() means that the buffer is really
there, and that the contents are good.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

cac36bb0

Remove remnants of sendfile() · d96e6e71

由 Jens Axboe 提交于 6月 11, 2007

There are now zero users of .sendfile() in the kernel, so kill
it from the file_operations structure and in do_sendfile().
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d96e6e71

xip sendfile removal · d054fe3d

由 Carsten Otte 提交于 6月 15, 2007

This patch removes xip_file_sendfile, the sendfile implementation for
xip without replacement. Those customers that use xip on s390 are not
using sendfile() as far as we know, and so far s390 is the only platform
this could potentially be used on so far.
Having sendfile is not a popular feature for execute in place file
systems, however we have a working implementation of splice_read() based
on fs/splice.c if anyone asks for it.
At this point in time, it does not seem preferable to merge
splice_read() for xip because it causes extra maintenence effort due to
code duplication and it requires struct page behind the xip memory
segment. We'd like to get rid of that in favor of supporting flash based
embedded platforms (Monta Vista work) soon.
Signed-off-by: NCarsten Otte <cotte@de.ibm.com>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d054fe3d

splice: completely document external interface with kerneldoc · 932cc6d4

由 Jens Axboe 提交于 6月 21, 2007

Also add fs/splice.c as a kerneldoc target with a smaller blurb that
should be expanded to better explain the overview of splice.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

932cc6d4

sendfile: remove bad_sendfile() from bad_file_ops · d6f51756

由 Jens Axboe 提交于 6月 04, 2007

do_sendfile() prefers splice over sendfile, so it should not trigger
(directly, at least).
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d6f51756

pipe: allow passing around of ops private pointer · 497f9625

由 Jens Axboe 提交于 6月 11, 2007

relay needs this for proper consumption handling, and the network
receive support needs it as well to lookup the sk_buff on pipe
release.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

497f9625

splice: divorce the splice structure/function definitions from the pipe header · d6b29d7c

由 Jens Axboe 提交于 6月 04, 2007

We need to move even more stuff into the header so that folks can use
the splice_to_pipe() implementation instead of open-coding a lot of
pipe knowledge (see relay implementation), so move to our own header
file finally.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

d6b29d7c

J
sendfile: convert nfsd to splice_direct_to_actor() · cf8208d0
由 Jens Axboe 提交于 6月 12, 2007
```
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
```
cf8208d0

sendfile: convert nfs to using splice_read() · f0930fff

由 Jens Axboe 提交于 6月 01, 2007

Acked-by: NTrond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

f0930fff

sendfile: remove .sendfile from filesystems that use generic_file_sendfile() · 5ffc4ef4

由 Jens Axboe 提交于 6月 01, 2007

They can use generic_file_splice_read() instead. Since sys_sendfile() now
prefers that, there should be no change in behaviour.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

5ffc4ef4

sys_sendfile: switch to using ->splice_read, if available · 534f2aaa

由 Jens Axboe 提交于 6月 01, 2007

This patch makes sendfile prefer to use ->splice_read(), if it's
available in the file_operations structure.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

534f2aaa

vmsplice: add vmsplice-to-user support · 6a14b90b

由 Jens Axboe 提交于 6月 14, 2007

A bit of a cheat, it actually just copies the data to userspace. But
this makes the interface nice and symmetric and enables people to build
on splice, with room for future improvement in performance.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

6a14b90b

splice: abstract out actor data · c66ab6fa

由 Jens Axboe 提交于 6月 12, 2007

For direct splicing (or private splicing), the output may not be a file.
So abstract out the handling into a specified actor function and put
the data in the splice_desc structure earlier, so we can build on top
of that.

This is the first step in better splice handling for drivers, and also
for implementing vmsplice _to_ user memory.
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

c66ab6fa

unexport bio_{,un}map_user · 72d3a38e

由 Adrian Bunk 提交于 7月 09, 2007

bio_{,un}map_user no longer have any modular users.
Signed-off-by: NAdrian Bunk <bunk@stusta.de>
Signed-off-by: NJens Axboe <jens.axboe@oracle.com>

72d3a38e

sched: scheduler debugging, core · 43ae34cb

由 Ingo Molnar 提交于 7月 09, 2007

scheduler debugging core: implement /proc/sched_debug and
/proc/<PID>/sched files for scheduler debugging.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

43ae34cb

B
sched: update delay-accounting to use CFS's precise stats · 172ba844
由 Balbir Singh 提交于 7月 09, 2007
```
update delay-accounting to use CFS's precise stats.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
```
172ba844

sched: make use of precise accounting for /proc task stats · b27f03d4

由 Ingo Molnar 提交于 7月 09, 2007

make use of CFS's precise accounting to drive /proc/<pid>/stat statistics.

this code was co-authored by:

 Balbir Singh <balbir@linux.vnet.ibm.com>
 Dmitry Adamushko <dmitry.adamushko@gmail.com>
 Ingo Molnar <mingo@elte.hu>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NDmitry Adamushko <dmitry.adamushko@gmail.com>

b27f03d4

sched: remove the SleepAVG field · 62480d13

由 Ingo Molnar 提交于 7月 09, 2007

remove the SleepAVG field from /proc/<pid>/status, as
with the removal of the sleep-average code this value
no longer makes sense.
Signed-off-by: NIngo Molnar <mingo@elte.hu>

62480d13

09 7月, 2007 20 次提交

[GFS2] Small fixes to logging code · a0a24741

由 Steven Whitehouse 提交于 7月 09, 2007

This reverts part of an earlier patch which tried to reclaim
gfs2_bufdata structures too early and resulted in a "use after free"
case (this bit from me). Also a change to not write out log headers
unless we really need to (in the case of flushing nothing we don't need
a header) from Bob.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

a0a24741

[DLM] dump more lock values · ac90a255

由 David Teigland 提交于 7月 06, 2007

Add two more output fields (lkb_flags and rsb nodeid) to the new debugfs
file that dumps one lock per line. Also, dump all locks instead of just
mastered locks. Accordingly, use a suffix of _locks instead of _master.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

ac90a255

[GFS2] Remove i_mode passing from NFS File Handle · 35dcc52e

由 Wendy Cheng 提交于 6月 27, 2007

GFS2 has been passing i_mode within NFS File Handle. Other than the
wrong assumption that there is always room for this extra 16 bit value,
the current gfs2_get_dentry doesn't really need the i_mode to work
correctly. Note that GFS2 NFS code does go thru the same lookup code
path as direct file access route (where the mode is obtained from name
lookup) but gfs2_get_dentry() is coded for different purpose. It is not
used during lookup time. It is part of the file access procedure call.
When the call is invoked, if on-disk inode is not in-memory, it has to
be read-in. This makes i_mode passing a useless overhead.
Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

35dcc52e

[GFS2] Obtaining no_formal_ino from directory entry · bb9bcf06

由 Wendy Cheng 提交于 6月 27, 2007

GFS2 lookup code doesn't ask for inode shared glock. This implies during
in-memory inode creation for existing file, GFS2 will not disk-read in
the inode contents. This leaves no_formal_ino un-initialized during
lookup time. The un-initialized no_formal_ino is subsequently encoded
into file handle. Clients will get ESTALE error whenever it tries to
access these files.
Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

bb9bcf06

[GFS2] git-gfs2-nmw-build-fix · f4fadb23

由 akpm@linux-foundation.org 提交于 6月 27, 2007

Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

f4fadb23

[GFS2] System won't suspend with GFS2 file system mounted · b3657629

由 Abhijith Das 提交于 6月 27, 2007

The kernel threads in gfs2, namely gfs2_scand, gfs2_logd, gfs2_quotad,
gfs2_glockd, gfs2_recoverd weren't doing anything when the suspend
mechanism was trying to freeze them.

I put in calls to refrigerator() in the loops for all the daemons and
suspend works as expected.
Signed-off-by: NAbhijith Das <adas@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

b3657629

[GFS2] remounting w/o acl option leaves acls enabled · 569a7b6c

由 Bob Peterson 提交于 6月 27, 2007

This patch is for bugzilla bug #245663.  This crosswrites a fix from
gfs1 (bz #210369) so that the mount options are reset properly upon
remount.  This was tested on system trin-10.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

569a7b6c

[GFS2] inode size inconsistency · 090ffaa5

由 Wendy Cheng 提交于 6月 27, 2007

This should have been part of the NFS patch #1 but somehow I missed it
when packaging the patches. It is not a critical issue as the others (I
hope). RHEL 5.1 31.el5 kernel runs fine without this change.

Our truncate code is chopped into two parts, one for vfs inode changes
(in vmtruncate()) and one of gfs inode (in gfs2_truncatei()). These two
operatons are, unfortunately, not atomic. So it could happens that
vmtruncate() succeeds (inode->i_size is changed) but gfs2_truncatei
fails (say kernel temporarily out of memory). This would leave gfs inode
i_di.di_size out of sync with vfs inode i_size. It will later confuse
gfs2_commit_write() if a write is issued. Last time I checked, it will
cause file corruption.
Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

090ffaa5

[DLM] Telnet to port 21064 can stop all lockspaces · 97d84836

由 Patrick Caulfield 提交于 6月 27, 2007

This patch fixes Red Hat bz#245892

Opening a tcp connection from a cluster member to another cluster member
targeting the dlm port it is enough to stop every dlm operation in the cluster.
This means that GFS and rgmanager will hang.
Signed-Off-By: NPatrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

97d84836

[GFS2] Fix gfs2_block_truncate_page err return · 1875f2f3

由 S. Wendy Cheng 提交于 6月 25, 2007

Code segment inside gfs2_block_truncate_page() doesn't set the return
code correctly. This causes NFSD erroneously returns EIO back to client
with setattr procedure call (truncate error).
Signed-off-by: NS. Wendy Cheng <wcheng@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

1875f2f3

[GFS2] Addendum to the journaled file/unmount patch · 773ed1a0

由 Robert Peterson 提交于 6月 20, 2007

This patch is an addendum to the previous journaled file/unmount patch.
It fixes a problem discovered during testing.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

773ed1a0

[GFS2] Simplify multiple glock aquisition · eaf5bd3c

由 Steven Whitehouse 提交于 6月 19, 2007

There is a bug in the code which acquires multiple glocks where if the
initial out-of-order attempt fails part way though we can land up trying
to acquire the wrong number of glocks. This is part of the fix for red
hat bz #239737. The other part of the bz doesn't apply to upstream
kernels since it was fixed by:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d3717bdf8f08a0e1039158c8bab2c24d20f492b6

Since the out-of-order code doesn't appear to add anything to the
performance of GFS2, this patch just removed it rather than trying to
fix it. It should be much easier to see whats going on here now. In
addition, we don't allocate any memory unless we are using a lot of
glocks (which is a relatively uncommon case).
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

eaf5bd3c

[GFS2] assertion failure after writing to journaled file, umount · 2332c443

由 Robert Peterson 提交于 6月 18, 2007

This patch passes all my nasty tests that were causing the code to
fail under one circumstance or another. Here is a complete summary
of all changes from today's git tree, in order of appearance:

1. There are now separate variables for metadata buffer accounting.
2. Variable sd_log_num_hdrs is no longer needed, since the header
accounting is taken care of by the reserve/refund sequence.
3. Fixed a tiny grammatical problem in a comment.
4. Added a new function "calc_reserved" to calculate the reserved
log space. This isn't entirely necessary, but it has two benefits:
First, it simplifies the gfs2_log_refund function greatly.
Second, it allows for easier debugging because I could sprinkle the
code with calls to this function to make sure the accounting is
proper (by adding asserts and printks) at strategic point of the code.
5. In log_pull_tail there apparently was a kludge to fix up the
accounting based on a "pull" parameter. The buffer accounting is
now done properly, so the kludge was removed.
6. File sync operations were making a call to gfs2_log_flush that
writes another journal header. Since that header was unplanned
for (reserved) by the reserve/refund sequence, the free space had
to be decremented so that when log_pull_tail gets called, the free
space is be adjusted properly. (Did I hear you call that a kludge?
well, maybe, but a lot more justifiable than the one I removed).
7. In the gfs2_log_shutdown code, it optionally syncs the log by
specifying the PULL parameter to log_write_header. I'm not sure
this is necessary anymore. It just seems to me there could be
cases where shutdown is called while there are outstanding log
buffers.
8. In the (data)buf_lo_before_commit functions, I changed some offset
values from being calculated on the fly to being constants. That
simplified some code and we might as well let the compiler do the
calculation once rather than redoing those cycles at run time.
9. This version has my rewritten databuf_lo_add function.
This version is much more like its predecessor, buf_lo_add, which
makes it easier to understand. Again, this might not be necessary,
but it seems as if this one works as well as the previous one,
maybe even better, so I decided to leave it in.
10. In databuf_lo_before_commit, a previous data corruption problem
was caused by going off the end of the buffer. The proper solution
is to have the proper limit in place, rather than stopping earlier.
(Thus my previous attempt to fix it is wrong).
If you don't wrap the buffer, you're stopping too early and that
causes more log buffer accounting problems.
11. In lops.h there are two new (previously mentioned) constants for
figuring out the data offset for the journal buffers.
12. There are also two new functions, buf_limit and databuf_limit to
calculate how many entries will fit in the buffer.
13. In function gfs2_meta_wipe, it needs to distinguish between pinned
metadata buffers and journaled data buffers for proper journal buffer
accounting. It can't use the JDATA gfs2_inode flag because it's
sometimes passed the "real" inode and sometimes the "metadata
inode" and the inode flags will be random bits in a metadata
gfs2_inode. It needs to base its decision on which was passed in.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

2332c443

[GFS2] Use zero_user_page() in stuffed_readpage() · 2840501a

由 Steven Whitehouse 提交于 6月 18, 2007

As suggested by Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Robert P. J. Day <rpjday@mindspring.com>

2840501a

[GFS2] Remove bogus '\0' in rgrp.c · c4201214

由 Steven Whitehouse 提交于 6月 14, 2007

Not sure how it slipped in, but we don't want it anyway.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

c4201214

[GFS2] Journaled file write/unstuff bug · 8fb68595

由 Robert Peterson 提交于 6月 12, 2007

This patch is for bugzilla bug 283162, which uncovered a number of
bugs pertaining to writing to files that have the journaled bit on.
These bugs happen most often when writing to the meta_fs because
the files are always journaled.  So operations like gfs2_grow were
particularly vulnerable, although many of the problems could be
recreated with normal files after setting the journaled bit on.
The problems fixed are:

-GFS2 wasn't ever writing unstuffed journaled data blocks to their
 in-place location on disk. Now it does.

-If you unmounted too quickly after doing IO to a journaled file,
 GFS2 was crashing because you would discard a buffer whose bufdata
 was still on the active items list.  GFS2 now deals with this
 gracefully.

-GFS2 was losing track of the bufdata for journaled data blocks,
 and it wasn't getting freed, causing an error when you tried to
 unmount the module.  GFS2 now frees all the bufdata structures.

-There was a memory corruption occurring because GFS2 wrote
 twice as many log entries for journaled buffers.

-It was occasionally trying to write journal headers in buffers
 that weren't currently mapped.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8fb68595

[DLM] don't require FS flag on all nodes · fad59c13

由 David Teigland 提交于 6月 11, 2007

Mask off the recently added DLM_LSFL_FS flag when setting the exflags.
This way all the nodes in the lockspace aren't required to have the FS
flag set, since we later check that exflags matches among all nodes.
Signed-off-by: NPatrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

fad59c13

[GFS2] Fix deallocation issues · d93cfa98

由 Abhijith Das 提交于 6月 11, 2007

There were two issues during deallocation of unlinked inodes. The
first was relating to the use of a "try" lock which in the case of
the inode lock wasn't trying hard enough to deallocate in all
circumstances (now changed to a normal glock) and in the case of
the iopen lock didn't wait for the demotion of the shared lock before
attempting to get the exclusive lock, and thereby sometimes (timing dependent)
not completing the deallocation when it should have done.

The second issue related to the lack of a way to invalidate dcache entries
on remote nodes (now fixed by this patch) which meant that unlinks were
taking a long time to return disk space to the fs. By adding some code to
invalidate the dcache entries across the cluster for unlinked inodes, that
is now fixed.

This patch was written jointly by Abhijith Das and Steven Whitehouse.
Signed-off-by: NAbhijith Das <adas@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

d93cfa98

[GFS2] return conflicts for GETLK · a7a2ff8a

由 David Teigland 提交于 6月 08, 2007

We weren't returning the correct result when GETLK found a conflict,
which is indicated by userspace passing back a 1.

Signed-off-by: Abhijith Das <adas redhat com>
Signed-off-by: David Teigland <teigland redhat com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

a7a2ff8a

[GFS2] set plock owner in GETLK info · d88101d4

由 David Teigland 提交于 6月 08, 2007

Set the owner field in the plock info sent to userspace for GETLK.
Without this, gfs_controld won't correctly see when the GETLK from a
process matches one of the process's existing locks.
Signed-off-by: NDavid Teigland <teigland@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

d88101d4