提交 · 3070451eea1ed8e3bde0573183c7d8ac25fd5e97 · openeuler / raspberrypi-kernel

27 7月, 2010 19 次提交

xfs: reduce stack usage in xfs_iomap · 3070451e

由 Christoph Hellwig 提交于 6月 24, 2010

xfs_iomap passes a xfs_bmbt_irec pointer to xfs_iomap_write_direct and
xfs_iomap_write_allocate to give them the results of our read-only
xfs_bmapi query.  Instead of allocating a new xfs_bmbt_irec on stack
for the next call to xfs_bmapi re use the one we got passed as it's not
used after this point.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

3070451e

xfs: avoid synchronous transaction in xfs_fs_write_inode · 7a36c8a9

由 Christoph Hellwig 提交于 6月 24, 2010

We already rely on the fact that the sync code will cause a synchronous
log force later on (currently via xfs_fs_sync_fs -> xfs_quiesce_data ->
xfs_sync_data), so no need to do this here.  This allows us to avoid
a lot of synchronous log forces during sync, which pays of especially
with delayed logging enabled.   Some compilebench numbers that show
this:

xfs (delayed logging, 256k logbufs)
===================================

intial create		  25.94 MB/s	  25.75 MB/s	  25.64 MB/s
create			   8.54 MB/s	   9.12 MB/s	   9.15 MB/s
patch			   2.47 MB/s	   2.47 MB/s	   3.17 MB/s
compile			  29.65 MB/s	  30.51 MB/s	  27.33 MB/s
clean			  90.92 MB/s	  98.83 MB/s	 128.87 MB/s
read tree		  11.90 MB/s	  11.84 MB/s	   8.56 MB/s
read compiled		  28.75 MB/s	  29.96 MB/s	  24.25 MB/s
delete tree		8.39 seconds	8.12 seconds	8.46 seconds
delete compiled		8.35 seconds	8.44 seconds	5.11 seconds
stat tree		6.03 seconds	5.59 seconds	5.19 seconds
stat compiled tree	9.00 seconds	9.52 seconds	8.49 seconds

xfs + write_inode log_force removal
===================================
intial create		  25.87 MB/s	  25.76 MB/s	  25.87 MB/s
create			  15.18 MB/s	  14.80 MB/s	  14.94 MB/s
patch			   3.13 MB/s	   3.14 MB/s	   3.11 MB/s
compile			  36.74 MB/s	  37.17 MB/s	  36.84 MB/s
clean			 226.02 MB/s	 222.58 MB/s	 217.94 MB/s
read tree		  15.14 MB/s	  15.02 MB/s	  15.14 MB/s
read compiled tree	  29.30 MB/s	  29.31 MB/s	  29.32 MB/s
delete tree		6.22 seconds	6.14 seconds	6.15 seconds
delete compiled tree	5.75 seconds	5.92 seconds	5.81 seconds
stat tree		4.60 seconds	4.51 seconds	4.56 seconds
stat compiled tree	4.07 seconds	3.87 seconds	3.96 seconds

In addition to that also remove the delwri inode flush that is unessecary
now that bulkstat is always coherent.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

7a36c8a9

xfs: simplify xfs_vm_writepage · 20cb52eb

由 Christoph Hellwig 提交于 6月 24, 2010

The writepage implementation in XFS still tries to deal with dirty but
unmapped buffers which used to caused by writes through shared mmaps. Since
the introduction of ->page_mkwrite these can't happen anymore, so remove the
code dealing with them.

Note that the all_bh variable which causes us to start I/O on all buffers on
the pages was controlled by the count of unmapped buffers, which also
included those not actually dirty. It's now unconditionally initialized to
0 but set to 1 for the case of small file size extensions. It probably can
be removed entirely, but that's left for another patch.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

20cb52eb

xfs: simplify xfs_vm_releasepage · 89f3b363

由 Christoph Hellwig 提交于 6月 24, 2010

Currently the xfs releasepage implementation has code to deal with converting
delayed allocated and unwritten space. But we never get called for those as
we always convert delayed and unwritten space when cleaning a page, or drop
the state from the buffers in block_invalidatepage. We still keep a WARN_ON
on those cases for now, but remove all the case dealing with it, which allows
to fold xfs_page_state_convert into xfs_vm_writepage and remove the !startio
case from the whole writeback path.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

89f3b363

xfs: fix corruption case for block size < page size · 3d9b02e3

由 Eric Sandeen 提交于 6月 24, 2010

xfstests 194 first truncats a file back and then extends it again by
truncating it to a larger size. This causes discard_buffer to drop
the mapped, but not the uptodate bit and thus creates something that
xfs_page_state_convert takes for unmapped space created by mmap because
it doesn't check for the dirty bit, which also gets cleared by
discard_buffer and checked by other ->writepage implementations like
block_write_full_page. Handle this kind of buffers early, and unlike
Eric's first version of the patch simply ASSERT that the buffers is
dirty, given that the mmap write case can't happen anymore since the
introduction of ->page_mkwrite. The now dead code dealing with that
will be deleted in a follow on patch.
Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

3d9b02e3

xfs: remove unused delta tracking code in xfs_bmapi · b4e9181e

由 Christoph Hellwig 提交于 6月 23, 2010

This code was introduced four years ago in commit
3e57ecf6 without any review and has
been unused since.  Remove it just as the rest of the code introduced
in that commit to reduce that stack usage and complexity in this central
piece of code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

b4e9181e

xfs: remove unused XFS_BMAPI_ flags · cd8b0bb3

由 Christoph Hellwig 提交于 6月 23, 2010

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

cd8b0bb3

C
xfs: remove the unused XFS_TRANS_NOSLEEP/XFS_TRANS_WAIT flags · a59f5570
由 Christoph Hellwig 提交于 6月 23, 2010
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
```
a59f5570
C
xfs: remove the unused XFS_LOG_SLEEP and XFS_LOG_NOSLEEP flags · 9134c233
由 Christoph Hellwig 提交于 6月 23, 2010
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
```
9134c233
C
xfs: kill the unused xlog_debug variable · dbb2f652
由 Christoph Hellwig 提交于 6月 23, 2010
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
```
dbb2f652

xfs: fix the xfs_log_iovec i_addr type · 4e0d5f92

由 Christoph Hellwig 提交于 6月 23, 2010

By making this member a void pointer we can get rid of a lot of pointless
casts.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

4e0d5f92

xfs: simplify inode to transaction joining · 898621d5

由 Christoph Hellwig 提交于 6月 24, 2010

Currently we need to either call IHOLD or xfs_trans_ihold on an inode when
joining it to a transaction via xfs_trans_ijoin.

This patches instead makes xfs_trans_ijoin usable on it's own by doing
an implicity xfs_trans_ihold, which also allows us to drop the third
argument. For the case where we want to hold a reference on the inode
a xfs_trans_ijoin_ref wrapper is added which does the IHOLD and marks
the inode for needing an xfs_iput. In addition to the cleaner interface
to the caller this also simplifies the implementation.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

898621d5

xfs: simplify buffer pinning · 4d16e924

由 Christoph Hellwig 提交于 6月 23, 2010

Get rid of the xfs_buf_pin/xfs_buf_unpin/xfs_buf_ispin helpers and opencode
them in their only callers, just like we did for the inode pinning a while
ago. Also remove duplicate trace points - the bufitem tracepoints cover
all the information that is present in a buffer tracepoint.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

4d16e924

xfs: give li_cb callbacks the correct prototype · ca30b2a7

由 Christoph Hellwig 提交于 6月 23, 2010

Stop the function pointer casting madness and give all the li_cb instances
correct prototype.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

ca30b2a7

xfs: give xfs_item_ops methods the correct prototypes · 7bfa31d8

由 Christoph Hellwig 提交于 6月 23, 2010

Stop the function pointer casting madness and give all the xfs_item_ops the
correct prototypes.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

7bfa31d8

xfs: merge iop_unpin_remove into iop_unpin · 9412e318

由 Christoph Hellwig 提交于 6月 23, 2010

The unpin_remove item operation instances always share most of the
implementation with the respective unpin implementation.  So instead
of keeping two different entry points add a remove flag to the unpin
operation and share the code more easily.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

9412e318

xfs: simplify log item descriptor tracking · e98c414f

由 Christoph Hellwig 提交于 6月 23, 2010

Currently we track log item descriptor belonging to a transaction using a
complex opencoded chunk allocator. This code has been there since day one
and seems to work around the lack of an efficient slab allocator.

This patch replaces it with dynamically allocated log item descriptors
from a dedicated slab pool, linked to the transaction by a linked list.

This allows to greatly simplify the log item descriptor tracking to the
point where it's just a couple hundred lines in xfs_trans.c instead of
a separate file. The external API has also been simplified while we're
at it - the xfs_trans_add_item and xfs_trans_del_item functions to add/
delete items from a transaction have been simplified to the bare minium,
and the xfs_trans_find_item function is replaced with a direct dereference
of the li_desc field. All debug code walking the list of log items in
a transaction is down to a simple list_for_each_entry.

Note that we could easily use a singly linked list here instead of the
double linked list from list.h as the fastpath only does deletion from
sequential traversal. But given that we don't have one available as
a library function yet I use the list.h functions for simplicity.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

e98c414f

C
xfs: remove unneeded #include statements · 3400777f
由 Christoph Hellwig 提交于 6月 23, 2010
```
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <david@fromorbit.com>
```
3400777f

xfs: drop dmapi hooks · 288699fe

由 Christoph Hellwig 提交于 6月 23, 2010

Dmapi support was never merged upstream, but we still have a lot of hooks
bloating XFS for it, all over the fast pathes of the filesystem.

This patch drops over 700 lines of dmapi overhead.  If we'll ever get HSM
support in mainline at least the namespace events can be done much saner
in the VFS instead of the individual filesystem, so it's not like this
is much help for future work.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

288699fe

23 7月, 2010 1 次提交

CIFS: Fix a malicious redirect problem in the DNS lookup code · 4c0c03ca

由 David Howells 提交于 7月 22, 2010

Fix the security problem in the CIFS filesystem DNS lookup code in which a
malicious redirect could be installed by a random user by simply adding a
result record into one of their keyrings with add_key() and then invoking a
CIFS CFS lookup [CVE-2010-2524].

This is done by creating an internal keyring specifically for the caching of
DNS lookups.  To enforce the use of this keyring, the module init routine
creates a set of override credentials with the keyring installed as the thread
keyring and instructs request_key() to only install lookup result keys in that
keyring.

The override is then applied around the call to request_key().

This has some additional benefits when a kernel service uses this module to
request a key:

 (1) The result keys are owned by root, not the user that caused the lookup.

 (2) The result keys don't pop up in the user's keyrings.

 (3) The result keys don't come out of the quota of the user that caused the
     lookup.

The keyring can be viewed as root by doing cat /proc/keys:

2a0ca6c3 I-----     1 perm 1f030000     0     0 keyring   .dns_resolver: 1/4

It can then be listed with 'keyctl list' by root.

	# keyctl list 0x2a0ca6c3
	1 key in keyring:
	726766307: --alswrv     0     0 dns_resolver: foo.bar.com
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Reviewed-and-Tested-by: NJeff Layton <jlayton@redhat.com>
Acked-by: NSteve French <smfrench@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4c0c03ca

22 7月, 2010 1 次提交

Fix up trivial spelling errors ('taht' -> 'that') · a4ce96ac

由 Linus Torvalds 提交于 7月 21, 2010

Pointed out by Lucas who found the new one in a comment in
setup_percpu.c. And then I fixed the others that I grepped
for.
Reported-by: NLucas <canolucas@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a4ce96ac

20 7月, 2010 5 次提交

xfs: track AGs with reclaimable inodes in per-ag radix tree · 16fd5367

由 Dave Chinner 提交于 7月 20, 2010

https://bugzilla.kernel.org/show_bug.cgi?id=16348

When the filesystem grows to a large number of allocation groups,
the summing of recalimable inodes gets expensive. In many cases,
most AGs won't have any reclaimable inodes and so we are wasting CPU
time aggregating over these AGs. This is particularly important for
the inode shrinker that gets called frequently under memory
pressure.

To avoid the overhead, track AGs with reclaimable inodes in the
per-ag radix tree so that we can find all the AGs with reclaimable
inodes via a simple gang tag lookup. This involves setting the tag
when the first reclaimable inode is tracked in the AG, and removing
the tag when the last reclaimable inode is removed from the tree.
Then the summation process becomes a loop walking the radix tree
summing AGs with the reclaim tag set.

This significantly reduces the overhead of scanning - a 6400 AG
filesystea now only uses about 25% of a cpu in kswapd while slab
reclaim progresses instead of being permanently stuck at 100% CPU
and making little progress. Clean filesystems filesystems will see
no overhead and the overhead only increases linearly with the number
of dirty AGs.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

16fd5367

xfs: convert inode shrinker to per-filesystem contexts · 70e60ce7

由 Dave Chinner 提交于 7月 20, 2010

Now the shrinker passes us a context, wire up a shrinker context per
filesystem. This allows us to remove the global mount list and the
locking problems that introduced. It also means that a shrinker call
does not need to traverse clean filesystems before finding a
filesystem with reclaimable inodes.  This significantly reduces
scanning overhead when lots of filesystems are present.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

70e60ce7

Btrfs: fix checks in BTRFS_IOC_CLONE_RANGE · 2ebc3464

由 Dan Rosenberg 提交于 7月 19, 2010

1.  The BTRFS_IOC_CLONE and BTRFS_IOC_CLONE_RANGE ioctls should check
whether the donor file is append-only before writing to it.

2.  The BTRFS_IOC_CLONE_RANGE ioctl appears to have an integer
overflow that allows a user to specify an out-of-bounds range to copy
from the source file (if off + len wraps around).  I haven't been able
to successfully exploit this, but I'd imagine that a clever attacker
could use this to read things he shouldn't.  Even if it's not
exploitable, it couldn't hurt to be safe.
Signed-off-by: NDan Rosenberg <dan.j.rosenberg@gmail.com>
cc: stable@kernel.org
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2ebc3464

Btrfs: fix CLONE ioctl destination file size expansion to block boundary · b5384d48

由 Sage Weil 提交于 6月 12, 2010

The CLONE and CLONE_RANGE ioctls round up the range of extents being
cloned to the block size when the range to clone extends to the end of file
(this is always the case with CLONE).  It was then using that offset when
extending the destination file's i_size.  Fix this by not setting i_size
beyond the originally requested ending offset.

This bug was introduced by a22285a6 (2.6.35-rc1).
Signed-off-by: NSage Weil <sage@newdream.net>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

b5384d48

Btrfs: fix split_leaf double split corner case · 99d8f83c

由 Chris Mason 提交于 7月 07, 2010

split_leaf was not properly balancing leaves when it was forced to
split a leaf twice.  This commit adds an extra push left and right
before forcing the double split in hopes of getting the slot where
we want to insert at either the start or end of the leaf.

If the extra pushes do work, then we are able to avoid splitting twice
and we keep the tree properly balanced.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

99d8f83c

19 7月, 2010 2 次提交

[S390] dasd: use correct label location for diag fba disks · cffab6bc

由 Peter Oberparleiter 提交于 7月 19, 2010

Partition boundary calculation fails for DASD FBA disks under the
following conditions:
- disk is formatted with CMS FORMAT with a blocksize of more than
  512 bytes
- all of the disk is reserved to a single CMS file using CMS RESERVE
- the disk is accessed using the DIAG mode of the DASD driver

Under these circumstances, the partition detection code tries to
read the CMS label block containing partition-relevant information
from logical block offset 1, while it is in fact located at physical
block offset 1.

Fix this problem by using the correct CMS label block location
depending on the device type as determined by the DASD SENSE ID
information.
Signed-off-by: NPeter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>

cffab6bc

mm: add context argument to shrinker callback · 7f8275d0

由 Dave Chinner 提交于 7月 19, 2010

The current shrinker implementation requires the registered callback
to have global state to work from. This makes it difficult to shrink
caches that are not global (e.g. per-filesystem caches). Pass the shrinker
structure to the callback so that users can embed the shrinker structure
in the context the shrinker needs to operate on and get back to it in the
callback via container_of().
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

7f8275d0

17 7月, 2010 3 次提交

ocfs2: Silence gcc warning in ocfs2_write_zero_page(). · 5453258d

由 Joel Becker 提交于 7月 16, 2010

ocfs2_write_zero_page() has a loop that won't ever be skipped, but gcc
doesn't know that.  Set ret=0 just to make gcc happy.
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

5453258d

ceph: do not include cap/dentry releases in replayed messages · e979cf50

由 Sage Weil 提交于 7月 15, 2010

Strip the cap and dentry releases from replayed messages.  They can
cause the shared state to get out of sync because they were generated
(with the request message) earlier, and no longer reflect the current
client state.
Signed-off-by: NSage Weil <sage@newdream.net>

e979cf50

ceph: reuse request message when replaying against recovering mds · 01a92f17

由 Sage Weil 提交于 7月 15, 2010

Replayed rename operations (after an mds failure/recovery) were broken
because the request paths were regenerated from the dentry names, which
get mangled when d_move() is called.

Instead, resend the previous request message when replaying completed
operations.  Just make sure the REPLAY flag is set and the target ino is
filled in.

This fixes problems with workloads doing renames when the MDS restarts,
where the rename operation appears to succeed, but on mds restart then
fails (leading to client confusion, app breakage, etc.).
Signed-off-by: NSage Weil <sage@newdream.net>

01a92f17

16 7月, 2010 3 次提交

jbd2/ocfs2: Fix block checksumming when a buffer is used in several transactions · 13ceef09

由 Jan Kara 提交于 7月 14, 2010

OCFS2 uses t_commit trigger to compute and store checksum of the just
committed blocks. When a buffer has b_frozen_data, checksum is computed
for it instead of b_data but this can result in an old checksum being
written to the filesystem in the following scenario:

1) transaction1 is opened
2) handle1 is opened
3) journal_access(handle1, bh)
    - This sets jh->b_transaction to transaction1
4) modify(bh)
5) journal_dirty(handle1, bh)
6) handle1 is closed
7) start committing transaction1, opening transaction2
8) handle2 is opened
9) journal_access(handle2, bh)
    - This copies off b_frozen_data to make it safe for transaction1 to commit.
      jh->b_next_transaction is set to transaction2.
10) jbd2_journal_write_metadata() checksums b_frozen_data
11) the journal correctly writes b_frozen_data to the disk journal
12) handle2 is closed
    - There was no dirty call for the bh on handle2, so it is never queued for
      any more journal operation
13) Checkpointing finally happens, and it just spools the bh via normal buffer
writeback.  This will write b_data, which was never triggered on and thus
contains a wrong (old) checksum.

This patch fixes the problem by calling the trigger at the moment data is
frozen for journal commit - i.e., either when b_frozen_data is created by
do_get_write_access or just before we write a buffer to the log if
b_frozen_data does not exist. We also rename the trigger to t_frozen as
that better describes when it is called.
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NMark Fasheh <mfasheh@suse.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

13ceef09

ocfs2/dlm: Remove BUG_ON from migration in the rare case of a down node · a39953dd

由 Wengang Wang 提交于 7月 14, 2010

For migration, we are waiting for DLM_LOCK_RES_MIGRATING flag to be set
before sending DLM_MIG_LOCKRES_MSG message to the target. We are using
dlm_migration_can_proceed() for that purpose. However, if the node is
down, dlm_migration_can_proceed() will also return "go ahead". In this
rare case, the DLM_LOCK_RES_MIGRATING flag might not be set yet. Remove
the BUG_ON() that trips over this condition.
Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

a39953dd

ocfs2: Don't duplicate pages past i_size during CoW. · f5e27b6d

由 Tao Ma 提交于 7月 14, 2010

During CoW, the pages after i_size don't contain valid data, so there's
no need to read and duplicate them.
Signed-off-by: NTao Ma <tao.ma@oracle.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

f5e27b6d

15 7月, 2010 5 次提交

GFS2: rename causes kernel Oops · 728a756b

由 Bob Peterson 提交于 7月 14, 2010

This patch fixes a kernel Oops in the GFS2 rename code.

The problem was in the way the gfs2 directory code was trying
to re-use sentinel directory entries.

In the failing case, gfs2's rename function was renaming a
file to another name that had the same non-trivial length.
The file being renamed happened to be the first directory
entry on the leaf block.

First, the rename code (gfs2_rename in ops_inode.c) found the
original directory entry and decided it could do its job by
simply replacing the directory entry with another.  Therefore
it determined correctly that no block allocations were needed.

Next, the rename code deleted the old directory entry prior to
replacing it with the new name.  Therefore, the soon-to-be
replaced directory entry was temporarily made into a directory
entry "sentinel" or a place holder at the start of a leaf block.

Lastly, it went to re-add the replacement directory entry in
that leaf block.  However, when gfs2_dirent_find_space was
looking for space in the leaf block, it used the wrong value
for the sentinel.  That threw off its calculations so later
it decides it can't really re-use the sentinel and therefore
must allocate a new leaf block.  But because it previously decided
to re-use the directory entry, it didn't waste the time to
grab a new block allocation for the inode.  Therefore, the
inode's i_alloc pointer was still NULL and it crashes trying to
reference it.

In the case of sentinel directory entries, the entire dirent is
reused, not just the "free space" portion of it, and therefore
the function gfs2_dirent_find_space should use the value 0
rather than GFS2_DIRENT_SIZE(0) for the actual dirent size.

Fixing this calculation enables the reproducer programs to work
properly.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

728a756b

GFS2: BUG in gfs2_adjust_quota · 8b421601

由 Abhijith Das 提交于 7月 04, 2010

HighMem pages on i686 do not get mapped to the buffer_heads and this was
causing a NULL pointer dereference when we were trying to memset page buffers
to zero.
We now use zero_user() that kmaps the page and directly manipulates page data.
This patch also fixes a boundary condition that was incorrect.
Signed-off-by: NAbhi Das <adas@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

8b421601

GFS2: Fix kernel NULL pointer dereference by dlm_astd · b1becbde

由 Bob Peterson 提交于 6月 17, 2010

This patch fixes a problem in an error path when looking
up dinodes.  There are two sister-functions, gfs2_inode_lookup
and gfs2_process_unlinked_inode.  Both functions acquire and
hold the i_iopen glock for the dinode being looked up. The last
thing they try to do is hold the i_gl glock for the dinode.
If that glock fails for some reason, the error path was
incorrectly calling gfs2_glock_put for the i_iopen glock twice.
This resulted in the glock being prematurely freed.  The
"minimum hold time" usually kept the glock in memory, but the
lock interface to dlm (aka lock_dlm) freed its memory for the
glock.  In some circumstances, it would cause dlm's dlm_astd daemon
to try to call the bast function for the freed lock_dlm memory,
which resulted in a NULL pointer dereference.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

b1becbde

GFS2: recovery stuck on transaction lock · b7dc2df5

由 Bob Peterson 提交于 6月 23, 2010

This patch fixes bugzilla bug #590878: GFS2: recovery stuck on
transaction lock. We set the frozen flag on the glock when we receive
a completion that cannot be delivered due to blocked locks. At that
point we check to see whether the first waiting holder has the noexp
flag set. If the noexp lock is queued later, then we need to unfreeze
the glock at that point in time, namely, in the glock work function.

This patch was originally written by Steve Whitehouse, but since
he's on holiday, I'm submitting it. It's been well tested with a
complex recovery test called revolver.
Signed-off-by: NSteve Whitehouse <swhiteho@redhat.com>
Signed-off-by: NBob Peterson <rpeterso@redhat.com>

b7dc2df5

GFS2: O_TRUNC not working on stuffed files across cluster · a8bf2bc2

由 Bob Peterson 提交于 6月 24, 2010

This patch replaces a statement that got dropped out by accident.
Without the patch, truncates on stuffed (very small) files cause
those files to have an unpredictable size.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>

a8bf2bc2

13 7月, 2010 1 次提交

ocfs2: tighten up strlen() checking · e372357b

由 Dan Carpenter 提交于 7月 10, 2010

This function is only called from one place and it's like this:
	dlm_register_domain(conn->cc_name, dlm_key, &fs_version);

The "conn->cc_name" is 64 characters long.  If strlen(conn->cc_name)
were equal to O2NM_MAX_NAME_LEN (64) that would be a bug because
strlen() doesn't count the NULL character.

In fact, if you look how O2NM_MAX_NAME_LEN is used, it mostly describes
64 character buffers.  The only exception is nd_name from struct
o2nm_node.

Anyway I looked into it and in this case the domain string comes from
osb->uuid_str in ocfs2_setup_osb_uuid().  That's 32 characters and NULL
which easily fits into O2NM_MAX_NAME_LEN.  This patch doesn't change how
the code works, but I think it makes the code a little cleaner.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NJoel Becker <joel.becker@oracle.com>

e372357b