提交 · e835124c2be289515b918f2688ced4249e2de566 · openeuler / raspberrypi-kernel

17 9月, 2010 2 次提交

ceph: only send one flushsnap per cap_snap per mds session · e835124c

由 Sage Weil 提交于 9月 17, 2010

Sending multiple flushsnap messages is problematic because we ignore
the response if the tid doesn't match, and the server may only respond to
each one once.  It's also a waste.

So, skip cap_snaps that are already on the flushing list, unless the caller
tells us to resend (because we are reconnecting).
Signed-off-by: NSage Weil <sage@newdream.net>

e835124c

ceph: fix cap_snap and realm split · ae00d4f3

由 Sage Weil 提交于 9月 16, 2010

The cap_snap creation/queueing relies on both the current i_head_snapc
_and_ the i_snap_realm pointers being correct, so that the new cap_snap
can properly reference the old context and the new i_head_snapc can be
updated to reference the new snaprealm's context.  To fix this, we:

 - move inodes completely to the new (split) realm so that i_snap_realm
   is correct, and
 - generate the new snapc's _before_ queueing the cap_snaps in
   ceph_update_snap_trace().
Signed-off-by: NSage Weil <sage@newdream.net>

ae00d4f3

15 9月, 2010 2 次提交

ceph: stop sending FLUSHSNAPs when we hit a dirty capsnap · cfc0bf66

由 Sage Weil 提交于 9月 14, 2010

Stop sending FLUSHSNAP messages when we hit a capsnap that has dirty_pages
or is still writing.  We'll send the newer capsnaps only after the older
ones complete.
Signed-off-by: NSage Weil <sage@newdream.net>

cfc0bf66

ceph: correctly set 'follows' in flushsnap messages · 8bef9239

由 Sage Weil 提交于 9月 14, 2010

The 'follows' should match the seq for the snap context for the given snap
cap, which is the context under which we have been dirtying and writing
data and metadata.  The snapshot that _contains_ those updates thus
_follows_ that context's seq #.
Signed-off-by: NSage Weil <sage@newdream.net>

8bef9239

14 9月, 2010 1 次提交

ceph: fix dn offset during readdir_prepopulate · 467c5251

由 Sage Weil 提交于 9月 13, 2010

When adding the readdir results to the cache, ceph_set_dentry_offset was
clobbered our just-set offset.  This can cause the readdir result offsets
to get out of sync with the server.  Add an argument to the helper so
that it does not.

This bug was introduced by 1cd3935b.
Signed-off-by: NSage Weil <sage@newdream.net>

467c5251

12 9月, 2010 4 次提交

ceph: fix file offset wrapping at 4GB on 32-bit archs · a77d9f7d

由 Sage Weil 提交于 9月 11, 2010

Cast the value before shifting so that we don't run out of bits with a
32-bit unsigned long.  This fixes wrapping of high file offsets into the
low 4GB of a file on disk, and the subsequent data corruption for large
files.
Signed-off-by: NSage Weil <sage@newdream.net>

a77d9f7d

ceph: fix reconnect encoding for old servers · 3612abbd

由 Sage Weil 提交于 9月 07, 2010

Fix the reconnect encoding to encode the cap record when the MDS does not
have the FLOCK capability (i.e., pre v0.22).
Signed-off-by: NSage Weil <sage@newdream.net>

3612abbd

ceph: fix pagelist kunmap tail · 3d4401d9

由 Yehuda Sadeh 提交于 9月 03, 2010

A wrong parameter was passed to the kunmap.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

3d4401d9

ceph: fix null pointer deref on anon root dentry release · ca04d9c3

由 Sage Weil 提交于 8月 26, 2010

When we release a root dentry, particularly after a splice, the parent
(actually our) inode was evaluating to NULL and was getting dereferenced
by ceph_snap().  This is reproduced by something as simple as

 mount -t ceph monhost:/a/b mnt
 mount -t ceph monhost:/a mnt2
 ls mnt2

A splice_dentry() would kill the old 'b' inode's root dentry, and we'd
crash while releasing it.

Fix by checking for both the ROOT and NULL cases explicitly.  We only need
to invalidate the parent dir when we have a correct parent to invalidate.
Signed-off-by: NSage Weil <sage@newdream.net>

ca04d9c3

28 8月, 2010 3 次提交

fsnotify: drop two useless bools in the fnsotify main loop · 92b4678e

由 Eric Paris 提交于 8月 27, 2010

The fsnotify main loop has 2 bools which indicated if we processed the
inode or vfsmount mark in that particular pass through the loop. These
bool can we replaced with the inode_group and vfsmount_group variables
and actually make the code a little easier to understand.
Signed-off-by: NEric Paris <eparis@redhat.com>

92b4678e

fsnotify: fix list walk order · f72adfd5

由 Eric Paris 提交于 8月 27, 2010

Marks were stored on the inode and vfsmonut mark list in order from
highest memory address to lowest memory address.  The code to walk those
lists thought they were in order from lowest to highest with
unpredictable results when trying to match up marks from each.  It was
possible that extra events would be sent to userspace when inode
marks ignoring events wouldn't get matched with the vfsmount marks.

This problem only affected fanotify when using both vfsmount and inode
marks simultaneously.
Signed-off-by: NEric Paris <eparis@redhat.com>

f72adfd5

fanotify: Return EPERM when a process is not privileged · a2f13ad0

由 Andreas Gruenbacher 提交于 8月 24, 2010

The appropriate error code when privileged operations are denied is
EPERM, not EACCES.
Signed-off-by: NAndreas Gruenbacher <agruen@suse.de>
Signed-off-by: NEric Paris <paris@paris.rdu.redhat.com>

a2f13ad0

27 8月, 2010 11 次提交

eCryptfs: Fix encrypted file name lookup regression · 93c3fe40

由 Tyler Hicks 提交于 8月 25, 2010

Fixes a regression caused by 21edad32

When file name encryption was enabled, ecryptfs_lookup() failed to use
the encrypted and encoded version of the upper, plaintext, file name
when performing a lookup in the lower file system. This made it
impossible to lookup existing encrypted file names and any newly created
files would have plaintext file names in the lower file system.

https://bugs.launchpad.net/ecryptfs/+bug/623087Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

93c3fe40

ecryptfs: properly mark init functions · 7371a382

由 Jerome Marchand 提交于 8月 17, 2010

Some ecryptfs init functions are not prefixed by __init and thus not
freed after initialization. This patch saved about 1kB in ecryptfs
module.
Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

7371a382

fs/ecryptfs: Return -ENOMEM on memory allocation failure · f137f150

由 Julia Lawall 提交于 8月 11, 2010

In this code, 0 is returned on memory allocation failure, even though other
failures return -ENOMEM or other similar values.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression ret;
expression x,e1,e2,e3;
@@

ret = 0
... when != ret = e1
*x = \(kmalloc\|kcalloc\|kzalloc\)(...)
... when != ret = e2
if (x == NULL) { ... when != ret = e3
  return ret;
}
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

f137f150

nfsd: fix NULL dereference in nfsd_statfs() · f6360efb

由 Takashi Iwai 提交于 8月 13, 2010

The commit ebabe9a9
    pass a struct path to vfs_statfs
introduced the struct path initialization, and this seems to trigger
an Oops on my machine.

fh_dentry field may be NULL and set later in fh_verify(), thus the
initialization of path must be after fh_verify().
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f6360efb

nfsd4: fix downgrade/lock logic · 7d947842

由 J. Bruce Fields 提交于 8月 20, 2010

If we already had a RW open for a file, and get a readonly open, we were
piggybacking on the existing RW open.  That's inconsistent with the
downgrade logic which blows away the RW open assuming you'll still have
a readonly open.

Also, make sure there is a readonly or writeonly open available for
locking, again to prevent bad behavior in downgrade cases when any RW
open may be lost.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

7d947842

J
nfsd4: typo fix in find_any_file · 18608ad4
由 J. Bruce Fields 提交于 8月 20, 2010
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
18608ad4

nfsd4: bad BUG() in preprocess_stateid_op · 30c0e1ef

由 J. Bruce Fields 提交于 8月 17, 2010

It's OK for this function to return without setting filp--we do it in
the special-stateid case.

And there's a legitimate case where we can hit this, since we do permit
reads on write-only stateid's.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

30c0e1ef

Cannot allocate memory error on mount · f0138a79

由 Suresh Jayaraman 提交于 8月 26, 2010

On 08/26/2010 01:56 AM, joe hefner wrote:
> On a recent Fedora (13), I am seeing a mount failure message that I can not explain. I have a Windows Server 2003ýa with a share set up for access only for a specific username (say userfoo). If I try to mount it from Linux,ýusing userfoo and the correct password all is well. If I try with a bad password or with some other username (userbar), it fails with "Permission denied" as expected. If I try to mount as username = administrator, and give the correct administrator password, I would also expect "Permission denied", but I see "Cannot allocate memory" instead.

> ýfs/cifs/netmisc.c: Mapping smb error code 5 to POSIX err -13
> ýfs/cifs/cifssmb.c: Send error in QPathInfo = -13
> ýCIFS VFS: cifs_read_super: get root inode failed

Looks like the commit 0b8f18e3 assumed that cifs_get_inode_info() and
friends fail only due to memory allocation error when the inode is NULL
which is not the case if CIFSSMBQPathInfo() fails and returns an error.
Fix this by propagating the actual error code back.
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSuresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

f0138a79

ceph: fix get_ticket_handler() error handling · b545787d

由 Dan Carpenter 提交于 8月 26, 2010

get_ticket_handler() returns a valid pointer or it returns
ERR_PTR(-ENOMEM) if kzalloc() fails.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

b545787d

ceph: don't BUG on ENOMEM during mds reconnect · e072f8aa

由 Sage Weil 提交于 8月 26, 2010

We are in a position to return an error; do that instead.
Signed-off-by: NSage Weil <sage@newdream.net>

e072f8aa

ceph: ceph_mdsc_build_path() returns an ERR_PTR · f44c3890

由 Dan Carpenter 提交于 8月 26, 2010

ceph_mdsc_build_path() returns an ERR_PTR but this code is set up to
handle NULL returns.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

f44c3890

26 8月, 2010 3 次提交

S
[CIFS] Eliminate unused variable warning · c89e5198
由 Steve French 提交于 8月 26, 2010
```
CC: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>
```
c89e5198

ceph: Fix warnings · ad8453ab

由 Alan Cox 提交于 8月 25, 2010

Just scrubbing some warnings so I can see real problem ones in the build
noise. For 32bit we need to coax gcc politely into believing we really
honestly intend to the casts. Using (u64)(unsigned long) means we cast from
a pointer to a type of the right size and then extend it. This stops the
warning spew.
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NSage Weil <sage@newdream.net>

ad8453ab

ceph: ceph_get_inode() returns an ERR_PTR · ac1f12ef

由 Dan Carpenter 提交于 8月 25, 2010

ceph_get_inode() returns an ERR_PTR and it doesn't return a NULL.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

ac1f12ef

25 8月, 2010 3 次提交

S
ceph: initialize fields on new dentry_infos · 36e21687
由 Sage Weil 提交于 8月 24, 2010
```
Signed-off-by: NSage Weil <sage@newdream.net>
```
36e21687

ceph: maintain i_head_snapc when any caps are dirty, not just for data · 7d8cb26d

由 Sage Weil 提交于 8月 24, 2010

We used to use i_head_snapc to keep track of which snapc the current epoch
of dirty data was dirtied under.  It is used by queue_cap_snap to set up
the cap_snap.  However, since we queue cap snaps for any dirty caps, not
just for dirty file data, we need to keep a valid i_head_snapc anytime
we have dirty|flushing caps.  This fixes a NULL pointer deref in
queue_cap_snap when writing back dirty caps without data (e.g.,
snaptest-authwb.sh).
Signed-off-by: NSage Weil <sage@newdream.net>

7d8cb26d

Eliminate sparse warning - bad constant expression · 2d20ca83

由 shirishpargaonkar@gmail.com 提交于 8月 24, 2010

Eliminiate sparse warning during usage of crypto_shash_* APIs
       error: bad constant expression

Allocate memory for shash descriptors once, so that we do not kmalloc/kfree it
for every signature generation (shash descriptor for md5 hash).

From ed7538619817777decc44b5660b52268077b74f3 Mon Sep 17 00:00:00 2001
From: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Date: Tue, 24 Aug 2010 11:47:43 -0500
Subject: [PATCH] eliminate sparse warnings during crypto_shash_* APis usage
Signed-off-by: NShirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

2d20ca83

24 8月, 2010 11 次提交

xfs: do not discard page cache data on EAGAIN · b5420f23

由 Christoph Hellwig 提交于 8月 24, 2010

If xfs_map_blocks returns EAGAIN because of lock contention we must redirty the
page and not disard the pagecache content and return an error from writepage.
We used to do this correctly, but the logic got lost during the recent
reshuffle of the writepage code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NMike Gao <ygao.linux@gmail.com>
Tested-by: NMike Gao <ygao.linux@gmail.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <dchinner@redhat.com>

b5420f23

xfs: don't do memory allocation under the CIL context lock · 3b93c7aa

由 Dave Chinner 提交于 8月 24, 2010

Formatting items requires memory allocation when using delayed
logging. Currently that memory allocation is done while holding the
CIL context lock in read mode. This means that if memory allocation
takes some time (e.g. enters reclaim), we cannot push on the CIL
until the allocation(s) required by formatting complete. This can
stall CIL pushes for some time, and once a push is stalled so are
all new transaction commits.

Fix this splitting the item formatting into two steps. The first
step which does the allocation and memcpy() into the allocated
buffer is now done outside the CIL context lock, and only the CIL
insert is done inside the CIL context lock. This avoids the stall
issue.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

3b93c7aa

xfs: Reduce log force overhead for delayed logging · a44f13ed

由 Dave Chinner 提交于 8月 24, 2010

Delayed logging adds some serialisation to the log force process to
ensure that it does not deference a bad commit context structure
when determining if a CIL push is necessary or not. It does this by
grabing the CIL context lock exclusively, then dropping it before
pushing the CIL if necessary. This causes serialisation of all log
forces and pushes regardless of whether a force is necessary or not.
As a result fsync heavy workloads (like dbench) can be significantly
slower with delayed logging than without.

To avoid this penalty, copy the current sequence from the context to
the CIL structure when they are swapped. This allows us to do
unlocked checks on the current sequence without having to worry
about dereferencing context structures that may have already been
freed. Hence we can remove the CIL context locking in the forcing
code and only call into the push code if the current context matches
the sequence we need to force.

By passing the sequence into the push code, we can check the
sequence again once we have the CIL lock held exclusive and abort if
the sequence has already been pushed. This avoids a lock round-trip
and unnecessary CIL pushes when we have racing push calls.

The result is that the regression in dbench performance goes away -
this change improves dbench performance on a ramdisk from ~2100MB/s
to ~2500MB/s. This compares favourably to not using delayed logging
which retuns ~2500MB/s for the same workload.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

a44f13ed

xfs: dummy transactions should not dirty VFS state · 1a387d3b

由 Dave Chinner 提交于 8月 24, 2010

When we  need to cover the log, we issue dummy transactions to ensure
the current log tail is on disk. Unfortunately we currently use the
root inode in the dummy transaction, and the act of committing the
transaction dirties the inode at the VFS level.

As a result, the VFS writeback of the dirty inode will prevent the
filesystem from idling long enough for the log covering state
machine to complete. The state machine gets stuck in a loop issuing
new dummy transactions to cover the log and never makes progress.

To avoid this problem, the dummy transactions should not cause
externally visible state changes. To ensure this occurs, make sure
that dummy transactions log an unchanging field in the superblock as
it's state is never propagated outside the filesystem. This allows
the log covering state machine to complete successfully and the
filesystem now correctly enters a fully idle state about 90s after
the last modification was made.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

1a387d3b

xfs: ensure f_ffree returned by statfs() is non-negative · 2fe33661

由 Stuart Brodsky 提交于 8月 24, 2010

Because of delayed updates to sb_icount field in the super block, it
is possible to allocate over maxicount number of inodes.  This
causes the arithmetic to calculate a negative number of free inodes
in user commands like df or stat -f.

Since maxicount is a somewhat arbitrary number, a slight over
allocation is not critical but user commands should be displayed as
0 or greater and never go negative.  To do this the value in the
stats buffer f_ffree is capped to never go negative.

[ Modified to use max_t as per Christoph's comment. ]
Signed-off-by: NStu Brodsky <sbrodsky@sgi.com>
Signed-off-by: NDave Chinner <dchinner@redhat.com>

2fe33661

xfs: handle negative wbc->nr_to_write during sync writeback · efceab1d

由 Dave Chinner 提交于 8月 24, 2010

During data integrity (WB_SYNC_ALL) writeback, wbc->nr_to_write will
go negative on inodes with more than 1024 dirty pages due to
implementation details of write_cache_pages(). Currently XFS will
abort page clustering in writeback once nr_to_write drops below
zero, and so for data integrity writeback we will do very
inefficient page at a time allocation and IO submission for inodes
with large numbers of dirty pages.

Fix this by only aborting the page clustering code when
wbc->nr_to_write is negative and the sync mode is WB_SYNC_NONE.

Cc: <stable@kernel.org>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

efceab1d

xfs: fix untrusted inode number lookup · 4536f2ad

由 Dave Chinner 提交于 8月 24, 2010

Commit 7124fe0a ("xfs: validate untrusted inode
numbers during lookup") changes the inode lookup code to do btree lookups for
untrusted inode numbers. This change made an invalid assumption about the
alignment of inodes and hence incorrectly calculated the first inode in the
cluster. As a result, some inode numbers were being incorrectly considered
invalid when they were actually valid.

The issue was not picked up by the xfstests suite because it always runs fsr
and dump (the two utilities that utilise the bulkstat interface) on cache hot
inodes and hence the lookup code in the cold cache path was not sufficiently
exercised to uncover this intermittent problem.

Fix the issue by relaxing the btree lookup criteria and then checking if the
record returned contains the inode number we are lookup for. If it we get an
incorrect record, then the inode number is invalid.

Cc: <stable@kernel.org>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

4536f2ad

xfs: ensure we mark all inodes in a freed cluster XFS_ISTALE · 5b3eed75

由 Dave Chinner 提交于 8月 24, 2010

Under heavy load parallel metadata loads (e.g. dbench), we can fail
to mark all the inodes in a cluster being freed as XFS_ISTALE as we
skip inodes we cannot get the XFS_ILOCK_EXCL or the flush lock on.
When this happens and the inode cluster buffer has already been
marked stale and freed, inode reclaim can try to write the inode out
as it is dirty and not marked stale. This can result in writing th
metadata to an freed extent, or in the case it has already
been overwritten trigger a magic number check failure and return an
EUCLEAN error such as:

Filesystem "ram0": inode 0x442ba1 background reclaim flush failed with 117

Fix this by ensuring that we hoover up all in memory inodes in the
cluster and mark them XFS_ISTALE when freeing the cluster.

Cc: <stable@kernel.org>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

5b3eed75

xfs: unlock items before allowing the CIL to commit · d17c701c

由 Dave Chinner 提交于 8月 24, 2010

When we commit a transaction using delayed logging, we need to
unlock the items in the transaciton before we unlock the CIL context
and allow it to be checkpointed. If we unlock them after we release
the CIl context lock, the CIL can checkpoint and complete before
we free the log items. This breaks stale buffer item unlock and
unpin processing as there is an implicit assumption that the unlock
will occur before the unpin.

Also, some log items need to store the LSN of the transaction commit
in the item (inodes and EFIs) and so can race with other transaction
completions if we don't prevent the CIL from checkpointing before
the unlock occurs.

Cc: <stable@kernel.org>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

d17c701c

cifs: check for NULL session password · 24e6cf92

由 Jeff Layton 提交于 8月 23, 2010

It's possible for a cifsSesInfo struct to have a NULL password, so we
need to check for that prior to running strncmp on it.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

24e6cf92

missing changes during ntlmv2/ntlmssp auth and sign · 3ec6bbcd

由 Shirish Pargaonkar 提交于 8月 23, 2010

Signed-off-by: NShirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

3ec6bbcd