提交 · 23963e54ce187ca6e907c83176c15508b0f6e60d · openeuler / Kernel

02 9月, 2010 1 次提交

xfs: Disallow 32bit project quota id · 23963e54

由 Arkadiusz Mi?kiewicz 提交于 8月 26, 2010

Currently on-disk structure is able to keep only 16bit project quota
id, so disallow 32bit ones. This fixes a problem where parts of
kernel structures holding project quota id are 32bit while parts
(on-disk) are 16bit variables which causes project quota member
files to be inaccessible for some operations (like mv/rm).
Signed-off-by: NArkadiusz Mi?kiewicz <arekm@maven.pl>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlex Elder <aelder@sgi.com>

23963e54

28 8月, 2010 3 次提交

fsnotify: drop two useless bools in the fnsotify main loop · 92b4678e

由 Eric Paris 提交于 8月 27, 2010

The fsnotify main loop has 2 bools which indicated if we processed the
inode or vfsmount mark in that particular pass through the loop. These
bool can we replaced with the inode_group and vfsmount_group variables
and actually make the code a little easier to understand.
Signed-off-by: NEric Paris <eparis@redhat.com>

92b4678e

fsnotify: fix list walk order · f72adfd5

由 Eric Paris 提交于 8月 27, 2010

Marks were stored on the inode and vfsmonut mark list in order from
highest memory address to lowest memory address.  The code to walk those
lists thought they were in order from lowest to highest with
unpredictable results when trying to match up marks from each.  It was
possible that extra events would be sent to userspace when inode
marks ignoring events wouldn't get matched with the vfsmount marks.

This problem only affected fanotify when using both vfsmount and inode
marks simultaneously.
Signed-off-by: NEric Paris <eparis@redhat.com>

f72adfd5

fanotify: Return EPERM when a process is not privileged · a2f13ad0

由 Andreas Gruenbacher 提交于 8月 24, 2010

The appropriate error code when privileged operations are denied is
EPERM, not EACCES.
Signed-off-by: NAndreas Gruenbacher <agruen@suse.de>
Signed-off-by: NEric Paris <paris@paris.rdu.redhat.com>

a2f13ad0

27 8月, 2010 11 次提交

eCryptfs: Fix encrypted file name lookup regression · 93c3fe40

由 Tyler Hicks 提交于 8月 25, 2010

Fixes a regression caused by 21edad32

When file name encryption was enabled, ecryptfs_lookup() failed to use
the encrypted and encoded version of the upper, plaintext, file name
when performing a lookup in the lower file system. This made it
impossible to lookup existing encrypted file names and any newly created
files would have plaintext file names in the lower file system.

https://bugs.launchpad.net/ecryptfs/+bug/623087Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

93c3fe40

ecryptfs: properly mark init functions · 7371a382

由 Jerome Marchand 提交于 8月 17, 2010

Some ecryptfs init functions are not prefixed by __init and thus not
freed after initialization. This patch saved about 1kB in ecryptfs
module.
Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

7371a382

fs/ecryptfs: Return -ENOMEM on memory allocation failure · f137f150

由 Julia Lawall 提交于 8月 11, 2010

In this code, 0 is returned on memory allocation failure, even though other
failures return -ENOMEM or other similar values.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression ret;
expression x,e1,e2,e3;
@@

ret = 0
... when != ret = e1
*x = \(kmalloc\|kcalloc\|kzalloc\)(...)
... when != ret = e2
if (x == NULL) { ... when != ret = e3
  return ret;
}
// </smpl>
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NTyler Hicks <tyhicks@linux.vnet.ibm.com>

f137f150

nfsd: fix NULL dereference in nfsd_statfs() · f6360efb

由 Takashi Iwai 提交于 8月 13, 2010

The commit ebabe9a9
    pass a struct path to vfs_statfs
introduced the struct path initialization, and this seems to trigger
an Oops on my machine.

fh_dentry field may be NULL and set later in fh_verify(), thus the
initialization of path must be after fh_verify().
Signed-off-by: NTakashi Iwai <tiwai@suse.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMinchan Kim <minchan.kim@gmail.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

f6360efb

nfsd4: fix downgrade/lock logic · 7d947842

由 J. Bruce Fields 提交于 8月 20, 2010

If we already had a RW open for a file, and get a readonly open, we were
piggybacking on the existing RW open.  That's inconsistent with the
downgrade logic which blows away the RW open assuming you'll still have
a readonly open.

Also, make sure there is a readonly or writeonly open available for
locking, again to prevent bad behavior in downgrade cases when any RW
open may be lost.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

7d947842

J
nfsd4: typo fix in find_any_file · 18608ad4
由 J. Bruce Fields 提交于 8月 20, 2010
```
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
```
18608ad4

nfsd4: bad BUG() in preprocess_stateid_op · 30c0e1ef

由 J. Bruce Fields 提交于 8月 17, 2010

It's OK for this function to return without setting filp--we do it in
the special-stateid case.

And there's a legitimate case where we can hit this, since we do permit
reads on write-only stateid's.
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

30c0e1ef

Cannot allocate memory error on mount · f0138a79

由 Suresh Jayaraman 提交于 8月 26, 2010

On 08/26/2010 01:56 AM, joe hefner wrote:
> On a recent Fedora (13), I am seeing a mount failure message that I can not explain. I have a Windows Server 2003ýa with a share set up for access only for a specific username (say userfoo). If I try to mount it from Linux,ýusing userfoo and the correct password all is well. If I try with a bad password or with some other username (userbar), it fails with "Permission denied" as expected. If I try to mount as username = administrator, and give the correct administrator password, I would also expect "Permission denied", but I see "Cannot allocate memory" instead.

> ýfs/cifs/netmisc.c: Mapping smb error code 5 to POSIX err -13
> ýfs/cifs/cifssmb.c: Send error in QPathInfo = -13
> ýCIFS VFS: cifs_read_super: get root inode failed

Looks like the commit 0b8f18e3 assumed that cifs_get_inode_info() and
friends fail only due to memory allocation error when the inode is NULL
which is not the case if CIFSSMBQPathInfo() fails and returns an error.
Fix this by propagating the actual error code back.
Acked-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSuresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

f0138a79

ceph: fix get_ticket_handler() error handling · b545787d

由 Dan Carpenter 提交于 8月 26, 2010

get_ticket_handler() returns a valid pointer or it returns
ERR_PTR(-ENOMEM) if kzalloc() fails.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

b545787d

ceph: don't BUG on ENOMEM during mds reconnect · e072f8aa

由 Sage Weil 提交于 8月 26, 2010

We are in a position to return an error; do that instead.
Signed-off-by: NSage Weil <sage@newdream.net>

e072f8aa

ceph: ceph_mdsc_build_path() returns an ERR_PTR · f44c3890

由 Dan Carpenter 提交于 8月 26, 2010

ceph_mdsc_build_path() returns an ERR_PTR but this code is set up to
handle NULL returns.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

f44c3890

26 8月, 2010 3 次提交

S
[CIFS] Eliminate unused variable warning · c89e5198
由 Steve French 提交于 8月 26, 2010
```
CC: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>
```
c89e5198

ceph: Fix warnings · ad8453ab

由 Alan Cox 提交于 8月 25, 2010

Just scrubbing some warnings so I can see real problem ones in the build
noise. For 32bit we need to coax gcc politely into believing we really
honestly intend to the casts. Using (u64)(unsigned long) means we cast from
a pointer to a type of the right size and then extend it. This stops the
warning spew.
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NSage Weil <sage@newdream.net>

ad8453ab

ceph: ceph_get_inode() returns an ERR_PTR · ac1f12ef

由 Dan Carpenter 提交于 8月 25, 2010

ceph_get_inode() returns an ERR_PTR and it doesn't return a NULL.
Signed-off-by: NDan Carpenter <error27@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

ac1f12ef

25 8月, 2010 3 次提交

S
ceph: initialize fields on new dentry_infos · 36e21687
由 Sage Weil 提交于 8月 24, 2010
```
Signed-off-by: NSage Weil <sage@newdream.net>
```
36e21687

ceph: maintain i_head_snapc when any caps are dirty, not just for data · 7d8cb26d

由 Sage Weil 提交于 8月 24, 2010

We used to use i_head_snapc to keep track of which snapc the current epoch
of dirty data was dirtied under.  It is used by queue_cap_snap to set up
the cap_snap.  However, since we queue cap snaps for any dirty caps, not
just for dirty file data, we need to keep a valid i_head_snapc anytime
we have dirty|flushing caps.  This fixes a NULL pointer deref in
queue_cap_snap when writing back dirty caps without data (e.g.,
snaptest-authwb.sh).
Signed-off-by: NSage Weil <sage@newdream.net>

7d8cb26d

Eliminate sparse warning - bad constant expression · 2d20ca83

由 shirishpargaonkar@gmail.com 提交于 8月 24, 2010

Eliminiate sparse warning during usage of crypto_shash_* APIs
       error: bad constant expression

Allocate memory for shash descriptors once, so that we do not kmalloc/kfree it
for every signature generation (shash descriptor for md5 hash).

From ed7538619817777decc44b5660b52268077b74f3 Mon Sep 17 00:00:00 2001
From: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Date: Tue, 24 Aug 2010 11:47:43 -0500
Subject: [PATCH] eliminate sparse warnings during crypto_shash_* APis usage
Signed-off-by: NShirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

2d20ca83

24 8月, 2010 11 次提交

xfs: do not discard page cache data on EAGAIN · b5420f23

由 Christoph Hellwig 提交于 8月 24, 2010

If xfs_map_blocks returns EAGAIN because of lock contention we must redirty the
page and not disard the pagecache content and return an error from writepage.
We used to do this correctly, but the logic got lost during the recent
reshuffle of the writepage code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NMike Gao <ygao.linux@gmail.com>
Tested-by: NMike Gao <ygao.linux@gmail.com>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NDave Chinner <dchinner@redhat.com>

b5420f23

xfs: don't do memory allocation under the CIL context lock · 3b93c7aa

由 Dave Chinner 提交于 8月 24, 2010

Formatting items requires memory allocation when using delayed
logging. Currently that memory allocation is done while holding the
CIL context lock in read mode. This means that if memory allocation
takes some time (e.g. enters reclaim), we cannot push on the CIL
until the allocation(s) required by formatting complete. This can
stall CIL pushes for some time, and once a push is stalled so are
all new transaction commits.

Fix this splitting the item formatting into two steps. The first
step which does the allocation and memcpy() into the allocated
buffer is now done outside the CIL context lock, and only the CIL
insert is done inside the CIL context lock. This avoids the stall
issue.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

3b93c7aa

xfs: Reduce log force overhead for delayed logging · a44f13ed

由 Dave Chinner 提交于 8月 24, 2010

Delayed logging adds some serialisation to the log force process to
ensure that it does not deference a bad commit context structure
when determining if a CIL push is necessary or not. It does this by
grabing the CIL context lock exclusively, then dropping it before
pushing the CIL if necessary. This causes serialisation of all log
forces and pushes regardless of whether a force is necessary or not.
As a result fsync heavy workloads (like dbench) can be significantly
slower with delayed logging than without.

To avoid this penalty, copy the current sequence from the context to
the CIL structure when they are swapped. This allows us to do
unlocked checks on the current sequence without having to worry
about dereferencing context structures that may have already been
freed. Hence we can remove the CIL context locking in the forcing
code and only call into the push code if the current context matches
the sequence we need to force.

By passing the sequence into the push code, we can check the
sequence again once we have the CIL lock held exclusive and abort if
the sequence has already been pushed. This avoids a lock round-trip
and unnecessary CIL pushes when we have racing push calls.

The result is that the regression in dbench performance goes away -
this change improves dbench performance on a ramdisk from ~2100MB/s
to ~2500MB/s. This compares favourably to not using delayed logging
which retuns ~2500MB/s for the same workload.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

a44f13ed

xfs: dummy transactions should not dirty VFS state · 1a387d3b

由 Dave Chinner 提交于 8月 24, 2010

When we  need to cover the log, we issue dummy transactions to ensure
the current log tail is on disk. Unfortunately we currently use the
root inode in the dummy transaction, and the act of committing the
transaction dirties the inode at the VFS level.

As a result, the VFS writeback of the dirty inode will prevent the
filesystem from idling long enough for the log covering state
machine to complete. The state machine gets stuck in a loop issuing
new dummy transactions to cover the log and never makes progress.

To avoid this problem, the dummy transactions should not cause
externally visible state changes. To ensure this occurs, make sure
that dummy transactions log an unchanging field in the superblock as
it's state is never propagated outside the filesystem. This allows
the log covering state machine to complete successfully and the
filesystem now correctly enters a fully idle state about 90s after
the last modification was made.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

1a387d3b

xfs: ensure f_ffree returned by statfs() is non-negative · 2fe33661

由 Stuart Brodsky 提交于 8月 24, 2010

Because of delayed updates to sb_icount field in the super block, it
is possible to allocate over maxicount number of inodes.  This
causes the arithmetic to calculate a negative number of free inodes
in user commands like df or stat -f.

Since maxicount is a somewhat arbitrary number, a slight over
allocation is not critical but user commands should be displayed as
0 or greater and never go negative.  To do this the value in the
stats buffer f_ffree is capped to never go negative.

[ Modified to use max_t as per Christoph's comment. ]
Signed-off-by: NStu Brodsky <sbrodsky@sgi.com>
Signed-off-by: NDave Chinner <dchinner@redhat.com>

2fe33661

xfs: handle negative wbc->nr_to_write during sync writeback · efceab1d

由 Dave Chinner 提交于 8月 24, 2010

During data integrity (WB_SYNC_ALL) writeback, wbc->nr_to_write will
go negative on inodes with more than 1024 dirty pages due to
implementation details of write_cache_pages(). Currently XFS will
abort page clustering in writeback once nr_to_write drops below
zero, and so for data integrity writeback we will do very
inefficient page at a time allocation and IO submission for inodes
with large numbers of dirty pages.

Fix this by only aborting the page clustering code when
wbc->nr_to_write is negative and the sync mode is WB_SYNC_NONE.

Cc: <stable@kernel.org>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

efceab1d

xfs: fix untrusted inode number lookup · 4536f2ad

由 Dave Chinner 提交于 8月 24, 2010

Commit 7124fe0a ("xfs: validate untrusted inode
numbers during lookup") changes the inode lookup code to do btree lookups for
untrusted inode numbers. This change made an invalid assumption about the
alignment of inodes and hence incorrectly calculated the first inode in the
cluster. As a result, some inode numbers were being incorrectly considered
invalid when they were actually valid.

The issue was not picked up by the xfstests suite because it always runs fsr
and dump (the two utilities that utilise the bulkstat interface) on cache hot
inodes and hence the lookup code in the cold cache path was not sufficiently
exercised to uncover this intermittent problem.

Fix the issue by relaxing the btree lookup criteria and then checking if the
record returned contains the inode number we are lookup for. If it we get an
incorrect record, then the inode number is invalid.

Cc: <stable@kernel.org>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

4536f2ad

xfs: ensure we mark all inodes in a freed cluster XFS_ISTALE · 5b3eed75

由 Dave Chinner 提交于 8月 24, 2010

Under heavy load parallel metadata loads (e.g. dbench), we can fail
to mark all the inodes in a cluster being freed as XFS_ISTALE as we
skip inodes we cannot get the XFS_ILOCK_EXCL or the flush lock on.
When this happens and the inode cluster buffer has already been
marked stale and freed, inode reclaim can try to write the inode out
as it is dirty and not marked stale. This can result in writing th
metadata to an freed extent, or in the case it has already
been overwritten trigger a magic number check failure and return an
EUCLEAN error such as:

Filesystem "ram0": inode 0x442ba1 background reclaim flush failed with 117

Fix this by ensuring that we hoover up all in memory inodes in the
cluster and mark them XFS_ISTALE when freeing the cluster.

Cc: <stable@kernel.org>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

5b3eed75

xfs: unlock items before allowing the CIL to commit · d17c701c

由 Dave Chinner 提交于 8月 24, 2010

When we commit a transaction using delayed logging, we need to
unlock the items in the transaciton before we unlock the CIL context
and allow it to be checkpointed. If we unlock them after we release
the CIl context lock, the CIL can checkpoint and complete before
we free the log items. This breaks stale buffer item unlock and
unpin processing as there is an implicit assumption that the unlock
will occur before the unpin.

Also, some log items need to store the LSN of the transaction commit
in the item (inodes and EFIs) and so can race with other transaction
completions if we don't prevent the CIL from checkpointing before
the unlock occurs.

Cc: <stable@kernel.org>
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>

d17c701c

cifs: check for NULL session password · 24e6cf92

由 Jeff Layton 提交于 8月 23, 2010

It's possible for a cifsSesInfo struct to have a NULL password, so we
need to check for that prior to running strncmp on it.
Signed-off-by: NJeff Layton <jlayton@redhat.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

24e6cf92

missing changes during ntlmv2/ntlmssp auth and sign · 3ec6bbcd

由 Shirish Pargaonkar 提交于 8月 23, 2010

Signed-off-by: NShirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: NSteve French <sfrench@us.ibm.com>

3ec6bbcd

23 8月, 2010 8 次提交

ceph: fix osd request lru adjustment when sending request · 07a27e22

由 Henry C Chang 提交于 8月 22, 2010

Fix argument order.  We want to move the item to the end of the list, not
change the position of the head.
Signed-off-by: NHenry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: NSage Weil <sage@newdream.net>

07a27e22

ceph: don't improperly set dir complete when holding EXCL cap · 12451491

由 Sage Weil 提交于 8月 22, 2010

If we hold the EXCL cap, we cannot trust the dir stats from the MDS (num
files, subdirs) and must not incorrectly conclude that the directory is
empty.  If we do, we get can bad results from lookup (bad ENOENT) and
bad readdir results.
Signed-off-by: NSage Weil <sage@newdream.net>

12451491

fanotify: drop duplicate pr_debug statement · ff8d6e98

由 Tvrtko Ursulin 提交于 8月 20, 2010

This reminded me... you have two pr_debugs in fanotify_should_send_event
which output redundant information. Maybe you intended it like that so
it is selectable how much log spam you want, or if not you may want to
apply this patch.
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@sophos.com>
Signed-off-by: NEric Paris <eparis@redhat.com>

ff8d6e98

fanotify: flush outstanding perm requests on group destroy · 2eebf582

由 Eric Paris 提交于 8月 18, 2010

When an fanotify listener is closing it may cause a deadlock between the
listener and the original task doing an fs operation. If the original task
is waiting for a permissions response it will be holding the srcu lock. The
listener cannot clean up and exit until after that srcu lock is syncronized.
Thus deadlock. The fix introduced here is to stop accepting new permissions
events when a listener is shutting down and to grant permission for all
outstanding events. Thus the original task will eventually release the srcu
lock and the listener can complete shutdown.
Reported-by: NAndreas Gruenbacher <agruen@suse.de>
Cc: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: NEric Paris <eparis@redhat.com>

2eebf582

fsnotify: fix ignored mask handling between inode and vfsmount marks · 84e1ab4d

由 Eric Paris 提交于 8月 18, 2010

The interesting 2 list lockstep walking didn't quite work out if the inode
marks only had ignores and the vfsmount list requested events. The code to
shortcut list traversal would not run the inode list since it didn't have real
event requests. This code forces inode list traversal when a vfsmount mark
matches the event type. Maybe we could add an i_fsnotify_ignored_mask field
to struct inode to get the shortcut back, but it doesn't seem worth it to grow
struct inode again.

I bet with the recent changes to lock the way we do now it would actually not
be a major perf hit to just drop i_fsnotify_mark_mask altogether. But that is
for another day.
Signed-off-by: NEric Paris <eparis@redhat.com>

84e1ab4d

fsnotify: reset used_inode and used_vfsmount on each pass · 5f3f259f

由 Eric Paris 提交于 8月 18, 2010

The fsnotify main loop has 2 booleans which tell if a particular mark was
sent to the listeners or if it should be processed in the next pass. The
problem is that the booleans were not reset on each traversal of the loop.
So marks could get skipped even when they were not sent to the notifiers.
Reported-by: NTvrtko Ursulin <tvrtko.ursulin@sophos.com>
Signed-off-by: NEric Paris <eparis@redhat.com>

5f3f259f

fanotify: do not dereference inode_mark when it is unset · faa9560a

由 Eric Paris 提交于 8月 18, 2010

The fanotify code is supposed to get the group from the mark. It accidentally
only used the inode_mark. If the vfsmount_mark was set but not the inode_mark
it would deref the NULL inode_mark. Get the group from the correct place.
Reported-by: NTvrtko Ursulin <tvrtko.ursulin@sophos.com>
Signed-off-by: NEric Paris <eparis@redhat.com>

faa9560a

mm: exporting account_page_dirty · 679ceace

由 Michael Rubin 提交于 8月 20, 2010

This allows code outside of the mm core to safely manipulate page state
and not worry about the other accounting. Not using these routines means
that some code will lose track of the accounting and we get bugs. This
has happened once already.
Signed-off-by: NMichael Rubin <mrubin@google.com>
Signed-off-by: NSage Weil <sage@newdream.net>

679ceace

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功