提交 · be655596b3de5873f994ddbe205751a5ffb4de39 · openeuler / raspberrypi-kernel

08 12月, 2011 1 次提交

ceph: use i_ceph_lock instead of i_lock · be655596

由 Sage Weil 提交于 13年前

We have been using i_lock to protect all kinds of data structures in the
ceph_inode_info struct, including lists of inodes that we need to iterate
over while avoiding races with inode destruction.  That requires grabbing
a reference to the inode with the list lock protected, but igrab() now
takes i_lock to check the inode flags.

Changing the list lock ordering would be a painful process.

However, using a ceph-specific i_ceph_lock in the ceph inode instead of
i_lock is a simple mechanical change and avoids the ordering constraints
imposed by igrab().
Reported-by: NAmon Ott <a.ott@m-privacy.de>
Signed-off-by: NSage Weil <sage@newdream.net>

be655596

03 12月, 2011 1 次提交

ceph: fix rasize reporting by ceph_show_options · 2151937d

由 Sage Weil 提交于 13年前

Fix typo.
Reported-by: Nmowang da <whooya.xxl@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

2151937d

12 11月, 2011 1 次提交

ceph: initialize root dentry · 774ac21d

由 Sage Weil 提交于 13年前

Set up d_fsdata on the root dentry.  This fixes a NULL pointer dereference
in ceph_d_prune on umount.  It also means we can eventually strip out all
of the conditional checks on d_fsdata because it is now set unconditionally
(prior to setting up the d_ops).

Fix the ceph_d_prune debug print while we're here.
Signed-off-by: NSage Weil <sage@newdream.net>

774ac21d

06 11月, 2011 4 次提交

ceph: fix iput race when queueing inode work · 15a2015f

由 Sage Weil 提交于 13年前

If we queue a work item that calls iput(), make sure we ihold() before
attempting to queue work. Otherwise our queued work might miraculously run
before we notice the queue_work() succeeded and call ihold(), allowing the
inode to be destroyed.

That is, instead of

	if (queue_work(...))
		ihold();

we need to do

	ihold();
	if (!queue_work(...))
		iput();
Reported-by: NAmon Ott <a.ott@m-privacy.de>
Signed-off-by: NSage Weil <sage@newdream.net>

15a2015f

ceph/super.c: quiet sparse noise · 0c6d4b4e

由 H Hartley Sweeten 提交于 13年前

Quiet the sparse noise:

warning: symbol 'create_fs_client' was not declared. Should it be static?
warning: symbol 'destroy_fs_client' was not declared. Should it be static?
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Sage Weil <sage@newdream.net>
ceph-devel@vger.kernel.org
Signed-off-by: NSage Weil <sage@newdream.net>

0c6d4b4e

ceph/mds_client.c: quiet sparse noise · 7fd7d101

由 H Hartley Sweeten 提交于 13年前

Quiet the following sparse noise:

warning: symbol 'get_nonsnap_parent' was not declared. Should it be static?
warning: symbol 'done_closing_sessions' was not declared. Should it be static?

Local functions don't need external visability. Make them static.
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Sage Weil <sage@newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

7fd7d101

ceph: use new D_COMPLETE dentry flag · c6ffe100

由 Sage Weil 提交于 13年前

We used to use a flag on the directory inode to track whether the dcache
contents for a directory were a complete cached copy. Switch to a dentry
flag CEPH_D_COMPLETE that is safely updated by ->d_prune().
Signed-off-by: NSage Weil <sage@newdream.net>

c6ffe100

04 11月, 2011 1 次提交

ceph: clear parent D_COMPLETE flag when on dentry prune · b58dc410

由 Sage Weil 提交于 13年前

When the VFS prunes a dentry from the cache, clear the D_COMPLETE flag
on the parent dentry. Do this for the live and snapshotted namespaces. Do
not bother for the .snap dir contents, since we do not cache that.
Signed-off-by: NSage Weil <sage@newdream.net>

b58dc410

02 11月, 2011 1 次提交

filesystems: add set_nlink() · bfe86848

由 Miklos Szeredi 提交于 13年前

Replace remaining direct i_nlink updates with a new set_nlink()
updater function.
Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
Tested-by: NToshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

bfe86848

26 10月, 2011 11 次提交

libceph: fix double-free of page vector · 33957340

由 Sage Weil 提交于 13年前

ceph_release_page_vector() kfrees the vector; we shouldn't do it here too.
Reported-by: NJeff Wu <cpwu@tnsoft.com.cn>
Signed-off-by: NSage Weil <sage@newdream.net>

33957340

ceph: fix 32-bit ino numbers · 3310f754

由 Amon Ott 提交于 13年前

Fix 32-bit ino generation to not always be 1.
Signed-off-by: NAmon Ott <a.ott@m-privacy.de>

3310f754

ceph: let the set_layout ioctl set single traits · a35eca95

由 Greg Farnum 提交于 13年前

Previously we were validating the passed-in stripe unit, object size,
and stripe count against each other (and not testing most other stuff).
Instead, make sure that the composed previous layout and new values are valid,
and only send the new values to the MDS. This lets users change the
pool without setting the whole layout, for instance.
Signed-off-by: NGreg Farnum <gregory.farnum@dreamhost.com>

a35eca95

Revert "ceph: don't truncate dirty pages in invalidate work thread" · 83eaea22

由 Sage Weil 提交于 13年前

This reverts commit c9af9fb6.

We need to block and truncate all pages in order to reliably invalidate
them.  Otherwise, we could:

 - have some uptodate pages in the cache
 - queue an invalidate
 - write(2) locks some pages
 - invalidate_work skips them
 - write(2) only overwrites part of the page
 - page now dirty and uptodate
 -> partial leakage of invalidated data

It's not entirely clear why we started skipping locked pages in the first
place.  I just ran this through fsx and didn't see any problems.
Signed-off-by: NSage Weil <sage@newdream.net>

83eaea22

ceph: replace leading spaces with tabs · 80db8bea

由 Noah Watkins 提交于 13年前

Trivial formatting fix.
Signed-off-by: NNoah Watkins <noahwatkins@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

80db8bea

libceph: don't complain on msgpool alloc failures · b61c2763

由 Sage Weil 提交于 13年前

The pool allocation failures are masked by the pool; there is no need to
spam the console about them.  (That's the whole point of having the pool
in the first place.)

Mark msg allocations whose failure is safely handled as such.
Signed-off-by: NSage Weil <sage@newdream.net>

b61c2763

libceph: create messenger with client · 6ab00d46

由 Sage Weil 提交于 13年前

This simplifies the init/shutdown paths, and makes client->msgr available
during the rest of the setup process.
Signed-off-by: NSage Weil <sage@newdream.net>

6ab00d46

ceph: document ioctls · 6a8ea470

由 Sage Weil 提交于 13年前

...after some prodding by Christoph.
Signed-off-by: NSage Weil <sage@newdream.net>

6a8ea470

ceph: implement (optional) max read size · 0d66a487

由 Sage Weil 提交于 13年前

The 'rsize' mount option limits the maximum size of an individual
read(ahead) operation that is sent off to an OSD.  This is distinct from
'rasize', which controls the size of the readahead window.
Signed-off-by: NSage Weil <sage@newdream.net>

0d66a487

S
ceph: rename rsize -> rasize · 83817e35
由 Sage Weil 提交于 13年前
```
It controls readahead.
Signed-off-by: NSage Weil <sage@newdream.net>
```
83817e35

ceph: make readpages fully async · 7c272194

由 Sage Weil 提交于 13年前

When we get a ->readpages() aop, submit async reads for all page ranges
in the provided page list.  Lock the pages immediately, so that VFS/MM
will block until the reads complete.
Signed-off-by: NSage Weil <sage@newdream.net>

7c272194

23 8月, 2011 1 次提交

ceph: fix memory leak · 259a187a

由 Noah Watkins 提交于 13年前

kfree does not clean up indirect allocations in
ceph_fs_client and ceph_options (e.g. snapdir_name).
Signed-off-by: NNoah Watkins <noahwatkins@gmail.com>
Signed-off-by: NSage Weil <sage@newdream.net>

259a187a

16 8月, 2011 1 次提交

ceph: fix encoding of ino only (not relative) paths · 795858db

由 Sage Weil 提交于 13年前

A 'path' consists of a starting ino and relative component.  Encode even
when there is no relative component.  This is primarily needed by the
NFS reexport code.
Signed-off-by: NSage Weil <sage@newdream.net>

795858db

27 7月, 2011 18 次提交

ceph: document unlocked d_parent accesses · d79698da

由 Sage Weil 提交于 13年前

For the most part we don't care about racing with rename when directing
MDS requests; either the old or new parent is fine.  Document that, and
do some minor cleanup.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

d79698da

ceph: explicitly reference rename old_dentry parent dir in request · 41b02e1f

由 Sage Weil 提交于 13年前

We carry a pin on the parent directory for the rename source and dest
dentries.  For the source it's r_locked_dir; we need to explicitly
reference the old_dentry parent as well, since the dentry's d_parent may
change between when the request was created and pinned and when it is
freed.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

41b02e1f

ceph: document locking for ceph_set_dentry_offset · 4f177264

由 Sage Weil 提交于 13年前

Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

4f177264

ceph: avoid d_parent in ceph_dentry_hash; fix ceph_encode_fh() hashing bug · e5f86dc3

由 Sage Weil 提交于 13年前

Have caller pass in a safely-obtained reference to the parent directory
for calculating a dentry's hash valud.

While we're here, simpify the flow through ceph_encode_fh() so that there
is a single exit point and cleanup.

Also fix a bug with the dentry hash calculation: calculate the hash for the
dentry we were given, not its parent.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

e5f86dc3

ceph: protect d_parent access in ceph_d_revalidate · bf1c6aca

由 Sage Weil 提交于 13年前

Protect d_parent with d_lock.  Carry a reference.  Simplify the flow so
that there is a single exit point and cleanup.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

bf1c6aca

ceph: protect access to d_parent · 5f21c96d

由 Sage Weil 提交于 13年前

d_parent is protected by d_lock: use it when looking up a dentry's parent
directory inode.  Also take a reference and drop it in the caller to avoid
a use-after-free.
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

5f21c96d

ceph: handle racing calls to ceph_init_dentry · 48d0cbd1

由 Sage Weil 提交于 13年前

The ->lookup() and prepopulate_readdir() callers are working with unhashed
dentries, so we don't have to worry.  The export.c callers, though, need
to initialize something they got back from d_obtain_alias() and are
potentially racing with other callers.  Make sure we don't return unless
the dentry is properly initialized (by us or someone else).
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

48d0cbd1

ceph: set dir complete frag after adding capability · dfabbed6

由 Sage Weil 提交于 13年前

Curretly ceph_add_cap clears the complete bit if we are newly issued the
FILE_SHARED cap, which is normally the case for a newly issue cap on a new
directory.  That means we clear the just-set bit.  Move the check that sets
the flag to after the cap is added/updated.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

dfabbed6

ceph: set up readahead size when rsize is not passed · e9852227

由 Yehuda Sadeh 提交于 13年前

This should improve the default read performance, as without it
readahead is practically disabled.
Signed-off-by: NYehuda Sadeh <yehuda@hq.newdream.net>

e9852227

ceph: ignore lease mask · 2f90b852

由 Sage Weil 提交于 13年前

The lease mask is no longer used (and it changed a while back).  Instead,
use a non-zero duration to indicate that there is a lease being issued.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

2f90b852

ceph: fix ceph_lookup_open intent usage · 468640e3

由 Sage Weil 提交于 13年前

We weren't properly calling lookup_instantiate_filp when setting up the
lookup intent, which could lead to file leakage on errors.  So:

 - use separate helper for the hidden snapdir translation, immediately
   following the mds request
 - use ceph_finish_lookup for the final dentry/return value dance in the
   exit path
 - lookup_instantiate_filp on success
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

468640e3

ceph: only link open operations to directory unsafe list if O_CREAT|O_TRUNC · 9bae113a

由 Sage Weil 提交于 13年前

We only need to put these on the directory unsafe list if they have
side effects that fsync(2) should flush out.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

9bae113a

ceph: fix bad parent_inode calc in ceph_lookup_open · acda7657

由 Sage Weil 提交于 13年前

We were always getting NULL here because the intent file f_dentry is always
NULL at this point, which means we were always passing NULL to
ceph_mdsc_do_request.  In reality, this was fine, since this isn't
currently ever a write operation that needs to get strung on the dir's
unsafe list.

Use the dir explicitly, and only pass it if this open has side-effects that
a dir fsync should flush.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

acda7657

ceph: avoid carrying Fw cap during write into page cache · d8de9ab6

由 Sage Weil 提交于 13年前

The generic_file_aio_write call may block on balance_dirty_pages while we
flush data to the OSDs.  If we hold a reference to the FILE_WR cap during
that interval revocation by the MDS (e.g., to do a stat(2)) may be very
slow.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

d8de9ab6

ceph: report f_bfree based on kb_avail rather than diffing. · 8f04d422

由 Greg Farnum 提交于 13年前

Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NGreg Farnum <gregory.farnum@dreamhost.com>

8f04d422

ceph: only queue capsnap if caps are dirty · e77dc3e9

由 Sage Weil 提交于 13年前

We used to go into this branch if i_wrbuffer_ref_head was non-zero.  This
was an ancient check from before we were careful about dealing with all
kinds of caps (and not just dirty pages).  It is cleaner to only queue a
capsnap if there is an actual dirty cap.  If we are racing with...
something...we will end up here with ci->i_wrbuffer_refs but no dirty
caps.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

e77dc3e9

ceph: fix snap writeback when racing with writes · af0ed569

由 Sage Weil 提交于 13年前

There are two problems that come up when we try to queue a capsnap while a
write is in progress:

 - The FILE_WR cap is held, but not yet dirty, so we may queue a capsnap
   with dirty == 0.  That will crash later in __ceph_flush_snaps().  Or
   on the FILE_WR cap if a write is in progress.
 - We may not have i_head_snapc set, which causes problems pretty quickly.
   Look to the snaprealm in this case.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

af0ed569

ceph: use flag bit for at_end readdir flag · 9cfa1098

由 Sage Weil 提交于 13年前

This saves us a word of memory per file.
Reviewed-by: NYehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: NSage Weil <sage@newdream.net>

9cfa1098