提交 · 85c7000fda0029ec16569b1eec8fd3a8d026be73 · openeuler / Kernel

23 3月, 2022 1 次提交

ceph: remove reliance on bdi congestion · 503d4fa6

由 NeilBrown 提交于 3月 22, 2022

The bdi congestion tracking in not widely used and will be removed.

CEPHfs is one of a small number of filesystems that uses it, setting just
the async (write) congestion flags at what it determines are appropriate
times.

The only remaining effect of the async flag is to cause (some)
WB_SYNC_NONE writes to be skipped.

So instead of setting the flag, set an internal flag and change:

 - .writepages to do nothing if WB_SYNC_NONE and the flag is set

 - .writepage to return AOP_WRITEPAGE_ACTIVATE if WB_SYNC_NONE and the
   flag is set.

The writepages change causes a behavioural change in that pageout() can
now return PAGE_ACTIVATE instead of PAGE_KEEP, so SetPageActive() will
be called on the page which (I think) wil further delay the next attempt
at writeout.  This might be a good thing.

Link: https://lkml.kernel.org/r/164549983739.9187.14895675781408171186.stgit@noble.brownSigned-off-by: NNeilBrown <neilb@suse.de>
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Paolo Valente <paolo.valente@linaro.org>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

503d4fa6

02 3月, 2022 1 次提交

ceph: move to a dedicated slabcache for ceph_cap_snap · ab58a5a1

由 Xiubo Li 提交于 2月 15, 2022

There could be huge number of capsnaps around at any given time. On
x86_64 the structure is 248 bytes, which will be rounded up to 256 bytes
by kzalloc. Move this to a dedicated slabcache to save 8 bytes for each.

[ jlayton: use kmem_cache_zalloc ]
Signed-off-by: NXiubo Li <xiubli@redhat.com>
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

ab58a5a1

13 1月, 2022 6 次提交

ceph: move CEPH_SUPER_MAGIC definition to magic.h · a0b3a15e

由 Jeff Layton 提交于 1月 10, 2022

The uapi headers are missing the ceph definition. Move it there so
userland apps can ID cephfs.
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

a0b3a15e

ceph: add new "nopagecache" option · 94cc0877

由 Jeff Layton 提交于 11月 30, 2021

CephFS is a bit unlike most other filesystems in that it only
conditionally does buffered I/O based on the caps that it gets from the
MDS. In most cases, unless there is contended access for an inode the
MDS does give Fbc caps to the client, so the unbuffered codepaths are
only infrequently traveled and are difficult to test.

At one time, the "-o sync" mount option would give you this behavior,
but that was removed in commit 7ab9b380 ("ceph: Don't use
ceph-sync-mode for synchronous-fs.").

Add a new mount option to tell the client to ignore Fbc caps when doing
I/O, and to use the synchronous codepaths exclusively, even on
non-O_DIRECT file descriptors. We already have an ioctl that forces this
behavior on a per-file basis, so we can just always set the CEPH_F_SYNC
flag in the file description on such mounts.

Additionally, this patch also changes the client to not request Fbc when
doing direct I/O. We aren't using the cache with O_DIRECT so we don't
have any need for those caps.
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Acked-by: NGreg Farnum <gfarnum@redhat.com>
Reviewed-by: NVenky Shankar <vshankar@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

94cc0877

ceph: mount syntax module parameter · adbed05e

由 Venky Shankar 提交于 11月 03, 2021

Add read-only module parameters for supported mount syntaxes. Primary
user is the user-space mount helper for catching v2 syntax bugs during
testing by cross verifying if the kernel supports v2 syntax on mount
failure.
Signed-off-by: NVenky Shankar <vshankar@redhat.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

adbed05e

ceph: record updated mon_addr on remount · 2167f2cc

由 Venky Shankar 提交于 7月 14, 2021

Note that the new monitors are just shown in /proc/mounts.
Ceph does not (re)connect to new monitors yet.

[ jlayton: s/printk\(KERN_NOTICE/pr_notice(/
	   s/strcmp/strcmp_null/ ]
Signed-off-by: NVenky Shankar <vshankar@redhat.com>
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

2167f2cc

ceph: new device mount syntax · 7b19b4db

由 Venky Shankar 提交于 7月 14, 2021

Old mount device syntax (source) has the following problems:

- mounts to the same cluster but with different fsnames
  and/or creds have identical device string which can
  confuse xfstests.

- Userspace mount helper tool resolves monitor addresses
  and fill in mon addrs automatically, but that means the
  device shown in /proc/mounts is different than what was
  used for mounting.

New device syntax is as follows:

  cephuser@fsid.mycephfs2=/path

Note, there is no "monitor address" in the device string.
That gets passed in as mount option. This keeps the device
string same when monitor addresses change (on remounts).

Also note that the userspace mount helper tool is backward
compatible. I.e., the mount helper will fallback to using
old syntax after trying to mount with the new syntax.

[ idryomov: drop CEPH_MON_ADDR_MNTOPT_DELIM ]
Signed-off-by: NVenky Shankar <vshankar@redhat.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

7b19b4db

libceph: generalize addr/ip parsing based on delimiter · 2d7c86a8

由 Venky Shankar 提交于 7月 14, 2021

... and remove hardcoded function name in ceph_parse_ips().

[ idryomov: delim parameter, drop CEPH_ADDR_PARSE_DEFAULT_DELIM ]
Signed-off-by: NVenky Shankar <vshankar@redhat.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

2d7c86a8

12 1月, 2022 1 次提交

ceph: conversion to new fscache API · 400e1286

由 Jeff Layton 提交于 12月 07, 2021

Now that the fscache API has been reworked and simplified, change ceph
over to use it.

With the old API, we would only instantiate a cookie when the file was
open for reads. Change it to instantiate the cookie when the inode is
instantiated and call use/unuse when the file is opened/closed.

Also, ensure we resize the cached data on truncates, and invalidate the
cache in response to the appropriate events. This will allow us to
plumb in write support later.
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20211129162907.149445-2-jlayton@kernel.org/ # v1
Link: https://lore.kernel.org/r/20211207134451.66296-2-jlayton@kernel.org/ # v2
Link: https://lore.kernel.org/r/163906984277.143852.14697110691303589000.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/163967188351.1823006.5065634844099079351.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/164021581427.640689.14128682147127509264.stgit@warthog.procyon.org.uk/ # v4

400e1286

08 11月, 2021 3 次提交

ceph: properly handle statfs on multifs setups · 8cfc0c7e

由 Jeff Layton 提交于 10月 05, 2021

ceph_statfs currently stuffs the cluster fsid into the f_fsid field.
This was fine when we only had a single filesystem per cluster, but now
that we have multiples we need to use something that will vary between
them.

Change ceph_statfs to xor each 32-bit chunk of the fsid (aka cluster id)
into the lower bits of the statfs->f_fsid. Change the lower bits to hold
the fscid (filesystem ID within the cluster).

That should give us a value that is guaranteed to be unique between
filesystems within a cluster, and should minimize the chance of
collisions between mounts of different clusters.

URL: https://tracker.ceph.com/issues/52812Reported-by: NSachin Prabhu <sprabhu@redhat.com>
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Reviewed-by: NXiubo Li <xiubli@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

8cfc0c7e

ceph: shut down mount on bad mdsmap or fsmap decode · 631ed4b0

由 Jeff Layton 提交于 10月 14, 2021

As Greg pointed out, if we get a mangled mdsmap or fsmap, then something
has gone very wrong, and we should avoid doing any activity on the
filesystem.

When this occurs, shut down the mount the same way we would with a
forced umount by calling ceph_umount_begin when decoding fails on either
map. This causes most operations done against the filesystem to return
an error. Any dirty data or caps in the cache will be dropped as well.

The effect is not reversible, so the only remedy is to umount.

[ idryomov: print fsmap decoding error ]

URL: https://tracker.ceph.com/issues/52303Signed-off-by: NJeff Layton <jlayton@kernel.org>
Acked-by: NGreg Farnum <gfarnum@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

631ed4b0

ceph: enable async dirops by default · f7a67b46

由 Jeff Layton 提交于 8月 09, 2021

Async dirops have been supported in mainline kernels for quite some time
now, and we've recently (as of June) started doing regular testing in
teuthology with '-o nowsync'. There were a few issues, but we've sorted
those out now.

Enable async dirops by default, and change /proc/mounts to show "wsync"
when they are disabled rather than "nowsync" when they are enabled.
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

f7a67b46

19 10月, 2021 1 次提交

ceph: skip existing superblocks that are blocklisted or shut down when mounting · 98d0a6fb

由 Jeff Layton 提交于 9月 30, 2021

Currently when mounting, we may end up finding an existing superblock
that corresponds to a blocklisted MDS client. This means that the new
mount ends up being unusable.

If we've found an existing superblock with a client that is already
blocklisted, and the client is not configured to recover on its own,
fail the match. Ditto if the superblock has been forcibly unmounted.

While we're in here, also rename "other" to the more conventional "fsc".

Cc: stable@vger.kernel.org
URL: https://bugzilla.redhat.com/show_bug.cgi?id=1901499Signed-off-by: NJeff Layton <jlayton@kernel.org>
Reviewed-by: NXiubo Li <xiubli@redhat.com>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

98d0a6fb

15 12月, 2020 1 次提交

ceph: add new RECOVER mount_state when recovering session · 50c9132d

由 Jeff Layton 提交于 9月 25, 2020

When recovering a session (a'la recover_session=clean), we want to do
all of the operations that we do on a forced umount, but changing the
mount state to SHUTDOWN is can cause queued MDS requests to fail when
the session comes back. Most of those can idle until the session is
recovered in this situation.

Reserve SHUTDOWN state for forced umount, and make a new RECOVER state
for the forced reconnect situation. Change several tests for equality with
SHUTDOWN to test for that or RECOVER.
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Reviewed-by: NXiubo Li <xiubli@redhat.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

50c9132d

12 10月, 2020 2 次提交

I
libceph, rbd, ceph: "blacklist" -> "blocklist" · 0b98acd6
由 Ilya Dryomov 提交于 9月 14, 2020
```
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
```
0b98acd6

ceph: use kill_anon_super helper · 470a5c77

由 Jeff Layton 提交于 9月 11, 2020

ceph open-codes this around some other activity and the rationale
for it isn't clear. There is no need to delay free_anon_bdev until
the end of kill_sb.
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

470a5c77

19 9月, 2020 1 次提交

[PATCH] reduce boilerplate in fsid handling · 6d1349c7

由 Al Viro 提交于 9月 18, 2020

Get rid of boilerplate in most of ->statfs()
instances...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

6d1349c7

05 8月, 2020 1 次提交

ceph: move sb->wb_pagevec_pool to be a global mempool · a0102bda

由 Jeff Layton 提交于 7月 30, 2020

When doing some testing recently, I hit some page allocation failures
on mount, when creating the wb_pagevec_pool for the mount. That
requires 128k (32 contiguous pages), and after thrashing the memory
during an xfstests run, sometimes that would fail.

128k for each mount seems like a lot to hold in reserve for a rainy
day, so let's change this to a global mempool that gets allocated
when the module is plugged in.
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

a0102bda

03 8月, 2020 2 次提交

ceph: delete repeated words in fs/ceph/ · f1f565a2

由 Randy Dunlap 提交于 7月 17, 2020

Drop duplicated words "down" and "the" in fs/ceph/.

[ idryomov: merge into a single patch ]
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

f1f565a2

ceph: periodically send perf metrics to MDSes · 18f473b3

由 Xiubo Li 提交于 7月 16, 2020

This will send the caps/read/write/metadata metrics to any available MDS
once per second, which will be the same as the userland client.  It will
skip the MDS sessions which don't support the metric collection, as the
MDSs will close socket connections when they get an unknown type
message.

We can disable the metric sending via the disable_send_metrics module
parameter.

[ jlayton: fix up endianness bug in ceph_mdsc_send_metrics() ]

URL: https://tracker.ceph.com/issues/43215Signed-off-by: NXiubo Li <xiubli@redhat.com>
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

18f473b3

30 3月, 2020 2 次提交

ceph: perform asynchronous unlink if we have sufficient caps · 2ccb4546

由 Jeff Layton 提交于 4月 02, 2019

The MDS is getting a new lock-caching facility that will allow it
to cache the necessary locks to allow asynchronous directory operations.
Since the CEPH_CAP_FILE_* caps are currently unused on directories,
we can repurpose those bits for this purpose.

When performing an unlink, if we have Fx on the parent directory,
and CEPH_CAP_DIR_UNLINK (aka Fr), and we know that the dentry being
removed is the primary link, then then we can fire off an unlink
request immediately and don't need to wait on reply before returning.

In that situation, just fix up the dcache and link count and return
immediately after issuing the call to the MDS. This does mean that we
need to hold an extra reference to the inode being unlinked, and extra
references to the caps to avoid races. Those references are put and
error handling is done in the r_callback routine.

If the operation ends up failing, then set a writeback error on the
directory inode, and the inode itself that can be fetched later by
an fsync on the dir.

The behavior of dir caps is slightly different from caps on normal
files. Because these are just considered an optimization, if the
session is reconnected, we will not automatically reclaim them. They
are instead considered lost until we do another synchronous op in the
parent directory.

Async dirops are enabled via the "nowsync" mount option, which is
patterned after the xfs "wsync" mount option. For now, the default
is "wsync", but eventually we may flip that.
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

2ccb4546

ceph: move to a dedicated slabcache for mds requests · 058daab7

由 Jeff Layton 提交于 2月 17, 2020

On my machine (x86_64) this struct is 952 bytes, which gets rounded up
to 1024 by kmalloc. Move this to a dedicated slabcache, so we can
allocate them without the extra 72 bytes of overhead per.
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

058daab7

12 2月, 2020 2 次提交

ceph: noacl mount option is effectively ignored · 3b20bc2f

由 Xiubo Li 提交于 2月 11, 2020

For the old mount API, the module parameters parseing function will
be called in ceph_mount() and also just after the default posix acl
flag set, so we can control to enable/disable it via the mount option.

But for the new mount API, it will call the module parameters
parseing function before ceph_get_tree(), so the posix acl will always
be enabled.

Fixes: 82995cc6 ("libceph, rbd, ceph: convert to use the new mount API")
Signed-off-by: NXiubo Li <xiubli@redhat.com>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

3b20bc2f

ceph: canonicalize server path in place · b27a939e

由 Ilya Dryomov 提交于 2月 10, 2020

syzbot reported that 4fbc0c71 ("ceph: remove the extra slashes in
the server path") had caused a regression where an allocation could be
done under a spinlock -- compare_mount_options() is called by sget_fc()
with sb_lock held.

We don't really need the supplied server path, so canonicalize it
in place and compare it directly.  To make this work, the leading
slash is kept around and the logic in ceph_real_mount() to skip it
is restored.  CEPH_MSG_CLIENT_SESSION now reports the same (i.e.
canonicalized) path, with the leading slash of course.

Fixes: 4fbc0c71 ("ceph: remove the extra slashes in the server path")
Reported-by: syzbot+98704a51af8e3d9425a9@syzkaller.appspotmail.com
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>

b27a939e

08 2月, 2020 6 次提交

A
ceph: use errorfc() and friends instead of spelling the prefix out · d53d0f74
由 Al Viro 提交于 12月 21, 2019
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
d53d0f74

fs_parse: handle optional arguments sanely · 48ce73b1

由 Al Viro 提交于 12月 17, 2019

Don't bother with "mixed" options that would allow both the
form with and without argument (i.e. both -o foo and -o foo=bar).
Rather than trying to shove both into a single fs_parameter_spec,
allow having with-argument and no-argument specs with the same
name and teach fs_parse to handle that.

There are very few options of that sort, and they are actually
easier to handle that way - callers end up with less postprocessing.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

48ce73b1

fs_parse: fold fs_parameter_desc/fs_parameter_spec · d7167b14

由 Al Viro 提交于 9月 07, 2019

The former contains nothing but a pointer to an array of the latter...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d7167b14

fs_parser: remove fs_parameter_description name field · 96cafb9c

由 Eric Sandeen 提交于 12月 06, 2019

Unused now.
Signed-off-by: NEric Sandeen <sandeen@redhat.com>
Acked-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

96cafb9c

add prefix to fs_context->log · cc3c0b53

由 Al Viro 提交于 12月 21, 2019

... turning it into struct p_log embedded into fs_context.  Initialize
the prefix with fs_type->name, turning fs_parse() into a trivial
inline wrapper for __fs_parse().

This makes fs_parameter_description->name completely unused.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

cc3c0b53

ceph_parse_param(), ceph_parse_mon_ips(): switch to passing fc_log · c80c98f0

由 Al Viro 提交于 12月 21, 2019

... and now errorf() et.al. are never called with NULL fs_context,
so we can get rid of conditional in those.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c80c98f0

07 2月, 2020 2 次提交

A
fold struct fs_parameter_enum into struct constant_table · 5eede625
由 Al Viro 提交于 12月 16, 2019
```
no real difference now
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
5eede625

fs_parse: get rid of ->enums · 2710c957

由 Al Viro 提交于 9月 06, 2019

Don't do a single array; attach them to fsparam_enum() entry
instead.  And don't bother trying to embed the names into those -
it actually loses memory, with no real speedup worth mentioning.

Simplifies validation as well.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2710c957

27 1月, 2020 3 次提交

ceph: use copy-from2 op in copy_file_range · 78beb0ff

由 Luis Henriques 提交于 1月 08, 2020

Instead of using the copy-from operation, switch copy_file_range to the
new copy-from2 operation, which allows to send the truncate_seq and
truncate_size parameters.

If an OSD does not support the copy-from2 operation it will return
-EOPNOTSUPP.  In that case, the kernel client will stop trying to do
remote object copies for this fs client and will always use the generic
VFS copy_file_range.
Signed-off-by: NLuis Henriques <lhenriques@suse.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

78beb0ff

ceph: remove the extra slashes in the server path · 4fbc0c71

由 Xiubo Li 提交于 12月 20, 2019

It's possible to pass the mount helper a server path that has more
than one contiguous slash character. For example:

  $ mount -t ceph 192.168.195.165:40176:/// /mnt/cephfs/

In the MDS server side the extra slashes of the server path will be
treated as snap dir, and then we can get the following debug logs:

  ceph:  mount opening path //
  ceph:  open_root_inode opening '//'
  ceph:  fill_trace 0000000059b8a3bc is_dentry 0 is_target 1
  ceph:  alloc_inode 00000000dc4ca00b
  ceph:  get_inode created new inode 00000000dc4ca00b 1.ffffffffffffffff ino 1
  ceph:  get_inode on 1=1.ffffffffffffffff got 00000000dc4ca00b

And then when creating any new file or directory under the mount
point, we can hit the following BUG_ON in ceph_fill_trace():

  BUG_ON(ceph_snap(dir) != dvino.snap);

Have the client ignore the extra slashes in the server path when
mounting. This will also canonicalize the path, so that identical mounts
can be consilidated.

1) "//mydir1///mydir//"
2) "/mydir1/mydir"
3) "/mydir1/mydir/"

Regardless of the internal treatment of these paths, the kernel still
stores the original string including the leading '/' for presentation
to userland.

URL: https://tracker.ceph.com/issues/42771Signed-off-by: NXiubo Li <xiubli@redhat.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

4fbc0c71

ceph: check availability of mds cluster on mount after wait timeout · 97820058

由 Xiubo Li 提交于 12月 10, 2019

If all the MDS daemons are down for some reason, then the first mount
attempt will fail with EIO after the mount request times out. A mount
attempt will also fail with EIO if all of the MDS's are laggy.

This patch changes the code to return -EHOSTUNREACH in these situations
and adds a pr_info error message to help the admin determine the cause.

URL: https://tracker.ceph.com/issues/4386Signed-off-by: NXiubo Li <xiubli@redhat.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

97820058

10 12月, 2019 1 次提交

ceph: convert int fields in ceph_mount_options to unsigned int · ad8c28a9

由 Jeff Layton 提交于 9月 09, 2019

Most of these values should never be negative, so convert them to
unsigned values. Add some sanity checking to the parsed values, and
clean up some unneeded casts.

Note that while caps_max should never be negative, this patch leaves
it signed, since this value ends up later being compared to a signed
counter. Just ensure that userland never passes in a negative value
for caps_max.
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

ad8c28a9

28 11月, 2019 1 次提交

libceph, rbd, ceph: convert to use the new mount API · 82995cc6

由 David Howells 提交于 3月 25, 2019

Convert the ceph filesystem to the new internal mount API as the old
one will be obsoleted and removed.  This allows greater flexibility in
communication of mount parameters between userspace, the VFS and the
filesystem.

See Documentation/filesystems/mount_api.txt for more information.

[ Numerous string handling, leak and regression fixes; rbd conversion
  was particularly broken and had to be redone almost from scratch. ]
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

82995cc6

08 11月, 2019 1 次提交

ceph: return -EINVAL if given fsc mount option on kernel w/o support · ff29fde8

由 Jeff Layton 提交于 11月 07, 2019

If someone requests fscache on the mount, and the kernel doesn't
support it, it should fail the mount.

[ Drop ceph prefix -- it's provided by pr_err. ]
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Reviewed-by: NIlya Dryomov <idryomov@gmail.com>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

ff29fde8

16 9月, 2019 2 次提交

ceph: call ceph_mdsc_destroy from destroy_fs_client · 3ee5a701

由 Jeff Layton 提交于 9月 12, 2019

They're always called in succession.
Signed-off-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

3ee5a701

ceph: auto reconnect after blacklisted · 131d7eb4

由 Yan, Zheng 提交于 7月 25, 2019

Make client use osd reply and session message to infer if itself is
blacklisted. Client reconnect to cluster using new entity addr if it
is blacklisted. Auto reconnect is limited to once every 30 minutes.

Auto reconnect is disabled by default. It can be enabled/disabled by
recover_session=<no|clean> mount option. In 'clean' mode, client drops
any dirty data/metadata, invalidates page caches and invalidates all
writable file handles. After reconnect, file locks become stale because
MDS loses track of them. If an inode contains any stale file locks,
read/write on the indoe are not allowed until applications release all
stale file locks.
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com>
Reviewed-by: NJeff Layton <jlayton@kernel.org>
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

131d7eb4

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功