提交 · 0fbc5360bf433ffeccbb94ef806d637049899741 · openanolis / cloud-kernel

13 12月, 2016 1 次提交
- Y
  ceph: check availability of mds cluster on mount · e9e427f0
  由 Yan, Zheng 提交于 11月 10, 2016
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
  e9e427f0
29 10月, 2016 2 次提交
- A
  ceph: switch to use of ->d_init() · ad5cb123
  由 Al Viro 提交于 10月 28, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  ad5cb123
- A
  ceph: unify dentry_operations instances · 18fc8abd
  由 Al Viro 提交于 10月 28, 2016
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  18fc8abd
18 10月, 2016 1 次提交

ceph: fix uninitialized dentry pointer in ceph_real_mount() · 31ca5878

由 Geert Uytterhoeven 提交于 10月 13, 2016

    fs/ceph/super.c: In function ‘ceph_real_mount’:
    fs/ceph/super.c:818: warning: ‘root’ may be used uninitialized in this function

If s_root is already valid, dentry pointer root is never initialized,
and returned by ceph_real_mount(). This will cause a crash later when
the caller dereferences the pointer.

Fixes: ce2728aa ("ceph: avoid accessing / when mounting a subpath")
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

31ca5878

03 10月, 2016 1 次提交

ceph: avoid accessing / when mounting a subpath · ce2728aa

由 Yan, Zheng 提交于 9月 14, 2016

Accessing / causes failuire if the client has caps that restrict path
Signed-off-by: NYan, Zheng <zyan@redhat.com>

ce2728aa

28 7月, 2016 3 次提交

ceph: Mark the file cache as unreclaimable · 6b1a9a6c

由 Nikolay Borisov 提交于 7月 25, 2016

Ceph creates multiple caches with the SLAB_RECLAIMABLE flag set, so
that it can satisfy its internal needs. Inspecting the code shows that
most of the caches are indeed reclaimable since they are directly
related to the generic inode/dentry shrinkers. However, one of the
cache used to satisfy struct file is not reclaimable since its
entries are freed only when the last reference to the file is
dropped. If a heavily loaded node opens a lot of files it can
introduce non-trivial discrepancies between memory shown as reclaimable
and what is actually reclaimed when drop_caches is used.

Fix this by removing the reclaimable flag for the file's cache.
Signed-off-by: NNikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

6b1a9a6c

ceph: mount non-default filesystem by name · 430afbad

由 Yan, Zheng 提交于 7月 08, 2016

To mount non-default filesytem, user currently needs to provide mds
namespace ID. This is inconvenience.

This patch makes user be able to mount filesystem by name. If user
wants to mount non-default filesystem. Client first subscribes to
fsmap.user. Subscribe to mdsmap.<ID> after getting ID of filesystem.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

430afbad

ceph: wait unsafe sync writes for evicting inode · 9a5530c6

由 Yan, Zheng 提交于 6月 15, 2016

Otherwise ceph_sync_write_unsafe() may access/modify freed inode.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

9a5530c6

26 5月, 2016 3 次提交

Y
ceph: report mount root in session metadata · 3f384954
由 Yan, Zheng 提交于 4月 21, 2016
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
3f384954
Y
ceph: CEPH_FEATURE_MDSENC support · d463a43d
由 Yan, Zheng 提交于 3月 31, 2016
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
d463a43d

ceph: multiple filesystem support · 235a0982

由 Yan, Zheng 提交于 3月 30, 2016

To access non-default filesystem, we just need to subscribe to
mdsmap.<MDS_NAMESPACE_ID> and add a new mount option for mds
namespace id.
Signed-off-by: NYan, Zheng <zyan@redhat.com>
[idryomov@gmail.com: switch to a new libceph API]
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

235a0982

05 4月, 2016 1 次提交

mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf

由 Kirill A. Shutemov 提交于 4月 01, 2016

PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

09cbfeaf

26 3月, 2016 4 次提交

ceph: fix mounting same fs multiple times · 132ca7e1

由 Yan, Zheng 提交于 3月 12, 2016

Now __ceph_open_session() only accepts closed client. An opened
client will tigger BUG_ON().
Signed-off-by: NYan, Zheng <zyan@redhat.com>

132ca7e1

ceph: kill ceph_empty_snapc · 34b759b4

由 Ilya Dryomov 提交于 2月 16, 2016

ceph_empty_snapc->num_snaps == 0 at all times.  Passing such a snapc to
ceph_osdc_alloc_request() (possibly through ceph_osdc_new_request()) is
equivalent to passing NULL, as ceph_osdc_alloc_request() uses it only
for sizing the request message.

Further, in all four cases the subsequent ceph_osdc_build_request() is
passed NULL for snapc, meaning that 0 is encoded for seq and num_snaps
and making ceph_empty_snapc entirely useless.  The two cases where it
actually mattered were removed in commits 86056090 ("ceph: avoid
sending unnessesary FLUSHSNAP message") and 23078637 ("ceph: fix
queuing inode to mdsdir's snaprealm").
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Reviewed-by: NYan, Zheng <zyan@redhat.com>

34b759b4

ceph: don't enable rbytes mount option by default · 133e9156

由 Yan, Zheng 提交于 1月 25, 2016

When rbytes mount option is enabled, directory size is recursive
size. Recursive size is not updated instantly. This can cause
directory size to change between successive stat(1)
Signed-off-by: NYan, Zheng <zyan@redhat.com>

133e9156

libceph: revamp subs code, switch to SUBSCRIBE2 protocol · 82dcabad

由 Ilya Dryomov 提交于 1月 19, 2016

It is currently hard-coded in the mon_client that mdsmap and monmap
subs are continuous, while osdmap sub is always "onetime". To better
handle full clusters/pools in the osd_client, we need to be able to
issue continuous osdmap subs. Revamp subs code to allow us to specify
for each sub whether it should be continuous or not.

Although not strictly required for the above, switch to SUBSCRIBE2
protocol while at it, eliminating the ambiguity between a request for
"every map since X" and a request for "just the latest" when we don't
have a map yet (i.e. have epoch 0). SUBSCRIBE2 feature bit is now
required - it's been supported since pre-argonaut (2010).

Move "got mdsmap" call to the end of ceph_mdsc_handle_map() - calling
in before we validate the epoch and successfully install the new map
can mess up mon_client sub state.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

82dcabad

15 1月, 2016 1 次提交

kmemcg: account certain kmem allocations to memcg · 5d097056

由 Vladimir Davydov 提交于 1月 14, 2016

Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg.  For the list, see below:

 - threadinfo
 - task_struct
 - task_delay_info
 - pid
 - cred
 - mm_struct
 - vm_area_struct and vm_region (nommu)
 - anon_vma and anon_vma_chain
 - signal_struct
 - sighand_struct
 - fs_struct
 - files_struct
 - fdtable and fdtable->full_fds_bits
 - dentry and external_name
 - inode for all filesystems. This is the most tedious part, because
   most filesystems overwrite the alloc_inode method.

The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds.  Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5d097056

09 9月, 2015 1 次提交

ceph: EIO all operations after forced umount · 48fec5d0

由 Yan, Zheng 提交于 7月 01, 2015

This patch makes try_get_cap_refs() and __do_request() check
if the file system was forced umount, and return -EIO if it was.
This patch also adds a helper function to drops dirty caps and
wakes up blocking operation.
Signed-off-by: NYan, Zheng <zyan@redhat.com>

48fec5d0

05 9月, 2015 1 次提交

fs: create and use seq_show_option for escaping · a068acf2

由 Kees Cook 提交于 9月 04, 2015

Many file systems that implement the show_options hook fail to correctly
escape their output which could lead to unescaped characters (e.g.  new
lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files.  This
could lead to confusion, spoofed entries (resulting in things like
systemd issuing false d-bus "mount" notifications), and who knows what
else.  This looks like it would only be the root user stepping on
themselves, but it's possible weird things could happen in containers or
in other situations with delegated mount privileges.

Here's an example using overlay with setuid fusermount trusting the
contents of /proc/mounts (via the /etc/mtab symlink).  Imagine the use
of "sudo" is something more sneaky:

  $ BASE="ovl"
  $ MNT="$BASE/mnt"
  $ LOW="$BASE/lower"
  $ UP="$BASE/upper"
  $ WORK="$BASE/work/ 0 0
  none /proc fuse.pwn user_id=1000"
  $ mkdir -p "$LOW" "$UP" "$WORK"
  $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
  $ cat /proc/mounts
  none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
  none /proc fuse.pwn user_id=1000 0 0
  $ fusermount -u /proc
  $ cat /proc/mounts
  cat: /proc/mounts: No such file or directory

This fixes the problem by adding new seq_show_option and
seq_show_option_n helpers, and updating the vulnerable show_option
handlers to use them as needed.  Some, like SELinux, need to be open
coded due to unusual existing escape mechanisms.

[akpm@linux-foundation.org: add lost chunk, per Kees]
[keescook@chromium.org: seq_show_option should be using const parameters]
Signed-off-by: NKees Cook <keescook@chromium.org>
Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
Acked-by: NJan Kara <jack@suse.com>
Acked-by: NPaul Moore <paul@paul-moore.com>
Cc: J. R. Okajima <hooanon05g@gmail.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a068acf2

25 6月, 2015 3 次提交

Y
ceph: pre-allocate data structure that tracks caps flushing · f66fd9f0
由 Yan, Zheng 提交于 6月 10, 2015
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
f66fd9f0

libceph: store timeouts in jiffies, verify user input · a319bf56

由 Ilya Dryomov 提交于 5月 15, 2015

There are currently three libceph-level timeouts that the user can
specify on mount: mount_timeout, osd_idle_ttl and osdkeepalive.  All of
these are in seconds and no checking is done on user input: negative
values are accepted, we multiply them all by HZ which may or may not
overflow, arbitrarily large jiffies then get added together, etc.

There is also a bug in the way mount_timeout=0 is handled.  It's
supposed to mean "infinite timeout", but that's not how wait.h APIs
treat it and so __ceph_open_session() for example will busy loop
without much chance of being interrupted if none of ceph-mons are
there.

Fix all this by verifying user input, storing timeouts capped by
msecs_to_jiffies() in jiffies and using the new ceph_timeout_jiffies()
helper for all user-specified waits to handle infinite timeouts
correctly.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Reviewed-by: NAlex Elder <elder@linaro.org>

a319bf56

Y
ceph: check OSD caps before read/write · 10183a69
由 Yan, Zheng 提交于 4月 27, 2015
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
10183a69

20 4月, 2015 3 次提交

ceph: show non-default options only · ff7eeb82

由 Ilya Dryomov 提交于 3月 25, 2015

Don't pollute /proc/mounts with default options (presently these are
dcache, nofsc and acl).  Leave the acl/noacl however - it's a bit of
a special case due to CONFIG_CEPH_FS_POSIX_ACL.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

ff7eeb82

libceph, ceph: split ceph_show_options() · ff40f9ae

由 Ilya Dryomov 提交于 3月 25, 2015

Split ceph_show_options() into two pieces and move the piece
responsible for printing client (libceph) options into net/ceph.  This
way people adding a libceph option wouldn't have to remember to update
code in fs/ceph.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>

ff40f9ae

ceph: kstrdup() memory handling · a149bb9a

由 Sanidhya Kashyap 提交于 3月 21, 2015

Currently, there is no check for the kstrdup() for r_path2,
r_path1 and snapdir_name as various locations as there is a
possibility of failure during memory pressure. Therefore,
returning ENOMEM where the checks have been missed.
Signed-off-by: NSanidhya Kashyap <sanidhya.gatech@gmail.com>
Signed-off-by: NYan, Zheng <zyan@redhat.com>

a149bb9a

16 4月, 2015 1 次提交

VFS: normal filesystems (and lustre): d_inode() annotations · 2b0143b5

由 David Howells 提交于 3月 17, 2015

that's the bulk of filesystem drivers dealing with inodes of their own
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

2b0143b5

19 2月, 2015 1 次提交
- I
  ceph: show nocephx_require_signatures and notcp_nodelay options · 2a0b61ce
  由 Ilya Dryomov 提交于 2月 02, 2015
```
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>
```
  2a0b61ce
21 1月, 2015 2 次提交

fs: remove default_backing_dev_info · df0ce26c

由 Christoph Hellwig 提交于 1月 14, 2015

Now that default_backing_dev_info is not used for writeback purposes we can
git rid of it easily:

 - instead of using it's name for tracing unregistered bdi we just use
   "unknown"
 - btrfs and ceph can just assign the default read ahead window themselves
   like several other filesystems already do.
 - we can assign noop_backing_dev_info as the default one in alloc_super.
   All filesystems already either assigned their own or
   noop_backing_dev_info.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

df0ce26c

ceph: remove call to bdi_unregister · e4d27509

由 Christoph Hellwig 提交于 1月 14, 2015

bdi_destroy already does all the work, and if we delay freeing the
anon bdev we can get away with just that single call.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Signed-off-by: NJens Axboe <axboe@fb.com>

e4d27509

18 12月, 2014 3 次提交

Y
ceph: support inline data feature · 65a22662
由 Yan, Zheng 提交于 11月 17, 2014
```
Signed-off-by: NYan, Zheng <zyan@redhat.com>
```
65a22662

ceph: remove unused stringification macros · ca3995ad

由 Ilya Dryomov 提交于 11月 13, 2014

These were used to report git versions a long time ago.
Signed-off-by: NIlya Dryomov <idryomov@redhat.com>

ca3995ad

ceph: introduce global empty snap context · 97c85a82

由 Yan, Zheng 提交于 11月 06, 2014

Current snaphost code does not properly handle moving inode from one
empty snap realm to another empty snap realm. After changing inode's
snap realm, some dirty pages' snap context can be not equal to inode's
i_head_snap. This can trigger BUG() in ceph_put_wrbuffer_cap_refs()

The fix is introduce a global empty snap context for all empty snap
realm. This avoids triggering the BUG() for filesystem with no snapshot.

Fixes: http://tracker.ceph.com/issues/9928Signed-off-by: NYan, Zheng <zyan@redhat.com>
Reviewed-by: NIlya Dryomov <idryomov@redhat.com>

97c85a82

08 8月, 2014 1 次提交

dcache: d_obtain_alias callers don't all want DISCONNECTED · 1a0a397e

由 J. Bruce Fields 提交于 2月 14, 2014

There are a few d_obtain_alias callers that are using it to get the
root of a filesystem which may already have an alias somewhere else.

This is not the same as the filehandle-lookup case, and none of them
actually need DCACHE_DISCONNECTED set.

It isn't really a serious problem, but it would really be clearer if we
reserved DCACHE_DISCONNECTED for those cases where it's actually needed.

In the btrfs case this was causing a spurious printk from
nfsd/nfsfh.c:fh_verify when it found an unexpected DCACHE_DISCONNECTED
dentry.  Josef worked around this by unsetting DCACHE_DISCONNECTED
manually in 3a0dfa6a "Btrfs: unset DCACHE_DISCONNECTED when mounting
default subvol", and this replaces that workaround.

Cc: Josef Bacik <jbacik@fb.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1a0a397e

05 4月, 2014 1 次提交

ceph: use fl->fl_file as owner identifier of flock and posix lock · eb13e832

由 Yan, Zheng 提交于 3月 09, 2014

flock and posix lock should use fl->fl_file instead of process ID
as owner identifier. (posix lock uses fl->fl_owner. fl->fl_owner
is usually equal to fl->fl_file, but it also can be a customized
value). The process ID of who holds the lock is just for F_GETLK
fcntl(2).

The fix is rename the 'pid' fields of struct ceph_mds_request_args
and struct ceph_filelock to 'owner', rename 'pid_namespace' fields
to 'pid'. Assign fl->fl_file to the 'owner' field of lock messages.
We also set the most significant bit of the 'owner' field. MDS can
use that bit to distinguish between old and new clients.

The MDS counterpart of this patch modifies the flock code to not
take the 'pid_namespace' into consideration when checking conflict
locks.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

eb13e832

18 2月, 2014 1 次提交

ceph: add acl, noacl options for cephfs mount · 45195e42

由 Sage Weil 提交于 2月 16, 2014

Make the 'acl' option dependent on having ACL support compiled in.  Make
the 'noacl' option work even without it so that one can always ask it to
be off and not error out on mount when it is not supported.
Signed-off-by: NGuangliang Zhao <lucienchao@gmail.com>
Signed-off-by: NSage Weil <sage@inktank.com>

45195e42

01 1月, 2014 2 次提交

libceph: all features fields must be u64 · 12b4629a

由 Ilya Dryomov 提交于 12月 24, 2013

In preparation for ceph_features.h update, change all features fields
from unsigned int/u32 to u64.  (ceph.git has ~40 feature bits at this
point.)
Signed-off-by: NIlya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: NSage Weil <sage@inktank.com>

12b4629a

ceph: add acl for cephfs · 7221fe4c

由 Guangliang Zhao 提交于 11月 11, 2013

Signed-off-by: NGuangliang Zhao <lucienchao@gmail.com>
Reviewed-by: NLi Wang <li.wang@ubuntykylin.com>
Reviewed-by: NZheng Yan <zheng.z.yan@intel.com>

7221fe4c

14 12月, 2013 1 次提交

ceph: drop unconnected inodes · 9f12bd11

由 Yan, Zheng 提交于 9月 20, 2013

Positve dentry and corresponding inode are always accompanied in MDS reply.
So no need to keep inode in the cache after dropping all its aliases.
Signed-off-by: NYan, Zheng <zheng.z.yan@intel.com>
Reviewed-by: NSage Weil <sage@inktank.com>

9f12bd11

07 9月, 2013 1 次提交

ceph: use fscache as a local presisent cache · 99ccbd22

由 Milosz Tanski 提交于 8月 21, 2013

Adding support for fscache to the Ceph filesystem. This would bring it to on
par with some of the other network filesystems in Linux (like NFS, AFS, etc...)

In order to mount the filesystem with fscache the 'fsc' mount option must be
passed.
Signed-off-by: NMilosz Tanski <milosz@adfin.com>
Signed-off-by: NSage Weil <sage@inktank.com>

99ccbd22

04 7月, 2013 1 次提交

ceph: avoid accessing invalid memory · 54464296

由 Sasha Levin 提交于 7月 01, 2013

when mounting ceph with a dev name that starts with a slash, ceph
would attempt to access the character before that slash. Since we
don't actually own that byte of memory, we would trigger an
invalid access:

[   43.499934] BUG: unable to handle kernel paging request at ffff880fa3a97fff
[   43.500984] IP: [<ffffffff818f3884>] parse_mount_options+0x1a4/0x300
[   43.501491] PGD 743b067 PUD 10283c4067 PMD 10282a6067 PTE 8000000fa3a97060
[   43.502301] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[   43.503006] Dumping ftrace buffer:
[   43.503596]    (ftrace buffer empty)
[   43.504046] CPU: 0 PID: 10879 Comm: mount Tainted: G        W    3.10.0-sasha #1129
[   43.504851] task: ffff880fa625b000 ti: ffff880fa3412000 task.ti: ffff880fa3412000
[   43.505608] RIP: 0010:[<ffffffff818f3884>]  [<ffffffff818f3884>] parse_mount_options$
[   43.506552] RSP: 0018:ffff880fa3413d08  EFLAGS: 00010286
[   43.507133] RAX: ffff880fa3a98000 RBX: ffff880fa3a98000 RCX: 0000000000000000
[   43.507893] RDX: ffff880fa3a98001 RSI: 000000000000002f RDI: ffff880fa3a98000
[   43.508610] RBP: ffff880fa3413d58 R08: 0000000000001f99 R09: ffff880fa3fe64c0
[   43.509426] R10: ffff880fa3413d98 R11: ffff880fa38710d8 R12: ffff880fa3413da0
[   43.509792] R13: ffff880fa3a97fff R14: 0000000000000000 R15: ffff880fa3413d90
[   43.509792] FS:  00007fa9c48757e0(0000) GS:ffff880fd2600000(0000) knlGS:000000000000$
[   43.509792] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   43.509792] CR2: ffff880fa3a97fff CR3: 0000000fa3bb9000 CR4: 00000000000006b0
[   43.509792] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   43.509792] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   43.509792] Stack:
[   43.509792]  0000e5180000000e ffffffff85ca1900 ffff880fa38710d8 ffff880fa3413d98
[   43.509792]  0000000000000120 0000000000000000 ffff880fa3a98000 0000000000000000
[   43.509792]  ffffffff85cf32a0 0000000000000000 ffff880fa3413dc8 ffffffff818f3c72
[   43.509792] Call Trace:
[   43.509792]  [<ffffffff818f3c72>] ceph_mount+0xa2/0x390
[   43.509792]  [<ffffffff81226314>] ? pcpu_alloc+0x334/0x3c0
[   43.509792]  [<ffffffff81282f8d>] mount_fs+0x8d/0x1a0
[   43.509792]  [<ffffffff812263d0>] ? __alloc_percpu+0x10/0x20
[   43.509792]  [<ffffffff8129f799>] vfs_kern_mount+0x79/0x100
[   43.509792]  [<ffffffff812a224d>] do_new_mount+0xcd/0x1c0
[   43.509792]  [<ffffffff812a2e8d>] do_mount+0x15d/0x210
[   43.509792]  [<ffffffff81220e55>] ? strndup_user+0x45/0x60
[   43.509792]  [<ffffffff812a2fdd>] SyS_mount+0x9d/0xe0
[   43.509792]  [<ffffffff83fd816c>] tracesys+0xdd/0xe2
[   43.509792] Code: 4c 8b 5d c0 74 0a 48 8d 50 01 49 89 14 24 eb 17 31 c0 48 83 c9 ff $
[   43.509792] RIP  [<ffffffff818f3884>] parse_mount_options+0x1a4/0x300
[   43.509792]  RSP <ffff880fa3413d08>
[   43.509792] CR2: ffff880fa3a97fff
[   43.509792] ---[ end trace 22469cd81e93af51 ]---
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Reviewed-by: NSage Weil <sage@inktan.com>

54464296

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功