- 06 3月, 2019 2 次提交
-
-
由 Yan, Zheng 提交于
If number of caps exceed the limit, ceph_trim_dentires() also trim dentries with valid leases. Trimming dentry releases references to associated inode, which may evict inode and release caps. By default, there is no limit for caps count. Signed-off-by: N"Yan, Zheng" <zyan@redhat.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yan, Zheng 提交于
When pending cap releases fill up one message, start a work to send cap release message. (old way is sending cap releases every 5 seconds) Signed-off-by: N"Yan, Zheng" <zyan@redhat.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 08 1月, 2019 1 次提交
-
-
由 Dongsheng Yang 提交于
Introduce a new option abort_on_full, default to false. Then we can get -ENOSPC when the pool is full, or reaches quota. [ Don't show abort_on_full in /proc/mounts. ] Signed-off-by: NDongsheng Yang <dongsheng.yang@easystack.cn> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 12 12月, 2018 1 次提交
-
-
由 Luis Henriques 提交于
Since we found a problem with the 'copy-from' operation after objects have been truncated, offloading object copies to OSDs should be discouraged until the issue is fixed. Thus, this patch adds the 'nocopyfrom' mount option to the default mount options which effectily means that remote copies won't be done in copy_file_range unless they are explicitly enabled at mount time. [ Adjust ceph_show_options() accordingly. ] Link: https://tracker.ceph.com/issues/37378Signed-off-by: NLuis Henriques <lhenriques@suse.com> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 22 10月, 2018 1 次提交
-
-
由 Luis Henriques 提交于
Add a new mount option 'nocopyfrom' that will prevent the usage of the RADOS 'copy-from' operation in cephfs. This could be useful, for example, for an administrator to temporarily mitigate any possible bugs in the 'copy-from' implementation. Currently, only copy_file_range uses this RADOS operation. Setting this mount option will result in this syscall reverting to the default VFS implementation, i.e. to perform the copies locally instead of doing remote object copies. Signed-off-by: NLuis Henriques <lhenriques@suse.com> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 06 9月, 2018 1 次提交
-
-
由 Ilya Dryomov 提交于
syzbot reported a use-after-free in ceph_destroy_options(), called from ceph_mount(). The problem was that create_fs_client() consumed the opt pointer on some errors, but not on all of them. Make sure it always consumes both libceph and ceph options. Reported-by: syzbot+8ab6f1042021b4eed062@syzkaller.appspotmail.com Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
-
- 03 8月, 2018 2 次提交
-
-
由 Chengguang Xu 提交于
In order to not bother to VFS and other specific filesystems, we decided to do offset validation inside ceph kernel client, so just simply set sb->s_maxbytes to MAX_LFS_FILESIZE so that it can successfully pass VFS check. We add new field max_file_size in ceph_fs_client to store real file size limit and doing proper check based on it. Signed-off-by: NChengguang Xu <cgxu519@gmx.com> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Don't mention "mount" -- in the rbd case it is "mapping". Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 05 6月, 2018 8 次提交
-
-
由 Chengguang Xu 提交于
In current ceph_show_options(), there is no item for showing 'ino32', so add showing mount option 'ino32' if the value is different with default. Signed-off-by: NChengguang Xu <cgxu519@gmx.com> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Chengguang Xu 提交于
The check (intval < PAGE_SIZE) will involve type cast, so even when specifying negative value to rsize/wsize/readdir_max_bytes, it will pass the validation check successfully. Signed-off-by: NChengguang Xu <cgxu519@gmx.com> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Chengguang Xu 提交于
On currently logic: when I specify rasize=0~1 then it will be 4096. when I specify rasize=2~4097 then it will be 8192. Make it the same as rsize & wsize. Signed-off-by: NChengguang Xu <cgxu519@gmx.com> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Luis Henriques 提交于
KASAN found an UAF in ceph_statfs. This was a one-off bug but looking at the code it looks like the monmap access needs to be protected as it can be modified while we're accessing it. Fix this by protecting the access with the monc->mutex. BUG: KASAN: use-after-free in ceph_statfs+0x21d/0x2c0 Read of size 8 at addr ffff88006844f2e0 by task trinity-c5/304 CPU: 0 PID: 304 Comm: trinity-c5 Not tainted 4.17.0-rc6+ #172 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xa5/0x11b ? show_regs_print_info+0x5/0x5 ? kmsg_dump_rewind+0x118/0x118 ? ceph_statfs+0x21d/0x2c0 print_address_description+0x73/0x2b0 ? ceph_statfs+0x21d/0x2c0 kasan_report+0x243/0x360 ceph_statfs+0x21d/0x2c0 ? ceph_umount_begin+0x80/0x80 ? kmem_cache_alloc+0xdf/0x1a0 statfs_by_dentry+0x79/0xb0 vfs_statfs+0x28/0x110 user_statfs+0x8c/0xe0 ? vfs_statfs+0x110/0x110 ? __fdget_raw+0x10/0x10 __se_sys_statfs+0x5d/0xa0 ? user_statfs+0xe0/0xe0 ? mutex_unlock+0x1d/0x40 ? __x64_sys_statfs+0x20/0x30 do_syscall_64+0xee/0x290 ? syscall_return_slowpath+0x1c0/0x1c0 ? page_fault+0x1e/0x30 ? syscall_return_slowpath+0x13c/0x1c0 ? prepare_exit_to_usermode+0xdb/0x140 ? syscall_trace_enter+0x330/0x330 ? __put_user_4+0x1c/0x30 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Allocated by task 130: __kmalloc+0x124/0x210 ceph_monmap_decode+0x1c1/0x400 dispatch+0x113/0xd20 ceph_con_workfn+0xa7e/0x44e0 process_one_work+0x5f0/0xa30 worker_thread+0x184/0xa70 kthread+0x1a0/0x1c0 ret_from_fork+0x35/0x40 Freed by task 130: kfree+0xb8/0x210 dispatch+0x15a/0xd20 ceph_con_workfn+0xa7e/0x44e0 process_one_work+0x5f0/0xa30 worker_thread+0x184/0xa70 kthread+0x1a0/0x1c0 ret_from_fork+0x35/0x40 Signed-off-by: NLuis Henriques <lhenriques@suse.com> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
The intent behind making it a per-request setting was that it would be set for writes, but not for reads. As it is, the flag is set for all fs/ceph requests except for pool perm check stat request (technically a read). ceph_osdc_abort_on_full() skips reads since the previous commit and I don't see a use case for marking individual requests. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Acked-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>
-
由 Yan, Zheng 提交于
Pending works hold inode references, which cause "Busy inodes after unmount" warning. Signed-off-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yan, Zheng 提交于
This avoid force umount waiting on page writeback: io_schedule+0xd/0x30 wait_on_page_bit_common+0xc6/0x130 __filemap_fdatawait_range+0xbd/0x100 filemap_fdatawait_keep_errors+0x15/0x40 sync_inodes_sb+0x1cf/0x240 sync_filesystem+0x52/0x90 generic_shutdown_super+0x1d/0x110 ceph_kill_sb+0x28/0x80 [ceph] deactivate_locked_super+0x35/0x60 cleanup_mnt+0x36/0x70 task_work_run+0x79/0xa0 exit_to_usermode_loop+0x62/0x70 do_syscall_64+0xdb/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 0xffffffffffffffff Signed-off-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
This is how it was before commit 95cca2b4 ("ceph: limit osd write size") went in. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 02 4月, 2018 6 次提交
-
-
由 Luis Henriques 提交于
This commit changes statfs default behaviour when reporting usage statistics. Instead of using the overall filesystem usage, statfs now reports the quota for the filesystem root, if ceph.quota.max_bytes has been set for this inode. If quota hasn't been set, it falls back to the old statfs behaviour. A new mount option is also added ('noquotadf') to disable this behaviour. Signed-off-by: NLuis Henriques <lhenriques@suse.com> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Chengguang Xu 提交于
In current code, regular file and directory use same struct ceph_file_info to store fs specific data so the struct has to include some fields which are only used for directory (e.g., readdir related info), when having plenty of regular files, it will lead to memory waste. This patch introduces dedicated ceph_dir_file_info cache for readdir related thins. So that regular file does not include those unused fields anymore. Signed-off-by: NChengguang Xu <cgxu519@gmx.com> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Chengguang Xu 提交于
Releasing cap is affected by many factors (e.g., avail_count/reserve_count/min_count) and min_count could be specified high volume in client mount option. Hence it's better to mark cap cache as unreclaimable in case of non-trivial discrepancies between memory shown as reclaimable and what is actually reclaimed. Signed-off-by: NChengguang Xu <cgxu519@icloud.com> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Chengguang Xu 提交于
Using seq_show_option to replace seq_printf for string type options. Signed-off-by: NChengguang Xu <cgxu519@icloud.com> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Chengguang Xu 提交于
When specifying multiple fscache related options, the result isn't always the same as option order, this fix will keep strict consistent meaning by order. Signed-off-by: NChengguang Xu <cgxu519@icloud.com> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Chengguang Xu 提交于
Some of dout format do not include newline in the end, fix for the files which are in fs/ceph and net/ceph directories, and changing printk to dout for printing debug info in super.c Signed-off-by: NChengguang Xu <cgxu519@icloud.com> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 01 3月, 2018 1 次提交
-
-
由 Chengguang Xu 提交于
There is lack of cache destroy operation for ceph_file_cachep when failing from fscache register. Signed-off-by: NChengguang Xu <cgxu519@icloud.com> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 26 2月, 2018 2 次提交
-
-
由 Chengguang Xu 提交于
When failing from ceph_fs_debugfs_init() in ceph_real_mount(), there is lack of dput of root_dentry and it causes slab errors, so change the calling order of ceph_fs_debugfs_init() and open_root_dentry() and do some cleanups to avoid this issue. Signed-off-by: NChengguang Xu <cgxu519@icloud.com> Reviewed-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Chengguang Xu 提交于
When parsing string option, in order to avoid memory leak we need to carefully free it first in case of specifying same option several times. Signed-off-by: NChengguang Xu <cgxu519@icloud.com> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 28 11月, 2017 1 次提交
-
-
由 Linus Torvalds 提交于
This is a pure automated search-and-replace of the internal kernel superblock flags. The s_flags are now called SB_*, with the names and the values for the moment mirroring the MS_* flags that they're equivalent to. Note how the MS_xyz flags are the ones passed to the mount system call, while the SB_xyz flags are what we then use in sb->s_flags. The script to do this was: # places to look in; re security/*: it generally should *not* be # touched (that stuff parses mount(2) arguments directly), but # there are two places where we really deal with superblock flags. FILES="drivers/mtd drivers/staging/lustre fs ipc mm \ include/linux/fs.h include/uapi/linux/bfs_fs.h \ security/apparmor/apparmorfs.c security/apparmor/include/lib.h" # the list of MS_... constants SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \ DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \ POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \ I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \ ACTIVE NOUSER" SED_PROG= for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done # we want files that contain at least one of MS_..., # with fs/namespace.c and fs/pnode.c excluded. L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c') for f in $L; do sed -i $f $SED_PROG; done Requested-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 11月, 2017 1 次提交
-
-
由 Jeff Layton 提交于
Since its inception, ceph has presented the fsid as an opaque value without any sort of endianness conversion. This means that the value presented is different on architectures of different endianness. While the value that should be stuffed into f_fsid is poorly-defined, I think it would be best to strive for consistency here between architectures, and clients (we need to present this properly to the userland client as well). Change ceph_statfs to convert the opaque words to host-endian before doing the xor. On an upgrade, a big-endian box may see a different fsid than it did before, but little-endian arches should see no change with this patch. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NSage Weil <sage@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 07 9月, 2017 6 次提交
-
-
由 Markus Elfring 提交于
The script “checkpatch.pl” pointed information out like the following. Comparison to NULL could be written ... Thus fix the affected source code places. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net> Reviewed-by: NYan, Zheng <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Douglas Fuller 提交于
Improve accuracy of statfs reporting for Ceph filesystems comprising exactly one data pool. In this case, the Ceph monitor can now report the space usage for the single data pool instead of the global data for the entire Ceph cluster. Include support for this message in mon_client and leverage it in ceph/super. Signed-off-by: NDouglas Fuller <dfuller@redhat.com> Reviewed-by: NYan, Zheng <zyan@redhat.com> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yan, Zheng 提交于
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yan, Zheng 提交于
OSD has a configurable limitation of max write size. OSD return error if write request size is larger than the limitation. For now, set max write size to CEPH_MSG_MAX_DATA_LEN. It should be small enough. Signed-off-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yan, Zheng 提交于
libceph returns -EIO when read size > CEPH_MSG_MAX_DATA_LEN. Link: http://tracker.ceph.com/issues/20528Signed-off-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yan, Zheng 提交于
Signed-off-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 07 7月, 2017 2 次提交
-
-
由 Yan, Zheng 提交于
Current ceph uses FSID as primary index key of fscache data. This allows ceph to retain cached data across remount. But this causes problem (kernel opps, fscache does not support sharing data) when a filesystem get mounted several times (with fscache enabled, with different mount options). The fix is adding a new mount option, which specifies uniquifier for fscache. Signed-off-by: N"Yan, Zheng" <zyan@redhat.com> Acked-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yan, Zheng 提交于
extra_mon_dispatch() and debugfs' foo_show functions dereference fsc->mdsc. we should clean up fsc->client->extra_mon_dispatch and debugfs before destroying fsc->mds. Signed-off-by: N"Yan, Zheng" <zyan@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 04 5月, 2017 1 次提交
-
-
由 Ilya Dryomov 提交于
No reason to hide CephFS-specific features in the rbd case. Recent feature bits mix RADOS and CephFS-specific stuff together anyway. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 21 4月, 2017 1 次提交
-
-
由 Jan Kara 提交于
Allocate struct backing_dev_info separately instead of embedding it inside client structure. This unifies handling of bdi among users. CC: Ilya Dryomov <idryomov@gmail.com> CC: "Yan, Zheng" <zyan@redhat.com> CC: Sage Weil <sage@redhat.com> CC: ceph-devel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJan Kara <jack@suse.cz> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 25 2月, 2017 1 次提交
-
-
由 Ilya Dryomov 提交于
- ask for a commit reply instead of an ack reply in __ceph_pool_perm_get() - don't ask for both ack and commit replies in ceph_sync_write() - since just only one reply is requested now, i_unsafe_writes list will always be empty -- kill ceph_sync_write_wait() and go back to a standard ->evict_inode() Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NSage Weil <sage@redhat.com>
-
- 20 2月, 2017 1 次提交
-
-
由 Andreas Gerstmayr 提交于
This patch sets the io_pages bdi hint based on the rsize mount option. Without this patch large buffered reads (request size > max readahead) are processed sequentially in chunks of the readahead size (i.e. read requests are sent out up to the readahead size, then the do_generic_file_read() function waits until the first page is received). With this patch read requests are sent out at once up to the size specified in the rsize mount option (default: 64 MB). Signed-off-by: NAndreas Gerstmayr <andreas.gerstmayr@catalysts.cc> Acked-by: NJeff Layton <jlayton@redhat.com> Signed-off-by: NYan, Zheng <zyan@redhat.com>
-
- 13 12月, 2016 1 次提交
-
-
由 Yan, Zheng 提交于
Signed-off-by: NYan, Zheng <zyan@redhat.com>
-