- 11 6月, 2022 1 次提交
-
-
由 Linus Torvalds 提交于
Change the signature of netfs helper functions to take a struct netfs_inode pointer rather than a struct inode pointer where appropriate, thereby relieving the need for the network filesystem to convert its internal inode format down to the VFS inode only for netfslib to bounce it back up. For type safety, it's better not to do that (and it's less typing too). Give netfs_write_begin() an extra argument to pass in a pointer to the netfs_inode struct rather than deriving it internally from the file pointer. Note that the ->write_begin() and ->write_end() ops are intended to be replaced in the future by netfslib code that manages this without the need to call in twice for each page. netfs_readpage() and similar are intended to be pointed at directly by the address_space_operations table, so must stick to the signature dictated by the function pointers there. Changes ======= - Updated the kerneldoc comments and documentation [DH]. Signed-off-by: NDavid Howells <dhowells@redhat.com> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/CAHk-=wgkwKyNmNdKpQkqZ6DnmUL-x9hp0YBnUGjaPFEAdxDTbw@mail.gmail.com/
-
- 10 6月, 2022 2 次提交
-
-
由 David Howells 提交于
While randstruct was satisfied with using an open-coded "void *" offset cast for the netfs_i_context <-> inode casting, __builtin_object_size() as used by FORTIFY_SOURCE was not as easily fooled. This was causing the following complaint[1] from gcc v12: In file included from include/linux/string.h:253, from include/linux/ceph/ceph_debug.h:7, from fs/ceph/inode.c:2: In function 'fortify_memset_chk', inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2, inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2: include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning] 242 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by embedding a struct inode into struct netfs_i_context (which should perhaps be renamed to struct netfs_inode). The struct inode vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode structs and vfs_inode is then simply changed to "netfs.inode" in those filesystems. Further, rename netfs_i_context to netfs_inode, get rid of the netfs_inode() function that converted a netfs_i_context pointer to an inode pointer (that can now be done with &ctx->inode) and rename the netfs_i_context() function to netfs_inode() (which is now a wrapper around container_of()). Most of the changes were done with: perl -p -i -e 's/vfs_inode/netfs.inode/'g \ `git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]` Kees suggested doing it with a pair structure[2] and a special declarator to insert that into the network filesystem's inode wrapper[3], but I think it's cleaner to embed it - and then it doesn't matter if struct randomisation reorders things. Dave Chinner suggested using a filesystem-specific VFS_I() function in each filesystem to convert that filesystem's own inode wrapper struct into the VFS inode struct[4]. Version #2: - Fix a couple of missed name changes due to a disabled cifs option. - Rename nfs_i_context to nfs_inode - Use "netfs" instead of "nic" as the member name in per-fs inode wrapper structs. [ This also undoes commit 507160f4 ("netfs: gcc-12: temporarily disable '-Wattribute-warning' for now") that is no longer needed ] Fixes: bc899ee1 ("netfs: Add a netfs inode context") Reported-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NDavid Howells <dhowells@redhat.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NKees Cook <keescook@chromium.org> Reviewed-by: NXiubo Li <xiubli@redhat.com> cc: Jonathan Corbet <corbet@lwn.net> cc: Eric Van Hensbergen <ericvh@gmail.com> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Ilya Dryomov <idryomov@gmail.com> cc: Steve French <smfrench@gmail.com> cc: William Kucharski <william.kucharski@oracle.com> cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> cc: Dave Chinner <david@fromorbit.com> cc: linux-doc@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-afs@lists.infradead.org cc: ceph-devel@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: samba-technical@lists.samba.org cc: linux-fsdevel@vger.kernel.org cc: linux-hardening@vger.kernel.org Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1] Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2] Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3] Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4] Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2 Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
This is a pure band-aid so that I can continue merging stuff from people while some of the gcc-12 fallout gets sorted out. In particular, gcc-12 is very unhappy about the kinds of pointer arithmetic tricks that netfs does, and that makes the fortify checks trigger in afs and ceph: In function ‘fortify_memset_chk’, inlined from ‘netfs_i_context_init’ at include/linux/netfs.h:327:2, inlined from ‘afs_set_netfs_context’ at fs/afs/inode.c:61:2, inlined from ‘afs_root_iget’ at fs/afs/inode.c:543:2: include/linux/fortify-string.h:258:25: warning: call to ‘__write_overflow_field’ declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning] 258 | __write_overflow_field(p_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ and the reason is that netfs_i_context_init() is passed a 'struct inode' pointer, and then it does struct netfs_i_context *ctx = netfs_i_context(inode); memset(ctx, 0, sizeof(*ctx)); where that netfs_i_context() function just does pointer arithmetic on the inode pointer, knowing that the netfs_i_context is laid out immediately after it in memory. This is all truly disgusting, since the whole "netfs_i_context is laid out immediately after it in memory" is not actually remotely true in general, but is just made to be that way for afs and ceph. See for example fs/cifs/cifsglob.h: struct cifsInodeInfo { struct { /* These must be contiguous */ struct inode vfs_inode; /* the VFS's inode record */ struct netfs_i_context netfs_ctx; /* Netfslib context */ }; [...] and realize that this is all entirely wrong, and the pointer arithmetic that netfs_i_context() is doing is also very very wrong and wouldn't give the right answer if netfs_ctx had different alignment rules from a 'struct inode', for example). Anyway, that's just a long-winded way to say "the gcc-12 warning is actually quite reasonable, and our code happens to work but is pretty disgusting". This is getting fixed properly, but for now I made the mistake of thinking "the week right after the merge window tends to be calm for me as people take a breather" and I did a sustem upgrade. And I got gcc-12 as a result, so to continue merging fixes from people and not have the end result drown in warnings, I am fixing all these gcc-12 issues I hit. Including with these kinds of temporary fixes. Cc: Kees Cook <keescook@chromium.org> Cc: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/all/AEEBCF5D-8402-441D-940B-105AA718C71F@chromium.org/Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 26 5月, 2022 5 次提交
-
-
由 Luís Henriques 提交于
When doing a mount using as base a directory that has 'max_bytes' quotas statfs uses that value as the total; if a subdirectory is used instead, the same 'max_bytes' too in statfs, unless there is another quota set. Unfortunately, if this subdirectory only has the 'max_files' quota set, then statfs uses the filesystem total. Fix this by making sure we only lookup realms that contain the 'max_bytes' quota. Cc: Ryan Taylor <rptaylor@uvic.ca> URL: https://tracker.ceph.com/issues/55090Signed-off-by: NLuís Henriques <lhenriques@suse.de> Reviewed-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NXiubo Li <xiubli@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Xiubo Li 提交于
If any 'x' caps is issued we can just choose the auth MDS instead of the random replica MDSes. Because only when the Locker is in LOCK_EXEC state will the loner client could get the 'x' caps. And if we send the getattr requests to any replica MDS it must auth pin and tries to rdlock from the auth MDS, and then the auth MDS need to do the Locker state transition to LOCK_SYNC. And after that the lock state will change back. This cost much when doing the Locker state transition and usually will need to revoke caps from clients. URL: https://tracker.ceph.com/issues/55240Signed-off-by: NXiubo Li <xiubli@redhat.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Xiubo Li 提交于
From the posix and the initial statx supporting commit comments, the AT_STATX_DONT_SYNC is a lightweight stat and the AT_STATX_FORCE_SYNC is a heaverweight one. And also checked all the other current usage about these two flags they are all doing the same, that is only when the AT_STATX_FORCE_SYNC is not set and the AT_STATX_DONT_SYNC is set will they skip sync retriving the attributes from storage. Signed-off-by: NXiubo Li <xiubli@redhat.com> Reviewed-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Xiubo Li 提交于
Fixes: 400e1286 ("ceph: conversion to new fscache API") Signed-off-by: NXiubo Li <xiubli@redhat.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Xiubo Li 提交于
The MDS will always refresh the dentry lease when removing the files or directories. And if the dentry is still hashed, we can update the dentry lease and no need to do the lookup from the MDS later. Signed-off-by: NXiubo Li <xiubli@redhat.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 23 3月, 2022 1 次提交
-
-
由 Muchun Song 提交于
The inode allocation is supposed to use alloc_inode_sb(), so convert kmem_cache_alloc() of all filesystems to alloc_inode_sb(). Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.comSigned-off-by: NMuchun Song <songmuchun@bytedance.com> Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4] Acked-by: NRoman Gushchin <roman.gushchin@linux.dev> Cc: Alex Shi <alexs@kernel.org> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kari Argillander <kari.argillander@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 3月, 2022 1 次提交
-
-
由 Xiubo Li 提交于
The ceph_get_inode() will search for or insert a new inode into the hash for the given vino, and return a reference to it. If new is non-NULL, its reference is consumed. We should release the reference when in error handing cases. Signed-off-by: NXiubo Li <xiubli@redhat.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 18 3月, 2022 1 次提交
-
-
由 David Howells 提交于
Add a netfs_i_context struct that should be included in the network filesystem's own inode struct wrapper, directly after the VFS's inode struct, e.g.: struct my_inode { struct { /* These must be contiguous */ struct inode vfs_inode; struct netfs_i_context netfs_ctx; }; }; The netfs_i_context struct so far contains a single field for the network filesystem to use - the cache cookie: struct netfs_i_context { ... struct fscache_cookie *cache; }; Three functions are provided to help with this: (1) void netfs_i_context_init(struct inode *inode, const struct netfs_request_ops *ops); Initialise the netfs context and set the operations. (2) struct netfs_i_context *netfs_i_context(struct inode *inode); Find the netfs context from the VFS inode. (3) struct inode *netfs_inode(struct netfs_i_context *ctx); Find the VFS inode from the netfs context. Changes ======= ver #4) - Fix netfs_is_cache_enabled() to check cookie->cache_priv to see if a cache is present[3]. - Fix netfs_skip_folio_read() to zero out all of the page, not just some of it[3]. ver #3) - Split out the bit to move ceph cap-getting on readahead into ceph_init_request()[1]. - Stick in a comment to the netfs inode structs indicating the contiguity requirements[2]. ver #2) - Adjust documentation to match. - Use "#if IS_ENABLED()" in netfs_i_cookie(), not "#ifdef". - Move the cap check from ceph_readahead() to ceph_init_request() to be called from netfslib. - Remove ceph_readahead() and use netfs_readahead() directly instead. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NJeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/8af0d47f17d89c06bbf602496dd845f2b0bf25b3.camel@kernel.org/ [1] Link: https://lore.kernel.org/r/beaf4f6a6c2575ed489adb14b257253c868f9a5c.camel@kernel.org/ [2] Link: https://lore.kernel.org/r/3536452.1647421585@warthog.procyon.org.uk/ [3] Link: https://lore.kernel.org/r/164622984545.3564931.15691742939278418580.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/164678213320.1200972.16807551936267647470.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/164692909854.2099075.9535537286264248057.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/306388.1647595110@warthog.procyon.org.uk/ # v4
-
- 02 3月, 2022 2 次提交
-
-
由 Milind Changire 提交于
Problem: Some directory vxattrs (e.g. ceph.dir.pin.random) are governed by information that isn't necessarily shared with the client. Add support for the new GETVXATTR operation, which allows the client to query the MDS directly for vxattrs. When the client is queried for a vxattr that doesn't have a special handler, have it issue a GETVXATTR to the MDS directly. Solution: Adds new getvxattr op to fetch ceph.dir.pin*, ceph.dir.layout* and ceph.file.layout* vxattrs. If the entire layout for a dir or a file is being set, then it is expected that the layout be set in standard JSON format. Individual field value retrieval is not wrapped in JSON. The JSON format also applies while setting the vxattr if the entire layout is being set in one go. As a temporary measure, setting a vxattr can also be done in the old format. The old format will be deprecated in the future. URL: https://tracker.ceph.com/issues/51062Signed-off-by: NMilind Changire <mchangir@redhat.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 hongnanli 提交于
inode->i_mutex has been replaced with inode->i_rwsem long ago. Fix comments still mentioning i_mutex. Signed-off-by: Nhongnanli <hongnan.li@linux.alibaba.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 12 1月, 2022 1 次提交
-
-
由 Jeff Layton 提交于
Now that the fscache API has been reworked and simplified, change ceph over to use it. With the old API, we would only instantiate a cookie when the file was open for reads. Change it to instantiate the cookie when the inode is instantiated and call use/unuse when the file is opened/closed. Also, ensure we resize the cached data on truncates, and invalidate the cache in response to the appropriate events. This will allow us to plumb in write support later. Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NDavid Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20211129162907.149445-2-jlayton@kernel.org/ # v1 Link: https://lore.kernel.org/r/20211207134451.66296-2-jlayton@kernel.org/ # v2 Link: https://lore.kernel.org/r/163906984277.143852.14697110691303589000.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967188351.1823006.5065634844099079351.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021581427.640689.14128682147127509264.stgit@warthog.procyon.org.uk/ # v4
-
- 08 11月, 2021 3 次提交
-
-
由 Xiubo Li 提交于
If the new size is the same as the current size, the MDS will do nothing but change the mtime/atime. POSIX doesn't mandate that the filesystems must update them in this case, so just ignore it instead. Signed-off-by: NXiubo Li <xiubli@redhat.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
Add proper error handling for when an async create fails. The inode never existed, so any dirty caps or data are now toast. We already d_drop the dentry in that case, but the now-stale inode may still be around. We want to shut down access to these inodes, and ensure that they can't harbor any more dirty data, which can cause problems at umount time. When this occurs, flag such inodes as being SHUTDOWN, and trash any caps and cap flushes that may be in flight for them, and invalidate the pagecache for the inode. Add a new helper that can check whether an inode or an entire mount is now shut down, and call it instead of accessing the mount_state directly in places where we test that now. URL: https://tracker.ceph.com/issues/51279Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
We have a lot of log messages that print inode pointer values. This is of dubious utility. Switch a random assortment of the ones I've found most useful to use ceph_vinop to print the snap:inum tuple instead. [ idryomov: use . as a separator, break unnecessarily long lines ] Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 19 10月, 2021 1 次提交
-
-
由 Jeff Layton 提交于
Currently, we check the wb_err too early for directories, before all of the unsafe child requests have been waited on. In order to fix that we need to check the mapping->wb_err later nearer to the end of ceph_fsync. We also have an overly-complex method for tracking errors after blocklisting. The errors recorded in cleanup_session_requests go to a completely separate field in the inode, but we end up reporting them the same way we would for any other error (in fsync). There's no real benefit to tracking these errors in two different places, since the only reporting mechanism for them is in fsync, and we'd need to advance them both every time. Given that, we can just remove i_meta_err, and convert the places that used it to instead just use mapping->wb_err instead. That also fixes the original problem by ensuring that we do a check_and_advance of the wb_err at the end of the fsync op. Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/52864Reported-by: NPatrick Donnelly <pdonnell@redhat.com> Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NXiubo Li <xiubli@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 03 9月, 2021 1 次提交
-
-
由 Jeff Layton 提交于
Consolidate some fiddly code for changing an inode's snap_realm into a new helper function, and change the callers to use it. While we're in here, nothing uses the i_snap_realm_counter field, so remove that from the inode. Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NLuis Henriques <lhenriques@suse.de> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 29 6月, 2021 3 次提交
-
-
由 Jeff Layton 提交于
Now that we don't need to hold session->s_mutex or the snap_rwsem when calling ceph_check_caps, we can eliminate ceph_async_iput and just use normal iput calls. Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
Turn s_cap_gen field into an atomic_t, and just rely on the fact that we hold the s_mutex when changing the s_cap_ttl field. Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NLuis Henriques <lhenriques@suse.de> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
...to simplify some error paths. Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NLuis Henriques <lhenriques@suse.de> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 22 6月, 2021 1 次提交
-
-
由 Jeff Layton 提交于
...and add a lockdep assertion for it to ceph_fill_inode(). Cc: stable@vger.kernel.org # v5.7+ Fixes: 9a8d03ca ("ceph: attempt to do async create when possible") Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 28 4月, 2021 5 次提交
-
-
由 Jeff Layton 提交于
The MDS reserves a set of inodes for its own usage, and these should never be accessible to clients. Add a new helper to vet a proposed inode number against that range, and complain loudly and refuse to create or look it up if it's in it. Also, ensure that the MDS doesn't try to delegate inodes that are in that range or lower. Print a warning if it does, and don't save the range in the xarray. URL: https://tracker.ceph.com/issues/49922Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NXiubo Li <xiubli@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
We need to use i_size_read(), which properly handles the torn read case on 32-bit arches. Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Yanhu Cao 提交于
Add support for grabbing the rsnaps value out of the inode info in traces, and exposing that via ceph.dir.rsnaps xattr. Signed-off-by: NYanhu Cao <gmayyyha@gmail.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
We want the snapdir to mirror the non-snapped directory's attributes for most things, but i_snap_caps represents the caps granted on the snapshot directory by the MDS itself. A misbehaving MDS could issue different caps for the snapdir and we lose them here. Only reset i_snap_caps when the inode is I_NEW. Also, move the setting of i_op and i_fop inside the if block since they should never change anyway. Reported-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
Ensure that we invalidate the fscache whenever we invalidate the pagecache. Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 08 3月, 2021 2 次提交
-
-
由 Jeff Layton 提交于
Al pointed out that a malicious or broken MDS could change the type or device number of a given inode number. It may also be possible for the MDS to reuse an old inode number. Ensure that we never allow fill_inode to change the type part of the i_mode or the i_rdev unless I_NEW is set. Throw warnings if the MDS ever changes these on us mid-stream, and return an error. Don't set i_rdev directly, and rely on init_special_inode to do it. Also, fix up error handling in the callers of ceph_get_inode. In handle_cap_grant, check for and warn if the inode type changes, and only overwrite the mode if it didn't. Reported-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jeff Layton 提交于
There are several warts in the snapdir error handling. The -EOPNOTSUPP return in __snapfh_to_dentry is currently lost, and the call to ceph_handle_snapdir is not currently checked at all. Fix all of this up and eliminate a BUG_ON in ceph_get_snapdir. We can handle that case with a warning and return an error. Signed-off-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 16 2月, 2021 2 次提交
-
-
由 Jeff Layton 提交于
Testing with the fscache overhaul has triggered some lockdep warnings about circular lock dependencies involving page_mkwrite and the mmap_lock. It'd be better to do the "real work" without the mmap lock being held. Change the skip_checking_caps parameter in __ceph_put_cap_refs to an enum, and use that to determine whether to queue check_caps, do it synchronously or not at all. Change ceph_page_mkwrite to do a ceph_put_cap_refs_async(). Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
Add a generic function for taking an inode reference, setting the I_WORK bit and queueing i_work. Turn the ceph_queue_* functions into static inline wrappers that pass in the right bit. Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 24 1月, 2021 5 次提交
-
-
由 Christian Brauner 提交于
Extend some inode methods with an additional user namespace argument. A filesystem that is aware of idmapped mounts will receive the user namespace the mount has been marked with. This can be used for additional permission checking and also to enable filesystems to translate between uids and gids if they need to. We have implemented all relevant helpers in earlier patches. As requested we simply extend the exisiting inode method instead of introducing new ones. This is a little more code churn but it's mostly mechanical and doesnt't leave us with additional inode methods. Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
The generic_fillattr() helper fills in the basic attributes associated with an inode. Enable it to handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace before we store the uid and gid. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-12-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJames Morris <jamorris@linux.microsoft.com> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
The posix acl permission checking helpers determine whether a caller is privileged over an inode according to the acls associated with the inode. Add helpers that make it possible to handle acls on idmapped mounts. The vfs and the filesystems targeted by this first iteration make use of posix_acl_fix_xattr_from_user() and posix_acl_fix_xattr_to_user() to translate basic posix access and default permissions such as the ACL_USER and ACL_GROUP type according to the initial user namespace (or the superblock's user namespace) to and from the caller's current user namespace. Adapt these two helpers to handle idmapped mounts whereby we either map from or into the mount's user namespace depending on in which direction we're translating. Similarly, cap_convert_nscap() is used by the vfs to translate user namespace and non-user namespace aware filesystem capabilities from the superblock's user namespace to the caller's user namespace. Enable it to handle idmapped mounts by accounting for the mount's user namespace. In addition the fileystems targeted in the first iteration of this patch series make use of the posix_acl_chmod() and, posix_acl_update_mode() helpers. Both helpers perform permission checks on the target inode. Let them handle idmapped mounts. These two helpers are called when posix acls are set by the respective filesystems to handle this case we extend the ->set() method to take an additional user namespace argument to pass the mount's user namespace down. Link: https://lore.kernel.org/r/20210121131959.646623-9-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
When file attributes are changed most filesystems rely on the setattr_prepare(), setattr_copy(), and notify_change() helpers for initialization and permission checking. Let them handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Helpers that perform checks on the ia_uid and ia_gid fields in struct iattr assume that ia_uid and ia_gid are intended values and have already been mapped correctly at the userspace-kernelspace boundary as we already do today. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-8-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
The two helpers inode_permission() and generic_permission() are used by the vfs to perform basic permission checking by verifying that the caller is privileged over an inode. In order to handle idmapped mounts we extend the two helpers with an additional user namespace argument. On idmapped mounts the two helpers will make sure to map the inode according to the mount's user namespace and then peform identical permission checks to inode_permission() and generic_permission(). If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-6-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJames Morris <jamorris@linux.microsoft.com> Acked-by: NSerge Hallyn <serge@hallyn.com> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
- 15 12月, 2020 3 次提交
-
-
由 Jeff Layton 提交于
We already have a pointer to the argument struct in req->r_args. Use that instead of groveling around in the ceph_mds_request_head. Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NXiubo Li <xiubli@redhat.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
Geng Jichao reported a rather complex deadlock involving several moving parts: 1) readahead is issued against an inode and some of its pages are locked while the read is in flight 2) the same inode is evicted from the cache, and this task gets stuck waiting for the page lock because of the above readahead 3) another task is processing a reply trace, and looks up the inode being evicted while holding the s_mutex. That ends up waiting for the eviction to complete 4) a write reply for an unrelated inode is then processed in the ceph_con_workfn job. It calls ceph_check_caps after putting wrbuffer caps, and that gets stuck waiting on the s_mutex held by 3. The reply to "1" is stuck behind the write reply in "4", so we deadlock at that point. This patch changes the trace processing to call ceph_get_inode outside of the s_mutex and snap_rwsem, which should break the cycle above. [ idryomov: break unnecessarily long lines ] URL: https://tracker.ceph.com/issues/47998Reported-by: NGeng Jichao <gengjichao@jd.com> Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NLuis Henriques <lhenriques@suse.de> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
Signed-off-by: NJeff Layton <jlayton@kernel.org> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-