- 08 4月, 2021 4 次提交
-
-
由 Al Viro 提交于
Zero it in set_nameidata() rather than in path_init(). That way it always matches the number of valid nd->stack[] entries. Since terminate_walk() does zero it (after having emptied the stack), we don't need to reinitialize it in subsequent path_init(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
That way we don't need the callers to mess with manually setting any fields of nameidata instances. Old set_nameidata() gets renamed (__set_nameidata()), new becomes an inlined helper that takes a struct path pointer and deals with setting nd->root and putting ND_ROOT_PRESET in nd->state when new argument is non-NULL. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Separate field in nameidata (nd->state) holding the flags that should be internal-only - that way we both get some spare bits in LOOKUP_... and get simpler rules for nd->root lifetime rules, since we can set the replacement of LOOKUP_ROOT (ND_ROOT_PRESET) at the same time we set nd->root. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
... and provide file_open_root_mnt(), using the root of given mount. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 07 4月, 2021 2 次提交
-
-
由 Al Viro 提交于
That (and traversals in case of umount .) should be done before complete_walk(). Either a braino or mismerge damage on queue reorders - either way, I should've spotted that much earlier. Fucked-up-by: NAl Viro <viro@zeniv.linux.org.uk> X-Paperbag: Brown Fixes: 161aff1d "LOOKUP_MOUNTPOINT: fold path_mountpointat() into path_lookupat()" Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
Initialize them in set_nameidata() and make sure that terminate_walk() clears them once the pointers become potentially invalid (i.e. we leave RCU mode or drop them in non-RCU one). Currently we have "path_init() always initializes them and nobody accesses them outside of path_init()/terminate_walk() segments", which is asking for trouble. With that change we would have nd->path.{mnt,dentry} 1) always valid - NULL or pointing to currently allocated objects. 2) non-NULL while we are successfully walking 3) NULL when we are not walking at all 4) contributing to refcounts whenever non-NULL outside of RCU mode. Fixes: 6c6ec2b0 ("fs: add support for LOOKUP_CACHED") Reported-by: syzbot+c88a7030da47945a3cc3@syzkaller.appspotmail.com Tested-by: NChristian Brauner <christian.brauner@ubuntu.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 23 3月, 2021 3 次提交
-
-
由 Randy Dunlap 提交于
Fix kernel-doc warnings in namei.c: ../fs/namei.c:1149: warning: Excess function parameter 'dir_mode' description in 'may_create_in_sticky' ../fs/namei.c:1149: warning: Excess function parameter 'dir_uid' description in 'may_create_in_sticky' ../fs/namei.c:3396: warning: Function parameter or member 'open_flag' not described in 'vfs_tmpfile' ../fs/namei.c:3396: warning: Excess function parameter 'open_flags' description in 'vfs_tmpfile' ../fs/namei.c:4460: warning: Function parameter or member 'rd' not described in 'vfs_rename' ../fs/namei.c:4460: warning: Excess function parameter 'old_mnt_userns' description in 'vfs_rename' ../fs/namei.c:4460: warning: Excess function parameter 'old_dir' description in 'vfs_rename' ../fs/namei.c:4460: warning: Excess function parameter 'old_dentry' description in 'vfs_rename' ../fs/namei.c:4460: warning: Excess function parameter 'new_mnt_userns' description in 'vfs_rename' ../fs/namei.c:4460: warning: Excess function parameter 'new_dir' description in 'vfs_rename' ../fs/namei.c:4460: warning: Excess function parameter 'new_dentry' description in 'vfs_rename' ../fs/namei.c:4460: warning: Excess function parameter 'delegated_inode' description in 'vfs_rename' ../fs/namei.c:4460: warning: Excess function parameter 'flags' description in 'vfs_rename' Link: https://lore.kernel.org/r/20210216042929.8931-3-rdunlap@infradead.org Fixes: 9fe61450 ("namei: introduce struct renamedata") Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Acked-by: NChristian Brauner <christian.brauner@ubuntu.com> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
Don't open-code the checks and instead move them into a clean little helper we can call. This also reduces the risk that if we ever change something we forget to change all locations. Link: https://lore.kernel.org/r/20210320122623.599086-4-christian.brauner@ubuntu.comInspired-by: NVivek Goyal <vgoyal@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
Vivek pointed out that the fs{g,u}id_into_mnt() naming scheme can be misleading as it could be understood as implying they do the exact same thing as i_{g,u}id_into_mnt(). The original motivation for this naming scheme was to signal to callers that the helpers will always take care to map the k{g,u}id such that the ownership is expressed in terms of the mnt_users. Get rid of the confusion by renaming those helpers to something more sensible. Al suggested mapped_fs{g,u}id() which seems a really good fit. Usually filesystems don't need to bother with these helpers directly only in some cases where they allocate objects that carry {g,u}ids which are either filesystem specific (e.g. xfs quota objects) or don't have a clean set of helpers as inodes have. Link: https://lore.kernel.org/r/20210320122623.599086-3-christian.brauner@ubuntu.comInspired-by: NVivek Goyal <vgoyal@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
- 21 2月, 2021 1 次提交
-
-
由 Al Viro 提交于
After switching to non-RCU mode, we want nd->depth to match the number of entries in nd->stack[] that need eventual path_put(). legitimize_links() takes care of that on failures; unfortunately, failure exits added for LOOKUP_CACHED do not. We could add the logics for that into those failure exits, both in try_to_unlazy() and in try_to_unlazy_next(), but since both checks are immediately followed by legitimize_links() and there's no calls of legitimize_links() other than those two... It's easier to move the check (and required handling of nd->depth on failure) into legitimize_links() itself. [caught by Jens: ... and since we are zeroing ->depth here, we need to do drop_links() first] Fixes: 6c6ec2b0 "fs: add support for LOOKUP_CACHED" Tested-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 24 1月, 2021 9 次提交
-
-
由 Christian Brauner 提交于
IMA does sometimes access the inode's i_uid and compares it against the rules' fowner. Enable IMA to handle idmapped mounts by passing down the mount's user namespace. We simply make use of the helpers we introduced before. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-27-christian.brauner@ubuntu.comSigned-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
Extend some inode methods with an additional user namespace argument. A filesystem that is aware of idmapped mounts will receive the user namespace the mount has been marked with. This can be used for additional permission checking and also to enable filesystems to translate between uids and gids if they need to. We have implemented all relevant helpers in earlier patches. As requested we simply extend the exisiting inode method instead of introducing new ones. This is a little more code churn but it's mostly mechanical and doesnt't leave us with additional inode methods. Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
When truncating files the vfs will verify that the caller is privileged over the inode. Extend it to handle idmapped mounts. If the inode is accessed through an idmapped mount it is mapped according to the mount's user namespace. Afterwards the permissions checks are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-16-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
The various vfs_*() helpers are called by filesystems or by the vfs itself to perform core operations such as create, link, mkdir, mknod, rename, rmdir, tmpfile and unlink. Enable them to handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace and pass it down. Afterwards the checks and operations are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-15-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
In order to handle idmapped mounts we will extend the vfs rename helper to take two new arguments in follow up patches. Since this operations already takes a bunch of arguments add a simple struct renamedata and make the current helper use it before we extend it. Link: https://lore.kernel.org/r/20210121131959.646623-14-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
The may_follow_link(), may_linkat(), may_lookup(), may_open(), may_o_create(), may_create_in_sticky(), may_delete(), and may_create() helpers determine whether the caller is privileged enough to perform the associated operations. Let them handle idmapped mounts by mapping the inode or fsids according to the mount's user namespace. Afterwards the checks are identical to non-idmapped inodes. The patch takes care to retrieve the mount's user namespace right before performing permission checks and passing it down into the fileystem so the user namespace can't change in between by someone idmapping a mount that is currently not idmapped. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-13-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJames Morris <jamorris@linux.microsoft.com> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
The inode_owner_or_capable() helper determines whether the caller is the owner of the inode or is capable with respect to that inode. Allow it to handle idmapped mounts. If the inode is accessed through an idmapped mount it according to the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Similarly, allow the inode_init_owner() helper to handle idmapped mounts. It initializes a new inode on idmapped mounts by mapping the fsuid and fsgid of the caller from the mount's user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-7-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJames Morris <jamorris@linux.microsoft.com> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
The two helpers inode_permission() and generic_permission() are used by the vfs to perform basic permission checking by verifying that the caller is privileged over an inode. In order to handle idmapped mounts we extend the two helpers with an additional user namespace argument. On idmapped mounts the two helpers will make sure to map the inode according to the mount's user namespace and then peform identical permission checks to inode_permission() and generic_permission(). If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-6-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJames Morris <jamorris@linux.microsoft.com> Acked-by: NSerge Hallyn <serge@hallyn.com> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
由 Christian Brauner 提交于
In order to determine whether a caller holds privilege over a given inode the capability framework exposes the two helpers privileged_wrt_inode_uidgid() and capable_wrt_inode_uidgid(). The former verifies that the inode has a mapping in the caller's user namespace and the latter additionally verifies that the caller has the requested capability in their current user namespace. If the inode is accessed through an idmapped mount map it into the mount's user namespace. Afterwards the checks are identical to non-idmapped inodes. If the initial user namespace is passed all operations are a nop so non-idmapped mounts will not see a change in behavior. Link: https://lore.kernel.org/r/20210121131959.646623-5-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJames Morris <jamorris@linux.microsoft.com> Acked-by: NSerge Hallyn <serge@hallyn.com> Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com>
-
- 05 1月, 2021 3 次提交
-
-
由 Jens Axboe 提交于
io_uring always punts opens to async context, since there's no control over whether the lookup blocks or not. Add LOOKUP_CACHED to support just doing the fast RCU based lookups, which we know will not block. If we can do a cached path resolution of the filename, then we don't have to always punt lookups for a worker. During path resolution, we always do LOOKUP_RCU first. If that fails and we terminate LOOKUP_RCU, then fail a LOOKUP_CACHED attempt as well. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
same as for the previous commit - instead of 0/-ECHILD make it return true/false, rename to try_to_unlazy_child(). Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Jens Axboe 提交于
Most callers check for non-zero return, and assume it's -ECHILD (which it always will be). One caller uses the actual error return. Clean this up and make it fully consistent, by having unlazy_walk() return a bool instead. Rename it to try_to_unlazy() and return true on success, and failure on error. That's easier to read. No functional changes in this patch. Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJens Axboe <axboe@kernel.dk> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 04 1月, 2021 2 次提交
-
-
由 Steven Rostedt (VMware) 提交于
Running my yearly branch profiling code, it detected a 100% wrong branch condition in name.c for lookup_fast(). The code in question has: status = d_revalidate(dentry, nd->flags); if (likely(status > 0)) return dentry; if (unlazy_child(nd, dentry, seq)) return ERR_PTR(-ECHILD); if (unlikely(status == -ECHILD)) /* we'd been told to redo it in non-rcu mode */ status = d_revalidate(dentry, nd->flags); If the status of the d_revalidate() is greater than zero, then the function finishes. Otherwise, if it is an "unlazy_child" it returns with -ECHILD. After the above two checks, the status is compared to -ECHILD, as that is what is returned if the original d_revalidate() needed to be done in a non-rcu mode. Especially this path is called in a condition of: if (nd->flags & LOOKUP_RCU) { And most of the d_revalidate() functions have: if (flags & LOOKUP_RCU) return -ECHILD; It appears that that is the only case that this if statement is triggered on two of my machines, running in production. As it is dependent on what filesystem mix is configured in the running kernel, simply remove the unlikely() from the if statement. Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
由 Al Viro 提交于
use vfs_open() instead Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 11 12月, 2020 1 次提交
-
-
由 Al Viro 提交于
make sure nd->dir_mode is always initialized after success exit from link_path_walk(); in case of empty path it did not happen. Reported-by: NAnant Thazhemadam <anant.thazhemadam@gmail.com> Tested-by: NAnant Thazhemadam <anant.thazhemadam@gmail.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 10 12月, 2020 1 次提交
-
-
由 Jens Axboe 提交于
Pass in the struct filename pointers instead of the user string, and update the three callers to do the same. This behaves like do_unlinkat(), which also takes a filename struct and puts it when it is done. Converting callers is then trivial. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 9月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
The last user of SB_I_MULTIROOT is disappeared with commit f2aedb71 ("NFS: Add fs_context support.") Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 28 8月, 2020 1 次提交
-
-
由 Mattias Nissler 提交于
For mounts that have the new "nosymfollow" option, don't follow symlinks when resolving paths. The new option is similar in spirit to the existing "nodev", "noexec", and "nosuid" options, as well as to the LOOKUP_NO_SYMLINKS resolve flag in the openat2(2) syscall. Various BSD variants have been supporting the "nosymfollow" mount option for a long time with equivalent implementations. Note that symlinks may still be created on file systems mounted with the "nosymfollow" option present. readlink() remains functional, so user space code that is aware of symlinks can still choose to follow them explicitly. Setting the "nosymfollow" mount option helps prevent privileged writers from modifying files unintentionally in case there is an unexpected link along the accessed path. The "nosymfollow" option is thus useful as a defensive measure for systems that need to deal with untrusted file systems in privileged contexts. More information on the history and motivation for this patch can be found here: https://sites.google.com/a/chromium.org/dev/chromium-os/chromiumos-design-docs/hardening-against-malicious-stateful-data#TOC-Restricting-symlink-traversalSigned-off-by: NMattias Nissler <mnissler@chromium.org> Signed-off-by: NRoss Zwisler <zwisler@google.com> Reviewed-by: NAleksa Sarai <cyphar@cyphar.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 15 8月, 2020 1 次提交
-
-
由 Kees Cook 提交于
Patch series "Fix S_ISDIR execve() errno". Fix an errno change for execve() of directories, noticed by Marc Zyngier. Along with the fix, include a regression test to avoid seeing this return in the future. This patch (of 2): The return code for attempting to execute a directory has always been EACCES. Adjust the S_ISDIR exec test to reflect the old errno instead of the general EISDIR for other kinds of "open" attempts on directories. Fixes: 633fb6ac ("exec: move S_ISREG() check earlier") Reported-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Tested-by: NGreg Kroah-Hartman <gregkh@android.com> Reviewed-by: NGreg Kroah-Hartman <gregkh@google.com> Link: http://lkml.kernel.org/r/20200813231723.2725102-2-keescook@chromium.org Link: https://lore.kernel.org/lkml/20200813151305.6191993b@whySigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 13 8月, 2020 3 次提交
-
-
由 Kees Cook 提交于
The path_noexec() check, like the regular file check, was happening too late, letting LSMs see impossible execve()s. Check it earlier as well in may_open() and collect the redundant fs/exec.c path_noexec() test under the same robustness comment as the S_ISREG() check. My notes on the call path, and related arguments, checks, etc: do_open_execat() struct open_flags open_exec_flags = { .open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC, .acc_mode = MAY_EXEC, ... do_filp_open(dfd, filename, open_flags) path_openat(nameidata, open_flags, flags) file = alloc_empty_file(open_flags, current_cred()); do_open(nameidata, file, open_flags) may_open(path, acc_mode, open_flag) /* new location of MAY_EXEC vs path_noexec() test */ inode_permission(inode, MAY_OPEN | acc_mode) security_inode_permission(inode, acc_mode) vfs_open(path, file) do_dentry_open(file, path->dentry->d_inode, open) security_file_open(f) open() /* old location of path_noexec() test */ Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: http://lkml.kernel.org/r/20200605160013.3954297-4-keescook@chromium.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kees Cook 提交于
The execve(2)/uselib(2) syscalls have always rejected non-regular files. Recently, it was noticed that a deadlock was introduced when trying to execute pipes, as the S_ISREG() test was happening too late. This was fixed in commit 73601ea5 ("fs/open.c: allow opening only regular files during execve()"), but it was added after inode_permission() had already run, which meant LSMs could see bogus attempts to execute non-regular files. Move the test into the other inode type checks (which already look for other pathological conditions[1]). Since there is no need to use FMODE_EXEC while we still have access to "acc_mode", also switch the test to MAY_EXEC. Also include a comment with the redundant S_ISREG() checks at the end of execve(2)/uselib(2) to note that they are present to avoid any mistakes. My notes on the call path, and related arguments, checks, etc: do_open_execat() struct open_flags open_exec_flags = { .open_flag = O_LARGEFILE | O_RDONLY | __FMODE_EXEC, .acc_mode = MAY_EXEC, ... do_filp_open(dfd, filename, open_flags) path_openat(nameidata, open_flags, flags) file = alloc_empty_file(open_flags, current_cred()); do_open(nameidata, file, open_flags) may_open(path, acc_mode, open_flag) /* new location of MAY_EXEC vs S_ISREG() test */ inode_permission(inode, MAY_OPEN | acc_mode) security_inode_permission(inode, acc_mode) vfs_open(path, file) do_dentry_open(file, path->dentry->d_inode, open) /* old location of FMODE_EXEC vs S_ISREG() test */ security_file_open(f) open() [1] https://lore.kernel.org/lkml/202006041910.9EF0C602@keescook/Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: http://lkml.kernel.org/r/20200605160013.3954297-3-keescook@chromium.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Al Viro 提交于
syzbot reported and bisected a use-after-free due to the recent init cleanups. The putname() should happen only after we'd *not* branched to retry, same as it's done in do_unlinkat(). Reported-by: syzbot+bbeb1c88016c7db4aa24@syzkaller.appspotmail.com Fixes: e24ab0ef "fs: push the getname from do_rmdir into the callers" Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 31 7月, 2020 5 次提交
-
-
由 Christoph Hellwig 提交于
Add a simple helper to mknod with a kernel space file name and switch the early init code over to it. Remove the now unused ksys_mknod. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
Add a simple helper to mkdir with a kernel space file name and switch the early init code over to it. Remove the now unused ksys_mkdir. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
Add a simple helper to symlink with a kernel space file name and switch the early init code over to it. Remove the now unused ksys_symlink. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
Add a simple helper to link with a kernel space file name and switch the early init code over to it. Remove the now unused ksys_link. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
This mirrors do_unlinkat and will make life a little easier for the init code to reuse the whole function with a kernel filename. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 09 6月, 2020 2 次提交
-
-
由 Linus Torvalds 提交于
posix_acl_permission() does not care about MAY_NOT_BLOCK, and in fact the permission logic internally must not check that bit (it's only for upper layers to decide whether they can block to do IO to look up the acl information or not). But the way the code was written, it _looked_ like it cared, since the function explicitly did not mask that bit off. But it has exactly two callers: one for when that bit is set, which first clears the bit before calling posix_acl_permission(), and the other call site when that bit was clear. So stop the silly games "saving" the MAY_NOT_BLOCK bit that must not be used for the actual permission test, and that currently is pointlessly cleared by the callers when the function itself should just not care. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
Rasmus Villemoes points out that the 'in_group_p()' tests can be a noticeable expense, and often completely unnecessary. A common situation is that the 'group' bits are the same as the 'other' bits wrt the permissions we want to test. So rewrite 'acl_permission_check()' to not bother checking for group ownership when the permission check doesn't care. For example, if we're asking for read permissions, and both 'group' and 'other' allow reading, there's really no reason to check if we're part of the group or not: either way, we'll allow it. Rasmus says: "On a bog-standard Ubuntu 20.04 install, a workload consisting of compiling lots of userspace programs (i.e., calling lots of short-lived programs that all need to get their shared libs mapped in, and the compilers poking around looking for system headers - lots of /usr/lib, /usr/bin, /usr/include/ accesses) puts in_group_p around 0.1% according to perf top. System-installed files are almost always 0755 (directories and binaries) or 0644, so in most cases, we can avoid the binary search and the cost of pulling the cred->groups array and in_group_p() .text into the cpu cache" Reported-by: NRasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 5月, 2020 1 次提交
-
-
由 Miklos Szeredi 提交于
Whiteouts, unlike real device node should not require privileges to create. The general concern with device nodes is that opening them can have side effects. The kernel already avoids zero major (see Documentation/admin-guide/devices.txt). To be on the safe side the patch explicitly forbids registering a char device with 0/0 number (see cdev_add()). This guarantees that a non-O_PATH open on a whiteout will fail with ENODEV; i.e. it won't have any side effect. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-