- 13 11月, 2019 2 次提交
-
-
由 Tejun Heo 提交于
kernfs_find_and_get_node_by_ino() uses RCU protection. It's currently a bit buggy because it can look up a node which hasn't been activated yet and thus may end up exposing a node that the kernfs user is still prepping. While it can be fixed by pushing it further in the current direction, it's already complicated and isn't clear whether the complexity is justified. The main use of kernfs_find_and_get_node_by_ino() is for exportfs operations. They aren't super hot and all the follow-up operations (e.g. mapping to path) use normal locking anyway. Let's switch to a dumber locking scheme and protect the lookup with kernfs_idr_lock. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Namhyung Kim <namhyung@kernel.org>
-
由 Tejun Heo 提交于
When the 32bit ino wraps around, kernfs increments the generation number to distinguish reused ino instances. The wrap-around detection tests whether the allocated ino is lower than what the cursor but the cursor is pointing to the next ino to allocate so the condition never triggers. Fix it by remembering the last ino and comparing against that. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: 4a3ef68a ("kernfs: implement i_generation") Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org # v4.14+
-
- 08 8月, 2019 1 次提交
-
-
由 Greg Kroah-Hartman 提交于
This reverts commit cc798c83. Tony writes: Somehow this causes a regression in Linux next for me where I'm seeing lots of sysfs entries now missing under /sys/bus/platform/devices. For example, I now only see one .serial entry show up in sysfs. Things work again if I revert commit cc798c83 ("kernfs: fix memleak inkernel_ops_readdir()"). Any ideas why that would be? Tejun says: Ugh, you're right. It can get double-put cuz ctx->pos is put by release too. So reverting it for now. Reported-by: NTony Lindgren <tony@atomide.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Tejun Heo <tj@kernel.org> Fixes: cc798c83 ("kernfs: fix memleak in kernel_ops_readdir()") Cc: stable <stable@vger.kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 06 8月, 2019 1 次提交
-
-
由 Andrea Arcangeli 提交于
If getdents64 is killed or hits on segfault, it'll leave cgroups directories in sysfs pinned leaking memory because the kernfs node won't be freed on rmdir and the parent neither. Repro: # for i in `seq 1000`; do mkdir $i; done # rmdir * # for i in `seq 1000`; do mkdir $i; done # rmdir * # for i in `seq 1000`; do while :; do ls $i/ >/dev/null; done & done # while :; do killall ls; done kernfs_node_cache in /proc/slabinfo keeps going up as expected. Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org # goes way back to original sysfs days Link: https://lore.kernel.org/r/20190805173404.GF136335@devbig004.ftw2.facebook.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 25 7月, 2019 2 次提交
-
-
由 Jia-Ju Bai 提交于
In kernfs_path_from_node_locked(), there is an if statement on line 147 to check whether buf is NULL: if (buf) When buf is NULL, it is used on line 151: len += strlcpy(buf + len, parent_str, ...) and line 158: len += strlcpy(buf + len, "/", ...) and line 160: len += strlcpy(buf + len, kn->name, ...) Thus, possible null-pointer dereferences may occur. To fix these possible bugs, buf is checked before being used. If it is NULL, -EINVAL is returned. These bugs are found by a static analysis tool STCheck written by us. Signed-off-by: NJia-Ju Bai <baijiaju1990@gmail.com> Link: https://lore.kernel.org/r/20190724022242.27505-1-baijiaju1990@gmail.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Peng Wang 提交于
Get root safely after kn is ensureed to be not null. Signed-off-by: NPeng Wang <rocking@whu.edu.cn> Acked-by: NTejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20190708151611.13242-1-rocking@whu.edu.cnSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 05 6月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Based on 1 normalized pattern(s): this file is released under the gplv2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 68 file(s). Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NArmijn Hemel <armijn@tjaldur.nl> Reviewed-by: NAllison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190114.292346262@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 26 4月, 2019 1 次提交
-
-
由 Andrea Parri 提交于
smp_mb__before_atomic() can not be applied to atomic_set(). Remove the barrier and rely on RELEASE synchronization. Fixes: ba16b284 ("kernfs: add an API to get kernfs node from inode number") Cc: stable@vger.kernel.org Signed-off-by: NAndrea Parri <andrea.parri@amarulasolutions.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 21 3月, 2019 3 次提交
-
-
由 Ondrej Mosnacek 提交于
Use the new security_kernfs_init_security() hook to allow LSMs to possibly assign a non-default security context to a newly created kernfs node based on the attributes of the new node and also its parent node. This fixes an issue with cgroupfs under SELinux, where newly created cgroup subdirectories/files would not inherit its parent's context if it had been set explicitly to a non-default value (other than the genfs context specified by the policy). This can be reproduced as follows (on Fedora/RHEL): # mkdir /sys/fs/cgroup/unified/test # # Need permissive to change the label under Fedora policy: # setenforce 0 # chcon -t container_file_t /sys/fs/cgroup/unified/test # ls -lZ /sys/fs/cgroup/unified total 0 -r--r--r--. 1 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:06 cgroup.controllers -rw-r--r--. 1 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:06 cgroup.max.depth -rw-r--r--. 1 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:06 cgroup.max.descendants -rw-r--r--. 1 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:06 cgroup.procs -r--r--r--. 1 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:06 cgroup.stat -rw-r--r--. 1 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:06 cgroup.subtree_control -rw-r--r--. 1 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:06 cgroup.threads drwxr-xr-x. 2 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:06 init.scope drwxr-xr-x. 26 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:21 system.slice drwxr-xr-x. 3 root root system_u:object_r:container_file_t:s0 0 Jan 29 03:15 test drwxr-xr-x. 3 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:06 user.slice # mkdir /sys/fs/cgroup/unified/test/subdir Actual result: # ls -ldZ /sys/fs/cgroup/unified/test/subdir drwxr-xr-x. 2 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:15 /sys/fs/cgroup/unified/test/subdir Expected result: # ls -ldZ /sys/fs/cgroup/unified/test/subdir drwxr-xr-x. 2 root root unconfined_u:object_r:container_file_t:s0 0 Jan 29 03:15 /sys/fs/cgroup/unified/test/subdir Link: https://github.com/SELinuxProject/selinux-kernel/issues/39Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com> Acked-by: NCasey Schaufler <casey@schaufler-ca.com> Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Ondrej Mosnacek 提交于
Replace the special handling of security xattrs with simple_xattrs, as is already done for the trusted xattrs. This simplifies the code and allows LSMs to use more than just a single xattr to do their business. Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com> Acked-by: NCasey Schaufler <casey@schaufler-ca.com> [PM: manual merge fixes] Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
由 Ondrej Mosnacek 提交于
Right now, kernfs_iattrs embeds the whole struct iattr, even though it doesn't really use half of its fields... This both leads to wasting space and makes the code look awkward. Let's just list the few fields we need directly in struct kernfs_iattrs. Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com> Acked-by: NCasey Schaufler <casey@schaufler-ca.com> [PM: merged a number of chunks manually due to fuzz] Signed-off-by: NPaul Moore <paul@paul-moore.com>
-
- 08 2月, 2019 1 次提交
-
-
由 Ayush Mittal 提交于
Creating a new cache for kernfs_iattrs. Currently, memory is allocated with kzalloc() which always gives aligned memory. On ARM, this is 64 byte aligned. To avoid the wastage of memory in aligning the size requested, a new cache for kernfs_iattrs is created. Size of struct kernfs_iattrs is 80 Bytes. On ARM, it will come in kmalloc-128 slab. and it will come in kmalloc-192 slab if debug info is enabled. Extra bytes taken 48 bytes. Total number of objects created : 4096 Total saving = 48*4096 = 192 KB After creating new slab(When debug info is enabled) : sh-3.2# cat /proc/slabinfo ... kernfs_iattrs_cache 4069 4096 128 32 1 : tunables 0 0 0 : slabdata 128 128 0 ... All testing has been done on ARM target. Signed-off-by: NAyush Mittal <ayush.m@samsung.com> Signed-off-by: NVaneet Narang <v.narang@samsung.com> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 21 7月, 2018 1 次提交
-
-
由 Dmitry Torokhov 提交于
This change allows creating kernfs files and directories with arbitrary uid/gid instead of always using GLOBAL_ROOT_UID/GID by extending kernfs_create_dir_ns() and kernfs_create_file_ns() with uid/gid arguments. The "simple" kernfs_create_file() and kernfs_create_dir() are left alone and always create objects belonging to the global root. When creating symlinks ownership (uid/gid) is taken from the target kernfs object. Co-Developed-by: NTyler Hicks <tyhicks@canonical.com> Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: NTyler Hicks <tyhicks@canonical.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 6月, 2018 1 次提交
-
-
由 Deepa Dinamani 提交于
struct timespec is not y2038 safe. Transition vfs to use y2038 safe struct timespec64 instead. The change was made with the help of the following cocinelle script. This catches about 80% of the changes. All the header file and logic changes are included in the first 5 rules. The rest are trivial substitutions. I avoid changing any of the function signatures or any other filesystem specific data structures to keep the patch simple for review. The script can be a little shorter by combining different cases. But, this version was sufficient for my usecase. virtual patch @ depends on patch @ identifier now; @@ - struct timespec + struct timespec64 current_time ( ... ) { - struct timespec now = current_kernel_time(); + struct timespec64 now = current_kernel_time64(); ... - return timespec_trunc( + return timespec64_trunc( ... ); } @ depends on patch @ identifier xtime; @@ struct \( iattr \| inode \| kstat \) { ... - struct timespec xtime; + struct timespec64 xtime; ... } @ depends on patch @ identifier t; @@ struct inode_operations { ... int (*update_time) (..., - struct timespec t, + struct timespec64 t, ...); ... } @ depends on patch @ identifier t; identifier fn_update_time =~ "update_time$"; @@ fn_update_time (..., - struct timespec *t, + struct timespec64 *t, ...) { ... } @ depends on patch @ identifier t; @@ lease_get_mtime( ... , - struct timespec *t + struct timespec64 *t ) { ... } @te depends on patch forall@ identifier ts; local idexpression struct inode *inode_node; identifier i_xtime =~ "^i_[acm]time$"; identifier ia_xtime =~ "^ia_[acm]time$"; identifier fn_update_time =~ "update_time$"; identifier fn; expression e, E3; local idexpression struct inode *node1; local idexpression struct inode *node2; local idexpression struct iattr *attr1; local idexpression struct iattr *attr2; local idexpression struct iattr attr; identifier i_xtime1 =~ "^i_[acm]time$"; identifier i_xtime2 =~ "^i_[acm]time$"; identifier ia_xtime1 =~ "^ia_[acm]time$"; identifier ia_xtime2 =~ "^ia_[acm]time$"; @@ ( ( - struct timespec ts; + struct timespec64 ts; | - struct timespec ts = current_time(inode_node); + struct timespec64 ts = current_time(inode_node); ) <+... when != ts ( - timespec_equal(&inode_node->i_xtime, &ts) + timespec64_equal(&inode_node->i_xtime, &ts) | - timespec_equal(&ts, &inode_node->i_xtime) + timespec64_equal(&ts, &inode_node->i_xtime) | - timespec_compare(&inode_node->i_xtime, &ts) + timespec64_compare(&inode_node->i_xtime, &ts) | - timespec_compare(&ts, &inode_node->i_xtime) + timespec64_compare(&ts, &inode_node->i_xtime) | ts = current_time(e) | fn_update_time(..., &ts,...) | inode_node->i_xtime = ts | node1->i_xtime = ts | ts = inode_node->i_xtime | <+... attr1->ia_xtime ...+> = ts | ts = attr1->ia_xtime | ts.tv_sec | ts.tv_nsec | btrfs_set_stack_timespec_sec(..., ts.tv_sec) | btrfs_set_stack_timespec_nsec(..., ts.tv_nsec) | - ts = timespec64_to_timespec( + ts = ... -) | - ts = ktime_to_timespec( + ts = ktime_to_timespec64( ...) | - ts = E3 + ts = timespec_to_timespec64(E3) | - ktime_get_real_ts(&ts) + ktime_get_real_ts64(&ts) | fn(..., - ts + timespec64_to_timespec(ts) ,...) ) ...+> ( <... when != ts - return ts; + return timespec64_to_timespec(ts); ...> ) | - timespec_equal(&node1->i_xtime1, &node2->i_xtime2) + timespec64_equal(&node1->i_xtime2, &node2->i_xtime2) | - timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2) + timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2) | - timespec_compare(&node1->i_xtime1, &node2->i_xtime2) + timespec64_compare(&node1->i_xtime1, &node2->i_xtime2) | node1->i_xtime1 = - timespec_trunc(attr1->ia_xtime1, + timespec64_trunc(attr1->ia_xtime1, ...) | - attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2, + attr1->ia_xtime1 = timespec64_trunc(attr2->ia_xtime2, ...) | - ktime_get_real_ts(&attr1->ia_xtime1) + ktime_get_real_ts64(&attr1->ia_xtime1) | - ktime_get_real_ts(&attr.ia_xtime1) + ktime_get_real_ts64(&attr.ia_xtime1) ) @ depends on patch @ struct inode *node; struct iattr *attr; identifier fn; identifier i_xtime =~ "^i_[acm]time$"; identifier ia_xtime =~ "^ia_[acm]time$"; expression e; @@ ( - fn(node->i_xtime); + fn(timespec64_to_timespec(node->i_xtime)); | fn(..., - node->i_xtime); + timespec64_to_timespec(node->i_xtime)); | - e = fn(attr->ia_xtime); + e = fn(timespec64_to_timespec(attr->ia_xtime)); ) @ depends on patch forall @ struct inode *node; struct iattr *attr; identifier i_xtime =~ "^i_[acm]time$"; identifier ia_xtime =~ "^ia_[acm]time$"; identifier fn; @@ { + struct timespec ts; <+... ( + ts = timespec64_to_timespec(node->i_xtime); fn (..., - &node->i_xtime, + &ts, ...); | + ts = timespec64_to_timespec(attr->ia_xtime); fn (..., - &attr->ia_xtime, + &ts, ...); ) ...+> } @ depends on patch forall @ struct inode *node; struct iattr *attr; struct kstat *stat; identifier ia_xtime =~ "^ia_[acm]time$"; identifier i_xtime =~ "^i_[acm]time$"; identifier xtime =~ "^[acm]time$"; identifier fn, ret; @@ { + struct timespec ts; <+... ( + ts = timespec64_to_timespec(node->i_xtime); ret = fn (..., - &node->i_xtime, + &ts, ...); | + ts = timespec64_to_timespec(node->i_xtime); ret = fn (..., - &node->i_xtime); + &ts); | + ts = timespec64_to_timespec(attr->ia_xtime); ret = fn (..., - &attr->ia_xtime, + &ts, ...); | + ts = timespec64_to_timespec(attr->ia_xtime); ret = fn (..., - &attr->ia_xtime); + &ts); | + ts = timespec64_to_timespec(stat->xtime); ret = fn (..., - &stat->xtime); + &ts); ) ...+> } @ depends on patch @ struct inode *node; struct inode *node2; identifier i_xtime1 =~ "^i_[acm]time$"; identifier i_xtime2 =~ "^i_[acm]time$"; identifier i_xtime3 =~ "^i_[acm]time$"; struct iattr *attrp; struct iattr *attrp2; struct iattr attr ; identifier ia_xtime1 =~ "^ia_[acm]time$"; identifier ia_xtime2 =~ "^ia_[acm]time$"; struct kstat *stat; struct kstat stat1; struct timespec64 ts; identifier xtime =~ "^[acmb]time$"; expression e; @@ ( ( node->i_xtime2 \| attrp->ia_xtime2 \| attr.ia_xtime2 \) = node->i_xtime1 ; | node->i_xtime2 = \( node2->i_xtime1 \| timespec64_trunc(...) \); | node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \); | node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \); | stat->xtime = node2->i_xtime1; | stat1.xtime = node2->i_xtime1; | ( node->i_xtime2 \| attrp->ia_xtime2 \) = attrp->ia_xtime1 ; | ( attrp->ia_xtime1 \| attr.ia_xtime1 \) = attrp2->ia_xtime2; | - e = node->i_xtime1; + e = timespec64_to_timespec( node->i_xtime1 ); | - e = attrp->ia_xtime1; + e = timespec64_to_timespec( attrp->ia_xtime1 ); | node->i_xtime1 = current_time(...); | node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = - e; + timespec_to_timespec64(e); | node->i_xtime1 = node->i_xtime3 = - e; + timespec_to_timespec64(e); | - node->i_xtime1 = e; + node->i_xtime1 = timespec_to_timespec64(e); ) Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com> Cc: <anton@tuxera.com> Cc: <balbi@kernel.org> Cc: <bfields@fieldses.org> Cc: <darrick.wong@oracle.com> Cc: <dhowells@redhat.com> Cc: <dsterba@suse.com> Cc: <dwmw2@infradead.org> Cc: <hch@lst.de> Cc: <hirofumi@mail.parknet.co.jp> Cc: <hubcap@omnibond.com> Cc: <jack@suse.com> Cc: <jaegeuk@kernel.org> Cc: <jaharkes@cs.cmu.edu> Cc: <jslaby@suse.com> Cc: <keescook@chromium.org> Cc: <mark@fasheh.com> Cc: <miklos@szeredi.hu> Cc: <nico@linaro.org> Cc: <reiserfs-devel@vger.kernel.org> Cc: <richard@nod.at> Cc: <sage@redhat.com> Cc: <sfrench@samba.org> Cc: <swhiteho@redhat.com> Cc: <tj@kernel.org> Cc: <trond.myklebust@primarydata.com> Cc: <tytso@mit.edu> Cc: <viro@zeniv.linux.org.uk>
-
- 29 7月, 2017 5 次提交
-
-
由 Shaohua Li 提交于
inode number and generation can identify a kernfs node. We are going to export the identification by exportfs operations, so put ino and generation into a separate structure. It's convenient when later patches use the identification. Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NShaohua Li <shli@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Shaohua Li 提交于
When working on adding exportfs operations in kernfs, I found it's hard to initialize dentry->d_fsdata in the exportfs operations. Looks there is no way to do it without race condition. Look at the kernfs code closely, there is no point to set dentry->d_fsdata. inode->i_private already points to kernfs_node, and we can get inode from a dentry. So this patch just delete the d_fsdata usage. Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NShaohua Li <shli@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Shaohua Li 提交于
Add an API to get kernfs node from inode number. We will need this to implement exportfs operations. This API will be used in blktrace too later, so it should be as fast as possible. To make the API lock free, kernfs node is freed in RCU context. And we depend on kernfs_node count/ino number to filter out stale kernfs nodes. Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NShaohua Li <shli@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Shaohua Li 提交于
Set i_generation for kernfs inode. This is required to implement exportfs operations. The generation is 32-bit, so it's possible the generation wraps up and we find stale files. To reduce the posssibility, we don't reuse inode numer immediately. When the inode number allocation wraps, we increase generation number. In this way generation/inode number consist of a 64-bit number which is unlikely duplicated. This does make the idr tree more sparse and waste some memory. Since idr manages 32-bit keys, idr uses a 6-level radix tree, each level covers 6 bits of the key. In a 100k inode kernfs, the worst case will have around 300k radix tree node. Each node is 576bytes, so the tree will use about ~150M memory. Sounds not too bad, if this really is a problem, we should find better data structure. Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NShaohua Li <shli@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Shaohua Li 提交于
kernfs uses ida to manage inode number. The problem is we can't get kernfs_node from inode number with ida. Switching to use idr, next patch will add an API to get kernfs_node from inode number. Acked-by: NTejun Heo <tj@kernel.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NShaohua Li <shli@fb.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 10 2月, 2017 1 次提交
-
-
由 Konstantin Khlebnikov 提交于
Null kernfs nodes could be found at cgroups during construction. It seems safer to handle these null pointers right in kernfs in the same way as printf prints "(null)" for null pointer string. Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 28 12月, 2016 1 次提交
-
-
由 Tejun Heo 提交于
Add ->open/release() methods to kernfs_ops. ->open() is called when the file is opened and ->release() when the file is either released or severed. These callbacks can be used, for example, to manage persistent caching objects over multiple seq_file iterations. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NAcked-by: Zefan Li <lizefan@huawei.com>
-
- 08 10月, 2016 1 次提交
-
-
由 Andreas Gruenbacher 提交于
These inode operations are no longer used; remove them. Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 07 10月, 2016 1 次提交
-
-
由 Andreas Gruenbacher 提交于
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NTejun Heo <tj@kernel.org> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 27 9月, 2016 2 次提交
-
-
由 Miklos Szeredi 提交于
Generated patch: sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2` sed -i "s/\brename2\b/rename/g" `git grep -wl rename2` Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
-
由 Miklos Szeredi 提交于
This is trivial to do: - add flags argument to foo_rename() - check if flags is zero - assign foo_rename() to .rename2 instead of .rename This doesn't mean it's impossible to support RENAME_NOREPLACE for these filesystems, but it is not trivial, like for local filesystems. RENAME_NOREPLACE must guarantee atomicity (i.e. it shouldn't be possible for a file to be created on one host while it is overwritten by rename on another host). Filesystems converted: 9p, afs, ceph, coda, ecryptfs, kernfs, lustre, ncpfs, nfs, ocfs2, orangefs. After this, we can get rid of the duplicate interfaces for rename. Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: David Howells <dhowells@redhat.com> [AFS] Acked-by: NMike Marshall <hubcap@omnibond.com> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jan Harkes <jaharkes@cs.cmu.edu> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Mark Fasheh <mfasheh@suse.com>
-
- 10 8月, 2016 2 次提交
-
-
由 Tejun Heo 提交于
It doesn't have any in-kernel user and the same result can be obtained from kernfs_path(@kn, NULL, 0). Remove it. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
-
由 Tejun Heo 提交于
kernfs_path*() functions always return the length of the full path but the path content is undefined if the length is larger than the provided buffer. This makes its behavior different from strlcpy() and requires error handling in all its users even when they don't care about truncation. In addition, the implementation can actully be simplified by making it behave properly in strlcpy() style. * Update kernfs_path_from_node_locked() to always fill up the buffer with path. If the buffer is not large enough, the output is truncated and terminated. * kernfs_path() no longer needs error handling. Make it a simple inline wrapper around kernfs_path_from_node(). * sysfs_warn_dup()'s use of kernfs_path() doesn't need error handling. Updated accordingly. * cgroup_path()'s use of kernfs_path() updated to retain the old behavior. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NSerge Hallyn <serge.hallyn@ubuntu.com>
-
- 11 6月, 2016 1 次提交
-
-
由 Linus Torvalds 提交于
We always mixed in the parent pointer into the dentry name hash, but we did it late at lookup time. It turns out that we can simplify that lookup-time action by salting the hash with the parent pointer early instead of late. A few other users of our string hashes also wanted to mix in their own pointers into the hash, and those are updated to use the same mechanism. Hash users that don't have any particular initial salt can just use the NULL pointer as a no-salt. Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 09 5月, 2016 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 03 5月, 2016 1 次提交
-
-
由 Serge Hallyn 提交于
We've calculated @len to be the bytes we need for '/..' entries from @kn_from to the common ancestor, and calculated @nlen to be the extra bytes we need to get from the common ancestor to @kn_to. We use them as such at the end. But in the loop copying the actual entries, we overwrite @nlen. Use a temporary variable for that instead. Without this, the return length, when the buffer is large enough, is wrong. (When the buffer is NULL or too small, the returned value is correct. The buffer contents are also correct.) Interestingly, no callers of this function are affected by this as of yet. However the upcoming cgroup_show_path() will be. Signed-off-by: NSerge Hallyn <serge.hallyn@ubuntu.com>
-
- 30 3月, 2016 1 次提交
-
-
由 Deepa Dinamani 提交于
This is in preparation for the series that transitions filesystem timestamps to use 64 bit time and hence make them y2038 safe. CURRENT_TIME macro will be deleted before merging the aforementioned series. Use current_fs_time() instead of CURRENT_TIME for inode timestamps. struct kernfs_node is associated with a sysfs file/ directory. Truncate the values to appropriate time granularity when writing to inode timestamps of the files. ktime_get_real_ts() is used to obtain times for struct kernfs_iattrs. Since these times are later assigned to inode times using timespec_truncate() for all filesystem based operations, we can save the supers list traversal time here by using ktime_get_real_ts() directly. Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 17 2月, 2016 1 次提交
-
-
由 Aditya Kali 提交于
The new function kernfs_path_from_node() generates and returns kernfs path of a given kernfs_node relative to a given parent kernfs_node. Signed-off-by: NAditya Kali <adityakali@google.com> Signed-off-by: NSerge E. Hallyn <serge.hallyn@canonical.com> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 08 2月, 2016 1 次提交
-
-
由 Tejun Heo 提交于
kernfs_walk_ns() uses a static path_buf[PATH_MAX] to separate out path components. Keeping around the 4k buffer just for kernfs_walk_ns() is wasteful. This patch makes it piggyback on kernfs_pr_cont_buf[] instead. This requires kernfs_walk_ns() to hold kernfs_rename_lock. Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 23 1月, 2016 1 次提交
-
-
由 Al Viro 提交于
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 15 1月, 2016 1 次提交
-
-
由 Vladimir Davydov 提交于
Currently, all kmem allocations (namely every kmem_cache_alloc, kmalloc, alloc_kmem_pages call) are accounted to memory cgroup automatically. Callers have to explicitly opt out if they don't want/need accounting for some reason. Such a design decision leads to several problems: - kmalloc users are highly sensitive to failures, many of them implicitly rely on the fact that kmalloc never fails, while memcg makes failures quite plausible. - A lot of objects are shared among different containers by design. Accounting such objects to one of containers is just unfair. Moreover, it might lead to pinning a dead memcg along with its kmem caches, which aren't tiny, which might result in noticeable increase in memory consumption for no apparent reason in the long run. - There are tons of short-lived objects. Accounting them to memcg will only result in slight noise and won't change the overall picture, but we still have to pay accounting overhead. For more info, see - http://lkml.kernel.org/r/20151105144002.GB15111%40dhcp22.suse.cz - http://lkml.kernel.org/r/20151106090555.GK29259@esperanza Therefore this patchset switches to the white list policy. Now kmalloc users have to explicitly opt in by passing __GFP_ACCOUNT flag. Currently, the list of accounted objects is quite limited and only includes those allocations that (1) are known to be easily triggered from userspace and (2) can fail gracefully (for the full list see patch no. 6) and it still misses many object types. However, accounting only those objects should be a satisfactory approximation of the behavior we used to have for most sane workloads. This patch (of 6): Revert 499611ed ("kernfs: do not account ino_ida allocations to memcg"). Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be fragile and difficult to maintain, because there seem to be many more allocations that should not be accounted than those that should be. Besides, false accounting an allocation might result in much worse consequences than not accounting at all, namely increased memory consumption due to pinned dead kmem caches. So it was decided to switch to the white-list policy. This patch reverts bits introducing the black-list policy. The white-list policy will be introduced later in the series. Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 11月, 2015 1 次提交
-
-
由 Tejun Heo 提交于
Implement kernfs_walk_and_get() which is similar to kernfs_find_and_get() but can walk a path instead of just a name. v2: Use strlcpy() instead of strlen() + memcpy() as suggested by David. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: David Miller <davem@davemloft.net>
-
- 19 8月, 2015 1 次提交
-
-
由 Tejun Heo 提交于
Add a function to determine the path length of a kernfs node. This for now will be used by writeback tracepoint updates. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 01 7月, 2015 1 次提交
-
-
由 Eric W. Biederman 提交于
Add a new function kernfs_create_empty_dir that can be used to create directory that can not be modified. Update the code to use make_empty_dir_inode when reporting a permanently empty directory to the vfs. Update the code to not allow adding to permanently empty directories. Cc: stable@vger.kernel.org Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
-
- 15 5月, 2015 1 次提交
-
-
由 Vladimir Davydov 提交于
root->ino_ida is used for kernfs inode number allocations. Since IDA has a layered structure, different IDs can reside on the same layer, which is currently accounted to some memory cgroup. The problem is that each kmem cache of a memory cgroup has its own directory on sysfs (under /sys/fs/kernel/<cache-name>/cgroup). If the inode number of such a directory or any file in it gets allocated from a layer accounted to the cgroup which the cache is created for, the cgroup will get pinned for good, because one has to free all kmem allocations accounted to a cgroup in order to release it and destroy all its kmem caches. That said we must not account layers of ino_ida to any memory cgroup. Since per net init operations may create new sysfs entries directly (e.g. lo device) or indirectly (nf_conntrack creates a new kmem cache per each namespace, which, in turn, creates new sysfs entries), an easy way to reproduce this issue is by creating network namespace(s) from inside a kmem-active memory cgroup. Signed-off-by: NVladimir Davydov <vdavydov@parallels.com> Acked-by: NTejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Greg Thelen <gthelen@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> [4.0.x] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 4月, 2015 1 次提交
-
-
由 David Howells 提交于
that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-