1. 23 5月, 2023 1 次提交
    • C
      kernfs: fix use-after-free in __kernfs_remove · 9c14d1f4
      Christian A. Ehrhardt 提交于
      stable inclusion
      from stable-v5.10.153
      commit 6f72a3977ba9d0e5491a5c01315204272e7f9c44
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I64YCA
      
      Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6f72a3977ba9d0e5491a5c01315204272e7f9c44
      
      --------------------------------
      
      commit 4abc9965 upstream.
      
      Syzkaller managed to trigger concurrent calls to
      kernfs_remove_by_name_ns() for the same file resulting in
      a KASAN detected use-after-free. The race occurs when the root
      node is freed during kernfs_drain().
      
      To prevent this acquire an additional reference for the root
      of the tree that is removed before calling __kernfs_remove().
      
      Found by syzkaller with the following reproducer (slab_nomerge is
      required):
      
      syz_mount_image$ext4(0x0, &(0x7f0000000100)='./file0\x00', 0x100000, 0x0, 0x0, 0x0, 0x0)
      r0 = openat(0xffffffffffffff9c, &(0x7f0000000080)='/proc/self/exe\x00', 0x0, 0x0)
      close(r0)
      pipe2(&(0x7f0000000140)={0xffffffffffffffff, <r1=>0xffffffffffffffff}, 0x800)
      mount$9p_fd(0x0, &(0x7f0000000040)='./file0\x00', &(0x7f00000000c0), 0x408, &(0x7f0000000280)={'trans=fd,', {'rfdno', 0x3d, r0}, 0x2c, {'wfdno', 0x3d, r1}, 0x2c, {[{@cache_loose}, {@mmap}, {@loose}, {@loose}, {@mmap}], [{@mask={'mask', 0x3d, '^MAY_EXEC'}}, {@fsmagic={'fsmagic', 0x3d, 0x10001}}, {@dont_hash}]}})
      
      Sample report:
      
      ==================================================================
      BUG: KASAN: use-after-free in kernfs_type include/linux/kernfs.h:335 [inline]
      BUG: KASAN: use-after-free in kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
      BUG: KASAN: use-after-free in __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
      Read of size 2 at addr ffff8880088807f0 by task syz-executor.2/857
      
      CPU: 0 PID: 857 Comm: syz-executor.2 Not tainted 6.0.0-rc3-00363-g7726d4c3 #5
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x6e/0x91 lib/dump_stack.c:106
       print_address_description mm/kasan/report.c:317 [inline]
       print_report.cold+0x5e/0x5e5 mm/kasan/report.c:433
       kasan_report+0xa3/0x130 mm/kasan/report.c:495
       kernfs_type include/linux/kernfs.h:335 [inline]
       kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
       __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
       __kernfs_remove fs/kernfs/dir.c:1356 [inline]
       kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
       sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
       __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
       create_cache mm/slab_common.c:229 [inline]
       kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
       p9_client_create+0xd4d/0x1190 net/9p/client.c:993
       v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
       v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
       legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
       vfs_get_tree+0x85/0x2e0 fs/super.c:1530
       do_new_mount fs/namespace.c:3040 [inline]
       path_mount+0x675/0x1d00 fs/namespace.c:3370
       do_mount fs/namespace.c:3383 [inline]
       __do_sys_mount fs/namespace.c:3591 [inline]
       __se_sys_mount fs/namespace.c:3568 [inline]
       __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7f725f983aed
      Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f725f0f7028 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
      RAX: ffffffffffffffda RBX: 00007f725faa3f80 RCX: 00007f725f983aed
      RDX: 00000000200000c0 RSI: 0000000020000040 RDI: 0000000000000000
      RBP: 00007f725f9f419c R08: 0000000020000280 R09: 0000000000000000
      R10: 0000000000000408 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000000006 R14: 00007f725faa3f80 R15: 00007f725f0d7000
       </TASK>
      
      Allocated by task 855:
       kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
       kasan_set_track mm/kasan/common.c:45 [inline]
       set_alloc_info mm/kasan/common.c:437 [inline]
       __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:470
       kasan_slab_alloc include/linux/kasan.h:224 [inline]
       slab_post_alloc_hook mm/slab.h:727 [inline]
       slab_alloc_node mm/slub.c:3243 [inline]
       slab_alloc mm/slub.c:3251 [inline]
       __kmem_cache_alloc_lru mm/slub.c:3258 [inline]
       kmem_cache_alloc+0xbf/0x200 mm/slub.c:3268
       kmem_cache_zalloc include/linux/slab.h:723 [inline]
       __kernfs_new_node+0xd4/0x680 fs/kernfs/dir.c:593
       kernfs_new_node fs/kernfs/dir.c:655 [inline]
       kernfs_create_dir_ns+0x9c/0x220 fs/kernfs/dir.c:1010
       sysfs_create_dir_ns+0x127/0x290 fs/sysfs/dir.c:59
       create_dir lib/kobject.c:63 [inline]
       kobject_add_internal+0x24a/0x8d0 lib/kobject.c:223
       kobject_add_varg lib/kobject.c:358 [inline]
       kobject_init_and_add+0x101/0x160 lib/kobject.c:441
       sysfs_slab_add+0x156/0x1e0 mm/slub.c:5954
       __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
       create_cache mm/slab_common.c:229 [inline]
       kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
       p9_client_create+0xd4d/0x1190 net/9p/client.c:993
       v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
       v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
       legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
       vfs_get_tree+0x85/0x2e0 fs/super.c:1530
       do_new_mount fs/namespace.c:3040 [inline]
       path_mount+0x675/0x1d00 fs/namespace.c:3370
       do_mount fs/namespace.c:3383 [inline]
       __do_sys_mount fs/namespace.c:3591 [inline]
       __se_sys_mount fs/namespace.c:3568 [inline]
       __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Freed by task 857:
       kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
       kasan_set_track+0x21/0x30 mm/kasan/common.c:45
       kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:370
       ____kasan_slab_free mm/kasan/common.c:367 [inline]
       ____kasan_slab_free mm/kasan/common.c:329 [inline]
       __kasan_slab_free+0x108/0x190 mm/kasan/common.c:375
       kasan_slab_free include/linux/kasan.h:200 [inline]
       slab_free_hook mm/slub.c:1754 [inline]
       slab_free_freelist_hook mm/slub.c:1780 [inline]
       slab_free mm/slub.c:3534 [inline]
       kmem_cache_free+0x9c/0x340 mm/slub.c:3551
       kernfs_put.part.0+0x2b2/0x520 fs/kernfs/dir.c:547
       kernfs_put+0x42/0x50 fs/kernfs/dir.c:521
       __kernfs_remove.part.0+0x72d/0x960 fs/kernfs/dir.c:1407
       __kernfs_remove fs/kernfs/dir.c:1356 [inline]
       kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
       sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
       __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
       create_cache mm/slab_common.c:229 [inline]
       kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
       p9_client_create+0xd4d/0x1190 net/9p/client.c:993
       v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
       v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
       legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
       vfs_get_tree+0x85/0x2e0 fs/super.c:1530
       do_new_mount fs/namespace.c:3040 [inline]
       path_mount+0x675/0x1d00 fs/namespace.c:3370
       do_mount fs/namespace.c:3383 [inline]
       __do_sys_mount fs/namespace.c:3591 [inline]
       __se_sys_mount fs/namespace.c:3568 [inline]
       __x64_sys_mount+0x282/0x300 fs/namespace.c:3568
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      The buggy address belongs to the object at ffff888008880780
       which belongs to the cache kernfs_node_cache of size 128
      The buggy address is located 112 bytes inside of
       128-byte region [ffff888008880780, ffff888008880800)
      
      The buggy address belongs to the physical page:
      page:00000000732833f8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8880
      flags: 0x100000000000200(slab|node=0|zone=1)
      raw: 0100000000000200 0000000000000000 dead000000000122 ffff888001147280
      raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff888008880680: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
       ffff888008880700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      >ffff888008880780: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                   ^
       ffff888008880800: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
       ffff888008880880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      ==================================================================
      Acked-by: NTejun Heo <tj@kernel.org>
      Cc: stable <stable@kernel.org> # -rc3
      Signed-off-by: NChristian A. Ehrhardt <lk@c--e.de>
      Link: https://lore.kernel.org/r/20220913121723.691454-1-lk@c--e.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NLipeng Sang <sanglipeng1@jd.com>
      9c14d1f4
  2. 19 10月, 2022 1 次提交
  3. 14 1月, 2020 1 次提交
  4. 13 11月, 2019 6 次提交
    • T
      kernfs: use 64bit inos if ino_t is 64bit · 40430452
      Tejun Heo 提交于
      Each kernfs_node is identified with a 64bit ID.  The low 32bit is
      exposed as ino and the high gen.  While this already allows using inos
      as keys by looking up with wildcard generation number of 0, it's
      adding unnecessary complications for 64bit ino archs which can
      directly use kernfs_node IDs as inos to uniquely identify each cgroup
      instance.
      
      This patch exposes IDs directly as inos on 64bit ino archs.  The
      conversion is mostly straight-forward.
      
      * 32bit ino archs behave the same as before.  64bit ino archs now use
        the whole 64bit ID as ino and the generation number is fixed at 1.
      
      * 64bit inos still use the same idr allocator which gurantees that the
        lower 32bits identify the current live instance uniquely and the
        high 32bits are incremented whenever the low bits wrap.  As the
        upper 32bits are no longer used as gen and we don't wanna start ino
        allocation with 33rd bit set, the initial value for highbits
        allocation is changed to 0 on 64bit ino archs.
      
      * blktrace exposes two 32bit numbers - (INO,GEN) pair - to identify
        the issuing cgroup.  Userland builds FILEID_INO32_GEN fids from
        these numbers to look up the cgroups.  To remain compatible with the
        behavior, always output (LOW32,HIGH32) which will be constructed
        back to the original 64bit ID by __kernfs_fh_to_dentry().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      40430452
    • T
      kernfs: combine ino/id lookup functions into kernfs_find_and_get_node_by_id() · fe0f726c
      Tejun Heo 提交于
      kernfs_find_and_get_node_by_ino() looks the kernfs_node matching the
      specified ino.  On top of that, kernfs_get_node_by_id() and
      kernfs_fh_get_inode() implement full ID matching by testing the rest
      of ID.
      
      On surface, confusingly, the two are slightly different in that the
      latter uses 0 gen as wildcard while the former doesn't - does it mean
      that the latter can't uniquely identify inodes w/ 0 gen?  In practice,
      this is a distinction without a difference because generation number
      starts at 1.  There are no actual IDs with 0 gen, so it can always
      safely used as wildcard.
      
      Let's simplify the code by renaming kernfs_find_and_get_node_by_ino()
      to kernfs_find_and_get_node_by_id(), moving all lookup logics into it,
      and removing now unnecessary kernfs_get_node_by_id().
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe0f726c
    • T
      kernfs: convert kernfs_node->id from union kernfs_node_id to u64 · 67c0496e
      Tejun Heo 提交于
      kernfs_node->id is currently a union kernfs_node_id which represents
      either a 32bit (ino, gen) pair or u64 value.  I can't see much value
      in the usage of the union - all that's needed is a 64bit ID which the
      current code is already limited to.  Using a union makes the code
      unnecessarily complicated and prevents using 64bit ino without adding
      practical benefits.
      
      This patch drops union kernfs_node_id and makes kernfs_node->id a u64.
      ino is stored in the lower 32bits and gen upper.  Accessors -
      kernfs[_id]_ino() and kernfs[_id]_gen() - are added to retrieve the
      ino and gen.  This simplifies ID handling less cumbersome and will
      allow using 64bit inos on supported archs.
      
      This patch doesn't make any functional changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Alexei Starovoitov <ast@kernel.org>
      67c0496e
    • T
      kernfs: kernfs_find_and_get_node_by_ino() should only look up activated nodes · 880df131
      Tejun Heo 提交于
      kernfs node can be created in two separate steps - allocation and
      activation.  This is used to make kernfs nodes visible only after the
      internal states attached to the node are fully initialized.
      kernfs_find_and_get_node_by_id() currently allows lookups of nodes
      which aren't activated yet and thus can expose nodes are which are
      still being prepped by kernfs users.
      
      Fix it by disallowing lookups of nodes which aren't activated yet.
      
      kernfs_find_and_get_node_by_ino()
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      880df131
    • T
      kernfs: use dumber locking for kernfs_find_and_get_node_by_ino() · b680b081
      Tejun Heo 提交于
      kernfs_find_and_get_node_by_ino() uses RCU protection.  It's currently
      a bit buggy because it can look up a node which hasn't been activated
      yet and thus may end up exposing a node that the kernfs user is still
      prepping.
      
      While it can be fixed by pushing it further in the current direction,
      it's already complicated and isn't clear whether the complexity is
      justified.  The main use of kernfs_find_and_get_node_by_ino() is for
      exportfs operations.  They aren't super hot and all the follow-up
      operations (e.g. mapping to path) use normal locking anyway.
      
      Let's switch to a dumber locking scheme and protect the lookup with
      kernfs_idr_lock.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      b680b081
    • T
      kernfs: fix ino wrap-around detection · e23f568a
      Tejun Heo 提交于
      When the 32bit ino wraps around, kernfs increments the generation
      number to distinguish reused ino instances.  The wrap-around detection
      tests whether the allocated ino is lower than what the cursor but the
      cursor is pointing to the next ino to allocate so the condition never
      triggers.
      
      Fix it by remembering the last ino and comparing against that.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Fixes: 4a3ef68a ("kernfs: implement i_generation")
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: stable@vger.kernel.org # v4.14+
      e23f568a
  5. 09 10月, 2019 1 次提交
    • Q
      locking/lockdep: Remove unused @nested argument from lock_release() · 5facae4f
      Qian Cai 提交于
      Since the following commit:
      
        b4adfe8e ("locking/lockdep: Remove unused argument in __lock_release")
      
      @nested is no longer used in lock_release(), so remove it from all
      lock_release() calls and friends.
      Signed-off-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NWill Deacon <will@kernel.org>
      Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: airlied@linux.ie
      Cc: akpm@linux-foundation.org
      Cc: alexander.levin@microsoft.com
      Cc: daniel@iogearbox.net
      Cc: davem@davemloft.net
      Cc: dri-devel@lists.freedesktop.org
      Cc: duyuyang@gmail.com
      Cc: gregkh@linuxfoundation.org
      Cc: hannes@cmpxchg.org
      Cc: intel-gfx@lists.freedesktop.org
      Cc: jack@suse.com
      Cc: jlbec@evilplan.or
      Cc: joonas.lahtinen@linux.intel.com
      Cc: joseph.qi@linux.alibaba.com
      Cc: jslaby@suse.com
      Cc: juri.lelli@redhat.com
      Cc: maarten.lankhorst@linux.intel.com
      Cc: mark@fasheh.com
      Cc: mhocko@kernel.org
      Cc: mripard@kernel.org
      Cc: ocfs2-devel@oss.oracle.com
      Cc: rodrigo.vivi@intel.com
      Cc: sean@poorly.run
      Cc: st@kernel.org
      Cc: tj@kernel.org
      Cc: tytso@mit.edu
      Cc: vdavydov.dev@gmail.com
      Cc: vincent.guittot@linaro.org
      Cc: viro@zeniv.linux.org.uk
      Link: https://lkml.kernel.org/r/1568909380-32199-1-git-send-email-cai@lca.pwSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5facae4f
  6. 08 8月, 2019 1 次提交
    • G
      Revert "kernfs: fix memleak in kernel_ops_readdir()" · 8097c43b
      Greg Kroah-Hartman 提交于
      This reverts commit cc798c83.
      
      Tony writes:
      	Somehow this causes a regression in Linux next for me where I'm
      	seeing lots of sysfs entries now missing under
      	/sys/bus/platform/devices.
      
      	For example, I now only see one .serial entry show up in sysfs.
      	Things work again if I revert commit cc798c83 ("kernfs: fix
      	memleak inkernel_ops_readdir()"). Any ideas why that would be?
      
      Tejun says:
      	Ugh, you're right.  It can get double-put cuz ctx->pos is put by
      	release too.
      
      So reverting it for now.
      Reported-by: NTony Lindgren <tony@atomide.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Fixes: cc798c83 ("kernfs: fix memleak in kernel_ops_readdir()")
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8097c43b
  7. 06 8月, 2019 1 次提交
  8. 25 7月, 2019 2 次提交
  9. 05 6月, 2019 1 次提交
  10. 26 4月, 2019 1 次提交
  11. 21 3月, 2019 3 次提交
    • O
      kernfs: initialize security of newly created nodes · e19dfdc8
      Ondrej Mosnacek 提交于
      Use the new security_kernfs_init_security() hook to allow LSMs to
      possibly assign a non-default security context to a newly created kernfs
      node based on the attributes of the new node and also its parent node.
      
      This fixes an issue with cgroupfs under SELinux, where newly created
      cgroup subdirectories/files would not inherit its parent's context if
      it had been set explicitly to a non-default value (other than the genfs
      context specified by the policy). This can be reproduced as follows (on
      Fedora/RHEL):
      
          # mkdir /sys/fs/cgroup/unified/test
          # # Need permissive to change the label under Fedora policy:
          # setenforce 0
          # chcon -t container_file_t /sys/fs/cgroup/unified/test
          # ls -lZ /sys/fs/cgroup/unified
          total 0
          -r--r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.controllers
          -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.max.depth
          -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.max.descendants
          -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.procs
          -r--r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.stat
          -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.subtree_control
          -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.threads
          drwxr-xr-x.  2 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 init.scope
          drwxr-xr-x. 26 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:21 system.slice
          drwxr-xr-x.  3 root root system_u:object_r:container_file_t:s0 0 Jan 29 03:15 test
          drwxr-xr-x.  3 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 user.slice
          # mkdir /sys/fs/cgroup/unified/test/subdir
      
      Actual result:
      
          # ls -ldZ /sys/fs/cgroup/unified/test/subdir
          drwxr-xr-x. 2 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:15 /sys/fs/cgroup/unified/test/subdir
      
      Expected result:
      
          # ls -ldZ /sys/fs/cgroup/unified/test/subdir
          drwxr-xr-x. 2 root root unconfined_u:object_r:container_file_t:s0 0 Jan 29 03:15 /sys/fs/cgroup/unified/test/subdir
      
      Link: https://github.com/SELinuxProject/selinux-kernel/issues/39Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      e19dfdc8
    • O
      kernfs: use simple_xattrs for security attributes · 0ac6075a
      Ondrej Mosnacek 提交于
      Replace the special handling of security xattrs with simple_xattrs, as
      is already done for the trusted xattrs. This simplifies the code and
      allows LSMs to use more than just a single xattr to do their business.
      Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      [PM: manual merge fixes]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      0ac6075a
    • O
      kernfs: clean up struct kernfs_iattrs · 05895219
      Ondrej Mosnacek 提交于
      Right now, kernfs_iattrs embeds the whole struct iattr, even though it
      doesn't really use half of its fields... This both leads to wasting
      space and makes the code look awkward. Let's just list the few fields
      we need directly in struct kernfs_iattrs.
      Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      [PM: merged a number of chunks manually due to fuzz]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      05895219
  12. 08 2月, 2019 1 次提交
    • A
      kernfs: Allocating memory for kernfs_iattrs with kmem_cache. · 26e28d68
      Ayush Mittal 提交于
      Creating a new cache for kernfs_iattrs.
      Currently, memory is allocated with kzalloc() which
      always gives aligned memory. On ARM, this is 64 byte aligned.
      To avoid the wastage of memory in aligning the size requested,
      a new cache for kernfs_iattrs is created.
      
      Size of struct kernfs_iattrs is 80 Bytes.
      On ARM, it will come in kmalloc-128 slab.
      and it will come in kmalloc-192 slab if debug info is enabled.
      Extra bytes taken 48 bytes.
      
      Total number of objects created : 4096
      Total saving = 48*4096 = 192 KB
      
      After creating new slab(When debug info is enabled) :
      sh-3.2# cat /proc/slabinfo
      ...
      kernfs_iattrs_cache   4069   4096    128   32    1 : tunables    0    0    0 : slabdata    128    128      0
      ...
      
      All testing has been done on ARM target.
      Signed-off-by: NAyush Mittal <ayush.m@samsung.com>
      Signed-off-by: NVaneet Narang <v.narang@samsung.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26e28d68
  13. 21 7月, 2018 1 次提交
  14. 06 6月, 2018 1 次提交
    • D
      vfs: change inode times to use struct timespec64 · 95582b00
      Deepa Dinamani 提交于
      struct timespec is not y2038 safe. Transition vfs to use
      y2038 safe struct timespec64 instead.
      
      The change was made with the help of the following cocinelle
      script. This catches about 80% of the changes.
      All the header file and logic changes are included in the
      first 5 rules. The rest are trivial substitutions.
      I avoid changing any of the function signatures or any other
      filesystem specific data structures to keep the patch simple
      for review.
      
      The script can be a little shorter by combining different cases.
      But, this version was sufficient for my usecase.
      
      virtual patch
      
      @ depends on patch @
      identifier now;
      @@
      - struct timespec
      + struct timespec64
        current_time ( ... )
        {
      - struct timespec now = current_kernel_time();
      + struct timespec64 now = current_kernel_time64();
        ...
      - return timespec_trunc(
      + return timespec64_trunc(
        ... );
        }
      
      @ depends on patch @
      identifier xtime;
      @@
       struct \( iattr \| inode \| kstat \) {
       ...
      -       struct timespec xtime;
      +       struct timespec64 xtime;
       ...
       }
      
      @ depends on patch @
      identifier t;
      @@
       struct inode_operations {
       ...
      int (*update_time) (...,
      -       struct timespec t,
      +       struct timespec64 t,
      ...);
       ...
       }
      
      @ depends on patch @
      identifier t;
      identifier fn_update_time =~ "update_time$";
      @@
       fn_update_time (...,
      - struct timespec *t,
      + struct timespec64 *t,
       ...) { ... }
      
      @ depends on patch @
      identifier t;
      @@
      lease_get_mtime( ... ,
      - struct timespec *t
      + struct timespec64 *t
        ) { ... }
      
      @te depends on patch forall@
      identifier ts;
      local idexpression struct inode *inode_node;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn_update_time =~ "update_time$";
      identifier fn;
      expression e, E3;
      local idexpression struct inode *node1;
      local idexpression struct inode *node2;
      local idexpression struct iattr *attr1;
      local idexpression struct iattr *attr2;
      local idexpression struct iattr attr;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      @@
      (
      (
      - struct timespec ts;
      + struct timespec64 ts;
      |
      - struct timespec ts = current_time(inode_node);
      + struct timespec64 ts = current_time(inode_node);
      )
      
      <+... when != ts
      (
      - timespec_equal(&inode_node->i_xtime, &ts)
      + timespec64_equal(&inode_node->i_xtime, &ts)
      |
      - timespec_equal(&ts, &inode_node->i_xtime)
      + timespec64_equal(&ts, &inode_node->i_xtime)
      |
      - timespec_compare(&inode_node->i_xtime, &ts)
      + timespec64_compare(&inode_node->i_xtime, &ts)
      |
      - timespec_compare(&ts, &inode_node->i_xtime)
      + timespec64_compare(&ts, &inode_node->i_xtime)
      |
      ts = current_time(e)
      |
      fn_update_time(..., &ts,...)
      |
      inode_node->i_xtime = ts
      |
      node1->i_xtime = ts
      |
      ts = inode_node->i_xtime
      |
      <+... attr1->ia_xtime ...+> = ts
      |
      ts = attr1->ia_xtime
      |
      ts.tv_sec
      |
      ts.tv_nsec
      |
      btrfs_set_stack_timespec_sec(..., ts.tv_sec)
      |
      btrfs_set_stack_timespec_nsec(..., ts.tv_nsec)
      |
      - ts = timespec64_to_timespec(
      + ts =
      ...
      -)
      |
      - ts = ktime_to_timespec(
      + ts = ktime_to_timespec64(
      ...)
      |
      - ts = E3
      + ts = timespec_to_timespec64(E3)
      |
      - ktime_get_real_ts(&ts)
      + ktime_get_real_ts64(&ts)
      |
      fn(...,
      - ts
      + timespec64_to_timespec(ts)
      ,...)
      )
      ...+>
      (
      <... when != ts
      - return ts;
      + return timespec64_to_timespec(ts);
      ...>
      )
      |
      - timespec_equal(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_equal(&node1->i_xtime2, &node2->i_xtime2)
      |
      - timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2)
      + timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2)
      |
      - timespec_compare(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_compare(&node1->i_xtime1, &node2->i_xtime2)
      |
      node1->i_xtime1 =
      - timespec_trunc(attr1->ia_xtime1,
      + timespec64_trunc(attr1->ia_xtime1,
      ...)
      |
      - attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2,
      + attr1->ia_xtime1 =  timespec64_trunc(attr2->ia_xtime2,
      ...)
      |
      - ktime_get_real_ts(&attr1->ia_xtime1)
      + ktime_get_real_ts64(&attr1->ia_xtime1)
      |
      - ktime_get_real_ts(&attr.ia_xtime1)
      + ktime_get_real_ts64(&attr.ia_xtime1)
      )
      
      @ depends on patch @
      struct inode *node;
      struct iattr *attr;
      identifier fn;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      expression e;
      @@
      (
      - fn(node->i_xtime);
      + fn(timespec64_to_timespec(node->i_xtime));
      |
       fn(...,
      - node->i_xtime);
      + timespec64_to_timespec(node->i_xtime));
      |
      - e = fn(attr->ia_xtime);
      + e = fn(timespec64_to_timespec(attr->ia_xtime));
      )
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      )
      ...+>
      }
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      struct kstat *stat;
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier i_xtime =~ "^i_[acm]time$";
      identifier xtime =~ "^[acm]time$";
      identifier fn, ret;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(stat->xtime);
      ret = fn (...,
      - &stat->xtime);
      + &ts);
      )
      ...+>
      }
      
      @ depends on patch @
      struct inode *node;
      struct inode *node2;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier i_xtime3 =~ "^i_[acm]time$";
      struct iattr *attrp;
      struct iattr *attrp2;
      struct iattr attr ;
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      struct kstat *stat;
      struct kstat stat1;
      struct timespec64 ts;
      identifier xtime =~ "^[acmb]time$";
      expression e;
      @@
      (
      ( node->i_xtime2 \| attrp->ia_xtime2 \| attr.ia_xtime2 \) = node->i_xtime1  ;
      |
       node->i_xtime2 = \( node2->i_xtime1 \| timespec64_trunc(...) \);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       stat->xtime = node2->i_xtime1;
      |
       stat1.xtime = node2->i_xtime1;
      |
      ( node->i_xtime2 \| attrp->ia_xtime2 \) = attrp->ia_xtime1  ;
      |
      ( attrp->ia_xtime1 \| attr.ia_xtime1 \) = attrp2->ia_xtime2;
      |
      - e = node->i_xtime1;
      + e = timespec64_to_timespec( node->i_xtime1 );
      |
      - e = attrp->ia_xtime1;
      + e = timespec64_to_timespec( attrp->ia_xtime1 );
      |
      node->i_xtime1 = current_time(...);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
       node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
      - node->i_xtime1 = e;
      + node->i_xtime1 = timespec_to_timespec64(e);
      )
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Cc: <anton@tuxera.com>
      Cc: <balbi@kernel.org>
      Cc: <bfields@fieldses.org>
      Cc: <darrick.wong@oracle.com>
      Cc: <dhowells@redhat.com>
      Cc: <dsterba@suse.com>
      Cc: <dwmw2@infradead.org>
      Cc: <hch@lst.de>
      Cc: <hirofumi@mail.parknet.co.jp>
      Cc: <hubcap@omnibond.com>
      Cc: <jack@suse.com>
      Cc: <jaegeuk@kernel.org>
      Cc: <jaharkes@cs.cmu.edu>
      Cc: <jslaby@suse.com>
      Cc: <keescook@chromium.org>
      Cc: <mark@fasheh.com>
      Cc: <miklos@szeredi.hu>
      Cc: <nico@linaro.org>
      Cc: <reiserfs-devel@vger.kernel.org>
      Cc: <richard@nod.at>
      Cc: <sage@redhat.com>
      Cc: <sfrench@samba.org>
      Cc: <swhiteho@redhat.com>
      Cc: <tj@kernel.org>
      Cc: <trond.myklebust@primarydata.com>
      Cc: <tytso@mit.edu>
      Cc: <viro@zeniv.linux.org.uk>
      95582b00
  15. 29 7月, 2017 5 次提交
  16. 10 2月, 2017 1 次提交
  17. 28 12月, 2016 1 次提交
  18. 08 10月, 2016 1 次提交
  19. 07 10月, 2016 1 次提交
  20. 27 9月, 2016 2 次提交
    • M
      fs: rename "rename2" i_op to "rename" · 2773bf00
      Miklos Szeredi 提交于
      Generated patch:
      
      sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2`
      sed -i "s/\brename2\b/rename/g" `git grep -wl rename2`
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      2773bf00
    • M
      fs: make remaining filesystems use .rename2 · 1cd66c93
      Miklos Szeredi 提交于
      This is trivial to do:
      
       - add flags argument to foo_rename()
       - check if flags is zero
       - assign foo_rename() to .rename2 instead of .rename
      
      This doesn't mean it's impossible to support RENAME_NOREPLACE for these
      filesystems, but it is not trivial, like for local filesystems.
      RENAME_NOREPLACE must guarantee atomicity (i.e. it shouldn't be possible
      for a file to be created on one host while it is overwritten by rename on
      another host).
      
      Filesystems converted:
      
      9p, afs, ceph, coda, ecryptfs, kernfs, lustre, ncpfs, nfs, ocfs2, orangefs.
      
      After this, we can get rid of the duplicate interfaces for rename.
      Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: David Howells <dhowells@redhat.com> [AFS]
      Acked-by: NMike Marshall <hubcap@omnibond.com>
      Cc: Eric Van Hensbergen <ericvh@gmail.com>
      Cc: Ilya Dryomov <idryomov@gmail.com>
      Cc: Jan Harkes <jaharkes@cs.cmu.edu>
      Cc: Tyler Hicks <tyhicks@canonical.com>
      Cc: Oleg Drokin <oleg.drokin@intel.com>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      1cd66c93
  21. 10 8月, 2016 2 次提交
    • T
      kernfs: remove kernfs_path_len() · bb09c863
      Tejun Heo 提交于
      It doesn't have any in-kernel user and the same result can be obtained
      from kernfs_path(@kn, NULL, 0).  Remove it.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
      bb09c863
    • T
      kernfs: make kernfs_path*() behave in the style of strlcpy() · 3abb1d90
      Tejun Heo 提交于
      kernfs_path*() functions always return the length of the full path but
      the path content is undefined if the length is larger than the
      provided buffer.  This makes its behavior different from strlcpy() and
      requires error handling in all its users even when they don't care
      about truncation.  In addition, the implementation can actully be
      simplified by making it behave properly in strlcpy() style.
      
      * Update kernfs_path_from_node_locked() to always fill up the buffer
        with path.  If the buffer is not large enough, the output is
        truncated and terminated.
      
      * kernfs_path() no longer needs error handling.  Make it a simple
        inline wrapper around kernfs_path_from_node().
      
      * sysfs_warn_dup()'s use of kernfs_path() doesn't need error handling.
        Updated accordingly.
      
      * cgroup_path()'s use of kernfs_path() updated to retain the old
        behavior.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: NSerge Hallyn <serge.hallyn@ubuntu.com>
      3abb1d90
  22. 11 6月, 2016 1 次提交
    • L
      vfs: make the string hashes salt the hash · 8387ff25
      Linus Torvalds 提交于
      We always mixed in the parent pointer into the dentry name hash, but we
      did it late at lookup time.  It turns out that we can simplify that
      lookup-time action by salting the hash with the parent pointer early
      instead of late.
      
      A few other users of our string hashes also wanted to mix in their own
      pointers into the hash, and those are updated to use the same mechanism.
      
      Hash users that don't have any particular initial salt can just use the
      NULL pointer as a no-salt.
      
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8387ff25
  23. 09 5月, 2016 1 次提交
  24. 03 5月, 2016 1 次提交
    • S
      kernfs_path_from_node_locked: don't overwrite nlen · e99ed4de
      Serge Hallyn 提交于
      We've calculated @len to be the bytes we need for '/..' entries from
      @kn_from to the common ancestor, and calculated @nlen to be the extra
      bytes we need to get from the common ancestor to @kn_to.  We use them
      as such at the end.  But in the loop copying the actual entries, we
      overwrite @nlen.  Use a temporary variable for that instead.
      
      Without this, the return length, when the buffer is large enough, is
      wrong.  (When the buffer is NULL or too small, the returned value is
      correct. The buffer contents are also correct.)
      
      Interestingly, no callers of this function are affected by this as of
      yet.  However the upcoming cgroup_show_path() will be.
      Signed-off-by: NSerge Hallyn <serge.hallyn@ubuntu.com>
      e99ed4de
  25. 30 3月, 2016 1 次提交
    • D
      fs: kernfs: Replace CURRENT_TIME by current_fs_time() · 3a3a5fec
      Deepa Dinamani 提交于
      This is in preparation for the series that transitions
      filesystem timestamps to use 64 bit time and hence make
      them y2038 safe.
      
      CURRENT_TIME macro will be deleted before merging the
      aforementioned series.
      
      Use current_fs_time() instead of CURRENT_TIME for inode
      timestamps.
      
      struct kernfs_node is associated with a sysfs file/ directory.
      Truncate the values to appropriate time granularity when
      writing to inode timestamps of the files.
      
      ktime_get_real_ts() is used to obtain times for
      struct kernfs_iattrs. Since these times are later assigned to
      inode times using timespec_truncate() for all filesystem based
      operations, we can save the supers list traversal time here by
      using ktime_get_real_ts() directly.
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3a3a5fec
  26. 17 2月, 2016 1 次提交