1. 13 11月, 2019 2 次提交
    • T
      kernfs: use dumber locking for kernfs_find_and_get_node_by_ino() · b680b081
      Tejun Heo 提交于
      kernfs_find_and_get_node_by_ino() uses RCU protection.  It's currently
      a bit buggy because it can look up a node which hasn't been activated
      yet and thus may end up exposing a node that the kernfs user is still
      prepping.
      
      While it can be fixed by pushing it further in the current direction,
      it's already complicated and isn't clear whether the complexity is
      justified.  The main use of kernfs_find_and_get_node_by_ino() is for
      exportfs operations.  They aren't super hot and all the follow-up
      operations (e.g. mapping to path) use normal locking anyway.
      
      Let's switch to a dumber locking scheme and protect the lookup with
      kernfs_idr_lock.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      b680b081
    • T
      kernfs: fix ino wrap-around detection · e23f568a
      Tejun Heo 提交于
      When the 32bit ino wraps around, kernfs increments the generation
      number to distinguish reused ino instances.  The wrap-around detection
      tests whether the allocated ino is lower than what the cursor but the
      cursor is pointing to the next ino to allocate so the condition never
      triggers.
      
      Fix it by remembering the last ino and comparing against that.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Fixes: 4a3ef68a ("kernfs: implement i_generation")
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: stable@vger.kernel.org # v4.14+
      e23f568a
  2. 30 8月, 2019 1 次提交
    • D
      timestamp_truncate: Replace users of timespec64_trunc · 3818c190
      Deepa Dinamani 提交于
      Update the inode timestamp updates to use timestamp_truncate()
      instead of timespec64_trunc().
      
      The change was mostly generated by the following coccinelle
      script.
      
      virtual context
      virtual patch
      
      @r1 depends on patch forall@
      struct inode *inode;
      identifier i_xtime =~ "^i_[acm]time$";
      expression e;
      @@
      
      inode->i_xtime =
      - timespec64_trunc(
      + timestamp_truncate(
      ...,
      - e);
      + inode);
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: NJeff Layton <jlayton@kernel.org>
      Cc: adrian.hunter@intel.com
      Cc: dedekind1@gmail.com
      Cc: gregkh@linuxfoundation.org
      Cc: hch@lst.de
      Cc: jaegeuk@kernel.org
      Cc: jlbec@evilplan.org
      Cc: richard@nod.at
      Cc: tj@kernel.org
      Cc: yuchao0@huawei.com
      Cc: linux-f2fs-devel@lists.sourceforge.net
      Cc: linux-ntfs-dev@lists.sourceforge.net
      Cc: linux-mtd@lists.infradead.org
      3818c190
  3. 08 8月, 2019 1 次提交
    • G
      Revert "kernfs: fix memleak in kernel_ops_readdir()" · 8097c43b
      Greg Kroah-Hartman 提交于
      This reverts commit cc798c83.
      
      Tony writes:
      	Somehow this causes a regression in Linux next for me where I'm
      	seeing lots of sysfs entries now missing under
      	/sys/bus/platform/devices.
      
      	For example, I now only see one .serial entry show up in sysfs.
      	Things work again if I revert commit cc798c83 ("kernfs: fix
      	memleak inkernel_ops_readdir()"). Any ideas why that would be?
      
      Tejun says:
      	Ugh, you're right.  It can get double-put cuz ctx->pos is put by
      	release too.
      
      So reverting it for now.
      Reported-by: NTony Lindgren <tony@atomide.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Fixes: cc798c83 ("kernfs: fix memleak in kernel_ops_readdir()")
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8097c43b
  4. 06 8月, 2019 1 次提交
  5. 25 7月, 2019 2 次提交
  6. 05 6月, 2019 1 次提交
  7. 21 5月, 2019 1 次提交
  8. 27 4月, 2019 1 次提交
    • A
      fsnotify(): switch to passing const struct qstr * for file_name · 25b229df
      Al Viro 提交于
      Note that in fnsotify_move() and fsnotify_link() we are guaranteed
      that dentry->d_name won't change during the fsnotify() evaluation
      (by having the parent directory locked exclusive), so we don't
      need to fetch dentry->d_name.name in the callers.  In fsnotify_dirent()
      the same stability of dentry->d_name is also true, but it's a bit
      more convoluted - there is one callchain (devpts_pty_new() ->
      fsnotify_create() -> fsnotify_dirent()) where the parent is _not_
      locked, but on devpts ->d_name of everything is unchanging; it
      has neither explicit nor implicit renames.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      25b229df
  9. 26 4月, 2019 1 次提交
  10. 04 4月, 2019 1 次提交
    • O
      kernfs: fix xattr name handling in LSM helpers · 1537ad15
      Ondrej Mosnacek 提交于
      The implementation of kernfs_security_xattr_*() helpers reuses the
      kernfs_node_xattr_*() functions, which take the suffix of the xattr name
      and extract full xattr name from it using xattr_full_name(). However,
      this function relies on the fact that the suffix passed to xattr
      handlers from VFS is always constructed from the full name by just
      incerementing the pointer. This doesn't necessarily hold for the callers
      of kernfs_security_xattr_*(), so their usage will easily lead to
      out-of-bounds access.
      
      Fix this by moving the xattr name reconstruction to the VFS xattr
      handlers and replacing the kernfs_security_xattr_*() helpers with more
      general kernfs_xattr_*() helpers that take full xattr name and allow
      accessing all kernfs node's xattrs.
      Reported-by: Nkernel test robot <rong.a.chen@intel.com>
      Fixes: b230d5ab ("LSM: add new hook for kernfs node initialization")
      Fixes: ec882da5 ("selinux: implement the kernfs_init_security hook")
      Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      1537ad15
  11. 21 3月, 2019 5 次提交
    • O
      kernfs: initialize security of newly created nodes · e19dfdc8
      Ondrej Mosnacek 提交于
      Use the new security_kernfs_init_security() hook to allow LSMs to
      possibly assign a non-default security context to a newly created kernfs
      node based on the attributes of the new node and also its parent node.
      
      This fixes an issue with cgroupfs under SELinux, where newly created
      cgroup subdirectories/files would not inherit its parent's context if
      it had been set explicitly to a non-default value (other than the genfs
      context specified by the policy). This can be reproduced as follows (on
      Fedora/RHEL):
      
          # mkdir /sys/fs/cgroup/unified/test
          # # Need permissive to change the label under Fedora policy:
          # setenforce 0
          # chcon -t container_file_t /sys/fs/cgroup/unified/test
          # ls -lZ /sys/fs/cgroup/unified
          total 0
          -r--r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.controllers
          -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.max.depth
          -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.max.descendants
          -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.procs
          -r--r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.stat
          -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.subtree_control
          -rw-r--r--.  1 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 cgroup.threads
          drwxr-xr-x.  2 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 init.scope
          drwxr-xr-x. 26 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:21 system.slice
          drwxr-xr-x.  3 root root system_u:object_r:container_file_t:s0 0 Jan 29 03:15 test
          drwxr-xr-x.  3 root root system_u:object_r:cgroup_t:s0         0 Jan 29 03:06 user.slice
          # mkdir /sys/fs/cgroup/unified/test/subdir
      
      Actual result:
      
          # ls -ldZ /sys/fs/cgroup/unified/test/subdir
          drwxr-xr-x. 2 root root system_u:object_r:cgroup_t:s0 0 Jan 29 03:15 /sys/fs/cgroup/unified/test/subdir
      
      Expected result:
      
          # ls -ldZ /sys/fs/cgroup/unified/test/subdir
          drwxr-xr-x. 2 root root unconfined_u:object_r:container_file_t:s0 0 Jan 29 03:15 /sys/fs/cgroup/unified/test/subdir
      
      Link: https://github.com/SELinuxProject/selinux-kernel/issues/39Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      e19dfdc8
    • O
      LSM: add new hook for kernfs node initialization · b230d5ab
      Ondrej Mosnacek 提交于
      This patch introduces a new security hook that is intended for
      initializing the security data for newly created kernfs nodes, which
      provide a way of storing a non-default security context, but need to
      operate independently from mounts (and therefore may not have an
      associated inode at the moment of creation).
      
      The main motivation is to allow kernfs nodes to inherit the context of
      the parent under SELinux, similar to the behavior of
      security_inode_init_security(). Other LSMs may implement their own logic
      for handling the creation of new nodes.
      
      This patch also adds helper functions to <linux/kernfs.h> for
      getting/setting security xattrs of a kernfs node so that LSMs hooks are
      able to do their job. Other important attributes should be accessible
      direcly in the kernfs_node fields (in case there is need for more, then
      new helpers should be added to kernfs.h along with the patch that needs
      them).
      Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      [PM: more manual merge fixes]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      b230d5ab
    • O
      kernfs: use simple_xattrs for security attributes · 0ac6075a
      Ondrej Mosnacek 提交于
      Replace the special handling of security xattrs with simple_xattrs, as
      is already done for the trusted xattrs. This simplifies the code and
      allows LSMs to use more than just a single xattr to do their business.
      Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      [PM: manual merge fixes]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      0ac6075a
    • O
      kernfs: do not alloc iattrs in kernfs_xattr_get · d0c9c153
      Ondrej Mosnacek 提交于
      This is a read-only operation, so we can simply return -ENODATA if
      kn->iattr is NULL.
      Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      [PM: minor merge fixes]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      d0c9c153
    • O
      kernfs: clean up struct kernfs_iattrs · 05895219
      Ondrej Mosnacek 提交于
      Right now, kernfs_iattrs embeds the whole struct iattr, even though it
      doesn't really use half of its fields... This both leads to wasting
      space and makes the code look awkward. Let's just list the few fields
      we need directly in struct kernfs_iattrs.
      Signed-off-by: NOndrej Mosnacek <omosnace@redhat.com>
      Acked-by: NCasey Schaufler <casey@schaufler-ca.com>
      [PM: merged a number of chunks manually due to fuzz]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      05895219
  12. 06 3月, 2019 1 次提交
    • J
      fs: kernfs: add poll file operation · 147e1a97
      Johannes Weiner 提交于
      Patch series "psi: pressure stall monitors", v3.
      
      Android is adopting psi to detect and remedy memory pressure that
      results in stuttering and decreased responsiveness on mobile devices.
      
      Psi gives us the stall information, but because we're dealing with
      latencies in the millisecond range, periodically reading the pressure
      files to detect stalls in a timely fashion is not feasible.  Psi also
      doesn't aggregate its averages at a high enough frequency right now.
      
      This patch series extends the psi interface such that users can
      configure sensitive latency thresholds and use poll() and friends to be
      notified when these are breached.
      
      As high-frequency aggregation is costly, it implements an aggregation
      method that is optimized for fast, short-interval averaging, and makes
      the aggregation frequency adaptive, such that high-frequency updates
      only happen while monitored stall events are actively occurring.
      
      With these patches applied, Android can monitor for, and ward off,
      mounting memory shortages before they cause problems for the user.  For
      example, using memory stall monitors in userspace low memory killer
      daemon (lmkd) we can detect mounting pressure and kill less important
      processes before device becomes visibly sluggish.
      
      In our memory stress testing psi memory monitors produce roughly 10x
      less false positives compared to vmpressure signals.  Having ability to
      specify multiple triggers for the same psi metric allows other parts of
      Android framework to monitor memory state of the device and act
      accordingly.
      
      The new interface is straightforward.  The user opens one of the
      pressure files for writing and writes a trigger description into the
      file descriptor that defines the stall state - some or full, and the
      maximum stall time over a given window of time.  E.g.:
      
              /* Signal when stall time exceeds 100ms of a 1s window */
              char trigger[] = "full 100000 1000000";
              fd = open("/proc/pressure/memory");
              write(fd, trigger, sizeof(trigger));
              while (poll() >= 0) {
                      ...
              }
              close(fd);
      
      When the monitored stall state is entered, psi adapts its aggregation
      frequency according to what the configured time window requires in order
      to emit event signals in a timely fashion.  Once the stalling subsides,
      aggregation reverts back to normal.
      
      The trigger is associated with the open file descriptor.  To stop
      monitoring, the user only needs to close the file descriptor and the
      trigger is discarded.
      
      Patches 1-4 prepare the psi code for polling support.  Patch 5
      implements the adaptive polling logic, the pressure growth detection
      optimized for short intervals, and hooks up write() and poll() on the
      pressure files.
      
      The patches were developed in collaboration with Johannes Weiner.
      
      This patch (of 5):
      
      Kernfs has a standardized poll/notification mechanism for waking all
      pollers on all fds when a filesystem node changes.  To allow polling for
      custom events, add a .poll callback that can override the default.
      
      This is in preparation for pollable cgroup pressure files which have
      per-fd trigger configurations.
      
      Link: http://lkml.kernel.org/r/20190124211518.244221-2-surenb@google.comSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NSuren Baghdasaryan <surenb@google.com>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      147e1a97
  13. 28 2月, 2019 1 次提交
    • D
      kernfs, sysfs, cgroup, intel_rdt: Support fs_context · 23bf1b6b
      David Howells 提交于
      Make kernfs support superblock creation/mount/remount with fs_context.
      
      This requires that sysfs, cgroup and intel_rdt, which are built on kernfs,
      be made to support fs_context also.
      
      Notes:
      
       (1) A kernfs_fs_context struct is created to wrap fs_context and the
           kernfs mount parameters are moved in here (or are in fs_context).
      
       (2) kernfs_mount{,_ns}() are made into kernfs_get_tree().  The extra
           namespace tag parameter is passed in the context if desired
      
       (3) kernfs_free_fs_context() is provided as a destructor for the
           kernfs_fs_context struct, but for the moment it does nothing except
           get called in the right places.
      
       (4) sysfs doesn't wrap kernfs_fs_context since it has no parameters to
           pass, but possibly this should be done anyway in case someone wants to
           add a parameter in future.
      
       (5) A cgroup_fs_context struct is created to wrap kernfs_fs_context and
           the cgroup v1 and v2 mount parameters are all moved there.
      
       (6) cgroup1 parameter parsing error messages are now handled by invalf(),
           which allows userspace to collect them directly.
      
       (7) cgroup1 parameter cleanup is now done in the context destructor rather
           than in the mount/get_tree and remount functions.
      
      Weirdies:
      
       (*) cgroup_do_get_tree() calls cset_cgroup_from_root() with locks held,
           but then uses the resulting pointer after dropping the locks.  I'm
           told this is okay and needs commenting.
      
       (*) The cgroup refcount web.  This really needs documenting.
      
       (*) cgroup2 only has one root?
      
      Add a suggestion from Thomas Gleixner in which the RDT enablement code is
      placed into its own function.
      
      [folded a leak fix from Andrey Vagin]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc: Tejun Heo <tj@kernel.org>
      cc: Li Zefan <lizefan@huawei.com>
      cc: Johannes Weiner <hannes@cmpxchg.org>
      cc: cgroups@vger.kernel.org
      cc: fenghua.yu@intel.com
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      23bf1b6b
  14. 08 2月, 2019 1 次提交
    • A
      kernfs: Allocating memory for kernfs_iattrs with kmem_cache. · 26e28d68
      Ayush Mittal 提交于
      Creating a new cache for kernfs_iattrs.
      Currently, memory is allocated with kzalloc() which
      always gives aligned memory. On ARM, this is 64 byte aligned.
      To avoid the wastage of memory in aligning the size requested,
      a new cache for kernfs_iattrs is created.
      
      Size of struct kernfs_iattrs is 80 Bytes.
      On ARM, it will come in kmalloc-128 slab.
      and it will come in kmalloc-192 slab if debug info is enabled.
      Extra bytes taken 48 bytes.
      
      Total number of objects created : 4096
      Total saving = 48*4096 = 192 KB
      
      After creating new slab(When debug info is enabled) :
      sh-3.2# cat /proc/slabinfo
      ...
      kernfs_iattrs_cache   4069   4096    128   32    1 : tunables    0    0    0 : slabdata    128    128      0
      ...
      
      All testing has been done on ARM target.
      Signed-off-by: NAyush Mittal <ayush.m@samsung.com>
      Signed-off-by: NVaneet Narang <v.narang@samsung.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26e28d68
  15. 18 1月, 2019 2 次提交
    • A
      kill kernfs_pin_sb() · 6d7fbce7
      Al Viro 提交于
      unused now and impossible to use safely anyway.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6d7fbce7
    • A
      fix cgroup_do_mount() handling of failure exits · 399504e2
      Al Viro 提交于
      same story as with last May fixes in sysfs (7b745a4e
      "unfuck sysfs_mount()"); new_sb is left uninitialized
      in case of early errors in kernfs_mount_ns() and papering
      over it by treating any error from kernfs_mount_ns() as
      equivalent to !new_ns ends up conflating the cases when
      objects had never been transferred to a superblock with
      ones when that has happened and resulting new superblock
      had been dropped.  Easily fixed (same way as in sysfs
      case).  Additionally, there's a superblock leak on
      kernfs_node_dentry() failure *and* a dentry leak inside
      kernfs_node_dentry() itself - the latter on probably
      impossible errors, but the former not impossible to trigger
      (as the matter of fact, injecting allocation failures
      at that point *does* trigger it).
      
      Cc: stable@kernel.org
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      399504e2
  16. 27 11月, 2018 1 次提交
    • R
      kernfs: Improve kernfs_notify() poll notification latency · 03c0a920
      Radu Rendec 提交于
      kernfs_notify() does two notifications: poll and fsnotify. Originally,
      both notifications were done from scheduled work context and all that
      kernfs_notify() did was schedule the work.
      
      This patch simply moves the poll notification from the scheduled work
      handler to kernfs_notify(). The fsnotify notification still needs to be
      done from scheduled work context because it can sleep (it needs to lock
      a mutex).
      
      If the poll notification is time critical (the notified thread needs to
      wake as quickly as possible), it's better to do it from kernfs_notify()
      directly. One example is calling sysfs_notify_dirent() from a hardware
      interrupt handler to wake up a thread and handle the interrupt in user
      space.
      Signed-off-by: NRadu Rendec <radu.rendec@gmail.com>
      Acked-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03c0a920
  17. 27 10月, 2018 1 次提交
    • J
      mm: zero-seek shrinkers · 4b85afbd
      Johannes Weiner 提交于
      The page cache and most shrinkable slab caches hold data that has been
      read from disk, but there are some caches that only cache CPU work, such
      as the dentry and inode caches of procfs and sysfs, as well as the subset
      of radix tree nodes that track non-resident page cache.
      
      Currently, all these are shrunk at the same rate: using DEFAULT_SEEKS for
      the shrinker's seeks setting tells the reclaim algorithm that for every
      two page cache pages scanned it should scan one slab object.
      
      This is a bogus setting.  A virtual inode that required no IO to create is
      not twice as valuable as a page cache page; shadow cache entries with
      eviction distances beyond the size of memory aren't either.
      
      In most cases, the behavior in practice is still fine.  Such virtual
      caches don't tend to grow and assert themselves aggressively, and usually
      get picked up before they cause problems.  But there are scenarios where
      that's not true.
      
      Our database workloads suffer from two of those.  For one, their file
      workingset is several times bigger than available memory, which has the
      kernel aggressively create shadow page cache entries for the non-resident
      parts of it.  The workingset code does tell the VM that most of these are
      expendable, but the VM ends up balancing them 2:1 to cache pages as per
      the seeks setting.  This is a huge waste of memory.
      
      These workloads also deal with tens of thousands of open files and use
      /proc for introspection, which ends up growing the proc_inode_cache to
      absurdly large sizes - again at the cost of valuable cache space, which
      isn't a reasonable trade-off, given that proc inodes can be re-created
      without involving the disk.
      
      This patch implements a "zero-seek" setting for shrinkers that results in
      a target ratio of 0:1 between their objects and IO-backed caches.  This
      allows such virtual caches to grow when memory is available (they do
      cache/avoid CPU work after all), but effectively disables them as soon as
      IO-backed objects are under pressure.
      
      It then switches the shrinkers for procfs and sysfs metadata, as well as
      excess page cache shadow nodes, to the new zero-seek setting.
      
      Link: http://lkml.kernel.org/r/20181009184732.762-5-hannes@cmpxchg.orgSigned-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: NDomas Mituzas <dmituzas@fb.com>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NRik van Riel <riel@surriel.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b85afbd
  18. 17 9月, 2018 1 次提交
  19. 21 7月, 2018 1 次提交
  20. 07 7月, 2018 1 次提交
  21. 06 6月, 2018 1 次提交
    • D
      vfs: change inode times to use struct timespec64 · 95582b00
      Deepa Dinamani 提交于
      struct timespec is not y2038 safe. Transition vfs to use
      y2038 safe struct timespec64 instead.
      
      The change was made with the help of the following cocinelle
      script. This catches about 80% of the changes.
      All the header file and logic changes are included in the
      first 5 rules. The rest are trivial substitutions.
      I avoid changing any of the function signatures or any other
      filesystem specific data structures to keep the patch simple
      for review.
      
      The script can be a little shorter by combining different cases.
      But, this version was sufficient for my usecase.
      
      virtual patch
      
      @ depends on patch @
      identifier now;
      @@
      - struct timespec
      + struct timespec64
        current_time ( ... )
        {
      - struct timespec now = current_kernel_time();
      + struct timespec64 now = current_kernel_time64();
        ...
      - return timespec_trunc(
      + return timespec64_trunc(
        ... );
        }
      
      @ depends on patch @
      identifier xtime;
      @@
       struct \( iattr \| inode \| kstat \) {
       ...
      -       struct timespec xtime;
      +       struct timespec64 xtime;
       ...
       }
      
      @ depends on patch @
      identifier t;
      @@
       struct inode_operations {
       ...
      int (*update_time) (...,
      -       struct timespec t,
      +       struct timespec64 t,
      ...);
       ...
       }
      
      @ depends on patch @
      identifier t;
      identifier fn_update_time =~ "update_time$";
      @@
       fn_update_time (...,
      - struct timespec *t,
      + struct timespec64 *t,
       ...) { ... }
      
      @ depends on patch @
      identifier t;
      @@
      lease_get_mtime( ... ,
      - struct timespec *t
      + struct timespec64 *t
        ) { ... }
      
      @te depends on patch forall@
      identifier ts;
      local idexpression struct inode *inode_node;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn_update_time =~ "update_time$";
      identifier fn;
      expression e, E3;
      local idexpression struct inode *node1;
      local idexpression struct inode *node2;
      local idexpression struct iattr *attr1;
      local idexpression struct iattr *attr2;
      local idexpression struct iattr attr;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      @@
      (
      (
      - struct timespec ts;
      + struct timespec64 ts;
      |
      - struct timespec ts = current_time(inode_node);
      + struct timespec64 ts = current_time(inode_node);
      )
      
      <+... when != ts
      (
      - timespec_equal(&inode_node->i_xtime, &ts)
      + timespec64_equal(&inode_node->i_xtime, &ts)
      |
      - timespec_equal(&ts, &inode_node->i_xtime)
      + timespec64_equal(&ts, &inode_node->i_xtime)
      |
      - timespec_compare(&inode_node->i_xtime, &ts)
      + timespec64_compare(&inode_node->i_xtime, &ts)
      |
      - timespec_compare(&ts, &inode_node->i_xtime)
      + timespec64_compare(&ts, &inode_node->i_xtime)
      |
      ts = current_time(e)
      |
      fn_update_time(..., &ts,...)
      |
      inode_node->i_xtime = ts
      |
      node1->i_xtime = ts
      |
      ts = inode_node->i_xtime
      |
      <+... attr1->ia_xtime ...+> = ts
      |
      ts = attr1->ia_xtime
      |
      ts.tv_sec
      |
      ts.tv_nsec
      |
      btrfs_set_stack_timespec_sec(..., ts.tv_sec)
      |
      btrfs_set_stack_timespec_nsec(..., ts.tv_nsec)
      |
      - ts = timespec64_to_timespec(
      + ts =
      ...
      -)
      |
      - ts = ktime_to_timespec(
      + ts = ktime_to_timespec64(
      ...)
      |
      - ts = E3
      + ts = timespec_to_timespec64(E3)
      |
      - ktime_get_real_ts(&ts)
      + ktime_get_real_ts64(&ts)
      |
      fn(...,
      - ts
      + timespec64_to_timespec(ts)
      ,...)
      )
      ...+>
      (
      <... when != ts
      - return ts;
      + return timespec64_to_timespec(ts);
      ...>
      )
      |
      - timespec_equal(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_equal(&node1->i_xtime2, &node2->i_xtime2)
      |
      - timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2)
      + timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2)
      |
      - timespec_compare(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_compare(&node1->i_xtime1, &node2->i_xtime2)
      |
      node1->i_xtime1 =
      - timespec_trunc(attr1->ia_xtime1,
      + timespec64_trunc(attr1->ia_xtime1,
      ...)
      |
      - attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2,
      + attr1->ia_xtime1 =  timespec64_trunc(attr2->ia_xtime2,
      ...)
      |
      - ktime_get_real_ts(&attr1->ia_xtime1)
      + ktime_get_real_ts64(&attr1->ia_xtime1)
      |
      - ktime_get_real_ts(&attr.ia_xtime1)
      + ktime_get_real_ts64(&attr.ia_xtime1)
      )
      
      @ depends on patch @
      struct inode *node;
      struct iattr *attr;
      identifier fn;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      expression e;
      @@
      (
      - fn(node->i_xtime);
      + fn(timespec64_to_timespec(node->i_xtime));
      |
       fn(...,
      - node->i_xtime);
      + timespec64_to_timespec(node->i_xtime));
      |
      - e = fn(attr->ia_xtime);
      + e = fn(timespec64_to_timespec(attr->ia_xtime));
      )
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      )
      ...+>
      }
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      struct kstat *stat;
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier i_xtime =~ "^i_[acm]time$";
      identifier xtime =~ "^[acm]time$";
      identifier fn, ret;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(stat->xtime);
      ret = fn (...,
      - &stat->xtime);
      + &ts);
      )
      ...+>
      }
      
      @ depends on patch @
      struct inode *node;
      struct inode *node2;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier i_xtime3 =~ "^i_[acm]time$";
      struct iattr *attrp;
      struct iattr *attrp2;
      struct iattr attr ;
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      struct kstat *stat;
      struct kstat stat1;
      struct timespec64 ts;
      identifier xtime =~ "^[acmb]time$";
      expression e;
      @@
      (
      ( node->i_xtime2 \| attrp->ia_xtime2 \| attr.ia_xtime2 \) = node->i_xtime1  ;
      |
       node->i_xtime2 = \( node2->i_xtime1 \| timespec64_trunc(...) \);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       stat->xtime = node2->i_xtime1;
      |
       stat1.xtime = node2->i_xtime1;
      |
      ( node->i_xtime2 \| attrp->ia_xtime2 \) = attrp->ia_xtime1  ;
      |
      ( attrp->ia_xtime1 \| attr.ia_xtime1 \) = attrp2->ia_xtime2;
      |
      - e = node->i_xtime1;
      + e = timespec64_to_timespec( node->i_xtime1 );
      |
      - e = attrp->ia_xtime1;
      + e = timespec64_to_timespec( attrp->ia_xtime1 );
      |
      node->i_xtime1 = current_time(...);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
       node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
      - node->i_xtime1 = e;
      + node->i_xtime1 = timespec_to_timespec64(e);
      )
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Cc: <anton@tuxera.com>
      Cc: <balbi@kernel.org>
      Cc: <bfields@fieldses.org>
      Cc: <darrick.wong@oracle.com>
      Cc: <dhowells@redhat.com>
      Cc: <dsterba@suse.com>
      Cc: <dwmw2@infradead.org>
      Cc: <hch@lst.de>
      Cc: <hirofumi@mail.parknet.co.jp>
      Cc: <hubcap@omnibond.com>
      Cc: <jack@suse.com>
      Cc: <jaegeuk@kernel.org>
      Cc: <jaharkes@cs.cmu.edu>
      Cc: <jslaby@suse.com>
      Cc: <keescook@chromium.org>
      Cc: <mark@fasheh.com>
      Cc: <miklos@szeredi.hu>
      Cc: <nico@linaro.org>
      Cc: <reiserfs-devel@vger.kernel.org>
      Cc: <richard@nod.at>
      Cc: <sage@redhat.com>
      Cc: <sfrench@samba.org>
      Cc: <swhiteho@redhat.com>
      Cc: <tj@kernel.org>
      Cc: <trond.myklebust@primarydata.com>
      Cc: <tytso@mit.edu>
      Cc: <viro@zeniv.linux.org.uk>
      95582b00
  22. 22 5月, 2018 1 次提交
  23. 23 4月, 2018 1 次提交
  24. 12 2月, 2018 1 次提交
    • L
      vfs: do bulk POLL* -> EPOLL* replacement · a9a08845
      Linus Torvalds 提交于
      This is the mindless scripted replacement of kernel use of POLL*
      variables as described by Al, done by this script:
      
          for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
              L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
              for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
          done
      
      with de-mangling cleanups yet to come.
      
      NOTE! On almost all architectures, the EPOLL* constants have the same
      values as the POLL* constants do.  But they keyword here is "almost".
      For various bad reasons they aren't the same, and epoll() doesn't
      actually work quite correctly in some cases due to this on Sparc et al.
      
      The next patch from Al will sort out the final differences, and we
      should be all done.
      Scripted-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9a08845
  25. 20 1月, 2018 1 次提交
  26. 28 11月, 2017 2 次提交
    • A
      fs: annotate ->poll() instances · 076ccb76
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      076ccb76
    • L
      Rename superblock flags (MS_xyz -> SB_xyz) · 1751e8a6
      Linus Torvalds 提交于
      This is a pure automated search-and-replace of the internal kernel
      superblock flags.
      
      The s_flags are now called SB_*, with the names and the values for the
      moment mirroring the MS_* flags that they're equivalent to.
      
      Note how the MS_xyz flags are the ones passed to the mount system call,
      while the SB_xyz flags are what we then use in sb->s_flags.
      
      The script to do this was:
      
          # places to look in; re security/*: it generally should *not* be
          # touched (that stuff parses mount(2) arguments directly), but
          # there are two places where we really deal with superblock flags.
          FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
                  include/linux/fs.h include/uapi/linux/bfs_fs.h \
                  security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
          # the list of MS_... constants
          SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
                DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
                POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
                I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
                ACTIVE NOUSER"
      
          SED_PROG=
          for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
      
          # we want files that contain at least one of MS_...,
          # with fs/namespace.c and fs/pnode.c excluded.
          L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
      
          for f in $L; do sed -i $f $SED_PROG; done
      Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1751e8a6
  27. 01 9月, 2017 1 次提交
  28. 28 8月, 2017 1 次提交
  29. 29 7月, 2017 4 次提交