1. 03 4月, 2017 3 次提交
  2. 02 3月, 2017 2 次提交
  3. 09 2月, 2017 1 次提交
  4. 24 1月, 2017 1 次提交
    • N
      inotify: Convert to using per-namespace limits · 1cce1eea
      Nikolay Borisov 提交于
      This patchset converts inotify to using the newly introduced
      per-userns sysctl infrastructure.
      
      Currently the inotify instances/watches are being accounted in the
      user_struct structure. This means that in setups where multiple
      users in unprivileged containers map to the same underlying
      real user (i.e. pointing to the same user_struct) the inotify limits
      are going to be shared as well, allowing one user(or application) to exhaust
      all others limits.
      
      Fix this by switching the inotify sysctls to using the
      per-namespace/per-user limits. This will allow the server admin to
      set sensible global limits, which can further be tuned inside every
      individual user namespace. Additionally, in order to preserve the
      sysctl ABI make the existing inotify instances/watches sysctls
      modify the values of the initial user namespace.
      Signed-off-by: NNikolay Borisov <n.borisov.lkml@gmail.com>
      Acked-by: NJan Kara <jack@suse.cz>
      Acked-by: NSerge Hallyn <serge@hallyn.com>
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      1cce1eea
  5. 24 12月, 2016 1 次提交
    • J
      fsnotify: Remove fsnotify_duplicate_mark() · e3ba7307
      Jan Kara 提交于
      There are only two calls sites of fsnotify_duplicate_mark(). Those are
      in kernel/audit_tree.c and both are bogus. Vfsmount pointer is unused
      for audit tree, inode pointer and group gets set in
      fsnotify_add_mark_locked() later anyway, mask and free_mark are already
      set in alloc_chunk(). In fact, calling fsnotify_duplicate_mark() is
      actively harmful because following fsnotify_add_mark_locked() will leak
      group reference by overwriting the group pointer. So just remove the two
      calls to fsnotify_duplicate_mark() and the function.
      Signed-off-by: NJan Kara <jack@suse.cz>
      [PM: line wrapping to fit in 80 chars]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      e3ba7307
  6. 13 12月, 2016 1 次提交
    • J
      fsnotify: Fix possible use-after-free in inode iteration on umount · 5716863e
      Jan Kara 提交于
      fsnotify_unmount_inodes() plays complex tricks to pin next inode in the
      sb->s_inodes list when iterating over all inodes. Furthermore the code has a
      bug that if the current inode is the last on i_sb_list that does not have e.g.
      I_FREEING set, then we leave next_i pointing to inode which may get removed
      from the i_sb_list once we drop s_inode_list_lock thus resulting in
      use-after-free issues (usually manifesting as infinite looping in
      fsnotify_unmount_inodes()).
      
      Fix the problem by keeping current inode pinned somewhat longer. Then we can
      make the code much simpler and standard.
      
      CC: stable@vger.kernel.org
      Signed-off-by: NJan Kara <jack@suse.cz>
      5716863e
  7. 06 12月, 2016 3 次提交
  8. 08 10月, 2016 5 次提交
  9. 20 9月, 2016 2 次提交
  10. 20 5月, 2016 1 次提交
    • J
      fsnotify: avoid spurious EMFILE errors from inotify_init() · 35e48176
      Jan Kara 提交于
      Inotify instance is destroyed when all references to it are dropped.
      That not only means that the corresponding file descriptor needs to be
      closed but also that all corresponding instance marks are freed (as each
      mark holds a reference to the inotify instance).  However marks are
      freed only after SRCU period ends which can take some time and thus if
      user rapidly creates and frees inotify instances, number of existing
      inotify instances can exceed max_user_instances limit although from user
      point of view there is always at most one existing instance.  Thus
      inotify_init() returns EMFILE error which is hard to justify from user
      point of view.  This problem is exposed by LTP inotify06 testcase on
      some machines.
      
      We fix the problem by making sure all group marks are properly freed
      while destroying inotify instance.  We wait for SRCU period to end in
      that path anyway since we have to make sure there is no event being
      added to the instance while we are tearing down the instance.  So it
      takes only some plumbing to allow for marks to be destroyed in that path
      as well and not from a dedicated work item.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NJan Kara <jack@suse.cz>
      Reported-by: NXiaoguang Wang <wangxg.fnst@cn.fujitsu.com>
      Tested-by: NXiaoguang Wang <wangxg.fnst@cn.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      35e48176
  11. 19 2月, 2016 2 次提交
  12. 15 1月, 2016 2 次提交
  13. 06 11月, 2015 2 次提交
    • D
      inotify: actually check for invalid bits in sys_inotify_add_watch() · d30e2c05
      Dave Hansen 提交于
      The comment here says that it is checking for invalid bits.  But, the mask
      is *actually* checking to ensure that _any_ valid bit is set, which is
      quite different.
      
      Without this check, an unexpected bit could get set on an inotify object.
      Since these bits are also interpreted by the fsnotify/dnotify code, there
      is the potential for an object to be mishandled inside the kernel.  For
      instance, can we be sure that setting the dnotify flag FS_DN_RENAME on an
      inotify watch is harmless?
      
      Add the actual check which was intended.  Retain the existing inotify bits
      are being added to the watch.  Plus, this is existing behavior which would
      be nice to preserve.
      
      I did a quick sniff test that inotify functions and that my
      'inotify-tools' package passes 'make check'.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Cc: John McCutchan <john@johnmccutchan.com>
      Cc: Robert Love <rlove@rlove.org>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d30e2c05
    • D
      inotify: hide internal kernel bits from fdinfo · 69335996
      Dave Hansen 提交于
      There was a report that my patch:
      
          inotify: actually check for invalid bits in sys_inotify_add_watch()
      
      broke CRIU.
      
      The reason is that CRIU looks up raw flags in /proc/$pid/fdinfo/* to
      figure out how to rebuild inotify watches and then passes those flags
      directly back in to the inotify API.  One of those flags
      (FS_EVENT_ON_CHILD) is set in mark->mask, but is not part of the inotify
      API.  It is used inside the kernel to _implement_ inotify but it is not
      and has never been part of the API.
      
      My patch above ensured that we only allow bits which are part of the API
      (IN_ALL_EVENTS).  This broke CRIU.
      
      FS_EVENT_ON_CHILD is really internal to the kernel.  It is set _anyway_ on
      all inotify marks.  So, CRIU was really just trying to set a bit that was
      already set.
      
      This patch hides that bit from fdinfo.  CRIU will not see the bit, not try
      to set it, and should work as before.  We should not have been exposing
      this bit in the first place, so this is a good patch independent of the
      CRIU problem.
      Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
      Reported-by: NAndrey Wagin <avagin@gmail.com>
      Acked-by: NAndrey Vagin <avagin@openvz.org>
      Acked-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: NEric Paris <eparis@redhat.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: John McCutchan <john@johnmccutchan.com>
      Cc: Robert Love <rlove@rlove.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      69335996
  14. 05 9月, 2015 4 次提交
  15. 18 8月, 2015 1 次提交
  16. 07 8月, 2015 1 次提交
  17. 22 7月, 2015 1 次提交
    • L
      Revert "fsnotify: fix oops in fsnotify_clear_marks_by_group_flags()" · d725e66c
      Linus Torvalds 提交于
      This reverts commit a2673b6e.
      
      Kinglong Mee reports a memory leak with that patch, and Jan Kara confirms:
      
       "Thanks for report! You are right that my patch introduces a race
        between fsnotify kthread and fsnotify_destroy_group() which can result
        in leaking inotify event on group destruction.
      
        I haven't yet decided whether the right fix is not to queue events for
        dying notification group (as that is pointless anyway) or whether we
        should just fix the original problem differently...  Whenever I look
        at fsnotify code mark handling I get lost in the maze of locks, lists,
        and subtle differences between how different notification systems
        handle notification marks :( I'll think about it over night"
      
      and after thinking about it, Jan says:
      
       "OK, I have looked into the code some more and I found another
        relatively simple way of fixing the original oops.  It will be IMHO
        better than trying to fixup this issue which has more potential for
        breakage.  I'll ask Linus to revert the fsnotify fix he already merged
        and send a new fix"
      Reported-by: NKinglong Mee <kinglongmee@gmail.com>
      Requested-by: NJan Kara <jack@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d725e66c
  18. 18 7月, 2015 1 次提交
  19. 17 6月, 2015 1 次提交
    • P
      fs/notify: don't use module_init for non-modular inotify_user code · c013d5a4
      Paul Gortmaker 提交于
      The INOTIFY_USER option is bool, and hence this code is either
      present or absent.  It will never be modular, so using
      module_init as an alias for __initcall is rather misleading.
      
      Fix this up now, so that we can relocate module_init from
      init.h into module.h in the future.  If we don't do this, we'd
      have to add module.h to obviously non-modular code, and that
      would be a worse thing.
      
      Note that direct use of __initcall is discouraged, vs. one
      of the priority categorized subgroups.  As __initcall gets
      mapped onto device_initcall, our use of fs_initcall (which
      makes sense for fs code) will thus change this registration
      from level 6-device to level 5-fs (i.e. slightly earlier).
      However no observable impact of that small difference has
      been observed during testing, or is expected.
      
      Cc: John McCutchan <john@johnmccutchan.com>
      Cc: Robert Love <rlove@rlove.org>
      Cc: Eric Paris <eparis@parisplace.org>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      c013d5a4
  20. 13 3月, 2015 1 次提交
    • S
      fanotify: fix event filtering with FAN_ONDIR set · b3c1030d
      Suzuki K. Poulose 提交于
      With FAN_ONDIR set, the user can end up getting events, which it hasn't
      marked.  This was revealed with fanotify04 testcase failure on
      Linux-4.0-rc1, and is a regression from 3.19, revealed with 66ba93c0
      ("fanotify: don't set FAN_ONDIR implicitly on a marks ignored mask").
      
         # /opt/ltp/testcases/bin/fanotify04
         [ ... ]
        fanotify04    7  TPASS  :  event generated properly for type 100000
        fanotify04    8  TFAIL  :  fanotify04.c:147: got unexpected event 30
        fanotify04    9  TPASS  :  No event as expected
      
      The testcase sets the adds the following marks : FAN_OPEN | FAN_ONDIR for
      a fanotify on a dir.  Then does an open(), followed by close() of the
      directory and expects to see an event FAN_OPEN(0x20).  However, the
      fanotify returns (FAN_OPEN|FAN_CLOSE_NOWRITE(0x10)).  This happens due to
      the flaw in the check for event_mask in fanotify_should_send_event() which
      does:
      
      	if (event_mask & marks_mask & ~marks_ignored_mask)
      		return true;
      
      where, event_mask == (FAN_ONDIR | FAN_CLOSE_NOWRITE),
             marks_mask == (FAN_ONDIR | FAN_OPEN),
             marks_ignored_mask == 0
      
      Fix this by masking the outgoing events to the user, as we already take
      care of FAN_ONDIR and FAN_EVENT_ON_CHILD.
      Signed-off-by: NSuzuki K. Poulose <suzuki.poulose@arm.com>
      Tested-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b3c1030d
  21. 23 2月, 2015 2 次提交
    • D
      fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions · 54f2a2f4
      David Howells 提交于
      Fanotify probably doesn't want to watch autodirs so make it use d_can_lookup()
      rather than d_is_dir() when checking a dir watch and give an error on fake
      directories.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      54f2a2f4
    • D
      VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) · e36cb0b8
      David Howells 提交于
      Convert the following where appropriate:
      
       (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
      
       (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
      
       (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry).  This is actually more
           complicated than it appears as some calls should be converted to
           d_can_lookup() instead.  The difference is whether the directory in
           question is a real dir with a ->lookup op or whether it's a fake dir with
           a ->d_automount op.
      
      In some circumstances, we can subsume checks for dentry->d_inode not being
      NULL into this, provided we the code isn't in a filesystem that expects
      d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
      use d_inode() rather than d_backing_inode() to get the inode pointer).
      
      Note that the dentry type field may be set to something other than
      DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
      manages the fall-through from a negative dentry to a lower layer.  In such a
      case, the dentry type of the negative union dentry is set to the same as the
      type of the lower dentry.
      
      However, if you know d_inode is not NULL at the call site, then you can use
      the d_is_xxx() functions even in a filesystem.
      
      There is one further complication: a 0,0 chardev dentry may be labelled
      DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE.  Strictly, this was
      intended for special directory entry types that don't have attached inodes.
      
      The following perl+coccinelle script was used:
      
      use strict;
      
      my @callers;
      open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
          die "Can't grep for S_ISDIR and co. callers";
      @callers = <$fd>;
      close($fd);
      unless (@callers) {
          print "No matches\n";
          exit(0);
      }
      
      my @cocci = (
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISLNK(E->d_inode->i_mode)',
          '+ d_is_symlink(E)',
          '',
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISDIR(E->d_inode->i_mode)',
          '+ d_is_dir(E)',
          '',
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISREG(E->d_inode->i_mode)',
          '+ d_is_reg(E)' );
      
      my $coccifile = "tmp.sp.cocci";
      open($fd, ">$coccifile") || die $coccifile;
      print($fd "$_\n") || die $coccifile foreach (@cocci);
      close($fd);
      
      foreach my $file (@callers) {
          chomp $file;
          print "Processing ", $file, "\n";
          system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
      	die "spatch failed";
      }
      
      [AV: overlayfs parts skipped]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e36cb0b8
  22. 11 2月, 2015 2 次提交
反馈
建议
客服 返回
顶部