1. 07 3月, 2012 1 次提交
  2. 21 2月, 2012 1 次提交
  3. 13 1月, 2012 2 次提交
  4. 11 1月, 2012 1 次提交
    • V
      procfs: add hidepid= and gid= mount options · 0499680a
      Vasiliy Kulikov 提交于
      Add support for mount options to restrict access to /proc/PID/
      directories.  The default backward-compatible "relaxed" behaviour is left
      untouched.
      
      The first mount option is called "hidepid" and its value defines how much
      info about processes we want to be available for non-owners:
      
      hidepid=0 (default) means the old behavior - anybody may read all
      world-readable /proc/PID/* files.
      
      hidepid=1 means users may not access any /proc/<pid>/ directories, but
      their own.  Sensitive files like cmdline, sched*, status are now protected
      against other users.  As permission checking done in proc_pid_permission()
      and files' permissions are left untouched, programs expecting specific
      files' modes are not confused.
      
      hidepid=2 means hidepid=1 plus all /proc/PID/ will be invisible to other
      users.  It doesn't mean that it hides whether a process exists (it can be
      learned by other means, e.g.  by kill -0 $PID), but it hides process' euid
      and egid.  It compicates intruder's task of gathering info about running
      processes, whether some daemon runs with elevated privileges, whether
      another user runs some sensitive program, whether other users run any
      program at all, etc.
      
      gid=XXX defines a group that will be able to gather all processes' info
      (as in hidepid=0 mode).  This group should be used instead of putting
      nonroot user in sudoers file or something.  However, untrusted users (like
      daemons, etc.) which are not supposed to monitor the tasks in the whole
      system should not be added to the group.
      
      hidepid=1 or higher is designed to restrict access to procfs files, which
      might reveal some sensitive private information like precise keystrokes
      timings:
      
      http://www.openwall.com/lists/oss-security/2011/11/05/3
      
      hidepid=1/2 doesn't break monitoring userspace tools.  ps, top, pgrep, and
      conky gracefully handle EPERM/ENOENT and behave as if the current user is
      the only user running processes.  pstree shows the process subtree which
      contains "pstree" process.
      
      Note: the patch doesn't deal with setuid/setgid issues of keeping
      preopened descriptors of procfs files (like
      https://lkml.org/lkml/2011/2/7/368).  We rely on that the leaked
      information like the scheduling counters of setuid apps doesn't threaten
      anybody's privacy - only the user started the setuid program may read the
      counters.
      Signed-off-by: NVasiliy Kulikov <segoon@openwall.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Theodore Tso <tytso@MIT.EDU>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: James Morris <jmorris@namei.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0499680a
  5. 07 1月, 2012 1 次提交
  6. 05 1月, 2012 1 次提交
    • Y
      ext4: add new online resize interface · 19c5246d
      Yongqiang Yang 提交于
      This patch adds new online resize interface, whose input argument is a
      64-bit integer indicating how many blocks there are in the resized fs.
      
      In new resize impelmentation, all work like allocating group tables
      are done by kernel side, so the new resize interface can support
      flex_bg feature and prepares ground for suppoting resize with features
      like bigalloc and exclude bitmap. Besides these, user-space tools just
      passes in the new number of blocks.
      
      We delay initializing the bitmaps and inode tables of added groups if
      possible and add multi groups (a flex groups) each time, so new resize
      is very fast like mkfs.
      Signed-off-by: NYongqiang Yang <xiaoqiangnk@gmail.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      19c5246d
  7. 04 1月, 2012 6 次提交
  8. 30 12月, 2011 1 次提交
  9. 01 12月, 2011 1 次提交
  10. 19 11月, 2011 1 次提交
  11. 08 11月, 2011 1 次提交
  12. 05 11月, 2011 2 次提交
  13. 02 11月, 2011 1 次提交
    • S
      vfs: add d_prune dentry operation · f0023bc6
      Sage Weil 提交于
      This adds a d_prune dentry operation that is called by the VFS prior to
      pruning (i.e. unhashing and killing) a hashed dentry from the dcache.
      Wrap dentry_lru_del() and use the new _prune() helper in the cases where we
      are about to unhash and kill the dentry.
      
      This will be used by Ceph to maintain a flag indicating whether the
      complete contents of a directory are contained in the dcache, allowing it
      to satisfy lookups and readdir without addition server communication.
      
      Renumber a few DCACHE_* #defines to group DCACHE_OP_PRUNE with the other
      DCACHE_OP_ bits.
      Signed-off-by: NSage Weil <sage@newdream.net>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      f0023bc6
  14. 25 10月, 2011 1 次提交
  15. 09 10月, 2011 2 次提交
  16. 28 9月, 2011 1 次提交
    • P
      doc: fix broken references · 395cf969
      Paul Bolle 提交于
      There are numerous broken references to Documentation files (in other
      Documentation files, in comments, etc.). These broken references are
      caused by typo's in the references, and by renames or removals of the
      Documentation files. Some broken references are simply odd.
      
      Fix these broken references, sometimes by dropping the irrelevant text
      they were part of.
      Signed-off-by: NPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      395cf969
  17. 04 9月, 2011 1 次提交
    • T
      ext4: improve handling of conflicting mount options · 56889787
      Theodore Ts'o 提交于
      If the user explicitly specifies conflicting mount options for
      delalloc or dioread_nolock and data=journal, fail the mount, instead
      of printing a warning and continuing (since many user's won't look at
      dmesg and notice the warning).
      
      Also, print a single warning that data=journal implies that delayed
      allocation is not on by default (since it's not supported), and
      furthermore that O_DIRECT is not supported.  Improve the text in
      Documentation/filesystems/ext4.txt so this is clear there as well.
      
      Similarly, if the dioread_nolock mount option is specified when the
      file system block size != PAGE_SIZE, fail the mount instead of
      printing a warning message and ignoring the mount option.
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      56889787
  18. 23 8月, 2011 1 次提交
  19. 17 8月, 2011 1 次提交
  20. 14 8月, 2011 1 次提交
  21. 27 7月, 2011 1 次提交
  22. 26 7月, 2011 1 次提交
  23. 25 7月, 2011 1 次提交
  24. 22 7月, 2011 1 次提交
  25. 21 7月, 2011 5 次提交
    • J
      fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers · 02c24a82
      Josef Bacik 提交于
      Btrfs needs to be able to control how filemap_write_and_wait_range() is called
      in fsync to make it less of a painful operation, so push down taking i_mutex and
      the calling of filemap_write_and_wait() down into the ->fsync() handlers.  Some
      file systems can drop taking the i_mutex altogether it seems, like ext3 and
      ocfs2.  For correctness sake I just pushed everything down in all cases to make
      sure that we keep the current behavior the same for everybody, and then each
      individual fs maintainer can make up their mind about what to do from there.
      Thanks,
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      02c24a82
    • J
      fs: add SEEK_HOLE and SEEK_DATA flags · 982d8165
      Josef Bacik 提交于
      This just gets us ready to support the SEEK_HOLE and SEEK_DATA flags.  Turns out
      using fiemap in things like cp cause more problems than it solves, so lets try
      and give userspace an interface that doesn't suck.  We need to match solaris
      here, and the definitions are
      
      *o* If /whence/ is SEEK_HOLE, the offset of the start of the
      next hole greater than or equal to the supplied offset
      is returned. The definition of a hole is provided near
      the end of the DESCRIPTION.
      
      *o* If /whence/ is SEEK_DATA, the file pointer is set to the
      start of the next non-hole file region greater than or
      equal to the supplied offset.
      
      So in the generic case the entire file is data and there is a virtual hole at
      the end.  That means we will just return i_size for SEEK_HOLE and will return
      the same offset for SEEK_DATA.  This is how Solaris does it so we have to do it
      the same way.
      
      Thanks,
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      982d8165
    • D
      vfs: increase shrinker batch size · 8ab47664
      Dave Chinner 提交于
      Now that the per-sb shrinker is responsible for shrinking 2 or more
      caches, increase the batch size to keep econmies of scale for
      shrinking each cache.  Increase the shrinker batch size to 1024
      objects.
      
      To allow for a large increase in batch size, add a conditional
      reschedule to prune_icache_sb() so that we don't hold the LRU spin
      lock for too long. This mirrors the behaviour of the
      __shrink_dcache_sb(), and allows us to increase the batch size
      without needing to worry about problems caused by long lock hold
      times.
      
      To ensure that filesystems using the per-sb shrinker callouts don't
      cause problems, document that the object freeing method must
      reschedule appropriately inside loops.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8ab47664
    • D
      superblock: add filesystem shrinker operations · 0e1fdafd
      Dave Chinner 提交于
      Now we have a per-superblock shrinker implementation, we can add a
      filesystem specific callout to it to allow filesystem internal
      caches to be shrunk by the superblock shrinker.
      
      Rather than perpetuate the multipurpose shrinker callback API (i.e.
      nr_to_scan == 0 meaning "tell me how many objects freeable in the
      cache), two operations will be added. The first will return the
      number of objects that are freeable, the second is the actual
      shrinker call.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0e1fdafd
    • J
      locks: rename lock-manager ops · 8fb47a4f
      J. Bruce Fields 提交于
      Both the filesystem and the lock manager can associate operations with a
      lock.  Confusingly, one of them (fl_release_private) actually has the
      same name in both operation structures.
      
      It would save some confusion to give the lock-manager ops different
      names.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      8fb47a4f
  26. 20 7月, 2011 3 次提交