1. 07 1月, 2011 2 次提交
    • N
      fs: change d_compare for rcu-walk · 621e155a
      Nick Piggin 提交于
      Change d_compare so it may be called from lock-free RCU lookups. This
      does put significant restrictions on what may be done from the callback,
      however there don't seem to have been any problems with in-tree fses.
      If some strange use case pops up that _really_ cannot cope with the
      rcu-walk rules, we can just add new rcu-unaware callbacks, which would
      cause name lookup to drop out of rcu-walk mode.
      
      For in-tree filesystems, this is just a mechanical change.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      621e155a
    • N
      fs: change d_delete semantics · fe15ce44
      Nick Piggin 提交于
      Change d_delete from a dentry deletion notification to a dentry caching
      advise, more like ->drop_inode. Require it to be constant and idempotent,
      and not take d_lock. This is how all existing filesystems use the callback
      anyway.
      
      This makes fine grained dentry locking of dput and dentry lru scanning
      much simpler.
      Signed-off-by: NNick Piggin <npiggin@kernel.dk>
      fe15ce44
  2. 26 10月, 2010 1 次提交
    • C
      fs: do not assign default i_ino in new_inode · 85fe4025
      Christoph Hellwig 提交于
      Instead of always assigning an increasing inode number in new_inode
      move the call to assign it into those callers that actually need it.
      For now callers that need it is estimated conservatively, that is
      the call is added to all filesystems that do not assign an i_ino
      by themselves.  For a few more filesystems we can avoid assigning
      any inode number given that they aren't user visible, and for others
      it could be done lazily when an inode number is actually needed,
      but that's left for later patches.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      85fe4025
  3. 15 10月, 2010 1 次提交
    • A
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann 提交于
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
  4. 10 8月, 2010 1 次提交
    • C
      remove inode_setattr · 1025774c
      Christoph Hellwig 提交于
      Replace inode_setattr with opencoded variants of it in all callers.  This
      moves the remaining call to vmtruncate into the filesystem methods where it
      can be replaced with the proper truncate sequence.
      
      In a few cases it was obvious that we would never end up calling vmtruncate
      so it was left out in the opencoded variant:
      
       spufs: explicitly checks for ATTR_SIZE earlier
       btrfs,hugetlbfs,logfs,dlmfs: explicitly clears ATTR_SIZE earlier
       ufs: contains an opencoded simple_seattr + truncate that sets the filesize just above
      
      In addition to that ncpfs called inode_setattr with handcrafted iattrs,
      which allowed to trim down the opencoded variant.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      1025774c
  5. 11 11月, 2009 1 次提交
  6. 24 9月, 2009 1 次提交
  7. 28 3月, 2009 1 次提交
  8. 06 1月, 2009 1 次提交
  9. 17 11月, 2008 1 次提交
  10. 23 10月, 2008 3 次提交
    • A
      proc: spread __init · 1e0edd3f
      Alexey Dobriyan 提交于
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      1e0edd3f
    • M
      [PATCH] move executable checking into ->permission() · f696a365
      Miklos Szeredi 提交于
      For execute permission on a regular files we need to check if file has
      any execute bits at all, regardless of capabilites.
      
      This check is normally performed by generic_permission() but was also
      added to the case when the filesystem defines its own ->permission()
      method.  In the latter case the filesystem should be responsible for
      performing this check.
      
      Move the check from inode_permission() inside filesystems which are
      not calling generic_permission().
      
      Create a helper function execute_ok() that returns true if the inode
      is a directory or if any execute bits are present in i_mode.
      
      Also fix up the following code:
      
       - coda control file is never executable
       - sysctl files are never executable
       - hfs_permission seems broken on MAY_EXEC, remove
       - hfsplus_permission is eqivalent to generic_permission(), remove
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      f696a365
    • C
      [PATCH] fix ->llseek for more directories · 3222a3e5
      Christoph Hellwig 提交于
      With this patch all directory fops instances that have a readdir
      that doesn't take the BKL are switched to generic_file_llseek.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      3222a3e5
  11. 10 10月, 2008 2 次提交
  12. 27 7月, 2008 2 次提交
    • A
      [PATCH] sanitize ->permission() prototype · e6305c43
      Al Viro 提交于
      * kill nameidata * argument; map the 3 bits in ->flags anybody cares
        about to new MAY_... ones and pass with the mask.
      * kill redundant gfs2_iop_permission()
      * sanitize ecryptfs_permission()
      * fix remaining places where ->permission() instances might barf on new
        MAY_... found in mask.
      
      The obvious next target in that direction is permission(9)
      
      folded fix for nfs_permission() breakage from Miklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e6305c43
    • A
      [PATCH] sanitize proc_sysctl · 9043476f
      Al Viro 提交于
      * keep references to ctl_table_head and ctl_table in /proc/sys inodes
      * grab the former during operations, use the latter for access to
        entry if that succeeds
      * have ->d_compare() check if table should be seen for one who does lookup;
        that allows us to avoid flipping inodes - if we have the same name resolve
        to different things, we'll just keep several dentries and ->d_compare()
        will reject the wrong ones.
      * have ->lookup() and ->readdir() scan the table of our inode first, then
        walk all ctl_table_header and scan ->attached_by for those that are
        attached to our directory.
      * implement ->getattr().
      * get rid of insane amounts of tree-walking
      * get rid of the need to know dentry in ->permission() and of the contortions
        induced by that.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9043476f
  13. 29 4月, 2008 2 次提交
    • P
      sysctl: add the ->permissions callback on the ctl_table_root · d7321cd6
      Pavel Emelyanov 提交于
      When reading from/writing to some table, a root, which this table came from,
      may affect this table's permissions, depending on who is working with the
      table.
      
      The core hunk is at the bottom of this patch.  All the rest is just pushing
      the ctl_table_root argument up to the sysctl_perm() function.
      
      This will be mostly (only?) used in the net sysctls.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Cc: Denis V. Lunev <den@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7321cd6
    • P
      sysctl: merge equal proc_sys_read and proc_sys_write · 7708bfb1
      Pavel Emelyanov 提交于
      Many (most of) sysctls do not have a per-container sense.  E.g.
      kernel.print_fatal_signals, vm.panic_on_oom, net.core.netdev_budget and so on
      and so forth.  Besides, tuning then from inside a container is not even
      secure.  On the other hand, hiding them completely from the container's tasks
      sometimes causes user-space to stop working.
      
      When developing net sysctl, the common practice was to duplicate a table and
      drop the write bits in table->mode, but this approach was not very elegant,
      lead to excessive memory consumption and was not suitable in general.
      
      Here's the alternative solution.  To facilitate the per-container sysctls
      ctl_table_root-s were introduced.  Each root contains a list of
      ctl_table_header-s that are visible to different namespaces.  The idea of this
      set is to add the permissions() callback on the ctl_table_root to allow ctl
      root limit permissions to the same ctl_table-s.
      
      The main user of this functionality is the net-namespaces code, but later this
      will (should) be used by more and more namespaces, containers and control
      groups.
      
      Actually, this idea's core is in a single hunk in the third patch.  First two
      patches are cleanups for sysctl code, while the third one mostly extends the
      arguments set of some sysctl functions.
      
      This patch:
      
      These ->read and ->write callbacks act in a very similar way, so merge these
      paths to reduce the number of places to patch later and shrink the .text size
      (a bit).
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: N"David S. Miller" <davem@davemloft.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Cc: Denis V. Lunev <den@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7708bfb1
  14. 15 2月, 2008 1 次提交
    • J
      Embed a struct path into struct nameidata instead of nd->{dentry,mnt} · 4ac91378
      Jan Blunck 提交于
      This is the central patch of a cleanup series. In most cases there is no good
      reason why someone would want to use a dentry for itself. This series reflects
      that fact and embeds a struct path into nameidata.
      
      Together with the other patches of this series
      - it enforced the correct order of getting/releasing the reference count on
        <dentry,vfsmount> pairs
      - it prepares the VFS for stacking support since it is essential to have a
        struct path in every place where the stack can be traversed
      - it reduces the overall code size:
      
      without patch series:
         text    data     bss     dec     hex filename
      5321639  858418  715768 6895825  6938d1 vmlinux
      
      with patch series:
         text    data     bss     dec     hex filename
      5320026  858418  715768 6894212  693284 vmlinux
      
      This patch:
      
      Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix cifs]
      [akpm@linux-foundation.org: fix smack]
      Signed-off-by: NJan Blunck <jblunck@suse.de>
      Signed-off-by: NAndreas Gruenbacher <agruen@suse.de>
      Acked-by: NChristoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Casey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4ac91378
  15. 09 2月, 2008 1 次提交
  16. 26 10月, 2007 1 次提交
  17. 09 5月, 2007 1 次提交
  18. 15 2月, 2007 2 次提交
    • E
      [PATCH] sysctl: hide the sysctl proc inodes from selinux · 86a71dbd
      Eric W. Biederman 提交于
      Since the security checks are applied on each read and write of a sysctl file,
      just like they are applied when calling sys_sysctl, they are redundant on the
      standard VFS constructs.  Since it is difficult to compute the security labels
      on the standard VFS constructs we just mark the sysctl inodes in proc private
      so selinux won't even bother with them.
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      86a71dbd
    • E
      [PATCH] sysctl: reimplement the sysctl proc support · 77b14db5
      Eric W. Biederman 提交于
      With this change the sysctl inodes can be cached and nothing needs to be done
      when removing a sysctl table.
      
      For a cost of 2K code we will save about 4K of static tables (when we remove
      de from ctl_table) and 70K in proc_dir_entries that we will not allocate, or
      about half that on a 32bit arch.
      
      The speed feels about the same, even though we can now cache the sysctl
      dentries :(
      
      We get the core advantage that we don't need to have a 1 to 1 mapping between
      ctl table entries and proc files.  Making it possible to have /proc/sys vary
      depending on the namespace you are in.  The currently merged namespaces don't
      have an issue here but the network namespace under /proc/sys/net needs to have
      different directories depending on which network adapters are visible.  By
      simply being a cache different directories being visible depending on who you
      are is trivial to implement.
      
      [akpm@osdl.org: fix uninitialised var]
      [akpm@osdl.org: fix ARM build]
      [bunk@stusta.de: make things static]
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      77b14db5