1. 26 2月, 2013 1 次提交
    • J
      vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op · ecf3d1f1
      Jeff Layton 提交于
      The following set of operations on a NFS client and server will cause
      
          server# mkdir a
          client# cd a
          server# mv a a.bak
          client# sleep 30  # (or whatever the dir attrcache timeout is)
          client# stat .
          stat: cannot stat `.': Stale NFS file handle
      
      Obviously, we should not be getting an ESTALE error back there since the
      inode still exists on the server. The problem is that the lookup code
      will call d_revalidate on the dentry that "." refers to, because NFS has
      FS_REVAL_DOT set.
      
      nfs_lookup_revalidate will see that the parent directory has changed and
      will try to reverify the dentry by redoing a LOOKUP. That of course
      fails, so the lookup code returns ESTALE.
      
      The problem here is that d_revalidate is really a bad fit for this case.
      What we really want to know at this point is whether the inode is still
      good or not, but we don't really care what name it goes by or whether
      the dcache is still valid.
      
      Add a new d_op->d_weak_revalidate operation and have complete_walk call
      that instead of d_revalidate. The intent there is to allow for a
      "weaker" d_revalidate that just checks to see whether the inode is still
      good. This is also gives us an opportunity to kill off the FS_REVAL_DOT
      special casing.
      
      [AV: changed method name, added note in porting, fixed confusion re
      having it possibly called from RCU mode (it won't be)]
      
      Cc: NeilBrown <neilb@suse.de>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      ecf3d1f1
  2. 21 12月, 2012 3 次提交
    • M
      documentation: drop vmtruncate · b9f61c3c
      Marco Stornelli 提交于
      Removed vmtruncate
      Signed-off-by: NMarco Stornelli <marco.stornelli@gmail.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b9f61c3c
    • D
      FS-Cache: Provide proper invalidation · ef778e7a
      David Howells 提交于
      Provide a proper invalidation method rather than relying on the netfs retiring
      the cookie it has and getting a new one.  The problem with this is that isn't
      easy for the netfs to make sure that it has completed/cancelled all its
      outstanding storage and retrieval operations on the cookie it is retiring.
      
      Instead, have the cache provide an invalidation method that will cancel or wait
      for all currently outstanding operations before invalidating the cache, and
      will cause new operations to queue up behind that.  Whilst invalidation is in
      progress, some requests will be rejected until the cache can stack a barrier on
      the operation queue to cause new operations to be deferred behind it.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      ef778e7a
    • D
      FS-Cache: Fix operation state management and accounting · 9f10523f
      David Howells 提交于
      Fix the state management of internal fscache operations and the accounting of
      what operations are in what states.
      
      This is done by:
      
       (1) Give struct fscache_operation a enum variable that directly represents the
           state it's currently in, rather than spreading this knowledge over a bunch
           of flags, who's processing the operation at the moment and whether it is
           queued or not.
      
           This makes it easier to write assertions to check the state at various
           points and to prevent invalid state transitions.
      
       (2) Add an 'operation complete' state and supply a function to indicate the
           completion of an operation (fscache_op_complete()) and make things call
           it.  The final call to fscache_put_operation() can then check that an op
           in the appropriate state (complete or cancelled).
      
       (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better
           govern the state of an object:
      
      	(a) The ->n_ops is now the number of extant operations on the object
      	    and is now decremented by fscache_put_operation() only.
      
      	(b) The ->n_in_progress is simply the number of objects that have been
      	    taken off of the object's pending queue for the purposes of being
      	    run.  This is decremented by fscache_op_complete() only.
      
      	(c) The ->n_exclusive is the number of exclusive ops that have been
      	    submitted and queued or are in progress.  It is decremented by
      	    fscache_op_complete() and by fscache_cancel_op().
      
           fscache_put_operation() and fscache_operation_gc() now no longer try to
           clean up ->n_exclusive and ->n_in_progress.  That was leading to double
           decrements against fscache_cancel_op().
      
           fscache_cancel_op() now no longer decrements ->n_ops.  That was leading to
           double decrements against fscache_put_operation().
      
           fscache_submit_exclusive_op() now decides whether it has to queue an op
           based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter
           will persist in being true even after all preceding operations have been
           cancelled or completed.  Furthermore, if an object is active and there are
           runnable ops against it, there must be at least one op running.
      
       (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and
           provide a function to record completion of the pages as they complete.
      
           When n_pages reaches 0, the operation is deemed to be complete and
           fscache_op_complete() is called.
      
           Add calls to fscache_retrieval_complete() anywhere we've finished with a
           page we've been given to read or allocate for.  This includes places where
           we just return pages to the netfs for reading from the server and where
           accessing the cache fails and we discard the proposed netfs page.
      
      The bugs in the unfixed state management manifest themselves as oopses like the
      following where the operation completion gets out of sync with return of the
      cookie by the netfs.  This is possible because the cache unlocks and returns
      all the netfs pages before recording its completion - which means that there's
      nothing to stop the netfs discarding them and returning the cookie.
      
      
      FS-Cache: Cookie 'NFS.fh' still has outstanding reads
      ------------[ cut here ]------------
      kernel BUG at fs/fscache/cookie.c:519!
      invalid opcode: 0000 [#1] SMP
      CPU 1
      Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc
      
      Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090                  /DG965RY
      RIP: 0010:[<ffffffffa007050a>]  [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache]
      RSP: 0018:ffff8800368cfb00  EFLAGS: 00010282
      RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000
      RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c
      RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000
      R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98
      R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370
      FS:  0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040)
      Stack:
       ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0
       ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0
       ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91
      Call Trace:
       [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs]
       [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs]
       [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs]
       [<ffffffff810d8d47>] evict+0xa1/0x15c
       [<ffffffff810d8e2e>] dispose_list+0x2c/0x38
       [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b
       [<ffffffff810c56b7>] prune_super+0xd5/0x140
       [<ffffffff8109b615>] shrink_slab+0x102/0x1ab
       [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595
       [<ffffffff8103e009>] ? process_timeout+0xb/0xb
       [<ffffffff8109dba3>] kswapd+0x270/0x289
       [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46
       [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595
       [<ffffffff8104bf7a>] kthread+0x7f/0x87
       [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10
       [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
       [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe
       [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53
       [<ffffffff813ad6b0>] ? gs_change+0xb/0xb
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      9f10523f
  3. 18 12月, 2012 5 次提交
    • C
      docs: update documentation about /proc/<pid>/fdinfo/<fd> fanotify output · e71ec593
      Cyrill Gorcunov 提交于
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: James Bottomley <jbottomley@parallels.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Matthew Helsley <matt.helsley@gmail.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e71ec593
    • C
      docs: add documentation about /proc/<pid>/fdinfo/<fd> output · f1d8c162
      Cyrill Gorcunov 提交于
      [akpm@linux-foundation.org: tweak documentation]
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrey Vagin <avagin@openvz.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: James Bottomley <jbottomley@parallels.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Matthew Helsley <matt.helsley@gmail.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f1d8c162
    • K
      /proc/pid/status: add "Seccomp" field · 2f4b3bf6
      Kees Cook 提交于
      It is currently impossible to examine the state of seccomp for a given
      process.  While attaching with gdb and attempting "call
      prctl(PR_GET_SECCOMP,...)" will work with some situations, it is not
      reliable.  If the process is in seccomp mode 1, this query will kill the
      process (prctl not allowed), if the process is in mode 2 with prctl not
      allowed, it will similarly be killed, and in weird cases, if prctl is
      filtered to return errno 0, it can look like seccomp is disabled.
      
      When reviewing the state of running processes, there should be a way to
      externally examine the seccomp mode.  ("Did this build of Chrome end up
      using seccomp?" "Did my distro ship ssh with seccomp enabled?")
      
      This adds the "Seccomp" line to /proc/$pid/status.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: James Morris <jmorris@namei.org>
      Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2f4b3bf6
    • C
      procfs: add VmFlags field in smaps output · 834f82e2
      Cyrill Gorcunov 提交于
      During c/r sessions we've found that there is no way at the moment to
      fetch some VMA associated flags, such as mlock() and madvise().
      
      This leads us to a problem -- we don't know if we should call for mlock()
      and/or madvise() after restore on the vma area we're bringing back to
      life.
      
      This patch intorduces a new field into "smaps" output called VmFlags,
      where all set flags associated with the particular VMA is shown as two
      letter mnemonics.
      
      [ Strictly speaking for c/r we only need mlock/madvise bits but it has been
        said that providing just a few flags looks somehow inconsistent.  So all
        flags are here now. ]
      
      This feature is made available on CONFIG_CHECKPOINT_RESTORE=n kernels, as
      other applications may start to use these fields.
      
      The data is encoded in a somewhat awkward two letters mnemonic form, to
      encourage userspace to be prepared for fields being added or removed in
      the future.
      
      [a.p.zijlstra@chello.nl: props to use for_each_set_bit]
      [sfr@canb.auug.org.au: props to use array instead of struct]
      [akpm@linux-foundation.org: overall redesign and simplification]
      [akpm@linux-foundation.org: remove unneeded braces per sfr, avoid using bloaty for_each_set_bit()]
      Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      834f82e2
    • J
      fat: provide option for setting timezone offset · 58156c8f
      Jan Kara 提交于
      So far FAT either offsets time stamps by sys_tz.minuteswest or leaves them
      as they are (when tz=UTC mount option is used).  However in some cases it
      is useful if one can specify time stamp offset on his own (e.g.  when time
      zone of the camera connected is different from time zone of the computer,
      or when HW clock is in UTC and thus sys_tz.minuteswest == 0).
      
      So provide a mount option time_offset= which allows user to specify offset
      in minutes that should be applied to time stamps on the filesystem.
      
      akpm: this code would work incorrectly when used via `mount -o remount',
      because cached inodes would not be updated.  But fatfs's fat_remount() is
      basically a no-op anyway.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Acked-by: NOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      58156c8f
  4. 11 12月, 2012 4 次提交
    • H
      f2fs: fix a typo in f2fs documentation · d08ab08d
      Huajun Li 提交于
      In f2fs_fs.h, one f2fs inode contains 923 data block pointers, while
      f2fs documentation says it is 929. Fix this inconsistence.
      Signed-off-by: NHuajun Li <huajun.li.lee@gmail.com>
      d08ab08d
    • J
      f2fs: update the f2fs document · 5bb446a2
      Jaegeuk Kim 提交于
      I moved the f2fs-tools.git into kernel.org.
      And I added a new mailing list, linux-f2fs-devel@lists.sourceforge.net.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      5bb446a2
    • J
      f2fs: add document · 98e4da8c
      Jaegeuk Kim 提交于
      This adds a document describing the mount options, proc entries, usage, and
      design of Flash-Friendly File System, namely F2FS.
      Signed-off-by: NJaegeuk Kim <jaegeuk.kim@samsung.com>
      98e4da8c
    • T
      ext4: Remove CONFIG_EXT4_FS_XATTR · 939da108
      Tao Ma 提交于
      Ted has sent out a RFC about removing this feature. Eric and Jan
      confirmed that both RedHat and SUSE enable this feature in all their
      product.  David also said that "As far as I know, it's enabled in all
      Android kernels that use ext4."  So it seems OK for us.
      
      And what's more, as inline data depends its implementation on xattr,
      and to be frank, I don't run any test again inline data enabled while
      xattr disabled.  So I think we should add inline data and remove this
      config option in the same release.
      
      [ The savings if you disable CONFIG_EXT4_FS_XATTR is only 27k, which
        isn't much in the grand scheme of things.  Since no one seems to be
        testing this configuration except for some automated compile farms, on
        balance we are better removing this config option, and so that it is
        effectively always enabled. -- tytso ]
      
      Cc: David Brown <davidb@codeaurora.org>
      Cc: Eric Sandeen <sandeen@redhat.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTao Ma <boyu.mt@taobao.com>
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      939da108
  5. 27 11月, 2012 1 次提交
  6. 26 11月, 2012 1 次提交
    • J
      nfsd4: delay filling in write iovec array till after xdr decoding · ffe1137b
      J. Bruce Fields 提交于
      Our server rejects compounds containing more than one write operation.
      It's unclear whether this is really permitted by the spec; with 4.0,
      it's possibly OK, with 4.1 (which has clearer limits on compound
      parameters), it's probably not OK.  No client that we're aware of has
      ever done this, but in theory it could be useful.
      
      The source of the limitation: we need an array of iovecs to pass to the
      write operation.  In the worst case that array of iovecs could have
      hundreds of elements (the maximum rwsize divided by the page size), so
      it's too big to put on the stack, or in each compound op.  So we instead
      keep a single such array in the compound argument.
      
      We fill in that array at the time we decode the xdr operation.
      
      But we decode every op in the compound before executing any of them.  So
      once we've used that array we can't decode another write.
      
      If we instead delay filling in that array till the time we actually
      perform the write, we can reuse it.
      
      Another option might be to switch to decoding compound ops one at a
      time.  I considered doing that, but it has a number of other side
      effects, and I'd rather fix just this one problem for now.
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      ffe1137b
  7. 17 11月, 2012 1 次提交
  8. 11 11月, 2012 1 次提交
  9. 08 11月, 2012 1 次提交
  10. 03 11月, 2012 1 次提交
  11. 30 10月, 2012 1 次提交
  12. 09 10月, 2012 1 次提交
  13. 02 10月, 2012 1 次提交
    • C
      NFS: Add nfs4_unique_id boot parameter · 6f2ea7f2
      Chuck Lever 提交于
      An optional boot parameter is introduced to allow client
      administrators to specify a string that the Linux NFS client can
      insert into its nfs_client_id4 id string, to make it both more
      globally unique, and to ensure that it doesn't change even if the
      client's nodename changes.
      
      If this boot parameter is not specified, the client's nodename is
      used, as before.
      
      Client installation procedures can create a unique string (typically,
      a UUID) which remains unchanged during the lifetime of that client
      instance.  This works just like creating a UUID for the label of the
      system's root and boot volumes.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      6f2ea7f2
  14. 22 9月, 2012 1 次提交
  15. 18 9月, 2012 2 次提交
  16. 28 8月, 2012 1 次提交
  17. 22 8月, 2012 2 次提交
  18. 17 8月, 2012 1 次提交
    • T
      ext4: add max_dir_size_kb mount option · df981d03
      Theodore Ts'o 提交于
      Very large directories can cause significant performance problems, or
      perhaps even invoke the OOM killer, if the process is running in a
      highly constrained memory environment (whether it is VM's with a small
      amount of memory or in a small memory cgroup).
      
      So it is useful, in cloud server/data center environments, to be able
      to set a filesystem-wide cap on the maximum size of a directory, to
      ensure that directories never get larger than a sane size.  We do this
      via a new mount option, max_dir_size_kb.  If there is an attempt to
      grow the directory larger than max_dir_size_kb, the system call will
      return ENOSPC instead.
      
      Google-Bug-Id: 6863013
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      
      
      
      df981d03
  19. 04 8月, 2012 1 次提交
  20. 02 8月, 2012 1 次提交
  21. 01 8月, 2012 1 次提交
    • M
      mm: add support for a filesystem to activate swap files and use direct_IO for writing swap pages · 62c230bc
      Mel Gorman 提交于
      Currently swapfiles are managed entirely by the core VM by using ->bmap to
      allocate space and write to the blocks directly.  This effectively ensures
      that the underlying blocks are allocated and avoids the need for the swap
      subsystem to locate what physical blocks store offsets within a file.
      
      If the swap subsystem is to use the filesystem information to locate the
      blocks, it is critical that information such as block groups, block
      bitmaps and the block descriptor table that map the swap file were
      resident in memory.  This patch adds address_space_operations that the VM
      can call when activating or deactivating swap backed by a file.
      
        int swap_activate(struct file *);
        int swap_deactivate(struct file *);
      
      The ->swap_activate() method is used to communicate to the file that the
      VM relies on it, and the address_space should take adequate measures such
      as reserving space in the underlying device, reserving memory for mempools
      and pinning information such as the block descriptor table in memory.  The
      ->swap_deactivate() method is called on sys_swapoff() if ->swap_activate()
      returned success.
      
      After a successful swapfile ->swap_activate, the swapfile is marked
      SWP_FILE and swapper_space.a_ops will proxy to
      sis->swap_file->f_mappings->a_ops using ->direct_io to write swapcache
      pages and ->readpage to read.
      
      It is perfectly possible that direct_IO be used to read the swap pages but
      it is an unnecessary complication.  Similarly, it is possible that
      ->writepage be used instead of direct_io to write the pages but filesystem
      developers have stated that calling writepage from the VM is undesirable
      for a variety of reasons and using direct_IO opens up the possibility of
      writing back batches of swap pages in the future.
      
      [a.p.zijlstra@chello.nl: Original patch]
      Signed-off-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Eric B Munson <emunson@mgebm.net>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Christie <michaelc@cs.wisc.edu>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
      Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
      Cc: Xiaotian Feng <dfeng@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      62c230bc
  22. 31 7月, 2012 1 次提交
  23. 14 7月, 2012 7 次提交
    • A
      don't pass nameidata to ->create() · ebfc3b49
      Al Viro 提交于
      boolean "does it have to be exclusive?" flag is passed instead;
      Local filesystem should just ignore it - the object is guaranteed
      not to be there yet.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      ebfc3b49
    • A
      stop passing nameidata to ->lookup() · 00cd8dd3
      Al Viro 提交于
      Just the flags; only NFS cares even about that, but there are
      legitimate uses for such argument.  And getting rid of that
      completely would require splitting ->lookup() into a couple
      of methods (at least), so let's leave that alone for now...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      00cd8dd3
    • A
      stop passing nameidata * to ->d_revalidate() · 0b728e19
      Al Viro 提交于
      Just the lookup flags.  Die, bastard, die...
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      0b728e19
    • A
      kill struct opendata · 30d90494
      Al Viro 提交于
      Just pass struct file *.  Methods are happier that way...
      There's no need to return struct file * from finish_open() now,
      so let it return int.  Next: saner prototypes for parts in
      namei.c
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      30d90494
    • A
      make ->atomic_open() return int · d9585277
      Al Viro 提交于
      Change of calling conventions:
      old		new
      NULL		1
      file		0
      ERR_PTR(-ve)	-ve
      
      Caller *knows* that struct file *; no need to return it.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d9585277
    • A
      ->atomic_open() prototype change - pass int * instead of bool * · 47237687
      Al Viro 提交于
      ... and let finish_open() report having opened the file via that sucker.
      Next step: don't modify od->filp at all.
      
      [AV: FILE_CREATE was already used by cifs; Miklos' fix folded]
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      47237687
    • M
      vfs: add i_op->atomic_open() · d18e9008
      Miklos Szeredi 提交于
      Add a new inode operation which is called on the last component of an open.
      Using this the filesystem can look up, possibly create and open the file in one
      atomic operation.  If it cannot perform this (e.g. the file type turned out to
      be wrong) it may signal this by returning NULL instead of an open struct file
      pointer.
      
      i_op->atomic_open() is only called if the last component is negative or needs
      lookup.  Handling cached positive dentries here doesn't add much value: these
      can be opened using f_op->open().  If the cached file turns out to be invalid,
      the open can be retried, this time using ->atomic_open() with a fresh dentry.
      
      For now leave the old way of using open intents in lookup and revalidate in
      place.  This will be removed once all the users are converted.
      
      David Howells noticed that if ->atomic_open() opens the file but does not create
      it, handle_truncate() will be called on it even if it is not a regular file.
      Fix this by checking the file type in this case too.
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d18e9008