1. 06 6月, 2012 1 次提交
    • B
      GFS2: Extend the life of the reservations · 0a305e49
      Bob Peterson 提交于
      This patch lengthens the lifespan of the reservations structure for
      inodes. Before, they were allocated and deallocated for every write
      operation. With this patch, they are allocated when the first write
      occurs, and deallocated when the last process closes the file.
      It's more efficient to do it this way because it saves GFS2 a lot of
      unnecessary allocates and frees. It also gives us more flexibility
      for the future: (1) we can now fold the qadata structure back into
      the structure and save those alloc/frees, (2) we can use this for
      multi-block reservations.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      0a305e49
  2. 24 4月, 2012 1 次提交
  3. 01 4月, 2012 1 次提交
  4. 09 3月, 2012 1 次提交
    • B
      GFS2: call gfs2_write_alloc_required for each chunk · 58a7d5fb
      Benjamin Marzinski 提交于
      gfs2_fallocate was calling gfs2_write_alloc_required() once at the start of
      the function. This caused problems since gfs2_write_alloc_required used a
      long unsigned int for the len, but gfs2_fallocate could allocate a much
      larger amount.  This patch will move the call into the loop where the
      chunks are actually allocated and zeroed out. This will keep the allocation
      size under the limit, and also allow gfs2_fallocate to quickly skip over
      sections of the file that are already completely allocated.
      
      fallcate_chunk was also not correctly setting the file size.  It was using the
      len veriable to find the last block written to, but by the time it was setting
      the size, the len variable had already been decremented to 0.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      58a7d5fb
  5. 29 2月, 2012 1 次提交
    • S
      GFS2: FITRIM ioctl support · 66fc061b
      Steven Whitehouse 提交于
      The FITRIM ioctl provides an alternative way to send discard requests to
      the underlying device. Using the discard mount option results in every
      freed block generating a discard request to the block device. This can
      be slow, since many block devices can only process discard requests of
      larger sizes, and also such operations can be time consuming.
      
      Rather than using the discard mount option, FITRIM allows a sweep of the
      filesystem on an occasional basis, and also to optionally avoid sending
      down discard requests for smaller regions.
      
      In GFS2 FITRIM will work at resource group granularity. There is a flag
      for each resource group which keeps track of which resource groups have
      been trimmed. This flag is reset whenever a deallocation occurs in the
      resource group, and set whenever a successful FITRIM of that resource
      group has taken place. This helps to reduce repeated discard requests
      for the same block ranges, again improving performance.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      66fc061b
  6. 28 2月, 2012 2 次提交
    • S
      GFS2: Read resource groups on mount · a365fbf3
      Steven Whitehouse 提交于
      This makes mount take slightly longer, but at the same time, the first
      write to the filesystem will be faster too. It also means that if there
      is a problem in the resource index, then we can refuse to mount rather
      than having to try and report that when the first write occurs.
      
      In addition, to avoid recursive locking, we hvae to take account of
      instances when the rindex glock may already be held when we are
      trying to update the rbtree of resource groups.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      a365fbf3
    • B
      GFS2: Ensure rindex is uptodate for fallocate · 9e73f571
      Bob Peterson 提交于
      This patch fixes a problem whereby gfs2_grow was failing and causing GFS2
      to assert. The problem was that when GFS2's fallocate operation tried to
      acquire an "allocation" it made sure the rindex was up to date, and if not,
      it called gfs2_rindex_update. However, if the file being fallocated was
      the rindex itself, it was already locked at that point. By calling
      gfs2_rindex_update at an earlier point in time, we bring rindex up to date
      and thereby avoid trying to lock it when the "allocation" is acquired.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      9e73f571
  7. 04 1月, 2012 2 次提交
  8. 22 11月, 2011 1 次提交
  9. 21 11月, 2011 1 次提交
    • S
      GFS2: O_(D)SYNC support for fallocate · 4442f2e0
      Steven Whitehouse 提交于
      Add sync of metadata after fallocate for O_SYNC files to ensure that we
      meet expectations for everything being on disk in this case.
      Unfortunately, the offset and len parameters are modified during the
      course of the fallocate function, so I've had to add a couple of new
      variables to call generic_write_sync() at the end.
      
      I know that potentially this will sync data as well within the range,
      but I think that is a fairly harmless side-effect overall, since we
      would not normally expect there to be any dirty data within the range in
      question.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Benjamin Marzinski <bmarzins@redhat.com>
      4442f2e0
  10. 08 11月, 2011 2 次提交
  11. 28 10月, 2011 1 次提交
    • A
      vfs: do (nearly) lockless generic_file_llseek · ef3d0fd2
      Andi Kleen 提交于
      The i_mutex lock use of generic _file_llseek hurts.  Independent processes
      accessing the same file synchronize over a single lock, even though
      they have no need for synchronization at all.
      
      Under high utilization this can cause llseek to scale very poorly on larger
      systems.
      
      This patch does some rethinking of the llseek locking model:
      
      First the 64bit f_pos is not necessarily atomic without locks
      on 32bit systems. This can already cause races with read() today.
      This was discussed on linux-kernel in the past and deemed acceptable.
      The patch does not change that.
      
      Let's look at the different seek variants:
      
      SEEK_SET: Doesn't really need any locking.
      If there's a race one writer wins, the other loses.
      
      For 32bit the non atomic update races against read()
      stay the same. Without a lock they can also happen
      against write() now.  The read() race was deemed
      acceptable in past discussions, and I think if it's
      ok for read it's ok for write too.
      
      => Don't need a lock.
      
      SEEK_END: This behaves like SEEK_SET plus it reads
      the maximum size too. Reading the maximum size would have the
      32bit atomic problem. But luckily we already have a way to read
      the maximum size without locking (i_size_read), so we
      can just use that instead.
      
      Without i_mutex there is no synchronization with write() anymore,
      however since the write() update is atomic on 64bit it just behaves
      like another racy SEEK_SET.  On non atomic 32bit it's the same
      as SEEK_SET.
      
      => Don't need a lock, but need to use i_size_read()
      
      SEEK_CUR: This has a read-modify-write race window
      on the same file. One could argue that any application
      doing unsynchronized seeks on the same file is already broken.
      But for the sake of not adding a regression here I'm
      using the file->f_lock to synchronize this. Using this
      lock is much better than the inode mutex because it doesn't
      synchronize between processes.
      
      => So still need a lock, but can use a f_lock.
      
      This patch implements this new scheme in generic_file_llseek.
      I dropped generic_file_llseek_unlocked and changed all callers.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      ef3d0fd2
  12. 21 10月, 2011 8 次提交
    • B
      GFS2: rewrite fallocate code to write blocks directly · 64dd153c
      Benjamin Marzinski 提交于
      GFS2's fallocate code currently goes through the page cache. Since it's only
      writing to the end of the file or to holes in it, it doesn't need to, and it
      was causing issues on low memory environments. This patch pulls in some of
      Steve's block allocation work, and uses it to simply allocate the blocks for
      the file, and zero them out at allocation time.  It provides a slight
      performance increase, and it dramatically simplifies the code.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      64dd153c
    • S
      GFS2: Clean up ->page_mkwrite · 13d921e3
      Steven Whitehouse 提交于
      This patch brings gfs2's ->page_mkwrite uptodate with respect to the
      expectations set by the VM. Also added is a check to wait if the fs
      is frozen, before we attempt to get a glock. This will only work on
      the node which initiates the freeze, but thats ok since the transaction
      lock will still provide the expected barrier on other nodes.
      
      The major change here is that we return a locked page now, except when
      we don't return a page at all (error cases). This removes the race
      which required rechecking the page after it was returned.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      13d921e3
    • S
      GFS2: Fix AIL flush issue during fsync · b5b24d7a
      Steven Whitehouse 提交于
      Unfortunately, it is not enough to just ignore locked buffers during
      the AIL flush from fsync. We need to be able to ignore all buffers
      which are locked, dirty or pinned at this stage as they might have
      been added subsequent to the log flush earlier in the fsync function.
      
      In addition, this means that we no longer need to rely on i_mutex to
      keep out writes during fsync, so we can, as a side-effect, remove
      that protection too.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Tested-By: NAbhijith Das <adas@redhat.com>
      b5b24d7a
    • S
      GFS2: Cache the most recently used resource group in the inode · 54335b1f
      Steven Whitehouse 提交于
      This means that after the initial allocation for any inode, the
      last used resource group is cached in the inode for future use.
      This drastically reduces the number of lookups of resource
      groups in the common case, and this the contention on that
      data structure.
      
      The allocation algorithm is the same as previously, except that we
      always check to see if the goal block is within the cached rgrp
      first before going to the rbtree to look one up.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      54335b1f
    • S
      GFS2: Fix lseek after SEEK_DATA, SEEK_HOLE have been added · 9453615a
      Steven Whitehouse 提交于
      We need to take the inode's glock whenever the inode's size
      is referenced, otherwise it might not be uptodate. Even
      though generic_file_llseek_unlocked() doesn't implement
      SEEK_DATA, SEEK_HOLE directly, it does reference the inode's
      size in those cases, so we need to add them to the list
      of origins which need the glock.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      9453615a
    • S
      GFS2: Use ->dirty_inode() · ab9bbda0
      Steven Whitehouse 提交于
      The aim of this patch is to use the newly enhanced ->dirty_inode()
      super block operation to deal with atime updates, rather than
      piggy backing that code into ->write_inode() as is currently
      done.
      
      The net result is a simplification of the code in various places
      and a reduction of the number of gfs2_dinode_out() calls since
      this is now implied by ->dirty_inode().
      
      Some of the mark_inode_dirty() calls have been moved under glocks
      in order to take advantage of then being able to avoid locking in
      ->dirty_inode() when we already have suitable locks.
      
      One consequence is that generic_write_end() now correctly deals
      with file size updates, so that we do not need a separate check
      for that afterwards. This also, indirectly, means that fdatasync
      should work correctly on GFS2 - the current code always syncs the
      metadata whether it needs to or not.
      
      Has survived testing with postmark (with and without atime) and
      also fsx.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      ab9bbda0
    • S
      GFS2: Fix bug trap and journaled data fsync · f1818529
      Steven Whitehouse 提交于
      Journaled data requires that a complete flush of all dirty data for
      the file is done, in order that the ail flush which comes after
      will succeed.
      
      Also the recently enhanced bug trap can trigger falsely in case
      an ail flush from fsync races with a page read. This updates the
      bug trap such that it will ignore buffers which are locked and
      only trigger on dirty and/or pinned buffers when the ail flush
      is run from fsync. The original bug trap is retained when ail
      flush is run from ->go_sync()
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      f1818529
    • S
      GFS2: Split data write & wait in fsync · 2f0264d5
      Steven Whitehouse 提交于
      Now that the data writing is part of fsync proper, we can split
      the waiting part out and do it later on. This reduces the
      number of waits that we do during fsync on average.
      
      There is also no need to take the i_mutex unless we are flushing
      metadata to disk, so we can move that to within the metadata
      flushing code.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      2f0264d5
  13. 21 7月, 2011 1 次提交
    • J
      fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers · 02c24a82
      Josef Bacik 提交于
      Btrfs needs to be able to control how filemap_write_and_wait_range() is called
      in fsync to make it less of a painful operation, so push down taking i_mutex and
      the calling of filemap_write_and_wait() down into the ->fsync() handlers.  Some
      file systems can drop taking the i_mutex altogether it seems, like ext3 and
      ocfs2.  For correctness sake I just pushed everything down in all cases to make
      sure that we keep the current behavior the same for everybody, and then each
      individual fs maintainer can make up their mind about what to do from there.
      Thanks,
      Acked-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      02c24a82
  14. 20 7月, 2011 1 次提交
  15. 15 7月, 2011 1 次提交
  16. 03 5月, 2011 1 次提交
    • B
      GFS2: make sure fallocate bytes is a multiple of blksize · 6905d9e4
      Benjamin Marzinski 提交于
      The GFS2 fallocate code chooses a target size to for allocating chunks of
      space.  Whenever it can't find any resource groups with enough space free, it
      halves its target. Since this target is in bytes, eventually it will no longer
      be a multiple of blksize.  As long as there is more space available in the
      resource group than the target, this isn't a problem, since gfs2 will use the
      actual space available, which is always a multiple of blksize.  However,
      when gfs couldn't fallocate a bigger chunk than the target, it was using the
      non-blksize aligned number. This caused a BUG in later code that required
      blksize aligned offsets.  GFS2 now ensures that bytes is always a multiple of
      blksize
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      6905d9e4
  17. 20 4月, 2011 1 次提交
    • S
      GFS2: Clean up fsync() · dba898b0
      Steven Whitehouse 提交于
      This patch is designed to clean up GFS2's fsync
      implementation and ensure that it really does get everything on
      disk. Since ->write_inode() has been updated, we can call that
      via the vfs library function sync_inode_metadata() and the only
      remaining thing that has to be done is to ensure that we get
      any revoke records in the log after the inode has been written back.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      dba898b0
  18. 18 4月, 2011 1 次提交
  19. 24 3月, 2011 1 次提交
  20. 11 3月, 2011 1 次提交
  21. 09 3月, 2011 1 次提交
  22. 02 2月, 2011 1 次提交
    • S
      GFS2: Improve cluster mmap scalability · b9c93bb7
      Steven Whitehouse 提交于
      The mmap system call grabs a glock when an update to atime maybe
      required. It does this in order to ensure that the flags on the
      inode are uptodate, but since it will only mark atime for a future
      update, an exclusive lock is not required here (one will be taken
      later when the actual update is performed).
      
      Also, the lock can be skipped when the mount is marked noatime in
      addition to the original check which only looked at the noatime
      flag for the inode itself.
      
      This should increase the scalability of the mmap call when multiple
      nodes are all mmaping the same file.
      Reported-by: NScooter Morris <scooter@cgl.ucsf.edu>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b9c93bb7
  23. 17 1月, 2011 1 次提交
    • C
      fallocate should be a file operation · 2fe17c10
      Christoph Hellwig 提交于
      Currently all filesystems except XFS implement fallocate asynchronously,
      while XFS forced a commit.  Both of these are suboptimal - in case of O_SYNC
      I/O we really want our allocation on disk, especially for the !KEEP_SIZE
      case where we actually grow the file with user-visible zeroes.  On the
      other hand always commiting the transaction is a bad idea for fast-path
      uses of fallocate like for example in recent Samba versions.   Given
      that block allocation is a data plane operation anyway change it from
      an inode operation to a file operation so that we have the file structure
      available that lets us check for O_SYNC.
      
      This also includes moving the code around for a few of the filesystems,
      and remove the already unnedded S_ISDIR checks given that we only wire
      up fallocate for regular files.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2fe17c10
  24. 07 1月, 2011 1 次提交
  25. 31 10月, 2010 2 次提交
  26. 15 10月, 2010 1 次提交
    • A
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann 提交于
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
  27. 05 10月, 2010 1 次提交
    • A
      fs/locks.c: prepare for BKL removal · b89f4321
      Arnd Bergmann 提交于
      This prepares the removal of the big kernel lock from the
      file locking code. We still use the BKL as long as fs/lockd
      uses it and ceph might sleep, but we can flip the definition
      to a private spinlock as soon as that's done.
      All users outside of fs/lockd get converted to use
      lock_flocks() instead of lock_kernel() where appropriate.
      
      Based on an earlier patch to use a spinlock from Matthew
      Wilcox, who has attempted this a few times before, the
      earliest patch from over 10 years ago turned it into
      a semaphore, which ended up being slower than the BKL
      and was subsequently reverted.
      
      Someone should do some serious performance testing when
      this becomes a spinlock, since this has caused problems
      before. Using a spinlock should be at least as good
      as the BKL in theory, but who knows...
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NMatthew Wilcox <willy@linux.intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Miklos Szeredi <mszeredi@suse.cz>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Sage Weil <sage@newdream.net>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-fsdevel@vger.kernel.org
      b89f4321
  28. 28 9月, 2010 1 次提交
  29. 20 9月, 2010 1 次提交