1. 28 6月, 2013 1 次提交
  2. 20 6月, 2013 2 次提交
  3. 19 6月, 2013 1 次提交
    • B
      GFS2: aggressively issue revokes in gfs2_log_flush · 5d054964
      Benjamin Marzinski 提交于
      This patch looks at all the outstanding blocks in all the transactions
      on the log, and moves the completed ones to the ail2 list.  Then it
      issues revokes for these blocks.  This will hopefully speed things up
      in situations where there is a lot of contention for glocks, especially
      if they are acquired serially.
      
      revoke_lo_before_commit will issue at most one log block's full of these
      preemptive revokes. The amount of reserved log space that
      gfs2_log_reserve() ignores has been incremented to allow for this extra
      block.
      
      This patch also consolidates the common revoke instructions into one
      function, gfs2_add_revoke().
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5d054964
  4. 14 6月, 2013 2 次提交
    • B
      GFS2: fix regression in dir_double_exhash · 512cbf02
      Bob Peterson 提交于
      Recent commit e8830d88 introduced a bug in function dir_double_exhash;
      it was failing to set h in the fall-back case. This patch corrects it.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      512cbf02
    • S
      GFS2: Add atomic_open support · 6d4ade98
      Steven Whitehouse 提交于
      I've restricted atomic_open to only operate on regular files, although
      I still don't understand why atomic_open should not be possible also for
      directories on GFS2. That can always be added in later though, if it
      makes sense.
      
      The ->atomic_open function can be passed negative dentries, which
      in most cases means either ENOENT (->lookup) or a call to d_instantiate
      (->create). In the GFS2 case though, we need to actually perform the
      look up, since we do not know whether there has been a new inode created
      on another node. The look up calls d_splice_alias which then tries to
      rehash the dentry - so the solution here is to simply check for that
      in d_splice_alias. The same issue is likely to affect any other cluster
      filesystem implementing ->atomic_open
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "J. Bruce Fields" <bfields fieldses org>
      Cc: Jeff Layton <jlayton@redhat.com>
      6d4ade98
  5. 11 6月, 2013 1 次提交
    • S
      GFS2: Only do one directory search on create · 5a00f3cc
      Steven Whitehouse 提交于
      Creation of a new inode requires a directory search in order to ensure
      that we are not trying to create an inode with the same name as an
      existing one. This was hidden away inside the create_ok() function.
      
      In the case that there was an existing inode, and a lookup can be
      substituted for a create (which is the case with regular files
      when the O_EXCL flag is not in use) then we were doing a second
      lookup in order to return the inode.
      
      This patch merges these two lookups into one. This can be done by
      passing a flag to gfs2_dir_search() to tell it to just return -EEXIST
      in the cases where we don't actually want to look up the inode.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      5a00f3cc
  6. 06 6月, 2013 1 次提交
  7. 05 6月, 2013 4 次提交
  8. 03 6月, 2013 4 次提交
  9. 24 5月, 2013 4 次提交
    • S
      GFS2: Fix typo in gfs2_log_end_write loop · e97e548b
      Steven Whitehouse 提交于
      There was a missing _all in this loop iterator
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      e97e548b
    • R
      GFS2: fix DLM depends to fix build errors · 75f96ce6
      Randy Dunlap 提交于
      Fix build errors by correcting DLM dependencies in GFS2.
      Build errors happen when CONFIG_GFS2_FS_LOCKING_DLM=y and CONFIG_DLM=m:
      
      fs/built-in.o: In function `gfs2_lock':
      file.c:(.text+0xc7abd): undefined reference to `dlm_posix_get'
      file.c:(.text+0xc7ad0): undefined reference to `dlm_posix_unlock'
      file.c:(.text+0xc7ad9): undefined reference to `dlm_posix_lock'
      fs/built-in.o: In function `gdlm_unmount':
      lock_dlm.c:(.text+0xd6e5b): undefined reference to `dlm_release_lockspace'
      fs/built-in.o: In function `sync_unlock':
      lock_dlm.c:(.text+0xd6e9e): undefined reference to `dlm_unlock'
      fs/built-in.o: In function `sync_lock':
      lock_dlm.c:(.text+0xd6fb6): undefined reference to `dlm_lock'
      fs/built-in.o: In function `gdlm_put_lock':
      lock_dlm.c:(.text+0xd7238): undefined reference to `dlm_unlock'
      fs/built-in.o: In function `gdlm_mount':
      lock_dlm.c:(.text+0xd753e): undefined reference to `dlm_new_lockspace'
      lock_dlm.c:(.text+0xd79d3): undefined reference to `dlm_release_lockspace'
      fs/built-in.o: In function `gdlm_lock':
      lock_dlm.c:(.text+0xd8179): undefined reference to `dlm_lock'
      fs/built-in.o: In function `gdlm_cancel':
      lock_dlm.c:(.text+0xd6b22): undefined reference to `dlm_unlock'
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      75f96ce6
    • B
      GFS2: Use single-block reservations for directories · af21ca8e
      Bob Peterson 提交于
      This patch changes the multi-block allocation code, such that
      directory inodes only get a single block reserved in the bitmap.
      That way, the bitmaps are more tightly packed together, and there
      are fewer spans of free blocks for in-use block reservations.
      This means it takes less time to find a free span of blocks in the
      bitmap, which speeds things up. This increases the performance of
      some workloads by almost 2X. In Nate's mockup.py script (which does
      (1) create dir, (2) create dir in dir, (3) create file in that dir)
      the test executes in 23 steps rather than 43 steps, a 47%
      performance improvement.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      af21ca8e
    • B
      GFS2: two minor quota fixups · 37f71577
      Bob Peterson 提交于
      This patch fixes two regression problems that Abhi found in the
      GFS2 quota code.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      37f71577
  10. 08 5月, 2013 1 次提交
  11. 29 4月, 2013 1 次提交
  12. 26 4月, 2013 1 次提交
    • B
      GFS2: Flush work queue before clearing glock hash tables · 222cb538
      Bob Peterson 提交于
      There was a timing window when a GFS2 file system was unmounted
      that caused GFS2 to call BUG() and panic the kernel. The call
      to BUG() is meant to ensure that the glock reference count,
      gl_ref, never gets down to zero and bounce back up again. What was
      happening during umount is that function gfs2_put_super was dequeing
      its glocks for well-known files. In particular, we saw it on the
      journal glock, sd_jinode_gh. The dequeue caused delayed work to be
      queued for the glock state machine, to transition the lock to an
      "unlocked" state. While the work was still queued, gfs2_put_super
      called gfs2_gl_hash_clear to clear out the glock hash tables.
      If the timing was just so, the glock work function would drop the
      reference count at the time when it was being checked for zero,
      and that caused BUG() to be called. This patch calls
      flush_workqueue before clearing the glock hash tables, thereby
      ensuring that the delayed work is executed before the hash tables
      are cleared, and therefore the reference count never goes to zero
      until the glock is cleared.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      222cb538
  13. 10 4月, 2013 2 次提交
    • S
      GFS2: Add origin indicator to glock demote tracing · 7bd8b2eb
      Steven Whitehouse 提交于
      This adds the origin indicator to the trace point for glock
      demotion, so that it is possible to see where demote requests
      have come from.
      
      Note that requests generated from the demote_rq sysfs interface
      will show as remote, since they are intended to replicate
      exactly the effect of a demote reuqest from a remote node. It
      is still possible to tell these apart by looking at the process
      which initiated the demote request.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      7bd8b2eb
    • S
      GFS2: Add origin indicator to glock callbacks · 81ffbf65
      Steven Whitehouse 提交于
      This patch adds a bool indicating whether the demote
      request was originated locally or remotely. This is then
      used by the iopen ->go_callback() to make 100% sure that
      it will only respond to remote callbacks.
      
      Since ->evict_inode() uses GL_NOCACHE when it attempts to
      get an exclusive lock on the iopen lock, this may result
      in extra scheduling of the workqueue in case that the
      exclusive promotion request failed. This patch prevents
      that from happening.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      81ffbf65
  14. 08 4月, 2013 5 次提交
    • B
      GFS2: replace gfs2_ail structure with gfs2_trans · 16ca9412
      Benjamin Marzinski 提交于
      In order to allow transactions and log flushes to happen at the same
      time, gfs2 needs to move the transaction accounting and active items
      list code into the gfs2_trans structure.  As a first step toward this,
      this patch removes the gfs2_ail structure, and handles the active items
      list in the gfs_trans structure.  This keeps gfs2 from allocating an ail
      structure on log flushes, and gives us a struture that can later be used
      to store the transaction accounting outside of the gfs2 superblock
      structure.
      
      With this patch, at the end of a transaction, gfs2 will add the
      gfs2_trans structure to the superblock if there is not one already.
      This structure now has the active items fields that were previously in
      gfs2_ail.  This is not necessary in the case where the transaction was
      simply used to add revokes, since these are never written outside of the
      journal, and thus, don't need an active items list.
      
      Also, in order to make sure that the transaction structure is not
      removed while it's still in use by gfs2_trans_end, unlocking the
      sd_log_flush_lock has to happen slightly later in ending the
      transaction.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      16ca9412
    • B
      GFS2: Remove vestigial parameter ip from function rs_deltree · 20095218
      Bob Peterson 提交于
      The functions that delete block reservations from the rgrp block
      reservations rbtree no longer use the ip parameter. This patch
      eliminates the parameter.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      20095218
    • S
      GFS2: Use gfs2_dinode_out() in the inode create path · 79ba7480
      Steven Whitehouse 提交于
      Over the previous two patches relating to inode creation, the
      content of init_dinode() has been looking more and more like
      gfs2_dinode_out(). This is not an accident! This patch replaces
      the parts of init_dinode() which are duplicated in gfs2_dinode_out()
      with a call to that function.
      
      Mostly that is straightforward, but there is one issue which needed
      to be resolved relating to the link count. The link count has to be
      set to zero in a certain error handling code path, which lands up
      calling iput(). This is now done specifically in that code path
      allowing the link count to be set earlier and written into the
      on disk inode by gfs2_dinode_put() in the normal way.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      79ba7480
    • S
      GFS2: Remove gfs2_refresh_inode from inode creation path · 28fb3027
      Steven Whitehouse 提交于
      The original method for creating inodes used in GFS2 was to fill
      out a buffer, with all the information, and then to read that
      buffer into the in-core inode, using gfs2_refresh_inode()
      
      The problem with this approach is that all the inode's fields
      need to be calculated ahead of time, and were stored in various
      variables making the code rather complicated.
      
      The new approach is simply to allocate the in-core inode earlier
      and fill in as many fields as possible ahead of time. These can
      then be used to initilise the on disk representation. The
      code has been working towards the point where it is possible
      to remove gfs2_refresh_inode() because all the fields are
      correctly initialised ahead of time. We've now reached that
      milestone, and have reversed the order of setting up the in
      core and on disk inodes.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      28fb3027
    • S
      GFS2: Clean up inode creation path · fd4b4e04
      Steven Whitehouse 提交于
      This patch cleans up the inode creation code path in GFS2. After the
      Orlov allocator was merged, a number of potential improvements are
      now possible, and this is a first set of these.
      
      The quota handling is now updated so that it matches the point in
      the code where the allocation takes place. This means that the one
      exception in gfs2_alloc_blocks relating to quota is now no longer
      required, and we can use the generic code everywhere.
      
      In addition the call to figure out whether we need to allocate any
      extra blocks in order to add a directory entry is moved higher up
      gfs2_create_inode. This means that if it returns an error, we
      can deal with that at a stage where it is easier to handle that case.
      The returned status cannot change during the function since we hold
      an exclusive lock on the directory.
      
      Two calls to gfs2_rindex_update have been changed to one, again at
      the top of gfs2_create_inode to simplify error handling.
      
      The time stamps are also now initialised earlier in the creation
      process, this is gradually moving towards being able to remove the
      call to gfs2_refresh_inode in gfs2_inode_create once we have all the
      fields covered.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      fd4b4e04
  15. 06 4月, 2013 1 次提交
    • B
      GFS2: Issue discards in 512b sectors · b2c87cae
      Bob Peterson 提交于
      This patch changes GFS2's discard issuing code so that it calls
      function sb_issue_discard rather than blkdev_issue_discard. The
      code was calling blkdev_issue_discard and specifying the correct
      sector offset and sector size, but blkdev_issue_discard expects
      these values to be in terms of 512 byte sectors, even if the native
      sector size for the device is different. Calling sb_issue_discard
      with the BLOCK size instead ensures the correct block-to-512b-sector
      translation. I verified that "minlen" is specified in blocks, so
      comparing it to a number of blocks is correct.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b2c87cae
  16. 04 4月, 2013 4 次提交
  17. 24 3月, 2013 1 次提交
    • K
      block: Add bio_end_sector() · f73a1c7d
      Kent Overstreet 提交于
      Just a little convenience macro - main reason to add it now is preparing
      for immutable bio vecs, it'll reduce the size of the patch that puts
      bi_sector/bi_size/bi_idx into a struct bvec_iter.
      Signed-off-by: NKent Overstreet <koverstreet@google.com>
      CC: Jens Axboe <axboe@kernel.dk>
      CC: Lars Ellenberg <drbd-dev@lists.linbit.com>
      CC: Jiri Kosina <jkosina@suse.cz>
      CC: Alasdair Kergon <agk@redhat.com>
      CC: dm-devel@redhat.com
      CC: Neil Brown <neilb@suse.de>
      CC: Martin Schwidefsky <schwidefsky@de.ibm.com>
      CC: Heiko Carstens <heiko.carstens@de.ibm.com>
      CC: linux-s390@vger.kernel.org
      CC: Chris Mason <chris.mason@fusionio.com>
      CC: Steven Whitehouse <swhiteho@redhat.com>
      Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
      f73a1c7d
  18. 04 3月, 2013 1 次提交
    • E
      fs: Limit sys_mount to only request filesystem modules. · 7f78e035
      Eric W. Biederman 提交于
      Modify the request_module to prefix the file system type with "fs-"
      and add aliases to all of the filesystems that can be built as modules
      to match.
      
      A common practice is to build all of the kernel code and leave code
      that is not commonly needed as modules, with the result that many
      users are exposed to any bug anywhere in the kernel.
      
      Looking for filesystems with a fs- prefix limits the pool of possible
      modules that can be loaded by mount to just filesystems trivially
      making things safer with no real cost.
      
      Using aliases means user space can control the policy of which
      filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
      with blacklist and alias directives.  Allowing simple, safe,
      well understood work-arounds to known problematic software.
      
      This also addresses a rare but unfortunate problem where the filesystem
      name is not the same as it's module name and module auto-loading
      would not work.  While writing this patch I saw a handful of such
      cases.  The most significant being autofs that lives in the module
      autofs4.
      
      This is relevant to user namespaces because we can reach the request
      module in get_fs_type() without having any special permissions, and
      people get uncomfortable when a user specified string (in this case
      the filesystem type) goes all of the way to request_module.
      
      After having looked at this issue I don't think there is any
      particular reason to perform any filtering or permission checks beyond
      making it clear in the module request that we want a filesystem
      module.  The common pattern in the kernel is to call request_module()
      without regards to the users permissions.  In general all a filesystem
      module does once loaded is call register_filesystem() and go to sleep.
      Which means there is not much attack surface exposed by loading a
      filesytem module unless the filesystem is mounted.  In a user
      namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
      which most filesystems do not set today.
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Reported-by: NKees Cook <keescook@google.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      7f78e035
  19. 26 2月, 2013 2 次提交
  20. 23 2月, 2013 1 次提交