1. 17 7月, 2017 1 次提交
    • D
      VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb) · bc98a42c
      David Howells 提交于
      Firstly by applying the following with coccinelle's spatch:
      
      	@@ expression SB; @@
      	-SB->s_flags & MS_RDONLY
      	+sb_rdonly(SB)
      
      to effect the conversion to sb_rdonly(sb), then by applying:
      
      	@@ expression A, SB; @@
      	(
      	-(!sb_rdonly(SB)) && A
      	+!sb_rdonly(SB) && A
      	|
      	-A != (sb_rdonly(SB))
      	+A != sb_rdonly(SB)
      	|
      	-A == (sb_rdonly(SB))
      	+A == sb_rdonly(SB)
      	|
      	-!(sb_rdonly(SB))
      	+!sb_rdonly(SB)
      	|
      	-A && (sb_rdonly(SB))
      	+A && sb_rdonly(SB)
      	|
      	-A || (sb_rdonly(SB))
      	+A || sb_rdonly(SB)
      	|
      	-(sb_rdonly(SB)) != A
      	+sb_rdonly(SB) != A
      	|
      	-(sb_rdonly(SB)) == A
      	+sb_rdonly(SB) == A
      	|
      	-(sb_rdonly(SB)) && A
      	+sb_rdonly(SB) && A
      	|
      	-(sb_rdonly(SB)) || A
      	+sb_rdonly(SB) || A
      	)
      
      	@@ expression A, B, SB; @@
      	(
      	-(sb_rdonly(SB)) ? 1 : 0
      	+sb_rdonly(SB)
      	|
      	-(sb_rdonly(SB)) ? A : B
      	+sb_rdonly(SB) ? A : B
      	)
      
      to remove left over excess bracketage and finally by applying:
      
      	@@ expression A, SB; @@
      	(
      	-(A & MS_RDONLY) != sb_rdonly(SB)
      	+(bool)(A & MS_RDONLY) != sb_rdonly(SB)
      	|
      	-(A & MS_RDONLY) == sb_rdonly(SB)
      	+(bool)(A & MS_RDONLY) == sb_rdonly(SB)
      	)
      
      to make comparisons against the result of sb_rdonly() (which is a bool)
      work correctly.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      bc98a42c
  2. 05 7月, 2017 3 次提交
  3. 03 4月, 2017 1 次提交
    • A
      Revert "GFS2: Wait for iopen glock dequeues" · d4da3198
      Andreas Gruenbacher 提交于
      Revert commit 86d067a7: it turns out
      that waiting for iopen glock dequeues here isn't needed anymore because
      the bugs that commit was meant to fix have been fixed otherwise.
      
      In addition, we want to avoid waiting on glocks in gfs2_evict_inode in
      shrinker context because the shrinker may be invoked on behalf of DLM,
      in which case calling into DLM again would deadlock.  This commit makes
      the described scenario less likely without completely avoiding it; it's
      still a step in the right direction, though.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      d4da3198
  4. 16 3月, 2017 1 次提交
    • B
      GFS2: Prevent BUG from occurring when normal Withdraws occur · 0d1c7ae9
      Bob Peterson 提交于
      When the GFS2 file system withdraws due to metadata corruption, it
      often has outstanding transactions in the journal and delayed work
      queued for its glocks. This patch adds some new checks for a
      withdrawn file system before proceeding with operations that would
      obviously cause a BUG() to be triggered. That allows GFS2 to be
      safely unmounted rather than cause the system to go down.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      0d1c7ae9
  5. 02 3月, 2017 1 次提交
  6. 03 8月, 2016 1 次提交
  7. 27 6月, 2016 1 次提交
    • A
      gfs2: Lock holder cleanup · 6df9f9a2
      Andreas Gruenbacher 提交于
      Make the code more readable by cleaning up the different ways of
      initializing lock holders and checking for initialized lock holders:
      mark lock holders as uninitialized by setting the holder's glock to NULL
      (gfs2_holder_mark_uninitialized) instead of zeroing out the entire
      object or using a separate flag.  Recognize initialized holders by their
      non-NULL glock (gfs2_holder_initialized).  Don't zero out holder objects
      which are immeditiately initialized via gfs2_holder_init or
      gfs2_glock_nq_init.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      6df9f9a2
  8. 11 4月, 2016 1 次提交
  9. 14 1月, 2016 1 次提交
  10. 19 12月, 2015 2 次提交
  11. 15 12月, 2015 4 次提交
    • B
      gfs2: clear journal live bit in gfs2_log_flush · 400ac52e
      Benjamin Marzinski 提交于
      When gfs2 was unmounting filesystems or changing them to read-only it
      was clearing the SDF_JOURNAL_LIVE bit before the final log flush.  This
      caused a race.  If an inode glock got demoted in the gap between
      clearing the bit and the shutdown flush, it would be unable to reserve
      log space to clear out the active items list in inode_go_sync, causing an
      error in inode_go_inval because the glock was still dirty.
      
      To solve this, the SDF_JOURNAL_LIVE bit is now cleared inside the
      shutdown log flush.  This means that, because of the locking on the log
      blocks, either inode_go_sync will be able to reserve space to clean the
      glock before the shutdown flush, or the shutdown flush will clean the
      glock itself, before inode_go_sync fails to reserve the space. Either
      way, the glock will be clean before inode_go_inval.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      400ac52e
    • B
      gfs2: change gfs2 readdir cookie · 471f3db2
      Benjamin Marzinski 提交于
      gfs2 currently returns 31 bits of filename hash as a cookie that readdir
      uses for an offset into the directory.  When there are a large number of
      directory entries, the likelihood of a collision goes up way too
      quickly.  GFS2 will now return cookies that are guaranteed unique for a
      while, and then fail back to using 30 bits of filename hash.
      Specifically, the directory leaf blocks are divided up into chunks based
      on the minimum size of a gfs2 directory entry (48 bytes). Each entry's
      cookie is based off the chunk where it starts, in the linked list of
      leaf blocks that it hashes to (there are 131072 hash buckets). Directory
      entries will have unique names until they take reach chunk 8192.
      Assuming the largest filenames possible, and the least efficient spacing
      possible, this new method will still be able to return unique names when
      the previous method has statistically more than a 99% chance of a
      collision.  The non-unique names it fails back to are guaranteed to not
      collide with the unique names.
      
      unique cookies will be in this format:
      - 1 bit "0" to make sure the the returned cookie is positive
      - 17 bits for the hash table index
      - 1 bit for the mode "0"
      - 13 bits for the offset
      
      non-unique cookies will be in this format:
      - 1 bit "0" to make sure the the returned cookie is positive
      - 17 bits for the hash table index
      - 1 bit for the mode "1"
      - 13 more bits of the name hash
      
      Another benefit of location based cookies, is that once a directory's
      exhash table is fully extended (so that multiple hash table indexs do
      not use the same leaf blocks), gfs2 can skip sorting the directory
      entries until it reaches the non-unique ones, and then it only needs to
      sort these. This provides a significant speed up for directory reads of
      very large directories.
      
      The only issue is that for these cookies to continue to point to the
      correct entry as files are added and removed from the directory, gfs2
      must keep the entries at the same offset in the leaf block when they are
      split (see my previous patch). This means that until all the nodes in a
      cluster are running with code that will split the directory leaf blocks
      this way, none of the nodes can use the new cookie code. To deal with
      this, gfs2 now has the mount option loccookie, which, if set, will make
      it return these new location based cookies.  This option must not be set
      until all nodes in the cluster are at least running this version of the
      kernel code, and you have guaranteed that there are no outstanding
      cookies required by other software, such as NFS.
      
      gfs2 uses some of the extra space at the end of the gfs2_dirent
      structure to store the calculated readdir cookies. This keeps us from
      needing to allocate a seperate array to hold these values.  gfs2
      recomputes the cookie stored in de_cookie for every readdir call.  The
      time it takes to do so is small, and if gfs2 expected this value to be
      saved on disk, the new code wouldn't work correctly on filesystems
      created with an earlier version of gfs2.
      
      One issue with adding de_cookie to the union in the gfs2_dirent
      structure is that it caused the union to align itself to a 4 byte
      boundary, instead of its previous 2 byte boundary. This changed the
      offset of de_rahead. To solve that, I pulled de_rahead out of the union,
      since it does not need to be there.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      471f3db2
    • B
      GFS2: Update master statfs buffer with sd_statfs_spin locked · 901c6c66
      Bob Peterson 提交于
      Before this patch, function update_statfs called gfs2_statfs_change_out
      to update the master statfs buffer without the sd_statfs_spin held.
      In theory, another process could call gfs2_statfs_sync, which takes
      the sd_statfs_spin lock and re-reads m_sc from the buffer. So there's
      a theoretical timing window in which one process could write the
      master statfs buffer, then another comes along and re-reads it, wiping
      out the changes.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      901c6c66
    • B
      GFS2: Make rgrp reservations part of the gfs2_inode structure · a097dc7e
      Bob Peterson 提交于
      Before this patch, multi-block reservation structures were allocated
      from a special slab. This patch folds the structure into the gfs2_inode
      structure. The disadvantage is that the gfs2_inode needs more memory,
      even when a file is opened read-only. The advantages are: (a) we don't
      need the special slab and the extra time it takes to allocate and
      deallocate from it. (b) we no longer need to worry that the structure
      exists for things like quota management. (c) This also allows us to
      remove the calls to get_write_access and put_write_access since we
      know the structure will exist.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      a097dc7e
  12. 24 11月, 2015 1 次提交
    • B
      GFS2: Extract quota data from reservations structure (revert 5407e242) · b54e9a0b
      Bob Peterson 提交于
      This patch basically reverts the majority of patch 5407e242.
      That patch eliminated the gfs2_qadata structure in favor of just
      using the reservations structure. The problem with doing that is that
      it increases the size of the reservations structure. That is not an
      issue until it comes time to fold the reservations structure into the
      inode in memory so we know it's always there. By separating out the
      quota structure again, we aren't punishing the non-quota users by
      making all the inodes bigger, requiring more slab space. This patch
      creates a new slab area to allocate the quota stuff so it's managed
      a little more sanely.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      b54e9a0b
  13. 17 11月, 2015 1 次提交
  14. 05 9月, 2015 1 次提交
    • K
      fs: create and use seq_show_option for escaping · a068acf2
      Kees Cook 提交于
      Many file systems that implement the show_options hook fail to correctly
      escape their output which could lead to unescaped characters (e.g.  new
      lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files.  This
      could lead to confusion, spoofed entries (resulting in things like
      systemd issuing false d-bus "mount" notifications), and who knows what
      else.  This looks like it would only be the root user stepping on
      themselves, but it's possible weird things could happen in containers or
      in other situations with delegated mount privileges.
      
      Here's an example using overlay with setuid fusermount trusting the
      contents of /proc/mounts (via the /etc/mtab symlink).  Imagine the use
      of "sudo" is something more sneaky:
      
        $ BASE="ovl"
        $ MNT="$BASE/mnt"
        $ LOW="$BASE/lower"
        $ UP="$BASE/upper"
        $ WORK="$BASE/work/ 0 0
        none /proc fuse.pwn user_id=1000"
        $ mkdir -p "$LOW" "$UP" "$WORK"
        $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
        $ cat /proc/mounts
        none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
        none /proc fuse.pwn user_id=1000 0 0
        $ fusermount -u /proc
        $ cat /proc/mounts
        cat: /proc/mounts: No such file or directory
      
      This fixes the problem by adding new seq_show_option and
      seq_show_option_n helpers, and updating the vulnerable show_option
      handlers to use them as needed.  Some, like SELinux, need to be open
      coded due to unusual existing escape mechanisms.
      
      [akpm@linux-foundation.org: add lost chunk, per Kees]
      [keescook@chromium.org: seq_show_option should be using const parameters]
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: NJan Kara <jack@suse.com>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Cc: J. R. Okajima <hooanon05g@gmail.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a068acf2
  15. 02 6月, 2015 1 次提交
    • T
      writeback: move bandwidth related fields from backing_dev_info into bdi_writeback · a88a341a
      Tejun Heo 提交于
      Currently, a bdi (backing_dev_info) embeds single wb (bdi_writeback)
      and the role of the separation is unclear.  For cgroup support for
      writeback IOs, a bdi will be updated to host multiple wb's where each
      wb serves writeback IOs of a different cgroup on the bdi.  To achieve
      that, a wb should carry all states necessary for servicing writeback
      IOs for a cgroup independently.
      
      This patch moves bandwidth related fields from backing_dev_info into
      bdi_writeback.
      
      * The moved fields are: bw_time_stamp, dirtied_stamp, written_stamp,
        write_bandwidth, avg_write_bandwidth, dirty_ratelimit,
        balanced_dirty_ratelimit, completions and dirty_exceeded.
      
      * writeback_chunk_size() and over_bground_thresh() now take @wb
        instead of @bdi.
      
      * bdi_writeout_fraction(bdi, ...)	-> wb_writeout_fraction(wb, ...)
        bdi_dirty_limit(bdi, ...)		-> wb_dirty_limit(wb, ...)
        bdi_position_ration(bdi, ...)		-> wb_position_ratio(wb, ...)
        bdi_update_writebandwidth(bdi, ...)	-> wb_update_write_bandwidth(wb, ...)
        [__]bdi_update_bandwidth(bdi, ...)	-> [__]wb_update_bandwidth(wb, ...)
        bdi_{max|min}_pause(bdi, ...)		-> wb_{max|min}_pause(wb, ...)
        bdi_dirty_limits(bdi, ...)		-> wb_dirty_limits(wb, ...)
      
      * Init/exits of the relocated fields are moved to bdi_wb_init/exit()
        respectively.  Note that explicit zeroing is dropped in the process
        as wb's are cleared in entirety anyway.
      
      * As there's still only one bdi_writeback per backing_dev_info, all
        uses of bdi->stat[] are mechanically replaced with bdi->wb.stat[]
        introducing no behavior changes.
      
      v2: Typo in description fixed as suggested by Jan.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      a88a341a
  16. 16 4月, 2015 1 次提交
  17. 21 1月, 2015 1 次提交
  18. 17 11月, 2014 1 次提交
    • B
      GFS2: update freeze code to use freeze/thaw_super on all nodes · 2e60d768
      Benjamin Marzinski 提交于
      The current gfs2 freezing code is considerably more complicated than it
      should be because it doesn't use the vfs freezing code on any node except
      the one that begins the freeze.  This is because it needs to acquire a
      cluster glock before calling the vfs code to prevent a deadlock, and
      without the new freeze_super and thaw_super hooks, that was impossible. To
      deal with the issue, gfs2 had to do some hacky locking tricks to make sure
      that a frozen node couldn't be holding on a lock it needed to do the
      unfreeze ioctl.
      
      This patch makes use of the new hooks to simply the gfs2 locking code. Now,
      all the nodes in the cluster freeze and thaw in exactly the same way. Every
      node in the cluster caches the freeze glock in the shared state.  The new
      freeze_super hook allows the freezing node to grab this freeze glock in
      the exclusive state without first calling the vfs freeze_super function.
      All the nodes in the cluster see this lock change, and call the vfs
      freeze_super function. The vfs locking code guarantees that the nodes can't
      get stuck holding the glocks necessary to unfreeze the system.  To
      unfreeze, the freezing node uses the new thaw_super hook to drop the freeze
      glock. Again, all the nodes notice this, reacquire the glock in shared mode
      and call the vfs thaw_super function.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      2e60d768
  19. 21 8月, 2014 1 次提交
  20. 16 7月, 2014 1 次提交
    • N
      sched: Remove proliferation of wait_on_bit() action functions · 74316201
      NeilBrown 提交于
      The current "wait_on_bit" interface requires an 'action'
      function to be provided which does the actual waiting.
      There are over 20 such functions, many of them identical.
      Most cases can be satisfied by one of just two functions, one
      which uses io_schedule() and one which just uses schedule().
      
      So:
       Rename wait_on_bit and        wait_on_bit_lock to
              wait_on_bit_action and wait_on_bit_lock_action
       to make it explicit that they need an action function.
      
       Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io
       which are *not* given an action function but implicitly use
       a standard one.
       The decision to error-out if a signal is pending is now made
       based on the 'mode' argument rather than being encoded in the action
       function.
      
       All instances of the old wait_on_bit and wait_on_bit_lock which
       can use the new version have been changed accordingly and their
       action functions have been discarded.
       wait_on_bit{_lock} does not return any specific error code in the
       event of a signal so the caller must check for non-zero and
       interpolate their own error code as appropriate.
      
      The wait_on_bit() call in __fscache_wait_on_invalidate() was
      ambiguous as it specified TASK_UNINTERRUPTIBLE but used
      fscache_wait_bit_interruptible as an action function.
      David Howells confirms this should be uniformly
      "uninterruptible"
      
      The main remaining user of wait_on_bit{,_lock}_action is NFS
      which needs to use a freezer-aware schedule() call.
      
      A comment in fs/gfs2/glock.c notes that having multiple 'action'
      functions is useful as they display differently in the 'wchan'
      field of 'ps'. (and /proc/$PID/wchan).
      As the new bit_wait{,_io} functions are tagged "__sched", they
      will not show up at all, but something higher in the stack.  So
      the distinction will still be visible, only with different
      function names (gds2_glock_wait versus gfs2_glock_dq_wait in the
      gfs2/glock.c case).
      
      Since first version of this patch (against 3.15) two new action
      functions appeared, on in NFS and one in CIFS.  CIFS also now
      uses an action function that makes the same freezer aware
      schedule call as NFS.
      Signed-off-by: NNeilBrown <neilb@suse.de>
      Acked-by: David Howells <dhowells@redhat.com> (fscache, keys)
      Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2)
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Steve French <sfrench@samba.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brownSigned-off-by: NIngo Molnar <mingo@kernel.org>
      74316201
  21. 14 5月, 2014 1 次提交
    • B
      GFS2: remove transaction glock · 24972557
      Benjamin Marzinski 提交于
      GFS2 has a transaction glock, which must be grabbed for every
      transaction, whose purpose is to deal with freezing the filesystem.
      Aside from this involving a large amount of locking, it is very easy to
      make the current fsfreeze code hang on unfreezing.
      
      This patch rewrites how gfs2 handles freezing the filesystem. The
      transaction glock is removed. In it's place is a freeze glock, which is
      cached (but not held) in a shared state by every node in the cluster
      when the filesystem is mounted. This lock only needs to be grabbed on
      freezing, and actions which need to be safe from freezing, like
      recovery.
      
      When a node wants to freeze the filesystem, it grabs this glock
      exclusively.  When the freeze glock state changes on the nodes (either
      from shared to unlocked, or shared to exclusive), the filesystem does a
      special log flush.  gfs2_log_flush() does all the work for flushing out
      the and shutting down the incore log, and then it tries to grab the
      freeze glock in a shared state again.  Since the filesystem is stuck in
      gfs2_log_flush, no new transaction can start, and nothing can be written
      to disk. Unfreezing the filesytem simply involes dropping the freeze
      glock, allowing gfs2_log_flush() to grab and then release the shared
      lock, so it is cached for next time.
      
      However, in order for the unfreezing ioctl to occur, gfs2 needs to get a
      shared lock on the filesystem root directory inode to check permissions.
      If that glock has already been grabbed exclusively, fsfreeze will be
      unable to get the shared lock and unfreeze the filesystem.
      
      In order to allow the unfreeze, this patch makes gfs2 grab a shared lock
      on the filesystem root directory during the freeze, and hold it until it
      unfreezes the filesystem.  The functions which need to grab a shared
      lock in order to allow the unfreeze ioctl to be issued now use the lock
      grabbed by the freeze code instead.
      
      The freeze and unfreeze code take care to make sure that this shared
      lock will not be dropped while another process is using it.
      Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      24972557
  22. 04 4月, 2014 1 次提交
    • J
      mm + fs: store shadow entries in page cache · 91b0abe3
      Johannes Weiner 提交于
      Reclaim will be leaving shadow entries in the page cache radix tree upon
      evicting the real page.  As those pages are found from the LRU, an
      iput() can lead to the inode being freed concurrently.  At this point,
      reclaim must no longer install shadow pages because the inode freeing
      code needs to ensure the page tree is really empty.
      
      Add an address_space flag, AS_EXITING, that the inode freeing code sets
      under the tree lock before doing the final truncate.  Reclaim will check
      for this flag before installing shadow pages.
      Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Reviewed-by: NMinchan Kim <minchan@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Bob Liu <bob.liu@oracle.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Metin Doslu <metin@citusdata.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Ozgun Erdogan <ozgun@citusdata.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Roman Gushchin <klamm@yandex-team.ru>
      Cc: Ryan Mallon <rmallon@gmail.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      91b0abe3
  23. 31 3月, 2014 1 次提交
  24. 13 3月, 2014 1 次提交
    • T
      fs: push sync_filesystem() down to the file system's remount_fs() · 02b9984d
      Theodore Ts'o 提交于
      Previously, the no-op "mount -o mount /dev/xxx" operation when the
      file system is already mounted read-write causes an implied,
      unconditional syncfs().  This seems pretty stupid, and it's certainly
      documented or guaraunteed to do this, nor is it particularly useful,
      except in the case where the file system was mounted rw and is getting
      remounted read-only.
      
      However, it's possible that there might be some file systems that are
      actually depending on this behavior.  In most file systems, it's
      probably fine to only call sync_filesystem() when transitioning from
      read-write to read-only, and there are some file systems where this is
      not needed at all (for example, for a pseudo-filesystem or something
      like romfs).
      Signed-off-by: N"Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Artem Bityutskiy <dedekind1@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Evgeniy Dushistov <dushistov@mail.ru>
      Cc: Jan Kara <jack@suse.cz>
      Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: Anders Larsen <al@alarsen.net>
      Cc: Phillip Lougher <phillip@squashfs.org.uk>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
      Cc: Petr Vandrovec <petr@vandrovec.name>
      Cc: xfs@oss.sgi.com
      Cc: linux-btrfs@vger.kernel.org
      Cc: linux-cifs@vger.kernel.org
      Cc: samba-technical@lists.samba.org
      Cc: codalist@coda.cs.cmu.edu
      Cc: linux-ext4@vger.kernel.org
      Cc: linux-f2fs-devel@lists.sourceforge.net
      Cc: fuse-devel@lists.sourceforge.net
      Cc: cluster-devel@redhat.com
      Cc: linux-mtd@lists.infradead.org
      Cc: jfs-discussion@lists.sourceforge.net
      Cc: linux-nfs@vger.kernel.org
      Cc: linux-nilfs@vger.kernel.org
      Cc: linux-ntfs-dev@lists.sourceforge.net
      Cc: ocfs2-devel@oss.oracle.com
      Cc: reiserfs-devel@vger.kernel.org
      02b9984d
  25. 07 3月, 2014 2 次提交
  26. 03 3月, 2014 1 次提交
    • S
      GFS2: Clean up journal extent mapping · b50f227b
      Steven Whitehouse 提交于
      This patch fixes a long standing issue in mapping the journal
      extents. Most journals will consist of only a single extent,
      and although the cache took account of that by merging extents,
      it did not actually map large extents, but instead was doing a
      block by block mapping. Since the journal was only being mapped
      on mount, this was not normally noticeable.
      
      With the updated code, it is now possible to use the same extent
      mapping system during journal recovery (which will be added in a
      later patch). This will allow checking of the integrity of the
      journal before any reply of the journal content is attempted. For
      this reason the code is moving to bmap.c, since it will be used
      more widely in due course.
      
      An exercise left for the reader is to compare the new function
      gfs2_map_journal_extents() with gfs2_write_alloc_required()
      
      Additionally, should there be a failure, the error reporting is
      also updated to show more detail about what went wrong.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      b50f227b
  27. 15 1月, 2014 1 次提交
    • S
      GFS2: Only run logd and quota when mounted read/write · 8ad151c2
      Steven Whitehouse 提交于
      While investigating a rather strange bit of code in the quota
      clean up function, I spotted that the reason for its existence
      was that when remounting read only, we were not stopping the
      quotad thread, and thus it was possible for it to still have
      a reference to some of the quotas in that case.
      
      This patch moves the logd and quota thread start and stop into
      the make_fs_rw/ro functions, so that we now stop those threads
      when mounted read only.
      
      This means that quotad will always be stopped before we call
      the quota clean up function, and we can thus dispose of the
      (rather hackish) code that waits for it to give up its
      reference on the quotas.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      Cc: Abhijith Das <adas@redhat.com>
      8ad151c2
  28. 27 9月, 2013 1 次提交
    • S
      GFS2: Clean up reservation removal · af5c2697
      Steven Whitehouse 提交于
      The reservation for an inode should be cleared when it is truncated so
      that we can start again at a different offset for future allocations.
      We could try and do better than that, by resetting the search based on
      where the truncation started from, but this is only a first step.
      
      In addition, there are three callers of gfs2_rs_delete() but only one
      of those should really be testing the value of i_writecount. While
      we get away with that in the other cases currently, I think it would
      be better if we made that test specific to the one case which
      requires it.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      af5c2697
  29. 03 6月, 2013 1 次提交
  30. 08 4月, 2013 1 次提交
  31. 13 2月, 2013 2 次提交
  32. 29 1月, 2013 1 次提交
    • S
      GFS2: Use ->writepages for ordered writes · 45138990
      Steven Whitehouse 提交于
      Instead of using a list of buffers to write ahead of the journal
      flush, this now uses a list of inodes and calls ->writepages
      via filemap_fdatawrite() in order to achieve the same thing. For
      most use cases this results in a shorter ordered write list,
      as well as much larger i/os being issued.
      
      The ordered write list is sorted by inode number before writing
      in order to retain the disk block ordering between inodes as
      per the previous code.
      
      The previous ordered write code used to conflict in its assumptions
      about how to write out the disk blocks with mpage_writepages()
      so that with this updated version we can also use mpage_writepages()
      for GFS2's ordered write, writepages implementation. So we will
      also send larger i/os from writeback too.
      Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
      45138990