1. 25 8月, 2017 1 次提交
    • B
      GFS2: Withdraw for IO errors writing to the journal or statfs · 942b0cdd
      Bob Peterson 提交于
      Before this patch, if GFS2 encountered IO errors while writing to
      the journal, it would not report the problem, so they would go
      unnoticed, sometimes for many hours. Sometimes this would only be
      noticed later, when recovery tried to do journal replay and failed
      due to invalid metadata at the blocks that resulted in IO errors.
      
      This patch makes GFS2's log daemon check for IO errors. If it
      encounters one, it withdraws from the file system and reports
      why in dmesg. A similar action is taken when IO errors occur when
      writing to the system statfs file.
      
      These errors are also reported back to any callers of fsync, since
      that requires the journal to be flushed. Therefore, any IO errors
      that would previously go unnoticed are now noticed and the file
      system is withdrawn as early as possible, thus preventing further
      file system damage.
      
      Also note that this reintroduces superblock variable sd_log_error,
      which Christoph removed with commit f729b66f.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      942b0cdd
  2. 16 8月, 2017 1 次提交
    • T
      gfs2: fix slab corruption during mounting and umounting gfs file system · cc1dfa8b
      Thomas Tai 提交于
      When using cman-3.0.12.1 and gfs2-utils-3.0.12.1, mounting and
      unmounting GFS2 file system would cause kernel to hang. The slab
      allocator suggests that it is likely a double free memory corruption.
      The issue is traced back to v3.9-rc6 where a patch is submitted to
      use kzalloc() for storing a bitmap instead of using a local variable.
      The intention is to allocate memory during mount and to free memory
      during unmount. The original patch misses a code path which has
      already freed the memory and caused memory corruption. This patch sets
      the memory pointer to NULL after the memory is freed, so that double
      free memory corruption will not happen.
      
      gdlm_mount()
        '-- set_recover_size() which use kzalloc()
        '-- if dlm does not support ops callbacks then
                '--- free_recover_size() which use kfree()
      
      gldm_unmount()
        '-- free_recover_size() which use kfree()
      
      Previous patch which introduced the double free issue is
      commit 57c7310b ("GFS2: use kmalloc for lvb bitmap")
      Signed-off-by: NThomas Tai <thomas.tai@oracle.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Reviewed-by: NLiam R. Howlett <Liam.Howlett@Oracle.com>
      cc1dfa8b
  3. 10 8月, 2017 6 次提交
    • A
      gfs2: forcibly flush ail to relieve memory pressure · b066a4ee
      Abhi Das 提交于
      On systems with low memory, it is possible for gfs2 to infinitely
      loop in balance_dirty_pages() under heavy IO (creating sparse files).
      
      balance_dirty_pages() attempts to write out the dirty pages via
      gfs2_writepages() but none are found because these dirty pages are
      being used by the journaling code in the ail. Normally, the journal
      has an upper threshold which when hit triggers an automatic flush
      of the ail. But this threshold can be higher than the number of
      allowable dirty pages and result in the ail never being flushed.
      
      This patch forces an ail flush when gfs2_writepages() fails to write
      anything. This is a good indication that the ail might be holding
      some dirty pages.
      Signed-off-by: NAbhi Das <adas@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      b066a4ee
    • A
      gfs2: Clean up waiting on glocks · a91323e2
      Andreas Gruenbacher 提交于
      The prepare_to_wait_on_glock and finish_wait_on_glock functions introduced in
      commit 56a365be "gfs2: gfs2_glock_get: Wait on freeing glocks" are
      better removed, resulting in cleaner code.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      a91323e2
    • A
      gfs2: Defer deleting inodes under memory pressure · 6a1c8f6d
      Andreas Gruenbacher 提交于
      When under memory pressure and an inode's link count has dropped to
      zero, defer deleting the inode to the delete workqueue.  This avoids
      calling into DLM under memory pressure, which can deadlock.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      6a1c8f6d
    • A
      gfs2: gfs2_evict_inode: Put glocks asynchronously · 71c1b213
      Andreas Gruenbacher 提交于
      gfs2_evict_inode is called to free inodes under memory pressure.  The
      function calls into DLM when an inode's last cluster-wide reference goes
      away (remote unlink) and to release the glock and associated DLM lock
      before finally destroying the inode.  However, if DLM is blocked on
      memory to become available, calling into DLM again will deadlock.
      
      Avoid that by decoupling releasing glocks from destroying inodes in that
      case: with gfs2_glock_queue_put, glocks will be dequeued asynchronously
      in work queue context, when the associated inodes have likely already
      been destroyed.
      
      With this change, inodes can end up being unlinked, remote-unlink can be
      triggered, and then the inode can be reallocated before all
      remote-unlink callbacks are processed.  To detect that, revalidate the
      link count in gfs2_evict_inode to make sure we're not deleting an
      allocated, referenced inode.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      71c1b213
    • A
      gfs2: Get rid of gfs2_set_nlink · eebd2e81
      Andreas Gruenbacher 提交于
      Remove gfs2_set_nlink which prevents the link count of an inode from
      becoming non-zero once it has reached zero.  The next commit reduces the
      amount of waiting on glocks when an inode is evicted from memory.  With
      that, an inode can become reallocated before all the remote-unlink
      callbacks from a previous delete are processed, which causes the link
      count to change from zero to non-zero.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      eebd2e81
    • A
      gfs2: gfs2_glock_get: Wait on freeing glocks · 0515480a
      Andreas Gruenbacher 提交于
      Keep glocks in their hash table until they are freed instead of removing
      them when their last reference is dropped.  This allows to wait for any
      previous instances of a glock to go away in gfs2_glock_get before
      creating a new glocks.
      
      Special thanks to Andy Price for finding and fixing a problem which also
      required us to delete the rcu_read_unlock from the error case in function
      gfs2_glock_get.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      0515480a
  4. 09 8月, 2017 6 次提交
  5. 21 7月, 2017 4 次提交
    • B
      GFS2: Set gl_object in inode lookup only after block type check · 4d7c18c7
      Bob Peterson 提交于
      Before this patch, the inode glock's gl_object was set after a
      reference was acquired, but before the block type was verified.
      In cases where the block was unlinked, then freed and reused on
      another node, a residule delete callback (delete_work) would try
      to look up the inode, eventually failing the block check, but
      only after it overwrites gl_object with a pointer to the wrong
      inode. This patch moves the assignment of gl_object after the
      block check so it won't be improperly overwritten.
      
      Likewise, at the end of the function, gfs2_inode_lookup was
      clearing gl_object after it unlocked the glock, which meant
      another process might free the glock in the meantime. This
      patch guards against that case.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com>
      4d7c18c7
    • B
      GFS2: Introduce helper for clearing gl_object · df3d87bd
      Bob Peterson 提交于
      This patch introduces a new helper function in glock.h that
      clears gl_object, with an added integrity check. An additional
      integrity check has been added to glock_set_object, plus comments.
      This is step 1 in a series to ensure gl_object integrity.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Reviewed-by: NAndreas Gruenbacher <agruenba@redhat.com>
      df3d87bd
    • C
      gfs2: add flag REQ_PRIO for metadata I/O · e477b24b
      Coly Li 提交于
      When gfs2 does metadata I/O, only REQ_META is used as a metadata hint of
      the bio. But flag REQ_META is just a hint for block trace, not for block
      layer code to handle a bio as metadata request.
      
      For some of metadata I/Os of gfs2, A REQ_PRIO flag on the metadata bio
      would be very informative to block layer code. For example, if bcache is
      used as a I/O cache for gfs2, it will be possible for bcache code to get
      the hint and cache the pre-fetched metadata blocks on cache device. This
      behavior may be helpful to improve metadata I/O performance if the
      following requests hit the cache.
      
      Here are the locations in gfs2 code where a REQ_PRIO flag should be added,
      - All places where REQ_READAHEAD is used, gfs2 code uses this flag for
        metadata read ahead.
      - In gfs2_meta_rq() where the first metadata block is read in.
      - In gfs2_write_buf_to_page(), read in quota metadata blocks to have them
        up to date.
      These metadata blocks are probably to be accessed again in future, adding
      a REQ_PRIO flag may have bcache to keep such metadata in fast cache
      device. For system without a cache layer, REQ_PRIO can still provide hint
      to block layer to handle metadata requests more properly.
      Signed-off-by: NColy Li <colyli@suse.de>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      e477b24b
    • W
      GFS2: fix code parameter error in inode_go_lock · e7cb550d
      Wang Xibo 提交于
      In inode_go_lock() function, the parameter order of list_add() is error.
      According to the define of list_add(), the first parameter is new entry
      and the second is the list head, so ip->i_trunc_list should be the
      first parameter and the sdp->sd_trunc_list should be second.
      
      Signed-off-by: Wang Xibo<wang.xibo@zte.com.cn>
      Signed-off-by: Xiao Likun<xiao.likun@zte.com.cn>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      e7cb550d
  6. 20 7月, 2017 1 次提交
  7. 19 7月, 2017 1 次提交
    • J
      gfs2: Don't clear SGID when inheriting ACLs · 914cea93
      Jan Kara 提交于
      When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
      set, DIR1 is expected to have SGID bit set (and owning group equal to
      the owning group of 'DIR0'). However when 'DIR0' also has some default
      ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
      'DIR1' to get cleared if user is not member of the owning group.
      
      Fix the problem by moving posix_acl_update_mode() out of
      __gfs2_set_acl() into gfs2_set_acl(). That way the function will not be
      called when inheriting ACLs which is what we want as it prevents SGID
      bit clearing and the mode has been properly set by posix_acl_create()
      anyway.
      
      Fixes: 07393101Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      914cea93
  8. 18 7月, 2017 1 次提交
  9. 17 7月, 2017 1 次提交
    • B
      GFS2: Prevent double brelse in gfs2_meta_indirect_buffer · 61eaadcd
      Bob Peterson 提交于
      Before this patch, problems reading in indirect buffers would send
      an IO error back to the caller, and release the buffer_head with
      brelse() in function gfs2_meta_indirect_buffer, however, it would
      still return the address of the buffer_head it released. After the
      error was discovered, function gfs2_block_map would call function
      release_metapath to free all buffers. That checked:
      if (mp->mp_bh[i] == NULL) but since the value was set after the
      error, it was non-zero, so brelse was called a second time. This
      resulted in the following error:
      
      kernel: WARNING: at fs/buffer.c:1224 __brelse+0x3a/0x40() (Tainted: G        W  -- ------------   )
      kernel: Hardware name: RHEV Hypervisor
      kernel: VFS: brelse: Trying to free free buffer
      
      This patch changes gfs2_meta_indirect_buffer so it only sets
      the buffer_head pointer in cases where it isn't released.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
      61eaadcd
  10. 08 7月, 2017 3 次提交
    • K
      exec: Limit arg stack to at most 75% of _STK_LIM · da029c11
      Kees Cook 提交于
      To avoid pathological stack usage or the need to special-case setuid
      execs, just limit all arg stack usage to at most 75% of _STK_LIM (6MB).
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      da029c11
    • L
      vfs: fix flock compat thinko · b59eea55
      Linus Torvalds 提交于
      Michael Ellerman reported that commit 8c6657cb ("Switch flock
      copyin/copyout primitives to copy_{from,to}_user()") broke his
      networking on a bunch of PPC machines (64-bit kernel, 32-bit userspace).
      
      The reason is a brown-paper bug by that commit, which had the arguments
      to "copy_flock_fields()" in the wrong order, breaking the compat
      handling for file locking.  Apparently very few people run 32-bit user
      space on x86 any more, so the PPC people got the honor of noticing this
      "feature".
      
      Michael also sent a minimal diff that just changed the order of the
      arguments in that macro.
      
      This is not that minimal diff.
      
      This not only changes the order of the arguments in the macro, it also
      changes them to be pointers (to be consistent with all the other uses of
      those pointers), and makes the functions that do all of this also have
      the proper "const" attribution on the source pointers in order to make
      issues like that (using the source as a destination) be really obvious.
      Reported-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b59eea55
    • A
      gfs2: Fix glock rhashtable rcu bug · 961ae1d8
      Andreas Gruenbacher 提交于
      Before commit 88ffbf3e "GFS2: Use resizable hash table for glocks",
      glocks were freed via call_rcu to allow reading the glock hashtable
      locklessly using rcu.  This was then changed to free glocks immediately,
      which made reading the glock hashtable unsafe.  Bring back the original
      code for freeing glocks via call_rcu.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Cc: stable@vger.kernel.org # 4.3+
      961ae1d8
  11. 07 7月, 2017 9 次提交
  12. 06 7月, 2017 6 次提交