1. 27 6月, 2016 4 次提交
    • A
      gfs2: Lock holder cleanup · 6df9f9a2
      Andreas Gruenbacher 提交于
      Make the code more readable by cleaning up the different ways of
      initializing lock holders and checking for initialized lock holders:
      mark lock holders as uninitialized by setting the holder's glock to NULL
      (gfs2_holder_mark_uninitialized) instead of zeroing out the entire
      object or using a separate flag.  Recognize initialized holders by their
      non-NULL glock (gfs2_holder_initialized).  Don't zero out holder objects
      which are immeditiately initialized via gfs2_holder_init or
      gfs2_glock_nq_init.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      6df9f9a2
    • A
      gfs2: Large-filesystem fix for 32-bit systems · cda9dd42
      Andreas Gruenbacher 提交于
      Commit ff34245d switched from iget5_locked to iget_locked among other
      things, but iget_locked doesn't work for filesystems larger than 2^32
      blocks on 32-bit systems.  Switch back to iget5_locked.  Filesystems
      larger than 2^32 blocks are unrealistic to work well on 32-bit systems,
      so this is mostly a code cleanliness fix.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      cda9dd42
    • A
      gfs2: Get rid of gfs2_ilookup · ec5ec66b
      Andreas Gruenbacher 提交于
      Now that gfs2_lookup_by_inum only takes the inode glock for new inodes
      (and not for cached inodes anymore), there no longer is a need to
      optimize the cached-inode case in gfs2_get_dentry or delete_work_func,
      and gfs2_ilookup can be removed.
      
      In addition, gfs2_get_dentry wasn't checking the GFS2_DIF_SYSTEM flag in
      i_diskflags in the gfs2_ilookup case (see gfs2_lookup_by_inum); this
      inconsistency goes away as well.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      ec5ec66b
    • A
      gfs2: Fix gfs2_lookup_by_inum lock inversion · 3ce37b2c
      Andreas Gruenbacher 提交于
      The current gfs2_lookup_by_inum takes the glock of a presumed inode
      identified by block number, verifies that the block is indeed an inode,
      and then instantiates and reads the new inode via gfs2_inode_lookup.
      
      However, instantiating a new inode may block on freeing a previous
      instance of that inode (__wait_on_freeing_inode), and freeing an inode
      requires to take the glock already held, leading to lock inversion and
      deadlock.
      
      Fix this by first instantiating the new inode, then verifying that the
      block is an inode (if required), and then reading in the new inode, all
      in gfs2_inode_lookup.
      
      If the block we are looking for is not an inode, we discard the new
      inode via iget_failed, which marks inodes as bad and unhashes them.
      Other tasks waiting on that inode will get back a bad inode back from
      ilookup or iget_locked; in that case, retry the lookup.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      3ce37b2c
  2. 17 6月, 2016 1 次提交
  3. 10 6月, 2016 1 次提交
    • B
      GFS2: don't set rgrp gl_object until it's inserted into rgrp tree · 36e4ad03
      Bob Peterson 提交于
      Before this patch, function read_rindex_entry would set a rgrp
      glock's gl_object pointer to itself before inserting the rgrp into
      the rgrp rbtree. The problem is: if another process was also reading
      the rgrp in, and had already inserted its newly created rgrp, then
      the second call to read_rindex_entry would overwrite that value,
      then return a bad return code to the caller. Later, other functions
      would reference the now-freed rgrp memory by way of gl_object.
      In some cases, that could result in gfs2_rgrp_brelse being called
      twice for the same rgrp: once for the failed attempt and once for
      the "real" rgrp release. Eventually the kernel would panic.
      There are also a number of other things that could go wrong when
      a kernel module is accessing freed storage. For example, this could
      result in rgrp corruption because the fake rgrp would point to a
      fake bitmap in memory too, causing gfs2_inplace_reserve to search
      some random memory for free blocks, and find some, since we were
      never setting rgd->rd_bits to NULL before freeing it.
      
      This patch fixes the problem by not setting gl_object until we
      have successfully inserted the rgrp into the rbtree. Also, it sets
      rd_bits to NULL as it frees them, which will ensure any accidental
      access to the wrong rgrp will result in a kernel panic rather than
      file system corruption, which is preferred.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      36e4ad03
  4. 13 5月, 2016 2 次提交
  5. 07 5月, 2016 1 次提交
  6. 03 5月, 2016 1 次提交
  7. 02 5月, 2016 4 次提交
  8. 20 4月, 2016 1 次提交
  9. 14 4月, 2016 1 次提交
  10. 13 4月, 2016 1 次提交
  11. 11 4月, 2016 3 次提交
  12. 06 4月, 2016 1 次提交
  13. 05 4月, 2016 3 次提交
    • B
      GFS2: Get rid of dead code in inode_go_demote_ok · 73cc8625
      Bob Peterson 提交于
      Function inode_go_demote_ok had some code that was only executed
      if gl_holders was not empty. However, if gl_holders was not empty,
      the only caller, demote_ok(), returns before inode_go_demote_ok
      would ever be called. Therefore, it's dead code, so I removed it.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
      73cc8625
    • B
      rhashtable: accept GFP flags in rhashtable_walk_init · 8f6fd83c
      Bob Copeland 提交于
      In certain cases, the 802.11 mesh pathtable code wants to
      iterate over all of the entries in the forwarding table from
      the receive path, which is inside an RCU read-side critical
      section.  Enable walks inside atomic sections by allowing
      GFP_ATOMIC allocations for the walker state.
      
      Change all existing callsites to pass in GFP_KERNEL.
      Acked-by: NThomas Graf <tgraf@suug.ch>
      Signed-off-by: NBob Copeland <me@bobcopeland.com>
      [also adjust gfs2/glock.c and rhashtable tests]
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      8f6fd83c
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  14. 24 3月, 2016 1 次提交
  15. 15 3月, 2016 5 次提交
  16. 23 1月, 2016 1 次提交
    • A
      wrappers for ->i_mutex access · 5955102c
      Al Viro 提交于
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  17. 15 1月, 2016 1 次提交
    • V
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov 提交于
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
      
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      fact).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d097056
  18. 14 1月, 2016 1 次提交
  19. 07 1月, 2016 1 次提交
  20. 31 12月, 2015 1 次提交
  21. 25 12月, 2015 1 次提交
  22. 22 12月, 2015 1 次提交
  23. 19 12月, 2015 3 次提交
    • B
      GFS2: Don't do glock put on when inode creation fails · 6cc4b6e8
      Bob Peterson 提交于
      Currently the error path of function gfs2_inode_lookup calls function
      gfs2_glock_put corresponding to an earlier call to gfs2_glock_get for
      the inode glock. That's wrong because the error path also calls
      iget_failed() which eventually calls iput, which eventually calls
      gfs2_evict_inode, which does another gfs2_glock_put. This double-put
      can cause the glock reference count to get off.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      6cc4b6e8
    • B
      GFS2: Always use iopen glock for gl_deletes · 5ea31bc0
      Bob Peterson 提交于
      Before this patch, when function try_rgrp_unlink queued a glock for
      delete_work to reclaim the space, it used the inode glock to do so.
      That's different from the iopen callback which uses the iopen glock
      for the same purpose. We should be consistent and always use the
      iopen glock. This may also save us reference counting problems with
      the inode glock, since clear_glock does an extra glock_put() for the
      inode glock.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      5ea31bc0
    • B
      GFS2: Release iopen glock in gfs2_create_inode error cases · 783013c0
      Bob Peterson 提交于
      Some error cases in gfs2_create_inode were not unlocking the iopen
      glock, getting the reference count off. This adds the proper unlock.
      The error logic in function gfs2_create_inode was also convoluted,
      so this patch simplifies it. It also takes care of a bug in
      which gfs2_qa_delete() was not called in an error case.
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      783013c0