1. 01 7月, 2016 1 次提交
  2. 30 5月, 2016 2 次提交
  3. 28 5月, 2016 1 次提交
  4. 09 5月, 2016 1 次提交
  5. 02 5月, 2016 1 次提交
  6. 11 4月, 2016 2 次提交
  7. 05 4月, 2016 1 次提交
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  8. 31 3月, 2016 1 次提交
    • A
      posix_acl: Inode acl caching fixes · b8a7a3a6
      Andreas Gruenbacher 提交于
      When get_acl() is called for an inode whose ACL is not cached yet, the
      get_acl inode operation is called to fetch the ACL from the filesystem.
      The inode operation is responsible for updating the cached acl with
      set_cached_acl().  This is done without locking at the VFS level, so
      another task can call set_cached_acl() or forget_cached_acl() before the
      get_acl inode operation gets to calling set_cached_acl(), and then
      get_acl's call to set_cached_acl() results in caching an outdate ACL.
      
      Prevent this from happening by setting the cached ACL pointer to a
      task-specific sentinel value before calling the get_acl inode operation.
      Move the responsibility for updating the cached ACL from the get_acl
      inode operations to get_acl().  There, only set the cached ACL if the
      sentinel value hasn't changed.
      
      The sentinel values are chosen to have odd values.  Likewise, the value
      of ACL_NOT_CACHED is odd.  In contrast, ACL object pointers always have
      an even value (ACLs are aligned in memory).  This allows to distinguish
      uncached ACLs values from ACL objects.
      
      In addition, switch from guarding inode->i_acl and inode->i_default_acl
      upates by the inode->i_lock spinlock to using xchg() and cmpxchg().
      
      Filesystems that do not want ACLs returned from their get_acl inode
      operations to be cached must call forget_cached_acl() to prevent the VFS
      from doing so.
      
      (Patch written by Al Viro and Andreas Gruenbacher.)
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      b8a7a3a6
  9. 23 1月, 2016 1 次提交
    • A
      wrappers for ->i_mutex access · 5955102c
      Al Viro 提交于
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  10. 15 1月, 2016 1 次提交
    • V
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov 提交于
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
      
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      fact).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5d097056
  11. 09 1月, 2016 2 次提交
  12. 31 12月, 2015 1 次提交
  13. 09 12月, 2015 2 次提交
    • A
      replace ->follow_link() with new method that could stay in RCU mode · 6b255391
      Al Viro 提交于
      new method: ->get_link(); replacement of ->follow_link().  The differences
      are:
      	* inode and dentry are passed separately
      	* might be called both in RCU and non-RCU mode;
      the former is indicated by passing it a NULL dentry.
      	* when called that way it isn't allowed to block
      and should return ERR_PTR(-ECHILD) if it needs to be called
      in non-RCU mode.
      
      It's a flagday change - the old method is gone, all in-tree instances
      converted.  Conversion isn't hard; said that, so far very few instances
      do not immediately bail out when called in RCU mode.  That'll change
      in the next commits.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      6b255391
    • A
      9p: ->evict_inode() should kick out ->i_data, not ->i_mapping · 4ad78628
      Al Viro 提交于
      For block devices the pagecache is associated with the inode
      on bdevfs, not with the aliasing ones on the mountable filesystems.
      The latter have its own ->i_data empty and ->i_mapping pointing
      to the (unique per major/minor) bdevfs inode.  That guarantees
      cache coherence between all block device inodes with the same
      device number.
      
      Eviction of an alias inode has no business trying to evict the
      pages belonging to bdevfs one; moreover, ->i_mapping is only
      safe to access when the thing is opened.  At the time of
      ->evict_inode() the victim is definitely *not* opened.  We are
      about to kill the address space embedded into struct inode
      (inode->i_data) and that's what we need to empty of any pages.
      
      9p instance tries to empty inode->i_mapping instead, which is
      both unsafe and bogus - if we have several device nodes with
      the same device number in different places, closing one of them
      should not try to empty the (shared) page cache.
      
      Fortunately, other instances in the tree are OK; they are
      evicting from &inode->i_data instead, as 9p one should.
      
      Cc: stable@vger.kernel.org # v2.6.32+, ones prior to 2.6.36 need only half of that
      Reported-by: N"Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
      Tested-by: N"Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      4ad78628
  14. 07 12月, 2015 2 次提交
  15. 14 11月, 2015 2 次提交
    • A
      9p: xattr simplifications · e409de99
      Andreas Gruenbacher 提交于
      Now that the xattr handler is passed to the xattr handler operations, we
      can use the same get and set operations for the user, trusted, and security
      xattr namespaces.  In those namespaces, we can access the full attribute
      name by "reattaching" the name prefix the vfs has skipped for us.  Add a
      xattr_full_name helper to make this obvious in the code.
      
      For the "system.posix_acl_access" and "system.posix_acl_default"
      attributes, handler->prefix is the full attribute name; the suffix is the
      empty string.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Cc: Eric Van Hensbergen <ericvh@gmail.com>
      Cc: Ron Minnich <rminnich@sandia.gov>
      Cc: Latchesar Ionkov <lucho@ionkov.net>
      Cc: v9fs-developer@lists.sourceforge.net
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      e409de99
    • A
      xattr handlers: Pass handler to operations instead of flags · d9a82a04
      Andreas Gruenbacher 提交于
      The xattr_handler operations are currently all passed a file system
      specific flags value which the operations can use to disambiguate between
      different handlers; some file systems use that to distinguish the xattr
      namespace, for example.  In some oprations, it would be useful to also have
      access to the handler prefix.  To allow that, pass a pointer to the handler
      to operations instead of the flags value alone.
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      d9a82a04
  16. 11 11月, 2015 1 次提交
  17. 10 11月, 2015 1 次提交
  18. 06 11月, 2015 1 次提交
  19. 23 10月, 2015 1 次提交
  20. 24 8月, 2015 2 次提交
  21. 12 7月, 2015 1 次提交
  22. 08 6月, 2015 1 次提交
    • T
      v9fs: fix error handling in v9fs_session_init() · 412a19b6
      Tejun Heo 提交于
      On failure, v9fs_session_init() returns with the v9fs_session_info
      struct partially initialized and expects the caller to invoke
      v9fs_session_close() to clean it up; however, it doesn't track whether
      the bdi is initialized or not and curiously invokes bdi_destroy() in
      both vfs_session_init() failure path too.
      
      A. If v9fs_session_init() fails before the bdi is initialized, the
         follow-up v9fs_session_close() will invoke bdi_destroy() on an
         uninitialized bdi.
      
      B. If v9fs_session_init() fails after the bdi is initialized,
         bdi_destroy() will be called twice on the same bdi - once in the
         failure path of v9fs_session_init() and then by
         v9fs_session_close().
      
      A is broken no matter what.  B used to be okay because bdi_destroy()
      allowed being invoked multiple times on the same bdi, which BTW was
      broken in its own way - if bdi_destroy() was invoked on an initialiezd
      but !registered bdi, it'd fail to free percpu counters.  Since
      f0054bb1 ("writeback: move backing_dev_info->wb_lock and
      ->worklist into bdi_writeback"), this no longer work - bdi_destroy()
      on an initialized but not registered bdi works correctly but multiple
      invocations of bdi_destroy() is no longer allowed.
      
      The obvious culprit here is v9fs_session_init()'s odd and broken error
      behavior.  It should simply clean up after itself on failures.  This
      patch makes the following updates to v9fs_session_init().
      
      * @rc -> @retval error return propagation removed.  It didn't serve
        any purpose.  Just use @rc.
      
      * Move addition to v9fs_sessionlist to the end of the function so that
        incomplete sessions are not put on the list or iterated and error
        path doesn't have to worry about it.
      
      * Update error handling so that it cleans up after itself.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      412a19b6
  23. 15 5月, 2015 1 次提交
  24. 11 5月, 2015 4 次提交
  25. 25 4月, 2015 1 次提交
    • J
      fs/9p: fix readdir() · 8e3c5005
      Johannes Berg 提交于
      Al Viro's IOV changes broke 9p readdir() because the new code
      didn't abort the read when it returned nothing. The original
      code checked if the combined error/length was <= 0 but in the
      new code that accidentally got changed to just an error check.
      
      Add back the return from the function when nothing is read.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Fixes: e1200fe6 ("9p: switch p9_client_read() to passing struct iov_iter *")
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8e3c5005
  26. 16 4月, 2015 1 次提交
  27. 12 4月, 2015 4 次提交