1. 03 5月, 2017 1 次提交
  2. 11 4月, 2017 1 次提交
    • G
      CIFS: store results of cifs_reopen_file to avoid infinite wait · 1fa839b4
      Germano Percossi 提交于
      This fixes Continuous Availability when errors during
      file reopen are encountered.
      
      cifs_user_readv and cifs_user_writev would wait for ever if
      results of cifs_reopen_file are not stored and for later inspection.
      
      In fact, results are checked and, in case of errors, a chain
      of function calls leading to reads and writes to be scheduled in
      a separate thread is skipped.
      These threads will wake up the corresponding waiters once reads
      and writes are done.
      
      However, given the return value is not stored, when rc is checked
      for errors a previous one (always zero) is inspected instead.
      This leads to pending reads/writes added to the list, making
      cifs_user_readv and cifs_user_writev wait for ever.
      Signed-off-by: NGermano Percossi <germano.percossi@citrix.com>
      Reviewed-by: NPavel Shilovsky <pshilov@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      1fa839b4
  3. 25 2月, 2017 1 次提交
  4. 02 2月, 2017 2 次提交
    • P
      CIFS: Add copy into pages callback for a read operation · d70b9104
      Pavel Shilovsky 提交于
      Since we have two different types of reads (pagecache and direct)
      we need to process such responses differently after decryption of
      a packet. The change allows to specify a callback that copies a read
      payload data into preallocated pages.
      Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
      d70b9104
    • P
      CIFS: Fix splice read for non-cached files · 9c25702c
      Pavel Shilovsky 提交于
      Currently we call copy_page_to_iter() for uncached reading into a pipe.
      This is wrong because it treats pages as VFS cache pages and copies references
      rather than actual data. When we are trying to read from the pipe we end up
      calling page_cache_pipe_buf_confirm() which returns -ENODATA. This error
      is translated into 0 which is returned to a user.
      
      This issue is reproduced by running xfs-tests suite (generic test #249)
      against mount points with "cache=none". Fix it by mapping pages manually
      and calling copy_to_iter() that copies data into the pipe.
      
      Cc: Stable <stable@vger.kernel.org>
      Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
      9c25702c
  5. 06 12月, 2016 1 次提交
    • P
      CIFS: Fix a possible double locking of mutex during reconnect · 96a988ff
      Pavel Shilovsky 提交于
      With the current code it is possible to lock a mutex twice when
      a subsequent reconnects are triggered. On the 1st reconnect we
      reconnect sessions and tcons and then persistent file handles.
      If the 2nd reconnect happens during the reconnecting of persistent
      file handles then the following sequence of calls is observed:
      
      cifs_reopen_file -> SMB2_open -> small_smb2_init -> smb2_reconnect
      -> cifs_reopen_persistent_file_handles -> cifs_reopen_file (again!).
      
      So, we are trying to acquire the same cfile->fh_mutex twice which
      is wrong. Fix this by moving reconnecting of persistent handles to
      the delayed work (smb2_reconnect_server) and submitting this work
      every time we reconnect tcon in SMB2 commands handling codepath.
      
      This can also lead to corruption of a temporary file list in
      cifs_reopen_persistent_file_handles() because we can recursively
      call this function twice.
      
      Cc: Stable <stable@vger.kernel.org> # v4.9+
      Signed-off-by: NPavel Shilovsky <pshilov@microsoft.com>
      96a988ff
  6. 14 10月, 2016 2 次提交
  7. 13 10月, 2016 2 次提交
  8. 28 9月, 2016 2 次提交
  9. 27 7月, 2016 1 次提交
    • M
      mm, memcg: use consistent gfp flags during readahead · 8a5c743e
      Michal Hocko 提交于
      Vladimir has noticed that we might declare memcg oom even during
      readahead because read_pages only uses GFP_KERNEL (with mapping_gfp
      restriction) while __do_page_cache_readahead uses
      page_cache_alloc_readahead which adds __GFP_NORETRY to prevent from
      OOMs.  This gfp mask discrepancy is really unfortunate and easily
      fixable.  Drop page_cache_alloc_readahead() which only has one user and
      outsource the gfp_mask logic into readahead_gfp_mask and propagate this
      mask from __do_page_cache_readahead down to read_pages.
      
      This alone would have only very limited impact as most filesystems are
      implementing ->readpages and the common implementation mpage_readpages
      does GFP_KERNEL (with mapping_gfp restriction) again.  We can tell it to
      use readahead_gfp_mask instead as this function is called only during
      readahead as well.  The same applies to read_cache_pages.
      
      ext4 has its own ext4_mpage_readpages but the path which has pages !=
      NULL can use the same gfp mask.  Btrfs, cifs, f2fs and orangefs are
      doing a very similar pattern to mpage_readpages so the same can be
      applied to them as well.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [mhocko@suse.com: restrict gfp mask in mpage_alloc]
        Link: http://lkml.kernel.org/r/20160610074223.GC32285@dhcp22.suse.cz
      Link: http://lkml.kernel.org/r/1465301556-26431-1-git-send-email-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: Chris Mason <clm@fb.com>
      Cc: Steve French <sfrench@samba.org>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Mike Marshall <hubcap@omnibond.com>
      Cc: Jaegeuk Kim <jaegeuk@kernel.org>
      Cc: Changman Lee <cm224.lee@samsung.com>
      Cc: Chao Yu <yuchao0@huawei.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8a5c743e
  10. 24 6月, 2016 1 次提交
  11. 18 5月, 2016 1 次提交
  12. 02 5月, 2016 3 次提交
  13. 05 4月, 2016 2 次提交
    • K
      mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage · ea1754a0
      Kirill A. Shutemov 提交于
      Mostly direct substitution with occasional adjustment or removing
      outdated comments.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ea1754a0
    • K
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov 提交于
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  14. 29 3月, 2016 1 次提交
  15. 23 1月, 2016 1 次提交
    • A
      wrappers for ->i_mutex access · 5955102c
      Al Viro 提交于
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      5955102c
  16. 16 1月, 2016 1 次提交
    • K
      page-flags: define PG_locked behavior on compound pages · 48c935ad
      Kirill A. Shutemov 提交于
      lock_page() must operate on the whole compound page.  It doesn't make
      much sense to lock part of compound page.  Change code to use head
      page's PG_locked, if tail page is passed.
      
      This patch also gets rid of custom helper functions --
      __set_page_locked() and __clear_page_locked().  They are replaced with
      helpers generated by __SETPAGEFLAG/__CLEARPAGEFLAG.  Tail pages to these
      helper would trigger VM_BUG_ON().
      
      SLUB uses PG_locked as a bit spin locked.  IIUC, tail pages should never
      appear there.  VM_BUG_ON() is added to make sure that this assumption is
      correct.
      
      [akpm@linux-foundation.org: fix fs/cifs/file.c]
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Steve Capper <steve.capper@linaro.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Jérôme Glisse <jglisse@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      48c935ad
  17. 07 11月, 2015 1 次提交
  18. 23 10月, 2015 1 次提交
  19. 17 10月, 2015 1 次提交
    • M
      mm, fs: obey gfp_mapping for add_to_page_cache() · 063d99b4
      Michal Hocko 提交于
      Commit 6afdb859 ("mm: do not ignore mapping_gfp_mask in page cache
      allocation paths") has caught some users of hardcoded GFP_KERNEL used in
      the page cache allocation paths.  This, however, wasn't complete and
      there were others which went unnoticed.
      
      Dave Chinner has reported the following deadlock for xfs on loop device:
      : With the recent merge of the loop device changes, I'm now seeing
      : XFS deadlock on my single CPU, 1GB RAM VM running xfs/073.
      :
      : The deadlocked is as follows:
      :
      : kloopd1: loop_queue_read_work
      :       xfs_file_iter_read
      :       lock XFS inode XFS_IOLOCK_SHARED (on image file)
      :       page cache read (GFP_KERNEL)
      :       radix tree alloc
      :       memory reclaim
      :       reclaim XFS inodes
      :       log force to unpin inodes
      :       <wait for log IO completion>
      :
      : xfs-cil/loop1: <does log force IO work>
      :       xlog_cil_push
      :       xlog_write
      :       <loop issuing log writes>
      :               xlog_state_get_iclog_space()
      :               <blocks due to all log buffers under write io>
      :               <waits for IO completion>
      :
      : kloopd1: loop_queue_write_work
      :       xfs_file_write_iter
      :       lock XFS inode XFS_IOLOCK_EXCL (on image file)
      :       <wait for inode to be unlocked>
      :
      : i.e. the kloopd, with it's split read and write work queues, has
      : introduced a dependency through memory reclaim. i.e. that writes
      : need to be able to progress for reads make progress.
      :
      : The problem, fundamentally, is that mpage_readpages() does a
      : GFP_KERNEL allocation, rather than paying attention to the inode's
      : mapping gfp mask, which is set to GFP_NOFS.
      :
      : The didn't used to happen, because the loop device used to issue
      : reads through the splice path and that does:
      :
      :       error = add_to_page_cache_lru(page, mapping, index,
      :                       GFP_KERNEL & mapping_gfp_mask(mapping));
      
      This has changed by commit aa4d8616 ("block: loop: switch to VFS
      ITER_BVEC").
      
      This patch changes mpage_readpage{s} to follow gfp mask set for the
      mapping.  There are, however, other places which are doing basically the
      same.
      
      lustre:ll_dir_filler is doing GFP_KERNEL from the function which
      apparently uses GFP_NOFS for other allocations so let's make this
      consistent.
      
      cifs:readpages_get_pages is called from cifs_readpages and
      __cifs_readpages_from_fscache called from the same path obeys mapping
      gfp.
      
      ramfs_nommu_expand_for_mapping is hardcoding GFP_KERNEL as well
      regardless it uses mapping_gfp_mask for the page allocation.
      
      ext4_mpage_readpages is the called from the page cache allocation path
      same as read_pages and read_cache_pages
      
      As I've noticed in my previous post I cannot say I would be happy about
      sprinkling mapping_gfp_mask all over the place and it sounds like we
      should drop gfp_mask argument altogether and use it internally in
      __add_to_page_cache_locked that would require all the filesystems to use
      mapping gfp consistently which I am not sure is the case here.  From a
      quick glance it seems that some file system use it all the time while
      others are selective.
      Signed-off-by: NMichal Hocko <mhocko@suse.com>
      Reported-by: NDave Chinner <david@fromorbit.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Ming Lei <ming.lei@canonical.com>
      Cc: Andreas Dilger <andreas.dilger@intel.com>
      Cc: Oleg Drokin <oleg.drokin@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      063d99b4
  20. 11 9月, 2015 1 次提交
  21. 21 5月, 2015 1 次提交
  22. 11 5月, 2015 1 次提交
  23. 16 4月, 2015 1 次提交
  24. 12 4月, 2015 5 次提交
  25. 21 3月, 2015 1 次提交
    • D
      cifs: fix use-after-free bug in find_writable_file · e1e9bda2
      David Disseldorp 提交于
      Under intermittent network outages, find_writable_file() is susceptible
      to the following race condition, which results in a user-after-free in
      the cifs_writepages code-path:
      
      Thread 1                                        Thread 2
      ========                                        ========
      
      inv_file = NULL
      refind = 0
      spin_lock(&cifs_file_list_lock)
      
      // invalidHandle found on openFileList
      
      inv_file = open_file
      // inv_file->count currently 1
      
      cifsFileInfo_get(inv_file)
      // inv_file->count = 2
      
      spin_unlock(&cifs_file_list_lock);
      
      cifs_reopen_file()                            cifs_close()
      // fails (rc != 0)                            ->cifsFileInfo_put()
                                             spin_lock(&cifs_file_list_lock)
                                             // inv_file->count = 1
                                             spin_unlock(&cifs_file_list_lock)
      
      spin_lock(&cifs_file_list_lock);
      list_move_tail(&inv_file->flist,
            &cifs_inode->openFileList);
      spin_unlock(&cifs_file_list_lock);
      
      cifsFileInfo_put(inv_file);
      ->spin_lock(&cifs_file_list_lock)
      
        // inv_file->count = 0
        list_del(&cifs_file->flist);
        // cleanup!!
        kfree(cifs_file);
      
        spin_unlock(&cifs_file_list_lock);
      
      spin_lock(&cifs_file_list_lock);
      ++refind;
      // refind = 1
      goto refind_writable;
      
      At this point we loop back through with an invalid inv_file pointer
      and a refind value of 1. On second pass, inv_file is not overwritten on
      openFileList traversal, and is subsequently dereferenced.
      Signed-off-by: NDavid Disseldorp <ddiss@suse.de>
      Reviewed-by: NJeff Layton <jlayton@samba.org>
      CC: <stable@vger.kernel.org>
      Signed-off-by: NSteve French <smfrench@gmail.com>
      e1e9bda2
  26. 17 2月, 2015 1 次提交
  27. 11 2月, 2015 1 次提交
  28. 20 1月, 2015 1 次提交
    • S
      Complete oplock break jobs before closing file handle · ca7df8e0
      Sachin Prabhu 提交于
      Commit
      c11f1df5
      requires writers to wait for any pending oplock break handler to
      complete before proceeding to write. This is done by waiting on bit
      CIFS_INODE_PENDING_OPLOCK_BREAK in cifsFileInfo->flags. This bit is
      cleared by the oplock break handler job queued on the workqueue once it
      has completed handling the oplock break allowing writers to proceed with
      writing to the file.
      
      While testing, it was noticed that the filehandle could be closed while
      there is a pending oplock break which results in the oplock break
      handler on the cifsiod workqueue being cancelled before it has had a
      chance to execute and clear the CIFS_INODE_PENDING_OPLOCK_BREAK bit.
      Any subsequent attempt to write to this file hangs waiting for the
      CIFS_INODE_PENDING_OPLOCK_BREAK bit to be cleared.
      
      We fix this by ensuring that we also clear the bit
      CIFS_INODE_PENDING_OPLOCK_BREAK when we remove the oplock break handler
      from the workqueue.
      
      The bug was found by Red Hat QA while testing using ltp's fsstress
      command.
      Signed-off-by: NSachin Prabhu <sprabhu@redhat.com>
      Acked-by: NShirish Pargaonkar <shirishpargaonkar@gmail.com>
      Signed-off-by: NJeff Layton <jlayton@samba.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NSteve French <steve.french@primarydata.com>
      ca7df8e0
  29. 17 1月, 2015 1 次提交