1. 18 11月, 2008 1 次提交
    • D
      prevent cifs_writepages() from skipping unwritten pages · b066a48c
      Dave Kleikamp 提交于
      Fixes a data corruption under heavy stress in which pages could be left
      dirty after all open instances of a inode have been closed.
      
      In order to write contiguous pages whenever possible, cifs_writepages()
      asks pagevec_lookup_tag() for more pages than it may write at one time.
      Normally, it then resets index just past the last page written before calling
      pagevec_lookup_tag() again.
      
      If cifs_writepages() can't write the first page returned, it wasn't resetting
      index, and the next call to pagevec_lookup_tag() resulted in skipping all of
      the pages it previously returned, even though cifs_writepages() did nothing
      with them.  This can result in data loss when the file descriptor is about
      to be closed.
      
      This patch ensures that index gets set back to the next returned page so
      that none get skipped.
      Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
      Acked-by: NJeff Layton <jlayton@redhat.com>
      Cc: Shirish S Pargaonkar <shirishp@us.ibm.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      b066a48c
  2. 14 11月, 2008 1 次提交
    • S
      [CIFS] Fix cifs reconnection flags · 3b795210
      Steve French 提交于
      In preparation for Jeff's big umount/mount fixes to remove the possibility of
      various races in cifs mount and linked list handling of sessions, sockets and
      tree connections, this patch cleans up some repetitive code in cifs_mount,
      and addresses a problem with ses->status and tcon->tidStatus in which we
      were overloading the "need_reconnect" state with other status in that
      field.  So the "need_reconnect" flag has been broken out from those
      two state fields (need reconnect was not mutually exclusive from some of the
      other possible tid and ses states).  In addition, a few exit cases in
      cifs_mount were cleaned up, and a problem with a tcon flag (for lease support)
      was not being set consistently for the 2nd mount of the same share
      
      CC: Jeff Layton <jlayton@redhat.com>
      CC: Shirish Pargaonkar <shirishp@us.ibm.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      3b795210
  3. 31 10月, 2008 1 次提交
  4. 20 10月, 2008 1 次提交
    • R
      vmscan: split LRU lists into anon & file sets · 4f98a2fe
      Rik van Riel 提交于
      Split the LRU lists in two, one set for pages that are backed by real file
      systems ("file") and one for pages that are backed by memory and swap
      ("anon").  The latter includes tmpfs.
      
      The advantage of doing this is that the VM will not have to scan over lots
      of anonymous pages (which we generally do not want to swap out), just to
      find the page cache pages that it should evict.
      
      This patch has the infrastructure and a basic policy to balance how much
      we scan the anon lists and how much we scan the file lists.  The big
      policy changes are in separate patches.
      
      [lee.schermerhorn@hp.com: collect lru meminfo statistics from correct offset]
      [kosaki.motohiro@jp.fujitsu.com: prevent incorrect oom under split_lru]
      [kosaki.motohiro@jp.fujitsu.com: fix pagevec_move_tail() doesn't treat unevictable page]
      [hugh@veritas.com: memcg swapbacked pages active]
      [hugh@veritas.com: splitlru: BDI_CAP_SWAP_BACKED]
      [akpm@linux-foundation.org: fix /proc/vmstat units]
      [nishimura@mxp.nes.nec.co.jp: memcg: fix handling of shmem migration]
      [kosaki.motohiro@jp.fujitsu.com: adjust Quicklists field of /proc/meminfo]
      [kosaki.motohiro@jp.fujitsu.com: fix style issue of get_scan_ratio()]
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NLee Schermerhorn <Lee.Schermerhorn@hp.com>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4f98a2fe
  5. 25 9月, 2008 1 次提交
  6. 23 9月, 2008 1 次提交
    • J
      cifs: have find_writeable_file prefer filehandles opened by same task · 2846d386
      Jeff Layton 提交于
      When the CIFS client goes to write out pages, it needs to pick a
      filehandle to write to. find_writeable_file however just picks the
      first filehandle that it finds. This can cause problems when a lock
      is issued against a particular filehandle and we pick a different
      filehandle to write to.
      
      This patch tries to avert this situation by having find_writable_file
      prefer filehandles that have a pid that matches the current task.
      This seems to fix lock test 11 from the connectathon test suite when
      run against a windows server.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      2846d386
  7. 28 8月, 2008 1 次提交
    • J
      cifs: fix O_APPEND on directio mounts · 838726c4
      Jeff Layton 提交于
      The direct I/O write codepath for CIFS is done through
      cifs_user_write(). That function does not currently call
      generic_write_checks() so the file position isn't being properly set
      when the file is opened with O_APPEND.  It's also not doing the other
      "normal" checks that should be done for a write call.
      
      The problem is currently that when you open a file with O_APPEND on a
      mount with the directio mount option, the file position is set to the
      beginning of the file. This makes any subsequent writes clobber the data
      in the file starting at the beginning.
      
      This seems to fix the problem in cursory testing. It is, however
      important to note that NFS disallows the combination of
      (O_DIRECT|O_APPEND). If my understanding is correct, the concern is
      races with multiple clients appending to a file clobbering each others'
      data. Since the write model for CIFS and NFS is pretty similar in this
      regard, CIFS is probably subject to the same sort of races. What's
      unclear to me is why this is a particular problem with O_DIRECT and not
      with buffered writes...
      
      Regardless, disallowing O_APPEND on an entire mount is probably not
      reasonable, so we'll probably just have to deal with it and reevaluate
      this flag combination when we get proper support for O_DIRECT. In the
      meantime this patch at least fixes the existing problem.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Cc: Stable Tree <stable@kernel.org>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      838726c4
  8. 06 8月, 2008 1 次提交
  9. 05 8月, 2008 1 次提交
  10. 24 5月, 2008 1 次提交
  11. 15 5月, 2008 1 次提交
  12. 29 4月, 2008 1 次提交
  13. 15 3月, 2008 1 次提交
    • S
      [CIFS] file create with acl support enabled is slow · 8b1327f6
      Steve French 提交于
      Shirish Pargaonkar noted:
      With cifsacl mount option, when a file is created on the Windows server,
      exclusive oplock is broken right away because the get cifs acl code
      again opens the file to obtain security descriptor.
      The client does not have the newly created file handle or inode in any
      of its lists yet so it does not respond to oplock break and server waits for
      its duration and then responds to the second open. This slows down file
      creation signficantly.  The fix is to pass the file descriptor to the get
      cifsacl code wherever available so that get cifs acl code does not send
      second open (NT Create ANDX) and oplock is not broken.
      
      CC: Shirish Pargaonkar <shirishp@us.ibm.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      8b1327f6
  14. 13 2月, 2008 1 次提交
  15. 08 2月, 2008 1 次提交
  16. 31 12月, 2007 1 次提交
  17. 21 11月, 2007 1 次提交
    • J
      [CIFS] Fix potential data corruption when writing out cached dirty pages · cea21805
      Jeff Layton 提交于
      Fix RedHat bug 329431
      
      The idea here is separate "conscious" from "unconscious" flushes.
      Conscious flushes are those due to a fsync() or close(). Unconscious
      ones are flushes that occur as a side effect of some other operation or
      due to memory pressure.
      
      Currently, when an error occurs during an unconscious flush (ENOSPC or
      EIO), we toss out the page and don't preserve that error to report to
      the user when a conscious flush occurs. If after the unconscious flush,
      there are no more dirty pages for the inode, the conscious flush will
      simply return success even though there were previous errors when writing
      out pages. This can lead to data corruption.
      
      The easiest way to reproduce this is to mount up a CIFS share that's
      very close to being full or where the user is very close to quota. mv
      a file to the share that's slightly larger than the quota allows. The
      writes will all succeed (since they go to pagecache). The mv will do a
      setattr to set the new file's attributes. This calls
      filemap_write_and_wait,
      which will return an error since all of the pages can't be written out.
      Then later, when the flush and release ops occur, there are no more
      dirty pages in pagecache for the file and those operations return 0. mv
      then assumes that the file was written out correctly and deletes the
      original.
      
      CIFS already has a write_behind_rc variable where it stores the results
      from earlier flushes, but that value is only reported in cifs_close.
      Since the VFS ignores the return value from the release operation, this
      isn't helpful. We should be reporting this error during the flush
      operation.
      
      This patch does the following:
      
      1) changes cifs_fsync to use filemap_write_and_wait and cifs_flush and also
      sync to check its return code. If it returns successful, they then check
      the value of write_behind_rc to see if an earlier flush had reported any
      errors. If so, they return that error and clear write_behind_rc.
      
      2) sets write_behind_rc in a few other places where pages are written
      out as a side effect of other operations and the code waits on them.
      
      3) changes cifs_setattr to only call filemap_write_and_wait for
      ATTR_SIZE changes.
      
      4) makes cifs_writepages accurately distinguish between EIO and ENOSPC
      errors when writing out pages.
      
      Some simple testing indicates that the patch works as expected and that
      it fixes the reproduceable known problem.
      Acked-by: NDave Kleikamp <shaggy@austin.rr.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      cea21805
  18. 17 11月, 2007 1 次提交
  19. 14 11月, 2007 1 次提交
    • S
      [CIFS] Fix buffer overflow if server sends corrupt response to small · 133672ef
      Steve French 提交于
      request
      
      In SendReceive() function in transport.c - it memcpy's
      message payload into a buffer passed via out_buf param. The function
      assumes that all buffers are of size (CIFSMaxBufSize +
      MAX_CIFS_HDR_SIZE) , unfortunately it is also called with smaller
      (MAX_CIFS_SMALL_BUFFER_SIZE) buffers.  There are eight callers
      (SMB worker functions) which are primarily affected by this change:
      
      TreeDisconnect, uLogoff, Close, findClose, SetFileSize, SetFileTimes,
      Lock and PosixLock
      
      CC: Dave Kleikamp <shaggy@austin.ibm.com>
      CC: Przemyslaw Wegrzyn <czajnik@czajsoft.pl>
      Acked-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      133672ef
  20. 26 10月, 2007 1 次提交
  21. 13 10月, 2007 1 次提交
  22. 02 10月, 2007 1 次提交
    • S
      [CIFS] Reduce chance of list corruption in find_writable_file · 9b22b0b7
      Steve French 提交于
      When find_writable_file is racing with close and the session
      to the server goes down, Shaggy noticed that there was a
      chance that an open file in the list of files off the inode
      could have been freed by close since cifs_reconnect can
      block (the spinlock thus not held). This means that
      we have to start over at the beginning of the list in some
      cases.
      
      There is a 2nd change that needs to be made later
      (pointed out by Jeremy Allison and Shaggy) in order to
      prevent cifs_close ever freeing the cifs per file info
      when a write is pending.  Although we delay close from
      freeing this memory for sufficiently long for all known
      cases, ultimately on a very, very slow write
      overlapping a close pending we need to allow close to return
      (without freeing the cifs file info) and defer freeing the
      memory to be the responsibility of the (sloooow) write
      thread (presumably have to look at every place wrtPending
      is decremented - and add a flag for deferred free for
      after wrtPending goes to zero).
      Acked-by: NShaggy <shaggy@us.ibm.com>
      Acked-by: NShirish Pargaonkar <shirishp@us.ibm.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      9b22b0b7
  23. 11 9月, 2007 1 次提交
  24. 08 9月, 2007 1 次提交
  25. 24 8月, 2007 2 次提交
  26. 26 7月, 2007 1 次提交
  27. 19 7月, 2007 1 次提交
  28. 18 7月, 2007 1 次提交
  29. 16 7月, 2007 1 次提交
  30. 13 7月, 2007 1 次提交
  31. 12 7月, 2007 1 次提交
  32. 10 7月, 2007 1 次提交
  33. 25 6月, 2007 1 次提交
  34. 09 5月, 2007 1 次提交
  35. 03 5月, 2007 1 次提交
  36. 05 4月, 2007 1 次提交
  37. 03 4月, 2007 1 次提交
  38. 06 3月, 2007 1 次提交
  39. 27 2月, 2007 1 次提交