1. 04 3月, 2010 1 次提交
  2. 10 12月, 2009 1 次提交
    • C
      vfs: Implement proper O_SYNC semantics · 6b2f3d1f
      Christoph Hellwig 提交于
      While Linux provided an O_SYNC flag basically since day 1, it took until
      Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
      since that day we had generic_osync_around with only minor changes and the
      great "For now, when the user asks for O_SYNC, we'll actually give
      O_DSYNC" comment.  This patch intends to actually give us real O_SYNC
      semantics in addition to the O_DSYNC semantics.  After Jan's O_SYNC
      patches which are required before this patch it's actually surprisingly
      simple, we just need to figure out when to set the datasync flag to
      vfs_fsync_range and when not.
      
      This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's
      numerical value to keep binary compatibility, and adds a new real O_SYNC
      flag.  To guarantee backwards compatiblity it is defined as expanding to
      both the O_DSYNC and the new additional binary flag (__O_SYNC) to make
      sure we are backwards-compatible when compiled against the new headers.
      
      This also means that all places that don't care about the differences can
      just check O_DSYNC and get the right behaviour for O_SYNC, too - only
      places that actuall care need to check __O_SYNC in addition.  Drivers and
      network filesystems have been updated in a fail safe way to always do the
      full sync magic if O_DSYNC is set.  The few places setting O_SYNC for
      lower layers are kept that way for now to stay failsafe.
      
      We enforce that O_DSYNC is set when __O_SYNC is set early in the open path
      to make sure we always get these sane options.
      
      Note that parisc really screwed up their headers as they already define a
      O_DSYNC that has always been a no-op.  We try to repair it by using it for
      the new O_DSYNC and redefinining O_SYNC to send both the traditional
      O_SYNC numerical value _and_ the O_DSYNC one.
      
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andreas Dilger <adilger@sun.com>
      Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: NKyle McMartin <kyle@mcmartin.ca>
      Acked-by: NUlrich Drepper <drepper@redhat.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NJan Kara <jack@suse.cz>
      6b2f3d1f
  3. 26 9月, 2009 1 次提交
  4. 25 9月, 2009 2 次提交
    • J
      cifs: eliminate cifs_init_private · 086f68bd
      Jeff Layton 提交于
      ...it does the same thing as cifs_fill_fileinfo, but doesn't handle the
      flist ordering correctly. Also rename cifs_fill_fileinfo to a more
      descriptive name and have it take an open flags arg instead of just a
      write_only flag. That makes the logic in the callers a little simpler.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      086f68bd
    • J
      cifs: convert oplock breaks to use slow_work facility (try #4) · 3bc303c2
      Jeff Layton 提交于
      This is the fourth respin of the patch to convert oplock breaks to
      use the slow_work facility.
      
      A customer of ours was testing a backport of one of the earlier
      patchsets, and hit a "Busy inodes after umount..." problem. An oplock
      break job had raced with a umount, and the superblock got torn down and
      its memory reused. When the oplock break job tried to dereference the
      inode->i_sb, the kernel oopsed.
      
      This patchset has the oplock break job hold an inode and vfsmount
      reference until the oplock break completes.  With this, there should be
      no need to take a tcon reference (the vfsmount implicitly holds one
      already).
      
      Currently, when an oplock break comes in there's a chance that the
      oplock break job won't occur if the allocation of the oplock_q_entry
      fails. There are also some rather nasty races in the allocation and
      handling these structs.
      
      Rather than allocating oplock queue entries when an oplock break comes
      in, add a few extra fields to the cifsFileInfo struct. Get rid of the
      dedicated cifs_oplock_thread as well and queue the oplock break job to
      the slow_work thread pool.
      
      This approach also has the advantage that the oplock break jobs can
      potentially run in parallel rather than be serialized like they are
      today.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      3bc303c2
  5. 16 9月, 2009 2 次提交
  6. 02 9月, 2009 1 次提交
  7. 10 7月, 2009 1 次提交
  8. 28 6月, 2009 1 次提交
  9. 26 6月, 2009 1 次提交
    • S
      cifs: Fix incorrect return code being printed in cFYI messages · 0f3bc09e
      Suresh Jayaraman 提交于
      FreeXid() along with freeing Xid does add a cifsFYI debug message that
      prints rc (return code) as well. In some code paths where we set/return
      error code after calling FreeXid(), incorrect error code is being
      printed when cifsFYI is enabled.
      
      This could be misleading in few cases. For eg.
      In cifs_open() if cifs_fill_filedata() returns a valid pointer to
      cifsFileInfo, FreeXid() prints rc=-13 whereas 0 is actually being
      returned. Fix this by setting rc before calling FreeXid().
      
      Basically convert
      
      FreeXid(xid);			rc = -ERR;
      return -ERR;		=>	FreeXid(xid);
      				return rc;
      
      [Note that Christoph would like to replace the GetXid/FreeXid
      calls, which are primarily used for debugging.  This seems
      like a good longer term goal, but although there is an
      alternative tracing facility, there are no examples yet
      available that I know of that we can use (yet) to
      convert this cifs function entry/exit logging, and for
      creating an identifier that we can use to correlate
      all dmesg log entries for a particular vfs operation
      (ie identify all log entries for a particular vfs
      request to cifs: e.g. a particular close or read or write
      or byte range lock call ... and just using the thread id
      is harder).  Eventually when a replacement
      for this is available (e.g. when NFS switches over and various
      samples to look at in other file systems) we can remove the
      GetXid/FreeXid macro but in the meantime multiple people
      use this run time configurable logging all the time
      for debugging, and Suresh's patch fixes a problem
      which made it harder to notice some low
      memory problems in the log so it is worthwhile
      to fix this problem until a better logging
      approach is able to be used]
      Acked-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSuresh Jayaraman <sjayaraman@suse.de>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      0f3bc09e
  10. 28 5月, 2009 1 次提交
  11. 22 5月, 2009 1 次提交
  12. 08 5月, 2009 1 次提交
  13. 17 4月, 2009 3 次提交
  14. 12 3月, 2009 6 次提交
  15. 05 1月, 2009 1 次提交
    • N
      fs: symlink write_begin allocation context fix · 54566b2c
      Nick Piggin 提交于
      With the write_begin/write_end aops, page_symlink was broken because it
      could no longer pass a GFP_NOFS type mask into the point where the
      allocations happened.  They are done in write_begin, which would always
      assume that the filesystem can be entered from reclaim.  This bug could
      cause filesystem deadlocks.
      
      The funny thing with having a gfp_t mask there is that it doesn't really
      allow the caller to arbitrarily tinker with the context in which it can be
      called.  It couldn't ever be GFP_ATOMIC, for example, because it needs to
      take the page lock.  The only thing any callers care about is __GFP_FS
      anyway, so turn that into a single flag.
      
      Add a new flag for write_begin, AOP_FLAG_NOFS.  Filesystems can now act on
      this flag in their write_begin function.  Change __grab_cache_page to
      accept a nofs argument as well, to honour that flag (while we're there,
      change the name to grab_cache_page_write_begin which is more instructive
      and does away with random leading underscores).
      
      This is really a more flexible way to go in the end anyway -- if a
      filesystem happens to want any extra allocations aside from the pagecache
      ones in ints write_begin function, it may now use GFP_KERNEL (rather than
      GFP_NOFS) for common case allocations (eg.  ocfs2_alloc_write_ctxt, for a
      random example).
      
      [kosaki.motohiro@jp.fujitsu.com: fix ubifs]
      [kosaki.motohiro@jp.fujitsu.com: fix fuse]
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Reviewed-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: <stable@kernel.org>		[2.6.28.x]
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      [ Cleaned up the calling convention: just pass in the AOP flags
        untouched to the grab_cache_page_write_begin() function.  That
        just simplifies everybody, and may even allow future expansion of the
        logic.   - Linus ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      54566b2c
  16. 26 12月, 2008 2 次提交
    • S
      [CIFS] remove sparse warning · acc18aa1
      Steve French 提交于
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      acc18aa1
    • S
      [CIFS] add mount option to send mandatory rather than advisory locks · 13a6e42a
      Steve French 提交于
      Some applications/subsystems require mandatory byte range locks
      (as is used for Windows/DOS/OS2 etc). Sending advisory (posix style)
      byte range lock requests (instead of mandatory byte range locks) can
      lead to problems for these applications (which expect that other
      clients be prevented from writing to portions of the file which
      they have locked and are updating).  This mount option allows
      mounting cifs with the new mount option "forcemand" (or
      "forcemandatorylock") in order to have the cifs client use mandatory
      byte range locks (ie SMB/CIFS/Windows/NTFS style locks) rather than
      posix byte range lock requests, even if the server would support
      posix byte range lock requests.  This has no effect if the server
      does not support the CIFS Unix Extensions (since posix style locks
      require support for the CIFS Unix Extensions), but for mounts
      to Samba servers this can be helpful for Wine and applications
      that require mandatory byte range locks.
      Acked-by: NJeff Layton <jlayton@redhat.com>
      CC: Alexander Bokovoy <ab@samba.org>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      13a6e42a
  17. 27 11月, 2008 1 次提交
    • J
      [CIFS] fix regression in cifs_write_begin/cifs_write_end · a98ee8c1
      Jeff Layton 提交于
      The conversion to write_begin/write_end interfaces had a bug where we
      were passing a bad parameter to cifs_readpage_worker. Rather than
      passing the page offset of the start of the write, we needed to pass the
      offset of the beginning of the page. This was reliably showing up as
      data corruption in the fsx-linux test from LTP.
      
      It also became evident that this code was occasionally doing unnecessary
      read calls. Optimize those away by using the PG_checked flag to indicate
      that the unwritten part of the page has been initialized.
      
      CC: Nick Piggin <npiggin@suse.de>
      Acked-by: NDave Kleikamp <shaggy@us.ibm.com>
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      a98ee8c1
  18. 21 11月, 2008 1 次提交
    • S
      [CIFS] Do not attempt to close invalidated file handles · ddb4cbfc
      Steve French 提交于
      If a connection with open file handles has gone down
      and come back up and reconnected without reopening
      the file handle yet, do not attempt to send an SMB close
      request for this handle in cifs_close.  We were
      checking for the connection being invalid in cifs_close
      but since the connection may have been reconnected
      we also need to check whether the file handle
      was marked invalid (otherwise we could close the
      wrong file handle by accident).
      Acked-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      ddb4cbfc
  19. 18 11月, 2008 1 次提交
    • D
      prevent cifs_writepages() from skipping unwritten pages · b066a48c
      Dave Kleikamp 提交于
      Fixes a data corruption under heavy stress in which pages could be left
      dirty after all open instances of a inode have been closed.
      
      In order to write contiguous pages whenever possible, cifs_writepages()
      asks pagevec_lookup_tag() for more pages than it may write at one time.
      Normally, it then resets index just past the last page written before calling
      pagevec_lookup_tag() again.
      
      If cifs_writepages() can't write the first page returned, it wasn't resetting
      index, and the next call to pagevec_lookup_tag() resulted in skipping all of
      the pages it previously returned, even though cifs_writepages() did nothing
      with them.  This can result in data loss when the file descriptor is about
      to be closed.
      
      This patch ensures that index gets set back to the next returned page so
      that none get skipped.
      Signed-off-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
      Acked-by: NJeff Layton <jlayton@redhat.com>
      Cc: Shirish S Pargaonkar <shirishp@us.ibm.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      b066a48c
  20. 14 11月, 2008 1 次提交
    • S
      [CIFS] Fix cifs reconnection flags · 3b795210
      Steve French 提交于
      In preparation for Jeff's big umount/mount fixes to remove the possibility of
      various races in cifs mount and linked list handling of sessions, sockets and
      tree connections, this patch cleans up some repetitive code in cifs_mount,
      and addresses a problem with ses->status and tcon->tidStatus in which we
      were overloading the "need_reconnect" state with other status in that
      field.  So the "need_reconnect" flag has been broken out from those
      two state fields (need reconnect was not mutually exclusive from some of the
      other possible tid and ses states).  In addition, a few exit cases in
      cifs_mount were cleaned up, and a problem with a tcon flag (for lease support)
      was not being set consistently for the 2nd mount of the same share
      
      CC: Jeff Layton <jlayton@redhat.com>
      CC: Shirish Pargaonkar <shirishp@us.ibm.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      3b795210
  21. 31 10月, 2008 1 次提交
  22. 20 10月, 2008 1 次提交
    • R
      vmscan: split LRU lists into anon & file sets · 4f98a2fe
      Rik van Riel 提交于
      Split the LRU lists in two, one set for pages that are backed by real file
      systems ("file") and one for pages that are backed by memory and swap
      ("anon").  The latter includes tmpfs.
      
      The advantage of doing this is that the VM will not have to scan over lots
      of anonymous pages (which we generally do not want to swap out), just to
      find the page cache pages that it should evict.
      
      This patch has the infrastructure and a basic policy to balance how much
      we scan the anon lists and how much we scan the file lists.  The big
      policy changes are in separate patches.
      
      [lee.schermerhorn@hp.com: collect lru meminfo statistics from correct offset]
      [kosaki.motohiro@jp.fujitsu.com: prevent incorrect oom under split_lru]
      [kosaki.motohiro@jp.fujitsu.com: fix pagevec_move_tail() doesn't treat unevictable page]
      [hugh@veritas.com: memcg swapbacked pages active]
      [hugh@veritas.com: splitlru: BDI_CAP_SWAP_BACKED]
      [akpm@linux-foundation.org: fix /proc/vmstat units]
      [nishimura@mxp.nes.nec.co.jp: memcg: fix handling of shmem migration]
      [kosaki.motohiro@jp.fujitsu.com: adjust Quicklists field of /proc/meminfo]
      [kosaki.motohiro@jp.fujitsu.com: fix style issue of get_scan_ratio()]
      Signed-off-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NLee Schermerhorn <Lee.Schermerhorn@hp.com>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4f98a2fe
  23. 25 9月, 2008 1 次提交
  24. 23 9月, 2008 1 次提交
    • J
      cifs: have find_writeable_file prefer filehandles opened by same task · 2846d386
      Jeff Layton 提交于
      When the CIFS client goes to write out pages, it needs to pick a
      filehandle to write to. find_writeable_file however just picks the
      first filehandle that it finds. This can cause problems when a lock
      is issued against a particular filehandle and we pick a different
      filehandle to write to.
      
      This patch tries to avert this situation by having find_writable_file
      prefer filehandles that have a pid that matches the current task.
      This seems to fix lock test 11 from the connectathon test suite when
      run against a windows server.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      2846d386
  25. 28 8月, 2008 1 次提交
    • J
      cifs: fix O_APPEND on directio mounts · 838726c4
      Jeff Layton 提交于
      The direct I/O write codepath for CIFS is done through
      cifs_user_write(). That function does not currently call
      generic_write_checks() so the file position isn't being properly set
      when the file is opened with O_APPEND.  It's also not doing the other
      "normal" checks that should be done for a write call.
      
      The problem is currently that when you open a file with O_APPEND on a
      mount with the directio mount option, the file position is set to the
      beginning of the file. This makes any subsequent writes clobber the data
      in the file starting at the beginning.
      
      This seems to fix the problem in cursory testing. It is, however
      important to note that NFS disallows the combination of
      (O_DIRECT|O_APPEND). If my understanding is correct, the concern is
      races with multiple clients appending to a file clobbering each others'
      data. Since the write model for CIFS and NFS is pretty similar in this
      regard, CIFS is probably subject to the same sort of races. What's
      unclear to me is why this is a particular problem with O_DIRECT and not
      with buffered writes...
      
      Regardless, disallowing O_APPEND on an entire mount is probably not
      reasonable, so we'll probably just have to deal with it and reevaluate
      this flag combination when we get proper support for O_DIRECT. In the
      meantime this patch at least fixes the existing problem.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Cc: Stable Tree <stable@kernel.org>
      Signed-off-by: NSteve French <sfrench@us.ibm.com>
      838726c4
  26. 06 8月, 2008 1 次提交
  27. 05 8月, 2008 1 次提交
  28. 24 5月, 2008 1 次提交
  29. 15 5月, 2008 1 次提交
  30. 29 4月, 2008 1 次提交