1. 10 10月, 2007 9 次提交
    • F
      Re: [NFS] [PATCH] Attribute timeout handling and wrapping u32 jiffies · c7e15961
      Fabio Olive Leite 提交于
      I would like to discuss the idea that the current checks for attribute
      timeout using time_after are inadequate for 32bit architectures, since
      time_after works correctly only when the two timestamps being compared
      are within 2^31 jiffies of each other. The signed overflow caused by
      comparing values more than 2^31 jiffies apart will flip the result,
      causing incorrect assumptions of validity.
      
      2^31 jiffies is a fairly large period of time (~25 days) when compared
      to the lifetime of most kernel data structures, but for long lived NFS
      mounts that can sit idle for months (think that for some reason autofs
      cannot be used), it is easy to compare inode attribute timestamps with
      very disparate or even bogus values (as in when jiffies have wrapped
      many times, where the comparison doesn't even make sense).
      
      Currently the code tests for attribute timeout by simply adding the
      desired amount of jiffies to the stored timestamp and comparing that
      with the current timestamp of obtained attribute data with time_after.
      This is incorrect, as it returns true for the desired timeout period
      and another full 2^31 range of jiffies.
      
      In testing with artificial jumps (several small jumps, not one big
      crank) of the jiffies I was able to reproduce a problem found in a
      server with very long lived NFS mounts, where attributes would not be
      refreshed even after touching files and directories in the server:
      
      Initial uptime:
      03:42:01 up 6 min, 0 users, load average: 0.01, 0.12, 0.07
      
      NFS volume is mounted and time is advanced:
      03:38:09 up 25 days, 2 min, 0 users, load average: 1.22, 1.05, 1.08
      
      # ls -l /local/A/foo/bar /nfs/A/foo/bar
      -rw-r--r--  1 root root 0 Dec 17 03:38 /local/A/foo/bar
      -rw-r--r--  1 root root 0 Nov 22 00:36 /nfs/A/foo/bar
      
      # touch /local/A/foo/bar
      
      # ls -l /local/A/foo/bar /nfs/A/foo/bar
      -rw-r--r--  1 root root 0 Dec 17 03:47 /local/A/foo/bar
      -rw-r--r--  1 root root 0 Nov 22 00:36 /nfs/A/foo/bar
      
      We can see the local mtime is updated, but the NFS mount still shows
      the old value. The patch below makes it work:
      
      Initial setup...
      07:11:02 up 25 days, 1 min,  0 users,  load average: 0.15, 0.03, 0.04
      
      # ls -l /local/A/foo/bar /nfs/A/foo/bar
      -rw-r--r--  1 root root 0 Jan 11 07:11 /local/A/foo/bar
      -rw-r--r--  1 root root 0 Jan 11 07:11 /nfs/A/foo/bar
      
      # touch /local/A/foo/bar
      
      # ls -l /local/A/foo/bar /nfs/A/foo/bar
      -rw-r--r--  1 root root 0 Jan 11 07:14 /local/A/foo/bar
      -rw-r--r--  1 root root 0 Jan 11 07:14 /nfs/A/foo/bar
      Signed-off-by: NFabio Olive Leite <fleite@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      c7e15961
    • P
      64 bit ino support for NFS client · 4e769b93
      Peter Staubach 提交于
      Hi.
      
      Attached is a patch to modify the NFS client code to support
      64 bit ino's, as appropriate for the system and the NFS
      protocol version.
      
      The code basically just expand the NFS interfaces for routines
      which handle ino's from using ino_t to u64 and then uses the
      fileid in the nfs_inode instead of i_ino in the inode.  The
      code paths that were updated are in the getattr method and
      the readdir methods.
      
      This should be no real change on 64 bit platforms.  Since
      the ino_t is an unsigned long, it would already be 64 bits
      wide.
      
          Thanx...
      
                 ps
      Signed-off-by: NPeter Staubach <staubach@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      4e769b93
    • T
      NFS: Fall back to synchronous writes when a background write errors... · 7b159fc1
      Trond Myklebust 提交于
      This helps prevent huge queues of background writes from building up
      whenever the server runs out of disk or quota space, or if someone changes
      the file access modes behind our backs.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      7b159fc1
    • T
      NFS: Writeback optimisation · 34901f70
      Trond Myklebust 提交于
      Schedule writes using WB_SYNC_NONE first, then come back for a second pass
      using WB_SYNC_ALL.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      34901f70
    • T
      NFS: Clean up NFS writeback flush code · ed90ef51
      Trond Myklebust 提交于
      The only user of nfs_sync_mapping_range() is nfs_getattr(), which uses it
      to flush out the entire inode without sending a commit. We therefore
      replace nfs_sync_mapping_range with a more appropriate helper.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      ed90ef51
    • T
      NFS: Clean up nfs_writepages() · f758c885
      Trond Myklebust 提交于
      Just call write_cache_pages directly instead of hacking the writeback
      control structure in order to find out if we were called from writepages()
      or directly from the VM.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      f758c885
    • T
      NFS: Clean up write code... · 9cccef95
      Trond Myklebust 提交于
      The addition of nfs_page_mkwrite means that We should no longer need to
      create requests inside nfs_writepage()
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      9cccef95
    • T
      NFS: Add the helper nfs_vm_page_mkwrite · 94387fb1
      Trond Myklebust 提交于
      This is needed in order to set up a proper nfs_page request for mmapped
      files.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      94387fb1
    • T
      NLM: Fix a memory leak in nlmsvc_testlock · a6d85430
      Trond Myklebust 提交于
      The recent fix for a circular lock dependency unfortunately introduced a
      potential memory leak in the event where the call to nlmsvc_lookup_host
      fails for some reason.
      
      Thanks to Roel Kluin for spotting this.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a6d85430
  2. 09 10月, 2007 1 次提交
  3. 04 10月, 2007 1 次提交
  4. 02 10月, 2007 1 次提交
  5. 01 10月, 2007 1 次提交
  6. 29 9月, 2007 1 次提交
  7. 27 9月, 2007 1 次提交
  8. 25 9月, 2007 1 次提交
  9. 03 10月, 2007 2 次提交
  10. 21 9月, 2007 6 次提交
    • J
      [PATCH] WE : Add missing auth compat-ioctl · d59952d5
      Jean Tourrilhes 提交于
      Johannes just found that we are missing a compat-ioctl
      declaration. The fix is trivial. As previous patches for compat-ioctl,
      this should also go to stable.
      
      More info :
      	http://marc.info/?l=linux-wireless&m=119029667902588&w=2Signed-off-by: NJean Tourrilhes <jt@hpl.hp.com>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      d59952d5
    • S
      ocfs2: Pack vote message and response structures · 813d974c
      Sunil Mushran 提交于
      The ocfs2_vote_msg and ocfs2_response_msg structs needed to be
      packed to ensure similar sizeofs in 32-bit and 64-bit arches. Without this,
      we had inadvertantly broken 32/64 bit cross mounts.
      Signed-off-by: NSunil Mushran <sunil.mushran@oracle.com>
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      813d974c
    • M
      ocfs2: Don't double set write parameters · 5c26a7b7
      Mark Fasheh 提交于
      The target page offsets were being incorrectly set a second time in
      ocfs2_prepare_page_for_write(), which was causing problems on a 16k page
      size kernel. Additionally, ocfs2_write_failure() was incorrectly using those
      parameters instead of the parameters for the individual page being cleaned
      up.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      5c26a7b7
    • M
      ocfs2: Fix pos/len passed to ocfs2_write_cluster · db56246c
      Mark Fasheh 提交于
      This was broken for file systems whose cluster size is greater than page
      size. Pos needs to be incremented as we loop through the descriptors, and
      len needs to be capped to the size of a single cluster.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      db56246c
    • M
      ocfs2: Allow smaller allocations during large writes · 415cb800
      Mark Fasheh 提交于
      The ocfs2 write code loops through a page much like the block code, except
      that ocfs2 allocation units can be any size, including larger than page
      size. Typically it's equal to or larger than page size - most kernels run 4k
      pages, the minimum ocfs2 allocation (cluster) size.
      
      Some changes introduced during 2.6.23 changed the way writes to pages are
      handled, and inadvertantly broke support for > 4k page size. Instead of just
      writing one cluster at a time, we now handle the whole page in one pass.
      
      This means that multiple (small) seperate allocations might happen in the
      same pass. The allocation code howver typically optimizes by getting the
      maximum which was reserved. This triggered a BUG_ON in the extend code where
      it'd ask for a single bit (for one part of a > 4k page) and get back more
      than it asked for.
      
      Fix this by providing a variant of the high level allocation function which
      allows the caller to specify a maximum. The traditional function remains and
      just calls the new one with a maximum determined from the initial
      reservation.
      Signed-off-by: NMark Fasheh <mark.fasheh@oracle.com>
      415cb800
    • D
      signalfd simplification · b8fceee1
      Davide Libenzi 提交于
      This simplifies signalfd code, by avoiding it to remain attached to the
      sighand during its lifetime.
      
      In this way, the signalfd remain attached to the sighand only during
      poll(2) (and select and epoll) and read(2).  This also allows to remove
      all the custom "tsk == current" checks in kernel/signal.c, since
      dequeue_signal() will only be called by "current".
      
      I think this is also what Ben was suggesting time ago.
      
      The external effect of this, is that a thread can extract only its own
      private signals and the group ones.  I think this is an acceptable
      behaviour, in that those are the signals the thread would be able to
      fetch w/out signalfd.
      Signed-off-by: NDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b8fceee1
  11. 20 9月, 2007 5 次提交
    • C
      [XFS] fix valid but harmless sparse warning · 1bc5858d
      Christoph Hellwig 提交于
      The new xlog_recover_do_reg_buffer checks call be16_to_cpu on di_gen which
      is a 32bit value so sparse rightly complains. Fortunately the warning is
      harmless because we don't care for the value, but only whether it's
      non-NULL. Due to that fact we can simply kill the endian swaps on this and
      the previous di_mode check entirely.
      
      SGI-PV: 969656
      SGI-Modid: xfs-linux-melb:xfs-kern:29709a
      Signed-off-by: NChristoph Hellwig <hch@infradead.org>
      Signed-off-by: NLachlan McIlroy <lachlan@sgi.com>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      1bc5858d
    • E
      [XFS] fix filestreams on 32-bit boxes · bcc7b445
      Eric Sandeen 提交于
      xfs_filestream_mount() sets up an mru cache with:
        err = xfs_mru_cache_create(&mp->m_filestream, lifetime, grp_count,
        (xfs_mru_cache_free_func_t)xfs_fstrm_free_func);
      but that cast is causing problems...
        typedef void (*xfs_mru_cache_free_func_t)(unsigned long, void*);
      but:
        void xfs_fstrm_free_func( xfs_ino_t ino, fstrm_item_t *item)
      so on a 32-bit box, it's casting (32, 32) args into (64, 32) and I assume
      it's getting garbage for *item, which subsequently causes an explosion.
      With this change the filestreams xfsqa tests don't oops on my 32-bit box.
      
      SGI-PV: 967795
      SGI-Modid: xfs-linux-melb:xfs-kern:29510a
      Signed-off-by: NEric Sandeen <sandeen@sandeen.net>
      Signed-off-by: NDavid Chinner <dgc@sgi.com>
      Signed-off-by: NTim Shimmin <tes@sgi.com>
      bcc7b445
    • E
      ext34: ensure do_split leaves enough free space in both blocks · ef2b02d3
      Eric Sandeen 提交于
      The do_split() function for htree dir blocks is intended to split a leaf
      block to make room for a new entry.  It sorts the entries in the original
      block by hash value, then moves the last half of the entries to the new
      block - without accounting for how much space this actually moves.  (IOW,
      it moves half of the entry *count* not half of the entry *space*).  If by
      chance we have both large & small entries, and we move only the smallest
      entries, and we have a large new entry to insert, we may not have created
      enough space for it.
      
      The patch below stores each record size when calculating the dx_map, and
      then walks the hash-sorted dx_map, calculating how many entries must be
      moved to more evenly split the existing entries between the old block and
      the new block, guaranteeing enough space for the new entry.
      
      The dx_map "offs" member is reduced to u16 so that the overall map size
      does not change - it is temporarily stored at the end of the new block, and
      if it grows too large it may be overwritten.  By making offs and size both
      u16, we won't grow the map size.
      
      Also add a few comments to the functions involved.
      
      This fixes the testcase reported by hooanon05@yahoo.co.jp on the
      linux-ext4 list, "ext3 dir_index causes an error"
      
      Thanks to Andreas Dilger for discussing the problem & solution with me.
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NAndreas Dilger <adilger@clusterfs.com>
      Tested-by: NJunjiro Okajima <hooanon05@yahoo.co.jp>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef2b02d3
    • A
      nfs: fix oops re sysctls and V4 support · 49af7ee1
      Alexey Dobriyan 提交于
      NFS unregisters sysctls only if V4 support is compiled in.  However, sysctl
      table is not V4 specific, so unregister it always.
      
      Steps to reproduce:
      
      	[build nfs.ko with CONFIG_NFS_V4=n]
      	modrobe nfs
      	rmmod nfs
      	ls /proc/sys
      
      Unable to handle kernel paging request at ffffffff880661c0 RIP:
       [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
      PGD 203067 PUD 207063 PMD 7e216067 PTE 0
      Oops: 0000 [1] SMP
      CPU 1
      Modules linked in: lockd nfs_acl sunrpc
      Pid: 3335, comm: ls Not tainted 2.6.23-rc3-bloat #2
      RIP: 0010:[<ffffffff802af8e3>]  [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
      RSP: 0018:ffff81007fd93e78  EFLAGS: 00010286
      RAX: ffffffff880661c0 RBX: ffffffff80466370 RCX: ffffffff880661c0
      RDX: 00000000000014c0 RSI: ffff81007f3ad020 RDI: ffff81007efd8b40
      RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000001 R11: ffffffff802a8570 R12: ffffffff880661c0
      R13: ffff81007e219640 R14: ffff81007efd8b40 R15: ffff81007ded7280
      FS:  00002ba25ef03060(0000) GS:ffff81007ff81258(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: ffffffff880661c0 CR3: 000000007dfaf000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process ls (pid: 3335, threadinfo ffff81007fd92000, task ffff81007d8a0000)
      Stack:  ffff81007f3ad150 ffffffff80283f30 ffff81007fd93f48 ffff81007efd8b40
       ffff81007ee00440 0000000422222222 0000000200035593 ffffffff88037e9a
       2222222222222222 ffffffff80466500 ffff81007e416400 ffff81007e219640
      Call Trace:
       [<ffffffff80283f30>] filldir+0x0/0xf0
       [<ffffffff80283f30>] filldir+0x0/0xf0
       [<ffffffff802840c7>] vfs_readdir+0xa7/0xc0
       [<ffffffff80284376>] sys_getdents+0x96/0xe0
       [<ffffffff8020bb3e>] system_call+0x7e/0x83
      
      Code: 41 8b 14 24 85 d2 74 dc 49 8b 44 24 08 48 85 c0 74 e7 49 3b
      RIP  [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
       RSP <ffff81007fd93e78>
      CR2: ffffffff880661c0
      Kernel panic - not syncing: Fatal exception
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Acked-by: NTrond Myklebust <trond.myklebust@fys.uio.no>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      49af7ee1
    • E
      dir_index: error out instead of BUG on corrupt dx dirs · 3d82abae
      Eric Sandeen 提交于
      Convert asserts (BUGs) in dx_probe from bad on-disk data to recoverable
      errors with helpful warnings.  With help catching other asserts from Duane
      Griffin <duaneg@dghda.com>
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Acked-by: NDuane Griffin <duaneg@dghda.com>
      Acked-by: NTheodore Ts'o <tytso@mit.edu>
      Cc: <stable@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3d82abae
  12. 18 9月, 2007 2 次提交
  13. 17 9月, 2007 1 次提交
  14. 15 9月, 2007 1 次提交
  15. 12 9月, 2007 7 次提交