1. 12 3月, 2011 7 次提交
  2. 11 3月, 2011 4 次提交
    • A
      NFSv4.1 reclaim complete must wait for completion · c34c32ea
      Andy Adamson 提交于
      Signed-off-by: NAndy Adamson <andros@netapp.com>
      [Trond: fix whitespace errors]
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      c34c32ea
    • R
      NFSv4.1: Retry CREATE_SESSION on NFS4ERR_DELAY · 7d6d63d6
      Ricardo Labiaga 提交于
      Fix bug where we currently retry the EXCHANGEID call again, eventhough
      we already have a valid clientid.  Instead, delay and retry the CREATE_SESSION
      call.
      Signed-off-by: NRicardo Labiaga <Ricardo.Labiaga@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      7d6d63d6
    • J
      nfs: fix compilation warning · 43b7c3f0
      Jovi Zhang 提交于
      this commit fix compilation warning as following:
      linux-2.6/fs/nfs/nfs4proc.c:3265: warning: comparison of distinct pointer types lacks a cast
      Signed-off-by: NJovi Zhang <bookjovi@gmail.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      43b7c3f0
    • T
      SUNRPC: Close a race in __rpc_wait_for_completion_task() · bf294b41
      Trond Myklebust 提交于
      Although they run as rpciod background tasks, under normal operation
      (i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
      and nfs4_do_close() want to be fully synchronous. This means that when we
      exit, we want all references to the rpc_task to be gone, and we want
      any dentry references etc. held by that task to be released.
      
      For this reason these functions call __rpc_wait_for_completion_task(),
      followed by rpc_put_task() in the expectation that the latter will be
      releasing the last reference to the rpc_task, and thus ensuring that the
      callback_ops->rpc_release() has been called synchronously.
      
      This patch fixes a race which exists due to the fact that
      rpciod calls rpc_complete_task() (in order to wake up the callers of
      __rpc_wait_for_completion_task()) and then subsequently calls
      rpc_put_task() without ensuring that these two steps are done atomically.
      
      In order to avoid adding new spin locks, the patch uses the existing
      waitqueue spin lock to order the rpc_task reference count releases between
      the waiting process and rpciod.
      The common case where nobody is waiting for completion is optimised for by
      checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
      reference count is 1: in those cases we drop trying to grab the spin lock,
      and immediately free up the rpc_task.
      
      Those few processes that need to put the rpc_task from inside an
      asynchronous context and that do not care about ordering are given a new
      helper: rpc_put_task_async().
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      bf294b41
  3. 05 3月, 2011 1 次提交
    • N
      nfs4: Ensure that ACL pages sent over NFS were not allocated from the slab (v3) · e9e3d724
      Neil Horman 提交于
      The "bad_page()" page allocator sanity check was reported recently (call
      chain as follows):
      
        bad_page+0x69/0x91
        free_hot_cold_page+0x81/0x144
        skb_release_data+0x5f/0x98
        __kfree_skb+0x11/0x1a
        tcp_ack+0x6a3/0x1868
        tcp_rcv_established+0x7a6/0x8b9
        tcp_v4_do_rcv+0x2a/0x2fa
        tcp_v4_rcv+0x9a2/0x9f6
        do_timer+0x2df/0x52c
        ip_local_deliver+0x19d/0x263
        ip_rcv+0x539/0x57c
        netif_receive_skb+0x470/0x49f
        :virtio_net:virtnet_poll+0x46b/0x5c5
        net_rx_action+0xac/0x1b3
        __do_softirq+0x89/0x133
        call_softirq+0x1c/0x28
        do_softirq+0x2c/0x7d
        do_IRQ+0xec/0xf5
        default_idle+0x0/0x50
        ret_from_intr+0x0/0xa
        default_idle+0x29/0x50
        cpu_idle+0x95/0xb8
        start_kernel+0x220/0x225
        _sinittext+0x22f/0x236
      
      It occurs because an skb with a fraglist was freed from the tcp
      retransmit queue when it was acked, but a page on that fraglist had
      PG_Slab set (indicating it was allocated from the Slab allocator (which
      means the free path above can't safely free it via put_page.
      
      We tracked this back to an nfsv4 setacl operation, in which the nfs code
      attempted to fill convert the passed in buffer to an array of pages in
      __nfs4_proc_set_acl, which gets used by the skb->frags list in
      xs_sendpages.  __nfs4_proc_set_acl just converts each page in the buffer
      to a page struct via virt_to_page, but the vfs allocates the buffer via
      kmalloc, meaning the PG_slab bit is set.  We can't create a buffer with
      kmalloc and free it later in the tcp ack path with put_page, so we need
      to either:
      
      1) ensure that when we create the list of pages, no page struct has
         PG_Slab set
      
       or
      
      2) not use a page list to send this data
      
      Given that these buffers can be multiple pages and arbitrarily sized, I
      think (1) is the right way to go.  I've written the below patch to
      allocate a page from the buddy allocator directly and copy the data over
      to it.  This ensures that we have a put_page free-able page for every
      entry that winds up on an skb frag list, so it can be safely freed when
      the frame is acked.  We do a put page on each entry after the
      rpc_call_sync call so as to drop our own reference count to the page,
      leaving only the ref count taken by tcp_sendpages.  This way the data
      will be properly freed when the ack comes in
      
      Successfully tested by myself to solve the above oops.
      
      Note, as this is the result of a setacl operation that exceeded a page
      of data, I think this amounts to a local DOS triggerable by an
      uprivlidged user, so I'm CCing security on this as well.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: Trond Myklebust <Trond.Myklebust@netapp.com>
      CC: security@kernel.org
      CC: Jeff Layton <jlayton@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e9e3d724
  4. 26 1月, 2011 1 次提交
  5. 12 1月, 2011 1 次提交
  6. 07 1月, 2011 7 次提交
  7. 05 1月, 2011 2 次提交
  8. 22 12月, 2010 2 次提交
  9. 08 12月, 2010 1 次提交
  10. 16 11月, 2010 1 次提交
  11. 29 10月, 2010 1 次提交
  12. 26 10月, 2010 1 次提交
  13. 25 10月, 2010 3 次提交
  14. 24 10月, 2010 3 次提交
    • B
      NFS: Readdir plus in v4 · 82f2e547
      Bryan Schumaker 提交于
      By requsting more attributes during a readdir, we can mimic the readdir plus
      operation that was in NFSv3.
      
      To test, I ran the command `ls -lU --color=none` on directories with various
      numbers of files.  Without readdir plus, I see this:
      
      n files |    100    |   1,000   |  10,000   |  100,000  | 1,000,000
      --------+-----------+-----------+-----------+-----------+----------
      real    | 0m00.153s | 0m00.589s | 0m05.601s | 0m56.691s | 9m59.128s
      user    | 0m00.007s | 0m00.007s | 0m00.077s | 0m00.703s | 0m06.800s
      sys     | 0m00.010s | 0m00.070s | 0m00.633s | 0m06.423s | 1m10.005s
      access  | 3         | 1         | 1         | 4         | 31
      getattr | 2         | 1         | 1         | 1         | 1
      lookup  | 104       | 1,003     | 10,003    | 100,003   | 1,000,003
      readdir | 2         | 16        | 158       | 1,575     | 15,749
      total   | 111       | 1,021     | 10,163    | 101,583   | 1,015,784
      
      With readdir plus enabled, I see this:
      
      n files |    100    |   1,000   |  10,000   |  100,000  | 1,000,000
      --------+-----------+-----------+-----------+-----------+----------
      real    | 0m00.115s | 0m00.206s | 0m01.079s | 0m12.521s | 2m07.528s
      user    | 0m00.003s | 0m00.003s | 0m00.040s | 0m00.290s | 0m03.296s
      sys     | 0m00.007s | 0m00.020s | 0m00.120s | 0m01.357s | 0m17.556s
      access  | 3         | 1         | 1         | 1         | 7
      getattr | 2         | 1         | 1         | 1         | 1
      lookup  | 4         | 3         | 3         | 3         | 3
      readdir | 6         | 62        | 630       | 6,300     | 62,993
      total   | 15        | 67        | 635       | 6,305     | 63,004
      
      Readdir plus disabled has about a 16x increase in the number of rpc calls and
      is 4 - 5 times slower on large directories.
      Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      82f2e547
    • B
      NFS: readdir with vmapped pages · 56e4ebf8
      Bryan Schumaker 提交于
      We can use vmapped pages to read more information from the network at once.
      This will reduce the number of calls needed to complete a readdir.
      Signed-off-by: NBryan Schumaker <bjschuma@netapp.com>
      [trondmy: Added #include for linux/vmalloc.h> in fs/nfs/dir.c]
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      56e4ebf8
    • T
      NFSv4: The state manager must ignore EKEYEXPIRED. · 168667c4
      Trond Myklebust 提交于
      Otherwise, we cannot recover state correctly.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      168667c4
  15. 20 10月, 2010 2 次提交
    • T
      NFSv4: Don't call nfs4_state_mark_reclaim_reboot() from error handlers · ae1007d3
      Trond Myklebust 提交于
      In the case of a server reboot, the state recovery thread starts by calling
      nfs4_state_end_reclaim_reboot() in order to avoid edge conditions when
      the server reboots while the client is in the middle of recovery.
      
      However, if the client has already marked the nfs4_state as requiring
      reboot recovery, then the above behaviour will cause the recovery thread to
      treat the open as if it was part of such an edge condition: the open will
      be recovered as if it was part of a lease expiration (and all the locks
      will be lost).
      Fix is to remove the call to nfs4_state_mark_reclaim_reboot from
      nfs4_async_handle_error(), and nfs4_handle_exception(). Instead we leave it
      to the recovery thread to do this for us.
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@kernel.org
      ae1007d3
    • T
      NFSv4: Fix open recovery · b0ed9dbc
      Trond Myklebust 提交于
      NFSv4 open recovery is currently broken: since we do not clear the
      state->flags states before attempting recovery, we end up with the
      'can_open_cached()' function triggering. This again leads to no OPEN call
      being put on the wire.
      Reported-by: NSachin Prabhu <sprabhu@redhat.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      Cc: stable@kernel.org
      b0ed9dbc
  16. 24 9月, 2010 1 次提交
  17. 22 9月, 2010 1 次提交
  18. 18 9月, 2010 1 次提交
    • J
      nfs: make sillyrename an async operation · d3d4152a
      Jeff Layton 提交于
      A synchronous rename can be interrupted by a SIGKILL. If that happens
      during a sillyrename operation, it's possible for the rename call to
      be sent to the server, but the task exits before processing the
      reply. If this happens, the sillyrenamed file won't get cleaned up
      during nfs_dentry_iput and the server is left with a dangling .nfs* file
      hanging around.
      
      Fix this problem by turning sillyrename into an asynchronous operation
      and have the task doing the sillyrename just wait on the reply. If the
      task is killed before the sillyrename completes, it'll still proceed
      to completion.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
      d3d4152a