1. 30 11月, 2016 4 次提交
    • C
      SUNRPC: Proper metric accounting when RPC is not transmitted · ae09531d
      Chuck Lever 提交于
      I noticed recently that during an xfstests on a krb5i mount, the
      retransmit count for certain operations had gone negative, and the
      backlog value became unreasonably large. I recall that Andy has
      pointed this out to me in the past.
      
      When call_refresh fails to find a valid credential for an RPC, the
      RPC exits immediately without sending anything on the wire. This
      leaves rq_ntrans, rq_xtime, and rq_rtt set to zero.
      
      The solution for om_queue is to not add the to RPC's running backlog
      queue total whenever rq_xtime is zero.
      
      For om_ntrans, it's a bit more difficult. A zero rq_ntrans causes
      om_ops to become larger than om_ntrans. The design of the RPC
      metrics API assumes that ntrans will always be equal to or larger
      than the ops count. The result is that when an RPC fails to find
      credentials, the RPC operation's reported retransmit count, which is
      computed in user space as the difference between ops and ntrans,
      goes negative.
      
      Ideally the kernel API should report a separate retransmit and
      "exited before initial transmission" metric, so that user space can
      sort out the difference properly.
      
      To avoid kernel API changes and changes to the way rq_ntrans is used
      when performing transport locking, account for untransmitted RPCs
      so that om_ntrans keeps up with om_ops: always add one or more to
      om_ntrans.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      ae09531d
    • C
      xprtrdma: Support for SG_GAP devices · 5e9fc6a0
      Chuck Lever 提交于
      Some devices (such as the Mellanox CX-4) can register, under a
      single R_key, a set of memory regions that are not contiguous. When
      this is done, all the segments in a Reply list, say, can then be
      invalidated in a single LocalInv Work Request (or via Remote
      Invalidation, which can invalidate exactly one R_key when completing
      a Receive).
      
      This means a single FastReg WR is used to register, and one or zero
      LocalInv WRs can invalidate, the memory involved with RDMA transfers
      on behalf of an RPC.
      
      In addition, xprtrdma constructs some Reply chunks from three or
      more segments. By registering them with SG_GAP, only one segment
      is needed for the Reply chunk, allowing the whole chunk to be
      invalidated remotely.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      5e9fc6a0
    • C
      xprtrdma: Make FRWR send queue entry accounting more accurate · 8d38de65
      Chuck Lever 提交于
      Verbs providers may perform house-keeping on the Send Queue during
      each signaled send completion. It is necessary therefore for a verbs
      consumer (like xprtrdma) to occasionally force a signaled send
      completion if it runs unsignaled most of the time.
      
      xprtrdma does not require signaled completions for Send or FastReg
      Work Requests, but does signal some LocalInv Work Requests. To
      ensure that Send Queue house-keeping can run before the Send Queue
      is more than half-consumed, xprtrdma forces a signaled completion
      on occasion by counting the number of Send Queue Entries it
      consumes. It currently does this by counting each ib_post_send as
      one Entry.
      
      Commit c9918ff5 ("xprtrdma: Add ro_unmap_sync method for FRWR")
      introduced the ability for frwr_op_unmap_sync to post more than one
      Work Request with a single post_send. Thus the underlying assumption
      of one Send Queue Entry per ib_post_send is no longer true.
      
      Also, FastReg Work Requests are currently never signaled. They
      should be signaled once in a while, just as Send is, to keep the
      accounting of consumed SQEs accurate.
      
      While we're here, convert the CQCOUNT macros to the currently
      preferred kernel coding style, which is inline functions.
      
      Fixes: c9918ff5 ("xprtrdma: Add ro_unmap_sync method for FRWR")
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      8d38de65
    • C
      xprtrdma: Cap size of callback buffer resources · 62aee0e3
      Chuck Lever 提交于
      When the inline threshold size is set to large values (say, 32KB)
      any NFSv4.1 CB request from the server gets a reply with status
      NFS4ERR_RESOURCE.
      
      Looks like there are some upper layer assumptions about the maximum
      size of a reply (for example, in process_op). Cap the size of the
      NFSv4 client's reply resources at a page.
      Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>
      62aee0e3
  2. 28 11月, 2016 3 次提交
  3. 27 11月, 2016 8 次提交
  4. 26 11月, 2016 22 次提交
  5. 25 11月, 2016 3 次提交
    • J
      parisc: Also flush data TLB in flush_icache_page_asm · 5035b230
      John David Anglin 提交于
      This is the second issue I noticed in reviewing the parisc TLB code.
      
      The fic instruction may use either the instruction or data TLB in
      flushing the instruction cache.  Thus, on machines with a split TLB, we
      should also flush the data TLB after setting up the temporary alias
      registers.
      
      Although this has no functional impact, I changed the pdtlb and pitlb
      instructions to consistently use the index register %r0.  These
      instructions do not support integer displacements.
      
      Tested on rp3440 and c8000.
      Signed-off-by: NJohn David Anglin  <dave.anglin@bell.net>
      Cc: <stable@vger.kernel.org> # v3.16+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      5035b230
    • J
      parisc: Fix race in pci-dma.c · c0452fb9
      John David Anglin 提交于
      We are still troubled by occasional random segmentation faults and
      memory memory corruption on SMP machines.  The causes quite a few
      package builds to fail on the Debian buildd machines for parisc.  When
      gcc-6 failed to build three times in a row, I looked again at the TLB
      related code.  I found a couple of issues.  This is the first.
      
      In general, we need to ensure page table updates and corresponding TLB
      purges are atomic.  The attached patch fixes an instance in pci-dma.c
      where the page table update was not guarded by the TLB lock.
      
      Tested on rp3440 and c8000.  So far, no further random segmentation
      faults have been observed.
      Signed-off-by: NJohn David Anglin  <dave.anglin@bell.net>
      Cc: <stable@vger.kernel.org> # v3.16+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      c0452fb9
    • H
      parisc: Switch to generic sched_clock implementation · 43b1f6ab
      Helge Deller 提交于
      Drop the open-coded sched_clock() function and replace it by the provided
      GENERIC_SCHED_CLOCK implementation.  We have seen quite some hung tasks in the
      past, which seem to be fixed by this patch.
      Signed-off-by: NHelge Deller <deller@gmx.de>
      Cc: <stable@vger.kernel.org> # v4.7+
      Signed-off-by: NHelge Deller <deller@gmx.de>
      43b1f6ab