1. 11 10月, 2008 3 次提交
  2. 14 8月, 2008 1 次提交
    • T
      svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet · 24b8b447
      Tom Tucker 提交于
      RDMA_READ completions are kept on a separate queue from the general
      I/O request queue. Since a separate lock is used to protect the RDMA_READ
      completion queue, a race exists between the dto_tasklet and the
      svc_rdma_recvfrom thread where the dto_tasklet sets the XPT_DATA
      bit and adds I/O to the read-completion queue. Concurrently, the
      recvfrom thread checks the generic queue, finds it empty and resets
      the XPT_DATA bit. A subsequent svc_xprt_enqueue will fail to enqueue
      the transport for I/O and cause the transport to "stall".
      
      The fix is to protect both lists with the same lock and set the XPT_DATA
      bit with this lock held.
      Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      24b8b447
  3. 27 7月, 2008 1 次提交
    • F
      dma-mapping: add the device argument to dma_mapping_error() · 8d8bb39b
      FUJITA Tomonori 提交于
      Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER
      architecture does:
      
      This enables us to cleanly fix the Calgary IOMMU issue that some devices
      are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423).
      
      I think that per-device dma_mapping_ops support would be also helpful for
      KVM people to support PCI passthrough but Andi thinks that this makes it
      difficult to support the PCI passthrough (see the above thread).  So I
      CC'ed this to KVM camp.  Comments are appreciated.
      
      A pointer to dma_mapping_ops to struct dev_archdata is added.  If the
      pointer is non NULL, DMA operations in asm/dma-mapping.h use it.  If it's
      NULL, the system-wide dma_ops pointer is used as before.
      
      If it's useful for KVM people, I plan to implement a mechanism to register
      a hook called when a new pci (or dma capable) device is created (it works
      with hot plugging).  It enables IOMMUs to set up an appropriate
      dma_mapping_ops per device.
      
      The major obstacle is that dma_mapping_error doesn't take a pointer to the
      device unlike other DMA operations.  So x86 can't have dma_mapping_ops per
      device.  Note all the POWER IOMMUs use the same dma_mapping_error function
      so this is not a problem for POWER but x86 IOMMUs use different
      dma_mapping_error functions.
      
      The first patch adds the device argument to dma_mapping_error.  The patch
      is trivial but large since it touches lots of drivers and dma-mapping.h in
      all the architecture.
      
      This patch:
      
      dma_mapping_error() doesn't take a pointer to the device unlike other DMA
      operations.  So we can't have dma_mapping_ops per device.
      
      Note that POWER already has dma_mapping_ops per device but all the POWER
      IOMMUs use the same dma_mapping_error function.  x86 IOMMUs use device
      argument.
      
      [akpm@linux-foundation.org: fix sge]
      [akpm@linux-foundation.org: fix svc_rdma]
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: fix bnx2x]
      [akpm@linux-foundation.org: fix s2io]
      [akpm@linux-foundation.org: fix pasemi_mac]
      [akpm@linux-foundation.org: fix sdhci]
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: fix sparc]
      [akpm@linux-foundation.org: fix ibmvscsi]
      Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Muli Ben-Yehuda <muli@il.ibm.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Avi Kivity <avi@qumranet.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8d8bb39b
  4. 03 7月, 2008 10 次提交
  5. 19 5月, 2008 21 次提交
  6. 24 4月, 2008 1 次提交
  7. 17 4月, 2008 1 次提交
    • R
      IB/core: Add support for "send with invalidate" work requests · 0f39cf3d
      Roland Dreier 提交于
      Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a
      "send with invalidate" work request as defined in the iWARP verbs and
      the InfiniBand base memory management extensions.  Also put "imm_data"
      and a new "invalidate_rkey" member in a new "ex" union in struct
      ib_send_wr. The invalidate_rkey member can be used to pass in an
      R_Key/STag to be invalidated.  Add this new union to struct
      ib_uverbs_send_wr.  Add code to copy the invalidate_rkey field in
      ib_uverbs_post_send().
      
      Fix up low-level drivers to deal with the change to struct ib_send_wr,
      and just remove the imm_data initialization from net/sunrpc/xprtrdma/,
      since that code never does any send with immediate operations.
      
      Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since
      the iWARP drivers currently in the tree set the bit.  The amso1100
      driver at least will silently fail to honor the IB_SEND_INVALIDATE bit
      if passed in as part of userspace send requests (since it does not
      implement kernel bypass work request queueing).  Remove the flag from
      all existing drivers that set it until we know which ones are OK.
      
      The values chosen for the new flag is not consecutive to avoid clashing
      with flags defined in the XRC patches, which are not merged yet but
      which are already in use and are likely to be merged soon.
      
      This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      0f39cf3d
  8. 27 3月, 2008 1 次提交
    • T
      SVCRDMA: Check num_sge when setting LAST_CTXT bit · c8237a5f
      Tom Tucker 提交于
      The RDMACTXT_F_LAST_CTXT bit was getting set incorrectly
      when the last chunk in the read-list spanned multiple pages. This
      resulted in a kernel panic when the wrong context was used to
      build the RPC iovec page list.
      
      RDMA_READ is used to fetch RPC data from the client for
      NFS_WRITE requests. A scatter-gather is used to map the
      advertised client side buffer to the server-side iovec and
      associated page list.
      
      WR contexts are used to convey which scatter-gather entries are
      handled by each WR. When the write data is large, a single RPC may
      require multiple RDMA_READ requests so the contexts for a single RPC
      are chained together in a linked list. The last context in this list
      is marked with a bit RDMACTXT_F_LAST_CTXT so that when this WR completes,
      the CQ handler code can enqueue the RPC for processing.
      
      The code in rdma_read_xdr was setting this bit on the last two
      contexts on this list when the last read-list chunk spanned multiple
      pages. This caused the svc_rdma_recvfrom logic to incorrectly build
      the RPC and caused the kernel to crash because the second-to-last
      context doesn't contain the iovec page list.
      
      Modified the condition that sets this bit so that it correctly detects
      the last context for the RPC.
      Signed-off-by: NTom Tucker <tom@opengridcomputing.com>
      Tested-by: NRoland Dreier <rolandd@cisco.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c8237a5f
  9. 25 3月, 2008 1 次提交
    • R
      SVCRDMA: Use only 1 RDMA read scatter entry for iWARP adapters · d3073779
      Roland Dreier 提交于
      The iWARP protocol limits RDMA read requests to a single scatter
      entry.  NFS/RDMA has code in rdma_read_max_sge() that is supposed to
      limit the sge_count for RDMA read requests to 1, but the code to do
      that is inside an #ifdef RDMA_TRANSPORT_IWARP block.  In the mainline
      kernel at least, RDMA_TRANSPORT_IWARP is an enum and not a
      preprocessor #define, so the #ifdef'ed code is never compiled.
      
      In my test of a kernel build with -j8 on an NFS/RDMA mount, this
      problem eventually leads to trouble starting with:
      
          svcrdma: Error posting send = -22
          svcrdma : RDMA_READ error = -22
      
      and things go downhill from there.
      
      The trivial fix is to delete the #ifdef guard.  The check seems to be
      a remnant of when the NFS/RDMA code was not merged and needed to
      compile against multiple kernel versions, although I don't think it
      ever worked as intended.  In any case now that the code is upstream
      there's no need to test whether the RDMA_TRANSPORT_IWARP constant is
      defined or not.
      
      Without this patch, my kernel build on an NFS/RDMA mount using NetEffect
      adapters quickly and 100% reproducibly failed with an error like:
      
          ld: final link failed: Software caused connection abort
      
      With the patch applied I was able to complete a kernel build on the
      same setup.
      
      (Tom Tucker says this is "actually an _ancient_ remnant when it had to
      compile against iWARP vs. non-iWARP enabled OFA trees.")
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      Acked-by: NTom Tucker <tom@opengridcomputing.com>
      Signed-off-by: NJ. Bruce Fields <bfields@citi.umich.edu>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d3073779