1. 15 7月, 2008 6 次提交
    • S
      RDMA/cxgb3: MEM_MGT_EXTENSIONS support · e7e55829
      Steve Wise 提交于
      - set IB_DEVICE_MEM_MGT_EXTENSIONS capability bit if fw supports it.
      - set max_fast_reg_page_list_len device attribute.
      - add iwch_alloc_fast_reg_mr function.
      - add iwch_alloc_fastreg_pbl
      - add iwch_free_fastreg_pbl
      - adjust the WQ depth for kernel mode work queues to account for
        fastreg possibly taking 2 WR slots.
      - add fastreg_mr work request support.
      - add local_inv work request support.
      - add send_with_inv and send_with_se_inv work request support.
      - removed useless duplicate enums/defines for TPT/MW/MR stuff.
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      e7e55829
    • S
      RDMA/core: Add memory management extensions support · 00f7ec36
      Steve Wise 提交于
      This patch adds support for the IB "base memory management extension"
      (BMME) and the equivalent iWARP operations (which the iWARP verbs
      mandates all devices must implement).  The new operations are:
      
       - Allocate an ib_mr for use in fast register work requests.
      
       - Allocate/free a physical buffer lists for use in fast register work
         requests.  This allows device drivers to allocate this memory as
         needed for use in posting send requests (eg via dma_alloc_coherent).
      
       - New send queue work requests:
         * send with remote invalidate
         * fast register memory region
         * local invalidate memory region
         * RDMA read with invalidate local memory region (iWARP only)
      
      Consumer interface details:
      
       - A new device capability flag IB_DEVICE_MEM_MGT_EXTENSIONS is added
         to indicate device support for these features.
      
       - New send work request opcodes IB_WR_FAST_REG_MR, IB_WR_LOCAL_INV,
         IB_WR_RDMA_READ_WITH_INV are added.
      
       - A new consumer API function, ib_alloc_mr() is added to allocate
         fast register memory regions.
      
       - New consumer API functions, ib_alloc_fast_reg_page_list() and
         ib_free_fast_reg_page_list() are added to allocate and free
         device-specific memory for fast registration page lists.
      
       - A new consumer API function, ib_update_fast_reg_key(), is added to
         allow the key portion of the R_Key and L_Key of a fast registration
         MR to be updated.  Consumers call this if desired before posting
         a IB_WR_FAST_REG_MR work request.
      
      Consumers can use this as follows:
      
       - MR is allocated with ib_alloc_mr().
      
       - Page list memory is allocated with ib_alloc_fast_reg_page_list().
      
       - MR R_Key/L_Key "key" field is updated with ib_update_fast_reg_key().
      
       - MR made VALID and bound to a specific page list via
         ib_post_send(IB_WR_FAST_REG_MR)
      
       - MR made INVALID via ib_post_send(IB_WR_LOCAL_INV),
         ib_post_send(IB_WR_RDMA_READ_WITH_INV) or an incoming send with
         invalidate operation.
      
       - MR is deallocated with ib_dereg_mr()
      
       - page lists dealloced via ib_free_fast_reg_page_list().
      
      Applications can allocate a fast register MR once, and then can
      repeatedly bind the MR to different physical block lists (PBLs) via
      posting work requests to a send queue (SQ).  For each outstanding
      MR-to-PBL binding in the SQ pipe, a fast_reg_page_list needs to be
      allocated (the fast_reg_page_list is owned by the low-level driver
      from the consumer posting a work request until the request completes).
      Thus pipelining can be achieved while still allowing device-specific
      page_list processing.
      
      The 32-bit fast register memory key/STag is composed of a 24-bit index
      and an 8-bit key.  The application can change the key each time it
      fast registers thus allowing more control over the peer's use of the
      key/STag (ie it can effectively be changed each time the rkey is
      rebound to a page list).
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      00f7ec36
    • R
      RDMA: Remove subversion $Id tags · f3781d2e
      Roland Dreier 提交于
      They don't get updated by git and so they're worse than useless.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      f3781d2e
    • R
      fd91b1bf
    • E
      IB/mlx4: Optimize QP stamping · 9670e553
      Eli Cohen 提交于
      The idea is that for QPs with fixed size work requests (eg selective
      signaling QPs), before stamping the WQE, we read the value of the DS
      field, which gives the effective size of the descriptor as used in the
      previous post.  Then we stamp only that area, since the rest of the
      descriptor is already stamped.
      
      When initializing the send queue buffer, make sure the DS field is
      initialized to the max descriptor size so that the subsequent stamping
      will be done on the entire descriptor area.
      Signed-off-by: NEli Cohen <eli@mellanox.co.il>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      9670e553
    • C
      RDMA/nes: Remove unnecessary memset() · 929555a2
      Christophe Jaillet 提交于
      Remove an explicit memset(..., 0, ...) of a 'listener' structure
      allocated with kzalloc().
      Signed-off-by: NChristophe Jaillet <christophe.jaillet@wanadoo.fr>
      Acked-by: NFaisal Latif <faisal@neteffect.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      929555a2
  2. 09 7月, 2008 1 次提交
  3. 24 6月, 2008 1 次提交
  4. 21 6月, 2008 2 次提交
  5. 11 6月, 2008 1 次提交
    • R
      RDMA/nes: Fix off-by-one in nes_reg_user_mr() error path · 24797a34
      Roland Dreier 提交于
      nes_reg_user_mr() should fail if page_count becomes >= 1024 * 512
      rather than just testing for strict >, because page_count is
      essentially used as an index into an array with 1024 * 512 entries, so
      allowing the loop to continue with page_count == 1024 * 512 means that
      memory after the end of the array is corrupted.  This leads to a crash
      triggerable by a userspace application that requests registration of a
      too-big region.
      
      Also get rid of the call to pci_free_consistent() here to avoid
      corrupting state with a double free, since the same memory will be
      freed in the code jumped to at reg_user_mr_err.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      24797a34
  6. 10 6月, 2008 1 次提交
    • R
      IB/core: Remove IB_DEVICE_SEND_W_INV capability flag · 4c0283fc
      Roland Dreier 提交于
      In 2.6.26, we added some support for send with invalidate work
      requests, including a device capability flag to indicate whether a
      device supports such requests.  However, the support was incomplete:
      the completion structure was not extended with a field for the key
      contained in incoming send with invalidate requests.
      
      Full support for memory management extensions (send with invalidate,
      local invalidate, fast register through a send queue, etc) is planned
      for 2.6.27.  Since send with invalidate is not very useful by itself,
      just remove the IB_DEVICE_SEND_W_INV bit before the 2.6.26 final
      release; we will add an IB_DEVICE_MEM_MGT_EXTENSIONS bit in 2.6.27,
      which makes things simpler for applications, since they will not have
      quite as confusing an array of fine-grained bits to check.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      4c0283fc
  7. 07 6月, 2008 2 次提交
  8. 27 5月, 2008 2 次提交
    • R
      IB/ipath: Fix device capability flags · 03031f71
      Ralph Campbell 提交于
      The driver supports a few features (RNR NAK, port active event, SRQ
      resize) that were not reported in the device capability flags.  This
      patch fixes that.
      Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      03031f71
    • R
      IB/ipath: Avoid test_bit() on u64 SDMA status value · e8ffef73
      Roland Dreier 提交于
      Gabriel C <nix.or.die@googlemail.com> pointed out that when the x86
      bitops are updated to operate on unsigned long, the code in
      sdma_abort_task() will produce warnings:
      
          drivers/infiniband/hw/ipath/ipath_sdma.c: In function 'sdma_abort_task':
          drivers/infiniband/hw/ipath/ipath_sdma.c:267: warning: passing argument 2 of 'constant_test_bit' from incompatible pointer type
      
      and so on, because it uses test_bit() to operation on a u64 value
      (returned by ipath_read_kref64() for a hardware register).
      
      Fix up these warnings by converting the test_bit() operations to &ing
      with appropriate symbolic defines of the bits within the hardware
      register.  This has the benign side-effect of making the code more
      self-documenting as well.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      e8ffef73
  9. 21 5月, 2008 1 次提交
    • R
      IB/mlx4: Fix creation of kernel QP with max number of send s/g entries · cd155c1c
      Roland Dreier 提交于
      When creating a kernel QP where the consumer asked for a send queue
      with lots of scatter/gater entries, set_kernel_sq_size() incorrectly
      returned an error if the send queue stride is larger than the
      hardware's maximum send work request descriptor size.  This is not a
      problem; the only issue is to make sure that the actual descriptors
      used do not overflow the maximum descriptor size, so check this instead.
      
      Clamp the returned max_send_sge value to be no bigger than what
      query_device returns for the max_sge to avoid confusing hapless users,
      even if the hardware is capable of handling a few more s/g entries.
      
      This bug caused NFS/RDMA mounts to fail when the server adapter used
      the mlx4 driver.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      cd155c1c
  10. 17 5月, 2008 3 次提交
  11. 16 5月, 2008 2 次提交
    • R
      IB/ipath: Fix UC receive completion opcode for RDMA WRITE with immediate · df3f0da8
      Ralph Campbell 提交于
      When I fixed the RC receive completion opcode in 2bfc8e9e ("IB/ipath:
      Return the correct opcode for RDMA WRITE with immediate"), I forgot to
      fix UC, which had the same problem for RDMA write with immediate
      returning the wrong opcode.
      Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      df3f0da8
    • R
      IB/ipath: Fix printk format for ipath_sdma_status · cd80ec6f
      Roland Dreier 提交于
      Commit f018c7e1 ("IB/ipath: Change ipath_devdata.ipath_sdma_status to be
      unsigned long") changed ipath_sdma_status to be unsigned long, but left
      a few debug messages that printed it out with a %016llx format, which
      generates the warnings
      
          drivers/infiniband/hw/ipath/ipath_sdma.c:348: warning: format '%016llx' expects type 'long long unsigned int', but argument  3 has type 'long unsigned int'
          drivers/infiniband/hw/ipath/ipath_sdma.c:618: warning: format '%016llx' expects type 'long long unsigned int', but argument  3 has type 'long unsigned int'
      
      Fix this by changing the format used to print out the value to %08lx
      (8 hex digits are now sufficient, because the highest bit used is 31).
      
      Warnings reported by Randy Dunlap <randy.dunlap@oracle.com>.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      cd80ec6f
  12. 14 5月, 2008 7 次提交
  13. 08 5月, 2008 8 次提交
  14. 07 5月, 2008 2 次提交
    • R
      RDMA/cxgb3: Fix severe limit on userspace memory registration size · 273748cc
      Roland Dreier 提交于
      Currently, iw_cxgb3 is severely limited on the amount of userspace
      memory that can be registered in in a single memory region, which
      causes big problems for applications that expect to be able to
      register 100s of MB.
      
      The problem is that the driver uses a single kmalloc()ed buffer to
      hold the physical buffer list (PBL) for the entire memory region
      during registration, which means that 8 bytes of contiguous memory are
      required for each page of memory being registered.  For example, a 64
      MB registration will require 128 KB of contiguous memory with 4 KB
      pages, and it unlikely that such an allocation will succeed on a busy
      system.
      
      This is purely a driver problem: the temporary page list buffer is not
      needed by the hardware, so we can fix this by writing the PBL to the
      hardware in page-sized chunks rather than all at once.  We do this by
      splitting the memory registration operation up into several steps:
      
       - Allocate PBL space in adapter memory for the full registration
       - Copy PBL to adapter memory in chunks
       - Allocate STag and enable memory region
      
      This also allows several other cleanups to the __cxio_tpt_op()
      interface and related parts of the driver.
      
      This change leaves the reregister memory region and memory window
      operations broken, but they already didn't work due to other
      longstanding bugs, so fixing them will be left to a later patch.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      273748cc
    • R
      RDMA/cxgb3: Don't add PBL memory to gen_pool in chunks · 0e991336
      Roland Dreier 提交于
      Current iw_cxgb3 code adds PBL memory to the driver's gen_pool in 2 MB
      chunks.  This limits the largest single allocation that can be done to
      the same size, which means that with 4 KB pages, each of which takes 8
      bytes of PBL memory, the largest memory region that can be allocated
      is 1 GB (256K PBL entries * 4 KB/entry).
      
      Remove this limit by adding all the PBL memory in a single gen_pool
      chunk, if possible.  Add code that falls back to smaller chunks if
      gen_pool_add() fails, which can happen if there is not sufficient
      contiguous lowmem for the internal gen_pool bitmap.
      Signed-off-by: NRoland Dreier <rolandd@cisco.com>
      0e991336
  15. 06 5月, 2008 1 次提交