1. 19 5月, 2015 11 次提交
  2. 13 5月, 2015 1 次提交
  3. 06 5月, 2015 2 次提交
  4. 05 5月, 2015 4 次提交
  5. 16 4月, 2015 3 次提交
  6. 03 4月, 2015 1 次提交
  7. 19 2月, 2015 2 次提交
  8. 18 2月, 2015 3 次提交
  9. 06 2月, 2015 1 次提交
  10. 04 2月, 2015 1 次提交
  11. 16 12月, 2014 9 次提交
    • H
      IB/core: Implement support for MMU notifiers regarding on demand paging regions · 882214e2
      Haggai Eran 提交于
      * Add an interval tree implementation for ODP umems. Create an
        interval tree for each ucontext (including a count of the number of
        ODP MRs in this context, semaphore, etc.), and register ODP umems in
        the interval tree.
      * Add MMU notifiers handling functions, using the interval tree to
        notify only the relevant umems and underlying MRs.
      * Register to receive MMU notifier events from the MM subsystem upon
        ODP MR registration (and unregister accordingly).
      * Add a completion object to synchronize the destruction of ODP umems.
      * Add mechanism to abort page faults when there's a concurrent invalidation.
      
      The way we synchronize between concurrent invalidations and page
      faults is by keeping a counter of currently running invalidations, and
      a sequence number that is incremented whenever an invalidation is
      caught. The page fault code checks the counter and also verifies that
      the sequence number hasn't progressed before it updates the umem's
      page tables. This is similar to what the kvm module does.
      
      In order to prevent the case where we register a umem in the middle of
      an ongoing notifier, we also keep a per ucontext counter of the total
      number of active mmu notifiers. We only enable new umems when all the
      running notifiers complete.
      Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: NShachar Raindel <raindel@mellanox.com>
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NYuval Dagan <yuvalda@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      882214e2
    • S
      IB/core: Add support for on demand paging regions · 8ada2c1c
      Shachar Raindel 提交于
      * Extend the umem struct to keep the ODP related data.
      * Allocate and initialize the ODP related information in the umem
        (page_list, dma_list) and freeing as needed in the end of the run.
      * Store a reference to the process PID struct in the ucontext.  Used to
        safely obtain the task_struct and the mm during fault handling,
        without preventing the task destruction if needed.
      * Add 2 helper functions: ib_umem_odp_map_dma_pages and
        ib_umem_odp_unmap_dma_pages. These functions get the DMA addresses
        of specific pages of the umem (and, currently, pin them).
      * Support for page faults only - IB core will keep the reference on
        the pages used and call put_page when freeing an ODP umem
        area. Invalidations support will be added in a later patch.
      Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: NShachar Raindel <raindel@mellanox.com>
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      8ada2c1c
    • S
      IB/core: Add flags for on demand paging support · 860f10a7
      Sagi Grimberg 提交于
      * Add a configuration option for enable on-demand paging support in
        the infiniband subsystem (CONFIG_INFINIBAND_ON_DEMAND_PAGING). In a
        later patch, this configuration option will select the MMU_NOTIFIER
        configuration option to enable mmu notifiers.
      * Add a flag for on demand paging (ODP) support in the IB device capabilities.
      * Add a flag to request ODP MR in the access flags to reg_mr.
      * Fail registrations done with the ODP flag when the low-level driver
        doesn't support this.
      * Change the conditions in which an MR will be writable to explicitly
        specify the access flags.  This is to avoid making an MR writable just
        because it is an ODP MR.
      * Add a ODP capabilities to the extended query device verb.
      Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: NShachar Raindel <raindel@mellanox.com>
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      860f10a7
    • E
      IB/core: Add support for extended query device caps · 5a77abf9
      Eli Cohen 提交于
      Add extensible query device capabilities verb to allow adding new features.
      ib_uverbs_ex_query_device is added and copy_query_dev_fields is used to
      copy capability fields to be used by both ib_uverbs_query_device and
      ib_uverbs_ex_query_device.
      Signed-off-by: NEli Cohen <eli@mellanox.com>
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      5a77abf9
    • H
      IB/core: Add umem function to read data from user-space · c5d76f13
      Haggai Eran 提交于
      In some drivers there's a need to read data from a user space area
      that was pinned using ib_umem when running from a different process
      context.
      
      The ib_umem_copy_from function allows reading data from the physical
      pages pinned in the ib_umem struct.
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      c5d76f13
    • H
      IB/core: Replace ib_umem's offset field with a full address · 406f9e5f
      Haggai Eran 提交于
      In order to allow umems that do not pin memory, we need the umem to
      keep track of its region's address.
      
      This makes the offset field redundant, and so this patch removes it.
      Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      406f9e5f
    • O
      IB/addr: Improve address resolution callback scheduling · 346f98b4
      Or Kehati 提交于
      Address resolution always does a context switch to a work-queue to
      deliver the address resolution event.  When the IP address is already
      cached in the system ARP table, we're going through the following:
      chain:
      
          rdma_resolve_ip --> addr_resolve (cache hit) -->
      
      which ends up with:
      
          queue_req --> set_timeout (now) --> mod_delayed_work(,, delay=1)
      
      We actually do realize that the timeout should be zero, but the code
      forces it to a minimum of one jiffie.
      
      Using one jiffie as the minimum delay value results in sub-optimal
      scheduling of executing this work item by the workqueue, which on the
      below testbed costs about 3-4ms out of 12ms total time.
      
      To fix that, we let the minimum delay to be zero.  Note that the
      connect step times change too, as there are address resolution calls
      from that flow.
      
      The results were taken from running both client and server on the
      same node, over mlx4 RoCE port.
      
      before -->
      step              total ms     max ms     min us  us / conn
      create id    :        0.01       0.01       6.00       6.00
      resolve addr :        4.02       4.01    4013.00    4016.00
      resolve route:        0.18       0.18     182.00     183.00
      create qp    :        1.15       1.15    1150.00    1150.00
      connect      :        6.73       6.73    6730.00    6731.00
      disconnect   :        0.55       0.55     549.00     550.00
      destroy      :        0.01       0.01       9.00       9.00
      
      after -->
      step              total ms     max ms     min us  us / conn
      create id    :        0.01       0.01       6.00       6.00
      resolve addr :        0.05       0.05      49.00      52.00
      resolve route:        0.21       0.21     207.00     208.00
      create qp    :        1.10       1.10    1104.00    1104.00
      connect      :        1.22       1.22    1220.00    1221.00
      disconnect   :        0.71       0.71     713.00     713.00
      destroy      :        0.01       0.01       9.00       9.00
      Signed-off-by: NOr Kehati <ork@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Acked-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      346f98b4
    • J
      IB/core: Fix mgid key handling in SA agent multicast data-base · 514f3ddf
      Jack Morgenstein 提交于
      Applications can request that the SM assign an MGID by passing a mcast
      member request containing MGID = 0. When the SM responds by sending
      the allocated MGID, this MGID replaces the 0-MGID in the multicast group.
      
      However, the MGID field in the group is also the key field in the IB
      core multicast code rbtree containing the multicast groups for the
      port.
      
      Since this is a key field, correct handling requires that the group
      entry be deleted from the rbtree and then re-inserted with the new
      key, so that the table structure is properly maintained.
      
      The current code does not do this correctly.  Correct operation
      requires that if the key-field gid has changed at all, it should be
      deleted and re-inserted.
      
      Note that when inserting, if the new MGID is zero (not the case here
      but the code should handle this correctly), we allow duplicate entries
      for 0-MGIDs.
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Acked-by: NSean Hefty <sean.hefty@intel.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      514f3ddf
    • M
      IB/core: Do not resolve VLAN if already resolved · c1bd6cde
      Moni Shoua 提交于
      For RoCE, resolution of layer 2 address attributes forces no VLAN if
      link-local GIDs are used.  This patch allows applications to choose
      the VLAN ID for link-local based RoCE GIDs by setting IB_QP_VID in
      their QP attribute mask, and prevents the core from overriding this
      choice.
      
      Cc: Ursula Braun <ursula.braun@de.ibm.com>
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      c1bd6cde
  12. 14 10月, 2014 2 次提交