1. 31 8月, 2015 8 次提交
  2. 29 8月, 2015 1 次提交
    • S
      RDMA/cma: fix IPv6 address resolution · 6c26a771
      Spencer Baugh 提交于
      Resolving a link-local IPv6 address with an unspecified source address
      was broken by commit 5462eddd, which prevented the IPv6 stack from
      learning the scope id of the link-local IPv6 address, causing random
      failures as the IP stack chose a random link to resolve the address on.
      
      This commit 5462eddd made us bail out of cma_check_linklocal early if
      the address passed in was not an IPv6 link-local address. On the address
      resolution path, the address passed in is the source address; if the
      source address is the unspecified address, which is not link-local, we
      will bail out early.
      
      This is mostly correct, but if the destination address is a link-local
      address, then we will be following a link-local route, and we'll need to
      tell the IPv6 stack what the scope id of the destination address is.
      This used to be done by last line of cma_check_linklocal, which is
      skipped when bailing out early:
      
      	dev_addr->bound_dev_if = sin6->sin6_scope_id;
      
      (In cma_bind_addr, the sin6_scope_id of the source address is set to the
      sin6_scope_id of the destination address, so this is correct)
      This line is required in turn for the following line, L279 of
      addr6_resolve, to actually inform the IPv6 stack of the scope id:
      
            fl6.flowi6_oif = addr->bound_dev_if;
      
      Since we can only know we are in this failure case when we have access
      to both the source IPv6 address and destination IPv6 address, we have to
      deal with this further up the stack. So detect this failure case in
      cma_bind_addr, and set bound_dev_if to the destination address scope id
      to correct it.
      Signed-off-by: NSpencer Baugh <sbaugh@catern.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      6c26a771
  3. 02 6月, 2015 1 次提交
  4. 21 5月, 2015 2 次提交
  5. 19 5月, 2015 12 次提交
  6. 05 5月, 2015 1 次提交
  7. 11 6月, 2014 1 次提交
    • T
      RDMA/core: Add support for iWARP Port Mapper user space service · 30dc5e63
      Tatyana Nikolova 提交于
      This patch adds iWARP Port Mapper (IWPM) Version 2 support.  The iWARP
      Port Mapper implementation is based on the port mapper specification
      section in the Sockets Direct Protocol paper -
      http://www.rdmaconsortium.org/home/draft-pinkerton-iwarp-sdp-v1.0.pdf
      
      Existing iWARP RDMA providers use the same IP address as the native
      TCP/IP stack when creating RDMA connections.  They need a mechanism to
      claim the TCP ports used for RDMA connections to prevent TCP port
      collisions when other host applications use TCP ports.  The iWARP Port
      Mapper provides a standard mechanism to accomplish this.  Without this
      service it is possible for RDMA application to bind/listen on the same
      port which is already being used by native TCP host application.  If
      that happens the incoming TCP connection data can be passed to the
      RDMA stack with error.
      
      The iWARP Port Mapper solution doesn't contain any changes to the
      existing network stack in the kernel space.  All the changes are
      contained with the infiniband tree and also in user space.
      
      The iWARP Port Mapper service is implemented as a user space daemon
      process.  Source for the IWPM service is located at
      http://git.openfabrics.org/git?p=~tnikolova/libiwpm-1.0.0/.git;a=summary
      
      The iWARP driver (port mapper client) sends to the IWPM service the
      local IP address and TCP port it has received from the RDMA
      application, when starting a connection.  The IWPM service performs a
      socket bind from user space to get an available TCP port, called a
      mapped port, and communicates it back to the client.  In that sense,
      the IWPM service is used to map the TCP port, which the RDMA
      application uses to any port available from the host TCP port
      space. The mapped ports are used in iWARP RDMA connections to avoid
      collisions with native TCP stack which is aware that these ports are
      taken. When an RDMA connection using a mapped port is terminated, the
      client notifies the IWPM service, which then releases the TCP port.
      
      The message exchange between the IWPM service and the iWARP drivers
      (between user space and kernel space) is implemented using netlink
      sockets.
      
      1) Netlink interface functions are added: ibnl_unicast() and
         ibnl_mulitcast() for sending netlink messages to user space
      
      2) The signature of the existing ibnl_put_msg() is changed to be more
         generic
      
      3) Two netlink clients are added: RDMA_NL_NES, RDMA_NL_C4IW
         corresponding to the two iWarp drivers - nes and cxgb4 which use
         the IWPM service
      
      4) Enums are added to enumerate the attributes in the netlink
         messages, which are exchanged between the user space IWPM service
         and the iWARP drivers
      Signed-off-by: NTatyana Nikolova <tatyana.e.nikolova@intel.com>
      Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
      Reviewed-by: NPJ Waskiewicz <pj.waskiewicz@solidfire.com>
      
      [ Fold in range checking fixes and nlh_next removal as suggested by Dan
        Carpenter and Steve Wise.  Fix sparse endianness in hash.  - Roland ]
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      30dc5e63
  8. 02 4月, 2014 1 次提交
  9. 23 1月, 2014 1 次提交
  10. 19 1月, 2014 1 次提交
  11. 15 1月, 2014 2 次提交
    • A
      net: replace macros net_random and net_srandom with direct calls to prandom · 63862b5b
      Aruna-Hewapathirane 提交于
      This patch removes the net_random and net_srandom macros and replaces
      them with direct calls to the prandom ones. As new commits only seem to
      use prandom_u32 there is no use to keep them around.
      This change makes it easier to grep for users of prandom_u32.
      Signed-off-by: NAruna-Hewapathirane <aruna.hewapathirane@gmail.com>
      Suggested-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63862b5b
    • M
      IB/core: Ethernet L2 attributes in verbs/cm structures · dd5f03be
      Matan Barak 提交于
      This patch add the support for Ethernet L2 attributes in the
      verbs/cm/cma structures.
      
      When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
      in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.
      
      Thus, those attributes were added to the following structures:
      
      * ib_ah_attr - added dmac
      * ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
      * ib_wc - added smac, vlan_id
      * ib_sa_path_rec - added smac, dmac, vlan_id
      * cm_av - added smac and vlan_id
      
      For the path record structure, extra care was taken to avoid the new
      fields when packing it into wire format, so we don't break the IB CM
      and SA wire protocol.
      
      On the active side, the CM fills. its internal structures from the
      path provided by the ULP.  We add there taking the ETH L2 attributes
      and placing them into the CM Address Handle (struct cm_av).
      
      On the passive side, the CM fills its internal structures from the WC
      associated with the REQ message.  We add there taking the ETH L2
      attributes from the WC.
      
      When the HW driver provides the required ETH L2 attributes in the WC,
      they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
      code checks for the presence of these flags, and in their absence does
      address resolution from the ib_init_ah_from_wc() helper function.
      
      ib_modify_qp_is_ok is also updated to consider the link layer. Some
      parameters are mandatory for Ethernet link layer, while they are
      irrelevant for IB.  Vendor drivers are modified to support the new
      function signature.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      dd5f03be
  12. 12 11月, 2013 1 次提交
  13. 09 11月, 2013 2 次提交
    • D
      IB/cma: Check for GID on listening device first · be9130cc
      Doug Ledford 提交于
      As a simple optimization that should speed up the vast majority of
      connect attemps on IB devices, when we are searching for the GID of an
      incoming connection in the cached GID lists of devices, search the
      device that received the incoming connection request first.  If we
      don't find it there, then move on to other devices.
      
      This reduces the time to perform 10,000 connections considerably.
      Prior to this patch, a bad run of cmtime would look like this:
      
      connect      :    12399.26   12351.10    8609.00    1239.93
      
      With this patch, it looks more like this:
      
      connect      :     5864.86    5799.80    8876.00     586.49
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      be9130cc
    • D
      IB/cma: Use cached gids · 29f27e84
      Doug Ledford 提交于
      The cma_acquire_dev function was changed by commit 3c86aa70
      ("RDMA/cm: Add RDMA CM support for IBoE devices") to use find_gid_port()
      because multiport devices might have either IB or IBoE formatted gids.
      The old function assumed that all ports on the same device used the
      same GID format.
      
      However, when it was changed to use find_gid_port(), we inadvertently
      lost usage of the GID cache.  This turned out to be a very costly
      change.  In our testing, each iteration through each index of the GID
      table takes roughly 35us.  When you have multiple devices in a system,
      and the GID you are looking for is on one of the later devices, the
      code loops through all of the GID indexes on all of the early devices
      before it finally succeeds on the target device.  This pathological
      search behavior combined with 35us per GID table index retrieval
      results in results such as the following from the cmtime application
      that's part of the latest librdmacm git repo:
      
      ib1:
      step              total ms     max ms     min us  us / conn
      create id    :       29.42       0.04       1.00       2.94
      bind addr    :   186705.66      19.00   18556.00   18670.57
      resolve addr :       41.93       9.68     619.00       4.19
      resolve route:      486.93       0.48     101.00      48.69
      create qp    :     4021.95       6.18     330.00     402.20
      connect      :    68350.39   68588.17   24632.00    6835.04
      disconnect   :     1460.43     252.65-1862269.00     146.04
      destroy      :       41.16       0.04       2.00       4.12
      
      ib0:
      step              total ms     max ms     min us  us / conn
      create id    :       28.61       0.68       1.00       2.86
      bind addr    :     2178.86       2.95     201.00     217.89
      resolve addr :       51.26      16.85     845.00       5.13
      resolve route:      620.08       0.43      92.00      62.01
      create qp    :     3344.40       6.36     273.00     334.44
      connect      :     6435.99    6368.53    7844.00     643.60
      disconnect   :     5095.38     321.90     757.00     509.54
      destroy      :       37.13       0.02       2.00       3.71
      
      Clearly, both the bind address and connect operations suffer
      a huge penalty for being anything other than the default
      GID on the first port in the system.
      
      After applying this patch, the numbers now look like this:
      
      ib1:
      step              total ms     max ms     min us  us / conn
      create id    :       30.15       0.03       1.00       3.01
      bind addr    :       80.27       0.04       7.00       8.03
      resolve addr :       43.02      13.53     589.00       4.30
      resolve route:      482.90       0.45     100.00      48.29
      create qp    :     3986.55       5.80     330.00     398.66
      connect      :     7141.53    7051.29    5005.00     714.15
      disconnect   :     5038.85     193.63     918.00     503.88
      destroy      :       37.02       0.04       2.00       3.70
      
      ib0:
      step              total ms     max ms     min us  us / conn
      create id    :       34.27       0.05       1.00       3.43
      bind addr    :       26.45       0.04       1.00       2.64
      resolve addr :       38.25      10.54     760.00       3.82
      resolve route:      604.79       0.43      97.00      60.48
      create qp    :     3314.95       6.34     273.00     331.49
      connect      :    12399.26   12351.10    8609.00    1239.93
      disconnect   :     5096.76     270.72    1015.00     509.68
      destroy      :       37.10       0.03       2.00       3.71
      
      It's worth noting that we still suffer a bit of a penalty on
      connect to the wrong device, but the penalty is much less than
      it used to be.  Follow on patches deal with this penalty.
      
      Many thanks to Neil Horman for helping to track the source of
      slow function that allowed us to track down the fact that
      the original patch I mentioned above backed out cache usage
      and identify just how much that impacted the system.
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      29f27e84
  14. 08 11月, 2013 1 次提交
  15. 01 10月, 2013 1 次提交
  16. 13 8月, 2013 1 次提交
  17. 31 7月, 2013 3 次提交