1. 31 10月, 2014 1 次提交
    • O
      mlx4: Avoid leaking steering rules on flow creation error flow · 571e1b2c
      Or Gerlitz 提交于
      If mlx4_ib_create_flow() attempts to create > 1 rules with the
      firmware, and one of these registrations fail, we leaked the
      already created flow rules.
      
      One example of the leak is when the registration of the VXLAN ghost
      steering rule fails, we didn't unregister the original rule requested
      by the user, introduced in commit d2fce8a9 "mlx4: Set
      user-space raw Ethernet QPs to properly handle VXLAN traffic".
      
      While here, add dump of the VXLAN portion of steering rules
      so it can actually be seen when flow creation fails.
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      571e1b2c
  2. 23 9月, 2014 10 次提交
  3. 20 9月, 2014 1 次提交
  4. 11 9月, 2014 1 次提交
  5. 30 8月, 2014 1 次提交
  6. 13 8月, 2014 1 次提交
  7. 02 8月, 2014 1 次提交
  8. 02 6月, 2014 2 次提交
  9. 30 5月, 2014 1 次提交
    • J
      IB/mlx4: SET_PORT called by mlx4_ib_modify_port should be wrapped · 61565013
      Jack Morgenstein 提交于
      mlx4_ib_modify_port is invoked in IB for resetting the Q_Key violations
      counters and for modifying the IB port capability flags.
      
      For example, when opensm is started up on the hypervisor,
      mlx4_ib_modify_port is called to set the port's IsSM flag.
      
      In multifunction mode, the SET_PORT command used in this flow should
      be wrapped (so that the PF port capability flags are also tracked,
      thus enabling the aggregate of all the VF/PF capability flags to be
      tracked properly).
      
      The procedure mlx4_SET_PORT() in main.c is also renamed to mlx4_ib_SET_PORT()
      to differentiate it from procedure mlx4_SET_PORT() in port.c.
      mlx4_ib_SET_PORT() is used exclusively by mlx4_ib_modify_port().
      
      Finally, the CM invokes ib_modify_port() to set the IsCMSupported flag
      even when running over RoCE.  Therefore, when RoCE is active,
      mlx4_ib_modify_port should return OK unconditionally (since the
      capability flags and qkey violations counter are not relevant).
      Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      61565013
  10. 17 5月, 2014 1 次提交
    • M
      IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes · 9433c188
      Matan Barak 提交于
      When we receive a netdev event indicating a netdev change and/or
      a netdev address change, we must change the MAC index used by the
      proxy QP1 (in the QP context), otherwise RoCE CM packets sent by the
      VF will not carry the same source MAC address as the non-CM packets.
      
      We use the UPDATE_QP command to perform this change.
      
      In order to avoid modifying a QP context based on netdev event,
      while the driver attempts to destroy this QP (e.g either the mlx4_ib
      or ib_mad modules are unloaded), we use mutex locking in both flows.
      
      Since the relevant mlx4 proxy GSI QP is created indirectly by the
      mad module when they create their GSI QP, the mlx4 didn't need to
      keep track on that QP prior to this change.
      
      Now, when QP modifications are needed to this QP from within the
      driver, we added refernece to it.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9433c188
  11. 02 4月, 2014 2 次提交
  12. 21 3月, 2014 2 次提交
    • M
      net/mlx4: Adapt code for N-Port VF · 449fc488
      Matan Barak 提交于
      Adds support for N-Port VFs, this includes:
      1. Adding support in the wrapped FW command
      	In wrapped commands, we need to verify and convert
      	the slave's port into the real physical port.
      	Furthermore, when sending the response back to the slave,
      	a reverse conversion should be made.
      2. Adjusting sqpn for QP1 para-virtualization
      	The slave assumes that sqpn is used for QP1 communication.
      	If the slave is assigned to a port != (first port), we need
      	to adjust the sqpn that will direct its QP1 packets into the
      	correct endpoint.
      3. Adjusting gid[5] to modify the port for raw ethernet
      	In B0 steering, gid[5] contains the port. It needs
      	to be adjusted into the physical port.
      4. Adjusting number of ports in the query / ports caps in the FW commands
      	When a slave queries the hardware, it needs to view only
      	the physical ports it's assigned to.
      5. Adjusting the sched_qp according to the port number
      	The QP port is encoded in the sched_qp, thus in modify_qp we need
      	to encode the correct port in sched_qp.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      449fc488
    • M
      IB/mlx4_ib: Adapt code to use caps.num_ports instead of a constant · 82373701
      Matan Barak 提交于
      Some code in the mlx4 IB driver stack assumed MLX4_MAX_PORTS ports.
      
      Instead, we should only loop until the number of actual ports in i
      the device, which is stored in dev->caps.num_ports.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82373701
  13. 13 3月, 2014 1 次提交
  14. 26 2月, 2014 1 次提交
  15. 14 2月, 2014 7 次提交
  16. 20 1月, 2014 1 次提交
  17. 19 1月, 2014 1 次提交
    • M
      IB/mlx4: Use IBoE (RoCE) IP based GIDs in the port GID table · d487ee77
      Moni Shoua 提交于
      Currently, the mlx4 driver set IBoE (RoCE) gids to encode related
      Ethernet netdevice interface MAC address and possibly VLAN id.
      
      Change this scheme such that gids encode interface IP addresses (both
      IP4 and IPv6).
      
      This requires learning the IP addresses which are of use by a
      netdevice associated with the HCA port, formatting them to gids and
      adding them to the port gid table.  Furthermore, events of add and
      delete address are caught to maintain the gid table accordingly.
      
      Associated IP addresses may belong to a master of an Ethernet
      netdevice on top of that port so this should be considered when
      building and maintaining the gid table.
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      d487ee77
  18. 15 1月, 2014 3 次提交
  19. 18 11月, 2013 2 次提交
    • M
      IB/core: Re-enable create_flow/destroy_flow uverbs · 69ad5da4
      Matan Barak 提交于
      This commit reverts commit 7afbddfa ("IB/core: Temporarily disable
      create_flow/destroy_flow uverbs").  Since the uverbs extensions
      functionality was experimental for v3.12, this patch re-enables the
      support for them and flow-steering for v3.13.
      Signed-off-by: NMatan Barak <matanb@mellanox.com>
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      69ad5da4
    • Y
      IB/core: extended command: an improved infrastructure for uverbs commands · f21519b2
      Yann Droneaud 提交于
      Commit 400dbc96 ("IB/core: Infrastructure for extensible uverbs
      commands") added an infrastructure for extensible uverbs commands
      while later commit 436f2ad0 ("IB/core: Export ib_create/destroy_flow
      through uverbs") exported ib_create_flow()/ib_destroy_flow() functions
      using this new infrastructure.
      
      According to the commit 400dbc96, the purpose of this
      infrastructure is to support passing around provider (eg. hardware)
      specific buffers when userspace issue commands to the kernel, so that
      it would be possible to extend uverbs (eg. core) buffers independently
      from the provider buffers.
      
      But the new kernel command function prototypes were not modified to
      take advantage of this extension. This issue was exposed by Roland
      Dreier in a previous review[1].
      
      So the following patch is an attempt to a revised extensible command
      infrastructure.
      
      This improved extensible command infrastructure distinguish between
      core (eg. legacy)'s command/response buffers from provider
      (eg. hardware)'s command/response buffers: each extended command
      implementing function is given a struct ib_udata to hold core
      (eg. uverbs) input and output buffers, and another struct ib_udata to
      hold the hw (eg. provider) input and output buffers.
      
      Having those buffers identified separately make it easier to increase
      one buffer to support extension without having to add some code to
      guess the exact size of each command/response parts: This should make
      the extended functions more reliable.
      
      Additionally, instead of relying on command identifier being greater
      than IB_USER_VERBS_CMD_THRESHOLD, the proposed infrastructure rely on
      unused bits in command field: on the 32 bits provided by command
      field, only 6 bits are really needed to encode the identifier of
      commands currently supported by the kernel. (Even using only 6 bits
      leaves room for about 23 new commands).
      
      So this patch makes use of some high order bits in command field to
      store flags, leaving enough room for more command identifiers than one
      will ever need (eg. 256).
      
      The new flags are used to specify if the command should be processed
      as an extended one or a legacy one. While designing the new command
      format, care was taken to make usage of flags itself extensible.
      
      Using high order bits of the commands field ensure that newer
      libibverbs on older kernel will properly fail when trying to call
      extended commands. On the other hand, older libibverbs on newer kernel
      will never be able to issue calls to extended commands.
      
      The extended command header includes the optional response pointer so
      that output buffer length and output buffer pointer are located
      together in the command, allowing proper parameters checking. This
      should make implementing functions easier and safer.
      
      Additionally the extended header ensure 64bits alignment, while making
      all sizes multiple of 8 bytes, extending the maximum buffer size:
      
                                   legacy      extended
      
         Maximum command buffer:  256KBytes   1024KBytes (512KBytes + 512KBytes)
        Maximum response buffer:  256KBytes   1024KBytes (512KBytes + 512KBytes)
      
      For the purpose of doing proper buffer size accounting, the headers
      size are no more taken in account in "in_words".
      
      One of the odds of the current extensible infrastructure, reading
      twice the "legacy" command header, is fixed by removing the "legacy"
      command header from the extended command header: they are processed as
      two different parts of the command: memory is read once and
      information are not duplicated: it's making clear that's an extended
      command scheme and not a different command scheme.
      
      The proposed scheme will format input (command) and output (response)
      buffers this way:
      
      - command:
      
        legacy header +
        extended header +
        command data (core + hw):
      
          +----------------------------------------+
          | flags     |   00      00    |  command |
          |        in_words    |   out_words       |
          +----------------------------------------+
          |                 response               |
          |                 response               |
          | provider_in_words | provider_out_words |
          |                 padding                |
          +----------------------------------------+
          |                                        |
          .              <uverbs input>            .
          .              (in_words * 8)            .
          |                                        |
          +----------------------------------------+
          |                                        |
          .             <provider input>           .
          .          (provider_in_words * 8)       .
          |                                        |
          +----------------------------------------+
      
      - response, if present:
      
          +----------------------------------------+
          |                                        |
          .          <uverbs output space>         .
          .             (out_words * 8)            .
          |                                        |
          +----------------------------------------+
          |                                        |
          .         <provider output space>        .
          .         (provider_out_words * 8)       .
          |                                        |
          +----------------------------------------+
      
      The overall design is to ensure that the extensible infrastructure is
      itself extensible while begin more reliable with more input and bound
      checking.
      
      Note:
      
      The unused field in the extended header would be perfect candidate to
      hold the command "comp_mask" (eg. bit field used to handle
      compatibility).  This was suggested by Roland Dreier in a previous
      review[2].  But "comp_mask" field is likely to be present in the uverb
      input and/or provider input, likewise for the response, as noted by
      Matan Barak[3], so it doesn't make sense to put "comp_mask" in the
      header.
      
      [1]:
      http://marc.info/?i=CAL1RGDWxmM17W2o_era24A-TTDeKyoL6u3NRu_=t_dhV_ZA9MA@mail.gmail.com
      
      [2]:
      http://marc.info/?i=CAL1RGDXJtrc849M6_XNZT5xO1+ybKtLWGq6yg6LhoSsKpsmkYA@mail.gmail.com
      
      [3]:
      http://marc.info/?i=525C1149.6000701@mellanox.comSigned-off-by: NYann Droneaud <ydroneaud@opteya.com>
      Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com
      
      [ Convert "ret ? ret : 0" to the equivalent "ret".  - Roland ]
      Signed-off-by: NRoland Dreier <roland@purestorage.com>
      f21519b2