1. 30 5月, 2018 1 次提交
    • P
      IB/core: Fix error code for invalid GID entry · a840c93c
      Parav Pandit 提交于
      When a GID entry is invalid EAGAIN is returned. This is an incorrect error
      code, there is nothing that will make this GID entry valid again in
      bounded time.
      
      Some user space tools fail incorrectly if EAGAIN is returned here, and
      this represents a small ABI change from earlier kernels.
      
      The first patch in the Fixes list makes entries that were valid before
      to become invalid, allowing this code to trigger, while the second patch
      in the Fixes list introduced the wrong EAGAIN.
      
      Therefore revert the return result to EINVAL which matches the historical
      expectations of the ibv_query_gid_type() API of the libibverbs user space
      library.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 598ff6ba ("IB/core: Refactor GID modify code for RoCE")
      Fixes: 03db3a2d ("IB/core: Add RoCE GID table management")
      Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      a840c93c
  2. 24 4月, 2018 3 次提交
    • P
      IB/core: Fix deleting default GIDs when changing mac adddress · dc5640f2
      Parav Pandit 提交于
      Before [1], When MAC address of the netdevice is changed, default GID is
      supposed to get deleted and added back which affects the node and/or port
      GUID in below sequence.
      
      netdevice_event()
      -> NETDEV_CHANGEADDR
         default_del_cmd()
            del_netdev_default_ips()
                bond_delete_netdev_default_gids()
                    ib_cache_gid_set_default_gid()
                        ib_cache_gid_del()
         add_cmd()
         [..]
      
      However, ib_cache_gid_del() was not getting invoked in non bonding
      scenarios because event_ndev and rdma_ndev are same.
      Therefore, fix such condition to ignore checking upper device when event
      ndev and rdma_dev are same; similar to bond_set_netdev_default_gids().
      
      Which this fix ib_cache_gid_del() is invoked correctly; however
      ib_cache_gid_del() doesn't find the default GID for deletion because
      find_gid() was given default_gid = false with
      GID_ATTR_FIND_MASK_DEFAULT set.
      But it was getting overwritten by ib_cache_gid_set_default_gid() later
      on as part of add_cmd().
      Therefore, mac address change used to work for default GID.
      
      With refactor series [1], this incorrect behavior is detected.
      
      Therefore,
      when deleting default GID, set default_gid and set MASK flag.
      when deleting IP based GID, clear default_gid and set MASK flag.
      
      [1] https://patchwork.kernel.org/patch/10319151/
      
      Fixes: 238fdf48 ("IB/core: Add RoCE table bonding support")
      Fixes: 598ff6ba ("IB/core: Refactor GID modify code for RoCE")
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      dc5640f2
    • P
      IB/core: Fix to avoid deleting IPv6 look alike default GIDs · 22c01ee4
      Parav Pandit 提交于
      When IPv6 link local address is removed, if it matches with the default
      GID, default GID(s)s gets removed which may not be a desired behavior.
      This behavior is introduced by refactor work in Fixes tag.
      
      When IPv6 link address is removed, removing its equivalent RoCEv2 GID
      which exactly matches with default RoCEv2 GID, is right thing to do.
      However achieving it correctly requires lot more changes, likely in
      roce_gid_mgmt.c and core/cache.c. This should be done as independent
      patch.
      
      Therefore, this patch preserves behavior of not deleteing default GIDs.
      This is done by providing explicit hint to consider default GID property
      using mask and default_gid; similar to add_gid().
      
      Fixes: 598ff6ba ("IB/core: Refactor GID modify code for RoCE")
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      22c01ee4
    • P
      IB/core: Don't allow default GID addition at non reseved slots · a66ed149
      Parav Pandit 提交于
      Default GIDs are marked reserved at the start of the GID table at index
      0 and 1 by gid_table_reserve_default().  Currently when default GID is
      requested, it can still allocates an empty slot which was not marked as
      RESERVED for default GID, which is incorrect.
      
      At least in current code flow of roce_gid_mgmt.c, in theory we can
      still request to allocate more than one/two default GIDs depending
      on how upper devices are setup.
      
      Therefore, it is better for cache layer to only allow our reserved slots
      to be used by default GID allocation requests.
      
      Fixes: 598ff6ba ("IB/core: Refactor GID modify code for RoCE")
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NDoug Ledford <dledford@redhat.com>
      a66ed149
  3. 04 4月, 2018 3 次提交
    • P
      RDMA: Use ib_gid_attr during GID modification · 414448d2
      Parav Pandit 提交于
      Now that ib_gid_attr contains device, port and index, simplify the
      provider APIs add_gid() and del_gid() to use device, port and index
      fields from the ib_gid_attr attributes structure.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      414448d2
    • P
      IB/core: Refactor GID modify code for RoCE · 598ff6ba
      Parav Pandit 提交于
      Code is refactored to prepare separate functions for RoCE which can do more
      complex operations related to reference counting, while still
      maintainining code readability. This includes
      (a) Simplification to not perform netdevice checks and modifications
      for IB link layer.
      (b) Do not add RoCE GID entry which has NULL netdevice; instead return
      an error.
      (c) If GID addition fails at provider level add_gid(), do not add the
      entry in the cache and keep the entry marked as INVALID.
      (d) Simplify and reuse the ib_cache_gid_add()/del() routines so that they
      can be used even for modifying default GIDs. This avoid some code
      duplication in modifying default GIDs.
      (e) find_gid() routine refers to the data entry flags to qualify a GID
      as valid or invalid GID rather than depending on attributes and zeroness
      of the GID content.
      (f) gid_table_reserve_default() sets the GID default attribute at
      beginning while setting up the GID table. There is no need to use
      default_gid flag in low level functions such as write_gid(), add_gid(),
      del_gid(), as they never need to update the DEFAULT property of the GID
      entry while during GID table update.
      
      As as result of this refactor, reserved GID 0:0:0:0:0:0:0:0 is no longer
      searchable as described below.
      
      A unicast GID entry of 0:0:0:0:0:0:0:0 is Reserved GID as per the IB
      spec version 1.3 section 4.1.1, point (6) whose snippet is below.
      
      "The unicast GID address 0:0:0:0:0:0:0:0 is reserved - referred to as
      the Reserved GID. It shall never be assigned to any endport. It shall
      not be used as a destination address or in a global routing header
      (GRH)."
      
      GID table cache now only stores valid GID entries. Before this patch,
      Reserved GID 0:0:0:0:0:0:0:0 was searchable in the GID table using
      ib_find_cached_gid_by_port() and other similar find routines.
      
      Zero GID is no longer searchable as it shall not to be present in GRH or
      path recored entry as described in IB spec version 1.3 section 4.1.1,
      point (6), section 12.7.10 and section 12.7.20.
      
      ib_cache_update() is simplified to check link layer once, use unified
      locking scheme for all link layers, removed temporary gid table
      allocation/free logic.
      
      Additionally,
      (a) Expand ib_gid_attr to store port and index so that GID query
      routines can get port and index information from the attribute structure.
      (b) Expand ib_gid_attr to store device as well so that in future code when
      GID reference counting is done, device is used to reach back to the GID
      table entry.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      598ff6ba
    • P
      IB/core: Simplify ib_query_gid to always refer to cache · f35faa4b
      Parav Pandit 提交于
      Currently following inconsistencies exist.
      1. ib_query_gid() returns GID from the software cache for a RoCE port
      and returns GID from the HCA for an IB port.
      This is incorrect because software GID cache is maintained regardless
      of HCA port type.
      
      2. GID is queries from the HCA via ib_query_gid and updated in the
      software cache for IB link layer. Both of them might not be in sync.
      
      ULPs such as SRP initiator, SRP target, IPoIB driver have historically
      used ib_query_gid() API to query the GID. However CM used cached version
      during CM processing, When software cache was introduced, this
      inconsitency remained.
      
      In order to simplify, improve readability and avoid link layer
      specific above inconsistencies, this patch brings following changes.
      
      1. ib_query_gid() always refers to the cache layer regardless of link
      layer.
      
      2. cache module who reads the GID entry from HCA and builds the cache,
      directly invokes the HCA provider verb's query_gid() callback function.
      
      3. ib_query_port() is being called in early stage where GID cache is not
      yet build while reading port immutable property. Therefore it needs to
      read the default GID from the HCA for IB link layer to publish the
      subnet prefix.
      Signed-off-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      f35faa4b
  4. 28 3月, 2018 3 次提交
  5. 20 3月, 2018 1 次提交
  6. 16 3月, 2018 1 次提交
  7. 09 1月, 2018 2 次提交
    • D
      {net, IB}/mlx5: Manage port association for multiport RoCE · 32f69e4b
      Daniel Jurgens 提交于
      When mlx5_ib_add is called determine if the mlx5 core device being
      added is capable of dual port RoCE operation. If it is, determine
      whether it is a master device or a slave device using the
      num_vhca_ports and affiliate_nic_vport_criteria capabilities.
      
      If the device is a slave, attempt to find a master device to affiliate it
      with. Devices that can be affiliated will share a system image guid. If
      none are found place it on a list of unaffiliated ports. If a master is
      found bind the port to it by configuring the port affiliation in the NIC
      vport context.
      
      Similarly when mlx5_ib_remove is called determine the port type. If it's
      a slave port, unaffiliate it from the master device, otherwise just
      remove it from the unaffiliated port list.
      
      The IB device is registered as a multiport device, even if a 2nd port is
      not available for affiliation. When the 2nd port is affiliated later the
      GID cache must be refreshed in order to get the default GIDs for the 2nd
      port in the cache. Export roce_rescan_device to provide a mechanism to
      refresh the cache after a new port is bound.
      
      In a multiport configuration all IB object (QP, MR, PD, etc) related
      commands should flow through the master mlx5_core_dev, other commands
      must be sent to the slave port mlx5_core_mdev, an interface is provide
      to get the correct mdev for non IB object commands.
      Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      32f69e4b
    • D
      IB/core: Change roce_rescan_device to return void · 908d6460
      Daniel Jurgens 提交于
      It always returns 0. Change return type to void.
      Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NLeon Romanovsky <leon@kernel.org>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      908d6460
  8. 19 12月, 2017 2 次提交
  9. 25 8月, 2017 1 次提交
  10. 24 5月, 2017 1 次提交
    • D
      IB/core: Enforce PKey security on QPs · d291f1a6
      Daniel Jurgens 提交于
      Add new LSM hooks to allocate and free security contexts and check for
      permission to access a PKey.
      
      Allocate and free a security context when creating and destroying a QP.
      This context is used for controlling access to PKeys.
      
      When a request is made to modify a QP that changes the port, PKey index,
      or alternate path, check that the QP has permission for the PKey in the
      PKey table index on the subnet prefix of the port. If the QP is shared
      make sure all handles to the QP also have access.
      
      Store which port and PKey index a QP is using. After the reset to init
      transition the user can modify the port, PKey index and alternate path
      independently. So port and PKey settings changes can be a merge of the
      previous settings and the new ones.
      
      In order to maintain access control if there are PKey table or subnet
      prefix change keep a list of all QPs are using each PKey index on
      each port. If a change occurs all QPs using that device and port must
      have access enforced for the new cache settings.
      
      These changes add a transaction to the QP modify process. Association
      with the old port and PKey index must be maintained if the modify fails,
      and must be removed if it succeeds. Association with the new port and
      PKey index must be established prior to the modify and removed if the
      modify fails.
      
      1. When a QP is modified to a particular Port, PKey index or alternate
         path insert that QP into the appropriate lists.
      
      2. Check permission to access the new settings.
      
      3. If step 2 grants access attempt to modify the QP.
      
      4a. If steps 2 and 3 succeed remove any prior associations.
      
      4b. If ether fails remove the new setting associations.
      
      If a PKey table or subnet prefix changes walk the list of QPs and
      check that they have permission. If not send the QP to the error state
      and raise a fatal error event. If it's a shared QP make sure all the
      QPs that share the real_qp have permission as well. If the QP that
      owns a security structure is denied access the security structure is
      marked as such and the QP is added to an error_list. Once the moving
      the QP to error is complete the security structure mark is cleared.
      
      Maintaining the lists correctly turns QP destroy into a transaction.
      The hardware driver for the device frees the ib_qp structure, so while
      the destroy is in progress the ib_qp pointer in the ib_qp_security
      struct is undefined. When the destroy process begins the ib_qp_security
      structure is marked as destroying. This prevents any action from being
      taken on the QP pointer. After the QP is destroyed successfully it
      could still listed on an error_list wait for it to be processed by that
      flow before cleaning up the structure.
      
      If the destroy fails the QPs port and PKey settings are reinserted into
      the appropriate lists, the destroying flag is cleared, and access control
      is enforced, in case there were any cache changes during the destroy
      flow.
      
      To keep the security changes isolated a new file is used to hold security
      related functionality.
      Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
      Acked-by: NDoug Ledford <dledford@redhat.com>
      [PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
      Signed-off-by: NPaul Moore <paul@paul-moore.com>
      d291f1a6
  11. 23 5月, 2017 1 次提交
  12. 28 1月, 2017 1 次提交
  13. 25 1月, 2017 1 次提交
  14. 13 1月, 2017 2 次提交
  15. 04 12月, 2016 1 次提交
  16. 23 6月, 2016 1 次提交
  17. 07 6月, 2016 1 次提交
  18. 23 4月, 2016 2 次提交
  19. 03 3月, 2016 1 次提交
  20. 20 1月, 2016 1 次提交
  21. 23 12月, 2015 6 次提交
  22. 22 10月, 2015 3 次提交
  23. 16 10月, 2015 1 次提交