提交 · 5ef8c0c180a6318542dce7e0701dd8e341c1265b · openeuler / Kernel

30 5月, 2018 1 次提交

RDMA/core: Remove indirection through ib_cache_setup() · 5ef8c0c1

由 Jason Gunthorpe 提交于 5月 28, 2018

This once might have made sense when cache.c was in a different module
from device.c, but  today it just obfuscation. Get rid of the wrappers
and call roge_gid_mgmt_init()/cleanup() directly.
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>

5ef8c0c1

16 3月, 2018 2 次提交

IB/core: Move rdma_addr_find_l2_eth_by_grh to core_priv.h · e41a7c41

由 Parav Pandit 提交于 3月 13, 2018

Before commit [1], rdma_addr_find_l2_eth_by_grh() was an exported function
and therefore declaration in include/rdma/ib_addr.h was fine.

But now that its scope is limited to ib_core module, its better to have it
in core_priv.h.

[1] commit 1060f865 ("IB/{core/cm}: Fix generating a return AH for
RoCEE")
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

e41a7c41

IB/core: Remove rdma_resolve_ip_route() as exported symbol · a9c06aeb

由 Parav Pandit 提交于 3月 13, 2018

rdma_resolve_ip_route() is used only by ib_core module. Therefore it is
removed as an exported symbol.
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

a9c06aeb

17 2月, 2018 2 次提交

RDMA/restrack: don't use uaccess_kernel() · 2f08ee36

由 Steve Wise 提交于 2月 14, 2018

uaccess_kernel() isn't sufficient to determine if an rdma resource is
user-mode or not.  For example, resources allocated in the add_one()
function of an ib_client get falsely labeled as user mode, when they
are kernel mode allocations.  EG: mad qps.

The result is that these qps are skipped over during a nldev query
because of an erroneous namespace mismatch.

So now we determine if the resource is user-mode by looking at the object
struct's uobject or similar pointer to know if it was allocated for user
mode applications.

Fixes: 02d8883f ("RDMA/restrack: Add general infrastructure to track RDMA resources")
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

2f08ee36

RDMA/verbs: Check existence of function prior to accessing it · 21885586

由 Leon Romanovsky 提交于 2月 14, 2018

Update all the flows to ensure that function pointer exists prior
to accessing it.

This is much safer than checking the uverbs_ex_mask variable, especially
since we know that test isn't working properly and will be removed
in -next.

This prevents a user triggereable oops.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

21885586

30 1月, 2018 2 次提交

RDMA/core: Add resource tracking for create and destroy QPs · 78a0cd64

由 Leon Romanovsky 提交于 1月 28, 2018

Track create and destroy operations of QP objects.
Reviewed-by: NMark Bloch <markb@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

78a0cd64

RDMA/restrack: Add general infrastructure to track RDMA resources · 02d8883f

由 Leon Romanovsky 提交于 1月 28, 2018

The RDMA subsystem has very strict set of objects to work with, but it
completely lacks tracking facilities and has no visibility of resource
utilization.

The following patch adds such infrastructure to keep track of RDMA
resources to help with debugging of user space applications. The primary
user of this infrastructure is RDMA nldev netlink (following patches), to
be exposed to userspace via rdmatool, but it is not limited too that.

At this stage, the main three objects (PD, CQ and QP) are added, and more
will be added later.
Reviewed-by: NMark Bloch <markb@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

02d8883f

11 1月, 2018 1 次提交

IB/core: Increase number of char device minors · 8cf12d77

由 Huy Nguyen 提交于 1月 08, 2018

There is a need to increase number of possible char devices to support
large number of SR-IOV instances. The current limit is in the range of
64-128 devices/ports. Increase it to support up to 1024.

The patch performs the following steps to refactor the code:
1. Removes the split bitmap for fixed and overflow dev numbers.
2. Pre-allocates the non-legacy major number range during driver
   initialization, choosen for simplicity.
3. Add new define (RDMA_MAX_PORTS) that is shared between all drivers.
   This is the maximum total number of ports on all struct ib_devices.
4. Set RDMA_MAX_PORTS to 1024.
Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

8cf12d77

09 1月, 2018 2 次提交

{net, IB}/mlx5: Manage port association for multiport RoCE · 32f69e4b

由 Daniel Jurgens 提交于 1月 04, 2018

When mlx5_ib_add is called determine if the mlx5 core device being
added is capable of dual port RoCE operation. If it is, determine
whether it is a master device or a slave device using the
num_vhca_ports and affiliate_nic_vport_criteria capabilities.

If the device is a slave, attempt to find a master device to affiliate it
with. Devices that can be affiliated will share a system image guid. If
none are found place it on a list of unaffiliated ports. If a master is
found bind the port to it by configuring the port affiliation in the NIC
vport context.

Similarly when mlx5_ib_remove is called determine the port type. If it's
a slave port, unaffiliate it from the master device, otherwise just
remove it from the unaffiliated port list.

The IB device is registered as a multiport device, even if a 2nd port is
not available for affiliation. When the 2nd port is affiliated later the
GID cache must be refreshed in order to get the default GIDs for the 2nd
port in the cache. Export roce_rescan_device to provide a mechanism to
refresh the cache after a new port is bound.

In a multiport configuration all IB object (QP, MR, PD, etc) related
commands should flow through the master mlx5_core_dev, other commands
must be sent to the slave port mlx5_core_mdev, an interface is provide
to get the correct mdev for non IB object commands.
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

32f69e4b

IB/core: Change roce_rescan_device to return void · 908d6460

由 Daniel Jurgens 提交于 1月 04, 2018

It always returns 0. Change return type to void.
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

908d6460

03 1月, 2018 1 次提交

RDMA/netlink: Fix locking around __ib_get_device_by_index · f8978bd9

由 Leon Romanovsky 提交于 1月 01, 2018

Holding locks is mandatory when calling __ib_device_get_by_index,
otherwise there are races during the list iteration with device removal.

Since the locks are static to device.c, __ib_device_get_by_index can
never be called correctly by any user out side the file.

Make the function static and provide a safe function that gets the
correct locks and returns a kref'd pointer. Fix all callers.

Fixes: e5c9469e ("RDMA/netlink: Add nldev device doit implementation")
Fixes: c3f66f7b ("RDMA/netlink: Implement nldev port doit callback")
Fixes: 7d02f605 ("RDMA/netlink: Add nldev port dumpit implementation")
Reviewed-by: NMark Bloch <markb@mellanox.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

f8978bd9

28 12月, 2017 1 次提交

infiniband: drop unknown function from core_priv.h · efac5ac0

由 Randy Dunlap 提交于 12月 27, 2017

Delete ibnl_chk_listeners() and its kernel-doc comments from the
core_priv.h header file.  There is no such function.

Fixes: 233c1955 ("RDMA/netlink: Reduce exposure of RDMA netlink functions")
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

efac5ac0

19 12月, 2017 1 次提交

IB/core: Avoid exporting module internal function · df8441c6

由 Parav Pandit 提交于 11月 14, 2017

ib_security_modify_qp and ib_security_pkey_access are core internal
function. So avoid exporting them.
ib_security_pkey_access is used only when secuirty hooks are enabled so
avoid defining it otherwise.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

df8441c6

10 8月, 2017 4 次提交

RDMA/netlink: Convert LS to doit callback · 647c75ac

由 Leon Romanovsky 提交于 6月 15, 2017

RDMA_NL_LS protocol is actually does not dump anything,
but sets data and it should be handled by doit callback.

This patch actually converts RDMA_NL_LS to doit callback, while
preserving IWCM and RDMA_CM flows through netlink_dump_start().
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>

647c75ac

RDMA/core: Add and expose static device index · ecc82c53

由 Leon Romanovsky 提交于 6月 18, 2017

This patch adds static device index in similar fashion to
already available in netdev world (struct net->ifindex).

In downstream patches, the RDMA nelink will use this idx-to-ib_device
conversion, so as part of this commit, we are exposing the translation
function to be visible for IB/core users.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>

ecc82c53

RDMA/core: Add iterator over ib_devices · 8030c835

由 Leon Romanovsky 提交于 6月 19, 2017

The coming nldev needs to iterate over all IB devices in the system
and in order to not expose the ib_devices list outside the devices.c,
it is necessary to provide function iterator.

Current version is written explicitly for nldev callback to avoid
over-engineering at this stage, but it can be easily extended for
other types.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>

8030c835

RDMA/netlink: Remove netlink clients infrastructure · c9901724

由 Leon Romanovsky 提交于 6月 05, 2017

RDMA netlink has a complicated infrastructure for dynamically
registering and de-registering netlink clients to the NETLINK_RDMA
group. The complicated portion of this code is not widely used because
2 of the 3 current clients are statically compiled together with
netlink.c. The infrastructure, therefore, is deemed overkill.

Refactor the code to eliminate the dynamically added clients. Now all
clients are pre-registered in a client array at compile time, and at run
time they merely check-in with the infrastructure to pass their callback
table for inclusion in the pre-sized client array.

This also allows for future cleanups and removal of unneeded code in the
iwcm* netlink handler.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NChien Tin Tung <chien.tin.tung@intel.com>

c9901724

09 8月, 2017 1 次提交

IB/core: Change port_attr.lid size from 16 to 32 bits · 582faf31

由 Dasaratharaman Chandramouli 提交于 6月 08, 2017

lid field in struct ib_port_attr is increased to 32 bits. This enables core
components to use larger LIDs if needed.
The user ABI is unchanged and return 16 bit values when queried.
Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDon Hiatt <don.hiatt@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

582faf31

02 6月, 2017 1 次提交

RDMA/netlink: Reduce exposure of RDMA netlink functions · 233c1955

由 Leon Romanovsky 提交于 5月 14, 2017

RDMA netlink is part of ib_core, hence ibnl_chk_listeners(),
ibnl_init() and ibnl_cleanup() don't need to be published
in public header file.

Let's remove EXPORT_SYMBOL from ibnl_chk_listeners() and move all these
functions to private header file.

CC: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NYuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

233c1955

24 5月, 2017 2 次提交

IB/core: Enforce security on management datagrams · 47a2b338

由 Daniel Jurgens 提交于 5月 19, 2017

Allocate and free a security context when creating and destroying a MAD
agent.  This context is used for controlling access to PKeys and sending
and receiving SMPs.

When sending or receiving a MAD check that the agent has permission to
access the PKey for the Subnet Prefix of the port.

During MAD and snoop agent registration for SMI QPs check that the
calling process has permission to access the manage the subnet  and
register a callback with the LSM to be notified of policy changes. When
notificaiton of a policy change occurs recheck permission and set a flag
indicating sending and receiving SMPs is allowed.

When sending and receiving MADs check that the agent has access to the
SMI if it's on an SMI QP.  Because security policy can change it's
possible permission was allowed when creating the agent, but no longer
is.
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Acked-by: NDoug Ledford <dledford@redhat.com>
[PM: remove the LSM hook init code]
Signed-off-by: NPaul Moore <paul@paul-moore.com>

47a2b338

IB/core: Enforce PKey security on QPs · d291f1a6

由 Daniel Jurgens 提交于 5月 19, 2017

Add new LSM hooks to allocate and free security contexts and check for
permission to access a PKey.

Allocate and free a security context when creating and destroying a QP.
This context is used for controlling access to PKeys.

When a request is made to modify a QP that changes the port, PKey index,
or alternate path, check that the QP has permission for the PKey in the
PKey table index on the subnet prefix of the port. If the QP is shared
make sure all handles to the QP also have access.

Store which port and PKey index a QP is using. After the reset to init
transition the user can modify the port, PKey index and alternate path
independently. So port and PKey settings changes can be a merge of the
previous settings and the new ones.

In order to maintain access control if there are PKey table or subnet
prefix change keep a list of all QPs are using each PKey index on
each port. If a change occurs all QPs using that device and port must
have access enforced for the new cache settings.

These changes add a transaction to the QP modify process. Association
with the old port and PKey index must be maintained if the modify fails,
and must be removed if it succeeds. Association with the new port and
PKey index must be established prior to the modify and removed if the
modify fails.

1. When a QP is modified to a particular Port, PKey index or alternate
   path insert that QP into the appropriate lists.

2. Check permission to access the new settings.

3. If step 2 grants access attempt to modify the QP.

4a. If steps 2 and 3 succeed remove any prior associations.

4b. If ether fails remove the new setting associations.

If a PKey table or subnet prefix changes walk the list of QPs and
check that they have permission. If not send the QP to the error state
and raise a fatal error event. If it's a shared QP make sure all the
QPs that share the real_qp have permission as well. If the QP that
owns a security structure is denied access the security structure is
marked as such and the QP is added to an error_list. Once the moving
the QP to error is complete the security structure mark is cleared.

Maintaining the lists correctly turns QP destroy into a transaction.
The hardware driver for the device frees the ib_qp structure, so while
the destroy is in progress the ib_qp pointer in the ib_qp_security
struct is undefined. When the destroy process begins the ib_qp_security
structure is marked as destroying. This prevents any action from being
taken on the QP pointer. After the QP is destroyed successfully it
could still listed on an error_list wait for it to be processed by that
flow before cleaning up the structure.

If the destroy fails the QPs port and PKey settings are reinserted into
the appropriate lists, the destroying flag is cleared, and access control
is enforced, in case there were any cache changes during the destroy
flow.

To keep the security changes isolated a new file is used to hold security
related functionality.
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Acked-by: NDoug Ledford <dledford@redhat.com>
[PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
Signed-off-by: NPaul Moore <paul@paul-moore.com>

d291f1a6

23 5月, 2017 1 次提交

IB/core: IB cache enhancements to support Infiniband security · 883c71fe

由 Daniel Jurgens 提交于 5月 19, 2017

Cache the subnet prefix and add a function to access it. Enforcing
security requires frequent queries of the subnet prefix and the pkeys in
the pkey table.
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NEli Cohen <eli@mellanox.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NDoug Ledford <dledford@redhat.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

883c71fe

15 2月, 2017 1 次提交

IB/cma: Add default RoCE TOS to CMA configfs · 89052d78

由 Majd Dibbiny 提交于 2月 14, 2017

Add new entry to the RDMA-CM configfs that allows users
to select default TOS for RDMA-CM QPs.

This is useful for users that want to control the TOS for legacy
applications without changing their code.

Application that sets the TOS explicitly using the rdma_set_option
API will continue to work as expected, meaning overriding the configfs
value.

CC: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
Reviewed-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

89052d78

11 1月, 2017 1 次提交

IB/core: added support to use rdma cgroup controller · 43579b5f

由 Parav Pandit 提交于 1月 10, 2017

Added support APIs for IB core to register/unregister every IB/RDMA
device with rdma cgroup for tracking rdma resources.
IB core registers with rdma cgroup controller.
Added support APIs for uverbs layer to make use of rdma controller.
Added uverbs layer to perform resource charge/uncharge functionality.
Added support during query_device uverb operation to ensure it
returns resource limits by honoring rdma cgroup configured limits.
Signed-off-by: NParav Pandit <pandit.parav@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

43579b5f

14 12月, 2016 1 次提交

IB/core: Change ib_resolve_eth_dmac to use it in create AH · c90ea9d8

由 Moni Shoua 提交于 11月 23, 2016

The function ib_resolve_eth_dmac() requires struct qp_attr * and
qp_attr_mask as parameters while the function might be useful to resolve
dmac for address handles. This patch changes the signature of the
function so it can be used in the flow of creating an address handle.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c90ea9d8

18 10月, 2016 1 次提交

IB/core: Flip to the new dev walk API · 453d3932

由 David Ahern 提交于 10月 17, 2016

Convert rdma_is_upper_dev_rcu, handle_netdev_upper and
ipoib_get_net_dev_match_addr to the new upper device walk API.
This is just a code conversion; no functional change is intended.

v2
- removed typecast of data
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

453d3932

25 5月, 2016 5 次提交

IB/core: Add IP to GID netlink offload · ae43f828

由 Mark Bloch 提交于 5月 19, 2016

There is an assumption that rdmacm is used only between nodes
in the same IB subnet, this why ARP resolution can be used to turn
IP to GID in rdmacm.

When dealing with IB communication between subnets this assumption
is no longer valid. ARP resolution will get us the next hop device
address and not the peer node's device address.

To solve this issue, we will check user space if it can provide the
GID of the peer node, and fail if not.

We add a sequence number to identify each request and fill in the GID
upon answer from userspace.
Signed-off-by: NMark Bloch <markb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

ae43f828

IB/core: Register SA ibnl client during ib_core initialization · 735c631a

由 Mark Bloch 提交于 5月 19, 2016

Move SA ibnl client registration to ib_core module init.
This will allow us to register a single client to handle
all RDMA_NL_LS operations and make it SA independent.
Signed-off-by: NMark Bloch <markb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

735c631a

IB/SA: Integrate ib_sa module into ib_core module · c2e49c92

由 Mark Bloch 提交于 5月 19, 2016

Consolidate ib_sa into ib_core, this commit eliminates
ib_sa.ko and makes it part of ib_core.ko
Signed-off-by: NMark Bloch <markb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c2e49c92

IB/MAD: Integrate ib_mad module into ib_core module · 4c2cb422

由 Mark Bloch 提交于 5月 19, 2016

Consolidate ib_mad into ib_core, this commit eliminates
ib_mad.ko and makes it part of ib_core.ko
Signed-off-by: NMark Bloch <markb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

4c2cb422

IB/core: Integrate IB address resolution module into core · e3f20f02

由 Leon Romanovsky 提交于 5月 19, 2016

IB address resolution is declared as a module (ib_addr.ko) which loads
itself before IB core module (ib_core.ko).

It causes to the scenario where IB netlink which is initialized by IB
core can't be used by ib_addr.ko.

In order to solve it, we are converting ib_addr.ko to be part of
IB core module.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NMark Bloch <markb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e3f20f02

23 12月, 2015 4 次提交

IB/cma: Add configfs for rdma_cm · 045959db

由 Matan Barak 提交于 12月 23, 2015

Users would like to control the behaviour of rdma_cm.
For example, old applications which don't set the
required RoCE gid type could be executed on RoCE V2
network types. In order to support this configuration,
we implement a configfs for rdma_cm.

In order to use the configfs, one needs to mount it and
mkdir <IB device name> inside rdma_cm directory.

The patch adds support for a single configuration file,
default_roce_mode. The mode can either be "IB/RoCE v1" or
"RoCE v2".
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

045959db

IB/rdma_cm: Add wrapper for cma reference count · 218a773f

由 Matan Barak 提交于 12月 23, 2015

Currently, cma users can't increase or decrease the cma reference
count. This is necassary when setting cma attributes (like the
default GID type) in order to avoid use-after-free errors.
Adding cma_ref_dev and cma_deref_dev APIs.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

218a773f

IB/core: Move rdma_is_upper_dev_rcu to header file · 6020d7e5

由 Matan Barak 提交于 12月 23, 2015

In order to validate the route, we need an easy way to check if a
net-device belongs to our RDMA device. Move this helper function
to a header file in order to make this check easier.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

6020d7e5

IB/core: Add gid_type to gid attribute · b39ffa1d

由 Matan Barak 提交于 12月 23, 2015

In order to support multiple GID types, we need to store the gid_type
with each GID. This is also aligned with the RoCE v2 annex "RoCEv2 PORT
GID table entries shall have a "GID type" attribute that denotes the L3
Address type". The currently supported GID is IB_GID_TYPE_IB which is
also RoCE v1 GID type.

This implies that gid_type should be added to roce_gid_table meta-data.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b39ffa1d

22 10月, 2015 2 次提交

IB/core: Use GID table in AH creation and dmac resolution · dbf727de

由 Matan Barak 提交于 10月 15, 2015

Previously, vlan id and source MAC were used from QP attributes. Since
the net device is now stored in the GID attributes, they could be used
instead of getting this information from the QP attributes.

IB_QP_SMAC, IB_QP_ALT_SMAC, IB_QP_VID and IB_QP_ALT_VID were removed
because there is no known libibverbs that uses them.

This commit also modifies the vendors (mlx4, ocrdma) drivers in order
to use the new approach.

ocrdma driver changes were done by Somnath Kotur <Somnath.Kotur@Avagotech.Com>
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

dbf727de

IB/core: Expose and rename ib_find_cached_gid_by_port cache API · d300ec52

由 Matan Barak 提交于 10月 15, 2015

Sometime consumers might want to search for a GID in a specific port.
For example, when a WC arrives and we want to search the GID
that matches that port - it's better to search only the relevant
port.
Exposing and renaming ib_cache_gid_find_by_port in order to match
the naming convention of the module.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d300ec52

31 8月, 2015 2 次提交

IB/core: Add RoCE GID table management · 03db3a2d

由 Matan Barak 提交于 7月 30, 2015

RoCE GIDs are based on IP addresses configured on Ethernet net-devices
which relate to the RDMA (RoCE) device port.

Currently, each of the low-level drivers that support RoCE (ocrdma,
mlx4) manages its own RoCE port GID table. As there's nothing which is
essentially vendor specific, we generalize that, and enhance the RDMA
core GID cache to do this job.

In order to populate the GID table, we listen for events:

(a) netdev up/down/change_addr events - if a netdev is built onto
    our RoCE device, we need to add/delete its IPs. This involves
    adding all GIDs related to this ndev, add default GIDs, etc.

(b) inet events - add new GIDs (according to the IP addresses)
    to the table.

For programming the port RoCE GID table, providers must implement
the add_gid and del_gid callbacks.

RoCE GID management requires us to state the associated net_device
alongside the GID. This information is necessary in order to manage
the GID table. For example, when a net_device is removed, its
associated GIDs need to be removed as well.

RoCE mandates generating a default GID for each port, based on the
related net-device's IPv6 link local. In contrast to the GID based on
the regular IPv6 link-local (as we generate GID per IP address),
the default GID is also available when the net device is down (in
order to support loopback).

Locking is done as follows:
The patch modify the GID table code both for new RoCE drivers
implementing the add_gid/del_gid callbacks and for current RoCE and
IB drivers that do not. The flows for updating the table are
different, so the locking requirements are too.

While updating RoCE GID table, protection against multiple writers is
achieved via mutex_lock(&table->lock). Since writing to a table
requires us to find an entry (possible a free entry) in the table and
then modify it, this mutex protects both the find_gid and write_gid
ensuring the atomicity of the action.
Each entry in the GID cache is protected by rwlock. In RoCE, writing
(usually results from netdev notifier) involves invoking the vendor's
add_gid and del_gid callbacks, which could sleep.
Therefore, an invalid flag is added for each entry. Updates for RoCE are
done via a workqueue, thus sleeping is permitted.

In IB, updates are done in write_lock_irq(&device->cache.lock), thus
write_gid isn't allowed to sleep and add_gid/del_gid are not called.

When passing net-device into/out-of the GID cache, the device
is always passed held (dev_hold).

The code uses a single work item for updating all RDMA devices,
following a netdev or inet notifier.

The patch moves the cache from being a client (which was incorrect,
as the cache is part of the IB infrastructure) to being explicitly
initialized/freed when a device is registered/removed.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

03db3a2d

IB/core: Make ib_alloc_device init the kobject · 55aeed06

由 Jason Gunthorpe 提交于 8月 04, 2015

This gets rid of the weird in-between state where struct ib_device
was allocated but the kobject didn't work.

Consequently ib_device_release is now guaranteed to be called in
all situations and we needn't duplicate its kfrees on error paths.
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

55aeed06

20 1月, 2014 1 次提交

IB/core: Resolve Ethernet L2 addresses when modifying QP · ed4c54e5

由 Or Gerlitz 提交于 12月 12, 2013

Existing user space applications provide only IBoE L3 address
attributes to the kernel when they issue a modify QP modify.  To work
with them and let such apps (plus kernel consumers which don't use the
RDMA-CM) keep working transparently under the IBoE GID IP addressing
changes, add an Eth L2 address resolution helper.
Signed-off-by: NMoni Shoua <monis@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

ed4c54e5

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功