提交 · 7fd8aefb7ce202dd9d97f752bf249be6215f1004 · openeuler / Kernel

09 1月, 2018 1 次提交

IB/core: Introduce driver QP type · 8011c1e3

由 Moni Shoua 提交于 1月 02, 2018

Vendors can implement type of QPs that are not described in the
InfiniBand specification. To still be able to use the IB/core layer
services (e.g. user object management) without tainting this layer with
driver proprietary logic, a new QP type is added - IB_QPT_DRIVER. This
will be a general QP type that the core layer doesn't know about its true nature.
When a command like create_qp() is passed to a hardware driver the extra
data that is required is taken from the driver channel.
Downstream patches from this series will use that QP type in the mlx5
driver.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

8011c1e3

19 12月, 2017 2 次提交

IB/{core, umad, cm}: Rename ib_init_ah_from_wc to ib_init_ah_attr_from_wc · f6bdb142

由 Parav Pandit 提交于 11月 14, 2017

Currently ib_init_ah_from_wc initializes address handle attributes and
not the address handle object itself.
To avoid confusion between ah_attr vs ah, ib_init_ah_from_wc is
renamed to ib_init_ah_attr_from_wc to reflect that its initialzes
ah_attr.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

f6bdb142

IB/{core, ipoib}: Simplify ib_find_gid to search only for IB link layer · dbb12562

由 Parav Pandit 提交于 11月 14, 2017

Currently there are no users of ib_find_gid for RoCE transport. It is
only used by IPoIB.
Therefore its simplified to ignore RoCE ports and GID type check which
was previously done for every port.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

dbb12562

14 11月, 2017 3 次提交

RDMA/core: Rename kernel modify_cq to better describe its usage · 4190b4e9

由 Leon Romanovsky 提交于 11月 13, 2017

Current ib_modify_cq() is used to set CQ moderation parameters.

This patch renames ib_modify_cq() to be rdma_set_cq_moderation(),
because the kernel version of RDMA API doesn't need to follow already
exposed to user's API pattern (create_XXX/modify_XXX/query_XXX/destroy_XXX)
and better to have more accurate name which describes the actual usage.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

4190b4e9

IB/uverbs: Add CQ moderation capability to query_device · 18bd9072

由 Yonatan Cohen 提交于 11月 13, 2017

The query_device function can now obtain the maximum values for
cq_max_count and cq_period, needed for CQ moderation.
cq_max_count is a 16 bits number that determines the number
of CQEs to accumulate before generating an event.
cq_period is a 16 bits number that determines the timeout in micro
seconds from the last event generated, upon which a new event will
be generated even if cq_max_count was not reached.
Signed-off-by: NYonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: NMajd Dibbiny <majd@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

18bd9072

IB/uverbs: Allow CQ moderation with modify CQ · 869ddcf8

由 Yonatan Cohen 提交于 11月 13, 2017

Uverbs support in modify_cq for CQ moderation only.
Gives ability to change cq_max_count and cq_period.
CQ moderation enhance performance by moderating the number
of CQEs needed to create an event instead of application
having to suffer from event per-CQE.
To achieve CQ moderation the application needs to set cq_max_count
and cq_period.
cq_max_count - defines the number of CQEs needed to create an event.
cq_period - defines the timeout (micro seconds) between last
            event and a new one that will occur even if
	    cq_max_count was not satisfied
Signed-off-by: NYonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: NMajd Dibbiny <majd@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

869ddcf8

11 11月, 2017 1 次提交

IB/core: Add PCI write end padding flags for WQ and QP · e1d2e887

由 Noa Osherovich 提交于 10月 29, 2017

There are root complexes that are able to optimize their
performance when incoming data is multiple full cache lines.

PCI write end padding is the device's ability to pad the ending of
incoming packets (scatter) to full cache line such that the last
upstream write generated by an incoming packet will be a full cache
line.

Add a relevant entry to ib_device_cap_flags to report such capability
of an RDMA device.

Add the QP and WQ create flags:
 * A QP/WQ created with a scatter end padding flag will cause
   HW to pad the last upstream write generated by a packet to cache line.

User should consider several factors before activating this feature:
- In case of high CPU memory load (which may cause PCI back pressure in
  turn), if a large percent of the writes are partial cache line, this
  feature should be checked as an optional solution.
- This feature might reduce performance if most packets are between one
  and two cache lines and PCIe throughput has reached its maximum
  capacity. E.g. 65B packet from the network port will lead to 128B
  write on PCIe, which may cause traffic on PCIe to reach high
  throughput.
Signed-off-by: NNoa Osherovich <noaos@mellanox.com>
Reviewed-by: NMajd Dibbiny <majd@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e1d2e887

19 10月, 2017 2 次提交

IB: Let ib_core resolve destination mac address · c0348eb0

由 Parav Pandit 提交于 10月 16, 2017

Since IB/core resolves the destination mac address for user and kernel
consumers, avoid resolving in multiple provider drivers.

Only ib_core resolves DMAC now, therefore resolve_eth_dmac is removed as
exported symbol.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c0348eb0

IB/core: Introduce and use rdma_create_user_ah · 5cda6587

由 Parav Pandit 提交于 10月 16, 2017

Introduce rdma_create_user_ah API which allows passing udata to
provider driver and additionally which resolves DMAC for RoCE.

ib_resolve_eth_dmac() resolves destination mac address for unicast,
multicast, link local ipv4 mapped ipv6 and ipv6 destination gid entry.
This allows all RoCE provider drivers to avoid duplicating such code.

Such change brings consistency where IB core always resolves dmac and pass
it to RoCE provider drivers for user and kernel consumers, with this
ah_attr->roce.dmac is always an input field for provider drivers.

This uniformity avoids exporting ib_resolve_eth_dmac symbol to providers
or other modules. Therefore its removed as exported symbol at later in
the patch series.

Now uverbs and umad both makes use of rdma_create_user_ah API which
fixes the issue where umad has invalid DMAC for address.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5cda6587

25 9月, 2017 2 次提交

IB: Correct MR length field to be 64-bit · edd31551

由 Parav Pandit 提交于 9月 24, 2017

The ib_mr->length represents the length of the MR in bytes as per
the IBTA spec 1.3 section 11.2.10.3 (REGISTER PHYSICAL MEMORY REGION).

Currently ib_mr->length field is defined as only 32-bits field.
This might result into truncation and failed WRs of consumers who
registers more than 4GB bytes memory regions and whose WRs accessing
such MRs.

This patch makes the length 64-bit to avoid such truncation.

Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Faisal Latif <faisal.latif@intel.com>
Fixes: 4c67e2bf ("IB/core: Introduce new fast registration API")
Signed-off-by: NIlya Lesokhin <ilyal@mellanox.com>
Signed-off-by: NParav Pandit <parav@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

edd31551

IB/core: Fix typo in the name of the tag-matching cap struct · 78b1beb0

由 Leon Romanovsky 提交于 9月 24, 2017

The tag matching functionality is implemented by mlx5 driver
by extending XRQ, however this internal kernel information was
exposed to user space applications with *xrq* name instead of *tm*.

This patch renames *xrq* to *tm* to handle that.

Fixes: 8d50505a ("IB/uverbs: Expose XRQ capabilities")
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

78b1beb0

09 9月, 2017 1 次提交

lib/interval_tree: fast overlap detection · f808c13f

由 Davidlohr Bueso 提交于 9月 08, 2017

Allow interval trees to quickly check for overlaps to avoid unnecesary
tree lookups in interval_tree_iter_first().

As of this patch, all interval tree flavors will require using a
'rb_root_cached' such that we can have the leftmost node easily
available.  While most users will make use of this feature, those with
special functions (in addition to the generic insert, delete, search
calls) will avoid using the cached option as they can do funky things
with insertions -- for example, vma_interval_tree_insert_after().

[jglisse@redhat.com: fix deadlock from typo vm_lock_anon_vma()]
  Link: http://lkml.kernel.org/r/20170808225719.20723-1-jglisse@redhat.com
Link: http://lkml.kernel.org/r/20170719014603.19029-12-dave@stgolabs.netSigned-off-by: NDavidlohr Bueso <dbueso@suse.de>
Signed-off-by: NJérôme Glisse <jglisse@redhat.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NDoug Ledford <dledford@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Christian Benvenuti <benve@cisco.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f808c13f

31 8月, 2017 1 次提交

IB/core: Add new ioctl interface · fac9658c

由 Matan Barak 提交于 8月 03, 2017

In this ioctl interface, processing the command starts from
properties of the command and fetching the appropriate user objects
before calling the handler.

Parsing and validation is done according to a specifier declared by
the driver's code. In the driver, all supported objects are declared.
These objects are separated to different object namepsaces. Dividing
objects to namespaces is done at initialization by using the higher
bits of the object ids. This initialization can mix objects declared
in different places to one parsing tree using in this ioctl interface.

For each object we list all supported methods. Similarly to objects,
methods are separated to method namespaces too. Namespacing is done
similarly to the objects case. This could be used in order to add
methods to an existing object.

Each method has a specific handler, which could be either a default
handler or a driver specific handler.
Along with the handler, a bunch of attributes are specified as well.
Similarly to objects and method, attributes are namespaced and hashed
by their ids at initialization too. All supported attributes are
subject to automatic fetching and validation. These attributes include
the command, response and the method's related objects' ids.

When these entities (objects, methods and attributes) are used, the
high bits of the entities ids are used in order to calculate the hash
bucket index. Then, these high bits are masked out in order to have a
zero based index. Since we use these high bits for both bucketing and
namespacing, we get a compact representation and O(1) array access.
This is mandatory for efficient dispatching.

Each attribute has a type (PTR_IN, PTR_OUT, IDR and FD) and a length.
Attributes could be validated through some attributes, like:
(*) Minimum size / Exact size
(*) Fops for FD
(*) Object type for IDR

If an IDR/fd attribute is specified, the kernel also states the object
type and the required access (NEW, WRITE, READ or DESTROY).
All uobject/fd management is done automatically by the infrastructure,
meaning - the infrastructure will fail concurrent commands that at
least one of them requires concurrent access (WRITE/DESTROY),
synchronize actions with device removals (dissociate context events)
and take care of reference counting (increase/decrease) for concurrent
actions invocation. The reference counts on the actual kernel objects
shall be handled by the handlers.

 objects
+--------+
|        |
|        |   methods                                                                +--------+
|        |   ns         method      method_spec                           +-----+   |len     |
+--------+  +------+[d]+-------+   +----------------+[d]+------------+    |attr1+-> |type    |
| object +> |method+-> | spec  +-> +  attr_buckets  +-> |default_chain+--> +-----+   |idr_type|
+--------+  +------+   |handler|   |                |   +------------+    |attr2|   |access  |
|        |  |      |   +-------+   +----------------+   |driver chain|    +-----+   +--------+
|        |  |      |                                    +------------+
|        |  +------+
|        |
|        |
|        |
|        |
|        |
|        |
|        |
|        |
|        |
|        |
+--------+

[d] = Hash ids to groups using the high order bits

The right types table is also chosen by using the high bits from
the ids. Currently we have either default or driver specific groups.

Once validation and object fetching (or creation) completed, we call
the handler:
int (*handler)(struct ib_device *ib_dev, struct ib_uverbs_file *ufile,
               struct uverbs_attr_bundle *ctx);

ctx bundles attributes of different namespaces. Each element there
is an array of attributes which corresponds to one namespaces of
attributes. For example, in the usually used case:

 ctx                               core
+----------------------------+     +------------+
| core:                      +---> | valid      |
+----------------------------+     | cmd_attr   |
| driver:                    |     +------------+
|----------------------------+--+  | valid      |
                                |  | cmd_attr   |
                                |  +------------+
                                |  | valid      |
                                |  | obj_attr   |
                                |  +------------+
                                |
                                |  drivers
                                |  +------------+
                                +> | valid      |
                                   | cmd_attr   |
                                   +------------+
                                   | valid      |
                                   | cmd_attr   |
                                   +------------+
                                   | valid      |
                                   | obj_attr   |
                                   +------------+
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

fac9658c

29 8月, 2017 3 次提交

IB/core: Add new SRQ type IB_SRQT_TM · 9c2c8496

由 Artemy Kovalyov 提交于 8月 17, 2017

This patch adds new SRQ type - IB_SRQT_TM. The new SRQ type supports tag
matching and rendezvous offloads for MPI applications.

When SRQ receives a message it will search through the matching list
for the corresponding posted receive buffer. The process of searching
the matching list is called tag matching.
In case the tag matching results in a match, the received message will
be placed in the address specified by the receive buffer. In case no
match was found the message will be placed in a generic buffer until the
corresponding receive buffer will be posted. These messages are called
unexpected and their set is called an unexpected list.
Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com>
Reviewed-by: NYossi Itigin <yosefe@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

9c2c8496

IB/core: Separate CQ handle in SRQ context · 1a56ff6d

由 Artemy Kovalyov 提交于 8月 17, 2017

Before this change CQ attached to SRQ was part of XRC specific extension.
Moving CQ handle out makes it available to other types extending SRQ
functionality.
Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com>
Reviewed-by: NYossi Itigin <yosefe@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

1a56ff6d

IB/core: Add XRQ capabilities · 6938fc1e

由 Artemy Kovalyov 提交于 8月 17, 2017

This patch adds following TM XRQ capabilities:

* max_rndv_hdr_size - Max size of rendezvous request message
* max_num_tags - Max number of entries in tag matching list
* max_ops - Max number of outstanding list operations
* max_sge - Max number of SGE in tag matching entry
* flags - the following flags are currently defined:
    - IB_TM_CAP_RC - Support tag matching on RC transport
Signed-off-by: NArtemy Kovalyov <artemyko@mellanox.com>
Reviewed-by: NYossi Itigin <yosefe@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

6938fc1e

25 8月, 2017 3 次提交

RDMA/core: Cleanup device capability enum · 78b57f95

由 Leon Romanovsky 提交于 8月 17, 2017

Cleanup patch prior exporting the ib_device_cap_flags
to the user space. In this patch, we are aligning the
indentation, removing IB_DEVICE_INIT_TYPE and IB_DEVICE_RESERVED
fields, because it is not used in the kernel.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

78b57f95

RDMA/(core, ulp): Convert register/unregister event handler to be void · dcc9881e

由 Leon Romanovsky 提交于 8月 17, 2017

The functions ib_register_event_handler() and
ib_unregister_event_handler() always returned success and they can't fail.

Let's convert those functions to be void, remove redundant checks and
cleanup tons of goto statements.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

dcc9881e

IB/core: Avoid accessing non-allocated memory when inferring port type · 498ca3c8

由 Noa Osherovich 提交于 8月 23, 2017

Commit 44c58487 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
introduced the concept of type in ah_attr:
 * During ib_register_device, each port is checked for its type which
   is stored in ib_device's port_immutable array.
 * During uverbs' modify_qp, the type is inferred using the port number
   in ib_uverbs_qp_dest struct (address vector) by accessing the
   relevant port_immutable array and the type is passed on to
   providers.

IB spec (version 1.3) enforces a valid port value only in Reset to
Init. During Init to RTR, the address vector must be valid but port
number is not mentioned as a field in the address vector, so its
value is not validated, which leads to accesses to a non-allocated
memory when inferring the port type.

Save the real port number in ib_qp during modify to Init (when the
comp_mask indicates that the port number is valid) and use this value
to infer the port type.

Avoid copying the address vector fields if the matching bit is not set
in the attr_mask. Address vector can't be modified before the port, so
no valid flow is affected.

Fixes: 44c58487 ('IB/core: Define 'ib' and 'roce' rdma_ah_attr types')
Signed-off-by: NNoa Osherovich <noaos@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

498ca3c8

23 8月, 2017 1 次提交

IB/hfi1: Determine 9B/16B L2 header type based on Address handle · d98bb7f7

由 Don Hiatt 提交于 8月 04, 2017

When address handle attributes are initialized, the LIDs are
transformed to be in the 32 bit LID space.
When constructing the header, hfi1 driver will look at the LID
to determine the packet header to be created.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: NDon Hiatt <don.hiatt@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d98bb7f7

19 8月, 2017 1 次提交

Add OPA extended LID support · 62ede777

由 Hiatt, Don 提交于 8月 14, 2017

This patch series primarily increases sizes of variables that hold
lid values from 16 to 32 bits. Additionally, it adds a check in
the IB mad stack to verify a properly formatted MAD when OPA
extended LIDs are used.
Signed-off-by: NDon Hiatt <don.hiatt@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

62ede777

10 8月, 2017 2 次提交

RDMA: Simplify get firmware interface · 9abb0d1b

由 Leon Romanovsky 提交于 6月 27, 2017

There is a need to forward FW version to user space
application through RDMA netlink. In order to make it safe, there
is need to declare nla_policy and limit the size of FW string.

The new define IB_FW_VERSION_NAME_MAX will limit the size of
FW version string. That define was chosen to be equal to
ETHTOOL_FWVERS_LEN, because many drivers anyway are limited
by that value indirectly.

The introduction of this define allows us to remove the string size
from get_fw_str function signature.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>

9abb0d1b

RDMA/core: Add and expose static device index · ecc82c53

由 Leon Romanovsky 提交于 6月 18, 2017

This patch adds static device index in similar fashion to
already available in netdev world (struct net->ifindex).

In downstream patches, the RDMA nelink will use this idx-to-ib_device
conversion, so as part of this commit, we are exposing the translation
function to be visible for IB/core users.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>

ecc82c53

09 8月, 2017 4 次提交

RDMA/core: expose affinity mappings per completion vector · c66cd353

由 Sagi Grimberg 提交于 7月 13, 2017

This will allow ULPs to intelligently locate threads based
on completion vector cpu affinity mappings. In case the
driver does not expose a get_vector_affinity callout, return
NULL so the caller can maintain a fallback logic.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHåkon Bugge <haakon.bugge@oracle.com>
Acked-by: NDoug Ledford <dledford@redhat.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c66cd353

IB/core: Change wc.slid from 16 to 32 bits · 7db20ecd

由 Hiatt, Don 提交于 6月 08, 2017

slid field in struct ib_wc is increased to 32 bits.
This enables core components to use larger LIDs if needed.
The user ABI is unchanged and return 16 bit values when queried.
Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDon Hiatt <don.hiatt@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7db20ecd

IB/core: Change port_attr.sm_lid from 16 to 32 bits · db58540b

由 Dasaratharaman Chandramouli 提交于 6月 08, 2017

sm_lid field in struct ib_port_attr is increased to 32 bits. This
enables core components to use larger LIDs if needed.
The user ABI is unchanged and return 16 bit values when queried.
Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDon Hiatt <don.hiatt@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

db58540b

IB/core: Change port_attr.lid size from 16 to 32 bits · 582faf31

由 Dasaratharaman Chandramouli 提交于 6月 08, 2017

lid field in struct ib_port_attr is increased to 32 bits. This enables core
components to use larger LIDs if needed.
The user ABI is unchanged and return 16 bit values when queried.
Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDon Hiatt <don.hiatt@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

582faf31

24 7月, 2017 3 次提交

IB/core: Enable QP creation with a given source QP number · 02984cc7

由 Yishai Hadas 提交于 6月 08, 2017

Enable QP creation with a given source QP number.
The created QP will use the source QPN as its wire QP number.

This comes as a pre-patch for downstream patches in this series to
allow user space applications to accelerate traffic which is typically
handled by IPoIB ULP.
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Reviewed-by: NMaor Gottlieb <maorg@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

02984cc7

IB/core: Introduce delay drop for a WQ · 7d9336d8

由 Maor Gottlieb 提交于 5月 30, 2017

Work queue which is created with IB_WQ_FLAGS_DELAY_DROP won't
cause packet drops when there aren't receive WQEs, but will wait until
posting of receive WQEs or for some period of time that the device
was configured with.

It includes:
 * Add a new creation flag to enable delay drop functionality in a WQ.
 * A new capability was introduced - IB_RAW_PACKET_CAP_DELAY_DROP, which
   is the device's ability to delay packet drops when there aren't receive
   WQEs.
Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
Reviewed-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7d9336d8

IB/core: Add generic function to extract IB speed from netdev · d4186194

由 Yuval Shaia 提交于 6月 14, 2017

Logic of retrieving netdev speed from net_device and translating it to
IB speed is implemented in rxe, in usnic and in bnxt drivers.

Define new function which merges all.
Signed-off-by: NYuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: NChristian Benvenuti <benve@cisco.com>
Reviewed-by: NSelvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d4186194

18 7月, 2017 2 次提交

IB/core: Remove NOIO QP create flag · 7855f584

由 Leon Romanovsky 提交于 5月 23, 2017

There are no users for IB_QP_CREATE_USE_GFP_NOIO flag,
so let's remove it.
Signed-off-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7855f584

IB/core: Introduce modify QP operation with udata · a512c2fb

由 Parav Pandit 提交于 5月 23, 2017

This patch adds new function ib_modify_qp_with_udata so that
uverbs layer can avoid handling L2 mac address at verbs layer
and depend on the core layer to resolve the mac address consistently
for all required QPs.
Signed-off-by: NParav Pandit <parav@mellanox.com>
Reviewed-by: NEli Cohen <eli@mellanox.com>
Reviewed-by: NDaniel Jurgens <danielj@mellanox.com>
Signed-off-by: NLeon Romanovsky <leon@kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a512c2fb

06 7月, 2017 1 次提交

IB/core, opa_vnic, hfi1, mlx5: Properly free rdma_netdev · 8e959601

由 Niranjana Vishwanathapura 提交于 6月 30, 2017

IPOIB is calling free_rdma_netdev even though alloc_rdma_netdev has
returned -EOPNOTSUPP.
Move free_rdma_netdev from ib_device structure to rdma_netdev structure
thus ensuring proper cleanup function is called for the rdma net device.

Fix the following trace:

ib0: Failed to modify QP to ERROR state
BUG: unable to handle kernel paging request at 0000000000001d20
IP: hfi1_vnic_free_rn+0x26/0xb0 [hfi1]
Call Trace:
 ipoib_remove_one+0xbe/0x160 [ib_ipoib]
 ib_unregister_device+0xd0/0x170 [ib_core]
 rvt_unregister_device+0x29/0x90 [rdmavt]
 hfi1_unregister_ib_device+0x1a/0x100 [hfi1]
 remove_one+0x4b/0x220 [hfi1]
 pci_device_remove+0x39/0xc0
 device_release_driver_internal+0x141/0x200
 driver_detach+0x3f/0x80
 bus_remove_driver+0x55/0xd0
 driver_unregister+0x2c/0x50
 pci_unregister_driver+0x2a/0xa0
 hfi1_mod_cleanup+0x10/0xf65 [hfi1]
 SyS_delete_module+0x171/0x250
 do_syscall_64+0x67/0x150
 entry_SYSCALL64_slow_path+0x25/0x25
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

8e959601

28 6月, 2017 2 次提交

IB/core,rdmavt,hfi1,opa-vnic: Send OPA cap_mask3 in trap · cb49366f

由 Vishwanathapura, Niranjana 提交于 6月 01, 2017

Provide the ability for IB clients to modify the OPA specific
capability mask and include this mask in the subsequent trap data.
Reviewed-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: NMichael N. Henry <michael.n.henry@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

cb49366f

IB/hfi1: Add functions to parse BTH/IB headers · 7dafbab3

由 Don Hiatt 提交于 5月 12, 2017

Improve code readablity by adding inline functions
to read specific BTH/IB fields without knowledge of
byte offsets.
Reviewed-by: NBrian Welty <brian.welty@intel.com>
Reviewed-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDon Hiatt <don.hiatt@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7dafbab3

24 5月, 2017 1 次提交

IB/core: Enforce PKey security on QPs · d291f1a6

由 Daniel Jurgens 提交于 5月 19, 2017

Add new LSM hooks to allocate and free security contexts and check for
permission to access a PKey.

Allocate and free a security context when creating and destroying a QP.
This context is used for controlling access to PKeys.

When a request is made to modify a QP that changes the port, PKey index,
or alternate path, check that the QP has permission for the PKey in the
PKey table index on the subnet prefix of the port. If the QP is shared
make sure all handles to the QP also have access.

Store which port and PKey index a QP is using. After the reset to init
transition the user can modify the port, PKey index and alternate path
independently. So port and PKey settings changes can be a merge of the
previous settings and the new ones.

In order to maintain access control if there are PKey table or subnet
prefix change keep a list of all QPs are using each PKey index on
each port. If a change occurs all QPs using that device and port must
have access enforced for the new cache settings.

These changes add a transaction to the QP modify process. Association
with the old port and PKey index must be maintained if the modify fails,
and must be removed if it succeeds. Association with the new port and
PKey index must be established prior to the modify and removed if the
modify fails.

1. When a QP is modified to a particular Port, PKey index or alternate
   path insert that QP into the appropriate lists.

2. Check permission to access the new settings.

3. If step 2 grants access attempt to modify the QP.

4a. If steps 2 and 3 succeed remove any prior associations.

4b. If ether fails remove the new setting associations.

If a PKey table or subnet prefix changes walk the list of QPs and
check that they have permission. If not send the QP to the error state
and raise a fatal error event. If it's a shared QP make sure all the
QPs that share the real_qp have permission as well. If the QP that
owns a security structure is denied access the security structure is
marked as such and the QP is added to an error_list. Once the moving
the QP to error is complete the security structure mark is cleared.

Maintaining the lists correctly turns QP destroy into a transaction.
The hardware driver for the device frees the ib_qp structure, so while
the destroy is in progress the ib_qp pointer in the ib_qp_security
struct is undefined. When the destroy process begins the ib_qp_security
structure is marked as destroying. This prevents any action from being
taken on the QP pointer. After the QP is destroyed successfully it
could still listed on an error_list wait for it to be processed by that
flow before cleaning up the structure.

If the destroy fails the QPs port and PKey settings are reinserted into
the appropriate lists, the destroying flag is cleared, and access control
is enforced, in case there were any cache changes during the destroy
flow.

To keep the security changes isolated a new file is used to hold security
related functionality.
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Acked-by: NDoug Ledford <dledford@redhat.com>
[PM: merge fixup in ib_verbs.h and uverbs_cmd.c]
Signed-off-by: NPaul Moore <paul@paul-moore.com>

d291f1a6

23 5月, 2017 1 次提交

IB/core: IB cache enhancements to support Infiniband security · 883c71fe

由 Daniel Jurgens 提交于 5月 19, 2017

Cache the subnet prefix and add a function to access it. Enforcing
security requires frequent queries of the subnet prefix and the pkeys in
the pkey table.
Signed-off-by: NDaniel Jurgens <danielj@mellanox.com>
Reviewed-by: NEli Cohen <eli@mellanox.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Reviewed-by: NJames Morris <james.l.morris@oracle.com>
Acked-by: NDoug Ledford <dledford@redhat.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

883c71fe

10 5月, 2017 1 次提交

smc_diag.h: fix include from userland · ea6819e1

由 Nicolas Dichtel 提交于 3月 27, 2017

This patch prepares the uapi export by fixing the following error:

.../linux/smc_diag.h:6:27: fatal error: rdma/ib_verbs.h: No such file or directory
 #include <rdma/ib_verbs.h>
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>

ea6819e1

02 5月, 2017 2 次提交

IB/core: Define 'opa' rdma_ah_attr type · 64b4646e

由 Dasaratharaman Chandramouli 提交于 4月 29, 2017

OPA ah_attr types allows core components to specify
attributes that may be specific to opa devices.
For instance, opa type ah_attr provides 32 bit lids
enabling larger OPA fabric sizes.
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Reviewed-by: NDon Hiatt <don.hiatt@intel.com>
Reviewed-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

64b4646e

IB/core: Define 'ib' and 'roce' rdma_ah_attr types · 44c58487

由 Dasaratharaman Chandramouli 提交于 4月 29, 2017

rdma_ah_attr can now be either ib or roce allowing
core components to use one type or the other and also
to define attributes unique to a specific type. struct
ib_ah is also initialized with the type when its first
created. This ensures that calls such as modify_ah
dont modify the type of the address handle attribute.
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Reviewed-by: NDon Hiatt <don.hiatt@intel.com>
Reviewed-by: NSean Hefty <sean.hefty@intel.com>
Reviewed-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

44c58487

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功