提交 · 963cab508296a06ed8063c848f32d74f2b4b4c26 · openeuler / raspberrypi-kernel

22 10月, 2015 1 次提交

IB/core: avoid 32-bit warning · 5d1e6235

由 Arnd Bergmann 提交于 10月 07, 2015

The INIT_UDATA() macro requires a pointer or unsigned long argument for
both input and output buffer, and all callers had a cast from when
the code was merged until a recent restructuring, so now we get

core/uverbs_cmd.c: In function 'ib_uverbs_create_cq':
core/uverbs_cmd.c:1481:66: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]

This makes the code behave as before by adding back the cast to
unsigned long.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Fixes: 565197dd ("IB/core: Extend ib_uverbs_create_cq")
Reviewed-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5d1e6235

04 9月, 2015 1 次提交

IB/uverbs: reject invalid or unknown opcodes · b632ffa7

由 Christoph Hellwig 提交于 8月 26, 2015

We have many WR opcodes that are only supported in kernel space
and/or require optional information to be copied into the WR
structure.  Reject all those not explicitly handled so that we
can't pass invalid information to drivers.

Cc: stable@vger.kernel.org
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b632ffa7

31 8月, 2015 4 次提交

IB/uverbs: Explicitly pass ib_dev to uverbs commands · 057aec0d

由 Yishai Hadas 提交于 8月 13, 2015

Done in preparation for deploying RCU for the device removal
flow. Allows isolating the RCU handling to the uverb_main layer and
keeping the uverbs_cmd code as is.
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NShachar Raindel <raindel@mellanox.com>
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

057aec0d

IB/uverbs: Fix reference counting usage of event files · 03c40442

由 Yishai Hadas 提交于 8月 13, 2015

Fix the reference counting usage to be handled in the event file
creation/destruction function, instead of being done by the caller.
This is done for both async/non-async event files.

Based on Jason Gunthorpe report at https://www.mail-archive.com/
linux-rdma@vger.kernel.org/msg24680.html:
"The existing code for this is broken, in ib_uverbs_get_context all
the error paths between ib_uverbs_alloc_event_file and the
kref_get(file->ref) are wrong - this will result in fput() which will
call ib_uverbs_event_close, which will try to do kref_put and
ib_unregister_event_handler - which are no longer paired."
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NShachar Raindel <raindel@mellanox.com>
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

03c40442

IB/core: Make ib_dealloc_pd return void · 7dd78647

由 Jason Gunthorpe 提交于 8月 05, 2015

The majority of callers never check the return value, and even if they
did, they can't do anything about a failure.

All possible failure cases represent a bug in the caller, so just
WARN_ON inside the function instead.

This fixes a few random errors:
 net/rd/iw.c infinite loops while it fails. (racing with EBUSY?)

This also lays the ground work to get rid of error return from the
drivers. Most drivers do not error, the few that do are broken since
it cannot be handled.

Since uverbs can legitimately make use of EBUSY, open code the
check.
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7dd78647

IB/core: Guarantee that a local_dma_lkey is available · 96249d70

由 Jason Gunthorpe 提交于 8月 05, 2015

Every single ULP requires a local_dma_lkey to do anything with
a QP, so let us ensure one exists for every PD created.

If the driver can supply a global local_dma_lkey then use that, otherwise
ask the driver to create a local use all physical memory MR associated
with the new PD.
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NSagi Grimberg <sagig@dev.mellanox.co.il>
Acked-by: NChristoph Hellwig <hch@infradead.org>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Tested-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

96249d70

13 6月, 2015 4 次提交

IB/core: Pass hardware specific data in query_device · 2528e33e

由 Matan Barak 提交于 6月 11, 2015

Vendors should be able to pass vendor specific data to/from
user-space via query_device uverb. In order to do this,
we need to pass the vendors' specific udata.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

2528e33e

IB/core: Add timestamp_mask and hca_core_clock to query_device · 24306dc6

由 Matan Barak 提交于 6月 11, 2015

In order to expose timestamp we need to expose two new attributes in
query_device to be used for CQ completion time-stamping:

timestamp_mask - how many bits are valid in the timestamp, where timestamp
values could be 64bits the most.

hca_core_clock - timestamp is given in HW cycles, the frequency in KHZ units
of the HCA, necessary in order to convert cycles to seconds.

This is added both to ib_query_device and its respective uverbs counterpart.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

24306dc6

IB/core: Extend ib_uverbs_create_cq · 565197dd

由 Matan Barak 提交于 6月 11, 2015

ib_uverbs_ex_create_cq follows the extension verbs
mechanism. New features (for example, CQ creation flags
field which is added in a downstream patch) could used
via user-space libraries without breaking the ABI.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

565197dd

IB/core: Change provider's API of create_cq to be extendible · bcf4c1ea

由 Matan Barak 提交于 6月 11, 2015

Add a new ib_cq_init_attr structure which contains the
previous cqe (minimum number of CQ entries) and comp_vector
(completion vector) in addition to a new flags field.
All vendors' create_cq callbacks are changed in order
to work with the new API.

This commit does not change any functionality.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> to patch #2Signed-off-by: NDoug Ledford <dledford@redhat.com>

bcf4c1ea

19 2月, 2015 2 次提交

IB/core: Add on demand paging caps to ib_uverbs_ex_query_device · f4056bfd

由 Haggai Eran 提交于 2月 08, 2015

Add on-demand paging capabilities reporting to the extended query device verb.

Yann Droneaud writes:

    Note: as offsetof() is used to retrieve the size of the lower chunk
    of the response, beware that it only works if the upper chunk
    is right after, without any implicit padding. And, as the size of
    the latter chunk is added to the base size, implicit padding at the
    end of the structure is not taken in account. Both point must be
    taken in account when extending the uverbs functionalities.
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Reviewed-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f4056bfd

IB/core: Add support for extended query device caps · 02d1aa7a

由 Eli Cohen 提交于 2月 08, 2015

Add extensible query device capabilities verb to allow adding new features.
ib_uverbs_ex_query_device is added and copy_query_dev_fields is used to copy
capability fields to be used by both ib_uverbs_query_device and
ib_uverbs_ex_query_device.

Following the discussion about this patch [1], the code now validates
the command's comp_mask is zero, returning -EINVAL for unknown values,
in order to allow extending the verb in the future.

The verb also checks the user-space provided response buffer size and
only fills in capabilities that will fit in the buffer. In attempt to
follow the spirit of presentation [2] by Tzahi Oved that was presented
during OpenFabrics Alliance International Developer Workshop 2013, the
comp_mask bits will only describe which fields are valid.  Furthermore,
fields that can simply be cleared when they are not supported, do not
require a comp_mask bit at all.  The verb returns a response_length
field containing the actual number of bytes written by the kernel, so
that a newer version running on an older kernel can tell which fields
were actually returned.

[1] [PATCH v1 0/5] IB/core: extended query device caps cleanup for v3.19
    http://thread.gmane.org/gmane.linux.kernel.api/7889/

[2] https://www.openfabrics.org/images/docs/2013_Dev_Workshop/Tues_0423/2013_Workshop_Tues_0830_Tzahi_Oved-verbs_extensions_ofa_2013-tzahio.pdfSigned-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Reviewed-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

02d1aa7a

18 2月, 2015 1 次提交

IB/core: Fix deadlock on uverbs modify_qp error flow · 0fb8bcf0

由 Moshe Lazer 提交于 2月 05, 2015

The deadlock occurs in __uverbs_modify_qp: we take a lock (idr_read_qp)
and in case of failure in ib_resolve_eth_l2_attrs we don't release
it (put_qp_read).  Fix that.

Fixes: ed4c54e5 ("IB/core: Resolve Ethernet L2 addresses when modifying QP")
Signed-off-by: NMoshe Lazer <moshel@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

0fb8bcf0

06 2月, 2015 1 次提交

Revert "IB/core: Add support for extended query device caps" · 43c61165

由 Yann Droneaud 提交于 2月 05, 2015

While commit 7e36ef82 ("IB/core: Temporarily disable
ex_query_device uverb") is correct as it makes the extended
QUERY_DEVICE uverb (which came as part of commit 5a77abf9
("IB/core: Add support for extended query device caps") and commit
860f10a7 ("IB/core: Add flags for on demand paging support")) not
available to userspace, it doesn't address the initial issue regarding
ib_copy_to_udata() [1][2].

Additionally, further discussions around this new uverb seems to
conclude it would require a different data structure than the one
currently described in <rdma/ib_user_verbs.h> [3].

Both of these issues require a revert of the changes, so this patch
partially reverts commit 8cdd312c ("IB/mlx5: Implement the ODP
capability query verb") and commit 860f10a7 ("IB/core: Add flags
for on demand paging support") and fully reverts commit 5a77abf9
("IB/core: Add support for extended query device caps").

[1] "Re: [PATCH v3 06/17] IB/core: Add support for extended query device caps"
    http://mid.gmane.org/1418733236.2779.26.camel@opteya.com

[2] "Re: [PATCH] IB/core: Temporarily disable ex_query_device uverb"
    http://mid.gmane.org/1423067503.3030.83.camel@opteya.com

[3] "RE: [PATCH v1 1/5] IB/uverbs: ex_query_device: answer must not depend on request's comp_mask"
    http://mid.gmane.org/2807E5FD2F6FDA4886F6618EAC48510E0CC12C30@CRSMSX101.amr.corp.intel.com

Cc: Eli Cohen <eli@mellanox.com>
Cc: Haggai Eran <haggaie@mellanox.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Shachar Raindel <raindel@mellanox.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

43c61165

16 12月, 2014 4 次提交

IB/core: Implement support for MMU notifiers regarding on demand paging regions · 882214e2

由 Haggai Eran 提交于 12月 11, 2014

* Add an interval tree implementation for ODP umems. Create an
  interval tree for each ucontext (including a count of the number of
  ODP MRs in this context, semaphore, etc.), and register ODP umems in
  the interval tree.
* Add MMU notifiers handling functions, using the interval tree to
  notify only the relevant umems and underlying MRs.
* Register to receive MMU notifier events from the MM subsystem upon
  ODP MR registration (and unregister accordingly).
* Add a completion object to synchronize the destruction of ODP umems.
* Add mechanism to abort page faults when there's a concurrent invalidation.

The way we synchronize between concurrent invalidations and page
faults is by keeping a counter of currently running invalidations, and
a sequence number that is incremented whenever an invalidation is
caught. The page fault code checks the counter and also verifies that
the sequence number hasn't progressed before it updates the umem's
page tables. This is similar to what the kvm module does.

In order to prevent the case where we register a umem in the middle of
an ongoing notifier, we also keep a per ucontext counter of the total
number of active mmu notifiers. We only enable new umems when all the
running notifiers complete.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NShachar Raindel <raindel@mellanox.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NYuval Dagan <yuvalda@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

882214e2

IB/core: Add support for on demand paging regions · 8ada2c1c

由 Shachar Raindel 提交于 12月 11, 2014

* Extend the umem struct to keep the ODP related data.
* Allocate and initialize the ODP related information in the umem
  (page_list, dma_list) and freeing as needed in the end of the run.
* Store a reference to the process PID struct in the ucontext.  Used to
  safely obtain the task_struct and the mm during fault handling,
  without preventing the task destruction if needed.
* Add 2 helper functions: ib_umem_odp_map_dma_pages and
  ib_umem_odp_unmap_dma_pages. These functions get the DMA addresses
  of specific pages of the umem (and, currently, pin them).
* Support for page faults only - IB core will keep the reference on
  the pages used and call put_page when freeing an ODP umem
  area. Invalidations support will be added in a later patch.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NShachar Raindel <raindel@mellanox.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

8ada2c1c

IB/core: Add flags for on demand paging support · 860f10a7

由 Sagi Grimberg 提交于 12月 11, 2014

* Add a configuration option for enable on-demand paging support in
  the infiniband subsystem (CONFIG_INFINIBAND_ON_DEMAND_PAGING). In a
  later patch, this configuration option will select the MMU_NOTIFIER
  configuration option to enable mmu notifiers.
* Add a flag for on demand paging (ODP) support in the IB device capabilities.
* Add a flag to request ODP MR in the access flags to reg_mr.
* Fail registrations done with the ODP flag when the low-level driver
  doesn't support this.
* Change the conditions in which an MR will be writable to explicitly
  specify the access flags.  This is to avoid making an MR writable just
  because it is an ODP MR.
* Add a ODP capabilities to the extended query device verb.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NShachar Raindel <raindel@mellanox.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

860f10a7

IB/core: Add support for extended query device caps · 5a77abf9

由 Eli Cohen 提交于 12月 11, 2014

Add extensible query device capabilities verb to allow adding new features.
ib_uverbs_ex_query_device is added and copy_query_dev_fields is used to
copy capability fields to be used by both ib_uverbs_query_device and
ib_uverbs_ex_query_device.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

5a77abf9

14 10月, 2014 1 次提交

IB/core: Clear AH attr variable to prevent garbage data · 8b0f93d9

由 Devesh Sharma 提交于 9月 26, 2014

During create-ah from userspace, uverbs is sending garbage data in
attr.dmac and attr.vlan_id.  This patch sets attr.dmac and
attr.vlan_id to zero.

Fixes: dd5f03be ("IB/core: Ethernet L2 attributes in verbs/cm structures")
Signed-off-by: NDevesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

8b0f93d9

02 8月, 2014 1 次提交

IB/core: Add user MR re-registration support · 7e6edb9b

由 Matan Barak 提交于 7月 31, 2014

Memory re-registration is a feature that enables changing the
attributes of a memory region registered by user-space, including PD,
translation (address and length) and access flags.

Add the required support in uverbs and the kernel verbs API.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7e6edb9b

20 1月, 2014 1 次提交

IB/core: Resolve Ethernet L2 addresses when modifying QP · ed4c54e5

由 Or Gerlitz 提交于 12月 12, 2013

Existing user space applications provide only IBoE L3 address
attributes to the kernel when they issue a modify QP modify.  To work
with them and let such apps (plus kernel consumers which don't use the
RDMA-CM) keep working transparently under the IBoE GID IP addressing
changes, add an Eth L2 address resolution helper.
Signed-off-by: NMoni Shoua <monis@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

ed4c54e5

21 12月, 2013 4 次提交

IB/uverbs: Check input length in flow steering uverbs · 6bcca3d4

由 Yann Droneaud 提交于 12月 11, 2013

Since ib_copy_from_udata() doesn't check yet the available input data
length before accessing userspace memory, an explicit check of this
length is required to prevent:

- reading past the user provided buffer,
- underflow when subtracting the expected command size from the input
  length.

This will ensure the newly added flow steering uverbs don't try to
process truncated commands.

Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6bcca3d4

IB/uverbs: Set error code when fail to consume all flow_spec items · 98a37510

由 Yann Droneaud 提交于 12月 11, 2013

If the flow_spec items parsed count does not match the number of items
declared in the flow_attr command, or if not all bytes are used for
flow_spec items (eg. trailing garbage), a log message is reported and
the function leave through the error path. Unfortunately the error
code is currently not set.

This patch set error code to -EINVAL in such cases, so that the error
is reported to userspace instead of silently fail.

Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

98a37510

IB/uverbs: Check reserved fields in create_flow · c780d82a

由 Yann Droneaud 提交于 12月 11, 2013

As noted by Daniel Vetter in its article "Botching up ioctls"[1]

  "Check *all* unused fields and flags and all the padding for whether
   it's 0, and reject the ioctl if that's not the case.  Otherwise
   your nice plan for future extensions is going right down the
   gutters since someone *will* submit an ioctl struct with random
   stack garbage in the yet unused parts. Which then bakes in the ABI
   that those fields can never be used for anything else but garbage."

It's important to ensure that reserved fields are set to known value,
so that it will be possible to use them latter to extend the ABI.

The same reasonning apply to comp_mask field present in newer uverbs
command: per commit 22878dbc ("IB/core: Better checking of
userspace values for receive flow steering"), unsupported values in
comp_mask are rejected.

[1] http://blog.ffwll.ch/2013/11/botching-up-ioctls.html

Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

c780d82a

IB/uverbs: Check comp_mask in destroy_flow · 2782c2d3

由 Yann Droneaud 提交于 12月 11, 2013

Just like the check added to create_flow in 22878dbc ("IB/core:
Better checking of userspace values for receive flow steering"),
comp_mask must be checked in destroy_flow too.

Since only empty comp_mask is currently supported, any other value
must be rejected.

This check was silently added in a previous patch[1] to move comp_mask
in extended command header, part of previous patchset[2] against
create/destroy_flow uverbs. The idea of moving comp_mask to the header
was discarded for the final patchset[3].

Unfortunately the check added in destroy_flow uverb was not integrated
in the final patchset.

[1] http://marc.info/?i=40175eda10d670d098204da6aa4c327a0171ae5f.1381510045.git.ydroneaud@opteya.com
[2] http://marc.info/?i=cover.1381510045.git.ydroneaud@opteya.com
[3] http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com

Cc: Matan Barak <matanb@mellanox.com>
Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

2782c2d3

18 11月, 2013 5 次提交

IB/core: Re-enable create_flow/destroy_flow uverbs · 69ad5da4

由 Matan Barak 提交于 11月 06, 2013

This commit reverts commit 7afbddfa ("IB/core: Temporarily disable
create_flow/destroy_flow uverbs").  Since the uverbs extensions
functionality was experimental for v3.12, this patch re-enables the
support for them and flow-steering for v3.13.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

69ad5da4

IB/core: extended command: an improved infrastructure for uverbs commands · f21519b2

由 Yann Droneaud 提交于 11月 06, 2013

Commit 400dbc96 ("IB/core: Infrastructure for extensible uverbs
commands") added an infrastructure for extensible uverbs commands
while later commit 436f2ad0 ("IB/core: Export ib_create/destroy_flow
through uverbs") exported ib_create_flow()/ib_destroy_flow() functions
using this new infrastructure.

According to the commit 400dbc96, the purpose of this
infrastructure is to support passing around provider (eg. hardware)
specific buffers when userspace issue commands to the kernel, so that
it would be possible to extend uverbs (eg. core) buffers independently
from the provider buffers.

But the new kernel command function prototypes were not modified to
take advantage of this extension. This issue was exposed by Roland
Dreier in a previous review[1].

So the following patch is an attempt to a revised extensible command
infrastructure.

This improved extensible command infrastructure distinguish between
core (eg. legacy)'s command/response buffers from provider
(eg. hardware)'s command/response buffers: each extended command
implementing function is given a struct ib_udata to hold core
(eg. uverbs) input and output buffers, and another struct ib_udata to
hold the hw (eg. provider) input and output buffers.

Having those buffers identified separately make it easier to increase
one buffer to support extension without having to add some code to
guess the exact size of each command/response parts: This should make
the extended functions more reliable.

Additionally, instead of relying on command identifier being greater
than IB_USER_VERBS_CMD_THRESHOLD, the proposed infrastructure rely on
unused bits in command field: on the 32 bits provided by command
field, only 6 bits are really needed to encode the identifier of
commands currently supported by the kernel. (Even using only 6 bits
leaves room for about 23 new commands).

So this patch makes use of some high order bits in command field to
store flags, leaving enough room for more command identifiers than one
will ever need (eg. 256).

The new flags are used to specify if the command should be processed
as an extended one or a legacy one. While designing the new command
format, care was taken to make usage of flags itself extensible.

Using high order bits of the commands field ensure that newer
libibverbs on older kernel will properly fail when trying to call
extended commands. On the other hand, older libibverbs on newer kernel
will never be able to issue calls to extended commands.

The extended command header includes the optional response pointer so
that output buffer length and output buffer pointer are located
together in the command, allowing proper parameters checking. This
should make implementing functions easier and safer.

Additionally the extended header ensure 64bits alignment, while making
all sizes multiple of 8 bytes, extending the maximum buffer size:

                             legacy      extended

   Maximum command buffer:  256KBytes   1024KBytes (512KBytes + 512KBytes)
  Maximum response buffer:  256KBytes   1024KBytes (512KBytes + 512KBytes)

For the purpose of doing proper buffer size accounting, the headers
size are no more taken in account in "in_words".

One of the odds of the current extensible infrastructure, reading
twice the "legacy" command header, is fixed by removing the "legacy"
command header from the extended command header: they are processed as
two different parts of the command: memory is read once and
information are not duplicated: it's making clear that's an extended
command scheme and not a different command scheme.

The proposed scheme will format input (command) and output (response)
buffers this way:

- command:

  legacy header +
  extended header +
  command data (core + hw):

    +----------------------------------------+
    | flags     |   00      00    |  command |
    |        in_words    |   out_words       |
    +----------------------------------------+
    |                 response               |
    |                 response               |
    | provider_in_words | provider_out_words |
    |                 padding                |
    +----------------------------------------+
    |                                        |
    .              <uverbs input>            .
    .              (in_words * 8)            .
    |                                        |
    +----------------------------------------+
    |                                        |
    .             <provider input>           .
    .          (provider_in_words * 8)       .
    |                                        |
    +----------------------------------------+

- response, if present:

    +----------------------------------------+
    |                                        |
    .          <uverbs output space>         .
    .             (out_words * 8)            .
    |                                        |
    +----------------------------------------+
    |                                        |
    .         <provider output space>        .
    .         (provider_out_words * 8)       .
    |                                        |
    +----------------------------------------+

The overall design is to ensure that the extensible infrastructure is
itself extensible while begin more reliable with more input and bound
checking.

Note:

The unused field in the extended header would be perfect candidate to
hold the command "comp_mask" (eg. bit field used to handle
compatibility).  This was suggested by Roland Dreier in a previous
review[2].  But "comp_mask" field is likely to be present in the uverb
input and/or provider input, likewise for the response, as noted by
Matan Barak[3], so it doesn't make sense to put "comp_mask" in the
header.

[1]:
http://marc.info/?i=CAL1RGDWxmM17W2o_era24A-TTDeKyoL6u3NRu_=t_dhV_ZA9MA@mail.gmail.com

[2]:
http://marc.info/?i=CAL1RGDXJtrc849M6_XNZT5xO1+ybKtLWGq6yg6LhoSsKpsmkYA@mail.gmail.com

[3]:
http://marc.info/?i=525C1149.6000701@mellanox.comSigned-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com

[ Convert "ret ? ret : 0" to the equivalent "ret".  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f21519b2

IB/core: Make uverbs flow structure use names like verbs ones · b68c9560

由 Yann Droneaud 提交于 11月 06, 2013

This patch adds "flow" prefix to most of data structure added as part
of commit 436f2ad0 ("IB/core: Export ib_create/destroy_flow through
uverbs") to keep those names in sync with the data structures added in
commit 319a441d ("IB/core: Add receive flow steering support").

It's just a matter of translating 'ib_flow' to 'ib_uverbs_flow'.
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.comSigned-off-by: NRoland Dreier <roland@purestorage.com>

b68c9560

IB/core: Rename 'flow' structs to match other uverbs structs · d82693da

由 Yann Droneaud 提交于 11月 06, 2013

Commit 436f2ad0 ("IB/core: Export ib_create/destroy_flow through
uverbs") added public data structures to support receive flow
steering. The new structs are not following the 'uverbs' pattern:
they're lacking the common prefix 'ib_uverbs'.

This patch replaces ib_kern prefix by ib_uverbs.
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.comSigned-off-by: NRoland Dreier <roland@purestorage.com>

d82693da

IB/core: clarify overflow/underflow checks on ib_create/destroy_flow · f8848274

由 Matan Barak 提交于 11月 06, 2013

This patch fixes the following issues:

1. Unneeded checks were removed

2. Removed the fixed size out of flow_attr.size, thus simplifying the checks.

3. Remove a 32bit hole on 64bit systems with strict alignment in
   struct ib_kern_flow_att by adding a reserved field.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f8848274

16 11月, 2013 1 次提交

IB/core: Encorce MR access rights rules on kernel consumers · 1c636f80

由 Eli Cohen 提交于 10月 31, 2013

Enforce the rule that when requesting remote write or atomic permissions, local
write must be indicated as well. See IB spec 11.2.8.2.

Spotted by: Hagay Abramovsky <hagaya@mellanox.com>
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1c636f80

09 11月, 2013 1 次提交

IB/core: Pass imm_data from ib_uverbs_send_wr to ib_send_wr correctly · 6b7d103c

由 Latchesar Ionkov 提交于 10月 19, 2013

Currently, we don't copy the immediate data from the userspace struct
to the kernel one when UD messages are being sent.

This patch makes sure that the immediate data is set correctly.
Signed-off-by: NLatchesar Ionkov <lucho@ionkov.net>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6b7d103c

22 10月, 2013 1 次提交

IB/core: Temporarily disable create_flow/destroy_flow uverbs · 7afbddfa

由 Yann Droneaud 提交于 10月 10, 2013

The create_flow/destroy_flow uverbs and the associated extensions to
the user-kernel verbs ABI are under review and are too experimental to
freeze at this point.

So userspace is not exposed to experimental features and an uinstable
ABI, temporarily disable this for v3.12 (with a Kconfig option behind
staging to reenable it if desired).

The feature will be enabled after proper cleanup for v3.13.
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1381351016.git.ydroneaud@opteya.com
Link: http://marc.info/?i=cover.1381177342.git.ydroneaud@opteya.com

[ Add a Kconfig option to reenable these verbs.  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7afbddfa

03 9月, 2013 1 次提交

IB/core: Better checking of userspace values for receive flow steering · 22878dbc

由 Matan Barak 提交于 9月 01, 2013

  - Don't allow unsupported comp_mask values, user should check
    ibv_query_device to know which features are supported.
  - Add a check in ib_uverbs_create_flow() to verify the size passed
    from the user space.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

22878dbc

29 8月, 2013 1 次提交

IB/core: Export ib_create/destroy_flow through uverbs · 436f2ad0

由 Hadar Hen Zion 提交于 8月 14, 2013

Implement ib_uverbs_create_flow() and ib_uverbs_destroy_flow() to
support flow steering for user space applications.
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

436f2ad0

14 8月, 2013 1 次提交

IB/core: Fixes to XRC reference counting in uverbs · 846be90d

由 Yishai Hadas 提交于 8月 01, 2013

Added reference counting mechanism for XRC target QPs between
ib_uqp_object and its ib_uxrcd_object.  This prevents closing an XRC
domain that is still attached to a QP.  In addition, add missing code
in ib_uverbs_destroy_srq() to handle ib_uxrcd_object reference
counting correctly when destroying an xsrq.
Signed-off-by: NYishai Hadas <yishaih@mellanox.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

846be90d

09 7月, 2013 1 次提交

IB/uverbs: Use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd() · da183c7a

由 Roland Dreier 提交于 7月 08, 2013

The macro get_unused_fd() is used to allocate a file descriptor with
default flags.  Those default flags (0) can be "unsafe": O_CLOEXEC must
be used by default to not leak file descriptor across exec().

Replace calls to get_unused_fd() in uverbs with calls to
get_unused_fd_flags(O_CLOEXEC).  Inheriting uverbs fds across exec()
cannot be used to do anything useful.

Based on a patch/suggestion from Yann Droneaud <ydroneaud@opteya.com>.
Signed-off-by: NRoland Dreier <roland@purestorage.com>

da183c7a

28 2月, 2013 1 次提交

IB/core: convert to idr_alloc() · 3b069c5d

由 Tejun Heo 提交于 2月 27, 2013

Convert to the much saner new idr interface.

v2: Mike triggered WARN_ON() in idr_preload() because send_mad(),
    which may be used from non-process context, was calling
    idr_preload() unconditionally.  Preload iff @gfp_mask has
    __GFP_WAIT.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NSean Hefty <sean.hefty@intel.com>
Reported-by: N"Marciniszyn, Mike" <mike.marciniszyn@intel.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3b069c5d

23 2月, 2013 1 次提交
- A
  new helper: file_inode(file) · 496ad9aa
  由 Al Viro 提交于 1月 23, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  496ad9aa
22 2月, 2013 1 次提交

IB/uverbs: Implement memory windows support in uverbs · 6b52a12b

由 Shani Michaeli 提交于 2月 06, 2013

The existing user/kernel uverbs API has IB_USER_VERBS_CMD_ALLOC/DEALLOC_MW.
Implement these calls, along with destroying user memory windows during
process cleanup.
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NShani Michaeli <shanim@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6b52a12b