提交 · 65995fee842fd9f1427b62be24053846d9c32102 · openanolis / cloud-kernel

13 6月, 2015 19 次提交

IB/core: Add OPA MAD core capability flag · 65995fee

由 Ira Weiny 提交于 6月 06, 2015

Add OPA MAD support flags to the core capability immutable flags.  In addition
add the rdma_cap_opa_mad helper function for core functions to use to detect
OPA MAD support.

OPA MADs share a common header with IBTA MADs but with some differences for
increased performance.

Sharing a common header with IBTA MADs allows us to share most of the MAD
processing code when dealing with OPA MADs in addition to supporting some IBTA
MADs on OPA devices.

OPA MADs differ in the following ways:

	1) MADs are variable size up to 2K
	   IBTA defined MADs remain fixed at 256 bytes
	2) OPA SMPs must carry valid PKeys
	3) OPA SMP packets are a different format

The MAD stack will use this new functionality to determine if OPA MAD
processing should occur on individual device ports.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

65995fee

IB/mad: Add support for additional MAD info to/from drivers · 4cd7c947

由 Ira Weiny 提交于 6月 06, 2015

In order to support alternate sized MADs (and variable sized MADs on OPA
devices) add in/out MAD size parameters to the process_mad core call.

In addition, add an out_mad_pkey_index to communicate the pkey index the driver
wishes the MAD stack to use when sending OPA MAD responses.

The out MAD size and the out MAD PKey index are required by the MAD
stack to generate responses on OPA devices.

Furthermore, the in and out MAD parameters are made generic by specifying them
as ib_mad_hdr rather than ib_mad.

Drivers are modified as needed and are protected by BUG_ON flags if the MAD
sizes passed to them is incorrect.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

4cd7c947

IB/mad: Convert allocations from kmem_cache to kzalloc · c9082e51

由 Ira Weiny 提交于 6月 06, 2015

This patch implements allocating alternate receive MAD buffers within the MAD
stack.  Support for OPA to send/recv variable sized MADs is implemented later.

    1) Convert MAD allocations from kmem_cache to kzalloc

       kzalloc is more flexible to support devices with different sized MADs
       and research and testing showed that the current use of kmem_cache does
       not provide performance benefits over kzalloc.

    2) Change struct ib_mad_private to use a flex array for the mad data
    3) Allocate ib_mad_private based on the size specified by devices in
       rdma_max_mad_size.
    4) Carry the allocated size in ib_mad_private to be used when processing
       ib_mad_private objects.
    5) Alter DMA mappings based on the mad_size of ib_mad_private.
    6) Replace the use of sizeof and static defines as appropriate
    7) Add appropriate casts for the MAD data when calling processing
       functions.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c9082e51

IB/core: Add ability for drivers to report an alternate MAD size. · 337877a4

由 Ira Weiny 提交于 6月 06, 2015

Add max MAD size to the device immutable data set and have all drivers that
support MADs report the current IB MAD size (IB_MGMT_MAD_SIZE) to the core.

Verify MAD size data in both the MAD core and when reading the immutable data.

OPA drivers will report alternate MAD sizes in subsequent patches.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

337877a4

IB/mad: Support alternate Base Versions when creating MADs · da2dfaa3

由 Ira Weiny 提交于 6月 06, 2015

In preparation to support the new OPA MAD Base version, add a base version
parameter to ib_create_send_mad and set it to IB_MGMT_BASE_VERSION for current
users.

Definition of the new base version and it's processing will occur in later
patches.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

da2dfaa3

IB/mad: Create a generic helper for DR forwarding checks · 29869eaf

由 Ira Weiny 提交于 6月 06, 2015

IB and OPA SMPs share the same processing algorithm but have different header
formats and permissive LID detection.

Add a helper function which is generic to processing the DR forwarding checks which
can be used by both IB and OPA SMP code.

Use this function in the current IB function smi_check_forward_dr_smp.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

29869eaf

IB/mad: Create a generic helper for DR SMP Recv processing · 86f0e67a

由 Ira Weiny 提交于 6月 06, 2015

IB and OPA SMPs share the same processing algorithm but have different header
formats and permissive LID detection.

Add a helper function which is generic to processing DR SMP Recv messages which
can be used by both IB and OPA SMP code.

Use this function in the current IB function smi_handle_dr_smp_recv.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

86f0e67a

IB/mad: Create a generic helper for DR SMP Send processing · 92f15056

由 Ira Weiny 提交于 6月 06, 2015

IB and OPA SMPs share the same processing algorithm but have different header
formats and permissive LID detection.

Add a helper function which is generic to processing DR SMP Send messages which
can be used by both IB and OPA SMP code.

Use this function in the current IB function smi_handle_dr_smp_send.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

92f15056

IB/mad: Split IB SMI handling from MAD Recv handler · e11ae8aa

由 Ira Weiny 提交于 6月 06, 2015

Make a helper function to process Directed Route SMPs to be called by the IB
MAD Recv Handler, ib_mad_recv_done_handler.

This cleans up the MAD receive handler code a bit and allows for us to better
share the SMP processing code between IB and OPA SMPs.

IB and OPA SMPs share the same processing algorithm but have different header
formats and permissive LID detection. Therefore this and subsequent patches
split the common processing code from the IB specific code in anticipation of
sharing those algorithms with the OPA code.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e11ae8aa

IB/mad cleanup: Generalize processing of MAD data · 83a1d228

由 Ira Weiny 提交于 6月 06, 2015

ib_find_send_mad only needs access to the MAD header not the full IB MAD.
Change the local variable to ib_mad_hdr and change the corresponding cast.

This allows for clean usage of this function with both IB and OPA MADs because
OPA MADs carry the same header as IB MADs.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

83a1d228

IB/mad cleanup: Clean up function params -- find_mad_agent · d94bd266

由 Ira Weiny 提交于 6月 06, 2015

find_mad_agent only needs read only access to the MAD header.  Update the
ib_mad pointer to be const ib_mad_hdr.  Adjust call tree.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d94bd266

IB/mlx4: Add support for CQ time-stamping · 4b664c43

由 Matan Barak 提交于 6月 11, 2015

This includes:

* support allocation of CQ with the TIMESTAMP_COMPLETION creation flag.

* add timestamp_mask and hca_core_clock to query_device, reporting the
  number of supported timestamp bits (mask) and the hca_core_clock frequency.

* return hca core clock's offset in query_device vendor's data,
  this is needed in order to read the HCA's core clock.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

4b664c43

IB/mlx4: Add mmap call to map the hardware clock · 52033cfb

由 Matan Barak 提交于 6月 11, 2015

In order to read the HCA's cycle counter efficiently in
user space, we need to map the HCA's register.
This is done through mmap call.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

52033cfb

IB/core: Pass hardware specific data in query_device · 2528e33e

由 Matan Barak 提交于 6月 11, 2015

Vendors should be able to pass vendor specific data to/from
user-space via query_device uverb. In order to do this,
we need to pass the vendors' specific udata.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

2528e33e

IB/core: Add timestamp_mask and hca_core_clock to query_device · 24306dc6

由 Matan Barak 提交于 6月 11, 2015

In order to expose timestamp we need to expose two new attributes in
query_device to be used for CQ completion time-stamping:

timestamp_mask - how many bits are valid in the timestamp, where timestamp
values could be 64bits the most.

hca_core_clock - timestamp is given in HW cycles, the frequency in KHZ units
of the HCA, necessary in order to convert cycles to seconds.

This is added both to ib_query_device and its respective uverbs counterpart.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

24306dc6

IB/core: Extend ib_uverbs_create_cq · 565197dd

由 Matan Barak 提交于 6月 11, 2015

ib_uverbs_ex_create_cq follows the extension verbs
mechanism. New features (for example, CQ creation flags
field which is added in a downstream patch) could used
via user-space libraries without breaking the ABI.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

565197dd

IB/core: Add CQ creation time-stamping flag · b9926b92

由 Matan Barak 提交于 6月 11, 2015

Add CQ creation flag which dictates that the created CQ will report
completion time-stamp value in the WC.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b9926b92

IB/core: Change ib_create_cq to use struct ib_cq_init_attr · 8e37210b

由 Matan Barak 提交于 6月 11, 2015

Currently, ib_create_cq uses cqe and comp_vecotr instead
of the extendible ib_cq_init_attr struct.

Earlier patches already changed the vendors to work with
ib_cq_init_attr. This patch changes the consumers too.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

8e37210b

IB/core: Change provider's API of create_cq to be extendible · bcf4c1ea

由 Matan Barak 提交于 6月 11, 2015

Add a new ib_cq_init_attr structure which contains the
previous cqe (minimum number of CQ entries) and comp_vector
(completion vector) in addition to a new flags field.
All vendors' create_cq callbacks are changed in order
to work with the new API.

This commit does not change any functionality.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> to patch #2Signed-off-by: NDoug Ledford <dledford@redhat.com>

bcf4c1ea

12 6月, 2015 2 次提交

iw_cxgb4: support for bar2 qid densities exceeding the page size · 74217d4c

由 Hariprasad S 提交于 6月 09, 2015

Handle this configuration:

        Queues Per Page * SGE BAR2 Queue Register Area Size > Page Size

Use cxgb4_bar2_sge_qregs() to obtain the proper location within the
bar2 region for a given qid.

Rework the DB and GTS write functions to make use of this bar2 info.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

74217d4c

cxgb4: Support for user mode bar2 mappings with T4 · 66cf188e

由 Hariprasad S 提交于 6月 09, 2015

Enhance cxgb4_t4_bar2_sge_qregs() and cxgb4_bar2_sge_qregs() to support T4
user mode mappings.  Update all the current users as well.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

66cf188e

11 6月, 2015 6 次提交

D

Merge branch 'for-4.2-misc' into k.o/for-4.2 · 0699ee7a
由 Doug Ledford 提交于 6月 11, 2015

0699ee7a

RDMA/ocrdma: fix double free on pd · 4dc54442

由 Colin Ian King 提交于 6月 05, 2015

A reorganisation of the PD allocation and deallocation in commit
9ba1377d ("RDMA/ocrdma: Move PD resource management to driver.")
introduced a double free on pd, as detected by static analysis by
smatch:

drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:682 ocrdma_alloc_pd()
  error: double free of 'pd'^

The original call to ocrdma_mbx_dealloc_pd() (which does not kfree
pd) was replaced with a call to _ocrdma_dealloc_pd() (which does
kfree pd).  The kfree following this call causes the double free,
so just remove it to fix the problem.

Fixes: 9ba1377d ("RDMA/ocrdma: Move PD resource management to driver.")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Acked-By: NDevesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

4dc54442

IB/usnic: clean up some error handling code · fc3aa45b

由 Dan Carpenter 提交于 6月 04, 2015

This code causes a static checker warning:

	drivers/infiniband/hw/usnic/usnic_uiom.c:476 usnic_uiom_alloc_pd()
	warn: passing zero to 'PTR_ERR'

This code isn't buggy, but iommu_domain_alloc() doesn't return an error
pointer so we can simplify the error handling and silence the static
checker warning.

The static checker warning is to catch place which do:

	if (!ptr)
		return ERR_PTR(ptr);
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: NDave Goodell <dgoodell@cisco.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

fc3aa45b

IB/mthca: use swap() in mthca_make_profile() · ed0de4a8

由 Fabian Frederick 提交于 6月 10, 2015

Use kernel.h macro definition.

Thanks to Julia Lawall for Coccinelle scripting support.
Signed-off-by: NFabian Frederick <fabf@skynet.be>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

ed0de4a8

IB/core: Don't warn on no SA support in event handler · 9247a8eb

由 Moni Shoua 提交于 6月 10, 2015

Registering an event handler is done for a device. This device may have
one RoCE port (no SA cap) and one InfiniBand port (has SA cap).
Therefore, warning from the event handler about a specific port that
doesn't have SA cap is correct but pollutes the kernel log without a
need.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

9247a8eb

IB/core: Don't advertise SA in RoCE port capabilities · db75d054

由 Moni Shoua 提交于 6月 10, 2015

The Subnet Administrator (SA) is not a component of the RoCE spec.
Therefore, it should not be a capability of a RoCE port.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

db75d054

02 6月, 2015 10 次提交

D

Merge branch 'for-4.2-misc' into k.o/for-4.2 · b806ef3b
由 Doug Ledford 提交于 6月 02, 2015

b806ef3b

IB/core cleanup: Add const to args - agent_send_response · 73cdaaee

由 Ira Weiny 提交于 5月 31, 2015

In order to support constant callers of agent_send_response we add const
specifiers to the its pointer arguments.

Adjust the call tree accordingly.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Reviewed-by: NHal Rosenstock <hal@mellanox.com>
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

73cdaaee

IB/core cleanup: Add const on args - device->process_mad · a97e2d86

由 Ira Weiny 提交于 5月 31, 2015

The process_mad device function declares some parameters as "in".  Make those
parameters const and adjust the call tree under process_mad in the various
drivers accordingly.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Reviewed-by: NHal Rosenstock <hal@mellanox.com>
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a97e2d86

IB/core cleanup: Add const to RDMA helpers · 5ede9289

由 Ira Weiny 提交于 5月 31, 2015

The ib_device passed to the new RDMA helpers is constant.  Declare the
ib_device as const in the following functions.

rdma_protocol_ib
rdma_protocol_roce
rdma_protocol_iwarp
rdma_ib_or_roce
rdma_cap_ib_mad
rdma_cap_ib_smi
rdma_cap_ib_cm
rdma_cap_iw_cm
rdma_cap_ib_sa
rdma_cap_ib_mcast
rdma_cap_af_ib
rdma_cap_eth_ah
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Reviewed-by: NHal Rosenstock <hal@mellanox.com>
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5ede9289

IB/mlx4: Fix error paths in mlx4_ib_create_flow() · 11562568

由 Roland Dreier 提交于 5月 29, 2015

The unwinding clean up code are err_create_flow starts at the current
index i.  That means we shouldn't increment i until we're really sure
we won't have to destroy the current flow; otherwise we might
increment the index, fail inside an is_bonded block, and end up
accessing off the end of the reg_id[] array.

This was detected by Coverity (CID 1271229).
Signed-off-by: NRoland Dreier <roland@purestorage.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

11562568

RDMA/ocrdma: Fix memory leak in _ocrdma_alloc_pd() · 18eaf1f1

由 Roland Dreier 提交于 5月 29, 2015

If ocrdma_get_pd_num() fails, then we need to free the pd struct we allocated.

This was detected by Coverity (CID 1271245).
Signed-off-by: NRoland Dreier <roland@purestorage.com>
Acked-By: NDevesh Sharma <devesh.sharma@avagotech.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

18eaf1f1

rds: re-entry of rds_ib_xmit/rds_iw_xmit · d655a9fb

由 Wengang Wang 提交于 5月 21, 2015

The BUG_ON at line 452/453 is triggered in function rds_send_xmit.

441 while (ret) {
442 tmp = min_t(int, ret, sg->length -
443 conn->c_xmit_data_off);
444 conn->c_xmit_data_off += tmp;
445 ret -= tmp;
446 if (conn->c_xmit_data_off == sg->length) {
447 conn->c_xmit_data_off = 0;
448 sg++;
449 conn->c_xmit_sg++;
450 if (ret != 0 && conn->c_xmit_sg == rm->data.op_nents)
451 printk(KERN_ERR "conn %p rm %p sg %p ret %d\n", conn, rm, sg, ret);
452 BUG_ON(ret != 0 &&
453 conn->c_xmit_sg == rm->data.op_nents);
454 }
455 }

it is complaining the total sent length is bigger that we want to send.

rds_ib_xmit() is wrong for the second entry for the same rds_message returning
wrong value.

the sg and off passed by rds_send_xmit to rds_ib_xmit is based on
scatterlist.offset/length, but the rds_ib_xmit action is based on
scatterlist.dma_address/dma_length. in case dma_length is larger than length
there is problem. for the 2nd and later entries of rds_ib_xmit for same
rds_message, at least one of the following two is wrong:

1) the scatterlist to start with, the choosen one can far beyond the correct
one.
2) the offset to start with within the scatterlist.

fix:
add op_dmasg and op_dmaoff to rm_data_op structure indicating the scatterlist
and offset within the it to start with for rds_ib_xmit respectively. op_dmasg
and op_dmaoff are initialized to zero when doing dma mapping for the first see
of the message and are changed when filling send slots.

the same applies to rds_iw_xmit too.
Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d655a9fb

IB/ipoib: Fix RCU annotations in ipoib_neigh_hash_init() · 52374967

由 Bart Van Assche 提交于 5月 26, 2015

Avoid that sparse complains about ipoib_neigh_hash_init(). This
patch does not change any functionality. See also patch "IPoIB:
Fix memory leak in the neigh table deletion flow" (commit ID
66172c09).
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Cc: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

52374967

RDMA/nes: Enable the use of the tos field in the nes driver · 854ace98

由 Faisal Latif 提交于 5月 18, 2015

RDMA/nes: Enable the use of the tos field in the nes driver
Signed-off-by: NFaisal Latif <Faisal.Latif@intel.com>
Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

854ace98

RDMA/iw_cm: Export tos field to iwarp providers · 68cdba06

由 Steve Wise 提交于 5月 18, 2015

rdma-cma/iw_cm: Export tos field to iwarp providers
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

68cdba06

21 5月, 2015 3 次提交

D

Merge branches 'bart-srp', 'generic-errors', 'ira-cleanups' and 'mwang-v8' into k.o/for-4.2 · 175e8efe
由 Doug Ledford 提交于 5月 20, 2015

175e8efe

IB/core: Change rdma_protocol_iboe to roce · 5d9fb044

由 Ira Weiny 提交于 5月 14, 2015

After discussion upstream, it was agreed to transition the usage of iboe
in the kernel to roce. This keeps our terminology consistent with what
was finalized in the IBTA Annex 16 and IBTA Annex 17 publications.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5d9fb044

IB/core: Convert core to use bitfield for caps · f9b22e35

由 Ira Weiny 提交于 5月 13, 2015

Remove query_protocol callback

Use the new Core Capability bits for:

rdma_protocol_*
rdma_cap_ib_mad
rdma_cap_ib_smi
rdma_cap_ib_cm
rdma_cap_iw_cm
rdma_cap_ib_sa
rdma_cap_ib_mcast
rdma_cap_af_ib
rdma_cap_eth_ah
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

f9b22e35

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功