提交 · 15865e7dab62a58407f1b7decdafd89dd0a8b063 · openanolis / cloud-kernel

31 8月, 2015 4 次提交

IB/cm: Expose service ID in request events · 15865e7d

由 Haggai Eran 提交于 7月 30, 2015

Expose the service ID on an incoming CM or SIDR request to the event
handler. This will allow the RDMA CM module to de-multiplex connection
requests based on the information encoded in the service ID.
Acked-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

15865e7d

IB/core: Find the network device matching connection parameters · 9268f72d

由 Yotam Kenneth 提交于 7月 30, 2015

In the case of IPoIB, and maybe in other cases, the network device is
managed by an upper-layer protocol (ULP). In order to expose this
network device to other users of the IB device, let ULPs implement
a callback that returns network device according to connection parameters.

The IB device and port, together with the P_Key and the GID should
be enough to uniquely identify the ULP net device. However, in current
kernels there can be multiple IPoIB interfaces created with the same GID.
Furthermore, such configuration may be desireable to support ipvlan-like
configurations for RDMA CM with IPoIB. To resolve the device in these
cases the code will also take the IP address as an additional input.
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NYotam Kenneth <yotamke@mellanox.com>
Signed-off-by: NShachar Raindel <raindel@mellanox.com>
Signed-off-by: NGuy Shapiro <guysh@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

9268f72d

IB/core: lock client data with lists_rwsem · 7c1eb45a

由 Haggai Eran 提交于 7月 30, 2015

An ib_client callback that is called with the lists_rwsem locked only for
read is protected from changes to the IB client lists, but not from
ib_unregister_device() freeing its client data. This is because
ib_unregister_device() will remove the device from the device list with
lists_rwsem locked for write, but perform the rest of the cleanup,
including the call to remove() without that lock.

Mark client data that is undergoing de-registration with a new going_down
flag in the client data context. Lock the client data list with lists_rwsem
for write in addition to using the spinlock, so that functions calling the
callback would be able to lock only lists_rwsem for read and let callbacks
sleep.

Since ib_unregister_client() now marks the client data context, no need for
remove() to search the context again, so pass the client data directly to
remove() callbacks.
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7c1eb45a

IB/core: Add rwsem to allow reading device list or client list · 5aa44bb9

由 Haggai Eran 提交于 7月 30, 2015

Currently the RDMA subsystem's device list and client list are protected by
a single mutex. This prevents adding user-facing APIs that iterate these
lists, since using them may cause a deadlock. The patch attempts to solve
this problem by adding a read-write semaphore to protect the lists. Readers
now don't need the mutex, and are safe just by read-locking the semaphore.

The ib_register_device, ib_register_client, ib_unregister_device, and
ib_unregister_client functions are modified to lock the semaphore for write
during their respective list modification. Also, in order to make sure
client callbacks are called only between add() and remove() calls, the code
is changed to only add items to the lists after the add() calls and remove
from the lists before the remove() calls.

This patch attempts to solve a similar need [1] that was seen in the RoCE
v2 patch series.

[1] http://www.spinics.net/lists/linux-rdma/msg24733.htmlReviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Matan Barak <matanb@mellanox.com>
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5aa44bb9

29 8月, 2015 2 次提交

RDMA/cma: fix IPv6 address resolution · 6c26a771

由 Spencer Baugh 提交于 8月 13, 2015

Resolving a link-local IPv6 address with an unspecified source address
was broken by commit 5462eddd, which prevented the IPv6 stack from
learning the scope id of the link-local IPv6 address, causing random
failures as the IP stack chose a random link to resolve the address on.

This commit 5462eddd made us bail out of cma_check_linklocal early if
the address passed in was not an IPv6 link-local address. On the address
resolution path, the address passed in is the source address; if the
source address is the unspecified address, which is not link-local, we
will bail out early.

This is mostly correct, but if the destination address is a link-local
address, then we will be following a link-local route, and we'll need to
tell the IPv6 stack what the scope id of the destination address is.
This used to be done by last line of cma_check_linklocal, which is
skipped when bailing out early:

	dev_addr->bound_dev_if = sin6->sin6_scope_id;

(In cma_bind_addr, the sin6_scope_id of the source address is set to the
sin6_scope_id of the destination address, so this is correct)
This line is required in turn for the following line, L279 of
addr6_resolve, to actually inform the IPv6 stack of the scope id:

      fl6.flowi6_oif = addr->bound_dev_if;

Since we can only know we are in this failure case when we have access
to both the source IPv6 address and destination IPv6 address, we have to
deal with this further up the stack. So detect this failure case in
cma_bind_addr, and set bound_dev_if to the destination address scope id
to correct it.
Signed-off-by: NSpencer Baugh <sbaugh@catern.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

6c26a771

IB/ucma: Fix theoretical user triggered use-after-free · 7e967fd0

由 Jason Gunthorpe 提交于 8月 04, 2015

Something like this:

CPU A                         CPU B
Acked-by: NSean Hefty <sean.hefty@intel.com>

========================      ================================
ucma_destroy_id()
 wait_for_completion()
                              .. anything
                                ucma_put_ctx()
                                  complete()
 .. continues ...
                              ucma_leave_multicast()
                               mutex_lock(mut)
                                 atomic_inc(ctx->ref)
                               mutex_unlock(mut)
 ucma_free_ctx()
  ucma_cleanup_multicast()
   mutex_lock(mut)
     kfree(mc)
                               rdma_leave_multicast(mc->ctx->cm_id,..

Fix it by latching the ref at 0. Once it goes to 0 mc and ctx cannot
leave the mutex(mut) protection.

The other atomic_inc in ucma_get_ctx is OK because mutex(mut) protects
it from racing with ucma_destroy_id.
Signed-off-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7e967fd0

15 7月, 2015 7 次提交

IB/core: Destroy multcast_idr on module exit · 45d25420

由 Johannes Thumshirn 提交于 7月 08, 2015

Destroy multcast_idr on module exit, reclaiming the allocated memory.

This was detected by the following semantic patch (written by Luis Rodriguez
<mcgrof@suse.com>)
<SmPL>
@ defines_module_init @
declarer name module_init, module_exit;
declarer name DEFINE_IDR;
identifier init;
@@

module_init(init);

@ defines_module_exit @
identifier exit;
@@

module_exit(exit);

@ declares_idr depends on defines_module_init && defines_module_exit @
identifier idr;
@@

DEFINE_IDR(idr);

@ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 idr_destroy(&idr);
 ...
}

@ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 +idr_destroy(&idr);
}

</SmPL>
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

45d25420

IB/ucm: Fix bitmap wrap when devnum > IB_UCM_MAX_DEVICES · 59d40dd9

由 Carol L Soto 提交于 6月 11, 2015

ib_ucm_release_dev clears the wrong bit if devnum is greater
than IB_UCM_MAX_DEVICES.
Signed-off-by: NCarol L Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

59d40dd9

IB/ucma: Fix lockdep warning in ucma_lock_files · 31b57b87

由 Haggai Eran 提交于 7月 07, 2015

The ucma_lock_files() locks the mut mutex on two files, e.g. for migrating
an ID. Use mutex_lock_nested() to prevent the warning below.

 =============================================
 [ INFO: possible recursive locking detected ]
 4.1.0-rc6-hmm+ #40 Tainted: G           O
 ---------------------------------------------
 pingpong_rpc_se/10260 is trying to acquire lock:
  (&file->mut){+.+.+.}, at: [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]

 but task is already holding lock:
  (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&file->mut);
   lock(&file->mut);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 1 lock held by pingpong_rpc_se/10260:
  #0:  (&file->mut){+.+.+.}, at: [<ffffffffa047ac4b>] ucma_migrate_id+0xbb/0x248 [rdma_ucm]

 stack backtrace:
 CPU: 0 PID: 10260 Comm: pingpong_rpc_se Tainted: G           O    4.1.0-rc6-hmm+ #40
 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
  ffff8801f85b63d0 ffff880195677b58 ffffffff81668f49 0000000000000001
  ffffffff825cbbe0 ffff880195677c38 ffffffff810bb991 ffff880100000000
  ffff880100000000 ffff880100000001 ffff8801f85b7010 ffffffff8121bee9
 Call Trace:
  [<ffffffff81668f49>] dump_stack+0x4f/0x6e
  [<ffffffff810bb991>] __lock_acquire+0x741/0x1820
  [<ffffffff8121bee9>] ? dput+0x29/0x320
  [<ffffffff810bcb38>] lock_acquire+0xc8/0x240
  [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
  [<ffffffff8166b901>] ? mutex_lock_nested+0x291/0x3e0
  [<ffffffff8166b6d5>] mutex_lock_nested+0x65/0x3e0
  [<ffffffffa047ac55>] ? ucma_migrate_id+0xc5/0x248 [rdma_ucm]
  [<ffffffff810baeed>] ? trace_hardirqs_on+0xd/0x10
  [<ffffffff8166b66e>] ? mutex_unlock+0xe/0x10
  [<ffffffffa047ac55>] ucma_migrate_id+0xc5/0x248 [rdma_ucm]
  [<ffffffffa0478474>] ucma_write+0xa4/0xb0 [rdma_ucm]
  [<ffffffff81200674>] __vfs_write+0x34/0x100
  [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
  [<ffffffff810ec055>] ? current_kernel_time+0xc5/0xe0
  [<ffffffff812aa4d3>] ? security_file_permission+0x23/0x90
  [<ffffffff8120088d>] ? rw_verify_area+0x5d/0xe0
  [<ffffffff812009bb>] vfs_write+0xab/0x120
  [<ffffffff81201519>] SyS_write+0x59/0xd0
  [<ffffffff8112427c>] ? __audit_syscall_entry+0xac/0x110
  [<ffffffff8166ffee>] system_call_fastpath+0x12/0x76
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

31b57b87

RDMA/core: Fixes for port mapper client registration · a7f2f24c

由 Tatyana Nikolova 提交于 7月 02, 2015

Fixes to allow clients to make remove mapping requests, after
they have provided the user space service with the mapping
information, they are using when the service is restarted.

1) Adding IWPM_REG_VALID, IWPM_REG_INCOMPL and IWPM_REG_UNDEF
   registration types for the port mapper clients and functions
   to set/check the registration type.
2) If the port mapper user space service is not available to register
   the client, then its registration stays IWPM_REG_UNDEF and the
   registration isn't checked until the service becomes available
   (no mappings are possible, if the user space service isn't running).
3) After the service is restarted, the user space port mapper pid is set
   to valid and the client registration is set to IWPM_REG_INCOMPL
   to allow the client to make remove mapping requests.
Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Reviewed-by: NSteve Wise <swise@opengridcomputing.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a7f2f24c

IB/cm: Do not queue work to a device that's going away · be4b4993

由 Erez Shitrit 提交于 6月 25, 2015

Whenever ib_cm gets remove_one call, like when there is a hot-unplug
event, the driver should mark itself as going_down and confirm that no
new works are going to be queued for that device.
so, the order of the actions are:
1. mark the going_down bit.
2. flush the wq.
3. [make sure no new works for that device.]
4. unregister mad agent.

otherwise, works that are already queued can be scheduled after the mad
agent was freed.
Signed-off-by: NErez Shitrit <erezsh@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

be4b4993

IB/mad: Fix compare between big endian and cpu endian · cd4cd565

由 Ira Weiny 提交于 6月 25, 2015

The define OPA_LID_PERMISSIVE is big endian and was compared to the
cpu endian variable opa_drslid.

Problem caught by 0-day build infrastructure.

Fixes: 8e4349d1 (IB/mad: Add final OPA MAD processing)
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NJohn, Jubin <jubin.john@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

cd4cd565

IB: Add rdma_cap_ib_switch helper and use where appropriate · 4139032b

由 Hal Rosenstock 提交于 6月 29, 2015

Persuant to Liran's comments on node_type on linux-rdma
mailing list:

In an effort to reform the RDMA core and ULPs to minimize use of
node_type in struct ib_device, an additional bit is added to
struct ib_device for is_switch (IB switch). This is needed
to be initialized by any IB switch device driver. This is a
NEW requirement on such device drivers which are all
"out of tree".

In addition, an ib_switch helper was added to ib_verbs.h
based on the is_switch device bit rather than node_type
(although those should be consistent).

The RDMA core (MAD, SMI, agent, sa_query, multicast, sysfs)
as well as (IPoIB and SRP) ULPs are updated where
appropriate to use this new helper. In some cases,
the helper is now used under the covers of using
rdma_[start end]_port rather than the open coding
previously used.
Reviewed-by: NSean Hefty <sean.hefty@intel.com>
Reviewed-By: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Tested-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NHal Rosenstock <hal@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

4139032b

13 6月, 2015 18 次提交

IB/mad: Add final OPA MAD processing · 8e4349d1

由 Ira Weiny 提交于 6月 10, 2015

For devices which support OPA MADs

   1) Use previously defined SMP support functions.

   2) Pass correct base version to ib_create_send_mad when processing OPA MADs.

   3) Process out_mad_key_index returned by agents for a response.  This is
      necessary because OPA SMP packets must carry a valid pkey.

   4) Carry the correct segment size (OPA vs IBTA) of RMPP messages within
      ib_mad_recv_wc.

   5) Handle variable length OPA MADs by:

        * Adjusting the 'fake' WC for locally routed SMP's to represent the
          proper incoming byte_len
        * out_mad_size is used from the local HCA agents
                1) when sending agent responses on the wire
                2) when passing responses through the local_completions
		   function

	NOTE: wc.byte_len includes the GRH length and therefore is different
	      from the in_mad_size specified to the local HCA agents.
	      out_mad_size should _not_ include the GRH length as it is added
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

8e4349d1

IB/mad: Add partial Intel OPA MAD support · f28990bc

由 Ira Weiny 提交于 6月 06, 2015

Add OPA SMP processing functionality.

Define the new OPA SMP format, create support functions for this format using
the previously defined helper functions as appropriate.

These functions are defined in this patch and used in the final OPA MAD support
patch.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

f28990bc

IB/mad: Add partial Intel OPA MAD support · 548ead17

由 Ira Weiny 提交于 6月 06, 2015

This patch is the first of 3 which adds processing of OPA MADs

1) Add Intel Omni-Path Architecture defines
2) Increase max management version to accommodate OPA
3) update ib_create_send_mad
	If the device supports OPA MADs and the MAD being sent is the OPA base
	version alter the MAD size and sg lengths as appropriate
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

548ead17

IB/mad: Add support for additional MAD info to/from drivers · 4cd7c947

由 Ira Weiny 提交于 6月 06, 2015

In order to support alternate sized MADs (and variable sized MADs on OPA
devices) add in/out MAD size parameters to the process_mad core call.

In addition, add an out_mad_pkey_index to communicate the pkey index the driver
wishes the MAD stack to use when sending OPA MAD responses.

The out MAD size and the out MAD PKey index are required by the MAD
stack to generate responses on OPA devices.

Furthermore, the in and out MAD parameters are made generic by specifying them
as ib_mad_hdr rather than ib_mad.

Drivers are modified as needed and are protected by BUG_ON flags if the MAD
sizes passed to them is incorrect.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

4cd7c947

IB/mad: Convert allocations from kmem_cache to kzalloc · c9082e51

由 Ira Weiny 提交于 6月 06, 2015

This patch implements allocating alternate receive MAD buffers within the MAD
stack.  Support for OPA to send/recv variable sized MADs is implemented later.

    1) Convert MAD allocations from kmem_cache to kzalloc

       kzalloc is more flexible to support devices with different sized MADs
       and research and testing showed that the current use of kmem_cache does
       not provide performance benefits over kzalloc.

    2) Change struct ib_mad_private to use a flex array for the mad data
    3) Allocate ib_mad_private based on the size specified by devices in
       rdma_max_mad_size.
    4) Carry the allocated size in ib_mad_private to be used when processing
       ib_mad_private objects.
    5) Alter DMA mappings based on the mad_size of ib_mad_private.
    6) Replace the use of sizeof and static defines as appropriate
    7) Add appropriate casts for the MAD data when calling processing
       functions.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c9082e51

IB/core: Add ability for drivers to report an alternate MAD size. · 337877a4

由 Ira Weiny 提交于 6月 06, 2015

Add max MAD size to the device immutable data set and have all drivers that
support MADs report the current IB MAD size (IB_MGMT_MAD_SIZE) to the core.

Verify MAD size data in both the MAD core and when reading the immutable data.

OPA drivers will report alternate MAD sizes in subsequent patches.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

337877a4

IB/mad: Support alternate Base Versions when creating MADs · da2dfaa3

由 Ira Weiny 提交于 6月 06, 2015

In preparation to support the new OPA MAD Base version, add a base version
parameter to ib_create_send_mad and set it to IB_MGMT_BASE_VERSION for current
users.

Definition of the new base version and it's processing will occur in later
patches.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

da2dfaa3

IB/mad: Create a generic helper for DR forwarding checks · 29869eaf

由 Ira Weiny 提交于 6月 06, 2015

IB and OPA SMPs share the same processing algorithm but have different header
formats and permissive LID detection.

Add a helper function which is generic to processing the DR forwarding checks which
can be used by both IB and OPA SMP code.

Use this function in the current IB function smi_check_forward_dr_smp.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

29869eaf

IB/mad: Create a generic helper for DR SMP Recv processing · 86f0e67a

由 Ira Weiny 提交于 6月 06, 2015

IB and OPA SMPs share the same processing algorithm but have different header
formats and permissive LID detection.

Add a helper function which is generic to processing DR SMP Recv messages which
can be used by both IB and OPA SMP code.

Use this function in the current IB function smi_handle_dr_smp_recv.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

86f0e67a

IB/mad: Create a generic helper for DR SMP Send processing · 92f15056

由 Ira Weiny 提交于 6月 06, 2015

IB and OPA SMPs share the same processing algorithm but have different header
formats and permissive LID detection.

Add a helper function which is generic to processing DR SMP Send messages which
can be used by both IB and OPA SMP code.

Use this function in the current IB function smi_handle_dr_smp_send.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

92f15056

IB/mad: Split IB SMI handling from MAD Recv handler · e11ae8aa

由 Ira Weiny 提交于 6月 06, 2015

Make a helper function to process Directed Route SMPs to be called by the IB
MAD Recv Handler, ib_mad_recv_done_handler.

This cleans up the MAD receive handler code a bit and allows for us to better
share the SMP processing code between IB and OPA SMPs.

IB and OPA SMPs share the same processing algorithm but have different header
formats and permissive LID detection. Therefore this and subsequent patches
split the common processing code from the IB specific code in anticipation of
sharing those algorithms with the OPA code.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e11ae8aa

IB/mad cleanup: Generalize processing of MAD data · 83a1d228

由 Ira Weiny 提交于 6月 06, 2015

ib_find_send_mad only needs access to the MAD header not the full IB MAD.
Change the local variable to ib_mad_hdr and change the corresponding cast.

This allows for clean usage of this function with both IB and OPA MADs because
OPA MADs carry the same header as IB MADs.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

83a1d228

IB/mad cleanup: Clean up function params -- find_mad_agent · d94bd266

由 Ira Weiny 提交于 6月 06, 2015

find_mad_agent only needs read only access to the MAD header.  Update the
ib_mad pointer to be const ib_mad_hdr.  Adjust call tree.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d94bd266

IB/core: Pass hardware specific data in query_device · 2528e33e

由 Matan Barak 提交于 6月 11, 2015

Vendors should be able to pass vendor specific data to/from
user-space via query_device uverb. In order to do this,
we need to pass the vendors' specific udata.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

2528e33e

IB/core: Add timestamp_mask and hca_core_clock to query_device · 24306dc6

由 Matan Barak 提交于 6月 11, 2015

In order to expose timestamp we need to expose two new attributes in
query_device to be used for CQ completion time-stamping:

timestamp_mask - how many bits are valid in the timestamp, where timestamp
values could be 64bits the most.

hca_core_clock - timestamp is given in HW cycles, the frequency in KHZ units
of the HCA, necessary in order to convert cycles to seconds.

This is added both to ib_query_device and its respective uverbs counterpart.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

24306dc6

IB/core: Extend ib_uverbs_create_cq · 565197dd

由 Matan Barak 提交于 6月 11, 2015

ib_uverbs_ex_create_cq follows the extension verbs
mechanism. New features (for example, CQ creation flags
field which is added in a downstream patch) could used
via user-space libraries without breaking the ABI.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

565197dd

IB/core: Change ib_create_cq to use struct ib_cq_init_attr · 8e37210b

由 Matan Barak 提交于 6月 11, 2015

Currently, ib_create_cq uses cqe and comp_vecotr instead
of the extendible ib_cq_init_attr struct.

Earlier patches already changed the vendors to work with
ib_cq_init_attr. This patch changes the consumers too.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

8e37210b

IB/core: Change provider's API of create_cq to be extendible · bcf4c1ea

由 Matan Barak 提交于 6月 11, 2015

Add a new ib_cq_init_attr structure which contains the
previous cqe (minimum number of CQ entries) and comp_vector
(completion vector) in addition to a new flags field.
All vendors' create_cq callbacks are changed in order
to work with the new API.

This commit does not change any functionality.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com> to patch #2
Signed-off-by: NDoug Ledford <dledford@redhat.com>

bcf4c1ea

11 6月, 2015 1 次提交

IB/core: Don't warn on no SA support in event handler · 9247a8eb

由 Moni Shoua 提交于 6月 10, 2015

Registering an event handler is done for a device. This device may have
one RoCE port (no SA cap) and one InfiniBand port (has SA cap).
Therefore, warning from the event handler about a specific port that
doesn't have SA cap is correct but pollutes the kernel log without a
need.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

9247a8eb

02 6月, 2015 2 次提交

IB/core cleanup: Add const to args - agent_send_response · 73cdaaee

由 Ira Weiny 提交于 5月 31, 2015

In order to support constant callers of agent_send_response we add const
specifiers to the its pointer arguments.

Adjust the call tree accordingly.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Reviewed-by: NHal Rosenstock <hal@mellanox.com>
Reviewed-by: NJason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

73cdaaee

RDMA/iw_cm: Export tos field to iwarp providers · 68cdba06

由 Steve Wise 提交于 5月 18, 2015

rdma-cma/iw_cm: Export tos field to iwarp providers
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NTatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

68cdba06

21 5月, 2015 6 次提交

IB/cma: Fix broken AF_IB UD support · c07678bb

由 Matthew Finlay 提交于 5月 19, 2015

Support for using UD and AF_IB is currently broken.  The
IB_CM_SIDR_REQ_RECEIVED message is not handled properly in
cma_save_net_info() and we end up falling into code that will try and
process the request as ipv4/ipv6, which will end up failing.

The resolution is to add a check for the SIDR_REQ and call
cma_save_ib_info() with a NULL path record.  Change cma_save_ib_info()
to copy the src sib info from the listen_id when the path record is NULL.
Reported-by: NHari Shankar <Hari.Shankar@netapp.com>
Signed-off-by: NMatt Finlay <matt@mellanox.com>
Acked-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c07678bb

IB/core: Change rdma_protocol_iboe to roce · 5d9fb044

由 Ira Weiny 提交于 5月 14, 2015

After discussion upstream, it was agreed to transition the usage of iboe
in the kernel to roce. This keeps our terminology consistent with what
was finalized in the IBTA Annex 16 and IBTA Annex 17 publications.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5d9fb044

ib/cm: Change reject message type when destroying cm_id · c29ed5a4

由 Ted Kim 提交于 5月 14, 2015

Problem reported by: Ted Kim <ted.h.kim@oracle.com>:

We have a case where a Linux system and a non-Linux system are
trying to interoperate.  The Linux host is the active side and
starts the connection establishment, but later decides to not go
through with the connection setup and does rdma_destroy_id().

The rdma_destroy_id() eventually works its way down to cm_destroy_id()
in core/cm.c, where a REJ is sent. The non-Linux system
has some trouble recognizing the REJ because of:

A. CM states which can't receive the REJ
B. Some issues about REJ formatting (missing comm ID)

ISSUE A: That part of the spec says, a Consumer Reject REJ can be
sent for a connection abort, but it goes further
and says: can send a REJ message with a "Consumer Reject"
Reason code if they are in a CM state (i.e. REP
Rcvd, MRA(REP) Sent, REQ Rcvd, MRA Sent) that allows
a REJ to be sent (lines 35-38).

Of the states listed there in that sentence, it would
seem to limit the active side to using the Consumer Reject
(for the abort case) in just the REP-Rcvd and MRA-REP-Sent
states. That is basically only after the active side
sees a REP (or alternatively goes down the state transitions
to timeout in which case a Timeout REJ is sent).

As a fix, in cm-destroy-id() move the IB-CM-MRA-REQ-RCVD case
to the same as REQ-SENT.  Essentially, make a REJ sent after
getting an MRA on active side a timeout rather than Consumer-
Reject, which is arguably more correct with the CM state
diagrams previous to getting a REP.
Signed-off-by: NTed Kim <ted.h.kim@oracle.com>
Signed-off-by: NSean Hefty <sean.hefty@intel.com>

c29ed5a4

IB/core: Convert core to use bitfield for caps · f9b22e35

由 Ira Weiny 提交于 5月 13, 2015

Remove query_protocol callback

Use the new Core Capability bits for:

rdma_protocol_*
rdma_cap_ib_mad
rdma_cap_ib_smi
rdma_cap_ib_cm
rdma_cap_iw_cm
rdma_cap_ib_sa
rdma_cap_ib_mcast
rdma_cap_af_ib
rdma_cap_eth_ah
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

f9b22e35

IB/core: Add per port immutable struct to ib_device · 7738613e

由 Ira Weiny 提交于 5月 13, 2015

As of commit 5eb620c8 "IB/core: Add helpers for uncached GID and P_Key
searches"; pkey_tbl_len and gid_tbl_len are immutable data which are stored in
the ib_device.

The per port core capability flags to be added later are also immutable data to
be stored in the ib_device object.

In preparation for this create a structure for per port immutable data and
place the pkey and gid table lengths within this structure.

"get_port_immutable" is added as a mandatory device function to allow the
drivers to fill in this data.
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7738613e

IB/user_mad: Fix buggy usage of port index · 26c45428

由 Ira Weiny 提交于 5月 13, 2015

The addition of the rdma_cap_ib_mad is technically broken in ib_umad_remove_one
because the loop "i" value is not a port value.

This bug resulted in the ib_umad failing to properly remove its resources when
the core capability functions were converted to bit fields.

NOTE: e17371d73908 did not result in broken behavior on its own. It was only
an issue when the implementation of rdma_cap_ib_mad was changed.

Pass the port value to rdma_cap_ib_mad.

Fixes: e17371d73908 ("IB/Verbs: Use management helper rdma_cap_ib_mad()")
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

26c45428

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功