- 10 10月, 2007 40 次提交
-
-
由 Roland Dreier 提交于
Doing min_t(int, foo, INT_MAX) doesn't work correctly, because if foo is bigger than INT_MAX, then when treated as a signed integer, it will become negative and hence such an expression is just an elaborate NOP. Fix such cases in ehca to do min_t(unsigned, foo, INT_MAX) instead. This fixes negative reported values for max_cqe, max_pd and max_ah: Before: max_cqe: -64 max_pd: -1 max_ah: -1 After: max_cqe: 2147483647 max_pd: 2147483647 max_ah: 2147483647 Based on a bug report and fix from Anton Blanchard <anton@samba.org>. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Dotan Barak 提交于
Make the way QP is being created in ipoib_cm_create_tx_qp() consistent with ipoib_cm_create_rx_qp(). Signed-off-by: NDotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
Firmware commands are sent to the HCA by writing multiple words to a command register block. Access to this block of registers is serialized with a mutex. However, on large SGI systems writes to the register block may be reordered within the system interconnect and reach the HCA in a different order than they were issued (even with the mutex). Fix this by adding an mmiowb() before dropping the mutex. This bug was observed with real workloads with the similar FW command code in the mthca driver, and adding the mmiowb() as in commit 66547550 ("IB/mthca: Use mmiowb() to avoid firmware commands getting jumbled up") was confirmed to fix the problems, so we should add the same fix to mlx4. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
Firmware commands are sent to the HCA by writing multiple words to a command register block. Access to this block of registers is serialized with a mutex. However, on large SGI systems, problems were seen with multiple CPUs issuing FW commands at the same time, because the writes to the register block may be reordered within the system interconnect and reach the HCA in a different order than they were issued (even with the mutex). Fix this by adding an mmiowb() before dropping the mutex. Tested-by: NArthur Kepner <akepner@sgi.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Sean Hefty 提交于
Automatically queue MRA message to decrease the number of retries sent by the remote side during connection establishment. This also has the effect of increasing the overall connection timeout without using a longer retry time in the case of dropped packets. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Sean Hefty 提交于
The IB CM provides a message received acknowledged (MRA) message that can be sent to indicate that a REQ or REP message has been received, but will require more time to process than the timeout specified by those messages. In many cases, the application may not know how long it will take to respond to a CM message, but the majority of the time, it will usually respond before a retry has been sent. Rather than sending an MRA in response to all messages just to handle the case where a longer timeout is needed, it is more efficient to queue the MRA for sending in case a duplicate message is received. This avoids sending an MRA when it is not needed, but limits the number of times that a REQ or REP will be resent. It also provides for a simpler implementation than generating the MRA based on a timer event. (That is, trying to send the MRA after receiving the first REQ or REP if a response has not been generated, so that it is received at the remote side before a duplicate REQ or REP has been received) Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
Increase the number of QPs allowed per multicast group from 8 to 56. This allows for one QP per core on 16-core systems, which are now quite common, and allows some space for future growth. This is basically the same patch that Jack Morgenstein <jackm@dev.mellanox.co.il> just supplied for mlx4. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Jack Morgenstein 提交于
Increase the number of QPs allowed per multicast group from 8 to 56. This allows for one QP per core on 16-core systems, which are now quite common, and allows some space for future growth. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Jack Morgenstein 提交于
Implement FMRs for mlx4. This is an adaptation of code from mthca. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Jack Morgenstein 提交于
Write MTT entries directly to ICM from the driver (eliminating use of WRITE_MTT command). This reduces the number of FW commands needed to register an MR by at least a factor of 2 and speeds up memory registration significantly. This code will also be used to implement FMRs. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
Everything that uses caps.reserved_mtts expects it to be a count of MTT segments, not MTT entries. So convert the value that the FW gives us to a count of segments. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
Taking ilog2(dev->caps.reserved_mtts) to find out the order to pass to the MTT buddy allocator will do the wrong thing if reserved_mtts is ever not a power of 2. Be safe and use fls(dev->caps.reserved_mtts - 1). Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Jack Morgenstein 提交于
Enable having ICM tables in coherent memory, and use coherent memory for the dMPT table. This will allow writing MPT entries for MRs both via the SW2HW_MPT command and also directly by the driver for FMR remapping without needing to flush or worry about cacheline boundaries. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NMichael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
ib_uverbs_release_event_file() is only used in uverbs_main.c, so make it static to that file. Also move the definition before the first use, so a forward declaration is not needed. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
The declaration of struct ib_user_mad_reg_req.method_mask[] exported to userspace was an array of __u32, but the kernel internally treated it as a bitmap made up of longs. This makes a difference for 64-bit big-endian kernels, where numbering the bits in an array of__u32 gives: |31.....0|63....31|95....64|127...96| while numbering the bits in an array of longs gives: |63..............0|127............64| 64-bit userspace can handle this by just treating method_mask[] as an array of longs, but 32-bit userspace is really stuck: the meaning of the bits in method_mask[] depends on whether the kernel is 32-bit or 64-bit, and there's no sane way for userspace to know that. Fix this by updating <rdma/ib_user_mad.h> to make it clear that method_mask[] is an array of longs, and using a compat_ioctl method to convert to an array of 64-bit longs to handle the 32-on-64 problem. This fixes the interface description to match existing behavior (so working binaries continue to work) in almost all situations, and gives consistent semantics in the case of 32-bit userspace that can run on either a 32-bit or 64-bit kernel, so that the same binary can work for both 32-on-32 and 32-on-64 systems. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
Add support for setting the P_Key index of sent MADs and getting the P_Key index of received MADs. This requires a change to the layout of the ABI structure struct ib_user_mad_hdr, so to avoid breaking compatibility, we default to the old (unchanged) ABI and add a new ioctl IB_USER_MAD_ENABLE_PKEY that allows applications that are aware of the new ABI to opt into using it. We plan on switching to the new ABI by default in a year or so, and this patch adds a warning that is printed when an application uses the old ABI, to push people towards converting to the new ABI. Signed-off-by: NRoland Dreier <rolandd@cisco.com> Reviewed-by: NSean Hefty <sean.hefty@intel.com> Reviewed-by: NHal Rosenstock <hal@xsigo.com>
-
由 Joachim Fenkes 提交于
Totally forgot this. Signed-off-by: NJoachim Fenkes <fenkes@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Hoang-Nam Nguyen 提交于
Signed-off-by: NHoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Hoang-Nam Nguyen 提交于
Signed-off-by: NHoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Jack Morgenstein 提交于
display the following device information under /sys/class/infiniband/mlx4_X: board_id, fw_ver, hw_rev, hca_type. This patch makes this information available to userspace utilities such as ibstat and ibv_devinfo. Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Ralph Campbell 提交于
I was looking at the code for multicast.c and noticed that ib_sa_join_multicast() calls queue_join() which puts the request at the front of the group->pending_list. If this is a second request, it seems like it would interfere with process_join_error() since group->last_join won't point to the member at the head of the pending_list. The sequence would thus be: 1. ib_sa_join_multicast() puts member1 on head of pending_list and starts work thread 2. mcast_work_handler() calls send_join() which sets group->last_join to member1 3. ib_sa_join_multicast() puts member2 on head of pending_list 4. join operation for member1 receives failures response from SA. 5. join_handler() is called with error status 6. process_join_error() fails to process member1 since it doesn't match the first entry in the group->pending_list. The impact is that the failed join request is tossed. The second request is processed, and after it completes, the original request ends up being retried. This change also results in join requests being processed in FIFO order. Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com> Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Satyam Sharma 提交于
* Replace {un}register_cpu_notifier with {un}register_hotcpu_notifier thereby losing a couple of #ifdef HOTPLUG_CPU pairs. * Move comp_pool_callback_nb declaration to below that of callback function so that initialization of .notifier_call and .priority can occur at build time itself and not runtime. * Mark the notifier_block (and callback function, and another static function used by it) as __cpuinit{data} for the sake of consistency and remove enclosing #ifdef. (This may increase size for modular build of this module, however, because these are no longer dropped unconditionally now.) Signed-off-by: NSatyam Sharma <satyam@infradead.org> Acked-by: NJoachim Fenkes <fenkes@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
The SRC ("scalable RC") transport has been renamed to XRC ("extended RC"), to avoid having an abbreviation that is so easily confused with an abbreviation for "source." Update the HCA capability decoding output to use the new name. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
<asm/scatterlist.h> is not needed because everyplace it appears, <linux/scatterlist.h> also appears. <asm/io.h> is not needed because nothing seems to be using device IO anyway. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Steve Wise 提交于
Calling arp_send() to initiate neighbour discovery (ND) doesn't do the full ND protocol. Namely, it doesn't handle retransmitting the arp request if it is dropped. The function neigh_event_send() does all this. Without doing full ND, RDMA address resolution fails in the presence of dropped ARP broadcast packets. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Acked-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Joachim Fenkes 提交于
...because, on virtualized hardware like System p, we can't be sure that the physical pages behind them are contiguous otherwise. Signed-off-by: NJoachim Fenkes <fenkes@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Joachim Fenkes 提交于
During ib_umem_get(), determine whether all pages from the memory region are hugetlb pages and report this in the "hugetlb" member. Low-level drivers can use this information if they need it. Signed-off-by: NJoachim Fenkes <fenkes@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Sean Hefty 提交于
Provide the target service ID when performing a path record query to support optional QoS capability. QoS requires support from the SA. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Sean Hefty 提交于
Export the ability to set the type of service to user space. Model the interface after setsockopt. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Sean Hefty 提交于
Provide support to specify a type of service for a communication identifier. A new function call is used when dealing with IPv4 addresses. For IPv6 addresses, the ToS is specified through the traffic class field in the sockaddr_in6 structure. Signed-off-by: NSean Hefty <sean.hefty@intel.com> [ The comments Eitan Zahavi and myself have made over the v1 post at <http://lists.openfabrics.org/pipermail/general/2007-August/039247.html> were fully addressed. ] Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Sean Hefty 提交于
The QoS annex defines new fields for path records. Add them to the ib_sa for consumers that want to use them. Signed-off-by: NSean Hefty <sean.hefty@intel.com> Reviewed-by: NOr Gerlitz <ogerlitz@voltaire.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Sean Hefty 提交于
To support QoS within and between subnets, modify IPoIB to request specific Traffic Class values with path record queries, using the value associated with the IPoIB broadcast group. Signed-off-by: NSean Hefty <sean.hefty@intel.com> [ See some comments I made on this at v1 and v2 of the posts <http://lists.openfabrics.org/pipermail/general/2007-August/039275.html> <http://lists.openfabrics.org/pipermail/general/2007-September/040312.html> ] Reviewed-by: NOr Gerlitz <ogerlitz@voltaire.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Hoang-Nam Nguyen 提交于
Signed-off-by: NJoachim Fenkes <fenkes@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Joachim Fenkes 提交于
Nobody needed the SVNEHCA_ prefix anyway. Signed-off-by: NJoachim Fenkes <fenkes@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Joachim Fenkes 提交于
We can use raw_smp_processor_id() here because the processor ID is only used for debug output and therefore our use is preemption-unsafe. Signed-off-by: NJoachim Fenkes <fenkes@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Joachim Fenkes 提交于
Some firmware levels exhibit a race condition between H_ALLOC_RESOURCE(MR) and H_FREE_RESOURCE(MR). Work around this problem by locking these hvCalls against each other. Signed-off-by: NJoachim Fenkes <fenkes@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Joachim Fenkes 提交于
Fix some modify_qp() issues related to path migration. Signed-off-by: NJoachim Fenkes <fenkes@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Joachim Fenkes 提交于
Signed-off-by: NJoachim Fenkes <fenkes@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Joachim Fenkes 提交于
Signed-off-by: NJoachim Fenkes <fenkes@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Joachim Fenkes 提交于
...because -12 is easier to read than FFFFFFF4. Signed-off-by: NJoachim Fenkes <fenkes@de.ibm.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-