- 28 4月, 2009 10 次提交
-
-
由 Faisal Latif 提交于
Running large cluster setup, we are hanging after many hours of testing. Fixing this required going over the code and making sure the rexmit entry was properly removed based on the cm_node's state and packet received. Also when receiving a FIN packet, check seq# and make sure there were no errors before calling handle_fin(). Following are the changes done in nes_cm.c: * handle_ack_pkt() needs to return error value, so in case of error, handle_fin() is not called. Some cleanup done while going over the code. * handle_rst_pkt(), handling of cm_node's NES_CM_STATE_LAST_ACK is missing. * process_packet(), in case of FIN only packet is received, call check_seq() before processing. * in handle_fin_pkt(), we are calling cleanup_retrans_entry() for all conditions, even if the packets need to be dropped. Signed-off-by: NFaisal Latif <faisal.latif@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Faisal Latif 提交于
Under heavy load with large cluster testing, it may take longer to receive a response to MPA requests. Change the driver to wait longer after each rexmit to max time value. Signed-off-by: NFaisal Latif <faisal.latif@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Faisal Latif 提交于
check_seq() was not checking if the seq#s have wrapped. Fix it. Signed-off-by: NFaisal Latif <faisal.latif@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Faisal Latif 提交于
When a connect request comes, apbvt should only be set for non-loopback connections. Signed-off-by: NFaisal Latif <faisal.latif@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
Remove the NES_DEBUG that is causing the compile warning about an unused variable when INFINIBAND_NES_DEBUG is not enabled. Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
/sys/class/infiniband/nes?/fw_ver is not displaying firmware version properly (it shows 0.0.0 with the current code). Fill in the correct firmware version number. Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
With updated PHY firmware for SFP_D, setting the trace length to 1 inch for SFP_D provides a more stable link. Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
Enable repause timer for port 1. Without this setting, under stress, the chip may misbehave. Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
In commit 1b949324 ("RDMA/nes: Fix SFP+ PHY initialization") there is a mistake in the clean up code that removed port 1 CDR loop filter settings for 10G cards other than CX4. Put the correct setting back for appropriate PHY types. Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
Change thermo mitigation code to flip the SerDes1 reference clock to internal, to match the change in commit a4849fc1 ("RDMA/nes: Add wide_ppm_offset parm for switch compatibility"). Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 22 4月, 2009 2 次提交
-
-
由 Miroslaw Walukiewicz 提交于
In error paths where a CQ is not created, pbl is not freeed properly. In nes_destroy_cq(), add the corresponding check for nescq->mcrqf to not call nes_free_resource() when it is already done in nes_create_cq(). Signed-off-by: NMiroslaw Walukiewicz <miroslaw.walukiewicz@intel.com> Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Matt Kraai 提交于
Signed-off-by: NMatt Kraai <kraai@ftbfs.org> Acked-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 21 4月, 2009 2 次提交
-
-
由 Don Wood 提交于
The code incorrectly failed memory registration if the buffer was not page aligned. Also, the length field is mangled causing the hardware to think the registration is much larger than it really is. The fix is to remove the page alignment restriction as well the incorrect length adjustment. Also make sure that all buffers after the first start at a page boundary, and all buffers except the last end on a page boundary. Signed-off-by: NDon Wood <donald.e.wood@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
Initialize pbl_count_256 to 0 to get rid of the warning: drivers/infiniband/hw/nes/nes_verbs.c: In function 'nes_reg_mr': drivers/infiniband/hw/nes/nes_verbs.c:1955: warning: 'pbl_count_256' may be used uninitialized in this function Reported-by: NRoland Dreier <rdreier@cisco.com> Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 09 4月, 2009 6 次提交
-
-
由 Chien Tung 提交于
Add new register settings for new SFP+ PHY/firmware. Add new PHY to to nes_netdev_get/set_settings. Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
We have observed unstable link with a new BNT switch. Add wide_ppm_offset parameter to allow the user to control the clock ppm offset on the CX4 interface for better compatibility. Default is 100ppm, setting it to 1 will increase it to 300ppm. Change default SerDes1 reference clock to external source. Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
SFP+ PHY initialization has very long delays, incorrect settings for direct attach copper cables, and inconsistent link detection. Adjust delays to the minimum required by the PHY. Worst case is now less than 4 seconds. Add new register settings for direct attach cables. Change link detection logic to use two new registers for more consistent link state detection. Reorganize code to shorten line length. Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Faisal Latif 提交于
We are getting crash or hung situation when we are running network cable pull tests during RDMA traffic. In schedule_nes_timer(), we return an error if nes_nic_cm_xmit() returns failure. This is changed to success as skb is being put on the timer routines to be processed later. In send_syn() case, we are indicating connect failure once from nes_connect() and the other when the rexmit retries expires. The other issue is skb->users which we are incrementing before calling nes_nic_cm_xmit() which calls dev_queue_xmit() but in case of failure we are decrementing the skb->users at the same time putting the skb on the rexmit path. Even if dev_queue_xmit() fails, the skb->users is decremented already. We are removing the decrement of skb->users in case of failure from both schedule_nes_timer() as well as from nes_cm_timer_tick(). There is also extra check in nes_cm_timer_tick() for rexmit failure which does a break from the loop is removed. This causes problem as the other nodes have their cm_node->ref_count incremented and are not processed. Signed-off-by: NFaisal Latif <faisal.latif@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Faisal Latif 提交于
Fix issues found by static code analysis: (1) Check if cm_node was successfully created for loopback connection. (2) schedule_nes_timer() does not free up allocated memory after encountering an error. There is a WARN_ON() for this condition. (3) there is a cm_node->freed flag which is set but not used. Reported-by: NDan Carpenter <error27@gmail.com> Signed-off-by: NFaisal Latif <faisal.latif@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Don Wood 提交于
The were some incorrect casts to unsigned long that caused 64-bit values to be truncated on 32-bit architectures and made the driver pass invalid adresses and lengths to the hardware. The problems were primarily seen with kernels with highmem configured but some could show up in non-highmem kernels, too. Signed-off-by: NDon Wood <donald.e.wood@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 07 4月, 2009 2 次提交
-
-
由 Yang Hongyang 提交于
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32) Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yang Hongyang 提交于
Replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64) Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 30 3月, 2009 3 次提交
-
-
由 Steve Wise 提交于
The cxgb3 l2t entry, hwtid, and dst entry were being released before all the iwch_ep references were released. This can cause a crash in t3_l2t_send_slow() and other places where the l2t entry is used. The fix is to defer releasing these resources until all endpoint references are gone. Details: - move flags field to the iwch_ep_common struct. - add a flag indicating resources are to be released. - release resources at endpoint free time instead of close/abort time. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Steve Wise 提交于
- wrap calls into cxgb3 and fail them if we're in the middle of a PCI EEH event. - correctly unwind and release endpoint and other resources when we are in an EEH event. - dispatch IB_EVENT_DEVICE_FATAL event when cxgb3 notifies iw_cxgb3 of a fatal error. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Roland Dreier 提交于
The PAT work on x86 has finally made pgprot_writecombine() a usable API for modular drivers. As the comment indicates, this is exactly what we want to use in mlx4_ib to map BlueFlame pages up to userspace, since using WC for these pages improves small message latency significantly. Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 27 3月, 2009 1 次提交
-
-
由 Roland Dreier 提交于
When net-next and infiniband were merged upstream, each branch deleted one of a pair of adjacent lines from nes_nic.c, but when Linus fixed the conflict up, he brought back both of the lines. Fix up to the intended final tree state. Signed-off-by: NRoland Dreier <rolandd@cisco.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 3月, 2009 1 次提交
-
-
由 Steve Wise 提交于
The cxgb3 NIC driver can handle more firmware versions than iw_cxgb3, and since commit 8207befa ("cxgb3: untie strict FW matching") cxgb3 will load with firmware versions that iw_cxgb3 can't handle. The FW major number indicates a specific interface between the FW and iw_cxgb3. Thus if the major number of the running firmware does not match the required version compiled into iw_cxgb3, then iw_cxgb3 must not register that device. Signed-off-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 22 3月, 2009 2 次提交
-
-
由 Stephen Hemminger 提交于
Also, removed unnecessary memset() since alloc_netdev returns zeroed memory. Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Stephen Hemminger 提交于
Convert this driver to new net_device_ops infrastructure. Also use default net_device get-stats infrastructure Signed-off-by: NStephen Hemminger <shemminger@vyatta.com> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 3月, 2009 1 次提交
-
-
由 Yevgeny Petrilin 提交于
According to the ConnectX programmer's reference manual, all operations should be stopped, all QPs should be torn down and all WQEs flushed before the CLOSE_PORT command is invoked. In some cases reversing the order of operations (as implemented now) could cause a loss of completions. Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 13 3月, 2009 1 次提交
-
-
由 Faisal Latif 提交于
STag zero is a special STag that allows consumers to access any bus address without registering memory. The nes driver unfortunately allows STag zero to be used even with QPs created by unprivileged userspace consumers, which means that any process with direct verbs access to the nes device can read and write any memory accessible to the underlying PCI device (usually any memory in the system). Such access is usually given for cluster software such as MPI to use, so this is a local privilege escalation bug on most systems running this driver. The driver was using STag zero to receive the last streaming mode data; to allow STag zero to be disabled for unprivileged QPs, the driver now registers a special MR for this data. Cc: <stable@kernel.org> Signed-off-by: NFaisal Latif <faisal.latif@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 3月, 2009 8 次提交
-
-
由 Faisal Latif 提交于
While doing testing, there are failures as MPA Reject call is not handled. To handle MPA Reject call, following changes are done: *Handle inbound/outbound MPA Reject response message. When nes_reject() is called for pending MPA request reply, send the MPA Reject message to its peer (active side)cm_node. The peer cm_node (active side) will indicate Reject message event for the pending Connect Request. *Handle MPA Reject response message for loopback connections and listener. When MPA Request is rejected, check if it is a loopback connection and if it is then it will send Reject message event to its peer loopback node. Also when destroying listener, check if the cm_nodes for that listener are loopback or not. *Add gracefull connection close with the MPA Reject response message. Send gracefull close (FIN, FIN ACK..) to terminate the cm_nodes. *Some code re-org while making the above changes. Removed recv_list and recv_list_lock from the cm_node structure as there can be only one receive close entry on the timer. Also implemented handle_recv_entry() as receive close entry is processed from both nes_rem_ref_cm_node() as well as nes_cm_timer_tick(). Signed-off-by: NFaisal Latif <faisal.latif@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Don Wood 提交于
Two level 256 byte PBLs was not implemented so the driver could report out of memory when in fact there were PBLs still available. This solution prefers to use 4KB PBLs over two level 256B PBLs until the number of 4KB PBLs falls below a threshold. At this point the 4KB PBL structure is converted to use 256B PBLs which prevents the driver from running out of 4KB PBLs too quickly. Signed-off-by: NDon Wood <donald.e.wood@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Faisal Latif 提交于
NETIF_F_LLTX is deprecated. Remove private TX locking from the driver and remove the NETIF_F_LLTX feature flag. This also fixes a warning in some configs that comes from doing skb_linearize() call in the hard_start_xmit method with IRQs disabled (if HIGHMEM is enabled, skb_linearize() may end up enabling BHs, which is a no-no if hard IRQs are disabled in that context). By getting rid of LLTX, we do not disable IRQs when skb_linearize() is called. Remove the sq_lock as it is not needed for non-LLTX. Fix ethtool not to show the counter for sq_lock. Reported-by: aluno3@poczta.onet.pl Signed-off-by: NFaisal Latif <faisal.latif@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Don Wood 提交于
When asynchronous events are processed by software, it is necessary to let the hardware know that software has handled the event. This frees up the entry in the asynchronous event queue. Signed-off-by: NDon Wood <donald.e.wood@intel.com> Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
In find_node(), tmp_addr causes an "unused variable" warning when INFINIBAND_NES_DEBUG is not defined. It's only used in a nes_debug() and the print does not make sense. So take out the whole thing. Reported-by: NManish Katiyar <mkatiyar@gmail.com> Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
ibv_devinfo displays 0 for vendor_id and vendor_part_id. Fill in OUI and device_id for those two fields. Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Chien Tung 提交于
Update copyright to the new legal entity, Intel-NE, Inc., an Intel company. Update copyright for the new year. Signed-off-by: NChien Tung <chien.tin.tung@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
由 Don Wood 提交于
Fix occurrences where the software PBL counts were changed before the hardware was updated. This bug allowed another thread to overallocate the hardware resources. Add proper PBL accounting in case nes_reg_mr() fails. Signed-off-by: NDon Wood <donald.e.wood@intel.com> Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-
- 23 2月, 2009 1 次提交
-
-
由 Roland Dreier 提交于
ipath_release_user_pages_on_close() just allocated a structure to schedule work with but just returned (leaking the structure) rather than actually doing schedule_work(). Fix the logic to what was intended. This was spotted by the Coverity checker (CID 2700). Signed-off-by: NRoland Dreier <rolandd@cisco.com>
-