- 17 12月, 2018 1 次提交
-
-
由 Piotr Stankiewicz 提交于
commit 36d842194a57f1b21fbc6a6875f2fa2f9a7f8679 upstream. When running with KASAN, the following trace is produced: [ 62.535888] ================================================================== [ 62.544930] BUG: KASAN: slab-out-of-bounds in gut_hw_stats+0x122/0x230 [hfi1] [ 62.553856] Write of size 8 at addr ffff88080e8d6330 by task kworker/0:1/14 [ 62.565333] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted 4.19.0-test-build-kasan+ #8 [ 62.575087] Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS SE5C610.86B.01.01.0019.101220160604 10/12/2016 [ 62.587951] Workqueue: events work_for_cpu_fn [ 62.594050] Call Trace: [ 62.598023] dump_stack+0xc6/0x14c [ 62.603089] ? dump_stack_print_info.cold.1+0x2f/0x2f [ 62.610041] ? kmsg_dump_rewind_nolock+0x59/0x59 [ 62.616615] ? get_hw_stats+0x122/0x230 [hfi1] [ 62.622985] print_address_description+0x6c/0x23c [ 62.629744] ? get_hw_stats+0x122/0x230 [hfi1] [ 62.636108] kasan_report.cold.6+0x241/0x308 [ 62.642365] get_hw_stats+0x122/0x230 [hfi1] [ 62.648703] ? hfi1_alloc_rn+0x40/0x40 [hfi1] [ 62.655088] ? __kmalloc+0x110/0x240 [ 62.660695] ? hfi1_alloc_rn+0x40/0x40 [hfi1] [ 62.667142] setup_hw_stats+0xd8/0x430 [ib_core] [ 62.673972] ? show_hfi+0x50/0x50 [hfi1] [ 62.680026] ib_device_register_sysfs+0x165/0x180 [ib_core] [ 62.687995] ib_register_device+0x5a2/0xa10 [ib_core] [ 62.695340] ? show_hfi+0x50/0x50 [hfi1] [ 62.701421] ? ib_unregister_device+0x2e0/0x2e0 [ib_core] [ 62.709222] ? __vmalloc_node_range+0x2d0/0x380 [ 62.716131] ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt] [ 62.723735] ? vmalloc_node+0x5c/0x70 [ 62.729697] ? rvt_driver_mr_init+0x11f/0x2d0 [rdmavt] [ 62.737347] ? rvt_driver_mr_init+0x1f5/0x2d0 [rdmavt] [ 62.744998] ? __rvt_alloc_mr+0x110/0x110 [rdmavt] [ 62.752315] ? rvt_rc_error+0x140/0x140 [rdmavt] [ 62.759434] ? rvt_vma_open+0x30/0x30 [rdmavt] [ 62.766364] ? mutex_unlock+0x1d/0x40 [ 62.772445] ? kmem_cache_create_usercopy+0x15d/0x230 [ 62.780115] rvt_register_device+0x1f6/0x360 [rdmavt] [ 62.787823] ? rvt_get_port_immutable+0x180/0x180 [rdmavt] [ 62.796058] ? __get_txreq+0x400/0x400 [hfi1] [ 62.802969] ? memcpy+0x34/0x50 [ 62.808611] hfi1_register_ib_device+0xde6/0xeb0 [hfi1] [ 62.816601] ? hfi1_get_npkeys+0x10/0x10 [hfi1] [ 62.823760] ? hfi1_init+0x89f/0x9a0 [hfi1] [ 62.830469] ? hfi1_setup_eagerbufs+0xad0/0xad0 [hfi1] [ 62.838204] ? pcie_capability_clear_and_set_word+0xcd/0xe0 [ 62.846429] ? pcie_capability_read_word+0xd0/0xd0 [ 62.853791] ? hfi1_pcie_init+0x187/0x4b0 [hfi1] [ 62.860958] init_one+0x67f/0xae0 [hfi1] [ 62.867301] ? hfi1_init+0x9a0/0x9a0 [hfi1] [ 62.873876] ? wait_woken+0x130/0x130 [ 62.879860] ? read_word_at_a_time+0xe/0x20 [ 62.886329] ? strscpy+0x14b/0x280 [ 62.891998] ? hfi1_init+0x9a0/0x9a0 [hfi1] [ 62.898405] local_pci_probe+0x70/0xd0 [ 62.904295] ? pci_device_shutdown+0x90/0x90 [ 62.910833] work_for_cpu_fn+0x29/0x40 [ 62.916750] process_one_work+0x584/0x960 [ 62.922974] ? rcu_work_rcufn+0x40/0x40 [ 62.928991] ? __schedule+0x396/0xdc0 [ 62.934806] ? __sched_text_start+0x8/0x8 [ 62.941020] ? pick_next_task_fair+0x68b/0xc60 [ 62.947674] ? run_rebalance_domains+0x260/0x260 [ 62.954471] ? __list_add_valid+0x29/0xa0 [ 62.960607] ? move_linked_works+0x1c7/0x230 [ 62.967077] ? trace_event_raw_event_workqueue_execute_start+0x140/0x140 [ 62.976248] ? mutex_lock+0xa6/0x100 [ 62.982029] ? __mutex_lock_slowpath+0x10/0x10 [ 62.988795] ? __switch_to+0x37a/0x710 [ 62.994731] worker_thread+0x62e/0x9d0 [ 63.000602] ? max_active_store+0xf0/0xf0 [ 63.006828] ? __switch_to_asm+0x40/0x70 [ 63.012932] ? __switch_to_asm+0x34/0x70 [ 63.019013] ? __switch_to_asm+0x40/0x70 [ 63.025042] ? __switch_to_asm+0x34/0x70 [ 63.031030] ? __switch_to_asm+0x40/0x70 [ 63.037006] ? __schedule+0x396/0xdc0 [ 63.042660] ? kmem_cache_alloc_trace+0xf3/0x1f0 [ 63.049323] ? kthread+0x59/0x1d0 [ 63.054594] ? ret_from_fork+0x35/0x40 [ 63.060257] ? __sched_text_start+0x8/0x8 [ 63.066212] ? schedule+0xcf/0x250 [ 63.071529] ? __wake_up_common+0x110/0x350 [ 63.077794] ? __schedule+0xdc0/0xdc0 [ 63.083348] ? wait_woken+0x130/0x130 [ 63.088963] ? finish_task_switch+0x1f1/0x520 [ 63.095258] ? kasan_unpoison_shadow+0x30/0x40 [ 63.101792] ? __init_waitqueue_head+0xa0/0xd0 [ 63.108183] ? replenish_dl_entity.cold.60+0x18/0x18 [ 63.115151] ? _raw_spin_lock_irqsave+0x25/0x50 [ 63.121754] ? max_active_store+0xf0/0xf0 [ 63.127753] kthread+0x1ae/0x1d0 [ 63.132894] ? kthread_bind+0x30/0x30 [ 63.138422] ret_from_fork+0x35/0x40 [ 63.146973] Allocated by task 14: [ 63.152077] kasan_kmalloc+0xbf/0xe0 [ 63.157471] __kmalloc+0x110/0x240 [ 63.162804] init_cntrs+0x34d/0xdf0 [hfi1] [ 63.168883] hfi1_init_dd+0x29a3/0x2f90 [hfi1] [ 63.175244] init_one+0x551/0xae0 [hfi1] [ 63.181065] local_pci_probe+0x70/0xd0 [ 63.186759] work_for_cpu_fn+0x29/0x40 [ 63.192310] process_one_work+0x584/0x960 [ 63.198163] worker_thread+0x62e/0x9d0 [ 63.203843] kthread+0x1ae/0x1d0 [ 63.208874] ret_from_fork+0x35/0x40 [ 63.217203] Freed by task 1: [ 63.221844] __kasan_slab_free+0x12e/0x180 [ 63.227844] kfree+0x92/0x1a0 [ 63.232570] single_release+0x3a/0x60 [ 63.238024] __fput+0x1d9/0x480 [ 63.242911] task_work_run+0x139/0x190 [ 63.248440] exit_to_usermode_loop+0x191/0x1a0 [ 63.254814] do_syscall_64+0x301/0x330 [ 63.260283] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 63.270199] The buggy address belongs to the object at ffff88080e8d5500 which belongs to the cache kmalloc-4096 of size 4096 [ 63.287247] The buggy address is located 3632 bytes inside of 4096-byte region [ffff88080e8d5500, ffff88080e8d6500) [ 63.303564] The buggy address belongs to the page: [ 63.310447] page:ffffea00203a3400 count:1 mapcount:0 mapping:ffff88081380e840 index:0x0 compound_mapcount: 0 [ 63.323102] flags: 0x2fffff80008100(slab|head) [ 63.329775] raw: 002fffff80008100 0000000000000000 0000000100000001 ffff88081380e840 [ 63.340175] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000 [ 63.350564] page dumped because: kasan: bad access detected [ 63.361974] Memory state around the buggy address: [ 63.369137] ffff88080e8d6200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 63.379082] ffff88080e8d6280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 63.389032] >ffff88080e8d6300: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc [ 63.398944] ^ [ 63.406141] ffff88080e8d6380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 63.416109] ffff88080e8d6400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 63.426099] ================================================================== The trace happens because get_hw_stats() assumes there is room in the memory allocated in init_cntrs() to accommodate the driver counters. Unfortunately, that routine only allocated space for the device counters. Fix by insuring the allocation has room for the additional driver counters. Cc: <Stable@vger.kernel.org> # v4.14+ Fixes: b7481944 ("IB/hfi1: Show statistics counters under IB stats interface") Reviewed-by: NMike Marciniczyn <mike.marciniszyn@intel.com> Reviewed-by: NMike Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NPiotr Stankiewicz <piotr.stankiewicz@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 22 6月, 2018 7 次提交
-
-
由 Michael J. Ruhl 提交于
The INTx IRQ support does not work for all HF1 IRQ handlers (specifically the receive data IRQs). Remove all supporting code for the INTx IRQ. If the requested MSIx vector request is unsuccessful, do not allow the driver to continue. Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: NKamenee Arumugam <kamenee.arumugam@intel.com> Reviewed-by: NSadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Mike Marciniszyn 提交于
Many fields in ctxtdata are incorrectly sized and the organization of the fields within the structure is a jumble. Fix by: - Correcting oversize fields. - Putting fields common to all contexts at the top with hot fields at the top. - Moving PSM fields to the bottom of the structure. Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Mike Marciniszyn 提交于
Remove the sizeable cache of the chip sizing CSRs and replace with CSR reads as needed. Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Mike Marciniszyn 提交于
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Mike Marciniszyn 提交于
Fields in this structure are sized excessively based on hardware limitations and input values. Fix by reducing fields as appropriate and repositioning to close holes in the structure. Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Mike Marciniszyn 提交于
It is only ever written. Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Mike Marciniszyn 提交于
The usage of this ctxt data field is not hot path and the value can be computed on demand to cut down the ctxtdata bloat. Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 20 6月, 2018 3 次提交
-
-
由 Mike Marciniszyn 提交于
The field is based on a constant that can never change. Use the define to assign the register instead. Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Mike Marciniszyn 提交于
This field should be in ctxtdata to allow for better locality of access by eliminating a dd dereference. The new field is now side-by-side with rcvhdrqentsize since the rhf_offset is a function of the rcvhdrqentsize. Both fields are now correctly sized as u8. Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Mike Marciniszyn 提交于
The current implementation precludes having receive context specific packet type receive handlers. Fix this by adding adding c99 const array for the existing handlers and remove the current 72 bytes of pointers from devdata. A new pointer in hfi1_ctxtdata will point to the const array. Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 05 6月, 2018 1 次提交
-
-
由 Kaike Wan 提交于
The mutex exp_lock in struct hfi1_ctxtdata is used to protect all Expected TID data of a user context. This patch renames it to exp_mutex to better reflect its identity and prepare for upcoming patches. Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NHarish Chegondi <harish.chegondi@intel.com> Signed-off-by: NKaike Wan <kaike.wan@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 24 5月, 2018 3 次提交
-
-
由 Mike Marciniszyn 提交于
The knowledge of the internal workings of the expect receive is too distributed. Fix by: - right size several rcd fields associated with expect receive - making an init entrance to init all the lists - consolidate all the allocations into an array anchored in the rcd Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: NKaike Wan <kaike.wan@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Don Hiatt 提交于
16B Management Packets (L4=0x08) replace the BTH and DETH of normal MAD packet packets with a header containing the the source and destination queue pair numbers; fields that were originally retrieved from the BTH/DETH are now populated from this header as well as from the 16B LRH (e.g. pkey). 16B Management Packets are used as an optimized management format on 16B fabrics. These management packets have an opcode of IB_OPCODE_UD_SEND_ONLY, a fixed 3Byte pad, and a header length of 24Bytes. The decision as to when we send a management packet is based upon either the source or destination queue pair number being 0 or 1. Reviewed-by: NIra Weiny <ira.weiny@intel.com> Signed-off-by: NDon Hiatt <don.hiatt@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Don Hiatt 提交于
Add 16B Management Packet definition. This optimized packet format replaces the ib_other_headers and BTH with a source and destination QP number. To support these packets we introduce struct opa_16b_mgmt into the struct hfi1_16b_header. This packet format is only used for MAD packets using the IB_OPCODE_UD_SEND_ONLY opcode on QP0/1. The original 16B implementation failed to use 16B management packets so now we add their definition. Reviewed-by: NIra Weiny <ira.weiny@intel.com> Signed-off-by: NDon Hiatt <don.hiatt@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 10 5月, 2018 3 次提交
-
-
由 Sebastian Sanchez 提交于
Currently the driver doesn't support completion vectors. These are used to indicate which sets of CQs should be grouped together into the same vector. A vector is a CQ processing thread that runs on a specific CPU. If an application has several CQs bound to different completion vectors, and each completion vector runs on different CPUs, then the completion queue workload is balanced. This helps scale as more nodes are used. Implement CQ completion vector support using a global workqueue where a CQ entry is queued to the CPU corresponding to the CQ's completion vector. Since the workqueue is global, it's guaranteed to always be there when queueing CQ entries; Therefore, the RCU locking for cq->rdi->worker in the hot path is superfluous. Each completion vector is assigned to a different CPU. The number of completion vectors available is computed by taking the number of online, physical CPUs from the local NUMA node and subtracting the CPUs used for kernel receive queues and the general interrupt. Special use cases: * If there are no CPUs left for completion vectors, the same CPU for the general interrupt is used; Therefore, there would only be one completion vector available. * For multi-HFI systems, the number of completion vectors available for each device is the total number of completion vectors in the local NUMA node divided by the number of devices in the same NUMA node. If there's a division remainder, the first device to get initialized gets an extra completion vector. Upon a CQ creation, an invalid completion vector could be specified. Handle it as follows: * If the completion vector is less than 0, set it to 0. * Set the completion vector to the result of the passed completion vector moded with the number of device completion vectors available. Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Mitko Haralanov 提交于
The packet fault injection code present in the HFI1 driver had some issues which not only fragment the code but also created user confusion. Furthermore, it suffered from the following issues: 1. The fault_packet method only worked for received packets. This meant that the only fault injection mode available for sent packets is fault_opcode, which did not allow for random packet drops on all egressing packets. 2. The mask available for the fault_opcode mode did not really work due to the fact that the opcode values are not bits in a bitmask but rather sequential integer values. Creating a opcode/mask pair that would successfully capture a set of packets was nearly impossible. 3. The code was fragmented and used too many debugfs entries to operate and control. This was confusing to users. 4. It did not allow filtering fault injection on a per direction basis - egress vs. ingress. In order to improve or fix the above issues, the following changes have been made: 1. The fault injection methods have been combined into a single fault injection facility. As such, the fault injection has been plugged into both the send and receive code paths. Regardless of method used the fault injection will operate on both egress and ingress packets. 2. The type of fault injection - by packet or by opcode - is now controlled by changing the boolean value of the file "opcode_mode". When the value is set to True, fault injection is done by opcode. Otherwise, by packet. 2. The masking ability has been removed in favor of a bitmap that holds opcodes of interest (one bit per opcode, a total of 256 bits). This works in tandem with the "opcode_mode" value. When the value of "opcode_mode" is False, this bitmap is ignored. When the value is True, the bitmap lists all opcodes to be considered for fault injection. By default, the bitmap is empty. When the user wants to filter by opcode, the user sets the corresponding bit in the bitmap by echo'ing the bit position into the 'opcodes' file. This gets around the issue that the set of opcodes does not lend itself to effective masks and allow for extremely fine-grained filtering by opcode. 4. fault_packet and fault_opcode methods have been combined. Hence, there is only one debugfs directory controlling the entire operation of the fault injection machinery. This reduces the number of debugfs entries and provides a more unified user experience. 5. A new control files - "direction" - is provided to allow the user to control the direction of packets, which are subject to fault injection. 6. A new control file - "skip_usec" - is added that would allow the user to specify a "timeout" during which no fault injection will occur. In addition, the following bug fixes have been applied: 1. The fault injection code has been split into its own header and source files. This was done to better organize the code and support conditional compilation without littering the code with #ifdef's. 2. The method by which the TX PIO packets were being marked for drop conflicted with the way send contexts were being setup. As a result, the send context was repeatedly being reset. 3. The fault injection only makes sense when the user can control it through the debugfs entries. However, a kernel configuration can enable fault injection but keep fault injection debugfs entries disabled. Therefore, it makes sense that the HFI fault injection code depends on both. 4. Error suppression did not take into account the method by which PIO packets were being dropped. Therefore, even with error suppression turned on, errors would still be displayed to the screen. A larger enough packet drop percentage would case the kernel to crash because the driver would be stuck printing errors. Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: NDon Hiatt <don.hiatt@intel.com> Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NMitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Alex Estrin 提交于
A warm restart will fail to unload the driver, leaving link state potentially flapping up to the point the BIOS resets the adapter. Correct the issue by hooking the shutdown pci method, which will bring port down. Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NAlex Estrin <alex.estrin@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 04 5月, 2018 1 次提交
-
-
由 Mike Marciniszyn 提交于
The code for handling a marked UD packet unconditionally returns the dlid in the header of the FECN marked packet. This is not correct for multicast packets where the DLID is in the multicast range. The subsequent attempt to send the CNP with the multicast lid will cause the chip to halt the ack send context because the source lid doesn't match the chip programming. The send context will be halted and flush any other pending packets in the pio ring causing the CNP to not be sent. A part of investigating the fix, it was determined that the 16B work broke the FECN routine badly with inconsistent use of 16 bit and 32 bits types for lids and pkeys. Since the port's source lid was correctly 32 bits the type mixmatches need to be dealt with at the same time as fixing the CNP header issue. Fix these issues by: - Using the ports lid for as the SLID for responding to FECN marked UD packets - Insure pkey is always 16 bit in this and subordinate routines - Insure lids are 32 bits in this and subordinate routines Cc: <stable@vger.kernel.org> # 4.14.x Fixes: 88733e3b ("IB/hfi1: Add 16B UD support") Reviewed-by: NDon Hiatt <don.hiatt@intel.com> Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 14 3月, 2018 1 次提交
-
-
由 Zhu Yanjun 提交于
In hfi.h, the header file opa_addr.h is included twice. In vt.h, the header file mmap.h is included twice. Signed-off-by: NZhu Yanjun <yanjun.zhu@oracle.com> Acked-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 02 2月, 2018 4 次提交
-
-
由 Kamenee Arumugam 提交于
HFI's counters SendWaitCnt and SendWaitVlCnt are in units of TXE cycle time (at 805MHz). OPA counters PortXmitWait and PortVLXmtWait are in units of flit times. Convert the counter values to flit units using following conversion formula: PortXmitWait = SendWaitCnt * 2 * (4 /link_width) * (25 Gbps /link_speed) PortVLXmitWait = SendWaitVLCnt * 2 * (4 /link_width) * (25 Gbps /link_speed) At link up or downgrade events, the link width can change. To ensure accurate counter calculations, sample the counters after the events, during counter requests, and then aggregate the OPA counters. Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NKamenee Arumugam <kamenee.arumugam@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Sebastian Sanchez 提交于
packet->fecn and packet->becn are calculated in the hot path and are never used. Remove these fields as they show to be costly in a profile. Also, remove initialization for becn and fecn in process_ecn() as they're unconditionally assigned in the function and ensure fecn and becn variables use a boolean type. Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Sebastian Sanchez 提交于
The packet type comparison used to find out if a packet is a bypass packet in the hot path is an expensive operation as seen in a profile. Determine packet's pkey and migration bit through the bypass and 9B code paths instead. Reviewed-by: NDon Hiatt <don.hiatt@intel.com> Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Michael J. Ruhl 提交于
The pci_request_irq() interfaces always adds the IRQF_SHARED bit to all IRQ requests. When the kernel is built with CONFIG_DEBUG_SHIRQ config flag, if the IRQF_SHARED bit is set, a call to the IRQ handler is made from the __free_irq() function. This is testing a race condition between the IRQ cleanup and an IRQ racing the cleanup. The HFI driver should be able to handle this race, but does not. This race can cause traces that start with this footprint: BUG: unable to handle kernel NULL pointer dereference at (null) Call Trace: <hfi1 irq handler> ... __free_irq+0x1b3/0x2d0 free_irq+0x35/0x70 pci_free_irq+0x1c/0x30 clean_up_interrupts+0x53/0xf0 [hfi1] hfi1_start_cleanup+0x122/0x190 [hfi1] postinit_cleanup+0x1d/0x280 [hfi1] remove_one+0x233/0x250 [hfi1] pci_device_remove+0x39/0xc0 Export IRQ cleanup function so it can be called from other modules. Using the exported cleanup function: Re-order the driver cleanup code to clean up IRQ resources before other resources, eliminating the race. Re-order error path for init so that the race does not occur. Reduce severity on spurious error message for SDMA IRQs to info. Reviewed-by: NAlex Estrin <alex.estrin@intel.com> Reviewed-by: NPatel Jay P <jay.p.patel@intel.com> Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 06 1月, 2018 2 次提交
-
-
由 Michael J. Ruhl 提交于
The get_unit_name() function crafts a string based on the device name and the device unit number. It then stores this in a static variable. This has concurrency issues as can be seen with this log: hfi1 0000:02:00.0: hfi1_1: read_idle_message: read idle message 0x203 hfi1 0000:01:00.0: hfi1_1: read_idle_message: read idle message 0x203 The PCI device ID (0000:02:00.0 vs. 0000:01:00.0) is correct for the message, but the device string hfi1_1 is incorrect (it should be hfi1_0 for the second log message). Remove get_unit_name() function. Instead, use the rvt accessor rvt_get_ibdev_name() to get the IB name string. Clean up any hfi1_early_xx calls that can now use the new path. QIB has the same (qib_get_unit_name()) issue. Updating as necessary. Remove qib_get_unit_name() function. Update log message that has redundant device name. Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Michael J. Ruhl 提交于
rdmavt has a down call to client drivers to retrieve a crafted card name. This name should be the IB defined name. Rather than craft the name each time it is needed, simply retrieve the IB allocated name from the IB device. Update the function name to reflect its application. Clean up driver code to match this change. Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 23 12月, 2017 2 次提交
-
-
由 Don Hiatt 提交于
Change the slid arg to ingress_pkey_table_fail() to a full 32Bits and do not convert to 16Bits in caller. This is so we can keep everything 32bit in the kernel and only change to 16bit at the uapi boundary. Signed-off-by: NDon Hiatt <don.hiatt@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Michael J. Ruhl 提交于
During driver init, various registers are saved to allow restoration after an FLR or gen3 bump. Some of these registers are not available in some circumstances (i.e. Virtual machines). This bug makes the driver unusable when the PCI device is passed into a VM, it fails during probe. Delete unnecessary register read/write, and only access register if the capability exists. Cc: <stable@vger.kernel.org> # 4.14.x Fixes: a618b7e4 ("IB/hfi1: Move saving PCI values to a separate function") Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 14 11月, 2017 1 次提交
-
-
由 Jan Sokolowski 提交于
HFI's are hard-wired to send Device Info frames with MgmtAllowed bit set to 0. This means in B2B setups, MgmtAllowed would never be allowed, which prevents remote opa management tools from working properly. Assume MgmtAllowed if a neighbor is also an HFI. Fixes: 98b9ee20 ("IB/hfi1: Cache neighbor secure data after link up") Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com> Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NJan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 31 10月, 2017 1 次提交
-
-
由 Mike Marciniszyn 提交于
This patch adds tx_opcode_stats to parallel the (rx)opcode_stats in the debugfs. Reviewed-by: NKaike Wan <kaike.wan@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 15 10月, 2017 1 次提交
-
-
由 Bart Van Assche 提交于
Move the hfi1_handle_cnp_tbl[] from a header file to a .c file such that only one copy ends up in the hfi1 kernel module. This patch does not change any functionality. Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com> Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 05 10月, 2017 2 次提交
-
-
由 Michael J. Ruhl 提交于
The addition of the VNIC contexts to num_rcv_contexts changes the meaning of the sysfs value nctxts from available user contexts, to user contexts + reserved VNIC contexts. User applications that use nctxts are now broken. Update the calculation so that VNIC contexts are used only if there are hardware contexts available, and do not silently affect nctxts. Update code to use the calculated VNIC context number. Update the sysfs value nctxts to be available user contexts only. Fixes: 2280740f ("IB/hfi1: Virtual Network Interface Controller (VNIC) HW support") Reviewed-by: NIra Weiny <ira.weiny@intel.com> Reviewed-by: NNiranjana Vishwanathapura <Niranjana.Vishwanathapura@intel.com> Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Cc: <Stable@vger.kernel.org> #v4.12+ Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Mike Marciniszyn 提交于
The 16B changes to the output side of the header trace introduced two issues: 1. An uninitialized field "l4" for 9B packets This field needs to be given a value of 0 for 9B packets to insure a correct 9B trace. The fix adds a new define to insure that there is a dummy default for 9B packets to insure the correct string is decoded. 2. Use of entry vs. __entry in field references Fixes: Commit 863cf89d ("IB/hfi1: Add 16B trace support") Reported-by: NKaike Wan <kaike.wan@intel.com> Reviewed-by: NDon Hiatt <don.hiatt@intel.com> Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 27 9月, 2017 5 次提交
-
-
由 Michael J. Ruhl 提交于
hfi1_rcd_get_by_index assumes that the given index is in the correct range. In most cases this is correct because the index is bounded by a loop. For these cases, adding a range check to the function is redundant. For the use case that is not bounded by the loop range, a _safe wrapper function is needed to validate the index before accessing the rcd array. Add a _safe wrapper to _get_by_index to validate the index range. Update appropriate call sites with the new _safe function. Reviewed-by: NIra Weiny <ira.weiny@intel.com> Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Jan Sokolowski 提交于
Following variables: hfi1_cpulist and hfi1_cpulist_count are unused. Remove them. Reviewed-by: NHarish Chegondi <harish.chegondi@intel.com> Reviewed-by: NJakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: NJan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Michael J. Ruhl 提交于
Calculating the offset to a context is done several times throughout the code. Create a common inlined function for doing this calculation. Reviewed-by: NIra Weiny <ira.weiny@intel.com> Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Ira Weiny 提交于
devdata->link_default is no longer variable Maintain number of holes by moving dc_shutdown Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: NIra Weiny <ira.weiny@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
由 Michael J. Ruhl 提交于
The HFI PCI IRQ code uses an obsolete PCI API. Update the code to use the new PCI IRQ API and any necessary changes because of the new API. Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 29 8月, 2017 1 次提交
-
-
由 Grzegorz Morys 提交于
Ratelimit error prints from sdma_interrupt function that could swarm dmesg otherwise. Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NGrzegorz Morys <grzegorz.morys@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 23 8月, 2017 1 次提交
-
-
由 Kaike Wan 提交于
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: NKaike Wan <kaike.wan@intel.com> Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-