提交 · 0719007663ce2d5da653ec1dc3bcfe2ab681b964 · openeuler / Kernel

02 2月, 2018 3 次提交

IB/hfi1: Convert PortXmitWait/PortVLXmitWait counters to flit times · 07190076

由 Kamenee Arumugam 提交于 2月 01, 2018

HFI's counters SendWaitCnt and SendWaitVlCnt are in units
of TXE cycle time (at 805MHz). OPA counters PortXmitWait and
PortVLXmtWait are in units of flit times.
Convert the counter values to flit units using following
conversion formula:

PortXmitWait =
	SendWaitCnt * 2 * (4 /link_width) * (25 Gbps /link_speed)
PortVLXmitWait =
	SendWaitVLCnt * 2 * (4 /link_width) * (25 Gbps /link_speed)

At link up or downgrade events, the link width can change. To ensure
accurate counter calculations, sample the counters after the events,
during counter requests, and then aggregate the OPA counters.
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NKamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

07190076

IB/hfi1: Fix for early release of sdma context · 473291b3

由 Alex Estrin 提交于 2月 01, 2018

With IRQF_SHARED flag set and CONFIG_DEBUG_SHIRQ enabled
module removal may result in panic in sdma_interrupt() routine
if associated sdma context was released before pci_free_irq();

[ 9198.939885] BUG: unable to handle kernel NULL pointer dereference at           (null)
[ 9198.940514] IP: sdma_make_progress+0xa5/0x450 [hfi1]
[ 9198.941114] PGD 170bdc0067 P4D 170bdc0067 PUD 172063e067 PMD 0
[ 9198.941783] Oops: 0000 [#1] SMP
.....
[ 9198.958877] CPU: 132 PID: 64173 Comm: rmmod Tainted: G           OE   4.14.0-rc4+ #1
[ 9198.961032] Hardware name: Intel Corporation S7200AP/S7200AP, BIOS S72C610.86B.01.02.0118.080620171935 08/06/2017
[ 9198.963323] task: ffff9681397f0000 task.stack: ffffae1647c40000
[ 9198.965695] RIP: 0010:sdma_make_progress+0xa5/0x450 [hfi1]
[ 9198.968082] RSP: 0018:ffffae1647c43be8 EFLAGS: 00010046
[ 9198.970503] RAX: 0000000000000000 RBX: ffff9680ce8b5ca8 RCX: 0000000000000000
[ 9198.973006] RDX: 0000000000000000 RSI: 0000000001a00d28 RDI: ffff9680ce8b5ca0
[ 9198.975546] RBP: ffffae1647c43c40 R08: ffff96814325ec00 R09: 00000000ffffffff
[ 9198.978142] R10: 000000004325e501 R11: ffff96814325ec00 R12: ffff9680ce8b5c44
[ 9198.980779] R13: ffff9680ce8b5ca0 R14: 0000000000000000 R15: ffff9680ce8b5b00
[ 9198.983462] FS:  00007f31196ba740(0000) GS:ffff96819df00000(0000) knlGS:0000000000000000
[ 9198.986231] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9198.989036] CR2: 0000000000000000 CR3: 000000170833f000 CR4: 00000000001406e0
[ 9198.991911] Call Trace:
[ 9198.994847]  sdma_engine_interrupt+0x82/0x100 [hfi1]
[ 9198.997852]  sdma_interrupt+0x61/0xc0 [hfi1]
[ 9199.000852]  __free_irq+0x1b3/0x2d0
[ 9199.003873]  free_irq+0x35/0x70
[ 9199.006909]  pci_free_irq+0x1c/0x30
[ 9199.009999]  clean_up_interrupts+0x53/0xf0 [hfi1]
[ 9199.013137]  hfi1_start_cleanup+0x117/0x190 [hfi1]
[ 9199.016315]  postinit_cleanup+0x1d/0x270 [hfi1]
[ 9199.019529]  remove_one+0x1f3/0x210 [hfi1]
[ 9199.022738]  pci_device_remove+0x39/0xc0
[ 9199.025974]  device_release_driver_internal+0x141/0x210
[ 9199.029268]  driver_detach+0x3f/0x80
[ 9199.032580]  bus_remove_driver+0x55/0xd0
[ 9199.035931]  driver_unregister+0x2c/0x50
[ 9199.039321]  pci_unregister_driver+0x2a/0xa0
[ 9199.042755]  hfi1_mod_cleanup+0x10/0xb50 [hfi1]
[ 9199.046196]  SyS_delete_module+0x171/0x250
...

Fix by exporting sdma_clean() and removing from sdma_exit().
sdma_exit() now just manipulates the engine state,
leaving the memory free to sdma_clean() which is now called
just before the dd is freed.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NMichael J Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NAlex Estrin <alex.estrin@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

473291b3

IB/hfi1: Re-order IRQ cleanup to address driver cleanup race · 82a97926

由 Michael J. Ruhl 提交于 2月 01, 2018

The pci_request_irq() interfaces always adds the IRQF_SHARED bit to
all IRQ requests.

When the kernel is built with CONFIG_DEBUG_SHIRQ config flag, if the
IRQF_SHARED bit is set, a call to the IRQ handler is made from the
__free_irq() function. This is testing a race condition between the
IRQ cleanup and an IRQ racing the cleanup.  The HFI driver should be
able to handle this race, but does not.

This race can cause traces that start with this footprint:

BUG: unable to handle kernel NULL pointer dereference at   (null)
Call Trace:
 <hfi1 irq handler>
 ...
 __free_irq+0x1b3/0x2d0
 free_irq+0x35/0x70
 pci_free_irq+0x1c/0x30
 clean_up_interrupts+0x53/0xf0 [hfi1]
 hfi1_start_cleanup+0x122/0x190 [hfi1]
 postinit_cleanup+0x1d/0x280 [hfi1]
 remove_one+0x233/0x250 [hfi1]
 pci_device_remove+0x39/0xc0

Export IRQ cleanup function so it can be called from other modules.

Using the exported cleanup function:

  Re-order the driver cleanup code to clean up IRQ resources before
  other resources, eliminating the race.

  Re-order error path for init so that the race does not occur.

Reduce severity on spurious error message for SDMA IRQs to info.
Reviewed-by: NAlex Estrin <alex.estrin@intel.com>
Reviewed-by: NPatel Jay P <jay.p.patel@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

82a97926

06 1月, 2018 1 次提交

IB/{rdmavt, hfi1, qib}: Self determine driver name · 5084c8ff

由 Michael J. Ruhl 提交于 12月 18, 2017

Currently the HFI and QIB drivers allow the IB core to assign a unit
number to the driver name string.

If multiple devices exist in a system, there is a possibility that the
device unit number and the IB core number will be mismatched.

Fix by using the driver defined unit number to generate the device
name.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5084c8ff

14 11月, 2017 1 次提交

IB/hfi1: Do not allocate PIO send contexts for VNIC · cc9a97ea

由 Niranjana Vishwanathapura 提交于 11月 06, 2017

OPA VNIC does not use PIO contexts and instead only uses SDMA
engines. Do not allocate PIO contexts for VNIC ports.
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

cc9a97ea

31 10月, 2017 1 次提交

IB/hfi1: Add tx_opcode_stats like the opcode_stats · 1b311f89

由 Mike Marciniszyn 提交于 10月 23, 2017

This patch adds tx_opcode_stats to parallel the
(rx)opcode_stats in the debugfs.
Reviewed-by: NKaike Wan <kaike.wan@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

1b311f89

18 10月, 2017 1 次提交

IB/hfi1: Convert timers to use timer_setup() · 8064135e

由 Kees Cook 提交于 10月 16, 2017

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly. Switches test of .data field to
.function, since .data will be going away.

Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

8064135e

27 9月, 2017 3 次提交

IB/hfi1: Add a safe wrapper for _rcd_get_by_index · d59075ad

由 Michael J. Ruhl 提交于 9月 26, 2017

hfi1_rcd_get_by_index assumes that the given index is in the correct
range.  In most cases this is correct because the index is bounded by
a loop.  For these cases, adding a range check to the function is
redundant.

For the use case that is not bounded by the loop range, a _safe wrapper
function is needed to validate the index before accessing the rcd array.

Add a _safe wrapper to _get_by_index to validate the index range.

Update appropriate call sites with the new _safe function.
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d59075ad

IB/hfi1: Remove unused hfi1_cpulist variables · 6fab2a88

由 Jan Sokolowski 提交于 9月 26, 2017

Following variables: hfi1_cpulist and hfi1_cpulist_count
are unused. Remove them.
Reviewed-by: NHarish Chegondi <harish.chegondi@intel.com>
Reviewed-by: NJakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: NJan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

6fab2a88

IB/hfi1: Remove unnecessary error messages on alloc failures · 6fee0369

由 Jan Sokolowski 提交于 9月 26, 2017

Per-cpu variables int_counter, rcv_limit, and send_schedule
print unnecessary error messages on failed allocations.
Remove the error messages.
Reviewed-by: NHarish Chegondi <harish.chegondi@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

6fee0369

23 8月, 2017 3 次提交

IB/hf1: User context locking is inconsistent · d295dbeb

由 Michael J. Ruhl 提交于 8月 04, 2017

There is a mixture of mutex and spinlocks to protect receive context
(rcd/uctxt) information.  This is not used consistently.

Use the mutex to protect device receive context information only.
Use the spinlock to protect sub context information only.

Protect access to items in the rcd array with a spinlock and
reference count.

Remove spinlock around dd->rcd array cleanup.  Since interrupts are
disabled and cleaned up before this point, this lock is not useful.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d295dbeb

IB/hfi1: Protect context array set/clear with spinlock · f2a3bc00

由 Michael J. Ruhl 提交于 8月 04, 2017

The rcd array can be accessed from user context or during interrupts.
Protecting this with a mutex isn't a good idea because the mutex should
not be used from an IRQ.

Protect the allocation and freeing of rcd array elements with a
spinlock.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

f2a3bc00

IB/hfi1: Revert egress pkey check enforcement · ecdb19f4

由 Alex Estrin 提交于 8月 04, 2017

Current code has some serious flaws. Disarm the flag
pending an appropriate patch.

Fixes: 53526500 ("IB/hfi1: Permanently enable P_Key checking in HFI")
Cc: stable@vger.kernel.org
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NAlex Estrin <alex.estrin@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

ecdb19f4

01 8月, 2017 3 次提交

IB/hfi1: Create workqueue for link events · 71d47008

由 Sebastian Sanchez 提交于 7月 29, 2017

Currently, link down interrupts queue link entries
on a workqueue intended for sending events only.
Create a workqueue for queuing link events.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

71d47008

IB/hfi1: Pass the context pointer rather than the index · 2250563e

由 Michael J. Ruhl 提交于 7月 24, 2017

The hfi1_rcvctrl() function receives an index which it then converts
to an rcd.  Since most functions have the rcd, use that instead.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

2250563e

IB/hfi1: Size rcd array index correctly and consistently · e6f7622d

由 Michael J. Ruhl 提交于 7月 24, 2017

The array index for the rcd array is sized several different ways
throughout the code.

Use the user interface size (u16) as the standard size and update the
necessary code to reflect this.

u16 is large enough for the largest amount of supported contexts.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e6f7622d

28 6月, 2017 2 次提交

IB/hfi1: Resolve kernel panics by reference counting receive contexts · f683c80c

由 Michael J. Ruhl 提交于 6月 09, 2017

Base receive contexts can be used by sub contexts.  Because of this,
resources for the context cannot be completely freed until all sub
contexts are done using the base context.

Introduce a reference count so that the base receive context can be
freed only when all sub contexts are done with it.

Use the provided function call for setting default send context
integrity rather than the manual method.

The cleanup path does not set all variables back to NULL after freeing
resources.  Since the clean up code can get called more than once,
(e.g. during context close and on the error path), it is necessary to
make sure that all the variables are NULLed.

Possible crash are:

BUG: unable to handle kernel paging request at 0000000001908900
IP: read_csr+0x24/0x30 [hfi1]
RIP: 0010:read_csr+0x24/0x30 [hfi1]
Call Trace:
 sc_disable+0x40/0x110 [hfi1]
 hfi1_file_close+0x16f/0x360 [hfi1]
 __fput+0xe7/0x210
 ____fput+0xe/0x10

or

kernel BUG at mm/slub.c:3877!
RIP: 0010:kfree+0x14f/0x170
Call Trace:
 hfi1_free_ctxtdata+0x19a/0x2b0 [hfi1]
 ? hfi1_user_exp_rcv_grp_free+0x73/0x80 [hfi1]
 hfi1_file_close+0x20f/0x360 [hfi1]
 __fput+0xe7/0x210
 ____fput+0xe/0x10

Fixes: Commit 62239fc6 ("IB/hfi1: Clean up on context initialization failure")
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

f683c80c

IB/hfi1: Initialize TID lists to avoid crash on cleanup · fe4e74ee

由 Michael J. Ruhl 提交于 6月 09, 2017

The expected receive lists (tid_xxx_list) are not initialized until
late in the receive context initialization.  If an error happens
before the initialization, a NULL pointer access will occur during
cleanup.

Initialized the lists sooner rather than later to avoid this Oops:

IP: unlock_exp_tids.isra.11+0x26/0xd0 [hfi1]
RIP: 0010:unlock_exp_tids.isra.11+0x26/0xd0 [hfi1]
Call Trace:
 hfi1_user_exp_rcv_free+0x79/0xb0 [hfi1]
 hfi1_file_close+0x87/0x360 [hfi1]
 __fput+0xe7/0x210
 ____fput+0xe/0x10
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

fe4e74ee

05 5月, 2017 7 次提交

IB/hfi1: Clean up on context initialization failure · 62239fc6

由 Michael J. Ruhl 提交于 5月 04, 2017

The error path for context initialization is not consistent. Cleanup all
resources on failure.

Removed unused variable user_event_mask.

Add the _BASE_FAILED bit to the event flags so that a base context can
notify waiting sub contexts that they cannot continue.

Running out of sub contexts is an EBUSY result, not EINVAL.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

62239fc6

IB/hfi1: Fix an assign/ordering issue with shared context IDs · 8737ce95

由 Michael J. Ruhl 提交于 5月 04, 2017

The current algorithm for generating sub-context IDs is FILO.  If the
contexts are not closed in that order, the uniqueness of the ID will be
compromised. I.e. logging the creation/deletion of context IDs with an
application that assigns and closes in a FIFO order reveals:

cache_id: assign: uctxt: 3    sub_ctxt: 0
cache_id: assign: uctxt: 3    sub_ctxt: 1
cache_id: assign: uctxt: 3    sub_ctxt: 2
cache_id: close:  uctxt: 3    sub_ctxt: 0
cache_id: assign: uctxt: 3    sub_ctxt: 2 <<<

The sub_ctxt ID 2 is reused incorrectly.

Update the sub-context ID assign algorithm to use a bitmask of in_use
contexts.  The new algorithm will allow the contexts to be closed in any
order, and will only re-use unused contexts.

Size subctxt and subctxt_cnt to match the user API size.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

8737ce95

IB/hfi1: Clean up context initialization · 9b60d2cb

由 Michael J. Ruhl 提交于 5月 04, 2017

Context initialization mixes base context init with sub context init.
This is bad because contexts can be reused, and on reuse, reinit things
that should not re-initialized.

Normalize comments and function names to refer to base context and
sub context (not main, shared or slaves).

Separate the base context initialization from sub context initialization.

hfi1_init_ctxt() cannot return an error so changed to a void and remove
error message.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

9b60d2cb

IB/hfi1: Name function prototype parameters · f4cd8765

由 Michael J. Ruhl 提交于 5月 04, 2017

To improve the readability of function prototypes, give the parameters
names.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

f4cd8765

IB/hfi1: Return an error on memory allocation failure · 94679061

由 Michael J. Ruhl 提交于 5月 04, 2017

If the eager buffer allocation fails, it is necessary to return
an error code.

Cc: stable@vger.kernel.org
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

94679061

IB/hfi1: Adjust default eager_buffer_size to 8MB · 9746fa43

由 Tymoteusz Kielan 提交于 5月 04, 2017

Performance analysis shows benefits for PSM2 in increasing eager buffer
size from 2MB to 8MB. The change has neutral impact on verbs.
Make change to the module parameter's default value. Allocation
ring down was verified to work with the larger buffer size.
Reviewed-by: NTadeusz Struk <tadeusz.struk@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NTymoteusz Kielan <tymoteusz.kielan@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

9746fa43

IB/hfi1: Fix yield logic in send engine · dd1ed108

由 Mike Marciniszyn 提交于 5月 04, 2017

When there are many RC QPs and an RDMA READ request
is sent, timeouts occur on the requester side because
of fairness among RC QPs on their relative SDMA engine
on the responder side.  This also hits write and send, but
to a lesser extent.

Complicating the issue is that the current code checks if workqueue
is congested before scheduling other QPs, however, this
check is based on the number of active entries in the
workqueue, which was found to be too big to for
workqueue_congested() to be effective.

Fix by reducing the number of active entries as revealed by
experimentation from the default of num_sdma to
HFI1_MAX_ACTIVE_WORKQUEUE_ENTRIES.  Retry counts were monitored
to determine the correct value.

Tracing to investigate any future issues is also added.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

dd1ed108

29 4月, 2017 3 次提交

IB/hfi1: Fix unbalanced braces around else · ee495ada

由 Dennis Dalessandro 提交于 4月 09, 2017

Add missing braces around else blocks in a few places to make checkpatch
happy.

Fixes: 77241056 ("IB/hfi1: add driver files")
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

ee495ada

IB/hfi1: Permanently enable P_Key checking in HFI · 53526500

由 Neel Desai 提交于 4月 09, 2017

Ingress and egress port P_Key checking should always be performed for
HFIs. This patch will enable ingress and egress P_Key checking when
the port is initialized and will ignore the P_Key information sent by
the FM in the port info structure which is meant to be used only by the
switch.
Reviewed-by: NEaswar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NNeel Desai <neel.desai@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

53526500

IB/hfi1: Fix softlockup issue · 22546b74

由 Tadeusz Struk 提交于 4月 28, 2017

Soft lockups can occur because the mad processing on different CPUs acquire
the spin lock dc8051_lock:

[534552.835870]  [<ffffffffa026f993>] ? read_dev_port_cntr.isra.37+0x23/0x160 [hfi1]
[534552.835880]  [<ffffffffa02775af>] read_dev_cntr+0x4f/0x60 [hfi1]
[534552.835893]  [<ffffffffa028d7cd>] pma_get_opa_portstatus+0x64d/0x8c0 [hfi1]
[534552.835904]  [<ffffffffa0290e7d>] hfi1_process_mad+0x48d/0x18c0 [hfi1]
[534552.835908]  [<ffffffff811dc1f1>] ? __slab_free+0x81/0x2f0
[534552.835936]  [<ffffffffa024c34e>] ? ib_mad_recv_done+0x21e/0xa30 [ib_core]
[534552.835939]  [<ffffffff811dd153>] ? __kmalloc+0x1f3/0x240
[534552.835947]  [<ffffffffa024c3fb>] ib_mad_recv_done+0x2cb/0xa30 [ib_core]
[534552.835955]  [<ffffffffa0237c85>] __ib_process_cq+0x55/0xd0 [ib_core]
[534552.835962]  [<ffffffffa0237d70>] ib_cq_poll_work+0x20/0x60 [ib_core]
[534552.835964]  [<ffffffff810a7f3b>] process_one_work+0x17b/0x470
[534552.835966]  [<ffffffff810a8d76>] worker_thread+0x126/0x410
[534552.835969]  [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460
[534552.835971]  [<ffffffff810b052f>] kthread+0xcf/0xe0
[534552.835974]  [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
[534552.835977]  [<ffffffff81696418>] ret_from_fork+0x58/0x90
[534552.835980]  [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140

This issue is made worse when the 8051 is busy and the reads take longer.
Fix by using a non-spinning lock procure.
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: NMike Marciszyn <mike.marciniszyn@intel.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

22546b74

21 4月, 2017 3 次提交

IB/hfi1: VNIC SDMA support · 64551ede

由 Vishwanathapura, Niranjana 提交于 4月 12, 2017

HFI1 VNIC SDMA support enables transmission of VNIC packets over SDMA.
Map VNIC queues to SDMA engines and support halting and wakeup of the
VNIC queues.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

64551ede

IB/hfi1: Virtual Network Interface Controller (VNIC) HW support · 2280740f

由 Vishwanathapura, Niranjana 提交于 4月 12, 2017

HFI1 HW specific support for VNIC functionality.
Dynamically allocate a set of contexts for VNIC when the first vnic
port is instantiated. Allocate VNIC contexts from user contexts pool
and return them back to the same pool while freeing up. Set aside
enough MSI-X interrupts for VNIC contexts and assign them when the
contexts are allocated. On the receive side, use an RSM rule to
spread TCP/UDP streams among VNIC contexts.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: NAndrzej Kacprowski <andrzej.kacprowski@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

2280740f

IB/hfi1: OPA_VNIC RDMA netdev support · d4829ea6

由 Vishwanathapura, Niranjana 提交于 4月 12, 2017

Add support to create and free OPA_VNIC rdma netdev devices.
Implement netstack interface functionality including xmit_skb,
receive side NAPI etc. Also implement rdma netdev control functions.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: NAndrzej Kacprowski <andrzej.kacprowski@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d4829ea6

06 4月, 2017 1 次提交

IB/hfi1: Check device id early during init · 5d6f08af

由 Tadeusz Struk 提交于 3月 20, 2017

If there is a wrong device passed to the driver it should fail early,
without trying to initialize the device only to find out that it has
an invalid device later during the init.
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5d6f08af

19 2月, 2017 1 次提交

IB/hfi1: Allocate context data on memory node · b448bf9a

由 Sebastian Sanchez 提交于 2月 08, 2017

There are some memory allocation calls in hfi1_create_ctxtdata()
that do not use the numa function parameter. This
can cause cache lines to be filled over QPI.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b448bf9a

16 11月, 2016 5 次提交

IB/hfi1: Relocate rcvhdrcnt module parameter check. · 11501ab9

由 Krzysztof Blaszkowski 提交于 10月 25, 2016

Validate the rcvhdrcnt module parameter in a single function at module
load time. This allows proper error reporting.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NKrzysztof Blaszkowski <krzysztof.blaszkowski@intel.com>
Signed-off-by: NTymoteusz Kielan <tymoteusz.kielan@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

11501ab9

IB/hfi1: Delete unused lock · f0f98f74

由 Easwar Hariharan 提交于 10月 17, 2016

The lock is an unused vestige from qib. Remove it.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NEaswar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

f0f98f74

IB/hfi1: Fix a potential memory leak in hfi1_create_ctxts() · 4dfe7cce

由 Jianxin Xiong 提交于 10月 17, 2016

In the function hfi1_create_ctxts the array "dd->rcd" is allocated and
then populated with allocated resources in a loop. Previously, if
error happened during the loop, only resource allocated in the current
iteration would be freed. The array itself would then be freed, leaving
the resources that were allocated in previous iterations and referenced
by the array elements in limbo.

This patch makes sure all allocated resources are freed before freeing
the array "dd->rcd". Also the resource allocation now takes account of
the numa node the device is attached to.
Reviewed-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NJianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

4dfe7cce

IB/hfi1: Return ENODEV for unsupported PCI device ids. · 83fb4af6

由 Krzysztof Blaszkowski 提交于 10月 17, 2016

Clean up device type checking.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NKrzysztof Blaszkowski <krzysztof.blaszkowski@intel.com>
Signed-off-by: NTymoteusz Kielan <tymoteusz.kielan@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

83fb4af6

IB/hfi1: Fix an Oops on pci device force remove · acd7c8fe

由 Tadeusz Struk 提交于 10月 25, 2016

This patch fixes an Oops on device unbind, when the device is used
by a PSM user process. PSM processes access device resources which
are freed on device removal. Similar protection exists in uverbs
in ib_core for Verbs clients, but PSM doesn't use ib_uverbs hence
a separate protection is required for PSM clients.

Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Reviewed-by: NDean Luick <dean.luick@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

acd7c8fe

02 10月, 2016 2 次提交

IB/hfi1: Fix resource release in context allocation · 3a6982df

由 Jakub Pawlak 提交于 9月 25, 2016

Correct resource free in allocate_ctxt() function.
When context creation fails allocated resources are properly
released and pointer in receive context data table is set back
to NULL.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

3a6982df

IB/hfi1: Fix user-space buffers mapping with IOMMU enabled · 60368186

由 Tymoteusz Kielan 提交于 9月 06, 2016

The dma_XXX API functions return bus addresses which are
physical addresses when IOMMU is disabled. Buffer
mapping to user-space is done via remap_pfn_range() with PFN
based on bus address instead of physical. This results in
wrong pages being mapped to user-space when IOMMU is enabled.
Reviewed-by: NMitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NTymoteusz Kielan <tymoteusz.kielan@intel.com>
Signed-off-by: NAndrzej Kacprowski <andrzej.kacprowski@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

60368186

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功