提交 · b67bbc5923bf99b4898fd3d4ed2337e8b716b448 · openanolis / cloud-kernel

22 6月, 2018 2 次提交

IB/hfi1: Remove rcvctrl from ctxtdata · b67bbc59

由 Mike Marciniszyn 提交于 6月 20, 2018

It is only ever written.
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

b67bbc59

IB/hfi1: Remove rcvhdrq_size · b2578431

由 Mike Marciniszyn 提交于 6月 20, 2018

The usage of this ctxt data field is not hot path and the value can be
computed on demand to cut down the ctxtdata bloat.
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

b2578431

20 6月, 2018 2 次提交

IB/hfi1: Remove rcvhdrsize · 32e3d970

由 Mike Marciniszyn 提交于 6月 04, 2018

The field is based on a constant that can never change.

Use the define to assign the register instead.
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

32e3d970

IB/hfi1: Move rhf_offset from devdata to ctxtdata · 40442b30

由 Mike Marciniszyn 提交于 6月 04, 2018

This field should be in ctxtdata to allow for better locality of access by
eliminating a dd dereference.

The new field is now side-by-side with rcvhdrqentsize since the rhf_offset
is a function of the rcvhdrqentsize.

Both fields are now correctly sized as u8.
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

40442b30

05 6月, 2018 2 次提交

IB/hfi1: Add bypass register defines and replace blind constants · dc2b2a91

由 Mike Marciniszyn 提交于 6月 04, 2018

These registers were not added in the 16B work.

Add them and replace blind constants with the correct defines.

Fixes: 72c07e2b ("IB/hfi1: Add support to receive 16B bypass packets")
Reviewed-by: NDon Hiatt <don.hiatt@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

dc2b2a91

IB/hfi1: Fix user context tail allocation for DMA_RTAIL · 1bc0299d

由 Mike Marciniszyn 提交于 5月 31, 2018

The following code fails to allocate a buffer for the
tail address that the hardware DMAs into when the user
context DMA_RTAIL is set.

if (HFI1_CAP_KGET_MASK(rcd->flags, DMA_RTAIL)) {
	rcd->rcvhdrtail_kvaddr = dma_zalloc_coherent(
		&dd->pcidev->dev, PAGE_SIZE, &dma_hdrqtail,
                gfp_flags);
	if (!rcd->rcvhdrtail_kvaddr)
		goto bail_free;
	rcd->rcvhdrqtailaddr_dma = dma_hdrqtail;
}

So the rcvhdrtail_kvaddr would then be NULL.

The mmap logic fails to check for a NULL rcvhdrtail_kvaddr.

The fix is to test for both user and kernel DMA_TAIL options
during the allocation as well as testing for a NULL
rcvhdrtail_kvaddr during the mmap processing.

Additionally, all downstream testing of the capmask for DMA_RTAIL
have been eliminated in favor of testing rcvhdrtail_kvaddr.

Cc: <stable@vger.kernel.org> # 4.9.x
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

1bc0299d

10 5月, 2018 5 次提交

IB/{hfi1, rdmavt, qib}: Implement CQ completion vector support · 5d18ee67

由 Sebastian Sanchez 提交于 5月 02, 2018

Currently the driver doesn't support completion vectors. These
are used to indicate which sets of CQs should be grouped together
into the same vector. A vector is a CQ processing thread that
runs on a specific CPU.

If an application has several CQs bound to different completion
vectors, and each completion vector runs on different CPUs, then
the completion queue workload is balanced. This helps scale as more
nodes are used.

Implement CQ completion vector support using a global workqueue
where a CQ entry is queued to the CPU corresponding to the CQ's
completion vector. Since the workqueue is global, it's guaranteed
to always be there when queueing CQ entries; Therefore, the RCU
locking for cq->rdi->worker in the hot path is superfluous.

Each completion vector is assigned to a different CPU. The number of
completion vectors available is computed by taking the number of
online, physical CPUs from the local NUMA node and subtracting the
CPUs used for kernel receive queues and the general interrupt.
Special use cases:

  * If there are no CPUs left for completion vectors, the same CPU
    for the general interrupt is used; Therefore, there would only
    be one completion vector available.

  * For multi-HFI systems, the number of completion vectors available
    for each device is the total number of completion vectors in
    the local NUMA node divided by the number of devices in the same
    NUMA node. If there's a division remainder, the first device to
    get initialized gets an extra completion vector.

Upon a CQ creation, an invalid completion vector could be specified.
Handle it as follows:

  * If the completion vector is less than 0, set it to 0.

  * Set the completion vector to the result of the passed completion
    vector moded with the number of device completion vectors
    available.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5d18ee67

IB/Hfi1: Read CCE Revision register to verify the device is responsive · c872a1f9

由 Kamenee Arumugam 提交于 5月 02, 2018

When Hfi1 device is unresponsive, reading the RcvArrayCnt register
will return all 1's. This value is then used to remap chip's RcvArray.
The incorrect all ones value used in remapping RcvArray
will cause warn on as shown by trace below:

[<ffffffff81685eac>] dump_stack+0x19/0x1b
[<ffffffff81085820>] warn_slowpath_common+0x70/0xb0
[<ffffffff810858bc>] warn_slowpath_fmt+0x5c/0x80
[<ffffffff81065c29>] __ioremap_caller+0x279/0x320
[<ffffffff8142873c>] ? _dev_info+0x6c/0x90
[<ffffffffa021d155>] ? hfi1_pcie_ddinit+0x1d5/0x330 [hfi1]
[<ffffffff81065d62>] ioremap_wc+0x32/0x40
[<ffffffffa021d155>] hfi1_pcie_ddinit+0x1d5/0x330 [hfi1]
[<ffffffffa0204851>] hfi1_init_dd+0x1d1/0x2440 [hfi1]
[<ffffffff813503dc>] ? pci_write_config_word+0x1c/0x20

Read CCE revision register first to verify that WFR device is
responsive. If the read return "all ones", bail out from init
and fail the driver load.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NKamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c872a1f9

IB/hfi1: Rework fault injection machinery · a74d5307

由 Mitko Haralanov 提交于 5月 02, 2018

The packet fault injection code present in the HFI1 driver had some
issues which not only fragment the code but also created user
confusion. Furthermore, it suffered from the following issues:

1. The fault_packet method only worked for received packets. This
meant that the only fault injection mode available for sent
packets is fault_opcode, which did not allow for random packet
drops on all egressing packets.
2. The mask available for the fault_opcode mode did not really work
due to the fact that the opcode values are not bits in a bitmask but
rather sequential integer values. Creating a opcode/mask pair that
would successfully capture a set of packets was nearly impossible.
3. The code was fragmented and used too many debugfs entries to
operate and control. This was confusing to users.
4. It did not allow filtering fault injection on a per direction basis -
egress vs. ingress.

In order to improve or fix the above issues, the following changes have
been made:

1. The fault injection methods have been combined into a single fault
injection facility. As such, the fault injection has been plugged
into both the send and receive code paths. Regardless of method used
the fault injection will operate on both egress and ingress packets.
2. The type of fault injection - by packet or by opcode - is now controlled
by changing the boolean value of the file "opcode_mode". When the value
is set to True, fault injection is done by opcode. Otherwise, by
packet.
2. The masking ability has been removed in favor of a bitmap that holds
opcodes of interest (one bit per opcode, a total of 256 bits). This
works in tandem with the "opcode_mode" value. When the value of
"opcode_mode" is False, this bitmap is ignored. When the value is
True, the bitmap lists all opcodes to be considered for fault injection.
By default, the bitmap is empty. When the user wants to filter by opcode,
the user sets the corresponding bit in the bitmap by echo'ing the bit
position into the 'opcodes' file. This gets around the issue that the set
of opcodes does not lend itself to effective masks and allow for extremely
fine-grained filtering by opcode.
4. fault_packet and fault_opcode methods have been combined. Hence, there
is only one debugfs directory controlling the entire operation of the
fault injection machinery. This reduces the number of debugfs entries
and provides a more unified user experience.
5. A new control files - "direction" - is provided to allow the user to
control the direction of packets, which are subject to fault injection.
6. A new control file - "skip_usec" - is added that would allow the user
to specify a "timeout" during which no fault injection will occur.

In addition, the following bug fixes have been applied:

1. The fault injection code has been split into its own header and source
files. This was done to better organize the code and support conditional
compilation without littering the code with #ifdef's.
2. The method by which the TX PIO packets were being marked for drop
conflicted with the way send contexts were being setup. As a result,
the send context was repeatedly being reset.
3. The fault injection only makes sense when the user can control it
through the debugfs entries. However, a kernel configuration can
enable fault injection but keep fault injection debugfs entries
disabled. Therefore, it makes sense that the HFI fault injection
code depends on both.
4. Error suppression did not take into account the method by which PIO
packets were being dropped. Therefore, even with error suppression
turned on, errors would still be displayed to the screen. A larger
enough packet drop percentage would case the kernel to crash because
the driver would be stuck printing errors.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: NDon Hiatt <don.hiatt@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a74d5307

IB/hfi1: Return correct value for device state · e4607073

由 Michael J. Ruhl 提交于 5月 02, 2018

The driver_pstate() function is used to map internal driver state
information to externally defined states.

The VERIFY_CAP and GOING_UP states are config/training states, but
the mapping routing returns the POLLING value.

Update the return values for VERIFY_CAP and GOING_UP to return the
correct value: TRAINING.
Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e4607073

IB/hfi1: Prevent LNI hang when LCB can't obtain lanes · 254361c1

由 Sebastian Sanchez 提交于 5月 02, 2018

When the LCB isn't able to get any lanes operational on the
first transition into mission mode, the link transfer active
never happens and the LNI stays in the polling state indefinitely.

Reset LCB upon receiving an 8051 interrupt for LCB to try to obtain
lanes with firmware version 1.25.0 or later. Also, update the LCB
reset value in other parts of the code with a macro defined to make
the code more maintainable and rename functions with the link_width
label to link_mode to reflect the fact that those functions set and
read link related data not just the link width.
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

254361c1

09 5月, 2018 1 次提交

IB/hfi1: Use after free race condition in send context error path · f9e76ca3

由 Michael J. Ruhl 提交于 5月 02, 2018

A pio send egress error can occur when the PSM library attempts to
to send a bad packet.  That issue is still being investigated.

The pio error interrupt handler then attempts to progress the recovery
of the errored pio send context.

Code inspection reveals that the handling lacks the necessary locking
if that recovery interleaves with a PSM close of the "context" object
contains the pio send context.

The lack of the locking can cause the recovery to access the already
freed pio send context object and incorrectly deduce that the pio
send context is actually a kernel pio send context as shown by the
NULL deref stack below:

[<ffffffff8143d78c>] _dev_info+0x6c/0x90
[<ffffffffc0613230>] sc_restart+0x70/0x1f0 [hfi1]
[<ffffffff816ab124>] ? __schedule+0x424/0x9b0
[<ffffffffc06133c5>] sc_halted+0x15/0x20 [hfi1]
[<ffffffff810aa3ba>] process_one_work+0x17a/0x440
[<ffffffff810ab086>] worker_thread+0x126/0x3c0
[<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
[<ffffffff810b252f>] kthread+0xcf/0xe0
[<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[<ffffffff816b8798>] ret_from_fork+0x58/0x90
[<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40

This is the best case scenario and other scenarios can corrupt the
already freed memory.

Fix by adding the necessary locking in the pio send context error
handler.

Cc: <stable@vger.kernel.org> # 4.9.x
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

f9e76ca3

02 2月, 2018 2 次提交

IB/hfi1: Convert PortXmitWait/PortVLXmitWait counters to flit times · 07190076

由 Kamenee Arumugam 提交于 2月 01, 2018

HFI's counters SendWaitCnt and SendWaitVlCnt are in units
of TXE cycle time (at 805MHz). OPA counters PortXmitWait and
PortVLXmtWait are in units of flit times.
Convert the counter values to flit units using following
conversion formula:

PortXmitWait =
	SendWaitCnt * 2 * (4 /link_width) * (25 Gbps /link_speed)
PortVLXmitWait =
	SendWaitVLCnt * 2 * (4 /link_width) * (25 Gbps /link_speed)

At link up or downgrade events, the link width can change. To ensure
accurate counter calculations, sample the counters after the events,
during counter requests, and then aggregate the OPA counters.
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NKamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

07190076

IB/hfi1: Re-order IRQ cleanup to address driver cleanup race · 82a97926

由 Michael J. Ruhl 提交于 2月 01, 2018

The pci_request_irq() interfaces always adds the IRQF_SHARED bit to
all IRQ requests.

When the kernel is built with CONFIG_DEBUG_SHIRQ config flag, if the
IRQF_SHARED bit is set, a call to the IRQ handler is made from the
__free_irq() function. This is testing a race condition between the
IRQ cleanup and an IRQ racing the cleanup.  The HFI driver should be
able to handle this race, but does not.

This race can cause traces that start with this footprint:

BUG: unable to handle kernel NULL pointer dereference at   (null)
Call Trace:
 <hfi1 irq handler>
 ...
 __free_irq+0x1b3/0x2d0
 free_irq+0x35/0x70
 pci_free_irq+0x1c/0x30
 clean_up_interrupts+0x53/0xf0 [hfi1]
 hfi1_start_cleanup+0x122/0x190 [hfi1]
 postinit_cleanup+0x1d/0x280 [hfi1]
 remove_one+0x233/0x250 [hfi1]
 pci_device_remove+0x39/0xc0

Export IRQ cleanup function so it can be called from other modules.

Using the exported cleanup function:

  Re-order the driver cleanup code to clean up IRQ resources before
  other resources, eliminating the race.

  Re-order error path for init so that the race does not occur.

Reduce severity on spurious error message for SDMA IRQs to info.
Reviewed-by: NAlex Estrin <alex.estrin@intel.com>
Reviewed-by: NPatel Jay P <jay.p.patel@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

82a97926

06 1月, 2018 2 次提交

IB/{hfi1, qib}: Fix a concurrency issue with device name in logging · 11f0e897

由 Michael J. Ruhl 提交于 12月 18, 2017

The get_unit_name() function crafts a string based on the device name
and the device unit number.  It then stores this in a static variable.

This has concurrency issues as can be seen with this log:

hfi1 0000:02:00.0: hfi1_1: read_idle_message: read idle message 0x203
hfi1 0000:01:00.0: hfi1_1: read_idle_message: read idle message 0x203

The PCI device ID (0000:02:00.0 vs. 0000:01:00.0) is correct for the
message, but the device string hfi1_1 is incorrect (it should be
hfi1_0 for the second log message).

Remove get_unit_name() function.

Instead, use the rvt accessor rvt_get_ibdev_name() to get the IB name
string.

Clean up any hfi1_early_xx calls that can now use the new path.

QIB has the same (qib_get_unit_name()) issue.  Updating as necessary.

Remove qib_get_unit_name() function.

Update log message that has redundant device name.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

11f0e897

IB/hfi1: Fix infinite loop in 8051 command error path · 9996b049

由 Sebastian Sanchez 提交于 12月 18, 2017

When an 8051 command times out, the entire DC block is restarted. During
the restart, the host interface version bit is set, which calls
do_8051_command() recursively. The host version bit needs to be set
before the link moves into polling, so the host version bit can be set
in set_local_link_attributes() instead. Thus, the 8051 command functions
can be simplied as a non-locking version (dd->dc8051_lock) of those
functions are no longer needed.

Fixes: 9be6a5d7 ("IB/hfi1: Prevent LNI out of sync by resetting host interface version")
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

9996b049

14 11月, 2017 3 次提交

IB/hfi1: Send 'reboot' as planned down remote reason · e8d5aff6

由 Jan Sokolowski 提交于 11月 06, 2017

On host shutdown, driver sends 'SMA_Disabled' as a reason
for link down. This is incorrect.

Send 'reboot' as a linkdown reason.
Signed-off-by: NJan Sokolowski <jan.sokolowski@intel.com>
Reviewed-by: NJakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e8d5aff6

IB/hfi1: Do not allocate PIO send contexts for VNIC · cc9a97ea

由 Niranjana Vishwanathapura 提交于 11月 06, 2017

OPA VNIC does not use PIO contexts and instead only uses SDMA
engines. Do not allocate PIO contexts for VNIC ports.
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NNiranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

cc9a97ea

IB/hfi1: Allow MgmtAllowed on B2B setups · 641f348b

由 Jan Sokolowski 提交于 11月 06, 2017

HFI's are hard-wired to send Device Info frames with
MgmtAllowed bit set to 0. This means in B2B setups,
MgmtAllowed would never be allowed, which prevents
remote opa management tools from working properly.

Assume MgmtAllowed if a neighbor is also an HFI.

Fixes: 98b9ee20 ("IB/hfi1: Cache neighbor secure data after link up")
Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NJan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

641f348b

31 10月, 2017 4 次提交

IB/hfi1: Don't modify num_user_contexts module parameter · 45a041cc

由 Kamenee Arumugam 提交于 10月 23, 2017

The driver parameter num_user_contexts controls global behavior and
should not be modified by the driver.
This patch eliminates modification of num_user_contexts by using a
local variable to keep track of the value.
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NKamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

45a041cc

IB/hfi1: Insure int mask for in-kernel receive contexts is clear · 2d9544aa

由 Mike Marciniszyn 提交于 10月 23, 2017

The only use for the urg interrupt is for priority PSM packets.

There is no reason for this interrupt to be enabled for kernel
contexts.
Reviewed-by: NKaike Wan <kaike.wan@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

2d9544aa

Ib/hfi1: Return actual operational VLs in port info query · 00f92031

由 Patel Jay P 提交于 10月 23, 2017

__subn_get_opa_portinfo stores value returned by hfi1_get_ib_cfg() as
operational vls. hfi1_get_ib_cfg() returns vls_operational field in
hfi1_pportdata. The problem with this is that the value is always equal
to vls_supported field in hfi1_pportdata.

The logic to calculate operational_vls is to set value passed by FM
(in  __subn_set_opa_portinfo routine). If no value is passed then
default value is stored in operational_vls.

Field actual_vls_operational is calculated on the basis of buffer
control table. Hence, modifying hfi1_get_ib_cfg() to return
actual_operational_vls when used with HFI1_IB_CFG_OP_VLS parameter
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NPatel Jay P <jay.p.patel@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

00f92031

IB/hfi1: Race condition between user notification and driver state · 4061f3a4

由 Michael J. Ruhl 提交于 10月 23, 2017

The handler for link init state (HLS_UP_INIT) notifies userspace
(update_statusp()) before enabling the device
(RCV_CTRL_RCV_PORT_ENABLE_SMASK) or setting the device state
(ppd->host_link_state).  This causes a race condition where the
userspace thinks the interface is in the INIT state before the driver
has set that state.

Rework the code path to eliminate the race.

Delay setting the init state until after a HW settling period.
Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

4061f3a4

18 10月, 2017 2 次提交

IB/hfi1: Convert timers to use timer_setup() · 8064135e

由 Kees Cook 提交于 10月 16, 2017

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly. Switches test of .data field to
.function, since .data will be going away.

Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

8064135e

IB/hfi1: Fix serdes loopback set-up · 242b494b

由 Jan Sokolowski 提交于 10月 09, 2017

Change serdes mode setting to use MISC_CONFIG_BITS in
VERIFY_CAP_LOCAL_LINK_WIDTH register. This method of
setting up serdes loopback is universally compatible
across all firmware versions.
Reviewed-by: NJakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: NJan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

242b494b

05 10月, 2017 3 次提交

IB/rdmavt: Correct issues with read-mostly and send size cache lines · 7ebfc93e

由 Sebastian Sanchez 提交于 10月 02, 2017

The s_ahgpsn was incorrectly placed in the read-mostly section of the QP
and the s_curr_size and s_hdrwords are oversized. The misplaced
s_ahgpsn will cause the read-mostly cachelines to thrash.

Place s_ahgpsn in the send side cache lines and correctly size and
s_hdrwords and s_cur_size to keep the send side cachelines at the same
size.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

7ebfc93e

IB/hfi1: Prevent LNI out of sync by resetting host interface version · 9be6a5d7

由 Sebastian Sanchez 提交于 10月 02, 2017

When the link is disabled and re-enabled, the host version bit is not
set again, so the firmware behaves as though it’s interacting with an
old driver. This causes LNI to get out of sync. The host version bit
needs to be set at load_8051_firmware() and _dc_start(). Currently, it's
only set at load_8051_firmware().

Create a common function to set the bit with the intent to make the code
more maintainable in the future, set the host version bit at _dc_start()
and modify the 8051 command API to prevent a deadlock as _dc_start() is
already holding the dc8051 lock.

Fixes: 913cc671 ("IB/hfi1: Always perform offline transition")
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

9be6a5d7

IB/hfi1: Fix incorrect available receive user context count · d7d62617

由 Michael J. Ruhl 提交于 10月 02, 2017

The addition of the VNIC contexts to num_rcv_contexts changes the
meaning of the sysfs value nctxts from available user contexts, to
user contexts + reserved VNIC contexts.

User applications that use nctxts are now broken.

Update the calculation so that VNIC contexts are used only if there are
hardware contexts available, and do not silently affect nctxts.

Update code to use the calculated VNIC context number.

Update the sysfs value nctxts to be available user contexts only.

Fixes: 2280740f ("IB/hfi1: Virtual Network Interface Controller (VNIC) HW support")
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Reviewed-by: NNiranjana Vishwanathapura <Niranjana.Vishwanathapura@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Cc: <Stable@vger.kernel.org> #v4.12+
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d7d62617

27 9月, 2017 7 次提交

IB/hfi1: Add a safe wrapper for _rcd_get_by_index · d59075ad

由 Michael J. Ruhl 提交于 9月 26, 2017

hfi1_rcd_get_by_index assumes that the given index is in the correct
range.  In most cases this is correct because the index is bounded by
a loop.  For these cases, adding a range check to the function is
redundant.

For the use case that is not bounded by the loop range, a _safe wrapper
function is needed to validate the index before accessing the rcd array.

Add a _safe wrapper to _get_by_index to validate the index range.

Update appropriate call sites with the new _safe function.
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d59075ad

IB/hfi1: Remove unused link_default variable · 156d24d7

由 Ira Weiny 提交于 9月 26, 2017

devdata->link_default is no longer variable

Maintain number of holes by moving dc_shutdown
Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

156d24d7

IB/hfi1: Update HFI to use the latest PCI API · 05cb18fd

由 Michael J. Ruhl 提交于 9月 26, 2017

The HFI PCI IRQ code uses an obsolete PCI API. Update the code to use
the new PCI IRQ API and any necessary changes because of the new API.
Reviewed-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

05cb18fd

IB/hfi1: Add new state complete decodes for LNI failures · e870b4a1

由 Jakub Byczkowski 提交于 9月 26, 2017

Add state decodes for link width negotiation, verify cap time out
and secure data resolution failures.
Reviewed-by: NJan Sokolowski <jan.sokolowski@intel.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NJakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e870b4a1

IB/hfi1: Return correct value in general interrupt handler · 09592af5

由 Kamenee Arumugam 提交于 9月 26, 2017

The general interrupt handler returns IRQ_HANDLED whether an IRQ
was handled or not.
Determine if an IRQ was handled and return the correct value.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: NKamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

09592af5

IB/hfi1: Only reset QSFP after link up and turn off AOC TX · 30e10527

由 Sebastian Sanchez 提交于 9月 26, 2017

QSFP reset enables AOC transmitters by default. They should be off
before moving to high power mode to complete the setup. There is no
need to reset the QSFP during LNI failure as it was reset at link down.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NJakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

30e10527

IB/hfi1: Turn off AOC TX after offline substates · df5efdd9

由 Sebastian Sanchez 提交于 9月 26, 2017

Offline.quietDuration was added in the 8051 firmware, and the driver
only turns off the AOC transmitters when offline.quiet is reached.
However, the AOC transmitters need to be turned off at the new state.
Therefore, turn off the AOC transmitters at any offline substates
including offline.quiet and offline.quietDuration, then recheck we
reached offline.quiet to support backwards compatibility.
Reviewed-by: NJakub Byczkowski <jakub.byczkowski@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

df5efdd9

29 8月, 2017 1 次提交

IB/hfi1: Ratelimit prints from sdma_interrupt · de42de80

由 Grzegorz Morys 提交于 8月 21, 2017

Ratelimit error prints from sdma_interrupt function
that could swarm dmesg otherwise.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NGrzegorz Morys <grzegorz.morys@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

de42de80

23 8月, 2017 4 次提交

IB/hfi1: Remove pstate from hfi1_pportdata · d392a673

由 Jakub Byczkowski 提交于 8月 13, 2017

Do not track physical state separately from host_link_state.
Deduce physical state from host_link_state when required.
Change cache_physical_state to log_physical_state to make
sure host_link_state reflects hardwares physical state properly.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NJakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d392a673

IB/hfi1: Check xchg returned value for queuing link down entry · b6422bc0

由 Sebastian Sanchez 提交于 8月 13, 2017

Check xchg returned value for queuing link down entry
to guarantee proper atomic value reads.

Fixes: 626c077c ("IB/hfi1: Prevent link down request double queuing")
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b6422bc0

IB/rdmavt, hfi1, qib: Enhance rdmavt and hfi1 to use 32 bit lids · 51e658f5

由 Dasaratharaman Chandramouli 提交于 8月 04, 2017

Increase lid used in hfi1 driver to 32 bits. qib continues
to use 16 bit lids.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: NDon Hiatt <don.hiatt@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

51e658f5

IB/hfi1: Add support to receive 16B bypass packets · 72c07e2b

由 Don Hiatt 提交于 8月 04, 2017

We introduce a struct hfi1_16b_header to support 16B headers.
16B bypass packets are received by the driver and processed
similar to 9B packets. Add basic support to handle 16B packets.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDon Hiatt <don.hiatt@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

72c07e2b

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功