提交 · 5fbd98dd20b9e9829868ebb874bc4d97f3ed3c9e · openanolis / cloud-kernel

03 8月, 2016 39 次提交

IB/hfi1: Ignore QSFP interrupts until power stabilizes · 5fbd98dd

由 Easwar Hariharan 提交于 7月 25, 2016

Some QSFP cables assert the interrupt line as a side effect of module
plug-in and power up. This causes the SerDes and QSFP tuning algorithm
to begin cable initialization by reading the QSFP memory map over I2C,
which fails. This patch ignores any interrupt line assertion until
the module has completed power up and voltage rails have stabilized,
which can take a maximum of 500 ms per the SFF-8679 specification.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NEaswar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5fbd98dd

IB/hfi1: Disable external device configuration requests · 3ca5f4c0

由 Easwar Hariharan 提交于 7月 25, 2016

QSFP CDR enablement is now controlled by determining power class
and the configuration file. We disable the DC 8051 from requesting
enablement or disabling of TX and RX CDRs by removing the code
that allowed the DC 8051 to request changes.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NEaswar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

3ca5f4c0

IB/rdmavt, hfi1: Fix NFSoRDMA failure with FRMR enabled · d9b13c20

由 Jianxin Xiong 提交于 7月 25, 2016

Hanging has been observed while writing a file over NFSoRDMA. Dmesg on
the server contains messages like these:

[  931.992501] svcrdma: Error -22 posting RDMA_READ
[  952.076879] svcrdma: Error -22 posting RDMA_READ
[  982.154127] svcrdma: Error -22 posting RDMA_READ
[ 1012.235884] svcrdma: Error -22 posting RDMA_READ
[ 1042.319194] svcrdma: Error -22 posting RDMA_READ

Here is why:

With the base memory management extension enabled, FRMR is used instead
of FMR. The xprtrdma server issues each RDMA read request as the following
bundle:

(1)IB_WR_REG_MR, signaled;
(2)IB_WR_RDMA_READ, signaled;
(3)IB_WR_LOCAL_INV, signaled & fencing.

These requests are signaled. In order to generate completion, the fast
register work request is processed by the hfi1 send engine after being
posted to the work queue, and the corresponding lkey is not valid until
the request is processed. However, the rdmavt driver validates lkey when
the RDMA read request is posted and thus it fails immediately with error
-EINVAL (-22).

This patch changes the work flow of local operations (fast register and
local invalidate) so that fast register work requests are always
processed immediately to ensure that the corresponding lkey is valid
when subsequent work requests are posted. Local invalidate requests are
processed immediately if fencing is not required and no previous local
invalidate request is pending.

To allow completion generation for signaled local operations that have
been processed before posting to the work queue, an internal send flag
RVT_SEND_COMPLETION_ONLY is added. The hfi1 send engine checks this flag
and only generates completion for such requests.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NJianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d9b13c20

IB/hfi1: Add the capability for reserved operations · 856cc4c2

由 Mike Marciniszyn 提交于 7月 25, 2016

This fix allows for support of in-kernel reserved operations
without impacting the ULP user.

The low level driver can register a non-zero value which
will be transparently added to the send queue size and hidden
from the ULP in every respect.

ULP post sends will never see a full queue due to a reserved
post send and reserved operations will never exceed that
registered value.

The s_avail will continue to track the ULP swqe availability
and the difference between the reserved value and the reserved
in use will track reserved availabity.
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

856cc4c2

IB/hfi1: Fix trace message units · 23002d5b

由 Grzegorz Heldt 提交于 7月 25, 2016

Trace shows incorrect amount of allocated memory.
Fix trace to display memory in KB.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NGrzegorz Heldt <grzegorz.heldt@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

23002d5b

IB/hfi1: Add sysfs entry to override SDMA interrupt affinity · b14db1f0

由 Tadeusz Struk 提交于 7月 25, 2016

Add sysfs entry to allow user to override affinity for SDMA
engine interrupts.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b14db1f0

IB/hfi1: Add static PCIe Gen3 CTLE tuning · c3f8de0b

由 Dean Luick 提交于 7月 25, 2016

Enhance the PCIe Gen3 recipe to support static CTLE tuning,
and add a switch to choose between static and dynamic
approaches.  Make discrete chips default to static CTLE
tuning.
Reviewed-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c3f8de0b

IB/hfi1: Fix "suspicious rcu_dereference_check() usage" warnings · 8adf71fa

由 Jianxin Xiong 提交于 7月 25, 2016

This fixes the following warnings with PROVE_LOCKING and PROVE_RCU
enabled in the kernel:

case (1):
[ INFO: suspicious RCU usage. ]
drivers/infiniband/hw/hfi1/init.c:532
suspicious rcu_dereference_check() usage!

case (2):
[ INFO: suspicious RCU usage. ]
drivers/infiniband/hw/hfi1/hfi.h:1624
suspicious rcu_dereference_check() usage!
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NJianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

8adf71fa

IB/rdmavt: Add missing spin_lock_init call for rdi->n_cqs_lock · a6580f43

由 Jianxin Xiong 提交于 7月 25, 2016

This fixes the following warning with PROV_LOCKING enabled kernel:

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 15 PID: 12286 Comm: modprobe Not tainted 4.7.0-rc5.prove_rcu+ #1
Hardware name: Intel Corporation S2600WT2R/S2600WT2R,
......
Call Trace:
[<ffffffff8139ec0d>] dump_stack+0x85/0xc8
[<ffffffff810eb765>] register_lock_class+0x415/0x4b0
[<ffffffff810ede1c>] ? __lock_acquire+0x40c/0x1960
[<ffffffff810edaa9>] __lock_acquire+0x99/0x1960
[<ffffffff8120ab62>] ? find_vmap_area+0x42/0x60
[<ffffffff8120ab39>] ? find_vmap_area+0x19/0x60
[<ffffffff810ef9d3>] lock_acquire+0xd3/0x200
[<ffffffffa049d598>] ? rvt_create_cq+0xc8/0x250 [rdmavt]
[<ffffffff81763391>] _raw_spin_lock+0x31/0x40
[<ffffffffa049d598>] ? rvt_create_cq+0xc8/0x250 [rdmavt]
[<ffffffffa049d598>] rvt_create_cq+0xc8/0x250 [rdmavt]
[<ffffffff810ead46>] ? static_obj+0x36/0x50
[<ffffffffa0469e39>] ib_alloc_cq+0x49/0x180 [ib_core]
[<ffffffffa047bed4>] ib_mad_init_device+0x204/0x6d0 [ib_core]
[<ffffffff810e968f>] ? up_write+0x1f/0x40
[<ffffffffa046e2c0>] ib_register_device+0x3d0/0x510 [ib_core]
[<ffffffffa0752410>] ? read_cc_setting_bin+0x200/0x200 [hfi1]
[<ffffffff810ead46>] ? static_obj+0x36/0x50
[<ffffffff810eb888>] ? lockdep_init_map+0x88/0x200
[<ffffffffa049cbff>] rvt_register_device+0x17f/0x320 [rdmavt]
[<ffffffffa0766caa>] hfi1_register_ib_device+0x6ca/0x7c0 [hfi1]
[<ffffffffa0733de4>] init_one+0x2b4/0x430 [hfi1]
[<ffffffff813e40a5>] local_pci_probe+0x45/0xa0
[<ffffffff813e5110>] ? pci_match_device+0xe0/0x110
[<ffffffff813e550c>] pci_device_probe+0xfc/0x140
[<ffffffff814daee9>] driver_probe_device+0x239/0x460
[<ffffffff814db1dd>] __driver_attach+0xcd/0xf0
[<ffffffff814db110>] ? driver_probe_device+0x460/0x460
[<ffffffff814d89b3>] bus_for_each_dev+0x73/0xc0
[<ffffffff814da74e>] driver_attach+0x1e/0x20
[<ffffffff814da1b3>] bus_add_driver+0x1d3/0x290
[<ffffffffa04cc114>] ? dev_init+0x114/0x114 [hfi1]
[<ffffffff814dbf60>] driver_register+0x60/0xe0
[<ffffffffa04cc114>] ? dev_init+0x114/0x114 [hfi1]
[<ffffffff813e39d0>] __pci_register_driver+0x60/0x70
[<ffffffffa04cc2aa>] hfi1_mod_init+0x196/0x1fe [hfi1]
[<ffffffff81002190>] do_one_initcall+0x50/0x190
[<ffffffff8110be72>] ? rcu_read_lock_sched_held+0x62/0x70
[<ffffffff8122d4aa>] ? kmem_cache_alloc_trace+0x23a/0x2a0
[<ffffffff811c1881>] ? do_init_module+0x27/0x1dc
[<ffffffff811c18ba>] do_init_module+0x60/0x1dc
[<ffffffff811360cc>] load_module+0x132c/0x1ac0
[<ffffffff81132c40>] ? __symbol_put+0x60/0x60
[<ffffffff8133e50d>] ? ima_post_read_file+0x3d/0x80

Cc: Stable <stable@vger.kernel.org> # 4.6+
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NJianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a6580f43

IB/hfi1: Read all firmware versions · b3bf270b

由 Dean Luick 提交于 7月 25, 2016

Read the version of the SBus, PCIe SerDes, and Fabric Serdes
firmwares at driver load time.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b3bf270b

IB/hfi1: Explain state complete frame details · 6854c692

由 Dean Luick 提交于 7月 25, 2016

When link up fails in LNI, the local and peer state complete
frames are reported as numbers.  Explain what the values mean
so the operator can better diagnose the problem.
Reviewed-by: NEaswar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

6854c692

IB/hfi1: Modify the default number of kernel receive conexts · 8784ac02

由 Harish Chegondi 提交于 7月 25, 2016

Currently, the default number of kernel receive contexts is set to the
number of NUMA nodes on the system plus one for control context. However,
the systems that have a single socket and/or have NUMA disabled in the BIOS
will have only one receive context by default. This patch would ensure that
by default there will be at least two kernel receive contexts plus one for
control context regardless of the number of NUMA nodes on the system. The
user can override the default number of kernel receive contexts with the
krcvqs module parameter.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NHarish Chegondi <harish.chegondi@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

8784ac02

IB/hfi1: Add support for extended memory management · c72cfe3e

由 Jianxin Xiong 提交于 7月 25, 2016

Advertise and add the capability of handing all aspects of IBTA extended
memory management support in post send.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

c72cfe3e

IB/hfi1: Work request processing for fast register mr and invalidate · 0db3dfa0

由 Jianxin Xiong 提交于 7月 25, 2016

In order to support extended memory management support, add send side
processing of work requests of type IB_WR_REG_MR, IB_WR_LOCAL_INV, and
IB_WR_SEND_WITH_INV. The first two are local operations and are supported
for both RC and UC. Send with invalidate is only supported for RC because
the corresponding IB opcodes are not defined for UC.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

0db3dfa0

IB/hfi1: Handle send with invalidate opcode in the RC recv path · a2df0c83

由 Jianxin Xiong 提交于 7月 25, 2016

As part of enabling extended memory management support, add the processing
of the RC send with invalidate.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a2df0c83

IB/rdmavt: Handle local operations in post send · d9f87239

由 Jianxin Xiong 提交于 7月 25, 2016

Some work requests are local operations, such as IB_WR_REG_MR and
IB_WR_LOCAL_INV. They differ from non-local operations in that:

(1) Local operations can be processed immediately without being posted
to the send queue if neither fencing nor completion generation is needed.
However, to ensure correct ordering, once a local operation is posted to
the work queue due to fencing or completion requiement, all subsequent
local operations must also be posted to the work queue until all the
local operations on the work queue have completed.

(2) Local operations don't send packets over the wire and thus don't
need (and shouldn't update) the packet sequence numbers.

Define a new a flag bit for the post send table to identify local
operations.

Add a new field to the QP structure to track the number of local
operations on the send queue to determine if direct processing of new
local operations should be enabled/disabled.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d9f87239

IB/rdmavt: Add mechanism to invalidate MR keys · e8f8b098

由 Jianxin Xiong 提交于 7月 25, 2016

In order to support extended memory management, add the mechanism to
invalidate MR keys. This includes a flag "lkey_invalid" in the MR data
structure that is to be checked when validating access to the MR via
the associated key, and two utility functions to perform fast memory
registration and memory key invalidate operations.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e8f8b098

IB/rdmavt: Add support for ib_map_mr_sg · a41081aa

由 Jianxin Xiong 提交于 7月 25, 2016

This implements the device specific function needed by the verbs
API function ib_map_mr_sg().
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

a41081aa

IB/hfi1: Pull FECN/BECN processing to a common place · 5fd2b562

由 Mitko Haralanov 提交于 7月 25, 2016

There were multiple places where FECN/BECN processing was
being done for the different types of QPs. All of that code
was very similar, which meant that it could be pulled into
a single function used by the different QP types.

To retain the performance in the fastpath, the common code
starts with an inline function, which only calls the slow
path if the packet has any of the [FB]ECN bits set.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NMitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

5fd2b562

IB/hfi1: Fix to fully initialize send context area · 1b23f02c

由 Tymoteusz Kielan 提交于 7月 25, 2016

While handling buffer control MAD, partially initialized
dd->kernel_send_context area may cause potential dereference
of uninitialized pointers. Fix by using kzalloc_node()
instead of kmalloc_node().
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: NAndrzej Kacprowski <andrzej.kacprowski@intel.com>
Signed-off-by: NTymoteusz Kielan <tymoteusz.kielan@intel.com>
Signed-off-by: NAndrzej Kacprowski <andrzej.kacprowski@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

1b23f02c

IB/hfi1: Fix integrity errors counter value calculation · 3210314a

由 Jakub Pawlak 提交于 7月 25, 2016

PMA should not sum TX and RX replay counts when reporting
local link integrity errors. Fixed by removing C_DC_TX_REPLAY
counter from calculation of the link integrity errors counter
value.
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NJakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

3210314a

IB/rdmavt: Use new driver specific post send table · 2821c509

由 Mike Marciniszyn 提交于 7月 01, 2016

Change rvt_post_one_wr to use the new table mechanism for
post send.

Validate that each low level driver specifies the table.
Reviewed-by: NJianxin Xiong <jianxin.xiong@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

2821c509

IB/qib: Add qib post send table · 9ec4faa3

由 Mike Marciniszyn 提交于 7月 01, 2016

Add initial table for table driven post_send support.
Reviewed-by: NJianxin Xiong <jianxin.xiong@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

9ec4faa3

IB/hfi1: Add hfi1 post send tables · 1ac57c50

由 Mike Marciniszyn 提交于 7月 01, 2016

Add initial table for table driven post_send support.
Reviewed-by: NJianxin Xiong <jianxin.xiong@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

1ac57c50

IB/rdmavt: Add data structures and routines for table driven post send · afcf8f76

由 Mike Marciniszyn 提交于 7月 01, 2016

Add flexibility for driver dependent operations in post send
because different drivers will have differing post send
operation support.

This includes data structure definitions to support a table
driven scheme along with the necessary validation routine
using the new table.
Reviewed-by: NAshutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: NJianxin Xiong <jianxin.xiong@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

afcf8f76

IB/hfi1: Correct receive packet handler assignment · 71e68e3d

由 Jakub Pawlak 提交于 7月 01, 2016

Prevent processing receive packet in case when opcode is
accepted by QP but handler for this type of packet is not
defined.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NJakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

71e68e3d

IB/hfi1: Improve SDMA engine assignment for user SDMA · 14833b8c

由 Jianxin Xiong 提交于 7月 01, 2016

Currently each user context is assigned a single SDMA engine
based on the VL, context id, and subcontext id. That means for
MPI applications, each rank can only use one SDMA engine for
all messages. This may create unwanted backup for independent
messages going to different destinations upon congestion at one
destination.

This patch adds the packet "dlid" to the formula of SDMA engine
selection for user SDMA requests. A simple hash table is used
to maintain even distribution among the available SDMA engines
regardless how the "dlid" values are distributed.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Reviewed-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NJianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

14833b8c

IB/hfi1: Remove TWSI references · e014991d

由 Dean Luick 提交于 7月 01, 2016

Remove the TWSI code.  The driver now uses the kernel's built-in
i2c bit bus module.

Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NEaswar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

e014991d

IB/hfi1: Use built-in i2c bit-shift bus adapter · dba715f0

由 Dean Luick 提交于 7月 06, 2016

Use built-in i2c bit-shift bus adapter to control the
i2c busses on the chip.

Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: NEaswar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

dba715f0

IB/hfi1: Refine user process affinity algorithm · b094a36f

由 Sebastian Sanchez 提交于 7月 25, 2016

When performing process affinity recommendations for MPI ranks, the current
algorithm doesn't take into account multiple HFI units. Also, real
cores and HT cores are not distinguished from one another. Therefore,
all HT cores are recommended to be assigned first within the local NUMA
node before recommending the assignments of cores in other NUMA nodes.
It's ideal to assign all real cores across all NUMA nodes first, then all
HT 1 cores, then all HT 2 cores, and so on to balance CPU workload. CPU
cores in other NUMA nodes could be running interrupt handlers, and this is
not taken into account.

To balance the CPU workload for user processes, the following
recommendation algorithm is used:

 For each user process that is opening a context on HFI Y:
  a) If all cores are assigned to user processes, start assignments all
	 over from the first core
  b) Assign real cores first, then HT cores (First set of HT cores on
	 all physical cores, then second set of HT cores, and, so on) in the
	 following order:

	 1. Same NUMA node as HFI Y and not running an IRQ handler
	 2. Same NUMA node as HFI Y and running an IRQ handler
	 3. Different NUMA node to HFI Y and not running an IRQ handler
	 4. Different NUMA node to HFI Y and running an IRQ handler
  c) Mark core as assigned in the global affinity structure. As user
	 processes are done, remove core assignments from global affinity
	 structure.

This implementation allows an arbitrary number of HT cores and provides
support for multiple HFIs.

This is being included in the kernel rather than user space due to the
fact that user space has no way of knowing the CPU recommendations for
contexts running as part of other jobs.
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Reviewed-by: NMitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

b094a36f

IB/hfi1: Reserve and collapse CPU cores for contexts · d6373019

由 Sebastian Sanchez 提交于 7月 25, 2016

Kernel receive queues oversubscribe CPU cores on multi-HFI systems.
To prevent this, the kernel receive queues are separated onto
different cores, and the SDMA engine interrupts are constrained to
a lesser number of cores.

hfi1s_on_numa_node*krcvqs is the number of CPU cores that are
reserved for kernel receive queues for all HFIs. Each HFI initializes
its kernel receive queues to one of the reserved CPU cores. If there
ends up being 0 CPU cores leftover for SDMA engines, use the same
CPU cores as receive contexts.

In addition, general and control contexts are assigned to their own
CPU core, however, both types of contexts tend to have low traffic.
To save CPU cores, collapse general and control contexts to one CPU
core for all HFI units. This change prevents SDMA engine interrupts
from wrapping around general contexts.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d6373019

IB/hfi1: Add global structure for affinity assignments · 4197344b

由 Dennis Dalessandro 提交于 7月 25, 2016

When HFI units get initialized, they each use their own mask copy for
affinity assignments. On a multi-HFI system, affinity assignments
overbook CPU cores as each HFI doesn't have knowledge of affinity
assignments for other HFI units. Therefore, some CPU cores are never
used for interrupt handlers in systems with high number of CPU cores
per NUMA node.

For multi-HFI systems, SDMA engine interrupt assignments start all over
from the first CPU in the local NUMA node after the first HFI
initialization. This change allows assignments to continue where the
last HFI unit left off.

Add global structure for affinity assignments for multiple HFIs to share
affinity mask.
Reviewed-by: NJianxin Xiong <jianxin.xiong@intel.com>
Reviewed-by: NJubin John <jubin.john@intel.com>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

4197344b

IB/hfi1: Add counter to track unsupported packets drop · 2b719046

由 Jakub Pawlak 提交于 7月 01, 2016

Add sw counter to track dropped unsupported packets.
Report unsupported packets drop as the RcvError.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NJakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

2b719046

IB/hfi1: Add VL XmitDiscards counters to the opapmaquery · 583eb8b8

由 Jakub Pawlak 提交于 7月 01, 2016

Add per VL XmitDiscards counters to the opapmaquery
status and error response.
Reviewed-by: NDean Luick <dean.luick@intel.com>
Signed-off-by: NJakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

583eb8b8

IB/hfi1: Fix trace sparse errors · ad421082

由 Mike Marciniszyn 提交于 7月 01, 2016

Fix sparse errors by making sure the fast assign destinations
are host cpu typed.

For the void __iomem *, just make the field match source
data.

Fix a bug where the hw_free trace printed the pointer vs.
the dereferenced value.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

ad421082

IB/hfi1: Separate tracepoints into specific headers · 462b6b21

由 Sebastian Sanchez 提交于 7月 01, 2016

The ftrace infrastructure used to evaluate the TRACE_SYSTEM
macro on every DEFINE_EVENT() macro. Now the TRACE_SYSTEM
macro only gets evaluated when trace/define_trace.h is
included, so the group event information is lost. This was
introduced in
commit acd388fd ("tracing: Give system name a pointer")
Therefore, each system tracepoint must be on its own file.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NSebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

462b6b21

IB/hfi1: Fix typo · 21a4c95d

由 Tadeusz Struk 提交于 7月 01, 2016

Fix a copy and paste typo in comment.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NTadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

21a4c95d

IB/hfi1: Remove unnecessary done label in hfi1_write_iter · 0904f327

由 Ira Weiny 提交于 7月 01, 2016

Simple code clean up of hfi1_write_iter.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

0904f327

IB/hfi1: Clean up port state structure definition · f8181697

由 Ira Weiny 提交于 7月 01, 2016

The definition of port state changed mid development and the
old structure was kept accidentally.  Remove this dead code.
Reviewed-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NIra Weiny <ira.weiny@intel.com>
Signed-off-by: NDennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

f8181697

25 7月, 2016 1 次提交
- L
  
  Linux 4.7 · 523d939e
  由 Linus Torvalds 提交于 7月 24, 2016
  
  523d939e

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功