提交 · 04e10e2164fcfa05e14eff3c2757a5097f11d258 · openanolis / cloud-kernel

16 7月, 2014 2 次提交

iw_cxgb4: Detect Ing. Padding Boundary at run-time · 04e10e21

由 Hariprasad Shenai 提交于 7月 14, 2014

Updates iw_cxgb4 to determine the Ingress Padding Boundary from
cxgb4_lld_info, and take subsequent actions.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04e10e21

net: set name_assign_type in alloc_netdev() · c835a677

由 Tom Gundersen 提交于 7月 14, 2014

Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert
all users to pass NET_NAME_UNKNOWN.

Coccinelle patch:

@@
expression sizeof_priv, name, setup, txqs, rxqs, count;
@@

(
-alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs)
+alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs)
|
-alloc_netdev_mq(sizeof_priv, name, setup, count)
+alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count)
|
-alloc_netdev(sizeof_priv, name, setup)
+alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup)
)

v9: move comments here from the wrong commit
Signed-off-by: NTom Gundersen <teg@jklm.no>
Reviewed-by: NDavid Herrmann <dh.herrmann@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c835a677

02 7月, 2014 1 次提交

rdma/cxgb4: Fixes cxgb4 probe failure in VM when PF is exposed through PCI Passthrough · 35b1de55

由 Hariprasad Shenai 提交于 6月 27, 2014

Change logic which determines our Physical Function at PCI Probe time.
Now we read the PL_WHOAMI register and get the Physical Function.

Pass Physical Function to Upper Layer Drivers in lld_info structure in the
new field "pf" added to lld_info. This is useful for the cases where the
PF, say PF4, is attached to a Virtual Machine via some form of "PCI
Pass Through" technology and the PCI Function shows up as PF0 in the VM.

Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: NCasey Leedom <leedom@chelsio.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35b1de55

12 6月, 2014 3 次提交

libiscsi, iser: Adjust data_length to include protection information · d77e6535

由 Sagi Grimberg 提交于 6月 11, 2014

In case protection information exists over the wire
iscsi header data length is required to include it.
Use protection information aware scsi helpers to set
the correct transfer length.

In order to avoid breakage, remove iser transfer length
checks for each task as they are not always true and
somewhat redundant anyway.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Reviewed-by: NMike Christie <michaelc@cs.wisc.edu>
Acked-by: NMike Christie <michaelc@cs.wisc.edu>
Cc: stable@vger.kernel.org # 3.15+
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

d77e6535

Target/iscsi: Fix sendtargets response pdu for iser transport · 22c7aaa5

由 Sagi Grimberg 提交于 6月 10, 2014

In case the transport is iser we should not include the
iscsi target info in the sendtargets text response pdu.
This causes sendtargets response to include the target
info twice.

Modify iscsit_build_sendtargets_response to filter
transport types that don't match.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Reported-by: NSlava Shwartsman <valyushash@gmail.com>
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

22c7aaa5

Target/iser: Fix a wrong dereference in case discovery session is over iser · e0546fc1

由 Sagi Grimberg 提交于 6月 10, 2014

In case the discovery session is carried over iser, we can't
access the assumed network portal since the default portal is
used. In this case we don't really need to allocate the fastreg
pool, just prepare to the text pdu that will follow.
Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Reported-by: NAlex Tabachnik <alext@mellanox.com>
Cc: stable@vger.kernel.org # 3.15+
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

e0546fc1

11 6月, 2014 6 次提交

iw_cxgb4: don't truncate the recv window size · b408ff28

由 Hariprasad Shenai 提交于 6月 06, 2014

Fixed a bug that shows up with recv window sizes that exceed the size of
the RCV_BUFSIZ field in opt0 (>= 1024K).  If the recv window exceeds
this, then we specify the max possible in opt0, add add the rest in via
a RX_DATA_ACK credits.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b408ff28

iw_cxgb4: Choose appropriate hw mtu index and ISS for iWARP connections · 92e7ae71

由 Hariprasad Shenai 提交于 6月 06, 2014

Select the appropriate hw mtu index and initial sequence number to optimize
hw memory performance.

Add new cxgb4_best_aligned_mtu() which allows callers to provide enough
information to be used to [possibly] select an MTU which will result in the
TCP Data Segment Size (AKA Maximum Segment Size) to be an aligned value.

If an RTR message exhange is required, then align the ISS to 8B - 1 + 4, so
that after the SYN the send seqno will align on a 4B boundary. The RTR
message exchange will leave the send seqno aligned on an 8B boundary.
If an RTR is not required, then align the ISS to 8B - 1. The goal is
to have the send seqno be 8B aligned when we send the first FPDU.

Based on original work by Casey Leedom <leeedom@chelsio.com> and
Steve Wise <swise@opengridcomputing.com>
Signed-off-by: NCasey Leedom <leedom@chelsio.com>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92e7ae71

iw_cxgb4: Allocate and use IQs specifically for indirect interrupts · cf38be6d

由 Hariprasad Shenai 提交于 6月 06, 2014

Currently indirect interrupts for RDMA CQs funnel through the LLD's RDMA
RXQs, which also handle direct interrupts for offload CPLs during RDMA
connection setup/teardown.  The intended T4 usage model, however, is to
have indirect interrupts flow through dedicated IQs. IE not to mix
indirect interrupts with CPL messages in an IQ.  This patch adds the
concept of RDMA concentrator IQs, or CIQs, setup and maintained by the
LLD and exported to iw_cxgb4 for use when creating CQs. RDMA CPLs will
flow through the LLD's RDMA RXQs, and CQ interrupts flow through the
CIQs.

Design:

cxgb4 creates and exports an array of CIQs for the RDMA ULD.  These IQs
are sized according to the max available CQs available at adapter init.
In addition, these IQs don't need FL buffers since they only service
indirect interrupts.  One CIQ is setup per RX channel similar to the
RDMA RXQs.

iw_cxgb4 will utilize these CIQs based on the vector value passed into
create_cq().  The num_comp_vectors advertised by iw_cxgb4 will be the
number of CIQs configured, and thus the vector value will be the index
into the array of CIQs.

Based on original work by Steve Wise <swise@opengridcomputing.com>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cf38be6d

RDMA/cxgb4: Add support for iWARP Port Mapper user space service · 9eccfe10

由 Steve Wise 提交于 3月 26, 2014

Based on original work by Vipul Pandya.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>

[ Fix htons -> ntohs to make sparse happy.  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

9eccfe10

RDMA/nes: Add support for iWARP Port Mapper user space service · 5647263c

由 Tatyana Nikolova 提交于 3月 26, 2014

Signed-off-by: NTatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

5647263c

RDMA/core: Add support for iWARP Port Mapper user space service · 30dc5e63

由 Tatyana Nikolova 提交于 3月 26, 2014

This patch adds iWARP Port Mapper (IWPM) Version 2 support.  The iWARP
Port Mapper implementation is based on the port mapper specification
section in the Sockets Direct Protocol paper -
http://www.rdmaconsortium.org/home/draft-pinkerton-iwarp-sdp-v1.0.pdf

Existing iWARP RDMA providers use the same IP address as the native
TCP/IP stack when creating RDMA connections.  They need a mechanism to
claim the TCP ports used for RDMA connections to prevent TCP port
collisions when other host applications use TCP ports.  The iWARP Port
Mapper provides a standard mechanism to accomplish this.  Without this
service it is possible for RDMA application to bind/listen on the same
port which is already being used by native TCP host application.  If
that happens the incoming TCP connection data can be passed to the
RDMA stack with error.

The iWARP Port Mapper solution doesn't contain any changes to the
existing network stack in the kernel space.  All the changes are
contained with the infiniband tree and also in user space.

The iWARP Port Mapper service is implemented as a user space daemon
process.  Source for the IWPM service is located at
http://git.openfabrics.org/git?p=~tnikolova/libiwpm-1.0.0/.git;a=summary

The iWARP driver (port mapper client) sends to the IWPM service the
local IP address and TCP port it has received from the RDMA
application, when starting a connection.  The IWPM service performs a
socket bind from user space to get an available TCP port, called a
mapped port, and communicates it back to the client.  In that sense,
the IWPM service is used to map the TCP port, which the RDMA
application uses to any port available from the host TCP port
space. The mapped ports are used in iWARP RDMA connections to avoid
collisions with native TCP stack which is aware that these ports are
taken. When an RDMA connection using a mapped port is terminated, the
client notifies the IWPM service, which then releases the TCP port.

The message exchange between the IWPM service and the iWARP drivers
(between user space and kernel space) is implemented using netlink
sockets.

1) Netlink interface functions are added: ibnl_unicast() and
   ibnl_mulitcast() for sending netlink messages to user space

2) The signature of the existing ibnl_put_msg() is changed to be more
   generic

3) Two netlink clients are added: RDMA_NL_NES, RDMA_NL_C4IW
   corresponding to the two iWarp drivers - nes and cxgb4 which use
   the IWPM service

4) Enums are added to enumerate the attributes in the netlink
   messages, which are exchanged between the user space IWPM service
   and the iWARP drivers
Signed-off-by: NTatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Reviewed-by: NPJ Waskiewicz <pj.waskiewicz@solidfire.com>

[ Fold in range checking fixes and nlh_next removal as suggested by Dan
  Carpenter and Steve Wise.  Fix sparse endianness in hash.  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

30dc5e63

10 6月, 2014 1 次提交

IB/mlx4: Fix gfp passing in create_qp_common() · 6fcd8d0d

由 Jiri Kosina 提交于 6月 09, 2014

There are two kzalloc() calls which were not converted to use value of
gfp passed to create_qp_common() instead of using hardcoded GFP_KERNEL
in 40f2287b ("IB/mlx4: Implement IB_QP_CREATE_USE_GFP_NOIO").  Fix
this by passing gfp value down properly.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6fcd8d0d

07 6月, 2014 1 次提交

IB/umad: Fix use-after-free on close · 60e1751c

由 Bart Van Assche 提交于 6月 06, 2014

Avoid that closing /dev/infiniband/umad<n> or /dev/infiniband/issm<n>
triggers a use-after-free.  __fput() invokes f_op->release() before it
invokes cdev_put().  Make sure that the ib_umad_device structure is
freed by the cdev_put() call instead of f_op->release().  This avoids
that changing the port mode from IB into Ethernet and back to IB
followed by restarting opensmd triggers the following kernel oops:

    general protection fault: 0000 [#1] PREEMPT SMP
    RIP: 0010:[<ffffffff810cc65c>]  [<ffffffff810cc65c>] module_put+0x2c/0x170
    Call Trace:
     [<ffffffff81190f20>] cdev_put+0x20/0x30
     [<ffffffff8118e2ce>] __fput+0x1ae/0x1f0
     [<ffffffff8118e35e>] ____fput+0xe/0x10
     [<ffffffff810723bc>] task_work_run+0xac/0xe0
     [<ffffffff81002a9f>] do_notify_resume+0x9f/0xc0
     [<ffffffff814b8398>] int_signal+0x12/0x17

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=75051Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NYann Droneaud <ydroneaud@opteya.com>
Cc: <stable@vger.kernel.org> # 3.x: 8ec0a0e6: IB/umad: Fix error handling
Signed-off-by: NRoland Dreier <roland@purestorage.com>

60e1751c

06 6月, 2014 2 次提交

IB/core: Fix kobject leak on device register error flow · 584482ac

由 Haggai Eran 提交于 5月 18, 2014

The ports kobject isn't being released during error flow in device
registration.  This patch refactors the ports kobject cleanup into a
single function called from both the error flow in device registration
and from the unregistration function.

A couple of attributes aren't being deleted (iw_stats_group, and
ib_class_attributes).  While this may be handled implicitly by the
destruction of their kobjects, it seems better to handle all the
attributes the same way.
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>

[ Make free_port_list_attributes() static.  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

584482ac

RDMA/cxgb4: add missing padding at end of struct c4iw_alloc_ucontext_resp · b7dfa889

由 Yann Droneaud 提交于 5月 05, 2014

The i386 ABI disagrees with most other ABIs regarding alignment of
data types larger than 4 bytes: on most ABIs a padding must be added
at end of the structures, while it is not required on i386.

So for most ABI struct c4iw_alloc_ucontext_resp gets implicitly padded
to be aligned on a 8 bytes multiple, while for i386, such padding is
not added.

The tool pahole can be used to find such implicit padding:

  $ pahole --anon_include \
           --nested_anon_include \
           --recursive \
           --class_name c4iw_alloc_ucontext_resp \
           drivers/infiniband/hw/cxgb4/iw_cxgb4.o

Then, structure layout can be compared between i386 and x86_64:

  +++ obj-i386/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt   2014-03-28 11:43:05.547432195 +0100
  --- obj-x86_64/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 10:55:10.990133017 +0100
  @@ -2,9 +2,8 @@ struct c4iw_alloc_ucontext_resp {
          __u64                      status_page_key;      /*     0     8 */
          __u32                      status_page_size;     /*     8     4 */

  -       /* size: 12, cachelines: 1, members: 2 */
  -       /* last cacheline: 12 bytes */
  +       /* size: 16, cachelines: 1, members: 2 */
  +       /* padding: 4 */
  +       /* last cacheline: 16 bytes */
   };

This ABI disagreement will make an x86_64 kernel try to write past the
buffer provided by an i386 binary.

When boundary check will be implemented, the x86_64 kernel will refuse
to write past the i386 userspace provided buffer and the uverbs will
fail.

If the structure is on a page boundary and the next page is not
mapped, ib_copy_to_udata() will fail and the uverb will fail.

Additionally, as reported by Dan Carpenter, without the implicit
padding being properly cleared, an information leak would take place
in most architectures.

This patch adds an explicit padding to struct c4iw_alloc_ucontext_resp,
and, like 92b0ca7c ("IB/mlx5: Fix stack info leak in
mlx5_ib_alloc_ucontext()"), makes function c4iw_alloc_ucontext()
not writting this padding field to userspace. This way, x86_64 kernel
will be able to write struct c4iw_alloc_ucontext_resp as expected by
unpatched and patched i386 libcxgb4.

Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com
Link: http://marc.info/?i=1395848977.3297.15.camel@localhost.localdomain
Link: http://marc.info/?i=20140328082428.GH25192@mwanda
Cc: <stable@vger.kernel.org>
Fixes: 05eb2389 ("cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes")
Reported-by: NYann Droneaud <ydroneaud@opteya.com>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b7dfa889

05 6月, 2014 3 次提交

IB/core: Fix port kobject deletion during error flow · cad6d02a

由 Haggai Eran 提交于 5月 18, 2014

When encountering an error during the add_port function, adding a port
to sysfs, the port kobject is freed without being deleted from sysfs.

Instead of freeing it directly, the patch uses kobject_put to release
the kobject and delete it.
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

cad6d02a

IB/core: Remove unneeded kobject_get/put calls · 373c0ea1

由 Haggai Eran 提交于 5月 18, 2014

The ib_core module will call kobject_get on the parent object of each
kobject it creates.  This is redundant since kobject_add does that
anyway.

As a side effect, this patch should fix leaking the ports kobject and
the device kobject during unregister flow, since the previous code
didn't seem to take into account the kobject_get calls on behalf of
the child kobjects.
Signed-off-by: NHaggai Eran <haggaie@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

373c0ea1

IB/core: Fix sparse warnings about redeclared functions · 8385fd84

由 Roland Dreier 提交于 6月 04, 2014

Fix a few functions that are declared with __attribute_const__ in the
ib_verbs.h header file but defined without it in verbs.c. This gets rid
of the following sparse warnings:

drivers/infiniband/core/verbs.c:51:5: error: symbol 'ib_rate_to_mult' redeclared with different type (originally declared at include/rdma/ib_verbs.h:469) - different modifiers
drivers/infiniband/core/verbs.c:68:14: error: symbol 'mult_to_ib_rate' redeclared with different type (originally declared at include/rdma/ib_verbs.h:607) - different modifiers
drivers/infiniband/core/verbs.c:85:5: error: symbol 'ib_rate_to_mbps' redeclared with different type (originally declared at include/rdma/ib_verbs.h:476) - different modifiers
drivers/infiniband/core/verbs.c:111:1: error: symbol 'rdma_node_get_transport' redeclared with different type (originally declared at include/rdma/ib_verbs.h:84) - different modifiers
Signed-off-by: NRoland Dreier <roland@purestorage.com>

8385fd84

04 6月, 2014 2 次提交

iser-target: Add missing target_put_sess_cmd for ImmedateData failure · 6cc44a6f

由 Nicholas Bellinger 提交于 5月 23, 2014

This patch addresses a bug where an early exception for SCSI WRITE
with ImmediateData=Yes was missing the target_put_sess_cmd() call
to drop the extra se_cmd->cmd_kref reference obtained during the
normal iscsit_setup_scsi_cmd() codepath execution.

This bug was manifesting itself during session shutdown within
isert_cq_rx_comp_err() where target_wait_for_sess_cmds() would
end up waiting indefinately for the last se_cmd->cmd_kref put to
occur for the failed SCSI WRITE + ImmediateData descriptors.

This fix follows what traditional iscsi-target code already does
for the same failure case within iscsit_get_immediate_data().
Reported-by: NSagi Grimberg <sagig@dev.mellanox.co.il>
Cc: Sagi Grimberg <sagig@dev.mellanox.co.il>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: NNicholas Bellinger <nab@linux-iscsi.org>

6cc44a6f

IB/mad: Fix sparse warning about gfp_t use · 5343c00d

由 Roland Dreier 提交于 6月 03, 2014

Properly convert gfp_t & result to bool to fix:

drivers/infiniband/core/sa_query.c:621:33: warning: incorrect type in initializer (different base types)
drivers/infiniband/core/sa_query.c:621:33: expected bool [unsigned] [usertype] preload
drivers/infiniband/core/sa_query.c:621:33: got restricted gfp_t
Signed-off-by: NRoland Dreier <roland@purestorage.com>

5343c00d

03 6月, 2014 4 次提交

IB/mlx4: Implement IB_QP_CREATE_USE_GFP_NOIO · 40f2287b

由 Jiri Kosina 提交于 5月 11, 2014

Modify the various routines used to allocate memory resources which
serve QPs in mlx4 to get an input GFP directive.  Have the Ethernet
driver to use GFP_KERNEL in it's QP allocations as done prior to this
commit, and the IB driver to use GFP_NOIO when the IB verbs
IB_QP_CREATE_USE_GFP_NOIO QP creation flag is provided.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

40f2287b

IB: Add a QP creation flag to use GFP_NOIO allocations · 09b93088

由 Or Gerlitz 提交于 5月 11, 2014

This addresses a problem where NFS client writes over IPoIB connected
mode may deadlock on memory allocation/writeback.

The problem is not directly memory reclamation.  There is an indirect
dependency between network filesystems writing back pages and
ipoib_cm_tx_init() due to how a kworker is used.  Page reclaim cannot
make forward progress until ipoib_cm_tx_init() succeeds and it is
stuck in page reclaim itself waiting for network transmission.
Ordinarily this situation may be avoided by having the caller use
GFP_NOFS but ipoib_cm_tx_init() does not have that information.

To address this, take a general approach and add a new QP creation
flag that tells the low-level hardware driver to use GFP_NOIO for the
memory allocations related to the new QP.

Use the new flag in the ipoib connected mode path, and if the driver
doesn't support it, re-issue the QP creation without the flag.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

09b93088

IB: Return error for unsupported QP creation flags · 60093dc0

由 Or Gerlitz 提交于 5月 11, 2014

Fix the usnic and thw qib drivers to err when QP creation flags that
they don't understand are provided.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

60093dc0

IB: Allow build of hw/ and ulp/ subdirectories independently · 729ee4ef

由 Yann Droneaud 提交于 3月 27, 2014

It is not possible to build only the drivers/infiniband/hw/ (or ulp/)
subdirectory with command such as:

    $ make ARCH=x86_64 O=./obj-x86_64/ drivers/infiniband/hw/

This fails with following error messages:

    make[2]: Nothing to be done for `all'.
    make[2]: Nothing to be done for `relocs'.
      CHK     include/config/kernel.release
      Using /home/ydroneaud/src/linux as source for kernel
      GEN     /home/ydroneaud/src/linux/obj-x86_64/Makefile
      CHK     include/generated/uapi/linux/version.h
      CHK     include/generated/utsrelease.h
      CALL    /home/ydroneaud/src/linux/scripts/checksyscalls.sh
    /home/ydroneaud/src/linux/scripts/Makefile.build:44: /home/ydroneaud/src/linux/drivers/infiniband/hw/Makefile: No such file or directory
    make[2]: *** No rule to make target `/home/ydroneaud/src/linux/drivers/infiniband/hw/Makefile'.  Stop.
    make[1]: *** [drivers/infiniband/hw/] Error 2
    make: *** [sub-make] Error 2

This patch creates a Makefile in hw/ and ulp/ and moves each
corresponding parts of drivers/infiniband/Makefile in the new
Makefiles.

It should not break build except if some hw/ drivers or ulp/ were
allowed previously to be built while CONFIG_INFINIBAND is set to 'n',
but according to drivers/infiniband/Kconfig, it's not possible. So it
should be safe to apply.
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

729ee4ef

02 6月, 2014 2 次提交

D
Revert "net/mlx4_en: Use affinity hint" · 96b2e73c
由 David S. Miller 提交于 6月 02, 2014
```
This reverts commit 70a640d0.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
96b2e73c

net/mlx4_en: Use affinity hint · 70a640d0

由 Yuval Atias 提交于 5月 25, 2014

The “affinity hint” mechanism is used by the user space
daemon, irqbalancer, to indicate a preferred CPU mask for irqs.
Irqbalancer can use this hint to balance the irqs between the
cpus indicated by the mask.

We wish the HCA to preferentially map the IRQs it uses to numa cores
close to it.  To accomplish this, we use cpumask_set_cpu_local_first(), that
sets the affinity hint according the following policy:
First it maps IRQs to “close” numa cores.  If these are exhausted, the
remaining IRQs are mapped to “far” numa cores.
Signed-off-by: NYuval Atias <yuvala@mellanox.com>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70a640d0

30 5月, 2014 8 次提交

RDMA/cxgb4: Add missing padding at end of struct c4iw_create_cq_resp · b6f04d3d

由 Yann Droneaud 提交于 5月 05, 2014

The i386 ABI disagrees with most other ABIs regarding alignment of
data types larger than 4 bytes: on most ABIs a padding must be added
at end of the structures, while it is not required on i386.

So for most ABI struct c4iw_create_cq_resp gets implicitly padded
to be aligned on a 8 bytes multiple, while for i386, such padding
is not added.

The tool pahole can be used to find such implicit padding:

  $ pahole --anon_include \
           --nested_anon_include \
           --recursive \
           --class_name c4iw_create_cq_resp \
           drivers/infiniband/hw/cxgb4/iw_cxgb4.o

Then, structure layout can be compared between i386 and x86_64:

  +++ obj-i386/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt   2014-03-28 11:43:05.547432195 +0100
  --- obj-x86_64/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 10:55:10.990133017 +0100
  @@ -14,9 +13,8 @@ struct c4iw_create_cq_resp {
          __u32                      size;                 /*    28     4 */
          __u32                      qid_mask;             /*    32     4 */

  -       /* size: 36, cachelines: 1, members: 6 */
  -       /* last cacheline: 36 bytes */
  +       /* size: 40, cachelines: 1, members: 6 */
  +       /* padding: 4 */
  +       /* last cacheline: 40 bytes */
   };

This ABI disagreement will make an x86_64 kernel try to write past the
buffer provided by an i386 binary.

When boundary check will be implemented, the x86_64 kernel will refuse
to write past the i386 userspace provided buffer and the uverbs will
fail.

If the structure is on a page boundary and the next page is not
mapped, ib_copy_to_udata() will fail and the uverb will fail.

This patch adds an explicit padding at end of structure
c4iw_create_cq_resp, and, like 92b0ca7c ("IB/mlx5: Fix stack info
leak in mlx5_ib_alloc_ucontext()"), makes function c4iw_create_cq()
not writting this padding field to userspace. This way, x86_64 kernel
will be able to write struct c4iw_create_cq_resp as expected by
unpatched and patched i386 libcxgb4.

Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com
Cc: <stable@vger.kernel.org>
Fixes: cfdda9d7 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC")
Fixes: e24a72a3 ("RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq()")
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b6f04d3d

IB/srp: Avoid problems if a header uses pr_fmt · d236cd0e

由 Joe Perches 提交于 2月 01, 2013

SRP defines pr_fmt(fmt) to be "PFX fmt", and then includes a bunch of
header files before it gets around to defining PFX.  This causes
problems if any of the header files do a pr_... and use pr_fmt().

Fix this by using KBUILD_MODNAME instead of the private PFX.
Acked-by: NChris Metcalf <cmetcalf@tilera.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

d236cd0e

IB/umad: Fix error handling · 8ec0a0e6

由 Bart Van Assche 提交于 5月 20, 2014

Avoid leaking a kref count in ib_umad_open() if port->ib_dev == NULL
or if nonseekable_open() fails.

Avoid leaking a kref count, that sm_sem is kept down and also that the
IB_PORT_SM capability mask is not cleared in ib_umad_sm_open() if
nonseekable_open() fails.

Since container_of() never returns NULL, remove the code that tests
whether container_of() returns NULL.

Moving the kref_get() call from the start of ib_umad_*open() to the
end is safe since it is the responsibility of the caller of these
functions to ensure that the cdev pointer remains valid until at least
when these functions return.
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Cc: <stable@vger.kernel.org>

[ydroneaud@opteya.com: rework a bit to reduce the amount of code changed]
Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>

[ nonseekable_open() can't actually fail, but....  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

8ec0a0e6

IB/mlx4: Add interface for selecting VFs to enable QP0 via MLX proxy QPs · 65fed8a8

由 Jack Morgenstein 提交于 5月 29, 2014

This commit adds the sysfs interface for enabling QP0 on VFs for
selected VF/port.

By default, no VFs are enabled for QP0 operation.

To enable QP0 operation on a VF/port, under
/sys/class/infiniband/mlx4_x/iov/<b:d:f>/ports/x there are two new entries:

- smi_enabled (read-only). Indicates whether smi is currently
  enabled for the indicated VF/port

- enable_smi_admin (rw). Used by the admin to request that smi
  capability be enabled or disabled for the indicated VF/port.
  0 = disable, 1 = enable.
  The requested enablement will occur at the next reset of the
  VF (e.g. driver restart on the VM which owns the VF).
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

65fed8a8

mlx4: Add infrastructure for selecting VFs to enable QP0 via MLX proxy QPs · 99ec41d0

由 Jack Morgenstein 提交于 5月 29, 2014

This commit adds the infrastructure for enabling selected VFs to
operate SMI (QP0) MADs without restriction.

Additionally, for these enabled VFs, their QP0 proxy and tunnel QPs
are MLX QPs.  As such, they operate over VL15.  Therefore, they are
not affected by "credit" problems or changes in the VLArb table (which
may shut down VL0).

Non-enabled VFs may only create UD proxy QP0 qps (which are forced by
the hypervisor to send packets using the q-key it assigns and places
in the qp-context).  Thus, non-enabled VFs will not pose a security
risk.  The hypervisor discards any privileged MADs it receives from
these non-enabled VFs.

By default, all VFs are NOT enabled, and must explicitly be enabled
by the administrator.

The sysfs interface which operates the VF enablement infrastructure
is provided in the next commit.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

99ec41d0

IB/mlx4: Preparation for VFs to issue/receive SMI (QP0) requests/responses · 97982f5a

由 Jack Morgenstein 提交于 5月 29, 2014

Currently, VFs in SRIOV VFs are denied QP0 access.  The main reason
for this decision is security, since Subnet Management Datagrams
(SMPs) are not restricted by network partitioning and may affect the
physical network topology.  Moreover, even the SM may be denied access
from portions of the network by setting management keys unknown to the
SM.

However, it is desirable to grant SMI access to certain privileged
VFs, so that certain network management activities may be conducted
within virtual machines instead of the hypervisor.

This commit does the following:

1. Create QP0 tunnel QPs for all VFs.

2. Discard SMI mads sent-from/received-for non-privileged VFs in the
   hypervisor MAD multiplex/demultiplex logic.  SMI mads from/for
   privileged VFs are allowed to pass.

3. MAD_IFC wrapper changes/fixes.  For non-privileged VFs, only
   host-view MAD_IFC commands are allowed, and only for SMI LID-Routed
   GET mads.  For privileged VFs, there are no restrictions.

This commit does not allow privileged VFs as yet.  To determine if a VF
is privileged, it calls function mlx4_vf_smi_enabled().  This function
returns 0 unconditionally for now.

The next two commits allow defining and activating privileged VFs.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

97982f5a

IB/mlx4: SET_PORT called by mlx4_ib_modify_port should be wrapped · 61565013

由 Jack Morgenstein 提交于 5月 29, 2014

mlx4_ib_modify_port is invoked in IB for resetting the Q_Key violations
counters and for modifying the IB port capability flags.

For example, when opensm is started up on the hypervisor,
mlx4_ib_modify_port is called to set the port's IsSM flag.

In multifunction mode, the SET_PORT command used in this flow should
be wrapped (so that the PF port capability flags are also tracked,
thus enabling the aggregate of all the VF/PF capability flags to be
tracked properly).

The procedure mlx4_SET_PORT() in main.c is also renamed to mlx4_ib_SET_PORT()
to differentiate it from procedure mlx4_SET_PORT() in port.c.
mlx4_ib_SET_PORT() is used exclusively by mlx4_ib_modify_port().

Finally, the CM invokes ib_modify_port() to set the IsCMSupported flag
even when running over RoCE.  Therefore, when RoCE is active,
mlx4_ib_modify_port should return OK unconditionally (since the
capability flags and qkey violations counter are not relevant).
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

61565013

IB/qib: Additional Intel branding changes · 0a66d2bd

由 Vinit Agnihotri 提交于 5月 29, 2014

This patches changes user visible function names containing "qlogic"
in module init and cleanup.
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NVinit Agnihotri <vinit.abhay.agnihotri@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

0a66d2bd

29 5月, 2014 4 次提交

RDMA/cxgb3: Remove a couple unneeded conditions · 3c735d48

由 Dan Carpenter 提交于 2月 06, 2014

We know that "reset_tpt_entry" is false on this side of the if else
statement so there is no need to check again.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

3c735d48

IB/mlx4: fix unitialised variable is_mcast · bfdfcfee

由 Colin Ian King 提交于 5月 17, 2014

Commit 297e0dad ("IB/mlx4: Handle Ethernet L2 parameters for IP
based GID addressing") introduced a bug where is_mcast is now no
longer initialized on the non-multicast condition and so it can be
any random value from the stack.  This issue was detected by cppcheck:

    [drivers/infiniband/hw/mlx4/ah.c:103]: (error) Uninitialized
      variable: is_mcast

Simple fix is to initialise is_mcast to zero.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

bfdfcfee

IB/ipath: Use time_before()/_after() · 49410185

由 Manuel Schölling 提交于 5月 25, 2014

Time comparisons must use time_after / time_before to avoid problems
when jiffies wraps.
Signed-off-by: NManuel Schölling <manuel.schoelling@gmx.de>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

49410185

IB/mlx5: Fix warning about cast of wr_id back to pointer on 32 bits · 6c9b5d9b

由 Roland Dreier 提交于 5月 28, 2014

We need to cast wr_id to unsigned long before casting to a pointer.
This fixes:

       drivers/infiniband/hw/mlx5/mr.c: In function 'mlx5_umr_cq_handler':
    >> drivers/infiniband/hw/mlx5/mr.c:724:13: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
          context = (struct mlx5_ib_umr_context *)wc.wr_id;
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6c9b5d9b

28 5月, 2014 1 次提交

IB/usnic: Fix source file missing copyright and license · ed477c4c

由 Upinder Malhi 提交于 4月 19, 2014

Prepends copyright and license to usnic_uiom_interval_tree.c
Signed-off-by: NUpinder Malhi <umalhi@cisco.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

ed477c4c

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功