提交 · 9392fa06411cf93885c4cafc8058085d98f52fec · openeuler / raspberrypi-kernel

20 1月, 2014 5 次提交

RDMA/ocrdma: Add dependency on INET · 9392fa06

由 Roland Dreier 提交于 1月 19, 2014

Now that ocrdma supports IP-based addressing, we need to depend on
INET, since ocrdma registers itself for net device events.
Signed-off-by: NRoland Dreier <roland@purestorage.com>

9392fa06

R
RDMA/ocrdma: Move ocrdma_inetaddr_event outside of "#if CONFIG_IPV6" · 31ab8acb
由 Roland Dreier 提交于 1月 19, 2014
```
This fixes the build if IPV6 isn't enabled.
Signed-off-by: NRoland Dreier <roland@purestorage.com>
```
31ab8acb

IB/mlx4: Add dependency INET · f282651d

由 Matan Barak 提交于 1月 16, 2014

Since mlx4_ib supports IP based addressing, a dependency on INET needs
to be added, since mlx4_ib registers itself for net device events.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f282651d

RDMA/ocrdma: Populate GID table with IP based gids · 37721d85

由 Moni Shoua 提交于 12月 12, 2013

This patch is similar in spirit to the "IB/mlx4: Use IBoE (RoCE) IP
based GIDs in the port GID table" patch.

Changes to inet4 and inet6 addresses for the host are monitored and if
the address is associated with an ocrdma device then a gid is added or
deleted from the device's gid table. The gid format will be a IPv4 to
IPv6 mapped or the IPv6 address.

Cc: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

37721d85

RDMA/ocrdma: Handle Ethernet L2 parameters for IP based GID addressing · 40aca6ff

由 Moni Shoua 提交于 12月 12, 2013

This patch is similar in spirit to the "IB/mlx4: Handle Ethernet L2
parameters for IP based GID addressing".  It handles the fact that IP
based RoCE gids don't store Ethernet L2 parameters, MAC and VLAN.

When building an address handle, instead of parsing the dgid to
get the MAC and VLAN, take them from the address handle attributes.

Cc: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

40aca6ff

19 1月, 2014 2 次提交

IB/mlx4: Handle Ethernet L2 parameters for IP based GID addressing · 297e0dad

由 Moni Shoua 提交于 12月 12, 2013

IP based RoCE gids don't store Ethernet L2 parameters, MAC and VLAN.

Therefore, we need to extract them from the CQE and place them in
struct ib_wc (to be used for cases were they were taken from the gid).

Also, when modifying a QP or building address handle, instead of
parsing the dgid to get the MAC and VLAN, take them from the address
handle attributes.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

297e0dad

IB/mlx4: Use IBoE (RoCE) IP based GIDs in the port GID table · d487ee77

由 Moni Shoua 提交于 12月 12, 2013

Currently, the mlx4 driver set IBoE (RoCE) gids to encode related
Ethernet netdevice interface MAC address and possibly VLAN id.

Change this scheme such that gids encode interface IP addresses (both
IP4 and IPv6).

This requires learning the IP addresses which are of use by a
netdevice associated with the HCA port, formatting them to gids and
adding them to the port gid table.  Furthermore, events of add and
delete address are caught to maintain the gid table accordingly.

Associated IP addresses may belong to a master of an Ethernet
netdevice on top of that port so this should be considered when
building and maintaining the gid table.
Signed-off-by: NMoni Shoua <monis@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

d487ee77

15 1月, 2014 1 次提交

IB/core: Ethernet L2 attributes in verbs/cm structures · dd5f03be

由 Matan Barak 提交于 12月 12, 2013

This patch add the support for Ethernet L2 attributes in the
verbs/cm/cma structures.

When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.

Thus, those attributes were added to the following structures:

* ib_ah_attr - added dmac
* ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
* ib_wc - added smac, vlan_id
* ib_sa_path_rec - added smac, dmac, vlan_id
* cm_av - added smac and vlan_id

For the path record structure, extra care was taken to avoid the new
fields when packing it into wire format, so we don't break the IB CM
and SA wire protocol.

On the active side, the CM fills. its internal structures from the
path provided by the ULP.  We add there taking the ETH L2 attributes
and placing them into the CM Address Handle (struct cm_av).

On the passive side, the CM fills its internal structures from the WC
associated with the REQ message.  We add there taking the ETH L2
attributes from the WC.

When the HW driver provides the required ETH L2 attributes in the WC,
they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
code checks for the presence of these flags, and in their absence does
address resolution from the ib_init_ah_from_wc() helper function.

ib_modify_qp_is_ok is also updated to consider the link layer. Some
parameters are mandatory for Ethernet link layer, while they are
irrelevant for IB.  Vendor drivers are modified to support the new
function signature.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

dd5f03be

23 12月, 2013 3 次提交

RDMA/cxgb4: Use cxgb4_select_ntuple to correctly calculate ntuple fields · 41b4f86c

由 Kumar Sanghvi 提交于 12月 18, 2013

Signed-off-by: NKumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

41b4f86c

RDMA/cxgb4: Server filters are supported only for IPv4 · 8c044690

由 Kumar Sanghvi 提交于 12月 18, 2013

Signed-off-by: NKumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8c044690

RDMA/cxgb4: Calculate the filter server TID properly · a4ea025f

由 Kumar Sanghvi 提交于 12月 18, 2013

Based on original work by Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: NKumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4ea025f

16 12月, 2013 1 次提交

RDMA/cxgb4: Make _c4iw_write_mem_dma() static · c00850dd

由 Rashika 提交于 12月 14, 2013

This patch marks the function _c4iw_write_mem_dma() as static
because it is not used outside this file, which fixes the warning:

drivers/infiniband/hw/cxgb4/mem.c:176:5: warning: no previous prototype for ‘_c4iw_write_mem_dma’ [-Wmissing-prototypes]
Signed-off-by: NRashika Kheria <rashika.kheria@gmail.com>
Acked-by: NSteve Wise <swise@opengridcomputing.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

c00850dd

18 11月, 2013 2 次提交

IB/core: Re-enable create_flow/destroy_flow uverbs · 69ad5da4

由 Matan Barak 提交于 11月 06, 2013

This commit reverts commit 7afbddfa ("IB/core: Temporarily disable
create_flow/destroy_flow uverbs").  Since the uverbs extensions
functionality was experimental for v3.12, this patch re-enables the
support for them and flow-steering for v3.13.
Signed-off-by: NMatan Barak <matanb@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

69ad5da4

IB/core: extended command: an improved infrastructure for uverbs commands · f21519b2

由 Yann Droneaud 提交于 11月 06, 2013

Commit 400dbc96 ("IB/core: Infrastructure for extensible uverbs
commands") added an infrastructure for extensible uverbs commands
while later commit 436f2ad0 ("IB/core: Export ib_create/destroy_flow
through uverbs") exported ib_create_flow()/ib_destroy_flow() functions
using this new infrastructure.

According to the commit 400dbc96, the purpose of this
infrastructure is to support passing around provider (eg. hardware)
specific buffers when userspace issue commands to the kernel, so that
it would be possible to extend uverbs (eg. core) buffers independently
from the provider buffers.

But the new kernel command function prototypes were not modified to
take advantage of this extension. This issue was exposed by Roland
Dreier in a previous review[1].

So the following patch is an attempt to a revised extensible command
infrastructure.

This improved extensible command infrastructure distinguish between
core (eg. legacy)'s command/response buffers from provider
(eg. hardware)'s command/response buffers: each extended command
implementing function is given a struct ib_udata to hold core
(eg. uverbs) input and output buffers, and another struct ib_udata to
hold the hw (eg. provider) input and output buffers.

Having those buffers identified separately make it easier to increase
one buffer to support extension without having to add some code to
guess the exact size of each command/response parts: This should make
the extended functions more reliable.

Additionally, instead of relying on command identifier being greater
than IB_USER_VERBS_CMD_THRESHOLD, the proposed infrastructure rely on
unused bits in command field: on the 32 bits provided by command
field, only 6 bits are really needed to encode the identifier of
commands currently supported by the kernel. (Even using only 6 bits
leaves room for about 23 new commands).

So this patch makes use of some high order bits in command field to
store flags, leaving enough room for more command identifiers than one
will ever need (eg. 256).

The new flags are used to specify if the command should be processed
as an extended one or a legacy one. While designing the new command
format, care was taken to make usage of flags itself extensible.

Using high order bits of the commands field ensure that newer
libibverbs on older kernel will properly fail when trying to call
extended commands. On the other hand, older libibverbs on newer kernel
will never be able to issue calls to extended commands.

The extended command header includes the optional response pointer so
that output buffer length and output buffer pointer are located
together in the command, allowing proper parameters checking. This
should make implementing functions easier and safer.

Additionally the extended header ensure 64bits alignment, while making
all sizes multiple of 8 bytes, extending the maximum buffer size:

                             legacy      extended

   Maximum command buffer:  256KBytes   1024KBytes (512KBytes + 512KBytes)
  Maximum response buffer:  256KBytes   1024KBytes (512KBytes + 512KBytes)

For the purpose of doing proper buffer size accounting, the headers
size are no more taken in account in "in_words".

One of the odds of the current extensible infrastructure, reading
twice the "legacy" command header, is fixed by removing the "legacy"
command header from the extended command header: they are processed as
two different parts of the command: memory is read once and
information are not duplicated: it's making clear that's an extended
command scheme and not a different command scheme.

The proposed scheme will format input (command) and output (response)
buffers this way:

- command:

  legacy header +
  extended header +
  command data (core + hw):

    +----------------------------------------+
    | flags     |   00      00    |  command |
    |        in_words    |   out_words       |
    +----------------------------------------+
    |                 response               |
    |                 response               |
    | provider_in_words | provider_out_words |
    |                 padding                |
    +----------------------------------------+
    |                                        |
    .              <uverbs input>            .
    .              (in_words * 8)            .
    |                                        |
    +----------------------------------------+
    |                                        |
    .             <provider input>           .
    .          (provider_in_words * 8)       .
    |                                        |
    +----------------------------------------+

- response, if present:

    +----------------------------------------+
    |                                        |
    .          <uverbs output space>         .
    .             (out_words * 8)            .
    |                                        |
    +----------------------------------------+
    |                                        |
    .         <provider output space>        .
    .         (provider_out_words * 8)       .
    |                                        |
    +----------------------------------------+

The overall design is to ensure that the extensible infrastructure is
itself extensible while begin more reliable with more input and bound
checking.

Note:

The unused field in the extended header would be perfect candidate to
hold the command "comp_mask" (eg. bit field used to handle
compatibility).  This was suggested by Roland Dreier in a previous
review[2].  But "comp_mask" field is likely to be present in the uverb
input and/or provider input, likewise for the response, as noted by
Matan Barak[3], so it doesn't make sense to put "comp_mask" in the
header.

[1]:
http://marc.info/?i=CAL1RGDWxmM17W2o_era24A-TTDeKyoL6u3NRu_=t_dhV_ZA9MA@mail.gmail.com

[2]:
http://marc.info/?i=CAL1RGDXJtrc849M6_XNZT5xO1+ybKtLWGq6yg6LhoSsKpsmkYA@mail.gmail.com

[3]:
http://marc.info/?i=525C1149.6000701@mellanox.comSigned-off-by: NYann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com

[ Convert "ret ? ret : 0" to the equivalent "ret".  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

f21519b2

16 11月, 2013 4 次提交

IB/mlx5: Fix page shift in create CQ for userspace · cf1c5e1f

由 Eli Cohen 提交于 10月 31, 2013

When creating a CQ, we must use mlx5 adapter page shift.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

cf1c5e1f

IB/mlx4: Fix device max capabilities check · 79d3da9c

由 Eli Cohen 提交于 10月 31, 2013

Move the check on max supported CQEs after the final number of entries is
evaluated.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

79d3da9c

IB/mlx5: Remove dead code · 7e2e1921

由 Eli Cohen 提交于 10月 31, 2013

The value of the local variable index is never used in reg_mr_callback().
Signed-off-by: NEli Cohen <eli@mellanox.com>

[ Remove now-unused variable delta too.  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7e2e1921

IB/mlx4: Fix endless loop in resize CQ · 93b80ac2

由 Eli Cohen 提交于 10月 31, 2013

When calling get_sw_cqe() we need pass the consumer_index and not the
masked value. Failure to do so will cause incorrect result of
get_sw_cqe() possibly leading to endless loop.

This problem was reported and analyzed by Michael Rice from HP.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

93b80ac2

13 11月, 2013 1 次提交

qib_fs: fix (some) dcache abuses · 441a9d0e

由 Al Viro 提交于 11月 13, 2013

* lookup_one_len() really wants i_mutex held on directory.
* leaks galore - just mount ipathfs, then
cd /sys/bus/pci/drivers/qib_ib; echo *:*:*.* >unbind
on a box with that card present and try to umount ipathfs...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

441a9d0e

09 11月, 2013 19 次提交

RDMA/nes: Remove self-assignment from nes_query_qp() · 4127c365

由 Dave Jones 提交于 9月 17, 2013

Assigning a value to itself is pointless.

Spotted with coverity, no hardware to test.
Signed-off-by: NDave Jones <davej@fedoraproject.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

4127c365

IB/qib: Fix txselect regression · 2fadd831

由 Mike Marciniszyn 提交于 10月 25, 2013

Commit 7fac3301("IB/qib: checkpatch fixes") was overzealous in
removing a simple_strtoul for a parse routine, setup_txselect().  That
routine is required to handle a multi-value string.

Unwind that aspect of the fix.

Cc: <stable@vger.kernel.org>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

2fadd831

IB/qib: Fix checkpatch __packed warnings · 78a58864

由 Mike Marciniszyn 提交于 10月 24, 2013

Convert __attribute__ ((packed)) to __packed.
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

78a58864

IB/qib: Convert qib_user_sdma_pin_pages() to use get_user_pages_fast() · 603e7729

由 Jan Kara 提交于 10月 04, 2013

qib_user_sdma_queue_pkts() gets called with mmap_sem held for
writing. Except for get_user_pages() deep down in
qib_user_sdma_pin_pages() we don't seem to need mmap_sem at all.  Even
more interestingly the function qib_user_sdma_queue_pkts() (and also
qib_user_sdma_coalesce() called somewhat later) call copy_from_user()
which can hit a page fault and we deadlock on trying to get mmap_sem
when handling that fault.

So just make qib_user_sdma_pin_pages() use get_user_pages_fast() and
leave mmap_sem locking for mm.

This deadlock has actually been observed in the wild when the node
is under memory pressure.

Cc: <stable@vger.kernel.org>
Reviewed-by: NMike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

603e7729

IB/ipath: Convert ipath_user_sdma_pin_pages() to use get_user_pages_fast() · 4adcf7fb

由 Jan Kara 提交于 10月 04, 2013

ipath_user_sdma_queue_pkts() gets called with mmap_sem held for
writing.  Except for get_user_pages() deep down in
ipath_user_sdma_pin_pages() we don't seem to need mmap_sem at all.

Even more interestingly the function ipath_user_sdma_queue_pkts() (and
also ipath_user_sdma_coalesce() called somewhat later) call
copy_from_user() which can hit a page fault and we deadlock on trying
to get mmap_sem when handling that fault.  So just make
ipath_user_sdma_pin_pages() use get_user_pages_fast() and leave
mmap_sem locking for mm.

This deadlock has actually been observed in the wild when the node
is under memory pressure.

Cc: <stable@vger.kernel.org>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@intel.com>

[ Merged in fix for call to get_user_pages_fast from Tetsuo Handa
  <penguin-kernel@I-love.SAKURA.ne.jp>.  - Roland ]
Signed-off-by: NRoland Dreier <roland@purestorage.com>

4adcf7fb

RDMA/ocrdma: Remove redundant check in ocrdma_build_fr() · d5e3f378

由 Naresh Gottumukkala 提交于 10月 28, 2013

Remove the redundant check of comparing if a 32-bit value is greater
than 0xffffffffULL.

Reported by Dan Carpenter.
Signed-off-by: NNaresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

d5e3f378

RDMA/ocrdma: Fix a crash in rmmod · 1852d1da

由 Naresh Gottumukkala 提交于 9月 06, 2013

1) ocrdma_remove_free() is called from a call_rcu callback funtion
   context, which can be a bottom-half context. So the code in
   ocrdma_remove_free should not sleep.

   But ocrdma_cleanup_hw() can sleep, So move it ocrdma_remove()
   instead of ocrdma_remove_free.

2) Fix a couple of kbuild test robot warnings.
Signed-off-by: NNaresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1852d1da

RDMA/ocrdma: Silence an integer underflow warning · 6ebacdfc

由 Dan Carpenter 提交于 9月 06, 2013

We recently added a cap on "max_wqe_allocated" in 43a6b402
('RDMA/ocrdma: Create IRD queue fix').

My static checker complains that the cap has a problem because it
casts large values to negative.  "attrs->cap.max_send_wr" is a u32.
It comes from the user, but it's capped in ocrdma_check_qp_params() so
it can't wrap here.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6ebacdfc

mlx5: Use enum to indicate adapter page size · 1b77d2bd

由 Eli Cohen 提交于 10月 24, 2013

The Connect-IB adapter has an inherent page size which equals 4K.
Define an new enum that equals the page shift and use it instead of
using the value 12 throughout the code.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1b77d2bd

IB/mlx5: Update opt param mask for RTS2RTS · c2a3431e

由 Eli Cohen 提交于 10月 24, 2013

RTS to RTS transition should allow update of alternate path.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

c2a3431e

IB/mlx5: Remove "Always false" comparison · 07c9113f

由 Eli Cohen 提交于 10月 24, 2013

mlx5_cur and mlx5_new cannot have negative values so remove the
redundant condition.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

07c9113f

IB/mlx5: Remove dead code in mr.c · 2d036fad

由 Eli Cohen 提交于 10月 24, 2013

In mlx5_mr_cache_init() the size variable is not used so remove it to
avoid compiler warnings when running with make W=1.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

2d036fad

mlx5: Support communicating arbitrary host page size to firmware · bf0bf77f

由 Eli Cohen 提交于 10月 23, 2013

Connect-IB firmware requires 4K pages to be communicated with the
driver. This patch breaks larger pages to 4K units to enable support
for architectures utilizing larger page size, such as PowerPC. This
patch also fixes several places that referred to PAGE_SHIFT instead of
explicit 12 which is the inherent page shift on Connect-IB.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

bf0bf77f

IB/mlx5: Fix srq free in destroy qp · cfd8f1d4

由 Moshe Lazer 提交于 10月 23, 2013

On destroy QP the driver walks over the relevant CQ and removes CQEs
reported for the destroyed QP. It also frees the related SRQ entry
without checking that this is actually an SRQ-related CQE. In case of
a CQ used for both send and receive QP, we could free SRQ entries for
send CQEs. This patch resolves this issue by verifying that this is a
SRQ related CQE by checking the SRQ number in the CQE is not zero.
Signed-off-by: NMoshe Lazer <moshel@mellanox.com>
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

cfd8f1d4

IB/mlx5: Simplify mlx5_ib_destroy_srq · 1faacf82

由 Eli Cohen 提交于 10月 23, 2013

Make use of destroy_srq_kernel() to clear SRQ resouces.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1faacf82

IB/mlx5: Fix overflow check in IB_WR_FAST_REG_MR · 9641b74e

由 Eli Cohen 提交于 10月 23, 2013

Make sure not to overflow when reading the page list from struct
ib_fast_reg_page_list.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

9641b74e

IB/mlx5: Multithreaded create MR · 746b5583

由 Eli Cohen 提交于 10月 23, 2013

Use asynchronous commands to execute up to eight concurrent create MR
commands. This is to fill memory caches faster so we keep consuming
from there.  Also, increase timeout for shrinking caches to five
minutes.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

746b5583

IB/mlx5: Fix check of number of entries in create CQ · 51ee86a4

由 Eli Cohen 提交于 10月 23, 2013

Verify that the value is non negative before rounding up to power of 2.
Signed-off-by: NEli Cohen <eli@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

51ee86a4

IB/cxgb4: Fix formatting of physical address · 649fb5ec

由 Ben Hutchings 提交于 10月 27, 2013

Physical addresses may be wider than virtual addresses (e.g. on i386
with PAE) and must not be formatted with %p.

Compile-tested only.
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

649fb5ec

08 11月, 2013 1 次提交

net/mlx4_core: Initialize all mailbox buffers to zero before use · 571b8b92

由 Jack Morgenstein 提交于 11月 07, 2013

To guarantee that all unused fields in all FW commands for both inboxes
and outboxes are zeroed out, initialize the mailbox buffer to all zeroes.

This is especially important for SRIOV comm-channel virtual commands
(such as QUERY_FUNC_CAP), where if new fields are added to support new
features, the driver can depend on older kernels passing zeroes in these
fields.

In addition to zeroing out the mailbox buffer at allocation time, all
(now unnecessary) calls to memset by the callers of
mlx4_alloc_cmd_mailbox() are removed.
Signed-off-by: NMajd Dibbiny <majd@mellanox.com>
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NAmir Vadai <amirv@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

571b8b92

05 11月, 2013 1 次提交

mlx4: Structures and init/teardown for VF resource quotas · 5a0d0a61

由 Jack Morgenstein 提交于 11月 03, 2013

This is step #1 for implementing SRIOV resource quotas for VFs.

Quotas are implemented per resource type for VFs and the PF, to prevent
any entity from simply grabbing all the resources for itself and leaving
the other entities unable to obtain such resources.

Resources which are allocated using quotas:  QPs, CQs, SRQs, MPTs, MTTs, MAC,
                                             VLAN, and Counters.

The quota system works as follows:
Each entity (VF or PF) is given a max number of a given resource (its quota),
and a guaranteed minimum number for each resource (starvation prevention).

For QPs, CQs, SRQs, MPTs and MTTs:
50% of the available quantity for the resource is divided equally among
the PF and all the active VFs (i.e., the number of VFs in the mlx4_core module
parameter "num_vfs"). This 50% represents the "guaranteed minimum" pool.
The other 50% is the "free pool", allocated on a first-come-first-serve basis.
For each VF/PF, resources are first allocated from its "guaranteed-minimum"
pool. When that pool is exhausted, the driver attempts to allocate from
the resource "free-pool".

The quota (i.e., max) for the VFs and the PF is:
  The free-pool amount (50% of the real max) + the guaranteed minimum

For MACs:
  Guarantee 2 MACs per VF/PF per port. As a result, since we have only
  128 MACs per port, reduce the allowable number of VFs from 64 to 63.
  Any remaining MACs are put into a free pool.

For VLANs:
  For the PF, the per-port quota is 128 and guarantee is 64
     (to allow the PF to register at least a VLAN per VF in VST mode).
  For the VFs, the per-port quota is 64 and the guarantee is 0.
      We assume that VGT VFs are trusted not to abuse the VLAN resource.

For Counters:
  For all functions (PF and VFs), the quota is 128 and the guarantee is 0.

In this patch, we define the needed structures, which are added to the
resource-tracker struct.  In addition, we do initialization
for the resource quota, and adjust the query_device response to use quotas
rather than resource maxima.

As part of the implementation, we introduce a new field in
mlx4_dev: quotas.  This field holds the resource quotas used
to report maxima to the upper layers (ib_core, via query_device).

The HCA maxima of these values are passed to the VFs (via
QUERY_HCA) so that they may continue to use these in handling
QPs, CQs, SRQs and MPTs.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a0d0a61