提交 · a4ef1451dfba92f51934e8331f634497b9ed3393 · openanolis / cloud-kernel

26 1月, 2008 40 次提交

IB/iser: Print information about unhandled RDMA CM events · a4ef1451

由 Erez Zilber 提交于 1月 17, 2008

Some RDMA CM events are not supported or not handled in iSER.
This patch adds some info (printk) for the user about them.
Signed-off-by: NErez Zilber <erezz@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

a4ef1451

IB/fmr_pool: ib_fmr_pool_flush() should flush all dirty FMRs · a3cd7d90

由 Olaf Kirch 提交于 1月 16, 2008

When a FMR is released via ib_fmr_pool_unmap(), the FMR usually ends
up on the free_list rather than the dirty_list (because we allow a
certain number of remappings before actually requiring a flush).

However, ib_fmr_batch_release() only looks at dirty_list when flushing
out old mappings.  This means that when ib_fmr_pool_flush() is used to
force a flush of the FMR pool, some dirty FMRs that have not reached
their maximum remap count will not actually be flushed.

Fix this by flushing all FMRs that have been used at least once in
ib_fmr_batch_release().
Signed-off-by: NOlaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

a3cd7d90

IB/fmr_pool: Flush serial numbers can get out of sync · a656eb75

由 Olaf Kirch 提交于 1月 16, 2008

Normally, the serial numbers for flush requests and flushes executed
for an FMR pool should be in sync.

However, if the FMR pool flushes dirty FMRs because the
dirty_watermark was reached, we wake up the cleanup thread and let it
do its stuff.  As a side effect, the cleanup thread increments
pool->flush_ser, which leaves it one higher than pool->req_ser.  The
next time the user calls ib_flush_fmr_pool(), the cleanup thread will
be woken up, but ib_flush_fmr_pool() won't wait for the flush to
complete because flush_ser is already past req_ser.  This means the
FMRs that the user expects to be flushed may not have all been flushed
when the function returns.

Fix this by telling the cleanup thread to do work exclusively by
incrementing req_ser, and by moving the comparison of dirty_len and
dirty_watermark into ib_fmr_pool_unmap().
Signed-off-by: NOlaf Kirch <olaf.kirch@oracle.com>

a656eb75

IB/umad: Simplify and fix locking · 2fe7e6f7

由 Roland Dreier 提交于 1月 25, 2008

In addition to being overly complex, the locking in user_mad.c is
broken: there were multiple reports of deadlocks and lockdep warnings.
In particular it seems that a single thread may end up trying to take
the same rwsem for reading more than once, which is explicitly
forbidden in the comments in <linux/rwsem.h>.

To solve this, we change the locking to use plain mutexes instead of
rwsems. There is one mutex per open file, which protects the contents
of the struct ib_umad_file, including the array of agents and list of
queued packets; and there is one mutex per struct ib_umad_port, which
protects the contents, including the list of open files. We never
hold the file mutex across calls to functions like ib_unregister_mad_agent(),
which can call back into other ib_umad code to queue a packet, and we
always hold the port mutex as long as we need to make sure that a
device is not hot-unplugged from under us.

This even makes things nicer for users of the -rt patch, since we
remove calls to downgrade_write() (which is not implemented in -rt).
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

2fe7e6f7

IB/ipath: Fix some sparse warnings about shadowed symbols · cf9542aa

由 Roland Dreier 提交于 1月 25, 2008

There are a few places in the ipath driver where a variable is
re-declared within a block where it is already in scope. Most of these
extra declarations can simply be removed, since the variable from the
outer scope is used in a way so that it does not need to keep its
variable across the block with the re-declaration.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

cf9542aa

RDMA/cxgb3: Endianness annotation for irs field · 1d6e658e

由 Roland Dreier 提交于 1月 25, 2008

t3_rdma_init_wr.irs is a big-endian field, so declare it as __be32.
This fixes one sparse warning.
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

1d6e658e

IB/ehca: Use round_jiffies() for EQ polling timer · 1a7d2dce

由 Anton Blanchard 提交于 10月 15, 2007

Use round_jiffies() to align ehca's 1-second timer with other timers
and potentially save power by sleeping cores for longer.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Acked-by: NHoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

1a7d2dce

RDMA/cma: Override default responder_resources with user value · 5851bb89

由 Sean Hefty 提交于 1月 04, 2008

By default, the responder_resources parameter is set to that received
in a connection request.  The passive side may override this value
when accepting the connection.  Use the value provided by the passive
side when transitioning the QP to RTR state, rather than the value
given in the connect request.  Without this change, the RTR transition
may fail if the passive side supports fewer responder_resources than
that in the request.

For code consistency and to protect against QP destruction, restructure
overriding initiator_depth to match how responder_resources is set.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

5851bb89

IB/ipath: Drop support for the original QHT7040 board · 1f813ca8

由 Dave Olson 提交于 1月 06, 2008

The original QHT7040 had significant performance issues so there was an
additional check in the driver for a newer serial number.  Support for
the small quantities of that board shipped has been dropped, so this
patch removes the special checks to simplify the code.
Signed-off-by: NDave Olson <dave.olson@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

1f813ca8

IB/ipath: Add ipath_read_ireg() abstraction · 7da0498e

由 Arthur Jones 提交于 1月 06, 2008

Different chips have different width interrupt status registers, so add
a flag and accessor function to decide which width register read to use.
Signed-off-by: NArthur Jones <arthur.jones@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

7da0498e

IB/ipath: Add flag and handling for chips with swapped register bug · 4ea61b54

由 Ralph Campbell 提交于 1月 06, 2008

The 6110 had a bug that caused some registers to be swapped; it was
fixed for the 7220 (and didn't affect the 6120 because it had fewer
registers).  This adds a flag and related code to handle that, and
includes some minor cleanups in the same area.
Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>

4ea61b54

IB/ipath: Port config has on-chip effects for 7220 · 60948a41

由 Ralph Campbell 提交于 1月 06, 2008

The number of configured ports for the 7220 changes the number of eager
TIDs available per port, for all but port 0 (kernel port) which remains
constant, so add a field to give port0 count separate from the portdata
structure.
Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

60948a41

IB/ipath: Allow more flexible user register alignments · a18e26ae

由 Ralph Campbell 提交于 1月 06, 2008

User registers have different alignments on different chips (4KB on
older, 64KB on 7220).  Allow mapping the user registers on kernels with
page sizes up to 64K.
Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

a18e26ae

IB/ipath: Clean up some comments · 9e2ef36b

由 Dave Olson 提交于 1月 06, 2008

Signed-off-by: NDave Olson <dave.olson@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

9e2ef36b

IB/ipath: Export hardware counters more consistently · 3029fcc3

由 Ralph Campbell 提交于 1月 06, 2008

Various hardware counters are exported via the ipath file system (since
it is binary data). The old file format was very dependent on the HW
offsets for these registers. Newer HCA chips can have different
counters at different offsets. This patch adds a level of indirection
to make the file format consistent across HCAs.
Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

3029fcc3

IB/ipath: MAD performance sampling registers support · 6c719cae

由 Ralph Campbell 提交于 1月 06, 2008

Add support for QLogic HCAs which have hardware performance sampling
registers for PortSamplesControl and PortSamplesResult MADs.
Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

6c719cae

IB/srp: Add identifying information to log messages · 7aa54bd7

由 David Dillow 提交于 1月 07, 2008

When you have multiple targets, it gets really confusing when you try
to track down who did a reset when there is no identifying information
in the log message, especially when the same extension ID is mapped
through two different local IB ports.  So, add an identifier that can
be used to track back to which local IB port/remote target pair is the
one having problems.
Signed-off-by: NDavid Dillow <dillowda@ornl.gov>
Acked-by: NPete Wyckoff <pw@osc.edu>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

7aa54bd7

IPoIB/CM: Enable SRQ support on HCAs that support fewer than 16 SG entries · 586a6934

由 Pradeep Satyanarayana 提交于 12月 21, 2007

Some HCAs (such as ehca2) support SRQ, but only support fewer than 16 SG
entries for SRQs. Currently IPoIB/CM implicitly assumes all HCAs will
support 16 SG entries for SRQs (to handle a 64K MTU with 4K pages). This
patch removes that restriction by limiting the maximum MTU in connected
mode to what the maximum number of SRQ SG entries allows.

This patch addresses <https://bugs.openfabrics.org/show_bug.cgi?id=728>
Signed-off-by: NPradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

586a6934

IB/srp: Enable SG list chaining · fff09a8e

由 David Dillow 提交于 12月 19, 2007

By default, the SCSI mid-layer seems to send down 512KB requests
(sg_tablesize = 256), with some requests occasionally combined. By
allowing the mid-layer to chain requests, we can easily grow to 1024KB
or larger -- I've tested 4096KB I/O requests with no problems.

I looked through the DMA paths on the hardware drivers to ensure they
could take advantage of the SG chaining, and it seems that every one
except ipath uses the system's DMA routines, which have been converted
to handle chaining.  ipath looks like it should be OK, but I have no
way to test it.
Signed-off-by: NDavid Dillow <dillowda@ornl.gov>

[ Tested on ipath.  - Roland ]
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

fff09a8e

IB/srp: Respect target credit limit · 8cba2077

由 David Dillow 提交于 12月 19, 2007

The current SRP initiator will send requests even if it has no credits
available.  The results of sending extra requests are vendor specific,
but on some devices, overrunning credits will cost 85% of peak
performance -- e.g. 100 MB/s vs 720 MB/s.  Other devices may just drop
the requests.

This patch will tell the SCSI midlayer to queue requests if there are
fewer than two credits remaining, and will not issue a task management
request if there are no credits remaining.  The mid-layer will retry
the queued command once an outstanding command completes.

The patch also removes the unlikely() in __srp_get_tx_iu(), as it is
not at all unlikely to hit this limit under heavy load.
Signed-off-by: NDavid Dillow <dillowda@ornl.gov>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

8cba2077

IPoIB: improve IPv4/IPv6 to IB mcast mapping functions · a9e527e3

由 Rolf Manderscheid 提交于 12月 10, 2007

An IPoIB subnet on an IB fabric that spans multiple IB subnets can't
use link-local scope in multicast GIDs.  The existing routines that
map IP/IPv6 multicast addresses into IB link-level addresses hard-code
the scope to link-local, and they also leave the partition key field
uninitialised.  This patch adds a parameter (the link-level broadcast
address) to the mapping routines, allowing them to initialise both the
scope and the P_Key appropriately, and fixes up the call sites.

The next step will be to add a way to configure the scope for an IPoIB
interface.
Signed-off-by: NRolf Manderscheid <rvm@obsidianresearch.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

a9e527e3

IB/ipath: Changes for fields moving from devdata to portdata · 755807a2

由 Dave Olson 提交于 12月 06, 2007

This patch moves some arrays that were defined per-device to be
variables defined in the per context data structure, thus avoiding extra
kzalloc() calls.
Signed-off-by: NDave Olson <dave.olson@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

755807a2

IB/ipath: Generalize some xxx_SHIFT macros · d8274869

由 Dave Olson 提交于 12月 21, 2007

In preparation for upcoming chips that have different values for
INFINIPATH_R_PORTENABLE_SHIFT, INFINIPATH_R_INTRAVAIL_SHIFT,
INFINIPATH_R_TAILUPD_SHIFT, and portcfg_shift, remove the shared
#defines and use device-specific variables instead.
Signed-off-by: NDave Olson <dave.olson@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

d8274869

IB/ipath: kreceive uses portdata rather than devdata · c59a80ac

由 Ralph Campbell 提交于 12月 20, 2007

kreceive is now portdata * instead of devdata * and other kreceive
related cleanups....
Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

c59a80ac

IB/ipath: Cleanup ipath_get_egrbuf() · d65708f3

由 Ralph Campbell 提交于 12月 21, 2007

Remove an unused parameter and fix up the comment.
Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

d65708f3

IB/ipath: Fix RNR NAK handling · cc65edcf

由 Ralph Campbell 提交于 12月 14, 2007

This patch fixes a couple of minor problems with RNR NAK handling:
 - The insertion sort was causing extra delay when inserting ahead
   vs. behind an existing entry on the list.
 - A resend of a first packet of a message which is still not ready,
   needs another RNR NAK (i.e., it was suppressed when it shouldn't).
 - Also, the resend tasklet doesn't need to be woken up unless the
   ACK/NAK actually indicates progress has been made.
Signed-off-by: NRalph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

cc65edcf

IB/ehca: Forward event client-reregister-required to registered clients · e57d62a1

由 Hoang-Nam Nguyen 提交于 12月 20, 2007

This patch allows ehca to forward event client-reregister-required to
registered clients.  One such event is generated by a switch eg. after
its reboot.
Signed-off-by: NHoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

e57d62a1

IB/mlx4: Micro-optimize mlx4_ib_poll_one() · b3226184

由 Roland Dreier 提交于 1月 25, 2008

Rather than byte-swapping cqe->g_mlpath_rqpn each time we extract a
field from it, byte-swap it once into a temporary variable.  This 
results in smaller, better code -- eg, on 32-bit x86:

add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5)
function                                     old     new   delta
mlx4_ib_poll_cq                             1188    1183      -5
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

b3226184

IB/mthca: Remove MSI support as scheduled · e57895d3

由 Adrian Bunk 提交于 1月 01, 2008

Remove MSI support from the mthca driver, as scheduled.  There is no
reason to use MSI instead of MSI-X, since MSI-X performs better.  No
one has spoken up since MSI support was deprecated in commit f6be6fbe
("IB/mthca: Schedule MSI support for removal"), so apparently the MSI
support is unused.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

e57895d3

IB/iser: Typo fix (s/destory/destroy/) · 38dc732f

由 Oliver Pinter 提交于 1月 25, 2008

Signed-off-by: NOliver Pinter <oliver.pntr@gmail.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

38dc732f

E
IB/iser: update URLs of iSER docs · bd5d7a85
由 Erez Zilber 提交于 1月 25, 2008
```
Signed-off-by: NErez Zilber <erezz@voltaire.com>
```
bd5d7a85

RDMA/cma: add support for rdma_migrate_id() · 88314e4d

由 Sean Hefty 提交于 11月 14, 2007

This is based on user feedback from Doug Ledford at RedHat:

Events that occur on an rdma_cm_id are reported to userspace through an
event channel.  Connection request events are reported on the event
channel associated with the listen.  When the connection is accepted, a
new rdma_cm_id is created and automatically uses the listen event
channel.  This is suboptimal where the user only wants listen events on
that channel.

Additionally, it may be desirable to have events related to connection
establishment use a different event channel than those related to
already established connections.

Allow the user to migrate an rdma_cm_id between event channels. All
pending events associated with the rdma_cm_id are moved to the new event
channel.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

88314e4d

RDMA/cma: Reenable device removal on passive side · 45d9478d

由 Vladimir Sokolovsky 提交于 12月 07, 2007

Enable conn_id remove on the passive side after connection
establishment.  This corrects an issue where the IB driver can't be
unloaded after running applications over RDS.  The 'dev_remove' counter
does not reach 0 for established connections on the passive side.

This problem is limited to device removal, and only occurs on the
passive side if there are established connections.
Signed-off-by: NVladimir Sokolovsky <vlad@mellanox.co.il>
Reviewed-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

45d9478d

IB/mad: Fix incorrect access to items on local_list · b61d92d8

由 Sean Hefty 提交于 11月 30, 2007

In cancel_mads(), MADs are moved from the wait_list and local_list
to a cancel_list for processing.  However, the structures on these two
lists are not the same.  The wait_list references struct
ib_mad_send_wr_private, but local_list references struct
ib_mad_local_private.  Cancel_mads() treats all items moved to the
cancel_list as struct ib_mad_send_wr_private.  This leads to a system
crash when requests are moved from the local_list to the cancel_list.

Fix this by leaving local_list alone.  All requests on the local_list
have completed are just awaiting processing by a queued worker thread.

Bug (crash) reported by Dotan Barak <dotanb@dev.mellanox.co.il>.
Problem with local_list access reported by Robert Reynolds
<rreynolds@opengridcomputing.com>.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

b61d92d8

IB/cm: Add basic performance counters · 9af57b7a

由 Sean Hefty 提交于 7月 16, 2007

Add performance/debug counters to track sent/received messages, retries,
and duplicates. Counters are tracked per CM message type, per port.

The counters are always enabled, so intrusive state tracking is not done.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

9af57b7a

IB/mad: Report number of times a mad was retried · 4fc8cd49

由 Sean Hefty 提交于 11月 27, 2007

To allow ULPs to tune timeout values and capture retry statistics,
report the number of times that a mad send operation was retried.

For RMPP mads, report the total number of times that the any portion
(send window) of the send operation was retried.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

4fc8cd49

IB/multicast: Report errors on multicast groups if P_key changes · 547af765

由 Sean Hefty 提交于 10月 22, 2007

P_key changes can invalidate multicast groups.  Report errors on all
multicast groups affected by a pkey change.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

547af765

IB: Spelling fixes in comments · 94545e8c

由 Joe Perches 提交于 12月 17, 2007

Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

94545e8c

mlx4_core: Fix max_eqs masking in QUERY_DEV_CAP · 5920869f

由 Jack Morgenstein 提交于 12月 10, 2007

log_max_eqs is a 4-bit field, not a 3-bit field in the response to the
QUERY_DEV_CAP FW command, so we should mask with 0xf instead of 0x7
when reading it.

Found by Yossi Leybovitch of Mellanox.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

5920869f

IB/ipath: Convert from .nopage to .fault · 3c845086

由 Nick Piggin 提交于 12月 13, 2007

Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

3c845086

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功