提交 · 43b752daae9445a3b2b075a236840d801fce1593 · openeuler / Kernel

10 5月, 2011 1 次提交

RDMA/cma: Fix handling of IPv6 addressing in cma_use_port · 43b752da

由 Hefty, Sean 提交于 5月 09, 2011

cma_use_port() assumes that the sockaddr is an IPv4 address.  Since
IPv6 addressing is supported (and also to support other address
families) make the code more generic in its address handling.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

43b752da

27 4月, 2011 1 次提交

Revert wrong fixes for common misspellings · e9c54999

由 Lucas De Marchi 提交于 4月 26, 2011

These changes were incorrectly fixed by codespell. They were now
manually corrected.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

e9c54999

31 3月, 2011 1 次提交

Fix common misspellings · 25985edc

由 Lucas De Marchi 提交于 3月 30, 2011

Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

25985edc

25 3月, 2011 1 次提交

RDMA/nes: Fix test of uninitialized netdev · cf55bb24

由 Roland Dreier 提交于 3月 24, 2011

Commit 1765a575 ("net: make dev->master general") introduced a
test of an uninitialized netdev.  Fix the code so the intended netdev
is tested.
Signed-off-by: NRoland Dreier <roland@purestorage.com>

cf55bb24

24 3月, 2011 2 次提交

mlx4: generalization of multicast steering. · 0345584e

由 Yevgeny Petrilin 提交于 3月 22, 2011

The same packet steering mechanism would be used both for IB and Ethernet,
Both multicasts and unicasts.
This commit prepares the general infrastructure for this.
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0345584e

mlx4_en: Reporting HW revision in ethtool -i · 725c8999

由 Yevgeny Petrilin 提交于 3月 22, 2011

HW revision is derived from device ID and rev id.
Signed-off-by: NEugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: NYevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

725c8999

23 3月, 2011 1 次提交

IB: Increase DMA max_segment_size on Mellanox hardware · 7f9e5c48

由 David Dillow 提交于 1月 17, 2011

By default, each device is assumed to be able only handle 64 KB chunks
during DMA. By giving the segment size a larger value, the block layer
will coalesce more S/G entries together for SRP, allowing larger
requests with the same sg_tablesize setting.  The block layer is the
only direct user of it, though a few IOMMU drivers reference it as
well for their *_map_sg coalescing code. pci-gart_64 on x86, and a
smattering on on sparc, powerpc, and ia64.

Since other IB protocols could potentially see larger segments with
this, let's check those:

 - iSER is fine, because you limit your maximum request size to 512
   KB, so we'll never overrun the page vector in struct iser_page_vec
   (128 entries currently). It is independent of the DMA segment size,
   and handles multi-page segments already.

 - IPoIB is fine, as it maps each page individually, and doesn't use
   ib_dma_map_sg().

 - RDS appears to do the right thing and has no dependencies on DMA
   segment size, but I don't claim to have done a complete audit.

 - NFSoRDMA and 9p are OK -- they do not use ib_dma_map_sg(), so they
   doesn't care about the coalescing.

 - Lustre's ko2iblnd does not care about coalescing -- it properly
   walks the returned sg list.

This patch ups the value on Mellanox hardware to 1 GB, which matches
reported firmware limits on mlx4.
Signed-off-by: NDavid Dillow <dillowda@ornl.gov>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

7f9e5c48

19 3月, 2011 1 次提交

IB/mad: Improve an error message so error code is included · 1eba843d

由 Michael Heinz 提交于 2月 11, 2011

Signed-off-by: NMichael Heinz <michael.heinz@qlogic.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1eba843d

18 3月, 2011 3 次提交

RDMA/nes: Don't print success message at level KERN_ERR · 748bfd9c

由 Roland Dreier 提交于 3月 18, 2011

There's no reason to print "NetEffect RNIC driver successfully loaded" 
at level KERN_ERR (where it will uglify the console on a quiet boot).
Change it to KERN_INFO.
Signed-off-by: NRoland Dreier <roland@purestorage.com>

748bfd9c

RDMA/addr: Fix return of uninitialized ret value · 1bdd6384

由 Sean Hefty 提交于 3月 17, 2011

Commit b23dd4fe ("ipv4: Make output route lookup return rtable
directly") resulted in leaving ret uninitialized, where it may later
be returned.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

1bdd6384

mm: export __get_user_pages · 0014bd99

由 Huang Ying 提交于 1月 30, 2011

In most cases, get_user_pages and get_user_pages_fast should be used
to pin user pages in memory.  But sometimes, some special flags except
FOLL_GET, FOLL_WRITE and FOLL_FORCE are needed, for example in
following patch, KVM needs FOLL_HWPOISON.  To support these users,
__get_user_pages is exported directly.

There are some symbol name conflicts in infiniband driver, fixed them too.
Signed-off-by: NHuang Ying <ying.huang@intel.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Michel Lespinasse <walken@google.com>
CC: Roland Dreier <roland@kernel.org>
CC: Ralph Campbell <infinipath@qlogic.com>
Signed-off-by: NMarcelo Tosatti <mtosatti@redhat.com>

0014bd99

16 3月, 2011 10 次提交

IB/srp: try to use larger FMR sizes to cover our mappings · be8b9814

由 David Dillow 提交于 1月 18, 2011

Now that we can get larger SG lists, we can take advantage of HCAs that
allow us to use larger FMR sizes. In many cases, we can use up to 512
entries, so start there and work our way down.
Signed-off-by: NDavid Dillow <dillowda@ornl.gov>

be8b9814

IB/srp: add support for indirect tables that don't fit in SRP_CMD · c07d424d

由 David Dillow 提交于 1月 16, 2011

This allows us to guarantee the ability to submit up to 8 MB requests
based on the current value of SCSI_MAX_SG_CHAIN_SEGMENTS. While FMR will
usually condense the requests into 8 SG entries, it is imperative that
the target support external tables in case the FMR mapping fails or is
not supported.

We add a safety valve to allow targets without the needed support to
reap the benefits of the large tables, but fail in a manner that lets
the user know that the data didn't make it to the device. The user must
add "allow_ext_sg=1" to the target parameters to indicate that the
target has the needed support.

If indirect_sg_entries is not specified in the modules options, then
the sg_tablesize for the target will default to cmd_sg_entries unless
overridden by the target options.
Signed-off-by: NDavid Dillow <dillowda@ornl.gov>

c07d424d

IB/srp: rework mapping engine to use multiple FMR entries · 8f26c9ff

由 David Dillow 提交于 1月 14, 2011

Instead of forcing all of the S/G entries to fit in one FMR, and falling
back to indirect descriptors if that fails, allow the use of as many
FMRs as needed to map the request. This lays the groundwork for allowing
indirect descriptor tables that are larger than can fit in the command
IU, but should marginally improve performance now by reducing the number
of indirect descriptors needed.

We increase the minimum page size for the FMR pool to 4K, as larger
pages help increase the coverage of each FMR, and it is rare that the
kernel would send down a request with scattered 512 byte fragments.

This patch also move some of the target initialization code afte the
parsing of options, to keep it together with the new code that needs to
allocate memory based on the options given.
Signed-off-by: NDavid Dillow <dillowda@ornl.gov>

8f26c9ff

IB/srp: allow sg_tablesize to be set for each target · 49248644

由 David Dillow 提交于 1月 14, 2011

Different configurations of target software allow differing max sizes of
the command IU. Allowing this to be changed per-target allows all
targets on an initiator to get an optimal setting.

We deprecate srp_sg_tablesize and replace it with cmd_sg_entries in
preparation for allowing more indirect descriptors than can fit in the
IU.
Signed-off-by: NDavid Dillow <dillowda@ornl.gov>

49248644

D
IB/srp: move IB CM setup completion into its own function · 961e0be8
由 David Dillow 提交于 1月 14, 2011
```
This is to clean up prior to further changes.
Signed-off-by: NDavid Dillow <dillowda@ornl.gov>
```
961e0be8

IB/srp: always avoid non-zero offsets into an FMR · 8c4037b5

由 David Dillow 提交于 1月 14, 2011

It is unclear exactly how this code works around Mellanox SRP targets,
or if the problem is on the target side or in the HCA itself. In an
abundance of caution, we should always enable the workaround.
Signed-off-by: NDavid Dillow <dillowda@ornl.gov>

8c4037b5

RDMA/cma: Replace global lock in rdma_destroy_id() with id-specific one · a396d43a

由 Sean Hefty 提交于 2月 23, 2011

rdma_destroy_id currently uses the global rdma cm 'lock' to test if an
rdma_cm_id has been bound to a device.  This prevents an active
address resolution callback handler from assigning a device to the
rdma_cm_id after rdma_destroy_id checks for one.

Instead, we can replace the use of the global lock around the check to
the rdma_cm_id device pointer by setting the id state to destroying,
then flushing all active callbacks.  The latter is accomplished by
acquiring and releasing the handler_mutex.  Any active handler will
complete first, and any newly scheduled handlers will find the
rdma_cm_id in an invalid state.

In addition to optimizing the current locking scheme, the use of the
rdma_cm_id mutex is a more intuitive synchronization mechanism than
that of the global lock.  These changes are based on feedback from
Doug Ledford <dledford@redhat.com> while he was trying to debug a
crash in the rdma cm destroy path.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

a396d43a

IB/cm: Cancel pending LAP message when exiting IB_CM_ESTABLISH state · 8d8ac865

由 Sean Hefty 提交于 3月 03, 2011

This problem was reported by Moni Shoua <monis@mellanox.com> and Amir
Vadai <amirv@mellanox.com>:

	When destroying a cm_id from a context of a work queue and if
	the lap_state of this cm_id is IB_CM_LAP_SENT, we need to
	release the reference of this id that was taken upon the send
	of the LAP message.  Otherwise, if the expected APR message
	gets lost, it is only after a long time that the reference
	will be released, while during that the work handler thread is
	not available to process other things.

It turns out that we need to cancel any pending LAP messages whenever
we transition out of the IB_CM_ESTABLISH state.  This occurs when
disconnecting - either sending or receiving a DREQ.  It can also
happen in a corner case where we receive a REJ message after sending
an RTU, followed by a LAP.  Add checks and cancel any outstanding LAP
messages in these three cases.

Canceling the LAP when sending a DREQ fixes the destroy problem
reported by Moni.  When a cm_id is destroyed in the IB_CM_ESTABLISHED
state, it sends a DREQ to the remote side to notify the peer that the
connection is going away.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

8d8ac865

IB/cm: Bump reference count on cm_id before invoking callback · 29963437

由 Sean Hefty 提交于 2月 23, 2011

When processing a SIDR REQ, the ib_cm allocates a new cm_id.  The
refcount of the cm_id is initialized to 1.  However, cm_process_work
will decrement the refcount after invoking all callbacks.  The result
is that the cm_id will end up with refcount set to 0 by the end of the
sidr req handler.

If a user tries to destroy the cm_id, the destruction will proceed,
under the incorrect assumption that no other threads are referencing
the cm_id.  This can lead to a crash when the cm callback thread tries
to access the cm_id.

This problem was noticed as part of a larger investigation with kernel
crashes in the rdma_cm when running on a real time OS.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Acked-by: NDoug Ledford <dledford@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

29963437

RDMA/cma: Fix crash in request handlers · 25ae21a1

由 Sean Hefty 提交于 2月 23, 2011

Doug Ledford and Red Hat reported a crash when running the rdma_cm on
a real-time OS.  The crash has the following call trace:

    cm_process_work
       cma_req_handler
          cma_disable_callback
          rdma_create_id
             kzalloc
             init_completion
          cma_get_net_info
          cma_save_net_info
          cma_any_addr
             cma_zero_addr
          rdma_translate_ip
             rdma_copy_addr
          cma_acquire_dev
             rdma_addr_get_sgid
             ib_find_cached_gid
             cma_attach_to_dev
          ucma_event_handler
             kzalloc
             ib_copy_ah_attr_to_user
          cma_comp

[ preempted ]

    cma_write
        copy_from_user
        ucma_destroy_id
           copy_from_user
           _ucma_find_context
           ucma_put_ctx
           ucma_free_ctx
              rdma_destroy_id
                 cma_exch
                 cma_cancel_operation
                 rdma_node_get_transport

        rt_mutex_slowunlock
        bad_area_nosemaphore
        oops_enter

They were able to reproduce the crash multiple times with the
following details:

    Crash seems to always happen on the:
            mutex_unlock(&conn_id->handler_mutex);
    as conn_id looks to have been freed during this code path.

An examination of the code shows that a race exists in the request
handlers.  When a new connection request is received, the rdma_cm
allocates a new connection identifier.  This identifier has a single
reference count on it.  If a user calls rdma_destroy_id() from another
thread after receiving a callback, rdma_destroy_id will proceed to
destroy the id and free the associated memory.  However, the request
handlers may still be in the process of running.  When control returns
to the request handlers, they can attempt to access the newly created
identifiers.

Fix this by holding a reference on the newly created rdma_cm_id until
the request handler is through accessing it.
Signed-off-by: NSean Hefty <sean.hefty@intel.com>
Acked-by: NDoug Ledford <dledford@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

25ae21a1

15 3月, 2011 10 次提交

IB/ipath: Don't reset disabled devices · 2a543904

由 Nicolas Kaiser 提交于 10月 26, 2010

The comment some lines above states that disabled devices must not reset.
Signed-off-by: NNicolas Kaiser <nikai@nikai.net>

2a543904

IB/qib: Fix M_Key field in SubnGet and SubnGetResp MADs · 36b87b41

由 Mitko Haralanov 提交于 3月 04, 2011

Set the M_Key field in SubnGet and SugnGetResp MADs based on correctly
interpreting the protection level specified in the M_KeyProtBits field.
Signed-off-by: NMitko Haralanov <mitko@qlogic.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

36b87b41

IB/qib: Set default LE2 value for active cables to 0 · 4634b794

由 Mitko Haralanov 提交于 2月 28, 2011

For active and far-EQ cables use an LE2 value of 0 for improved SI.
Signed-off-by: NMitko Haralanov <mitko@qlogic.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

4634b794

RDMA/cxgb4: Debugfs dump_qp() updates · db5d040d

由 Steve Wise 提交于 3月 11, 2011

- Show whether the SQ is in onchip memory or not.
- Dump both SQ and RQ QIDs.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

db5d040d

RDMA/cxgb4: Dispatch FATAL event on EEH errors · 767fbe81

由 Steve Wise 提交于 3月 11, 2011

This at least kicks the user mode applications that are watching for
device events.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

767fbe81

RDMA/cxgb4: Use ULP_MODE_TCPDDP · b48f3b9c

由 Steve Wise 提交于 3月 11, 2011

Set the ULP mode for initial RDMA connection setup to the proper DDP
mode. This avoids wasting some HW resources while in streaming mode.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b48f3b9c

RDMA/cxgb4: Enable on-chip SQ support by default · a9c77198

由 Steve Wise 提交于 3月 11, 2011

Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

a9c77198

RDMA/cxgb4: Do CIDX_INC updates every 1/16 CQ depth CQE reaps · ffc3f748

由 Steve Wise 提交于 3月 11, 2011

This avoids the CIDX_INC overflow issue with T4A2 when running
kernel RDMA applications.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

ffc3f748

RDMA/cxgb4: Remove db_drop_task · 29428137

由 Steve Wise 提交于 3月 11, 2011

Unloading iw_cxgb4 can crash due to the unload code trying to use
db_drop_task, which is uninitialized.  So remove this dead code.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

29428137

RDMA/cxgb4: Turn on delayed ACK · b52fe09e

由 Steve Wise 提交于 3月 11, 2011

Set the default to on.
Signed-off-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b52fe09e

13 3月, 2011 3 次提交

D
ipv6: Convert to use flowi6 where applicable. · 4c9483b2
由 David S. Miller 提交于 3月 12, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
4c9483b2

net: Put flowi_* prefix on AF independent members of struct flowi · 1d28f42c

由 David S. Miller 提交于 3月 12, 2011

I intend to turn struct flowi into a union of AF specific flowi
structs.  There will be a common structure that each variant includes
first, much like struct sock_common.

This is the first step to move in that direction.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d28f42c

ipv4: Create and use route lookup helpers. · 78fbfd8a

由 David S. Miller 提交于 3月 12, 2011

The idea here is this minimizes the number of places one has to edit
in order to make changes to how flows are defined and used.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

78fbfd8a

03 3月, 2011 1 次提交
- D
  ipv4: Make output route lookup return rtable directly. · b23dd4fe
  由 David S. Miller 提交于 3月 02, 2011
```
Instead of on the stack.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  b23dd4fe
02 3月, 2011 2 次提交
- D
  ipv4: Kill can_sleep arg to ip_route_output_flow() · 273447b3
  由 David S. Miller 提交于 3月 01, 2011
```
This boolean state is now available in the flow flags.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  273447b3
- D
  ipv4: Make final arg to ip_route_output_flow to be boolean "can_sleep" · 420d44da
  由 David S. Miller 提交于 3月 01, 2011
```
Since that is what the current vague "flags" argument means.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  420d44da
25 2月, 2011 1 次提交

[SCSI] iser: export addr and port · 7c53c6f8

由 Mike Christie 提交于 2月 16, 2011

This pactch has iser export the address and port
of the endpoint.
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Signed-off-by: NJames Bottomley <James.Bottomley@suse.de>

7c53c6f8

23 2月, 2011 1 次提交

IB/qib: Return correct MAD when setting link width to 255 · cc7fb059

由 Mitko Haralanov 提交于 2月 22, 2011

Fix a bug which causes the driver to return incorrect MADs as a
response to Set(PortInfo) which sets the link width to 0xFF or link
speed to 0xF.
Signed-off-by: NMitko Haralanov <mitko@qlogic.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

cc7fb059

18 2月, 2011 1 次提交

IB/qib: Prevent double completions after a timeout or RNR error · c0af2c05

由 Mike Marciniszyn 提交于 2月 16, 2011

There is a double completion associated with error handling for RC QPs.

The sequence is:

 - The do_rc_ack() routine fields an RNR nack and there are 0
   rnr_retries configured on the QP.
 - qib_error_qp() stops the pending timer
 - qib_rc_send_complete() is called from sdma_complete()
 - qib_rc_send_complete() starts the timer because the msb of the psn
   just completed says an ack is needed.
 - a bunch of flushes occur as ipoib posts WQEs to an error'ed QP
 - rc_timeout() calls qib_restart_rc()
 - qib_restart_rc() calls qib_send_complete() with a
   IB_WC_RETRY_EXC_ERR on a wqe that has already been completed in the
   past

The fix avoids starting the timer since another packet will never
arrive.
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

c0af2c05

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功