提交 · 2ade0b7f795293ca0b11c1bf3d2f99ed33e04a2a · openanolis / cloud-kernel

13 9月, 2012 2 次提交

IPoIB: Fix AB-BA deadlock when deleting neighbours · b5120a6e

由 Shlomo Pongratz 提交于 8月 29, 2012

Lockdep points out a circular locking dependency betwwen the ipoib
device priv spinlock (priv->lock) and the neighbour table rwlock
(ntbl->rwlock).

In the normal path, ie neigbour garbage collection task, the neigh
table rwlock is taken first and then if the neighbour needs to be
deleted, priv->lock is taken.

However in some error paths, such as in ipoib_cm_handle_tx_wc(),
priv->lock is taken first and then ipoib_neigh_free routine is called
which in turn takes the neighbour table ntbl->rwlock.

The solution is to get rid the neigh table rwlock completely and use
only priv->lock.
Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b5120a6e

IPoIB: Fix memory leak in the neigh table deletion flow · 66172c09

由 Shlomo Pongratz 提交于 8月 29, 2012

If the neighbours hash table is empty when unloading the module, then
ipoib_flush_neighs(), the cleanup routine, isn't called and the
memory used for the hash table itself leaked.

To fix this, ipoib_flush_neighs() is allways called, and another
completion object is added to signal when the table is freed.

Once invoked, ipoib_flush_neighs() flushes all the neighbours (if
there are any), calls the the hash table RCU free routine, which now
signals completion of the deletion process, and waits for the last
neighbour to be freed.
Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

66172c09

15 8月, 2012 1 次提交

IB/ipoib: Fix RCU pointer dereference of wrong object · 6c723a68

由 Shlomo Pongratz 提交于 8月 13, 2012

Commit b63b70d8 ("IPoIB: Use a private hash table for path lookup
in xmit path") introduced a bug where in ipoib_neigh_free() (which is
called from a few errors flows in the driver), rcu_dereference() is
invoked with the wrong pointer object, which results in a crash.
Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

6c723a68

30 7月, 2012 1 次提交

IPoIB: Use a private hash table for path lookup in xmit path · b63b70d8

由 Shlomo Pongratz 提交于 7月 24, 2012

Dave Miller <davem@davemloft.net> provided a detailed description of
why the way IPoIB is using neighbours for its own ipoib_neigh struct
is buggy:

    Any time an ipoib_neigh is changed, a sequence like the following is made:

    			spin_lock_irqsave(&priv->lock, flags);
    			/*
    			 * It's safe to call ipoib_put_ah() inside
    			 * priv->lock here, because we know that
    			 * path->ah will always hold one more reference,
    			 * so ipoib_put_ah() will never do more than
    			 * decrement the ref count.
    			 */
    			if (neigh->ah)
    				ipoib_put_ah(neigh->ah);
    			list_del(&neigh->list);
    			ipoib_neigh_free(dev, neigh);
    			spin_unlock_irqrestore(&priv->lock, flags);
    			ipoib_path_lookup(skb, n, dev);

    This doesn't work, because you're leaving a stale pointer to the freed up
    ipoib_neigh in the special neigh->ha pointer cookie.  Yes, it even fails
    with all the locking done to protect _changes_ to *ipoib_neigh(n), and
    with the code in ipoib_neigh_free() that NULLs out the pointer.

    The core issue is that read side calls to *to_ipoib_neigh(n) are not
    being synchronized at all, they are performed without any locking.  So
    whether we hold the lock or not when making changes to *ipoib_neigh(n)
    you still can have threads see references to freed up ipoib_neigh
    objects.

    	cpu 1			cpu 2
    	n = *ipoib_neigh()
    				*ipoib_neigh() = NULL
    				kfree(n)
    	n->foo == OOPS

    [..]

    Perhaps the ipoib code can have a private path database it manages
    entirely itself, which holds all the necessary information and is
    looked up by some generic key which is available easily at transmit
    time and does not involve generic neighbour entries.

See <http://marc.info/?l=linux-rdma&m=132812793105624&w=2> and
<http://marc.info/?l=linux-rdma&w=2&r=1&s=allows+references+to+freed+memory&q=b>
for the full discussion.

This patch aims to solve the race conditions found in the IPoIB driver.

The patch removes the connection between the core networking neighbour
structure and the ipoib_neigh structure.  In addition to avoiding the
race described above, it allows us to handle SKBs carrying IP packets
that don't have any associated neighbour.

We add an ipoib_neigh hash table with N buckets where the key is the
destination hardware address.  The ipoib_neigh is fetched from the
hash table and instead of the stashed location in the neighbour
structure. The hash table uses both RCU and reference counting to
guarantee that no ipoib_neigh instance is ever deleted while in use.

Fetching the ipoib_neigh structure instance from the hash also makes
the special code in ipoib_start_xmit that handles remote and local
bonding failover redundant.

Aged ipoib_neigh instances are deleted by a garbage collection task
that runs every M seconds and deletes every ipoib_neigh instance that
was idle for at least 2*M seconds. The deletion is safe since the
ipoib_neigh instances are protected using RCU and reference count
mechanisms.

The number of buckets (N) and frequency of running the GC thread (M),
are taken from the exported arb_tbl.
Signed-off-by: NShlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

b63b70d8

05 7月, 2012 1 次提交
- D
  ipoib: Convert over to dev_lookup_neigh_skb(). · 178709bb
  由 David S. Miller 提交于 7月 02, 2012
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  178709bb
09 2月, 2012 2 次提交

IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses · 936d7de3

由 Roland Dreier 提交于 2月 07, 2012

Commit a0417fa3 ("net: Make qdisc_skb_cb upper size bound
explicit.") made it possible for a netdev driver to use skb->cb
between its header_ops.create method and its .ndo_start_xmit
method.  Use this in ipoib_hard_header() to stash away the LL address
(GID + QPN), instead of the "ipoib_pseudoheader" hack.  This allows
IPoIB to stop lying about its hard_header_len, which will let us fix
the L2 check for GRO.
Signed-off-by: NRoland Dreier <roland@purestorage.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

936d7de3

IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses · 377cb4f9

由 Roland Dreier 提交于 2月 07, 2012

Commit a0417fa3 ("net: Make qdisc_skb_cb upper size bound
explicit.") made it possible for a netdev driver to use skb->cb
between its header_ops.create method and its .ndo_start_xmit
method.  Use this in ipoib_hard_header() to stash away the LL address
(GID + QPN), instead of the "ipoib_pseudoheader" hack.  This allows
IPoIB to stop lying about its hard_header_len, which will let us fix
the L2 check for GRO.
Signed-off-by: NRoland Dreier <roland@purestorage.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

377cb4f9

06 12月, 2011 2 次提交

infiniband: ipoib: Sanitize neighbour handling in ipoib_main.c · 17e6abee

由 David Miller 提交于 12月 02, 2011

Reduce the number of dst_get_neighbour_noref() calls within a single
call chain.  Primarily by passing the neighbour pointer down to the
helper functions.

Handle dst_get_neighbour_noref() returning NULL in ipoib_start_xmit()
by incrementing the dropped counter and freeing the packet.  We don't
want it to fall through into the ARP/RARP/multicast handling, since
that should only happen when skb_dst() is NULL.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NRoland Dreier <roland@purestorage.com>

17e6abee

net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}. · 27217455

由 David Miller 提交于 12月 02, 2011

To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NRoland Dreier <roland@purestorage.com>

27217455

01 12月, 2011 1 次提交

neigh: Add infrastructure for allocating device neigh privates. · 596b9b68

由 David Miller 提交于 7月 25, 2011

netdev->neigh_priv_len records the private area length.

This will trigger for neigh_table objects which set tbl->entry_size
to zero, and the first instances of this will be forthcoming.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

596b9b68

30 11月, 2011 2 次提交

IB: Fix RCU lockdep splats · 580da35a

由 Eric Dumazet 提交于 11月 29, 2011

Commit f2c31e32 ("net: fix NULL dereferences in check_peer_redir()")
forgot to take care of infiniband uses of dst neighbours.

Many thanks to Marc Aurele who provided a nice bug report and feedback.
Reported-by: NMarc Aurele La France <tsi@ualberta.ca>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: <stable@kernel.org>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

580da35a

IB/ipoib: Prevent hung task or softlockup processing multicast response · 3874397c

由 Mike Marciniszyn 提交于 11月 21, 2011

This following can occur with ipoib when processing a multicast reponse:

    BUG: soft lockup - CPU#0 stuck for 67s! [ib_mad1:982]
    Modules linked in: ...
    CPU 0:
    Modules linked in: ...
    Pid: 982, comm: ib_mad1 Not tainted 2.6.32-131.0.15.el6.x86_64 #1 ProLiant DL160 G5
    RIP: 0010:[<ffffffff814ddb27>]  [<ffffffff814ddb27>] _spin_unlock_irqrestore+0x17/0x20
    RSP: 0018:ffff8802119ed860  EFLAGS: 00000246
    0000000000000004 RBX: ffff8802119ed860 RCX: 000000000000a299
    RDX: ffff88021086c700 RSI: 0000000000000246 RDI: 0000000000000246
    RBP: ffffffff8100bc8e R08: ffff880210ac229c R09: 0000000000000000
    R10: ffff88021278aab8 R11: 0000000000000000 R12: ffff8802119ed860
    R13: ffffffff8100be6e R14: 0000000000000001 R15: 0000000000000003
    FS:  0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 00000000006d4840 CR3: 0000000209aa5000 CR4: 00000000000406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Call Trace:
    [<ffffffffa032c247>] ? ipoib_mcast_send+0x157/0x480 [ib_ipoib]
    [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20
    [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20
    [<ffffffffa03283d4>] ? ipoib_path_lookup+0x124/0x2d0 [ib_ipoib]
    [<ffffffffa03286fc>] ? ipoib_start_xmit+0x17c/0x430 [ib_ipoib]
    [<ffffffff8141e758>] ? dev_hard_start_xmit+0x2c8/0x3f0
    [<ffffffff81439d0a>] ? sch_direct_xmit+0x15a/0x1c0
    [<ffffffff81423098>] ? dev_queue_xmit+0x388/0x4d0
    [<ffffffffa032d6b7>] ? ipoib_mcast_join_finish+0x2c7/0x510 [ib_ipoib]
    [<ffffffffa032dab8>] ? ipoib_mcast_sendonly_join_complete+0x1b8/0x1f0 [ib_ipoib]
    [<ffffffffa02a0946>] ? mcast_work_handler+0x1a6/0x710 [ib_sa]
    [<ffffffffa015f01e>] ? ib_send_mad+0xfe/0x3c0 [ib_mad]
    [<ffffffffa00f6c93>] ? ib_get_cached_lmc+0xa3/0xb0 [ib_core]
    [<ffffffffa02a0f9b>] ? join_handler+0xeb/0x200 [ib_sa]
    [<ffffffffa029e4fc>] ? ib_sa_mcmember_rec_callback+0x5c/0xa0 [ib_sa]
    [<ffffffffa029e79c>] ? recv_handler+0x3c/0x70 [ib_sa]
    [<ffffffffa01603a4>] ? ib_mad_completion_handler+0x844/0x9d0 [ib_mad]
    [<ffffffffa015fb60>] ? ib_mad_completion_handler+0x0/0x9d0 [ib_mad]
    [<ffffffff81088830>] ? worker_thread+0x170/0x2a0
    [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40
    [<ffffffff810886c0>] ? worker_thread+0x0/0x2a0
    [<ffffffff8108ddf6>] ? kthread+0x96/0xa0
    [<ffffffff8100c1ca>] ? child_rip+0xa/0x20

Coinciding with stack trace is the following message:

    ib0: ib_address_create failed

The code below in ipoib_mcast_join_finish() will note the above
failure in the address handle but otherwise continue:

                ah = ipoib_create_ah(dev, priv->pd, &av);
                if (!ah) {
                        ipoib_warn(priv, "ib_address_create failed\n");
                } else {

The while loop at the bottom of ipoib_mcast_join_finish() will attempt
to send queued multicast packets in mcast->pkt_queue and eventually
end up in ipoib_mcast_send():

        if (!mcast->ah) {
                if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE)
                        skb_queue_tail(&mcast->pkt_queue, skb);
                else {
                        ++dev->stats.tx_dropped;
                        dev_kfree_skb_any(skb);
                }

My read is that the code will requeue the packet and return to the
ipoib_mcast_join_finish() while loop and the stage is set for the
"hung" task diagnostic as the while loop never sees a non-NULL ah, and
will do nothing to resolve.

There are GFP_ATOMIC allocates in the provider routines, so this is
possible and should be dealt with.

The test that induced the failure is associated with a host SM on the
same server during a shutdown.

This patch causes ipoib_mcast_join_finish() to exit with an error
which will flush the queued mcast packets.  Nothing is done to unwind
the QP attached state so that subsequent sends from above will retry
the join.
Reviewed-by: NRam Vepa <ram.vepa@qlogic.com>
Reviewed-by: NGary Leshner <gary.leshner@qlogic.com>
Signed-off-by: NMike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

3874397c

17 11月, 2011 1 次提交
- D
  infiniband: Update net drivers for netdev_features_t changes. · 9ca36f7d
  由 David S. Miller 提交于 11月 16, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  9ca36f7d
18 8月, 2011 1 次提交

net: remove use of ndo_set_multicast_list in drivers · afc4b13d

由 Jiri Pirko 提交于 8月 16, 2011

replace it by ndo_set_rx_mode
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

afc4b13d

17 8月, 2011 1 次提交

IPoIB: Fix possible NULL dereference in ipoib_start_xmit() · 22cfb0bf

由 Bernd Schubert 提交于 8月 16, 2011

Fix a bug introduced in 69cce1d1 ("net: Abstract dst->neighbour
accesses behind helpers.") where we might dereference skb_dst(skb)
even if it is NULL, which causes:

    [  240.944030] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
    [  240.948007] IP: [<ffffffffa0366ce9>] ipoib_start_xmit+0x39/0x280 [ib_ipoib]
    [...]
    [  240.948007] Call Trace:
    [  240.948007]  <IRQ>
    [  240.948007]  [<ffffffff812cd5e0>] dev_hard_start_xmit+0x2a0/0x590
    [  240.948007]  [<ffffffff8131f680>] ? arp_create+0x70/0x200
    [  240.948007]  [<ffffffff812e8e1f>] sch_direct_xmit+0xef/0x1c0

Addresses: https://bugzilla.kernel.org/show_bug.cgi?id=41212Signed-off-by: NBernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: NRoland Dreier <roland@purestorage.com>

22cfb0bf

18 7月, 2011 1 次提交
- D
  net: Abstract dst->neighbour accesses behind helpers. · 69cce1d1
  由 David S. Miller 提交于 7月 17, 2011
```
dst_{get,set}_neighbour()
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  69cce1d1
20 4月, 2011 1 次提交

net: infiniband/ulp/ipoib: convert to hw_features · 3d96c74d

由 Michał Mirosław 提交于 4月 19, 2011

Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d96c74d

13 1月, 2011 1 次提交
- J
  RDMA: Use vzalloc() to replace vmalloc()+memset(0) · 948579cd
  由 Joe Perches 提交于 11月 05, 2010
```
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>
```
  948579cd
11 1月, 2011 2 次提交

IPoIB: Add GRO support · 8ae31e5b

由 Or Gerlitz 提交于 1月 10, 2011

Signed-off-by: NOr Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

8ae31e5b

IPoIB: Remove LRO support · 19e364f6

由 Or Gerlitz 提交于 1月 10, 2011

As a first step in moving from LRO to GRO, revert commit af40da89
("IPoIB: add LRO support").  Also eliminate the ethtool set_flags
callback which isn't needed anymore.  Finally, we need to include
<linux/sched.h> directly to get the declaration of restart_syscall()
(which used to be included implicitly through <linux/inet_lro.h>).

Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: NOr Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

19e364f6

27 10月, 2010 1 次提交

replace nested max/min macros with {max,min}3 macro · 732eacc0

由 Hagen Paul Pfeifer 提交于 10月 26, 2010

Use the new {max,min}3 macros to save some cycles and bytes on the stack.
This patch substitutes trivial nested macros with their counterpart.
Signed-off-by: NHagen Paul Pfeifer <hagen@jauu.net>
Cc: Joe Perches <joe@perches.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

732eacc0

24 10月, 2010 1 次提交

IPoIB: Set dev_id field of net_device · c3aa9b18

由 Eli Cohen 提交于 9月 20, 2010

Use the net device's dev_id field to encode the port number of the pci
device.  This can be used to to associate a net device with the pci
device's port. The encoding is: dev_id = port - 1.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

c3aa9b18

14 10月, 2010 1 次提交

IPoIB: Skip IBoE ports · 7b4c8769

由 Eli Cohen 提交于 9月 27, 2010

IPoIB is IP-over-Infiniband link layer. In the case of IBoE, the link
layer is Ethernet and IP can work directly over Ethernet, so disable
IPoIB for non-IB_LINK_LAYER_INFINIBAND ports.
Signed-off-by: NEli Cohen <eli@mellanox.co.il>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

7b4c8769

07 7月, 2010 1 次提交

IPoIB: Fix world-writable child interface control sysfs attributes · 7a52b34b

由 Or Gerlitz 提交于 6月 06, 2010

Sumeet Lahorani <sumeet.lahorani@oracle.com> reported that the IPoIB
child entries are world-writable; however we don't want ordinary users
to be able to create and destroy child interfaces, so fix them to be
writable only by root.
Signed-off-by: NOr Gerlitz <ogerlitz@voltaire.com>
Cc: <stable@kernel.org>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

7a52b34b

10 12月, 2009 1 次提交

IPoIB: Clear ipoib_neigh.dgid in ipoib_neigh_alloc() · 0cd4d0fd

由 David J. Wilder 提交于 12月 09, 2009

IPoIB can miss a change in destination GID under some conditions.  The
problem is caused when ipoib_neigh->dgid contains a stale address.
The fix is to set ipoib_neigh->dgid to zero in ipoib_neigh_alloc().

This can happen when a system using bonding on its IPoIB interfaces
has switched its active interface from interface A to B and back to A.
The system that fails over will not correctly processes the 2nd
address change, as described below.

When an address has changed neighbor->ha is updated with the new
address.  Each neighbor has an associated ipoib_neigh.
ipoib_neigh->dgid also holds a copy of the remote node's hardware
address.  When an address changes neighbor->ha is updated by the
network layer (arp code) with the new address.  IPoIB detects this
change in ipoib_start_xmit() by comparing neighbor->ha with
ipoib_neigh->dgid.  The bug is that ipoib_neigh->dgid may already
contain the new address (A) thus the change from B to A is missed by
ipoib.  Here is the sequence of events:

    ipoib_neigh->dgid = A  and  neighbor->ha = A

The address is switched to B (the first switch)

    neighbor->ha = B

The change is seen in ipoib_start_xmit() -- neighbor->ha !=
ipoib_neigh->dgid so ipoib_neigh is released, and a new one is
allocated.

The allocator may return the same chunk of memory that was just
released, therefore ipoib_neigh->dgid still contains A at this point.

ipoib_neigh->dgid should be updated in neigh_add_path(), but if the
following conditions are true dgid is not updated:

        1) __path_find() returns a path
        2) path->ah is NULL

The remote system now switches from address B to A, neighbor->ha is
updated to A.

Now we have again : ipoib_neigh->dgid = A  and  neighbor->ha = A

Since the addresses are the same ipoib won't process the change in
address.  Fix this by zeroing out the dgid field when allocating a new
struct ipoib_neigh.
Signed-off-by: NDavid Wilder <dwilder@us.ibm.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

0cd4d0fd

06 9月, 2009 1 次提交

IPoIB: Drop priv->lock before calling ipoib_send() · 721d67cd

由 Roland Dreier 提交于 9月 05, 2009

IPoIB currently must use irqsave locking for priv->lock, since it is
taken from interrupt context in one path. However, ipoib_send() does
skb_orphan(), and the network stack locking is not IRQ-safe.
Therefore we need to make sure we don't hold priv->lock when calling
ipoib_send() to avoid lockdep warnings (the code was almost certainly
safe in practice, since the only code path that takes priv->lock from
interrupt context would never call into the network stack).

Addresses: http://bugzilla.kernel.org/show_bug.cgi?id=13757Reported-by: NBart Van Assche <bart.vanassche@gmail.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

721d67cd

03 6月, 2009 1 次提交

net: skb->dst accessors · adf30907

由 Eric Dumazet 提交于 6月 02, 2009

Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

adf30907

31 5月, 2009 1 次提交

net: unset IFF_XMIT_DST_RELEASE for qeth and ipoib · 86d15cd8

由 Eric Dumazet 提交于 5月 30, 2009

Last two drivers that need skb->dst in their start_xmit() function

Tell dev_hard_start_xmit() to no release it by unsetting IFF_XMIT_DST_RELEASE
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86d15cd8

21 4月, 2009 1 次提交

IPoIB: Disable NAPI while CQ is being drained · e028cc55

由 Yossi Etigin 提交于 4月 20, 2009

If NAPI is enabled while IPoIB's CQ is being drained, it creates a
race on priv->ibwc between ipoib_poll() and ipoib_drain_cq(), leading
to memory corruption.

The solution is to enable/disable NAPI in ipoib_ib_dev_{open/stop}()
instead of in ipoib_{open/stop}(), and sync NAPI on the INITIALIZED
flag instead on the ADMIN_UP flag. This way NAPI will be disabled when
ipoib_drain_cq() is called.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1587>.
Signed-off-by: NYossi Etigin <yosefe@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

e028cc55

22 3月, 2009 1 次提交

infiniband: convert ipoib to net_device_ops · fe8114e8

由 Stephen Hemminger 提交于 3月 20, 2009

Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe8114e8

18 2月, 2009 1 次提交

IPoIB: In unicast_arp_send(), only free newly-created paths · 71d98b46

由 Jack Morgenstein 提交于 2月 17, 2009

If path_rec_start() returns error, call path_free() only if the path
was newly-created.  If we free an existing path whose valid flag was zero,
(but do not detach it from the list) we cause corruption of the
path list (of which it is a member), and get a kernel crash.

The simplest solution is to not free an existing path -- just leave it
in the list as-is (i.e., with its valid flag cleared).

Thanks to Yossi Etigin of Voltaire for identifying the problem flow
which caused the kernel crash.
Signed-off-by: NJack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: NMoni Shua <monis@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

71d98b46

15 1月, 2009 1 次提交

IPoIB: Fix hang in napi_disable() if P_Key is never found · b8a1b1ce

由 Roland Dreier 提交于 1月 14, 2009

After commit fe25c561 ("IPoIB: Don't enable NAPI when it's already
enabled"), if an interface is brought up but the corresponding P_Key
never appears, then ipoib_stop() will hang in napi_disable(), because
ipoib_open() returns before it does napi_enable().

Fix this by changing ipoib_open() to call napi_enable() even if the
P_Key isn't present.
Reported-by: NYossi Etigin <yosefe@Voltaire.COM>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

b8a1b1ce

10 1月, 2009 1 次提交

IPoIB: Fix loss of connectivity after bonding failover on both sides · a50df398

由 Yossi Etigin 提交于 1月 09, 2009

Fix bonding failover in the case both peers failover and the
gratuitous ARP is lost.  In that case, the sender side will create an
ipoib_neigh and issue a path request with the old GID first.  When
skb->dst->neighbour->ha changes due to ARP refresh, this ipoib_neigh
will not be added to the path->list of the path of the new GID,
because the ipoib_neigh already exists.  It will not have an AH
either, because of sender-side failover.  Therefore, it will not get
an AH when the path is resolved.

The solution here is to compare GIDs in ipoib_start_xmit() even if
neigh->ah is invalid.  Comparing with an uninitialized value of
neigh->dgid should be fine, since a spurious match is harmless (and
astronomically unlikely too).
Signed-off-by: NMoni Shoua <monis@voltaire.com>
Signed-off-by: NYossi Etigin <yosefe@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

a50df398

13 11月, 2008 3 次提交

IPoIB: Fix crash in path_rec_completion() · ff79ae80

由 Yossi Etigin 提交于 11月 12, 2008

Fix a crash in path_rec_completion() during an SM up/down loop.  If
more than one path record request is issued, the first completion
releases path->done, allowing ipoib_flush_paths() to free the path,
and thus corrupting it for the second completion.

Commit ee1e2c82 ("IPoIB: Refresh paths instead of flushing them on SM
change events") added the field path->valid and changed the test "if
(!path)" to "if (!path || !path->valid)".  This change made it
possible for a path with an outstanding query to pass the test and
issue another query on the same path.  Having two queries on the same
path leads to a crash.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1325>.
Signed-off-by: NYossi Etigin <yosefe@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

ff79ae80

IPoIB: Fix hang in ipoib_flush_paths() · 93a3ab93

由 Yossi Etigin 提交于 11月 12, 2008

ipoib_flush_paths() can hang during an SM up/down loop: if
path_rec_start() fails (for instance, because there is no sm_ah), the
path is still added to the path list by neigh_add_path().  Then,
ipoib_flush_paths() will wait for path->done, but it will never
complete because the request was not issued at all.  Fix this by
completing path->done if issuing the query fails.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1329>.
Signed-off-by: NYossi Etigin <yosefe@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

93a3ab93

IPoIB: Don't enable NAPI when it's already enabled · fe25c561

由 Yossi Etigin 提交于 11月 12, 2008

If a P_Key is not present when an interface is created, ipoib_open()
will return after doing napi_enable().  ipoib_open() will be called
again from ipoib_pkey_poll() when the P_Key appears, after NAPI has
already been enabled, and try to enable it again. This triggers a
BUG_ON() in napi_enable().

Fix this by moving the call to napi_enable() to after the test for
P_Key presence.
Signed-off-by: NYossi Etigin <yosefe@voltaire.com>
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

fe25c561

30 10月, 2008 1 次提交

net: replace %p6 with %pI6 · 5b095d98

由 Harvey Harrison 提交于 10月 29, 2008

Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b095d98

29 10月, 2008 1 次提交

infiniband: ipoib replace IPOIB_GID_FMT with %p6 · fcace2fe

由 Harvey Harrison 提交于 10月 28, 2008

Replace all uses of IPOIB_GID_FMT, IPOIB_GID_RAW_ARG() and IPOIB_GID_ARG()
Signed-off-by: NHarvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fcace2fe

23 10月, 2008 1 次提交

IPoIB: Set netdev offload features properly for child (VLAN) interfaces · 83bb63f6

由 Or Gerlitz 提交于 10月 22, 2008

Child devices were created without any offload features set, fix this by
moving the code that computes the features into generic function which is
now called through non-child and child device creation.
Signed-off-by: NOr Gerlitz <ogerlitz@voltaire.com>

-- v1 has a bug where the 'result' flag in ipoib_vlan_add may be used uninitialized
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

83bb63f6

01 10月, 2008 1 次提交

IPoIB: Use netif_tx_lock() and get rid of private tx_lock, LLTX · 943c246e

由 Roland Dreier 提交于 9月 30, 2008

Currently, IPoIB is an LLTX driver that uses its own IRQ-disabling
tx_lock.  Not only do we want to get rid of LLTX, this actually causes
problems because of the skb_orphan() done with this tx_lock held: some
skb destructors expect to be run with interrupts enabled.

The simplest fix for this is to get rid of the driver-private tx_lock
and stop using LLTX.  We kill off priv->tx_lock and use
netif_tx_lock[_bh]() instead; the patch to do this is a tiny bit
tricky because we need to update places that take priv->lock inside
the tx_lock to disable IRQs, rather than relying on tx_lock having
already disabled IRQs.

Also, there are a couple of places where we need to disable BHs to
make sure we have a consistent context to call netif_tx_lock() (since
we no longer can use _irqsave() variants), and we also have to change
ipoib_send_comp_handler() to call drain_tx_cq() through a timer rather
than directly, because ipoib_send_comp_handler() runs in interrupt
context and drain_tx_cq() must run in BH context so it can call
netif_tx_lock().
Signed-off-by: NRoland Dreier <rolandd@cisco.com>

943c246e

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功