提交 · df8f66d02df7b44516635edbc8c17b1311cfa0d8 · openeuler / Kernel

14 11月, 2022 3 次提交

ibmvnic: Update XPS assignments during affinity binding · df8f66d0

由 Nick Child 提交于 11月 10, 2022

Transmit Packet Steering (XPS) maps cpu numbers to transmit
queues. By running the same connection on the same set of cpu's,
contention for the queue and cache miss rate can be minimized.
When assigning a cpu mask for a tranmit queues irq number, assign
the same cpu mask as the set of cpu's that XPS should use for that
queue.
Signed-off-by: NThomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NNick Child <nnac123@linux.ibm.com>
Reviewed-by: NRick Lindsley <ricklind@linux.ibm.com>
Reviewed-by: NHaren Myneni <haren@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df8f66d0

ibmvnic: Add hotpluggable CPU callbacks to reassign affinity hints · 92125c3a

由 Nick Child 提交于 11月 10, 2022

When CPU's are added and removed, ibmvnic devices will reassign
hint values. Introduce a new cpu hotplug state CPUHP_IBMVNIC_DEAD
to signal to ibmvnic devices that the CPU has been removed and it
is time to reset affinity hint assignments. On the other hand,
when CPU's are being added, add a state instance to
CPUHP_AP_ONLINE_DYN which will trigger a reassignment of affinity
hints once the new CPU's are online. This implementation is based
on the virtio_net driver.
Signed-off-by: NThomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NNick Child <nnac123@linux.ibm.com>
Reviewed-by: NRick Lindsley <ricklind@linux.ibm.com>
Reviewed-by: NHaren Myneni <haren@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92125c3a

ibmvnic: Assign IRQ affinity hints to device queues · 44fbc1b6

由 Nick Child 提交于 11月 10, 2022

Assign affinity hints to ibmvnic device queue interrupts.
Affinity hints are assigned and removed during sub-crq init and
teardown, respectively. This update should improve latency if
utilized as interrupt lines and processing are more equally
distributed among CPU's. This implementation is based on the
virtio_net driver.
Signed-off-by: NThomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NNick Child <nnac123@linux.ibm.com>
Reviewed-by: NRick Lindsley <ricklind@linux.ibm.com>
Reviewed-by: NHaren Myneni <haren@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

44fbc1b6

03 11月, 2022 1 次提交

ibmvnic: Free rwi on reset success · d6dd2fe7

由 Nick Child 提交于 10月 31, 2022

Free the rwi structure in the event that the last rwi in the list
processed successfully. The logic in commit 4f408e1f ("ibmvnic:
retry reset if there are no other resets") introduces an issue that
results in a 32 byte memory leak whenever the last rwi in the list
gets processed.

Fixes: 4f408e1f ("ibmvnic: retry reset if there are no other resets")
Signed-off-by: NNick Child <nnac123@linux.ibm.com>
Link: https://lore.kernel.org/r/20221031150642.13356-1-nnac123@linux.ibm.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

d6dd2fe7

29 9月, 2022 1 次提交

net: drop the weight argument from netif_napi_add · b48b89f9

由 Jakub Kicinski 提交于 9月 27, 2022

We tell driver developers to always pass NAPI_POLL_WEIGHT
as the weight to netif_napi_add(). This may be confusing
to newcomers, drop the weight argument, those who really
need to tweak the weight can use netif_napi_add_weight().

Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN
Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.orgSigned-off-by: NJakub Kicinski <kuba@kernel.org>

b48b89f9

04 7月, 2022 1 次提交

ibmvnic: Properly dispose of all skbs during a failover. · 1b18f09d

由 Rick Lindsley 提交于 7月 02, 2022

During a reset, there may have been transmits in flight that are no
longer valid and cannot be fulfilled. Resetting and clearing the
queues is insufficient; each skb also needs to be explicitly freed
so that upper levels are not left waiting for confirmation of a
transmit that will never happen. If this happens frequently enough,
the apparent backlog will cause TCP to begin "congestion control"
unnecessarily, culminating in permanently decreased throughput.

Fixes: d7c0ef36 ("ibmvnic: Free and re-allocate scrqs when tx/rx scrqs change")
Tested-by: NNick Child <nnac123@linux.ibm.com>
Reviewed-by: NBrian King <brking@linux.vnet.ibm.com>
Signed-off-by: NRick Lindsley <ricklind@us.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b18f09d

06 5月, 2022 1 次提交

net: ethernet: Prepare cleanup of powerpc's asm/prom.h · 6bff3ffc

由 Christophe Leroy 提交于 5月 04, 2022

powerpc's asm/prom.h includes some headers that it doesn't
need itself.

In order to clean powerpc's asm/prom.h up in a further step,
first clean all files that include asm/prom.h

Some files don't need asm/prom.h at all. For those ones,
just remove inclusion of asm/prom.h

Some files don't need any of the items provided by asm/prom.h,
but need some of the headers included by asm/prom.h. For those
ones, add the needed headers that are brought by asm/prom.h at
the moment and remove asm/prom.h

Some files really need asm/prom.h but also need some of the
headers included by asm/prom.h. For those one, leave asm/prom.h
but also add the needed headers so that they can be removed
from asm/prom.h in a later step.
Signed-off-by: NChristophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/r/09a13d592d628de95d30943e59b2170af5b48110.1651663857.git.christophe.leroy@csgroup.euSigned-off-by: NJakub Kicinski <kuba@kernel.org>

6bff3ffc

29 4月, 2022 1 次提交

Revert "ibmvnic: Add ethtool private flag for driver-defined queue limits" · aeaf59b7

由 Dany Madden 提交于 4月 27, 2022

This reverts commit 723ad916

When client requests channel or ring size larger than what the server
can support the server will cap the request to the supported max. So,
the client would not be able to successfully request resources that
exceed the server limit.

Fixes: 723ad916 ("ibmvnic: Add ethtool private flag for driver-defined queue limits")
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Link: https://lore.kernel.org/r/20220427235146.23189-1-drt@linux.ibm.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

aeaf59b7

16 4月, 2022 6 次提交

ibmvnic: Allow multiple ltbs in txpool ltb_set · 93b1ebb3

由 Sukadev Bhattiprolu 提交于 4月 13, 2022

Allow multiple LTBs in the txpool's ltb_set. i.e rather than using
a single large LTB, use several smaller LTBs.

The first n-1 LTBs will all be of the same size. The size of the last
LTB in the set depends on the number of buffers and buffer (mtu) size.
This strategy hopefully allows more reuse of the initial LTBs and also
reduces the chances of an allocation failure (of the large LTB) when
system is low in memory.
Suggested-by: NBrian King <brking@linux.ibm.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

93b1ebb3

ibmvnic: Allow multiple ltbs in rxpool ltb_set · a75de820

由 Sukadev Bhattiprolu 提交于 4月 13, 2022

Allow multiple LTBs in the rxpool's ltb_set. The first n-1 LTBs will all
be of the same size. The size of the last LTB in the set depends on the
number of buffers and buffer (mtu) size.

Having a set of LTBs per pool provides a couple of benefits.

First, with the current value of IBMVNIC_MAX_LTB_SIZE of 16MB, with an
MTU of 9000, we need a LTB (DMA buffer) of that size but the allocation
can fail in low memory conditions. With a set of LTBs per pool, we can
use several smaller (8MB) LTBs and hopefully have fewer allocation
failures. (See also comments in ibmvnic.h on the trade-off with smaller
LTBs)

Second since the kernel limits the size of the DMA buffer to 16MB (based
on MAX_ORDER), with a single DMA buffer per pool, the pool is also limited
to 16MB. This in turn limits the number of buffers per pool to 1763 when
MTU is 9000. With a set of LTBs per pool, we can have upto the max of 4096
buffers per pool even when MTU is 9000.
Suggested-by: NBrian King <brking@linux.ibm.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

a75de820

ibmvnic: convert rxpool ltb to a set of ltbs · d6b45850

由 Sukadev Bhattiprolu 提交于 4月 13, 2022

Define and use interfaces that treat the long term buffer (LTB) of an
rxpool as a set of LTBs rather than a single LTB. The set only has one
LTB for now.
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

d6b45850

ibmvnic: define map_txpool_buf_to_ltb() · 0c91bf9c

由 Sukadev Bhattiprolu 提交于 4月 13, 2022

Define a helper to map a given txpool buffer into its corresponding long
term buffer (LTB) and offset. Currently there is just one LTB per txpool
so the mapping is trivial. When we add support for multiple LTBs per
txpool, this helper will be more useful.
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

0c91bf9c

ibmvnic: define map_rxpool_buf_to_ltb() · 2872a67c

由 Sukadev Bhattiprolu 提交于 4月 13, 2022

Define a helper to map a given rx pool buffer into its corresponding long
term buffer (LTB) and offset. Currently there is just one LTB per pool so
the mapping is trivial. When we add support for multiple LTBs per pool,
this helper will be more useful.
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

2872a67c

ibmvnic: rename local variable index to bufidx · 8880fc66

由 Sukadev Bhattiprolu 提交于 4月 13, 2022

The local variable 'index' is heavily used in some functions and is
confusing with the presence of other "index" fields like pool->index,
->consumer_index, etc. Rename it to bufidx to better reflect that its
the index of a buffer in the pool.
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

8880fc66

18 3月, 2022 1 次提交

ibmvnic: fix race between xmit and reset · 4219196d

由 Sukadev Bhattiprolu 提交于 3月 16, 2022

There is a race between reset and the transmit paths that can lead to
ibmvnic_xmit() accessing an scrq after it has been freed in the reset
path. It can result in a crash like:

	Kernel attempted to read user page (0) - exploit attempt? (uid: 0)
	BUG: Kernel NULL pointer dereference on read at 0x00000000
	Faulting instruction address: 0xc0080000016189f8
	Oops: Kernel access of bad area, sig: 11 [#1]
	...
	NIP [c0080000016189f8] ibmvnic_xmit+0x60/0xb60 [ibmvnic]
	LR [c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
	Call Trace:
	[c008000001618f08] ibmvnic_xmit+0x570/0xb60 [ibmvnic] (unreliable)
	[c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
	[c000000000c9cfcc] sch_direct_xmit+0xec/0x330
	[c000000000bfe640] __dev_xmit_skb+0x3a0/0x9d0
	[c000000000c00ad4] __dev_queue_xmit+0x394/0x730
	[c008000002db813c] __bond_start_xmit+0x254/0x450 [bonding]
	[c008000002db8378] bond_start_xmit+0x40/0xc0 [bonding]
	[c000000000c0046c] dev_hard_start_xmit+0x11c/0x280
	[c000000000c00ca4] __dev_queue_xmit+0x564/0x730
	[c000000000cf97e0] neigh_hh_output+0xd0/0x180
	[c000000000cfa69c] ip_finish_output2+0x31c/0x5c0
	[c000000000cfd244] __ip_queue_xmit+0x194/0x4f0
	[c000000000d2a3c4] __tcp_transmit_skb+0x434/0x9b0
	[c000000000d2d1e0] __tcp_retransmit_skb+0x1d0/0x6a0
	[c000000000d2d984] tcp_retransmit_skb+0x34/0x130
	[c000000000d310e8] tcp_retransmit_timer+0x388/0x6d0
	[c000000000d315ec] tcp_write_timer_handler+0x1bc/0x330
	[c000000000d317bc] tcp_write_timer+0x5c/0x200
	[c000000000243270] call_timer_fn+0x50/0x1c0
	[c000000000243704] __run_timers.part.0+0x324/0x460
	[c000000000243894] run_timer_softirq+0x54/0xa0
	[c000000000ea713c] __do_softirq+0x15c/0x3e0
	[c000000000166258] __irq_exit_rcu+0x158/0x190
	[c000000000166420] irq_exit+0x20/0x40
	[c00000000002853c] timer_interrupt+0x14c/0x2b0
	[c000000000009a00] decrementer_common_virt+0x210/0x220
	--- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c

The immediate cause of the crash is the access of tx_scrq in the following
snippet during a reset, where the tx_scrq can be either NULL or an address
that will soon be invalid:

	ibmvnic_xmit()
	{
		...
		tx_scrq = adapter->tx_scrq[queue_num];
		txq = netdev_get_tx_queue(netdev, queue_num);
		ind_bufp = &tx_scrq->ind_buf;

		if (test_bit(0, &adapter->resetting)) {
		...
	}

But beyond that, the call to ibmvnic_xmit() itself is not safe during a
reset and the reset path attempts to avoid this by stopping the queue in
ibmvnic_cleanup(). However just after the queue was stopped, an in-flight
ibmvnic_complete_tx() could have restarted the queue even as the reset is
progressing.

Since the queue was restarted we could get a call to ibmvnic_xmit() which
can then access the bad tx_scrq (or other fields).

We cannot however simply have ibmvnic_complete_tx() check the ->resetting
bit and skip starting the queue. This can race at the "back-end" of a good
reset which just restarted the queue but has not cleared the ->resetting
bit yet. If we skip restarting the queue due to ->resetting being true,
the queue would remain stopped indefinitely potentially leading to transmit
timeouts.

IOW ->resetting is too broad for this purpose. Instead use a new flag
that indicates whether or not the queues are active. Only the open/
reset paths control when the queues are active. ibmvnic_complete_tx()
and others wake up the queue only if the queue is marked active.

So we will have:
	A. reset/open thread in ibmvnic_cleanup() and __ibmvnic_open()

		->resetting = true
		->tx_queues_active = false
		disable tx queues
		...
		->tx_queues_active = true
		start tx queues

	B. Tx interrupt in ibmvnic_complete_tx():

		if (->tx_queues_active)
			netif_wake_subqueue();

To ensure that ->tx_queues_active and state of the queues are consistent,
we need a lock which:

	- must also be taken in the interrupt path (ibmvnic_complete_tx())
	- shared across the multiple queues in the adapter (so they don't
	  become serialized)

Use rcu_read_lock() and have the reset thread synchronize_rcu() after
updating the ->tx_queues_active state.

While here, consolidate a few boolean fields in ibmvnic_adapter for
better alignment.

Based on discussions with Brian King and Dany Madden.

Fixes: 7ed5b31f ("net/ibmvnic: prevent more than one thread from running in reset")
Reported-by: NVaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4219196d

25 2月, 2022 8 次提交

ibmvnic: Allow queueing resets during probe · fd98693c

由 Sukadev Bhattiprolu 提交于 2月 24, 2022

We currently don't allow queuing resets when adapter is in VNIC_PROBING
state - instead we throw away the reset and return EBUSY. The reasoning
is probably that during ibmvnic_probe() the ibmvnic_adapter itself is
being initialized so performing a reset during this time can lead us to
accessing fields in the ibmvnic_adapter that are not fully initialized.
A review of the code shows that all the adapter state neede to process a
reset is initialized before registering the CRQ so that should no longer
be a concern.

Further the expectation is that if we do get a reset (transport event)
during probe, the do..while() loop in ibmvnic_probe() will handle this
by reinitializing the CRQ.

While that is true to some extent, it is possible that the reset might
occur _after_ the CRQ is registered and CRQ_INIT message was exchanged
but _before_ the adapter state is set to VNIC_PROBED. As mentioned above,
such a reset will be thrown away. While the client assumes that the
adapter is functional, the vnic server will wait for the client to reinit
the adapter. This disconnect between the two leaves the adapter down
needing manual intervention.

Because ibmvnic_probe() has other work to do after initializing the CRQ
(such as registering the netdev at a minimum) and because the reset event
can occur at any instant after the CRQ is initialized, there will always
be a window between initializing the CRQ and considering the adapter
ready for resets (ie state == PROBED).

So rather than discarding resets during this window, allow queueing them
- but only process them after the adapter is fully initialized.

To do this, introduce a new completion state ->probe_done and have the
reset worker thread wait on this before processing resets.

This change brings up two new situations in or just after ibmvnic_probe().
First after one or more resets were queued, we encounter an error and
decide to retry the initialization.  At that point the queued resets are
no longer relevant since we could be talking to a new vnic server. So we
must purge/flush the queued resets before restarting the initialization.
As a side note, since we are still in the probing stage and we have not
registered the netdev, it will not be CHANGE_PARAM reset.

Second this change opens up a potential race between the worker thread
in __ibmvnic_reset(), the tasklet and the ibmvnic_open() due to the
following sequence of events:

	1. Register CRQ
	2. Get transport event before CRQ_INIT completes.
	3. Tasklet schedules reset:
		a) add rwi to list
		b) schedule_work() to start worker thread which runs
		   and waits for ->probe_done.
	4. ibmvnic_probe() decides to retry, purges rwi_list
	5. Re-register crq and this time rest of probe succeeds - register
	   netdev and complete(->probe_done).
	6. Worker thread resumes in __ibmvnic_reset() from 3b.
	7. Worker thread sets ->resetting bit
	8. ibmvnic_open() comes in, notices ->resetting bit, sets state
	   to IBMVNIC_OPEN and returns early expecting worker thread to
	   finish the open.
	9. Worker thread finds rwi_list empty and returns without
	   opening the interface.

If this happens, the ->ndo_open() call is effectively lost and the
interface remains down. To address this, ensure that ->rwi_list is
not empty before setting the ->resetting  bit. See also comments in
__ibmvnic_reset().

Fixes: 6a2fb0e9 ("ibmvnic: driver initialization for kdump/kexec")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fd98693c

ibmvnic: clear fop when retrying probe · f628ad53

由 Sukadev Bhattiprolu 提交于 2月 24, 2022

Clear ->failover_pending flag that may have been set in the previous
pass of registering CRQ. If we don't clear, a subsequent ibmvnic_open()
call would be misled into thinking a failover is pending and assuming
that the reset worker thread would open the adapter. If this pass of
registering the CRQ succeeds (i.e there is no transport event), there
wouldn't be a reset worker thread.

This would leave the adapter unconfigured and require manual intervention
to bring it up during boot.

Fixes: 5a18e1e0 ("ibmvnic: Fix failover case for non-redundant configuration")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f628ad53

ibmvnic: init init_done_rc earlier · ae16bf15

由 Sukadev Bhattiprolu 提交于 2月 24, 2022

We currently initialize the ->init_done completion/return code fields
before issuing a CRQ_INIT command. But if we get a transport event soon
after registering the CRQ the taskslet may already have recorded the
completion and error code. If we initialize here, we might overwrite/
lose that and end up issuing the CRQ_INIT only to timeout later.

If that timeout happens during probe, we will leave the adapter in the
DOWN state rather than retrying to register/init the CRQ.

Initialize the completion before registering the CRQ so we don't lose
the notification.

Fixes: 032c5e82 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ae16bf15

ibmvnic: register netdev after init of adapter · 570425f8

由 Sukadev Bhattiprolu 提交于 2月 24, 2022

Finish initializing the adapter before registering netdev so state
is consistent.

Fixes: c26eba03 ("ibmvnic: Update reset infrastructure to support tunable parameters")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

570425f8

ibmvnic: complete init_done on transport events · 36491f2d

由 Sukadev Bhattiprolu 提交于 2月 24, 2022

If we get a transport event, set the error and mark the init as
complete so the attempt to send crq-init or login fail sooner
rather than wait for the timeout.

Fixes: bbd669a8 ("ibmvnic: Fix completion structure initialization")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

36491f2d

ibmvnic: define flush_reset_queue helper · 83da53f7

由 Sukadev Bhattiprolu 提交于 2月 24, 2022

Define and use a helper to flush the reset queue.

Fixes: 2770a798 ("ibmvnic: Introduce hard reset recovery")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83da53f7

ibmvnic: initialize rc before completing wait · 765559b1

由 Sukadev Bhattiprolu 提交于 2月 24, 2022

We should initialize ->init_done_rc before calling complete(). Otherwise
the waiting thread may see ->init_done_rc as 0 before we have updated it
and may assume that the CRQ was successful.

Fixes: 6b278c0c ("ibmvnic delay complete()")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

765559b1

ibmvnic: free reset-work-item when flushing · 8d0657f3

由 Sukadev Bhattiprolu 提交于 2月 24, 2022

Fix a tiny memory leak when flushing the reset work queue.

Fixes: 2770a798 ("ibmvnic: Introduce hard reset recovery")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d0657f3

23 2月, 2022 1 次提交

ibmvnic: schedule failover only if vioctl fails · 277f2bb1

由 Sukadev Bhattiprolu 提交于 2月 21, 2022

If client is unable to initiate a failover reset via H_VIOCTL hcall, then
it should schedule a failover reset as a last resort. Otherwise, there is
no need to do a last resort.

Fixes: 334c4241 ("ibmvnic: improve failover sysfs entry")
Reported-by: NCris Forno <cforno12@outlook.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Link: https://lore.kernel.org/r/20220221210545.115283-1-drt@linux.ibm.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

277f2bb1

18 2月, 2022 1 次提交

net/ibmvnic: Cleanup workaround doing an EOI after partition migration · 7ea0c16a

由 Cédric Le Goater 提交于 2月 18, 2022

There were a fair amount of changes to workaround a firmware bug leaving
a pending interrupt after migration of the ibmvnic device :

commit 2df5c60e ("net/ibmvnic: Ignore H_FUNCTION return from H_EOI
       		    to tolerate XIVE mode")
commit 284f87d2 ("Revert "net/ibmvnic: Fix EOI when running in
       		    XIVE mode"")
commit 11d49ce9 ("net/ibmvnic: Fix EOI when running in XIVE mode.")
commit f23e0643 ("ibmvnic: Clear pending interrupt after device reset")

Here is the final one taking into account the XIVE interrupt mode.

Cc: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Cc: Dany Madden <drt@linux.ibm.com>
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7ea0c16a

09 2月, 2022 1 次提交

ibmvnic: don't release napi in __ibmvnic_open() · 61772b09

由 Sukadev Bhattiprolu 提交于 2月 07, 2022

If __ibmvnic_open() encounters an error such as when setting link state,
it calls release_resources() which frees the napi structures needlessly.
Instead, have __ibmvnic_open() only clean up the work it did so far (i.e.
disable napi and irqs) and leave the rest to the callers.

If caller of __ibmvnic_open() is ibmvnic_open(), it should release the
resources immediately. If the caller is do_reset() or do_hard_reset(),
they will release the resources on the next reset.

This fixes following crash that occurred when running the drmgr command
several times to add/remove a vnic interface:

	[102056] ibmvnic 30000003 env3: Disabling rx_scrq[6] irq
	[102056] ibmvnic 30000003 env3: Disabling rx_scrq[7] irq
	[102056] ibmvnic 30000003 env3: Replenished 8 pools
	Kernel attempted to read user page (10) - exploit attempt? (uid: 0)
	BUG: Kernel NULL pointer dereference on read at 0x00000010
	Faulting instruction address: 0xc000000000a3c840
	Oops: Kernel access of bad area, sig: 11 [#1]
	LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
	...
	CPU: 9 PID: 102056 Comm: kworker/9:2 Kdump: loaded Not tainted 5.16.0-rc5-autotest-g6441998e #1
	Workqueue: events_long __ibmvnic_reset [ibmvnic]
	NIP:  c000000000a3c840 LR: c0080000029b5378 CTR: c000000000a3c820
	REGS: c0000000548e37e0 TRAP: 0300   Not tainted  (5.16.0-rc5-autotest-g6441998e)
	MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28248484  XER: 00000004
	CFAR: c0080000029bdd24 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0
	GPR00: c0080000029b55d0 c0000000548e3a80 c0000000028f0200 0000000000000000
	...
	NIP [c000000000a3c840] napi_enable+0x20/0xc0
	LR [c0080000029b5378] __ibmvnic_open+0xf0/0x430 [ibmvnic]
	Call Trace:
	[c0000000548e3a80] [0000000000000006] 0x6 (unreliable)
	[c0000000548e3ab0] [c0080000029b55d0] __ibmvnic_open+0x348/0x430 [ibmvnic]
	[c0000000548e3b40] [c0080000029bcc28] __ibmvnic_reset+0x500/0xdf0 [ibmvnic]
	[c0000000548e3c60] [c000000000176228] process_one_work+0x288/0x570
	[c0000000548e3d00] [c000000000176588] worker_thread+0x78/0x660
	[c0000000548e3da0] [c0000000001822f0] kthread+0x1c0/0x1d0
	[c0000000548e3e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64
	Instruction dump:
	7d2948f8 792307e0 4e800020 60000000 3c4c01eb 384239e0 f821ffd1 39430010
	38a0fff6 e92d1100 f9210028 39200000 <e9030010> f9010020 60420000 e9210020
	---[ end trace 5f8033b08fd27706 ]---

Fixes: ed651a10 ("ibmvnic: Updated reset handling")
Reported-by: NAbdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Link: https://lore.kernel.org/r/20220208001918.900602-1-sukadev@linux.ibm.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>

61772b09

24 1月, 2022 4 次提交

ibmvnic: remove unused ->wait_capability · 3a5d9db7

由 Sukadev Bhattiprolu 提交于 1月 21, 2022

With previous bug fix, ->wait_capability flag is no longer needed and can
be removed.

Fixes: 249168ad ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a5d9db7

ibmvnic: don't spin in tasklet · 48079e7f

由 Sukadev Bhattiprolu 提交于 1月 21, 2022

ibmvnic_tasklet() continuously spins waiting for responses to all
capability requests. It does this to avoid encountering an error
during initialization of the vnic. However if there is a bug in the
VIOS and we do not receive a response to one or more queries the
tasklet ends up spinning continuously leading to hard lock ups.

If we fail to receive a message from the VIOS it is reasonable to
timeout the login attempt rather than spin indefinitely in the tasklet.

Fixes: 249168ad ("ibmvnic: Make CRQ interrupt tasklet wait for all capabilities crqs")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

48079e7f

ibmvnic: init ->running_cap_crqs early · 151b6a5c

由 Sukadev Bhattiprolu 提交于 1月 21, 2022

We use ->running_cap_crqs to determine when the ibmvnic_tasklet() should
send out the next protocol message type. i.e when we get back responses
to all our QUERY_CAPABILITY CRQs we send out REQUEST_CAPABILITY crqs.
Similiary, when we get responses to all the REQUEST_CAPABILITY crqs, we
send out the QUERY_IP_OFFLOAD CRQ.

We currently increment ->running_cap_crqs as we send out each CRQ and
have the ibmvnic_tasklet() send out the next message type, when this
running_cap_crqs count drops to 0.

This assumes that all the CRQs of the current type were sent out before
the count drops to 0. However it is possible that we send out say 6 CRQs,
get preempted and receive all the 6 responses before we send out the
remaining CRQs. This can result in ->running_cap_crqs count dropping to
zero before all messages of the current type were sent and we end up
sending the next protocol message too early.

Instead initialize the ->running_cap_crqs upfront so the tasklet will
only send the next protocol message after all responses are received.

Use the cap_reqs local variable to also detect any discrepancy (either
now or in future) in the number of capability requests we actually send.

Currently only send_query_cap() is affected by this behavior (of sending
next message early) since it is called from the worker thread (during
reset) and from application thread (during ->ndo_open()) and they can be
preempted. send_request_cap() is only called from the tasklet which
processes CRQ responses sequentially, is not be affected. But to
maintain the existing symmtery with send_query_capability() we update
send_request_capability() also.

151b6a5c

ibmvnic: Allow extra failures before disabling · db9f0e8b

由 Sukadev Bhattiprolu 提交于 1月 21, 2022

If auto-priority-failover (APF) is enabled and there are at least two
backing devices of different priorities, some resets like fail-over,
change-param etc can cause at least two back to back failovers. (Failover
from high priority backing device to lower priority one and then back
to the higher priority one if that is still functional).

Depending on the timimg of the two failovers it is possible to trigger
a "hard" reset and for the hard reset to fail due to failovers. When this
occurs, the driver assumes that the network is unstable and disables the
VNIC for a 60-second "settling time". This in turn can cause the ethtool
command to fail with "No such device" while the vnic automatically recovers
a little while later.

Given that it's possible to have two back to back failures, allow for extra
failures before disabling the vnic for the settling time.

Fixes: f15fde9d ("ibmvnic: delay next reset if hard reset fails")
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db9f0e8b

14 12月, 2021 1 次提交

ibmvnic: Update driver return codes · b6ee566c

由 Dany Madden 提交于 12月 14, 2021

Update return codes to be more informative.
Signed-off-by: NJacob Root <otis@otisroot.com>
Signed-off-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6ee566c

02 12月, 2021 2 次提交

ibmvnic: drop bad optimization in reuse_tx_pools() · 5b085601

由 Sukadev Bhattiprolu 提交于 11月 30, 2021

When trying to decide whether or not reuse existing rx/tx pools
we tried to allow a range of values for the pool parameters rather
than exact matches. This was intended to reuse the resources for
instance when switching between two VIO servers with different
default parameters.

But this optimization is incomplete and breaks when we try to
change the number of queues for instance. The optimization needs
to be updated, so drop it for now and simplify the code.

Fixes: bbd80930 ("ibmvnic: Reuse tx pools when possible")
Reported-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Reviewed-by: NRick Lindsley <ricklind@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b085601

ibmvnic: drop bad optimization in reuse_rx_pools() · 0584f494

由 Sukadev Bhattiprolu 提交于 11月 30, 2021

When trying to decide whether or not reuse existing rx/tx pools
we tried to allow a range of values for the pool parameters rather
than exact matches. This was intended to reuse the resources for
instance when switching between two VIO servers with different
default parameters.

But this optimization is incomplete and breaks when we try to
change the number of queues for instance. The optimization needs
to be updated, so drop it for now and simplify the code.

Fixes: 489de956 ("ibmvnic: Reuse rx pools when possible")
Reported-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Reviewed-by: NRick Lindsley <ricklind@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0584f494

22 11月, 2021 1 次提交

ethtool: extend ringparam setting/getting API with rx_buf_len · 74624944

由 Hao Chen 提交于 11月 18, 2021

Add two new parameters kernel_ringparam and extack for
.get_ringparam and .set_ringparam to extend more ring params
through netlink.
Signed-off-by: NHao Chen <chenhao288@hisilicon.com>
Signed-off-by: NGuangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74624944

17 11月, 2021 1 次提交

net: annotate accesses to queue->trans_start · 5337824f

由 Eric Dumazet 提交于 11月 16, 2021

In following patches, dev_watchdog() will no longer stop all queues.
It will read queue->trans_start locklessly.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5337824f

01 11月, 2021 3 次提交

ibmvnic: delay complete() · 6b278c0c

由 Sukadev Bhattiprolu 提交于 10月 29, 2021

If we get CRQ_INIT, we set errno to -EIO and first call complete() to
notify the waiter. Then we try to schedule a FAILOVER reset. If this
occurs while adapter is in PROBING state, ibmvnic_reset() changes the
error code to EAGAIN and returns without scheduling the FAILOVER. The
purpose of setting error code to EAGAIN is to ask the waiter to retry.

But due to the earlier complete() call, the waiter may already have seen
the -EIO response and decided not to retry. This can cause intermittent
failures when bringing up ibmvnic adapters during boot, specially in
in kexec/kdump kernels.

Defer the complete() call until after scheduling the reset.

Also streamline the error code to EAGAIN. Don't see why we need EIO
sometimes. All 3 callers of ibmvnic_reset_init() can handle EAGAIN.

Fixes: 17c87058 ("ibmvnic: Return error code if init interrupted by transport event")
Reported-by: NVaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b278c0c

ibmvnic: Process crqs after enabling interrupts · 6e20d001

由 Sukadev Bhattiprolu 提交于 10月 29, 2021

Soon after registering a CRQ it is possible that we get a fail over or
maybe a CRQ_INIT from the VIOS while interrupts were disabled.

Look for any such CRQs after enabling interrupts.

Otherwise we can intermittently fail to bring up ibmvnic adapters during
boot, specially in kexec/kdump kernels.

Fixes: 032c5e82 ("Driver for IBM System i/p VNIC protocol")
Reported-by: NVaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e20d001

ibmvnic: don't stop queue in xmit · 8878e46f

由 Sukadev Bhattiprolu 提交于 10月 29, 2021

If adapter's resetting bit is on, discard the packet but don't stop the
transmit queue - instead leave that to the reset code. With this change,
it is possible that we may get several calls to ibmvnic_xmit() that simply
discard packets and return.

But if we stop the queue here, we might end up doing so just after
__ibmvnic_open() started the queues (during a hard/soft reset) and before
the ->resetting bit was cleared. If that happens, there will be no one to
restart queue and transmissions will be blocked indefinitely.

This can cause a TIMEOUT reset and with auto priority failover enabled,
an unnecessary FAILOVER reset to less favored backing device and then a
FAILOVER back to the most favored backing device. If we hit the window
repeatedly, we can get stuck in a loop of TIMEOUT, FAILOVER, FAILOVER
resets leaving the adapter unusable for extended periods of time.

Fixes: 7f5b0308 ("ibmvnic: Free skb's in cases of failure in transmit")
Reported-by: NAbdul Haleem <abdhalee@in.ibm.com>
Reported-by: NVaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: NDany Madden <drt@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8878e46f

02 10月, 2021 1 次提交

ethernet: use eth_hw_addr_set() instead of ether_addr_copy() · f3956ebb

由 Jakub Kicinski 提交于 10月 01, 2021

Convert Ethernet from ether_addr_copy() to eth_hw_addr_set():

  @@
  expression dev, np;
  @@
  - ether_addr_copy(dev->dev_addr, np)
  + eth_hw_addr_set(dev, np)
Signed-off-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f3956ebb

27 9月, 2021 1 次提交

Revert "ibmvnic: check failover_pending in login response" · 2974b8a6

由 Desnes A. Nunes do Rosario 提交于 9月 25, 2021

This reverts commit d437f5aa.

Code has been duplicated through commit <273c29e9> "ibmvnic: check
failover_pending in login response"
Signed-off-by: NDesnes A. Nunes do Rosario <desnesn@linux.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2974b8a6

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功