- 20 7月, 2020 3 次提交
-
-
由 Julian Wiedmann 提交于
For non-thinint devices in LPAR, qdio polls an idle Input Queue for a little while to catch more work. But platform support for thinints has been around practically _forever_ by now, so this micro-optimization is seeing 0 actual use. Remove it to reduce the overall complexity of the hot path. In the meantime we also grew support for driver-level polling (eg. NAPI in qeth), so it's quite questionable how useful this would actually be on current kernels. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
-
由 Julian Wiedmann 提交于
The comment is inaccurate, qdio_inbound_q_moved() and/or its callers no longer get confused by a count of 128 completed SBALs. Scanning all 128 SBALs at once can improve IRQ reduction (as we now place the ACK at the right spot), and reduce the amount of processing needed to handle all completed SBALs. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
-
由 Julian Wiedmann 提交于
Old code would only scan up to 127 SBALs at once. So the last statistics bucket was set aside to count "discovered 127 SBALs with new work" events. But nowadays we allow to scan all 128 SBALs for Output Queues, and a subsequent patch will introduce the same for Input Queues. So fix up the accounting to use the last bucket only when all 128 SBALs have been discovered with new work. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NHeiko Carstens <hca@linux.ibm.com>
-
- 18 6月, 2020 2 次提交
-
-
由 Julian Wiedmann 提交于
The way we produce SBALs to the device (first update q->nr_buf_used, then update the SLSB) should ensure that we never see some of the SLSB states when scanning the queue for progress. So make some noise if we do, this implies a bug in our SBAL tracking. Also tweak the WARN msg to provide more information. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
This removes the last remaining accesses to ->qdio_data from internal code. Just pass the qdio_irq struct where needed instead. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
- 16 6月, 2020 2 次提交
-
-
由 Julian Wiedmann 提交于
Streamline the processing of QDIO Input Queues, and remove some intermittent SLSB updates (no deleting of old ACKs, no redundant transitions through NOT_INIT). Rather than counting ACKs, we now keep track of the whole batch of SBALs that were completed during the current polling cycle. Most completed SBALs stay in their initial state (ie. PRIMED or ERROR), except that the most recent SBAL in each sub-run is ACKed for IRQ reduction. The only logic changes happen in inbound_handle_work(), the other delta is just a renaming of the variables that track the SBAL batch. Note that in particular we don't need to flip the _oldest_ SBAL to an idle state (eg. NOT_INIT or ACKed) as a guard against catching our own tail. Since get_inbound_buffer_frontier() will never scan more than the remaining nr_buf_used SBALs, this scenario just doesn't occur. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
xchg() for a single-byte location assembles to a 4-byte Compare&Swap, wrapped into a non-trivial amount of retry code that deals with concurrent modifications to the unaffected bytes. Change it to a simple byte-store, but preserve the memory ordering semantics that the CS provided. This simplifies the generated code for a hot path, and in theory also allows us to amortize the memory barriers over multiple SLSB updates. CC: Andreas Krebbel <krebbel@linux.ibm.com> Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
- 28 5月, 2020 3 次提交
-
-
由 Alexandra Winter 提交于
CHSC3D (PNSO - perform network subchannel operation) is used for OC0 (Store-network-bridging-information) as well as for OC3 (Store-network-address-information). So common fields are renamed from *brinfo* to *pnso*. Also *_bridge_host_* is changed into *_addr_change_*, e.g. qeth_bridge_host_event to qeth_addr_change_event, for the same reasons. The keywords in the card traces are changed accordingly. Remove unused L3 types, as PNSO will only return Layer2 entries. Make PNSO CHSC implementation more consistent with existing API usage: Add new function ccw_device_pnso() to drivers/s390/cio/device_ops.c and the function declaration to arch/s390/include/asm/ccwdev.h, which takes a struct ccw_device * as parameter instead of schid and calls chsc_pnso(). PNSO CHSC has no strict relationship to qdio. So move the calling function from qdio to qeth_l2 and move the necessary structures to a new file arch/s390/include/asm/chsc.h. Do response code evaluation only in chsc_error_from_response() and use return code in all other places. qeth_anset_makerc() was meant to evaluate the PNSO response code, but never did, because pnso_rc was already non-zero. Indentation was corrected in some places. Signed-off-by: NAlexandra Winter <wintera@linux.ibm.com> Reviewed-by: NPeter Oberparleiter <oberpar@linux.ibm.com> Reviewed-by: NVineeth Vijayan <vneethv@linux.ibm.com> Reviewed-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
q->first_to_kick is obsolete, and can be replaced by q->first_to_check. Both cursors start off at 0. Out of the three code paths that update first_to_check, the qdio_inspect_queue() path is irrelevant as it doesn't even touch first_to_kick anymore. This leaves us with the two tasklet-driven code paths. Here any update to first_to_check is followed by a call to qdio_kick_handler(), which advances first_to_kick by the same amount. So the two cursors will differ only for a tiny moment. Drivers have no way of deterministically observing this difference, and thus it doesn't matter which of the cursors we use for reporting an error to q->handler. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
Document the actual semantics, correcting an old copy & paste mistake. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
- 20 5月, 2020 3 次提交
-
-
由 Julian Wiedmann 提交于
SBALs in PRIMED or ERROR state represent new work on the Input Queue. But while inbound_primed() does all sorts of ACK management for new PRIMED work, the same handling is currently missing for ERROR work. In particular the path for ERROR work doesn't clear up _old_ ACKs. Treat ERROR work the same as PRIMED work, but consider that the QEBSM auto-ACK feature doesn't apply here. So we need to set the ACK manually, as if it was a non-QEBSM device. Note that this doesn't aspire to actually improve performance, the main goal is to just unify the code paths and have consistent behaviour. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
inbound_primed() currently has two code paths - one for QEBSM that knows how to deal with multiple ACKs, and a non-QEBSM path that strictly assumes a single ACK on the queue. In preparation for a subsequent patch, slightly adjust the non-QEBSM path so that it can manage a queue with multiple ACKs. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
Refilling the Input Queue requires additional checks, as the refilled SBALs can overlap with the ACKs that qdio maintains on the queue. This code path is way too complex, and does a whole bunch of wrap-around checks that the modulo arithmetic in sub_buf() takes care of by itself. So shrink down all that code into a few lines of equivalent functionality. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
- 28 4月, 2020 8 次提交
-
-
由 Julian Wiedmann 提交于
buf_in_between() gets passed q->u.in.ack_start as 'bufnr' parameter. The ack_start always ranges between 0 and QDIO_MAX_BUFFERS_PER_Q - 1, so the subsequent check will always return true. Remove it. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
Except for some initial thinint-only steps, the processing is identical to the non-thinint case. So re-use the existing helper. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
Knowing how many queues we initially allocated allows us to 1) sanity-check a subsequent qdio_establish() request, and 2) walk the queue arrays without further checks. Apply this while cleanly splitting qdio_free_queues() into two separate helpers. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NSteffen Maier <maier@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
When qdio_allocate_qs() fails, have it deal with its previous allocations. This way qdio_allocate() doesn't need to clean up afterwards. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NSteffen Maier <maier@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
Instead of having a catch-all qdio_release_memory() helper, free the individual allocations from the respective error path. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NSteffen Maier <maier@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
Wrap the init/exit steps for thinint into a single helper that follows the established naming scheme. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NSteffen Maier <maier@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
qdio_establish() calls qdio_establish_thinint(), but later has an error exit path that doesn't roll this call back. Fix it. Fixes: 779e6e1c ("[S390] qdio: new qdio driver.") Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
For rolling back after an error, qdio_establish() calls qdio_shutdown(). If the error occurs early enough, then the qdio_irq's state still is QDIO_IRQ_STATE_INACTIVE and qdio_shutdown() does nothing. But at _any_ point where qdio_establish() bails out in this way, qdio_setup_irq() will have already replaced the IRQ handler. This then won't be restored after an early error, and the device can end up being returned to the device driver with qdio's IRQ handler still installed. Slightly reorder qdio_setup_irq() so we can be 100% sure that the IRQ handler was replaced. Then fix the bug in qdio_establish() by calling a helper that rolls back only the IRQ handler modification. Also use the new helper in qdio_shutdown() to keep things in sync, and slightly clean up the locking while doing so. This makes minor semantical changes, but holding setup_mutex gives us sufficient leeway to eg. pull qdio_shutdown_thinint() outside of the ccwdev lock's scope. Fixes: 779e6e1c ("[S390] qdio: new qdio driver.") Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
- 06 4月, 2020 3 次提交
-
-
由 Julian Wiedmann 提交于
Polling drivers in a configuration with 1 Input Queue currently keep their DSCI armed all the way through the poll cycle, until qdio_start_irq() clears it. _Any_ intermittent QDIO interrupt delivered to tiqdio_thinint_handler() will thus cause 1) the 'adapter_int' statistic to be incremented, 2) a call to tiqdio_call_inq_handlers() for this device, and then 3) the 'int_discarded' statistics to be incremented. This causes overhead & complexity in the IRQ path, along with ambiguity in the statistics. On the other hand the device should be in IRQ avoidance mode during a poll cycle, so there won't be a lot of DSCI ping-pong that this micro-optimization could prevent. So align the DSCI handling with what we already do for devices with multiple Input Queues: clear it right away while processing the IRQ. For the non-polling path this means that we no longer need to handle the 1-queue case separately. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
It's no longer needed. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
All that qdio_allocate() actually uses from the init_data is the cdev, and the number of Input and Output Queues. Have the driver pass those as parameters, and defer the init_data processing into qdio_establish(). This includes writing per-device(!) trace entries, and most of the sanity checks. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
- 27 3月, 2020 1 次提交
-
-
由 Julian Wiedmann 提交于
Set up qdio_irq->cdev right when the qdio_irq struct is allocated, so that all subsequent code can rely on this pointer. Then convert two helper functions to not pass a cdev parameter around. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
- 26 3月, 2020 1 次提交
-
-
由 Julian Wiedmann 提交于
When the support for polling drivers was initially added, it only considered Input Queue 0. But as QDIO interrupts are actually for the full device and not a single queue, this doesn't really fit for configurations where multiple Input Queues are used. Rework the qdio code so that interrupts for a polling driver are not split up into actions for each queue. Instead deliver the interrupt as a single event, and let the driver decide which queue needs what action. When re-enabling the QDIO interrupt via qdio_start_irq(), this means that the qdio code needs to (1) put _all_ eligible queues back into a state where they raise IRQs, (2) and afterwards check _all_ eligible queues for new work to bridge the race window. On the qeth side of things (as the only qdio polling driver), we can now add CQ polling support to the main NAPI poll routine. It doesn't consume NAPI budget, and to avoid hogging the CPU we yield control after completing one full queue worth of buffers. The subsequent qdio_start_irq() will check for any additional work, and have us re-schedule the NAPI instance accordingly. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 2月, 2020 1 次提交
-
-
由 Julian Wiedmann 提交于
Remove all usage of cdev->private->qdio_data that's buried deep in internal code. This should only be used by the exported driver API, which can then pass around a proper qdio_irq pointer. Also trivially merge some initializations with their definitions. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
- 10 2月, 2020 1 次提交
-
-
由 Julian Wiedmann 提交于
Current code uses a 'polling' flag to keep track of whether an Input Queue has any ACKed SBALs. QEBSM devices might have multiple ACKed SBALs, and those are tracked separately with 'ack_count'. By also setting ack_count for non-QEBSM devices (to a fixed value of 1), we can use 'ack_count != 0' as replacement for the polling flag. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
- 01 11月, 2019 5 次提交
-
-
由 Julian Wiedmann 提交于
This allows IQD drivers to send out multiple SBALs with a single SIGA instruction. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NAlexandra Winter <wintera@linux.ibm.com> Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Julian Wiedmann 提交于
Output interrupts are not subject to SLSB-based avoidance, so remove the gratuitous SLSB updates for Output SBALs in ERROR state. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
On an interrupt, tiqdio_thinint_handler() walks a list of all objects that might require attention, and checks their DSCI. This list is awkwardly built from Input Queues, even though the IRQs are per-device and the queue is then only used to dereference its qdio_irq parent. To simplify the logic, change the code so that tiq_list contains qdio_irq entries. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
qperf_inc() takes a queue as input, but actually updates the statistics in its qdio_irq parent. In some contexts we already have access to the qdio_irq struct, and can avoid the additional dereference. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
由 Julian Wiedmann 提交于
Partial EQBS completion is no significant event, and the WARN ends up spamming the debug logs for no good reason. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NBenjamin Block <bblock@linux.ibm.com> Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
-
- 25 8月, 2019 2 次提交
-
-
由 Julian Wiedmann 提交于
If a driver wants to use the new Output Queue poll code, then the qdio layer must disable its internal Queue scanning. Let the driver select this mode by passing a special scan_threshold of 0. As the scan_threshold is the same for all Output Queues, also move it into the main qdio_irq struct. This allows for fast opt-out checking, a driver is expected to operate either _all_ or none of its Output Queues in polling mode. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Acked-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Julian Wiedmann 提交于
While commit d36deae7 ("qdio: extend API to allow polling") enhanced the qdio layer so that drivers can poll their Input Queues, we don't have the corresponding infrastructure for Output Queues yet. Factor out a helper that scans a single QDIO Queue, so that qeth can implement TX NAPI on top of it. While doing so, remove the duplicated tracking of the next-to-scan index (q->first_to_check vs q->first_to_kick) in this code path. qdio_handle_aobs() needs to move slightly upwards in the code hierarchy, so that it's still called from the polling path. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Acked-by: NVasily Gorbik <gor@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 7月, 2019 2 次提交
-
-
由 Julian Wiedmann 提交于
The IQD mcast queue doesn't support QAOB mode, so skip the qdio_enable_async_operation() setup call for this queue. This avoids the allocation of an unneeded QAOB pointer array, and sets up q->use_cq properly so that drivers are prohibited from using QAOBs for mcast traffic. Take this opportunity to streamline the q->use_cq and aob != 0 checks. The path to qdio_siga_output() is straight-forward, we don't need to worry about being called with bad operands. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
由 Julian Wiedmann 提交于
If the device driver were to send out a full queue's worth of SBALs, current code would end up discovering the last of those SBALs as PRIMED and erroneously skip the SIGA-w. This immediately stalls the queue. Add a check to not attempt fast-requeue in this case. While at it also make sure that the state of the previous SBAL was successfully extracted before inspecting it. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NJens Remus <jremus@linux.ibm.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
- 07 6月, 2019 1 次提交
-
-
由 Julian Wiedmann 提交于
When a CQ-enabled device uses QEBSM for SBAL state inspection, get_buf_states() can return the PENDING state for an Output Queue. get_outbound_buffer_frontier() isn't prepared for this, and any PENDING buffer will permanently stall all further completion processing on this Queue. This isn't a concern for non-QEBSM devices, as get_buf_states() for such devices will manually turn PENDING buffers into EMPTY ones. Fixes: 104ea556 ("qdio: support asynchronous delivery of storage blocks") Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
-
- 08 5月, 2019 2 次提交
-
-
由 Julian Wiedmann 提交于
When get_buf_states() gets called with count > 1, it scans the corresponding number of SBAL states until it encounters a mismatch. But when these SBALs are in a HW-owned state, the callers don't actually care _how many_ such SBALs are on the queue. If we can't process the first SBAL, we can't process any of the following SBALs either. So when the first SBAL is HW-owned, skip the scan of the remaining SBALs and thus save some CPU time. Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NJens Remus <jremus@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-
由 Julian Wiedmann 提交于
For a 1-SBAL state inspection, use the corresponding helper. No functional change, just reducing the number of immediate callers to get_buf_states(). Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com> Reviewed-by: NJens Remus <jremus@linux.ibm.com> Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
-