- 05 11月, 2019 1 次提交
-
-
由 Maciej Fijalkowski 提交于
Add support for XDP. Implement ndo_bpf and ndo_xdp_xmit. Upon load of an XDP program, allocate additional Tx rings for dedicated XDP use. The following actions are supported: XDP_TX, XDP_DROP, XDP_REDIRECT, XDP_PASS, and XDP_ABORTED. Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 05 9月, 2019 3 次提交
-
-
由 Jesse Brandeburg 提交于
Add a small bit of efficiency to the code by adding a prefetch of the port_info structure in order to help avoid a cache miss a little later on in execution. Also add an unlikely statement to a branch which generally will never happen in normal operation. Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Jesse Brandeburg 提交于
This is a simple patch to move the assignment to a local variable closer to the site where the local variable is used. This can help readability and also maybe performance, although the performance enhancement is really dependent upon the compiler. No functional change. Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Jesse Brandeburg 提交于
There are a couple of functions that don't need two arguments passed in when the second argument already had access to the pointer pointed to by the first. Remove the unnecessary arguments. Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 24 8月, 2019 1 次提交
-
-
由 Dave Ertman 提交于
For control packets (i.e. LLDP packets) to be able to egress from the main VSI, a bit has to be set in the TX_descriptor. This should only be done for the main VSI and only if the FW LLDP agent is disabled. A bit to allow this also has to be set in the VSI context. Add the logic to add the necessary bits in the VSI context for the PF_VSI and the TX_descriptors for control packets egressing the PF_VSI. Signed-off-by: NDave Ertman <david.m.ertman@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 21 8月, 2019 3 次提交
-
-
由 Mitch Williams 提交于
In some circumstances, the hardware will hand us a receive descriptor which has no data attached, but is otherwise valid. The receive code was improperly ignoring these descriptors, which result in an infinite loop. To fix this, change the receive code to process all descriptors, regardless of the size of the associated data. Add checks to the memory-handling functions to allow for zero size. Signed-off-by: NMitch Williams <mitch.a.williams@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Brett Creeley 提交于
Currently when busy polling is enabled we aren't setting/enabling WB_ON_ITR in the driver. This doesn't break the driver, but it does cause issues. If we don't enable WB_ON_ITR mode we will still get write-backs from hardware during polling when a cache line has been filled, but if a cache line is not filled we will not get the write-back because WB_ON_ITR is not set. Fix this by enabling WB_ON_ITR in the driver when interrupts are disabled. Signed-off-by: NBrett Creeley <brett.creeley@intel.com> Signed-off-by: NTony Nguyen <anthony.l.nguyen@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Brett Creeley 提交于
Currently we divide budget by the number of Rx queues per Rx ring container in ice_napi_poll even if there is only 1. This is an unnecessary divide for the normal case of 1 Rx ring per Rx ring container. Fix this by using an unlikely() call in the case where we actually need to divide. Also, we will always set budget_per_ring even if there are no Rx rings in the Rx ring container so we don't need to initialize it to 0. Signed-off-by: NBrett Creeley <brett.creeley@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 01 8月, 2019 3 次提交
-
-
由 Brett Creeley 提交于
This flag is not needed and is called every time we re-enable interrupts in the hotpath so remove it. Also remove ice_vsi_req_irq() because it was a wrapper function for ice_vsi_req_irq_msix() whose sole purpose was checking the ICE_FLAG_MSIX_ENA flag. Signed-off-by: NBrett Creeley <brett.creeley@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Brett Creeley 提交于
Currently if the call to ice_alloc_mapped_page() fails we jump to the no_buf label, possibly call ice_release_rx_desc(), and return true indicating that there is more work to do. In the success case we just fall out of the while loop, possibly call ice_alloc_mapped_page(), and return false saying we exhausted cleaned_count. This flow can be improved by breaking if ice_alloc_mapped_page() fails and then the flow outside of the while loop is the same for the failure and success case. Signed-off-by: NBrett Creeley <brett.creeley@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Brett Creeley 提交于
Currently we bump the Rx tail and release/give buffers to hardware every 16 descriptors. This causes us to bump Rx tail up to 4 times per napi_poll call. Also we are always bumping tail on an odd index and this is a problem because hardware ignores the lower 3 bits in the QRX_TAIL register. This is making it so hardware sees tail bumps only every 8 descriptors. Instead lets only bump Rx tail once per napi_poll if the value aligns with hardware's expectations of the lower 3 bits being cleared. Also only release/give Rx buffers once per napi_poll call. Signed-off-by: NBrett Creeley <brett.creeley@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 23 7月, 2019 1 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
In preparation for unifying the skb_frag and bio_vec, use the fine accessors which already exist and use skb_frag_t instead of struct skb_frag_struct. Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 5月, 2019 1 次提交
-
-
由 Anirudh Venkataramanan 提交于
This patch mostly capitalizes abbreviations in code comments. Fixed some typos and removed some unnecessary newlines as well. Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 29 5月, 2019 1 次提交
-
-
由 Bruce Allan 提交于
Some static analysis tools can complain when doing a bitop assignment using operands of different sizes. Fix that. Signed-off-by: NBruce Allan <bruce.w.allan@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 05 5月, 2019 1 次提交
-
-
由 Bruce Allan 提交于
A recent version of cppcheck falsely reports- Variable ip.hdr is assigned a value that is never used. ip is a union so the pointer ip.hdr is actually used when referenced as ip.v4 and ip.v6. Silence these false reports when using cppcheck with the --inline-suppr command-line option. Signed-off-by: NBruce Allan <bruce.w.allan@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 02 5月, 2019 1 次提交
-
-
由 Brett Creeley 提交于
Every time we want to re-enable interrupts and/or write to a register that requires an interrupt vector's hardware index we do the following: vsi->hw_base_vector + q_vector->v_idx This is a wasteful operation, especially in the hot path. Fix this by adding a u16 reg_idx member to the ice_q_vector structure and make the necessary changes to make this work. Signed-off-by: NBrett Creeley <brett.creeley@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 24 4月, 2019 1 次提交
-
-
由 Stanislav Fomichev 提交于
Update all users of eth_get_headlen to pass network device, fetch network namespace from it and pass it down to the flow dissector. This commit is a noop until administrator inserts BPF flow dissector program. Cc: Maxim Krasnyansky <maxk@qti.qualcomm.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: intel-wired-lan@lists.osuosl.org Cc: Yisen Zhuang <yisen.zhuang@huawei.com> Cc: Salil Mehta <salil.mehta@huawei.com> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: NStanislav Fomichev <sdf@google.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 18 4月, 2019 3 次提交
-
-
由 Brett Creeley 提交于
Currently when calculating how much to increment ITR by inside of ice_update_itr() we do some estimations and intermediate calculations. Instead of doing estimations, just do the calculation directly. This allows for a more accurate value and it makes it easier for the next person to understand and update. Also, remove the dividing the ITR value by 2 when latency driven because the ITR values are already so low for 100Gbps speed. This should help get to the desired ITR value faster. Signed-off-by: NBrett Creeley <brett.creeley@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Anirudh Venkataramanan 提交于
This patch introduces a new function ice_tx_prepare_vlan_flags_dcb to insert 802.1p priority information into the VLAN header Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Anirudh Venkataramanan 提交于
Capitalize abbreviations and spell out some that aren't obvious. Reviewed-by: NBruce Allan <bruce.w.allan@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 08 4月, 2019 1 次提交
-
-
由 Will Deacon 提交于
mmiowb() is now implied by spin_unlock() on architectures that require it, so there is no reason to call it from driver code. This patch was generated using coccinelle: @mmiowb@ @@ - mmiowb(); and invoked as: $ for d in drivers include/linux/qed sound; do \ spatch --include-headers --sp-file mmiowb.cocci --dir $d --in-place; done NOTE: mmiowb() has only ever guaranteed ordering in conjunction with spin_unlock(). However, pairing each mmiowb() removal in this patch with the corresponding call to spin_unlock() is not at all trivial, so there is a small chance that this change may regress any drivers incorrectly relying on mmiowb() to order MMIO writes between CPUs using lock-free synchronisation. If you've ended up bisecting to this commit, you can reintroduce the mmiowb() calls using wmb() instead, which should restore the old behaviour on all architectures other than some esoteric ia64 systems. Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 02 4月, 2019 1 次提交
-
-
由 Florian Westphal 提交于
There are two reasons for this. First, the xmit_more flag conceptually doesn't fit into the skb, as xmit_more is not a property related to the skb. Its only a hint to the driver that the stack is about to transmit another packet immediately. Second, it was only done this way to not have to pass another argument to ndo_start_xmit(). We can place xmit_more in the softnet data, next to the device recursion. The recursion counter is already written to on each transmit. The "more" indicator is placed right next to it. Drivers can use the netdev_xmit_more() helper instead of skb->xmit_more to check the "more packets coming" hint. skb->xmit_more is retained (but always 0) to not cause build breakage. This change takes care of the simple s/skb->xmit_more/netdev_xmit_more()/ conversions. Remaining drivers are converted in the next patches. Suggested-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 3月, 2019 3 次提交
-
-
由 Anirudh Venkataramanan 提交于
Single statement if conditions don't need braces. Remove it. Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Brett Creeley 提交于
Currently the ice_q_vector structure and ice_ring_container structure are taking up more space than necessary due to cache alignment holes and unnecessary variables respectively. This is not helping the driver's performance. The following fixes were done to improve cache alignment, reduce wasted space, and increase performance. 1. Remove the ice_latency_range enum as it is unused. 2. Remove the latency_range variable in the ice_ring_container structure. 3. Change the size of the itr_idx in the ice_ring_container structure from an int to an u16. This reduced the size of ice_ring_container structure to 32 Bytes so it has no holes or padding. 4. Re-arrange the ice_q_vector structure using pahole to align members as best as possible in regards to 64 Byte cache line size. Signed-off-by: NBrett Creeley <brett.creeley@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Anirudh Venkataramanan 提交于
commit 63f545ed ("ice: Add support for adaptive interrupt moderation") was meant to add support for adaptive interrupt moderation but there was an error on my part while formatting the patch, and thus only part of the patch ended up being submitted. This patch rectifies the error by adding the rest of the code. Fixes: 63f545ed ("ice: Add support for adaptive interrupt moderation") Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 26 3月, 2019 5 次提交
-
-
由 Maciej Fijalkowski 提交于
Provide DMA_ATTR_WEAK_ORDERING and DMA_ATTR_SKIP_CPU_SYNC attributes to the DMA API during the mapping operations on Rx side. With this change the non-x86 platforms will be able to sync only with what is being used (2k buffer) instead of entire page. This should yield a slight performance improvement. Furthermore, DMA unmap may destroy the changes that were made to the buffer by CPU when platform is not a x86 one. DMA_ATTR_SKIP_CPU_SYNC attribute usage fixes this issue. Also add a sync_single_for_device call during the Rx buffer assignment, to make sure that the cache lines are cleared before device attempting to write to the buffer. Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Maciej Fijalkowski 提交于
Refactor ice_fetch_rx_buf and ice_add_rx_frag in a way that we have standalone functions that do either the skb construction or frag addition to previously constructed skb. The skb handling between rx_bufs is spread among various functions. The ice_get_rx_buf will retrieve the skb pointer from rx_buf and if it is a NULL pointer then we do the ice_construct_skb, otherwise we add a frag to the current skb via ice_add_rx_frag. Then, on the ice_put_rx_buf the skb pointer that belongs to rx_buf will be cleared. Moving further, if the current frame is not EOP frame we assign the current skb to the rx_buf that is pointed by updated next_to_clean indicator. What is more during the buffer reuse let's assign each member of ice_rx_buf individually so we avoid the unnecessary copy of skb. Last but not least, this logic split will allow us for better code reuse when adding a support for build_skb. Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Maciej Fijalkowski 提交于
Pull out the code responsible for page counting and buffer recycling so that it will be possible to clean up the Rx buffers in cases where we won't allocate skb (ex. XDP) Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Maciej Fijalkowski 提交于
{get,put}_page are atomic operations which we use for page count handling. The current logic for refcount handling is that we increment it when passing a skb with the data from the first half of page up to netstack and recycle the second half of page. This operation protects us from losing a page since the network stack can decrement the refcount of page from skb. The performance can be gently improved by doing the bulk updates of refcount instead of doing it one by one. During the buffer initialization, maximize the page's refcount and don't allow the refcount to become less than two. Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Maciej Fijalkowski 提交于
Instead of adding a frag and later when dealing with EOP frame accessing that frag in order to copy the headers onto linear part of skb, we can do this in ice_add_rx_frag in case where the data_len is still 0 and frame won't fit onto the linear part as a whole. Function comment of ice_pull_tail was a bit misleading because of mentioned optimizations that can be performed (drop a frag/maintaining accurate truesize of skb) - it seems that this part of logic was dropped and the comment was not updated to reflect this change. Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 25 3月, 2019 2 次提交
-
-
由 Maciej Fijalkowski 提交于
Introduce ice_can_reuse_rx_page which will verify whether the page can be reused and return the boolean result to caller. Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Maciej Fijalkowski 提交于
Introduce ice_get_rx_buf, which will fetch the Rx buffer and do the DMA synchronization. Length of the packet that hardware Rx descriptor contains is now read in ice_clean_rx_irq, so we can feed ice_get_rx_buf with it and resign from rx_desc passed as argument in ice_fetch_rx_buf and ice_add_rx_frag. Signed-off-by: NMaciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 22 3月, 2019 1 次提交
-
-
由 Bruce Allan 提交于
Put the return type on a separate line for function prototypes and signatures that would exceed the 80-character limit if both were on the same line. Signed-off-by: NBruce Allan <bruce.w.allan@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 20 3月, 2019 1 次提交
-
-
由 Brett Creeley 提交于
Currently we set the default number of Tx and Rx descriptors to 128 by default. For Rx this amounts to a full page (assuming 4K pages) because each Rx descriptor is 32 Bytes, but for Tx it only amounts to a half page because each Tx descriptor is 16 Bytes (assuming 4K pages). Instead of assuming 4K pages, determine the ring size and the number of descriptors for Tx and Rx based on a calculation using the PAGE_SIZE, ICE_MAX_NUM_DESC, and ICE_REQ_DESC_MULTIPLE. This change is being made to improve the performance of the driver when using the default settings. Signed-off-by: NBrett Creeley <brett.creeley@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 26 2月, 2019 2 次提交
-
-
由 Bruce Allan 提交于
When compiling and analyzing the driver on newer kernels, a static analyzer warns about the following "numeric overflow" issues: "The result of expression: 'budget-1' generates 4-byte type while casting to a bigger size of 8-byte". "The result of expression: '*words-words_read' generates 4-byte type while casting to a bigger size of 8-byte". Fix them both. Signed-off-by: NBruce Allan <bruce.w.allan@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Bruce Allan 提交于
With sizeof(), it is preferable to use the variable of type <type> instead of sizeof(<type>). There are multiple places where a temporary variable is used to hold a 'size' value which is then used for a subsequent alloc/memset. Get rid of the temporary variable by calculating size as part of the alloc/memset statement. Also remove unnecessary type-cast. Signed-off-by: NBruce Allan <bruce.w.allan@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 16 1月, 2019 2 次提交
-
-
由 Anirudh Venkataramanan 提交于
This patch adds the ability to offload SCTP checksum calculations to the NIC. Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Brett Creeley 提交于
Currently the driver does not support adaptive/dynamic interrupt moderation. This patch adds support for this. Also, adaptive/dynamic interrupt moderation is turned on by default upon driver load. In order to support adaptive interrupt moderation, two functions were added, ice_update_itr() and ice_itr_divisor(). These are used to determine the current packet load and to determine a divisor based on link speed respectively. This patch also adds the ICE_ITR_GRAN_S define that is used in the hot-path when setting a new ITR value. The shift is used to pet two birds with one hand, set the ITR value while re-enabling the interrupt. Also, the ICE_ITR_GRAN_S is defined as 1 because the device has a ITR granularity of 2usecs. Signed-off-by: NBrett Creeley <brett.creeley@intel.com> Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 22 11月, 2018 1 次提交
-
-
由 Jesse Brandeburg 提交于
While reviewing code, I noticed that Eric Dumazet recommends that drivers check the return code of napi_complete_done, and use that to decide to enable interrupts or not when exiting poll. One of the Intel drivers was already fixed (ixgbe). Upon looking at the Intel drivers as a whole, we are handling our polling and NAPI exit in a few different ways based on whether we have multiqueue and whether we have Tx cleanup included. Several drivers had the bug of exiting NAPI with return 0, which appears to mess up the accounting in the stack. Consolidate all the NAPI routines to do best known way of exiting and to just mostly look like each other. 1) check return code of napi_complete_done to control interrupt enable 2) return the actual amount of work done. 3) return budget immediately if need NAPI poll again Tested the changes on e1000e with a high interrupt rate set, and it shows about an 8% reduction in the CPU utilization when busy polling because we aren't re-enabling interrupts when we're about to be polled. Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Reviewed-by: NJacob Keller <jacob.e.keller@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 21 11月, 2018 1 次提交
-
-
由 Anirudh Venkataramanan 提交于
In code comments, use Tx|Rx instead of tx|rx Signed-off-by: NAnirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: NAndrew Bowers <andrewx.bowers@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-