提交 · fe0acb5fcb7fe8cb3d68bbdb8459865c972d8f83 · openeuler / Kernel

25 9月, 2016 2 次提交

i40evf: support queue-specific settings for interrupt moderation · 65e87c03

由 Jacob Keller 提交于 9月 12, 2016

In commit a75e8005 ("i40e: queue-specific settings for interrupt
moderation") the i40e driver gained support for setting interrupt
moderation values per queue. This patch adds support for this feature
to the i40evf driver as well. In addition, a few changes are made to
the i40e implementation to add function header documentation comments,
as well.

This behaves in a similar fashion to the implementation in i40e. Thus,
requesting the moderation value when no queue is provided will report
queue 0 value, while setting the value without a queue will set all
queues at once.

Change-ID: I1f310a57c8e6c84a8524c178d44d1b7a6d3a848e
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

65e87c03

i40e/i40evf: Add txring_txq function to match fm10k and ixgbe · e486bdfd

由 Alexander Duyck 提交于 9月 12, 2016

This patch adds a txring_txq function which allows us to convert a
i40e_ring/i40evf_ring to a netdev_tx_queue structure.  This way we
can avoid having to make a multi-line function call for all the spots
that need access to this.

Change-ID: Ic063b71d8b92ea406d2c32e798c8e2b02809d65b
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

e486bdfd

06 5月, 2016 4 次提交

i40e/i40evf: Remove unused hardware receive descriptor code · bec60fc4

由 Jesse Brandeburg 提交于 4月 18, 2016

The hardware supports a 16 byte descriptor for receive, but the
driver was never using it in production.  There was no performance
benefit to the real driver of 16 byte descriptors, so drop a whole
lot of complexity while getting rid of the code.

Also since the previous patch made us use no-split mode all the
time, drop any support in the driver for any other value in dtype
and assume it is always zero (aka no-split).

Hooray for code removal!

Change-ID: I2257e902e4dad84a07b94db6d2e6f4ce69b27bc0
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

bec60fc4

i40evf: refactor receive routine · ab9ad98e

由 Jesse Brandeburg 提交于 4月 18, 2016

This is part 2 of the Rx refactor series, just including
changes to i40evf.

This refactor aligns the receive routine with the one in
ixgbe which was highly optimized.  This reduces the code
we have to maintain and allows for (hopefully) more readable
and maintainable RX hot path.

In order to do this:
- consolidate the receive path into a single function that doesn't
  use packet split but *does* use pages for Rx buffers.
- remove the old _1buf routine
- consolidate several routines into helper functions
- remove VF ethtool control over packet split
- remove priv_flags interface since it is unused
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

ab9ad98e

i40evf: Drop packet split receive routine · 19b85e67

由 Jesse Brandeburg 提交于 4月 18, 2016

As part of preparation for the rx-refactor, remove the
packet split receive routine and ancillary code.

Some of the split related context set up code stays in
i40e_virtchnl_pf.c in case an older VF driver tries to load
and still wants to use packet split.
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

19b85e67

i40e/i40evf: Remove reference to ring->dtype · 04b3b779

由 Jesse Brandeburg 提交于 4月 18, 2016

As part of the rx-refactor, the dtype variable in the i40e_ring
struct is no longer used, so remove it.
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

04b3b779

14 4月, 2016 1 次提交

i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packet · 3f3f7cb8

由 Alexander Duyck 提交于 3月 30, 2016

This patch addresses a bug introduced based on my interpretation of the
XL710 datasheet. Specifically section 8.4.1 states that "A single transmit
packet may span up to 8 buffers (up to 8 data descriptors per packet
including both the header and payload buffers)." It then later goes on to
say that each segment for a TSO obeys the previous rule, however it then
refers to TSO header and the segment payload buffers.

I believe the actual limit for fragments with TSO and a skbuff that has
payload data in the header portion of the buffer is actually only 7
fragments as the skb->data portion counts as 2 buffers, one for the TSO
header, and one for a segment payload buffer.

Fixes: 2d37490b ("i40e/i40evf: Rewrite logic for 8 descriptor per packet check")
Reported-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
Acked-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

3f3f7cb8

07 4月, 2016 1 次提交

i40e/i40evf: Faster RX via avoiding FCoE · 1f15d667

由 Jesse Brandeburg 提交于 4月 01, 2016

As it turns out, calling into other files from hot path hurts
performance a lot.  In this case the majority of the time we
call "check FCoE" and the packet is *not* FCoE, but this call
was taking 5% of our total cycles spent on receive.

Change-ID: I080552c26e7060bc7b78504dc2763f6f0b3d8c76
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

1f15d667

05 4月, 2016 1 次提交

i40e/i40evf: Allow up to 12K bytes of data per Tx descriptor instead of 8K · 5c4654da

由 Alexander Duyck 提交于 2月 19, 2016

From what I can tell the practical limitation on the size of the Tx data
buffer is the fact that the Tx descriptor is limited to 14 bits. As such
we cannot use 16K as is typically used on the other Intel drivers. However
artificially limiting ourselves to 8K can be expensive as this means that
we will consume up to 10 descriptors (1 context, 1 for header, and 9 for
payload, non-8K aligned) in a single send.

I propose that we can reduce this by increasing the maximum data for a 4K
aligned block to 12K. We can reduce the descriptors used for a 32K aligned
block by 1 by increasing the size like this. In addition we still have the
4K - 1 of space that is still unused. We can use this as a bit of extra
padding when dealing with data that is not aligned to 4K.

By aligning the descriptors after the first to 4K we can improve the
efficiency of PCIe accesses as we can avoid using byte enables and can fetch
full TLP transactions after the first fetch of the buffer. This helps to
improve PCIe efficiency. Below is the results of testing before and after
with this patch:

Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB
Before:
87380 16384 16384 10.00 33682.24 20.27 -1.00 0.592 -1.00
After:
87380 16384 16384 10.00 34204.08 20.54 -1.00 0.590 -1.00

So the net result of this patch is that we have a small gain in throughput
due to a reduction in overhead for putting together the frame.
Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

5c4654da

19 2月, 2016 4 次提交

i40e/i40evf: Rewrite logic for 8 descriptor per packet check · 2d37490b

由 Alexander Duyck 提交于 2月 17, 2016

This patch is meant to rewrite the logic for how we determine if we can
transmit the frame or if it needs to be linearized.

The previous code for this function was using a mix of division and modulus
division as a part of computing if we need to take the slow path. Instead
I have replaced this by simply working with a sliding window which will
tell us if the frame would be capable of causing a single packet to span
several descriptors.

The logic for the scan is fairly simple. If any given group of 6 fragments
is less than gso_size - 1 then it is possible for us to have one byte
coming out of the first fragment, 6 fragments, and one or more bytes coming
out of the last fragment. This gives us a total of 8 fragments
which exceeds what we can allow so we send such frames to be linearized.

Arguably the use of modulus might be more exact as the approach I propose
may generate some false positives. However the likelihood of us taking much
of a hit for those false positives is fairly low, and I would rather not
add more overhead in the case where we are receiving a frame composed of 4K
pages.
Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

2d37490b

i40e/i40evf: Break up xmit_descriptor_count from maybe_stop_tx · 4ec441df

由 Alexander Duyck 提交于 2月 17, 2016

In an upcoming patch I would like to have access to the descriptor count
used for the data portion of the frame. For this reason I am splitting up
the descriptor count function from the function that stops the ring.

Also in order to try and reduce unnecessary duplication of code I am moving
the slow-path portions of the code out of being inline calls so that we can
just jump to them and process them instead of having to build them into
each function that calls them.
Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

4ec441df

i40e/i40evf: Add exception handling for Tx checksum · 529f1f65

由 Alexander Duyck 提交于 1月 24, 2016

Add exception handling to the Tx checksum path so that we can handle cases
of TSO where the frame is bad, or Tx checksum where we didn't recognize a
protocol

Drop I40E_TX_FLAGS_CSUM as it is unused, move the CHECKSUM_PARTIAL check
into the function itself so that we can decrease indent.
Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

529f1f65

i40e/i40evf: Drop outer checksum offload that was not requested · a9c9a81f

由 Alexander Duyck 提交于 1月 24, 2016

The i40e and i40evf drivers contained code for inserting an outer checksum
on UDP tunnels. The issue however is that the upper levels of the stack
never requested such an offload and it results in possible errors.

In addition the same logic was being applied to the Rx side where it was
attempting to validate the outer checksum, but the logic there was
incorrect in that it was testing for the resultant sum to be equal to the
header checksum instead of being equal to 0.

Since this code is so massively flawed, and doing things that we didn't ask
for it to do I am just dropping it, and will bring it back later to use as
an offload for SKB_GSO_UDP_TUNNEL_CSUM which can make use of such a
feature.

As far as the Rx feature I am dropping it completely since it would need to
be massively expanded and applied to IPv4 and IPv6 checksums for all parts,
not just the one that supports Tx checksum offload for the outer.
Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

a9c9a81f

18 2月, 2016 5 次提交

i40e: Add a SW workaround for lost interrupts · dd353109

由 Anjali Singhai Jain 提交于 1月 15, 2016

This patch adds a workaround for cases where we might have
interrupts that got lost but WB happened.
If that happens without this patch we will see a tx_timeout.
To work around it, this patch goes ahead and reschedules NAPI
in that situation, if NAPI is not already scheduled.
We also add a counter in ethtool to keep track of when
we detect a case of tx_lost_interrupt.

Note: napi_reschedule() can be safely called from process/service_task
context and is done in other drivers as well without an issue.

Change-ID: I00f98f1ce3774524d9421227652bef20fcbd0d20
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

dd353109

i40e: properly show packet split status in debugfs · 4668607a

由 Mitch Williams 提交于 1月 13, 2016

Get rid of the unused hsplit field in the ring struct and use the
existing macro to detect packet split enablement. This allows debugfs
dumps of the VSI to properly show which Rx routine is in use.

Change-ID: Ic4e9589e6a788ab196ed0850703f704e30c03781
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

4668607a

i40e/i40evf: use pages correctly in Rx · f16704e5

由 Mitch Williams 提交于 1月 13, 2016

Refactor the packet split Rx code to properly use half-pages for
receives. The previous code was doing way more mapping and unmapping
than it needed to, and wasn't properly using half-pages.

Increment the page use count each time we give a half-page to an skb,
knowing that the stack will probably process and release the page before
we need it again. Only free and reallocate pages if the count shows that
both half-pages are in use. Add counters to track reallocations and page
reuse.

Change-ID: I534b299196036b64be82b4861a0a4036310a8f22
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

f16704e5

i40e/i40evf: try again after failure · c2e245ab

由 Jesse Brandeburg 提交于 1月 13, 2016

This is the "Don't Give Up" patch.  Previously the
driver could fail an allocation, and then possibly stall
a queue forever, by never coming back to continue receiving
or allocating buffers.

With this patch, the driver will keep polling trying to allocate
receive buffers until it succeeds.  This should keep all receive
queues running even in the face of memory pressure.

Also update copyright year in file header.

Change-ID: I2b103d1ce95b9831288a7222c3343ffa1988b81b
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

c2e245ab

i40e: Refactor force_wb and WB_ON_ITR functionality code · ecc6a239

由 Anjali Singhai Jain 提交于 1月 13, 2016

Now that the Force-WriteBack functionality in X710/XL710 devices
has been moved out of the clean routine and into the service task,
we need to make sure WriteBack-On-ITR is separated out since it
is still called from clean.

In the X722 devices, Force-WriteBack implies WriteBack-On-ITR but
without the interrupt, which put the driver into a missed
interrupt scenario and a potential tx-timeout report.

With this patch, we break the two functions out, and call the
appropriate ones at the right place. This will avoid creating missed
interrupt like scenarios for X722 devices.

Also update copyright year in file headers.

Change-ID: Iacbde39f95f332f82be8736864675052c3583a40
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

ecc6a239

03 12月, 2015 1 次提交

i40e: Detection and recovery of TX queue hung logic moved to service_task from tx_timeout · 9c6c1259

由 Kiran Patil 提交于 11月 06, 2015

This patch contains following changes:
   - detection and recovery logic (issue SW interrupt) has been moved to
     service_task from timeout function.
   - added some more debug info from tx_timeout.

Logic to detect and recover TX queue hung is now two step process:
  - service_task detects TX queue hung and sets a bit(hung_detected) if
    it was not set.
  - if bit was set (means this is back-back hung condition detected),
    issue SW interrupt and clear the bit.
  - napi_poll clears the bit unconditionally since it cleans TX/RX queues.

Change-ID: Ieed03a48927c845a988b3ff375090bf37caeb903
Signed-off-by: NKiran Patil <kiran.patil@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

9c6c1259

02 12月, 2015 1 次提交

i40e/i40evf: Fix RS bit update in Tx path and disable force WB workaround · 6a7fded7

由 Anjali Singhai Jain 提交于 10月 26, 2015

This patch fixes the issue of forcing WB too often causing us to not
benefit from NAPI.

Without this patch we were forcing WB/arming interrupt too often taking
away the benefits of NAPI and causing a performance impact.

With this patch we disable force WB in the clean routine for X710
and XL710 adapters. X722 adapters do not enable interrupt to force
a WB and benefit from WB_ON_ITR and hence force WB is left enabled
for those adapters.
For XL710 and X710 adapters if we have less than 4 packets pending
a software Interrupt triggered from service task will force a WB.

This patch also changes the conditions for setting RS bit as described
in code comments. This optimizes when the HW does a tail bump amd when
it does a WB. It also optimizes when we do a wmb.

Change-ID: Id831e1ae7d3e2ec3f52cd0917b41ce1d22d75d9d
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

6a7fded7

26 11月, 2015 1 次提交

i40e/i40evf: Add a stat to track how many times we have to do a force WB · 164c9f54

由 Anjali Singhai Jain 提交于 10月 21, 2015

When in NAPI with interrupts disabled, the HW needs to be forced to do a
write back on TX if the number of descriptors pending are less than a
cache line.

This stat helps keep track of how many times we get into this situation.

Change-ID: I76c1bcc7ebccd6bffcc5aa33bfe05f2fa1c9a984
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

164c9f54

20 10月, 2015 3 次提交

i40e/i40evf: adjust interrupt throttle less frequently · ee2319cf

由 Jesse Brandeburg 提交于 9月 28, 2015

The adaptive ITR (interrupt throttle rate) algorithm was adjusting
the hardware's interrupt rate too frequently.  This caused a lot
of variation in the interrupt rate for fairly constant workloads.

Change the code to have a counter and adjust only once every N
number of interrupts.

Change-ID: I0460f1f86571037484eca5aca36ac4d889cb8389
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

ee2319cf

i40e/i40evf: change dynamic interrupt thresholds · c56625d5

由 Jesse Brandeburg 提交于 9月 28, 2015

The dynamic algorithm, while now working, doesn't have good
performance in 40G mode.

One part of this patch addresses the high CPU utilization of some small
streaming workloads that the driver should reduce CPU in.

It also changes the minimum ITR that the dynamic algorithm
will settle on, causing our minimum latency to go from 12us
to about 14us, when using adaptive mode.

It also changes the BULK interrupt rate to allow maximum throughput
on a 40Gb connection with a single thread of transmit, clamping
interrupt rate to 8000 for TX makes single thread traffic go too
slow.

The new ULTRA bulk setting is introduced and is used
when the Rx packet rate on this queue exceeds 40000 packets per
second.  This value of 40000 was chosen because the automatic tuning
of minimum ITR=20us means that a single queue can't quite achieve
that many packets per second from a round-robin test.

Change-ID: Icce8faa128688ca5fd2c4229bdd9726877a92ea2
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

c56625d5

i40evf: fix overlong BIT defines · d08f5558

由 Jesse Brandeburg 提交于 9月 16, 2015

The defines from the RSS enabling call were mistakenly
missed in the patches to the i40e which should have been
to i40evf as well.

This is a follow up to (commit ed921559886dd40528) "fix
32 bit build warnings".
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

d08f5558

16 10月, 2015 1 次提交

i40e/i40evf: moderate interrupts differently · ac26fc13

由 Jesse Brandeburg 提交于 9月 28, 2015

The XL710 hardware has a different interrupt moderation design
that can support a limit of total interrupts per second per
vector, in addition to the "number of interrupts per second"
controls already established in the driver.  This combination
of hardware features allows us to set very low default latency
settings but minimize the total CPU utilization by not
making too many interrupts, should the user desire.

The current driver implementation is still enabling the dynamic
moderation in the driver, and only using the rx/tx-usecs
limit in ethtool to limit the interrupt rate per second, by default.

The new code implemented in this patch
2) adds init/use of the new "Interrupt Limit" register
3) adds ethtool knob to control/report the limits above

Usage is ethtool -C ethx rx-usecs-high <value> Where <value> is number
of microseconds to create a rate of 1/N interrupts per second,
regardless of rx-usecs or tx-usecs values. Since there is a credit based
scheme in the hardware, the rx-usecs and tx-usecs can be configured for
very low latency for short bursts, but once the credit runs out the
refill rate on the credits is limited by rx-usecs-high.

Change-ID: I3a1075d3296123b0f4f50623c779b027af5b188d
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

ac26fc13

09 10月, 2015 1 次提交

i40e/i40evf: clean up some code · 6995b36c

由 Jesse Brandeburg 提交于 8月 28, 2015

Add missings spaces after declarations, remove another __func__ use,
remove uncessary braces, remove unneeded breaks, and useless returns,
and generally fix up some code.

Change-ID: Ie715d6b64976c50e1c21531685fe0a2bd38c4244
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

6995b36c

08 10月, 2015 1 次提交

i40e/i40evf: Add a stat to keep track of linearization count · 2fc3d715

由 Anjali Singhai Jain 提交于 8月 27, 2015

Keep track of how many times we ask the stack to linearize the
skb because the HW cannot handle skbs with more than 8 frags per
segment/single packet.

Change-ID: If455452060963a769bbe6112cba952e79e944b52
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

2fc3d715

29 9月, 2015 1 次提交

i40e/i40evf: refactor tx timeout logic · b03a8c1f

由 Kiran Patil 提交于 9月 24, 2015

This patch modifies the driver timeout logic by issuing a writeback
request via a software interrupt to the hardware the first time the
driver detects a hang. The driver was too aggressive in resetting a hung
queue, so back that off by removing logic to down the netdevice after
too many hangs, and move the function to the service task.

Change-ID: Ife100b9d124cd08cbdb81ab659008c1b9abbedea
Signed-off-by: NKiran Patil <kiran.patil@intel.com>
Signed-off-by: NShannon Nelson <shannon.nelson@intel.com>
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

b03a8c1f

06 8月, 2015 3 次提交

i40e/i40evf: Add TX/RX outer UDP checksum support for X722 · 527274c7

由 Anjali Singhai Jain 提交于 6月 05, 2015

X722 supports offloading of outer UDP TX and RX checksum for tunneled
packets. This patch exposes the support and leaves it enabled by
default.
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NCatherine Sullivan <catherine.sullivan@intel.com>
Tested-by: NJim Young <james.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

527274c7

i40e/i40evf: Add support for writeback on ITR feature for X722 · 8e0764b4

由 Anjali Singhai Jain 提交于 6月 05, 2015

X722 fixes an issue from X710 where TX descriptor WB would not happen if
the interrupts were disabled. In order for the write backs to happen a
bit needs to be set in the dynamic interrupt control register called
WB_ON_ITR. With this feature, the SW driver need not arm SW interrupts to
work around the issue in X710.
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NCatherine Sullivan <catherine.sullivan@intel.com>
Tested-by: NJim Young <james.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

8e0764b4

i40e/i40evf: RSS changes for X722 · e25d00b8

由 Anjali Singhai Jain 提交于 6月 23, 2015

X722 uses the admin queue to configure RSS. This patch adds the necessary
flow changes to configure RSS through AQ. It also adds the separate VMDQ2
lookup tables and hash key programming for X722.

X722 also exposes a different set of PCTYPES for RSS, this patch
accommodates those changes.
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NCatherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NJim Young <james.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

e25d00b8

23 7月, 2015 1 次提交

i40e: use BIT and BIT_ULL macros · 41a1d04b

由 Jesse Brandeburg 提交于 6月 04, 2015

Use macros for abstracting (1 << foo) to BIT(foo)
and (1ULL << foo64) to BIT_ULL(foo64) in order to match
better with kernel requirements.

NOTE: the adminq_cmd.h file was not modified on purpose because
of the dependency upon firmware for that file.

Change-ID: I73ee2e48c880d671948aad19bd53ca6b2ac558fc
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NCatherine Sullivan <catherine.sullivan@intel.com>
Tested-by: NJim Young <james.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

41a1d04b

28 5月, 2015 2 次提交

i40e/i40evf: remove time_stamp member · 33507598

由 Jesse Brandeburg 提交于 4月 16, 2015

The driver doesn't use the time_stamp member to determine if there is a
tx_hang any more. There really isn't any point to the variable at all
so just remove it. It was left over from a previous tx_hang design.

Change-ID: I4c814827e1bcb46e45118fe37acdcfa814fb62a0
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: NJim Young <james.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

33507598

i40e/i40evf: Add ATR support for tunneled TCP/IPv4/IPv6 packets. · 89232c3b

由 Anjali Singhai Jain 提交于 4月 16, 2015

Without this, RSS would have done inner header load balancing. Now we can
get the benefits of ATR for tunneled packets to better align TX and RX
queues with the right core/interrupt.

Change-ID: I07d0e0a192faf28fdd33b2f04c32b2a82ff97ddd
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

89232c3b

26 2月, 2015 1 次提交

i40e: Fix TSO with more than 8 frags per segment issue · 71da6197

由 Anjali Singhai 提交于 2月 21, 2015

The hardware has some limitations the driver needs to adhere to,
that we found in extended testing.
  1) no more than 8 descriptors per packet on the wire
  2) no header can span more than 3 descriptors

If one of these events occurs, the hardware will generate an internal
error and freeze the Tx queue.

This patch linearizes the skb to avoid these situations.

Change-ID: I37dab7d3966e14895a9663ec4d0aaa8eb0d9e115
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: NJim Young <james.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

71da6197

24 2月, 2015 1 次提交

i40e/i40evf: Refactor the receive routines · a132af24

由 Mitch Williams 提交于 1月 24, 2015

Split the receive hot path code into two, one for packet split and one
for single buffer. This improves receive performance since we only need
to check if the ring is in packet split mode once per NAPI poll time,
not several times per packet. The single buffer code is further improved
by the removal of a bunch of code and several variables that are not
needed. On a receive-oriented test this can improve single-threaded
throughput.

Also refactor the packet split receive path to use a fixed buffer for
headers, like ixgbe does. This vastly reduces the number of DMA mappings
and unmappings we need to do, allowing for much better performance in
the presence of an IOMMU.

Lastly, correct packet split descriptor types now that we are actually
using them.

Change-ID: I3a194a93af3d2c31e77ff17644ac7376da6f3e4b
Signed-off-by: NMitch Williams <mitch.a.williams@intel.com>
Tested-by: NJim Young <james.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

a132af24

09 2月, 2015 1 次提交

i40evf: Force Tx writeback on ITR · c29af37f

由 Anjali Singhai Jain 提交于 1月 10, 2015

This patch forces Tx descriptor writebacks on ITR by kicking
off the SWINT interrupt when we notice that there are non-cache-aligned
Tx descriptors waiting in the ring while interrupts are disabled
under NAPI.

Change-ID: dd6d9675629bf266c7515ad7a201394618c35444
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

c29af37f

11 11月, 2014 1 次提交

i40e: clean up throttle rate code · 79442d38

由 Jesse Brandeburg 提交于 10月 25, 2014

The interrupt throttle rate minimum is actually 2us, so
fix that define and while we are there, remove some unused defines.

Change some strings in the function to be a bit less wrappy, and
express the correct limits.

Change-ID: I96829bbc77935e0b57c6f0fc1439fb4152b2960a
Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: NPatrick Lu <patrick.lu@intel.com>
Tested-by: NJim Young <jamesx.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

79442d38

27 8月, 2014 1 次提交

i40e/i40evf: Ignore a driver perceived Tx hang if the number of desc pending < 4 · 810b3ae4

由 Anjali Singhai Jain 提交于 7月 10, 2014

We are seeing situations where the driver sees a hang with less than 4
desc pending, if the driver chooses to ignore it the queue progresses
forward and the stack never experiences a real hang.
With this patch we will log a stat when this situation happens
"tx_sluggish" will increment and we can see some more details
at a higher debug level. Other than that we will ignore this
particular case of Tx hang.

Change-ID: I7d1d1666d990e2b12f4f6bed0d17d22e1b6410d5
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

810b3ae4

03 7月, 2014 1 次提交

i40e/i40evf: Do not free the dummy packet buffer synchronously · 49d7d933

由 Anjali Singhai Jain 提交于 6月 04, 2014

The HW still needs to consume it and freeing it in the function
that created it would mean we will be racing with the HW. The
i40e_clean_tx_ring() routine will free up the buffer attached once
the HW has consumed it.  The clean_fdir_tx_irq function had to be fixed
to handle the freeing correctly.

Cases where we program more than one filter per flow (Ipv4), the
code had to be changed to allocate dummy buffer multiple times
since it will be freed by the clean routine.  This also fixes an issue
where the filter program routine was not checking if there were
descriptors available for programming a filter.

Change-ID: Idf72028fd873221934e319d021ef65a1e51acaf7
Signed-off-by: NAnjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: NJim Young <jamesx.m.young@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

49d7d933

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功