- 19 3月, 2012 3 次提交
-
-
由 Alexander Duyck 提交于
This patch adds support for enabling or disabling UDP RSS via the ethtool -N rx-flow-hash command. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This change makes it so that only the 2nd cache line in the ring structure should see frequent updates. The advantage to this is that it should reduce the amount of cross CPU cache bouncing since only the 2nd cache line will be changing between most network transactions. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This change makes it so that we store the tx_flags and protocol information to the tx_buffer_info structure sooner. This allows us to avoid unnecessary read/write transactions since we are placing the data in the final location earlier. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 17 3月, 2012 4 次提交
-
-
由 Alexander Duyck 提交于
This change makes it so that we always write the DMA address for the skb itself on the same tx_buffer struct that the skb is written on. This way we don't need the MAPPED_AS_PAGE flag and we always know it will be the first DMA value that we will have to unmap. In addition I have found an issue in which we were leaking a DMA mapping if the value happened to be 0 which is possible on some platforms. In order to resolve that I have updated the transmit path to use the length instead of the DMA mapping in order to determine if a mapping is actually present. One other tweak in this patch is that it only writes the olinfo information on the first descriptor. As it turns out it isn't necessary to write it for anything but the first descriptor so there is no need to carry it forward. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
Instead of keeping a local copy of the skb on the stack for as long as long as we do it makes sense to instead just place it on the first tx_buffer structure so that we can save space on the stack and avoid unnecessary read/write operations copying the pointer out of the stack and onto the ring later. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
A separate value was added to track Tx completions in order to determine if the Tx unit was hung. However we can do the same thing using the number of packets completed without having to add another stat to the Tx ring. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This patch replaces the existing Rx hot-path in the ixgbe driver with a new implementation that is based on performing a double buffered receive. The ixgbe driver already had something similar in place for its' packet split path, however in that case we were still receiving the header for the packet into the sk_buff. The big change here is the entire receive path will receive into pages only, and then pull the header out of the page and copy it into the sk_buff data. There are several motivations behind this approach. First, this allows us to avoid several cache misses as we were taking a set of cache misses for allocating the sk_buff and then another set for receiving data into the sk_buff. We are able to avoid these misses on receive now as we allocate the sk_buff when data is available. Second we are able to see a considerable performance gain when an IOMMU is enabled because we are no longer unmapping every buffer on receive. Instead we can delay the unmap until we are unable to use the page, and instead we can simply call sync_single_range on the half of the page that contains new data. Finally we are able to drop a considerable amount of code from the driver as we no longer have to support 2 different receive modes, packet split and one buffer. This allows us to optimize the Rx path further since less branching is required. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NRoss Brattain <ross.b.brattain@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 14 3月, 2012 1 次提交
-
-
由 Alexander Duyck 提交于
There isn't much point in using variables to store the values of eitr_low and eitr_high since they are not user changeable. As such I am replacing them with the constants 10 and 20 in order to avoid any confusion on what the values actually are. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 13 3月, 2012 5 次提交
-
-
由 Alexander Duyck 提交于
Since there are multiple spots where we have to cycle through all of the rings on a q_vector it makes sense to just add a function for iterating through all of them. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This patch makes the rings a part of the q_vector directly instead of indirectly. Specifically on x86 systems this helps to avoid any cache set conflicts between the q_vector, the tx_rings, and the rx_rings as the critical stride is 4K and in order to cross that boundary you would need to have over 15 rings on a single q_vector. In addition this allows for smarter allocations when Flow Director is enabled. Previously Flow Director would set the irq_affinity hints based on the CPU and was still using a node interleaving approach which on some systems would end up with the two values mismatched. With the new approach we can set the affinity for the irq_vector and use the CPU for that affinity to determine the node value for the node and the rings. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
The old code had several errors in how it was determining the vector budget. In order to simplify things this patch updates the code so that it will attempt to always allocated paired Rx/Tx vectors instead of attempting to allocate individual vectors when the number of queues is less than the number of CPUs. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This change moves several frequently accessed items together into one cache line in order to reduce cache misses in the hot-path. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This adds support for byte queue limits (BQL). Based on patch from Eric Dumazet for igb. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com>
-
- 11 2月, 2012 4 次提交
-
-
由 Alexander Duyck 提交于
This change combines a number of post-DMA Rx packet processing functions into a single function. The advantage of this is that it combines most of the Rx descriptor processing into one spot so it should all be warm in the cache. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
It doesn't make much sense to differentiate between advanced and legacy descriptors when the only descriptors that ixgbe uses are advanced descriptors. As such we can drop the _ADV suffix since all ixgbe descriptors are automatically advanced. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This change adds a small function for testing Rx status bits in the descriptor. The advantage to this is that we can avoid unnecessary byte swaps on big endian systems. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This change addresses several issue. First I had left the use of the next and prev skb pointers floating around in the code and they were overdue to be pulled since I had rewritten the RSC code in the out-of-tree driver some time ago to address issues brought up by David Miller in regards to this. I am also now defaulting to always leaving the first buffer unmapped on any packet and then unmapping it after we read the EOP descriptor. This allows a simplification of the path with less branching. Instead of counting packets received the code was changed some time ago to track the number of buffers received. This leads to inaccurate counting when you compare numbers of packets received by the hardware versus what is tracked by the software. To correct this I am revising things so that the append_cnt value for RSC accurately tracks the number of frames received. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 03 2月, 2012 1 次提交
-
-
由 Don Skidmore 提交于
New year so bump the copyright date. Signed-off-by: NDon Skidmore <donald.c.skidmore@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 06 1月, 2012 1 次提交
-
-
由 Neerav Parikh 提交于
This patch implements support for ndo_get_fcoe_hbainfo() call in the ixgbe driver. This function will be called by the FCoE protocol stack to obtain device specific information from the underlying device configured to do FCoE. Signed-off-by: NNeerav Parikh <Neerav.Parikh@intel.com> Tested-by: NRoss Brattain <ross.b.brattain@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 10月, 2011 1 次提交
-
-
由 Emil Tantilov 提交于
Use 32bit value starting at offset 0x2d for displaying the firmware version in ethtool. This should work for all current ixgbe HW Signed-off-by: NEmil Tantilov <emil.s.tantilov@intel.com> Tested-by: NStephen Ko <stephen.s.ko@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 17 10月, 2011 1 次提交
-
-
由 Greg Rose 提交于
Implements the new netdev op to allow user configuration of spoof checking on a per VF basis. V2 - Change netdev spoof check op setting to bool Signed-off-by: NGreg Rose <gregory.v.rose@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 13 10月, 2011 1 次提交
-
-
由 Greg Rose 提交于
It is possible for a VF to set an invalid target DMA address in its Tx/Rx descriptor buffer pointers. The workarounds in this patch will guard against such an event and issue a VFLR to the VF in response. The VFLR will shut down the VF until an administrator can take action to investigate the event and correct the problem. Signed-off-by: NGreg Rose <gregory.v.rose@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 29 9月, 2011 1 次提交
-
-
由 Emil Tantilov 提交于
This patch is meant to help cleanup the interrupt throttle rate logic by storing the interrupt throttle rate as a value in microseconds instead of interrupts per second. The advantage to this approach is that the value can now be stored in an 16 bit field and doesn't require as much math to flip the value back and forth since the hardware already used microseconds when setting the rate. Signed-off-by: NEmil Tantilov <emil.s.tantilov@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 24 9月, 2011 2 次提交
-
-
由 Emil Tantilov 提交于
Add support for WOL as determined by the EEPROM. Signed-off-by: NEmil Tantilov <emil.s.tantilov@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Greg Rose 提交于
Use the PCI device flag indicating if a VF is assigned to a guest VM to guard against destroying VFs upon driver removal. Implement additional feature to detect if VFs already exist when the driver is loaded and if so configure them and set the driver state to SR-IOV enabled. Signed-off-by: NGreg Rose <gregory.v.rose@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 17 9月, 2011 2 次提交
-
-
由 Alexander Duyck 提交于
This patch improves the memory utilization with RSC when in one-buffer mode. This is accomplished by making the default buffer sizes match up with the standard memory allocation sizes minus 1K for shared info and padding overhead. By doing this CPU utilization when doing large receives can be reduced by as much as 8%. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
ixgbe_up and ixgbe_up_complete will always return 0. Since this doesn't provide any useful information we might as well just make them both void and save ourselves from having to return an unused value. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 16 9月, 2011 2 次提交
-
-
由 Alexander Duyck 提交于
It was possible to inadvertently add additional interrupt causes to the MSI-X other interrupt. This occurred when things such as RX buffer overrun events were being triggered at the same time as an event such as a Flow Director table reinit request. In order to avoid this we should be explicitly programming only the interrupts that we want enabled. In addition I am renaming the ixgbe_msix_lsc function and interrupt to drop any implied meaning of this being a link status only interrupt. Unfortunately the patch is a bit ugly due to the fact that ixgbe_irq_enable needed to be moved up before ixgbe_msix_other in order to have things defined in the correct order. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This change makes it so that the default Tx work limit is 256 buffers or 1/2 of an entire ring instead of a full ring size so that it is much more likely that we will be able to actually reach the work limit value. Previously with the value set to an entire ring it would not have been possible for us to trigger an event due to the fact that the Tx work is stopped at the point where we cannot place one more buffer on the ring and it is not restarted until cleanup is complete. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 29 8月, 2011 1 次提交
-
-
由 Alexander Duyck 提交于
This change makes it so that the CC bit in the descriptor is set when SR-IOV is enabled. This is needed in order to support offloading functionality when passing traffic over the internal TX switch. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 27 8月, 2011 1 次提交
-
-
由 Alexander Duyck 提交于
This change converts the current bit array into a linked list so that the q_vectors can simply go through ring by ring and locate each ring needing to be cleaned. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 19 8月, 2011 2 次提交
-
-
由 Alexander Duyck 提交于
This change is meant to further cleanup the transmit path by streamlining some of the VLAN and FCOE/DCB tasks in the transmit path. In addition it adds code for support software VLANs in the event that they are used in conjunction with DCB and/or FCOE. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NRoss Brattain <ross.b.brattain@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This patch implements a partial refactor of the TX map/queue and cleanup routines. It merges the map and queue functionality and as a result improves the transmit performance by avoiding unnecessary reads from memory. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 11 8月, 2011 1 次提交
-
-
由 Jeff Kirsher 提交于
Moves the Intel wired LAN drivers into drivers/net/ethernet/intel/ and the necessary Kconfig and Makefile changes. Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 22 7月, 2011 4 次提交
-
-
由 Don Skidmore 提交于
Private rx_csum flags are now duplicate of netdev->features & NETIF_F_RXCSUM. We remove those duplicates and now use the net_device_ops ndo_set_features. This was based on the original patch submitted by Michal Miroslaw <mirq-linux@rere.qmqm.pl>. I also removed the special case not requiring a reset for X540 hardware. It is needed just as it is in 82599 hardware. Signed-off-by: NDon Skidmore <donald.c.skidmore@intel.com> Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Tested-by: NPhil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This change is meant to address possible race conditions from the status and error bits on the RX descriptors being re-read by multiple functions in the RX cleanup path. To resolve this I have added code that will pass the staterr value to those functions. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NRoss Brattain <ross.b.brattain@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This change moves work_limit, total_packets, and total_bytes into the ring container struct of the q_vector. The advantage of this is that it should reduce the size of memory used in the event of multiple rings being assigned to a single q_vector. In addition it should help to reduce the total workload for calculating itr since now total_packets and total_bytes will be the total work done of the interrupt instead of for the ring. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NRoss Brattain <ross.b.brattain@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
This patch adds support for a ring container structure to be used within the q_vector. The basic idea is to provide a means of separating the RX and TX rings while maintaining a common structure for their containment. The advantage to this is that later we should be able to pass this structure to the update_itr functions without needing to pass individual rings. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Tested-by: NRoss Brattain <ross.b.brattain@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 25 6月, 2011 2 次提交
-
-
由 Alexander Duyck 提交于
This patch updates the current methods used for determining if we have enough space to transmit a given skb. The current method is quite wasteful as it has us go through and determine how each page is going to be broken up. That only needs to be done if pages are larger than our maximum data per TXD. As such I have wrapped that in a page size check. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-
由 Alexander Duyck 提交于
There is a significant amount of shared functionality between the checksum and TSO offload configuration that is shared in regards to how they setup the context descriptors. Since so much of the functionality is shared it makes sense to move the shared functionality into a single function and just call that function from the two context descriptor specific routines. Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
-