1. 25 6月, 2013 2 次提交
  2. 12 6月, 2013 1 次提交
  3. 15 5月, 2013 2 次提交
    • B
      sfc: Reduce RX scatter buffer size, and reduce alignment if appropriate · 950c54df
      Ben Hutchings 提交于
      efx_start_datapath() asserts that we can fit 2 RX scatter buffers plus
      a software structure, each appropriately aligned, into a single page.
      Where L1_CACHE_BYTES == 256 and PAGE_SIZE == 4096, which is the case
      on s390, this assertion fails.
      
      The current scatter buffer size is also not a multiple of 64 or 128,
      which are more common cache line sizes.  If we can make both the start
      and end of a scatter buffer cache-aligned, this will reduce the need
      for read-modify-write operations on inter- processor links.
      
      Fix the alignment by reducing EFX_RX_USR_BUF_SIZE to 2048 - 256 ==
      1792.  (We could use 2048 - L1_CACHE_BYTES, but EFX_RX_USR_BUF_SIZE
      also affects user-level networking where a larger amount of
      housekeeping data may be needed.  Although this version of the driver
      does not support user-level networking, I prefer to keep scattering
      behaviour consistent with the out-of-tree version.)
      
      This still doesn't fix the s390 build because like most architectures
      it has NET_IP_ALIGN == 2.  When NET_IP_ALIGN != 0 we cannot achieve
      cache line alignment at either the start or end of a scatter buffer,
      so there is actually no point in padding the buffers to a multiple of
      the cache line size.  All we need is 4-byte alignment of the network
      header, so do that.
      
      Adjust the assertions accordingly.
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Reported-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      Acked-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      950c54df
    • B
      sfc: Delete EFX_PAGE_IP_ALIGN, equivalent to NET_IP_ALIGN · c14ff2ea
      Ben Hutchings 提交于
      The two architectures that define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
      (powerpc and x86) now both define NET_IP_ALIGN as 0, so there is no
      need for this optimisation any more.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c14ff2ea
  4. 08 3月, 2013 11 次提交
    • D
      sfc: allocate more RX buffers per page · 1648a23f
      Daniel Pieczko 提交于
      Allocating 2 buffers per page is insanely inefficient when MTU is 1500
      and PAGE_SIZE is 64K (as it usually is on POWER).  Allocate as many as
      we can fit, and choose the refill batch size at run-time so that we
      still always use a whole page at once.
      
      [bwh: Fix loop condition to allow for compound pages; rebase]
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      1648a23f
    • B
      sfc: Replace efx_rx_is_last_buffer() with a flag · 179ea7f0
      Ben Hutchings 提交于
      This condition is brittle and we have lots of flags to spare.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      179ea7f0
    • D
      sfc: reuse pages to avoid DMA mapping/unmapping costs · 2768935a
      Daniel Pieczko 提交于
      On POWER systems, DMA mapping/unmapping operations are very expensive.
      These changes reduce these costs by trying to reuse DMA mapped pages.
      
      After all the buffers associated with a page have been processed and
      passed up, the page is placed into a ring (if there is room).  For
      each page that is required for a refill operation, a page in the ring
      is examined to determine if its page count has fallen to 1, ie. the
      kernel has released its reference to these packets.  If this is the
      case, the page can be immediately added back into the RX descriptor
      ring, without having to re-map it for DMA.
      
      If the kernel is still holding a reference to this page, it is removed
      from the ring and unmapped for DMA.  Then a new page, which can
      immediately be used by RX buffers in the descriptor ring, is allocated
      and DMA mapped.
      
      The time a page needs to spend in the recycle ring before the kernel
      has released its page references is based on the number of buffers
      that use this page.  As large pages can hold more RX buffers, the RX
      recycle ring can be shorter.  This reduces memory usage on POWER
      systems, while maintaining the performance gain achieved by recycling
      pages, following the driver change to pack more than two RX buffers
      into large pages.
      
      When an IOMMU is not present, the recycle ring can be small to reduce
      memory usage, since DMA mapping operations are inexpensive.
      
      With a small recycle ring, attempting to refill the descriptor queue
      with more buffers than the equivalent size of the recycle ring could
      ultimately lead to memory leaks if page entries in the recycle ring
      were overwritten.  To prevent this, the check to see if the recycle
      ring is full is changed to check if the next entry to be written is
      NULL.
      
      [bwh: Combine and rebase several commits so this is complete
       before the following buffer-packing changes.  Remove module
       parameter.]
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      2768935a
    • B
      sfc: Enable RX DMA scattering where possible · 85740cdf
      Ben Hutchings 提交于
      Enable RX DMA scattering iff an RX buffer large enough for the current
      MTU will not fit into a single page and the NIC supports DMA
      scattering for kernel-mode RX queues.
      
      On Falcon and Siena, the RX_USR_BUF_SIZE field is used as the DMA
      limit for both all RX queues with scatter enabled.  Set it to 1824,
      matching what Onload uses now.
      
      Maintain a statistic for frames truncated due to lack of descriptors
      (rx_nodesc_trunc).  This is distinct from rx_frm_trunc which may be
      incremented when scattering is disabled and implies an over-length
      frame.
      
      Whenever an MTU change causes scattering to be turned on or off,
      update filters that point to the PF queues, but leave others
      unchanged, as VF drivers assume scattering is off.
      
      Add n_frags parameters to various functions, and make them iterate:
      - efx_rx_packet()
      - efx_recycle_rx_buffers()
      - efx_rx_mk_skb()
      - efx_rx_deliver()
      
      Make efx_handle_rx_event() responsible for updating
      efx_rx_queue::removed_count.
      
      Change the RX pipeline state to a starting ring index and number of
      fragments, and make __efx_rx_packet() responsible for clearing it.
      
      Based on earlier versions by David Riddoch and Jon Cooper.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      85740cdf
    • B
      sfc: Update RX buffer address together with length · b74e3e8c
      Ben Hutchings 提交于
      Adjust rx_buf->page_offset when we eat the RX hash prefix.  Remove
      efx_rx_buf_offset(), which is now redundant.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      b74e3e8c
    • B
    • B
      sfc: Properly distinguish RX buffer and DMA lengths · 272baeeb
      Ben Hutchings 提交于
      Replace efx_nic::rx_buffer_len with efx_nic::rx_dma_len, the maximum
      RX DMA length.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      272baeeb
    • B
    • A
      sfc: Add AER and EEH support for Siena · 626950db
      Alexandre Rames 提交于
      The Linux side of EEH is triggered by MMIO reads, but this
      driver's data path does not issue any MMIO reads (except in
      legacy interrupt mode).  Therefore add a monitor function
      to poll EEH periodically.
      
      When preparing to reset the device based on our own error
      detection, also poll EEH and defer to its recovery mechanism
      if appropriate.
      
      [bwh: Use a separate condition for the initial link poll; fix some
       style errors]
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      626950db
    • A
      sfc: Remove rx_alloc_method SKB · 97d48a10
      Alexandre Rames 提交于
      [bwh: Remove more dead code, and make efx_ptp_rx() pull the data it
       needs into the header area.]
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      97d48a10
    • B
      sfc: Allow efx_channel_type::receive_skb() to reject a packet · 4a74dc65
      Ben Hutchings 提交于
      Instead of having efx_ptp_rx() call netif_receive_skb() for an invalid
      PTP packet, make it return false for rejected packets and have
      efx_rx_deliver() pass them up.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      4a74dc65
  5. 26 2月, 2013 1 次提交
  6. 01 12月, 2012 2 次提交
  7. 01 11月, 2012 1 次提交
  8. 06 10月, 2012 1 次提交
  9. 19 9月, 2012 2 次提交
  10. 08 9月, 2012 2 次提交
  11. 25 8月, 2012 4 次提交
    • B
      sfc: Change state names to be clearer, and comment them · f16aeea0
      Ben Hutchings 提交于
      STATE_INIT and STATE_FINI are equivalent and represent incompletely
      initialised states; combine them as STATE_UNINIT.
      
      Rename STATE_RUNNING to STATE_READY, to avoid confusion with
      netif_running() and IFF_RUNNING.
      
      The comments do not quite match current usage, but this will be
      corrected in subsequent fixes.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      f16aeea0
    • B
      sfc: Simplify TSO header buffer allocation · f7251a9c
      Ben Hutchings 提交于
      TSO header buffers contain a control structure immediately followed by
      the packet headers, and are kept on a free list when not in use.  This
      complicates buffer management and tends to result in cache read misses
      when we recycle such buffers (particularly if DMA-coherent memory
      requires caches to be disabled).
      
      Replace the free list with a simple mapping by descriptor index.  We
      know that there is always a payload descriptor between any two
      descriptors with TSO header buffers, so we can allocate only one
      such buffer for each two descriptors.
      
      While we're at it, use a standard error code for allocation failure,
      not -1.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      f7251a9c
    • B
      sfc: Stop TX queues before they fill up · 14bf718f
      Ben Hutchings 提交于
      We now have a definite upper bound on the number of descriptors per
      skb; use that to stop the queue when the next packet might not fit.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      14bf718f
    • B
      sfc: Refactor struct efx_tx_buffer to use a flags field · 7668ff9c
      Ben Hutchings 提交于
      Add a flags field to struct efx_tx_buffer, replacing the
      continuation and map_single booleans.
      
      Since a single descriptor cannot be both a TSO header and the last
      descriptor for an skb, unionise efx_tx_buffer::{skb,tsoh} and add
      flags for validity of these fields.
      
      Clear all flags in free buffers (whereas previously the continuation
      flag would be set).
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      7668ff9c
  12. 17 7月, 2012 2 次提交
    • B
      sfc: Disable VF queues during register self-test · d4f2cecc
      Ben Hutchings 提交于
      Currently VF queues and drivers may remain active during this test.
      This could cause memory corruption or spurious test failures.
      Therefore we reset the port/function before running these tests on
      Siena.
      
      On Falcon this doesn't work: we have to do some additional
      initialisation before some blocks will work again.  So refactor the
      reset/register-test sequence into an efx_nic_type method so
      efx_selftest() doesn't have to consider such quirks.
      
      In the process, fix another minor bug: Siena does not have an
      'invisible' reset and the self-test currently fails to push the PHY
      configuration after resetting.  Passing RESET_TYPE_ALL to
      efx_reset_{down,up}() fixes this.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      d4f2cecc
    • B
      sfc: Use generic DMA API, not PCI-DMA API · 0e33d870
      Ben Hutchings 提交于
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      0e33d870
  13. 11 7月, 2012 1 次提交
  14. 10 5月, 2012 2 次提交
  15. 07 3月, 2012 2 次提交
  16. 23 2月, 2012 1 次提交
  17. 16 2月, 2012 3 次提交
    • B
      sfc: Add SR-IOV back-end support for SFC9000 family · cd2d5b52
      Ben Hutchings 提交于
      On the SFC9000 family, each port has 1024 Virtual Interfaces (VIs),
      each with an RX queue, a TX queue, an event queue and a mailbox
      register.  These may be assigned to up to 127 SR-IOV virtual functions
      per port, with up to 64 VIs per VF.
      
      We allocate an extra channel (IRQ and event queue only) to receive
      requests from VF drivers.
      
      There is a per-port limit of 4 concurrent RX queue flushes, and queue
      flushes may be initiated by the MC in response to a Function Level
      Reset (FLR) of a VF.  Therefore, when SR-IOV is in use, we submit all
      flush requests via the MC.
      
      The RSS indirection table is shared with VFs, so the number of RX
      queues used in the PF is limited to the number of VIs per VF.
      
      This is almost entirely the work of Steve Hodgson, formerly
      shodgson@solarflare.com.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      cd2d5b52
    • B
      sfc: Allocate SRAM between buffer table and descriptor caches at init time · 28e47c49
      Ben Hutchings 提交于
      Each port has a block of 64-bit SRAM that is divided between buffer
      table and descriptor cache regions at initialisation time.  Currently
      we use a fixed allocation, but it needs to be changed to support
      larger numbers of queues.
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      28e47c49
    • B
      sfc: Add support for 'extra' channel types · 7f967c01
      Ben Hutchings 提交于
      Abstract some of the channel operations to allow for 'extra'
      channels that do not have RX or TX queues.
      
      - Try to assign a channel to each extra channel type that is enabled
        for the NIC, but gracefully degrade if we can't allocate sufficient
        MSI-X vectors
      - Allow each extra channel type to generate its own channel name
      - Allow channel types to disable reallocation and reinitialisation
        of their channels
      Signed-off-by: NBen Hutchings <bhutchings@solarflare.com>
      7f967c01