1. 16 6月, 2012 1 次提交
    • E
      bnx2x: fix panic when TX ring is full · bc14786a
      Eric Dumazet 提交于
      There is a off by one error in the minimal number of BD in
      bnx2x_start_xmit() and bnx2x_tx_int() before stopping/resuming tx queue.
      
      A full size GSO packet, with data included in skb->head really needs
      (MAX_SKB_FRAGS + 4) BDs, because of bnx2x_tx_split()
      
      This error triggers if BQL is disabled and heavy TCP transmit traffic
      occurs.
      
      bnx2x_tx_split() definitely can be called, remove a wrong comment.
      Reported-by: NTomas Hruby <thruby@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Eilon Greenstein <eilong@broadcom.com>
      Cc: Yaniv Rosner <yanivr@broadcom.com>
      Cc: Merav Sicron <meravs@broadcom.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Robert Evans <evansr@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc14786a
  2. 14 6月, 2012 1 次提交
    • E
      bnx2x: fix checksum validation · d6cb3e41
      Eric Dumazet 提交于
      bnx2x driver incorrectly sets ip_summed to CHECKSUM_UNNECESSARY on
      encapsulated segments. TCP stack happily accepts frames with bad
      checksums, if they are inside a GRE or IPIP encapsulation.
      
      Our understanding is that if no IP or L4 csum validation was done by the
      hardware, we should leave ip_summed as is (CHECKSUM_NONE), since
      hardware doesn't provide CHECKSUM_COMPLETE support in its cqe.
      
      Then, if IP/L4 checksumming was done by the hardware, set
      CHECKSUM_UNNECESSARY if no error was flagged.
      
      Patch based on findings and analysis from Robert Evans
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Eilon Greenstein <eilong@broadcom.com>
      Cc: Yaniv Rosner <yanivr@broadcom.com>
      Cc: Merav Sicron <meravs@broadcom.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Robert Evans <evansr@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Acked-by: NEilon Greenstein <eilong@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6cb3e41
  3. 01 5月, 2012 3 次提交
    • E
      bnx2x: remove some bloat · 1191cb83
      Eric Dumazet 提交于
      Before doing skb->head_frag work on bnx2x driver, I found too much stuff
      was inlined in bnx2x/bnx2x_cmn.h for no good reason and made my work not
      very easy.
      
      Move some big functions out of this include file to the respective .c
      file.
      
      A lot of inline keywords are not needed at all in this huge driver.
      
         text	   data	    bss	    dec	    hex	filename
       490083	   1270	     56	 491409	  77f91	bnx2x/bnx2x.ko.before
       484206	   1270	     56	 485532	  7689c	bnx2x/bnx2x.ko
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Eilon Greenstein <eilong@broadcom.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Matt Carlson <mcarlson@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1191cb83
    • E
      net: allow skb->head to be a page fragment · d3836f21
      Eric Dumazet 提交于
      skb->head is currently allocated from kmalloc(). This is convenient but
      has the drawback the data cannot be converted to a page fragment if
      needed.
      
      We have three spots were it hurts :
      
      1) GRO aggregation
      
       When a linear skb must be appended to another skb, GRO uses the
      frag_list fallback, very inefficient since we keep all struct sk_buff
      around. So drivers enabling GRO but delivering linear skbs to network
      stack aren't enabling full GRO power.
      
      2) splice(socket -> pipe).
      
       We must copy the linear part to a page fragment.
       This kind of defeats splice() purpose (zero copy claim)
      
      3) TCP coalescing.
      
       Recently introduced, this permits to group several contiguous segments
      into a single skb. This shortens queue lengths and save kernel memory,
      and greatly reduce probabilities of TCP collapses. This coalescing
      doesnt work on linear skbs (or we would need to copy data, this would be
      too slow)
      
      Given all these issues, the following patch introduces the possibility
      of having skb->head be a fragment in itself. We use a new skb flag,
      skb->head_frag to carry this information.
      
      build_skb() is changed to accept a frag_size argument. Drivers willing
      to provide a page fragment instead of kmalloc() data will set a non zero
      value, set to the fragment size.
      
      Then, on situations we need to convert the skb head to a frag in itself,
      we can check if skb->head_frag is set and avoid the copies or various
      fallbacks we have.
      
      This means drivers currently using frags could be updated to avoid the
      current skb->head allocation and reduce their memory footprint (aka skb
      truesize). (thats 512 or 1024 bytes saved per skb). This also makes
      bpf/netfilter faster since the 'first frag' will be part of skb linear
      part, no need to copy data.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Cc: Ben Hutchings <bhutchings@solarflare.com>
      Cc: Matt Carlson <mcarlson@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3836f21
    • W
  4. 24 4月, 2012 2 次提交
  5. 05 4月, 2012 1 次提交
  6. 04 4月, 2012 3 次提交
  7. 28 3月, 2012 1 次提交
  8. 20 3月, 2012 4 次提交
  9. 13 3月, 2012 2 次提交
  10. 21 2月, 2012 2 次提交
    • D
      bnx2x: add gro_check · fe603b4d
      Dmitry Kravkov 提交于
      The patch provides workaround for BUG in FW 7.2.16,
      which in GRO mode may miscalculate buffer and
      place on SGE one frag less than it could.
      It may happen only for some MTUs, we mark these MTUs
      with gro_check flag during device initialization or
      MTU change.
      
      Next FW should include fix for the issue and the
      patch could be reverted.
      Signed-off-by: NDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: NEilon Greenstein <eilong@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe603b4d
    • D
      use FW 7.2.16 · 621b4d66
      Dmitry Kravkov 提交于
      The patch integrates FW 7.2.16 HSI and implements driver
      part of GRO flow.
      
      FW 7.2.16 adds the ability to aggregate packets for GRO
      (and not just LRO) and also fixes some bugs.
      
      1. Added new aggregation mode: GRO. In this mode packets are aggregated
         such that the original packets can be reconstructed by the OS.
      2. 57712 HW bug workaround - initialized all CAM TM registers to 0x32.
      3. Adding the FCoE statistics structures to the BNX2X HSI.
      4. Wrong configuration of TX HW input buffer size may cause theoretical
         performance effect. Performed configuration fix.
      5. FCOE - Arrival of packets beyond task IO size can lead to crash.
         Fix firmware data-in flow.
      6. iSCSI - In rare cases of on-chip termination the graceful termination
         timer hangs, and the termination doesn't complete. Firmware fix to MSL
         timer tolerance.
      7. iSCSI - Chip hangs when target sends FIN out-of-order or with isles
         open at the initiator side. Firmware implementation corrected to drop
         FIN received out-of-order or with isles still open.
      8. iSCSI - Chip hangs when in case of retransmission not aligned to 4-bytes
         from the beginning of iSCSI PDU. Firmware implementation corrected
         to support arbitrary aligned retransmissions.
      9. iSCSI - Arrival of target-initiated NOP-IN during intense ISCSI traffic
         might lead to crash. Firmware fix to relevant flow.
      Signed-off-by: NDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: NEilon Greenstein <eilong@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      621b4d66
  11. 16 2月, 2012 3 次提交
  12. 08 2月, 2012 1 次提交
  13. 27 1月, 2012 4 次提交
  14. 24 1月, 2012 1 次提交
  15. 17 12月, 2011 1 次提交
  16. 14 12月, 2011 1 次提交
  17. 04 12月, 2011 1 次提交
  18. 30 11月, 2011 2 次提交
  19. 17 11月, 2011 1 次提交
  20. 15 11月, 2011 1 次提交
    • E
      bnx2x: uses build_skb() in receive path · e52fcb24
      Eric Dumazet 提交于
      bnx2x uses following formula to compute its rx_buf_sz :
      
      dev->mtu + 2*L1_CACHE_BYTES + 14 + 8 + 8 + 2
      
      Then core network adds NET_SKB_PAD and SKB_DATA_ALIGN(sizeof(struct
      skb_shared_info))
      
      Final allocated size for skb head on x86_64 (L1_CACHE_BYTES = 64,
      MTU=1500) : 2112 bytes : SLUB/SLAB round this to 4096 bytes.
      
      Since skb truesize is then bigger than SK_MEM_QUANTUM, we have lot of
      false sharing because of mem_reclaim in UDP stack.
      
      One possible way to half truesize is to reduce the need by 64 bytes
      (2112 -> 2048 bytes)
      
      Instead of allocating a full cache line at the end of packet for
      alignment, we can use the fact that skb_shared_info sits at the end of
      skb->head, and we can use this room, if we convert bnx2x to new
      build_skb() infrastructure.
      
      skb_shared_info will be initialized after hardware finished its
      transfert, so we can eventually overwrite the final padding.
      
      Using build_skb() also reduces cache line misses in the driver, since we
      use cache hot skb instead of cold ones. Number of in-flight sk_buff
      structures is lower, they are recycled while still hot.
      
      Performance results :
      
      (820.000 pps on a rx UDP monothread benchmark, instead of 720.000 pps)
      Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
      CC: Eilon Greenstein <eilong@broadcom.com>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      CC: Tom Herbert <therbert@google.com>
      CC: Jamal Hadi Salim <hadi@mojatatu.com>
      CC: Stephen Hemminger <shemminger@vyatta.com>
      CC: Thomas Graf <tgraf@infradead.org>
      CC: Herbert Xu <herbert@gondor.apana.org.au>
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Acked-by: NEilon Greenstein <eilong@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e52fcb24
  21. 14 11月, 2011 4 次提交
新手
引导
客服 返回
顶部