1. 24 1月, 2017 1 次提交
  2. 04 12月, 2016 1 次提交
  3. 17 11月, 2016 1 次提交
    • E
      bnx2x: switch to napi_complete_done() · 80f1c21c
      Eric Dumazet 提交于
      Switch from napi_complete() to napi_complete_done()
      for better GRO support (gro_flush_timeout) and core NAPI
      features.
      
      Do not rearm interrupts if we are busy polling,
      to reduce bus and interrupts overhead.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Adam Belay <abelay@google.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Cc: Yuval Mintz <Yuval.Mintz@cavium.com>
      Cc: Ariel Elior <ariel.elior@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80f1c21c
  4. 18 10月, 2016 1 次提交
    • J
      ethernet/broadcom: use core min/max MTU checking · e1c6dcca
      Jarod Wilson 提交于
      tg3: min_mtu 60, max_mtu 9000/1500
      
      bnxt: min_mtu 60, max_mtu 9000
      
      bnx2x: min_mtu 46, max_mtu 9600
      - Fix up ETH_OVREHEAD -> ETH_OVERHEAD while we're in here, remove
        duplicated defines from bnx2x_link.c.
      
      bnx2: min_mtu 46, max_mtu 9000
      - Use more standard ETH_* defines while we're at it.
      
      bcm63xx_enet: min_mtu 46, max_mtu 2028
      - compute_hw_mtu was made largely pointless, and thus merged back into
        bcm_enet_change_mtu.
      
      b44: min_mtu 60, max_mtu 1500
      
      CC: netdev@vger.kernel.org
      CC: Michael Chan <michael.chan@broadcom.com>
      CC: Sony Chacko <sony.chacko@qlogic.com>
      CC: Ariel Elior <ariel.elior@qlogic.com>
      CC: Dept-HSGLinuxNICDev@qlogic.com
      CC: Siva Reddy Kallam <siva.kallam@broadcom.com>
      CC: Prashant Sreedharan <prashant@broadcom.com>
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1c6dcca
  5. 17 3月, 2016 1 次提交
    • Y
      bnx2x: don't wait for Tx completion on recovery · d78a1f08
      Yuval Mintz 提交于
      When driver has hit a parity event, HW can no longer write to host memory.
      As a result, Tx completions cannot be written to the host SB memory, and
      waiting for Tx completions eventually timeout.
      As driver is willing to delay as much as 1-2 seconds per Tx queue for its
      draining and this delay is sequential, the time to recover might greatly
      lengthen needlessly in case the recovery is done under multi-connection
      traffic.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d78a1f08
  6. 04 3月, 2016 1 次提交
    • J
      net: relax setup_tc ndo op handle restriction · 5eb4dce3
      John Fastabend 提交于
      I added this check in setup_tc to multiple drivers,
      
       if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)
      
      Unfortunately restricting to TC_H_ROOT like this breaks the old
      instantiation of mqprio to setup a hardware qdisc. This patch
      relaxes the test to only check the type to make it equivalent
      to the check before I broke it. With this the old instantiation
      continues to work.
      
      A good smoke test is to setup mqprio with,
      
      # tc qdisc add dev eth4 root mqprio num_tc 8 \
        map 0 1 2 3 4 5 6 7 \
        queues 0@0 1@1 2@2 3@3 4@4 5@5 6@6 7@7
      
      Fixes: e4c6734e ("net: rework ndo tc op to consume additional qdisc handle paramete")
      Reported-by: NSingh Krishneil <krishneil.k.singh@intel.com>
      Reported-by: NJake Keller <jacob.e.keller@intel.com>
      CC: Murali Karicheri <m-karicheri2@ti.com>
      CC: Shradha Shah <sshah@solarflare.com>
      CC: Or Gerlitz <ogerlitz@mellanox.com>
      CC: Ariel Elior <ariel.elior@qlogic.com>
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      CC: Bruce Allan <bruce.w.allan@intel.com>
      CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
      CC: Don Skidmore <donald.c.skidmore@intel.com>
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5eb4dce3
  7. 17 2月, 2016 3 次提交
  8. 19 12月, 2015 1 次提交
  9. 09 12月, 2015 2 次提交
  10. 06 12月, 2015 2 次提交
  11. 19 11月, 2015 3 次提交
    • E
      net: provide generic busy polling to all NAPI drivers · 93d05d4a
      Eric Dumazet 提交于
      NAPI drivers no longer need to observe a particular protocol
      to benefit from busy polling (CONFIG_NET_RX_BUSY_POLL=y)
      
      napi_hash_add() and napi_hash_del() are automatically called
      from core networking stack, respectively from
      netif_napi_add() and netif_napi_del()
      
      This patch depends on free_netdev() and netif_napi_del() being
      called from process context, which seems to be the norm.
      
      Drivers might still prefer to call napi_hash_del() on their
      own, since they might combine all the rcu grace periods into
      a single one, knowing their NAPI structures lifetime, while
      core networking stack has no idea of a possible combining.
      
      Once this patch proves to not bring serious regressions,
      we will cleanup drivers to either remove napi_hash_del()
      or provide appropriate rcu grace periods combining.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93d05d4a
    • E
      net: move skb_mark_napi_id() into core networking stack · 93f93a44
      Eric Dumazet 提交于
      We would like to automatically provide busy polling support
      to all NAPI drivers, without them having to implement anything.
      
      skb_mark_napi_id() can be called from napi_gro_receive() and
      napi_get_frags().
      
      Few drivers are still calling skb_mark_napi_id() because
      they use netif_receive_skb(). They should eventually call
      napi_gro_receive() instead. I will leave this to drivers
      maintainers.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93f93a44
    • E
      bnx2x: remove bnx2x_low_latency_recv() support · b59768c6
      Eric Dumazet 提交于
      Switch to native NAPI polling, as this reduces overhead and complexity.
      
      Normal path is faster, since one cmpxchg() is not anymore requested,
      and busy polling with the NAPI polling has same performance.
      
      Tested:
      lpk50:~# cat /proc/sys/net/core/busy_read
      70
      lpk50:~# nstat >/dev/null;./netperf -H lpk55 -t TCP_RR;nstat
      MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpk55.prod.google.com () port 0 AF_INET : first burst 0
      Local /Remote
      Socket Size   Request  Resp.   Elapsed  Trans.
      Send   Recv   Size     Size    Time     Rate
      bytes  Bytes  bytes    bytes   secs.    per sec
      
      16384  87380  1        1       10.00    40095.07
      16384  87380
      IpInReceives                    401062             0.0
      IpInDelivers                    401062             0.0
      IpOutRequests                   401079             0.0
      TcpActiveOpens                  7                  0.0
      TcpPassiveOpens                 3                  0.0
      TcpAttemptFails                 3                  0.0
      TcpEstabResets                  5                  0.0
      TcpInSegs                       401036             0.0
      TcpOutSegs                      401052             0.0
      TcpOutRsts                      38                 0.0
      UdpInDatagrams                  26                 0.0
      UdpOutDatagrams                 27                 0.0
      Ip6OutNoRoutes                  1                  0.0
      TcpExtDelayedACKs               1                  0.0
      TcpExtTCPPrequeued              98                 0.0
      TcpExtTCPDirectCopyFromPrequeue 98                 0.0
      TcpExtTCPHPHits                 4                  0.0
      TcpExtTCPHPHitsToUser           98                 0.0
      TcpExtTCPPureAcks               5                  0.0
      TcpExtTCPHPAcks                 101                0.0
      TcpExtTCPAbortOnData            6                  0.0
      TcpExtBusyPollRxPackets         400832             0.0
      TcpExtTCPOrigDataSent           400983             0.0
      IpExtInOctets                   21273867           0.0
      IpExtOutOctets                  21261254           0.0
      IpExtInNoECTPkts                401064             0.0
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b59768c6
  12. 07 11月, 2015 1 次提交
    • M
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman 提交于
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  13. 18 8月, 2015 1 次提交
    • Y
      bnx2: Fix bandwidth allocation for some MF modes · da3cc2da
      Yuval Mintz 提交于
      Management firmware tells driver in case bandwidth configuration for
      a specific function exists, but [regretably] the same field has different
      meanings depending on the multi-function mode - it can either be
      a percentile value or an actual speed.
      
      For newer multi-function modes current logic is incorrect -
      driver understands values as actual speeds instead of percentages,
      causing the resulting chip configuration to be incorrect.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      da3cc2da
  14. 11 8月, 2015 1 次提交
  15. 30 7月, 2015 1 次提交
  16. 23 7月, 2015 4 次提交
  17. 29 6月, 2015 1 次提交
    • M
      bnx2x: fix DMA API usage · 8031612d
      Michal Schmidt 提交于
      With CONFIG_DMA_API_DEBUG=y bnx2x triggers the error "DMA-API: device
      driver frees DMA memory with wrong function".
      On archs where PAGE_SIZE > SGE_PAGE_SIZE it also triggers "DMA-API:
      device driver frees DMA memory with different size".
      
      Fix this by making the mapping and unmapping symmetric:
       - Do not map the whole pool page at once. Instead map the
         SGE_PAGE_SIZE-sized pieces individually, so they can be unmapped in
         the same manner.
       - What's mapped using dma_map_page() must be unmapped using
         dma_unmap_page().
      
      Tested on ppc64.
      
      Fixes: 4cace675 ("bnx2x: Alloc 4k fragment for each rx ring buffer element")
      Signed-off-by: NMichal Schmidt <mschmidt@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8031612d
  18. 25 6月, 2015 1 次提交
  19. 02 6月, 2015 1 次提交
    • G
      bnx2x: Alloc 4k fragment for each rx ring buffer element · 4cace675
      Gabriel Krisman Bertazi 提交于
      The driver allocates one page for each buffer on the rx ring, which is
      too much on architectures like ppc64 and can cause unexpected allocation
      failures when the system is under stress.  Now, we keep a memory pool
      per queue, and if the architecture's PAGE_SIZE is greater than 4k, we
      fragment pages and assign each 4k segment to a ring element, which
      reduces the overall memory consumption on such architectures.  This
      helps avoiding errors like the example below:
      
      [bnx2x_alloc_rx_sge:435(eth1)]Can't alloc sge
      [c00000037ffeb900] [d000000075eddeb4] .bnx2x_alloc_rx_sge+0x44/0x200 [bnx2x]
      [c00000037ffeb9b0] [d000000075ee0b34] .bnx2x_fill_frag_skb+0x1ac/0x460 [bnx2x]
      [c00000037ffebac0] [d000000075ee11f0] .bnx2x_tpa_stop+0x160/0x2e8 [bnx2x]
      [c00000037ffebb90] [d000000075ee1560] .bnx2x_rx_int+0x1e8/0xc30 [bnx2x]
      [c00000037ffebcd0] [d000000075ee2084] .bnx2x_poll+0xdc/0x3d8 [bnx2x] (unreliable)
      Signed-off-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
      Acked-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Reviewed-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cace675
  20. 12 5月, 2015 1 次提交
  21. 05 5月, 2015 1 次提交
  22. 30 4月, 2015 3 次提交
  23. 28 4月, 2015 1 次提交
    • M
      bnx2x: really disable TPA if 'disable_tpa' option is set · 22a8f237
      Michal Schmidt 提交于
      bnx2x's 'disable_tpa=1' module option is not respected properly and TPA
      (transparent packet aggregation) remains enabled. Even though the
      module option causes LRO to be disabled, TPA is enabled in GRO mode.
      
      Additionally, disabling GRO via ethtool then has no effect. One can
      still observe tpa_* statistics increase and large packets being received
      in tcpdump.
      
      The bug was an unintended consequence of commit aebf6244 "bnx2x: Be
      more forgiving toward SW GRO".
      
      Fix it by following the bp->disable_tpa flag when initializing fp's.
      Signed-off-by: NMichal Schmidt <mschmidt@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22a8f237
  24. 23 4月, 2015 1 次提交
  25. 16 4月, 2015 1 次提交
    • E
      bnx2x: Fix busy_poll vs netpoll · 074975d0
      Eric Dumazet 提交于
      Commit 9a2620c8 ("bnx2x: prevent WARN during driver unload")
      switched the napi/busy_lock locking mechanism from spin_lock() into
      spin_lock_bh(), breaking inter-operability with netconsole, as netpoll
      disables interrupts prior to calling our napi mechanism.
      
      This switches the driver into using atomic assignments instead of the
      spinlock mechanisms previously employed.
      
      Based on initial patch from Yuval Mintz & Ariel Elior
      
      I basically added softirq starvation avoidance, and mixture
      of atomic operations, plain writes and barriers.
      
      Note this slightly reduces the overhead for this driver when no
      busy_poll sockets are in use.
      
      Fixes: 9a2620c8 ("bnx2x: prevent WARN during driver unload")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      074975d0
  26. 27 1月, 2015 1 次提交
  27. 14 1月, 2015 1 次提交
  28. 11 12月, 2014 1 次提交
  29. 17 11月, 2014 1 次提交