1. 28 12月, 2017 1 次提交
  2. 08 11月, 2017 1 次提交
  3. 08 8月, 2017 4 次提交
  4. 11 6月, 2017 1 次提交
  5. 08 6月, 2017 1 次提交
  6. 02 6月, 2017 1 次提交
    • M
      bnx2x: Fix Multi-Cos · 3968d389
      Mintz, Yuval 提交于
      Apparently multi-cos isn't working for bnx2x quite some time -
      driver implements ndo_select_queue() to allow queue-selection
      for FCoE, but the regular L2 flow would cause it to modulo the
      fallback's result by the number of queues.
      The fallback would return a queue matching the needed tc
      [via __skb_tx_hash()], but since the modulo is by the number of TSS
      queues where number of TCs is not accounted, transmission would always
      be done by a queue configured into using TC0.
      
      Fixes: ada7c19e ("bnx2x: use XPS if possible for bnx2x_select_queue instead of pure hash")
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3968d389
  7. 01 5月, 2017 1 次提交
    • S
      bnx2x: Align RX buffers · 9b70de6d
      Scott Wood 提交于
      The bnx2x driver is not providing proper alignment on the receive buffers it
      passes to build_skb(), causing skb_shared_info to be misaligned.
      skb_shared_info contains an atomic, and while PPC normally supports
      unaligned accesses, it does not support unaligned atomics.
      
      Aligning the size of rx buffers will ensure that page_frag_alloc() returns
      aligned addresses.
      
      This can be reproduced on PPC by setting the network MTU to 1450 (or other
      non-multiple-of-4) and then generating sufficient inbound network traffic
      (one or two large "wget"s usually does it), producing the following oops:
      
      Unable to handle kernel paging request for unaligned access at address 0xc00000ffc43af656
      Faulting instruction address: 0xc00000000080ef8c
      Oops: Kernel access of bad area, sig: 7 [#1]
      SMP NR_CPUS=2048
      NUMA
      PowerNV
      Modules linked in: vmx_crypto powernv_rng rng_core powernv_op_panel leds_powernv led_class nfsd ip_tables x_tables autofs4 xfs lpfc bnx2x mdio libcrc32c crc_t10dif crct10dif_generic crct10dif_common
      CPU: 104 PID: 0 Comm: swapper/104 Not tainted 4.11.0-rc8-00088-g4c761daf #2
      task: c00000ffd4892400 task.stack: c00000ffd4920000
      NIP: c00000000080ef8c LR: c00000000080eee8 CTR: c0000000001f8320
      REGS: c00000ffffc33710 TRAP: 0600   Not tainted  (4.11.0-rc8-00088-g4c761daf)
      MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>
        CR: 24082042  XER: 00000000
      CFAR: c00000000080eea0 DAR: c00000ffc43af656 DSISR: 00000000 SOFTE: 1
      GPR00: c000000000907f64 c00000ffffc33990 c000000000dd3b00 c00000ffcaf22100
      GPR04: c00000ffcaf22e00 0000000000000000 0000000000000000 0000000000000000
      GPR08: 0000000000b80008 c00000ffc43af636 c00000ffc43af656 0000000000000000
      GPR12: c0000000001f6f00 c00000000fe1a000 000000000000049f 000000000000c51f
      GPR16: 00000000ffffef33 0000000000000000 0000000000008a43 0000000000000001
      GPR20: c00000ffc58a90c0 0000000000000000 000000000000dd86 0000000000000000
      GPR24: c000007fd0ed10c0 00000000ffffffff 0000000000000158 000000000000014a
      GPR28: c00000ffc43af010 c00000ffc9144000 c00000ffcaf22e00 c00000ffcaf22100
      NIP [c00000000080ef8c] __skb_clone+0xdc/0x140
      LR [c00000000080eee8] __skb_clone+0x38/0x140
      Call Trace:
      [c00000ffffc33990] [c00000000080fb74] skb_clone+0x74/0x110 (unreliable)
      [c00000ffffc339c0] [c000000000907f64] packet_rcv+0x144/0x510
      [c00000ffffc33a40] [c000000000827b64] __netif_receive_skb_core+0x5b4/0xd80
      [c00000ffffc33b00] [c00000000082b2bc] netif_receive_skb_internal+0x2c/0xc0
      [c00000ffffc33b40] [c00000000082c49c] napi_gro_receive+0x11c/0x260
      [c00000ffffc33b80] [d000000066483d68] bnx2x_poll+0xcf8/0x17b0 [bnx2x]
      [c00000ffffc33d00] [c00000000082babc] net_rx_action+0x31c/0x480
      [c00000ffffc33e10] [c0000000000d5a44] __do_softirq+0x164/0x3d0
      [c00000ffffc33f00] [c0000000000d60a8] irq_exit+0x108/0x120
      [c00000ffffc33f20] [c000000000015b98] __do_irq+0x98/0x200
      [c00000ffffc33f90] [c000000000027f14] call_do_irq+0x14/0x24
      [c00000ffd4923a90] [c000000000015d94] do_IRQ+0x94/0x110
      [c00000ffd4923ae0] [c000000000008d90] hardware_interrupt_common+0x150/0x160
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b70de6d
  8. 16 3月, 2017 1 次提交
    • A
      mqprio: Modify mqprio to pass user parameters via ndo_setup_tc. · 56f36acd
      Amritha Nambiar 提交于
      The configurable priority to traffic class mapping and the user specified
      queue ranges are used to configure the traffic class, overriding the
      hardware defaults when the 'hw' option is set to 0. However, when the 'hw'
      option is non-zero, the hardware QOS defaults are used.
      
      This patch makes it so that we can pass the data the user provided to
      ndo_setup_tc. This allows us to pull in the queue configuration if the
      user requested it as well as any additional hardware offload type
      requested by using a value other than 1 for the hw value.
      
      Finally it also provides a means for the device driver to return the level
      supported for the offload type via the qopt->hw value. Previously we were
      just always assuming the value to be 1, in the future values beyond just 1
      may be supported.
      Signed-off-by: NAmritha Nambiar <amritha.nambiar@intel.com>
      Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56f36acd
  9. 31 1月, 2017 1 次提交
  10. 24 1月, 2017 1 次提交
  11. 04 12月, 2016 1 次提交
  12. 17 11月, 2016 1 次提交
    • E
      bnx2x: switch to napi_complete_done() · 80f1c21c
      Eric Dumazet 提交于
      Switch from napi_complete() to napi_complete_done()
      for better GRO support (gro_flush_timeout) and core NAPI
      features.
      
      Do not rearm interrupts if we are busy polling,
      to reduce bus and interrupts overhead.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Adam Belay <abelay@google.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Cc: Yuval Mintz <Yuval.Mintz@cavium.com>
      Cc: Ariel Elior <ariel.elior@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80f1c21c
  13. 18 10月, 2016 1 次提交
    • J
      ethernet/broadcom: use core min/max MTU checking · e1c6dcca
      Jarod Wilson 提交于
      tg3: min_mtu 60, max_mtu 9000/1500
      
      bnxt: min_mtu 60, max_mtu 9000
      
      bnx2x: min_mtu 46, max_mtu 9600
      - Fix up ETH_OVREHEAD -> ETH_OVERHEAD while we're in here, remove
        duplicated defines from bnx2x_link.c.
      
      bnx2: min_mtu 46, max_mtu 9000
      - Use more standard ETH_* defines while we're at it.
      
      bcm63xx_enet: min_mtu 46, max_mtu 2028
      - compute_hw_mtu was made largely pointless, and thus merged back into
        bcm_enet_change_mtu.
      
      b44: min_mtu 60, max_mtu 1500
      
      CC: netdev@vger.kernel.org
      CC: Michael Chan <michael.chan@broadcom.com>
      CC: Sony Chacko <sony.chacko@qlogic.com>
      CC: Ariel Elior <ariel.elior@qlogic.com>
      CC: Dept-HSGLinuxNICDev@qlogic.com
      CC: Siva Reddy Kallam <siva.kallam@broadcom.com>
      CC: Prashant Sreedharan <prashant@broadcom.com>
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e1c6dcca
  14. 17 3月, 2016 1 次提交
    • Y
      bnx2x: don't wait for Tx completion on recovery · d78a1f08
      Yuval Mintz 提交于
      When driver has hit a parity event, HW can no longer write to host memory.
      As a result, Tx completions cannot be written to the host SB memory, and
      waiting for Tx completions eventually timeout.
      As driver is willing to delay as much as 1-2 seconds per Tx queue for its
      draining and this delay is sequential, the time to recover might greatly
      lengthen needlessly in case the recovery is done under multi-connection
      traffic.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d78a1f08
  15. 04 3月, 2016 1 次提交
    • J
      net: relax setup_tc ndo op handle restriction · 5eb4dce3
      John Fastabend 提交于
      I added this check in setup_tc to multiple drivers,
      
       if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)
      
      Unfortunately restricting to TC_H_ROOT like this breaks the old
      instantiation of mqprio to setup a hardware qdisc. This patch
      relaxes the test to only check the type to make it equivalent
      to the check before I broke it. With this the old instantiation
      continues to work.
      
      A good smoke test is to setup mqprio with,
      
      # tc qdisc add dev eth4 root mqprio num_tc 8 \
        map 0 1 2 3 4 5 6 7 \
        queues 0@0 1@1 2@2 3@3 4@4 5@5 6@6 7@7
      
      Fixes: e4c6734e ("net: rework ndo tc op to consume additional qdisc handle paramete")
      Reported-by: NSingh Krishneil <krishneil.k.singh@intel.com>
      Reported-by: NJake Keller <jacob.e.keller@intel.com>
      CC: Murali Karicheri <m-karicheri2@ti.com>
      CC: Shradha Shah <sshah@solarflare.com>
      CC: Or Gerlitz <ogerlitz@mellanox.com>
      CC: Ariel Elior <ariel.elior@qlogic.com>
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      CC: Bruce Allan <bruce.w.allan@intel.com>
      CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
      CC: Don Skidmore <donald.c.skidmore@intel.com>
      Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5eb4dce3
  16. 17 2月, 2016 3 次提交
  17. 19 12月, 2015 1 次提交
  18. 09 12月, 2015 2 次提交
  19. 06 12月, 2015 2 次提交
  20. 19 11月, 2015 3 次提交
    • E
      net: provide generic busy polling to all NAPI drivers · 93d05d4a
      Eric Dumazet 提交于
      NAPI drivers no longer need to observe a particular protocol
      to benefit from busy polling (CONFIG_NET_RX_BUSY_POLL=y)
      
      napi_hash_add() and napi_hash_del() are automatically called
      from core networking stack, respectively from
      netif_napi_add() and netif_napi_del()
      
      This patch depends on free_netdev() and netif_napi_del() being
      called from process context, which seems to be the norm.
      
      Drivers might still prefer to call napi_hash_del() on their
      own, since they might combine all the rcu grace periods into
      a single one, knowing their NAPI structures lifetime, while
      core networking stack has no idea of a possible combining.
      
      Once this patch proves to not bring serious regressions,
      we will cleanup drivers to either remove napi_hash_del()
      or provide appropriate rcu grace periods combining.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93d05d4a
    • E
      net: move skb_mark_napi_id() into core networking stack · 93f93a44
      Eric Dumazet 提交于
      We would like to automatically provide busy polling support
      to all NAPI drivers, without them having to implement anything.
      
      skb_mark_napi_id() can be called from napi_gro_receive() and
      napi_get_frags().
      
      Few drivers are still calling skb_mark_napi_id() because
      they use netif_receive_skb(). They should eventually call
      napi_gro_receive() instead. I will leave this to drivers
      maintainers.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93f93a44
    • E
      bnx2x: remove bnx2x_low_latency_recv() support · b59768c6
      Eric Dumazet 提交于
      Switch to native NAPI polling, as this reduces overhead and complexity.
      
      Normal path is faster, since one cmpxchg() is not anymore requested,
      and busy polling with the NAPI polling has same performance.
      
      Tested:
      lpk50:~# cat /proc/sys/net/core/busy_read
      70
      lpk50:~# nstat >/dev/null;./netperf -H lpk55 -t TCP_RR;nstat
      MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpk55.prod.google.com () port 0 AF_INET : first burst 0
      Local /Remote
      Socket Size   Request  Resp.   Elapsed  Trans.
      Send   Recv   Size     Size    Time     Rate
      bytes  Bytes  bytes    bytes   secs.    per sec
      
      16384  87380  1        1       10.00    40095.07
      16384  87380
      IpInReceives                    401062             0.0
      IpInDelivers                    401062             0.0
      IpOutRequests                   401079             0.0
      TcpActiveOpens                  7                  0.0
      TcpPassiveOpens                 3                  0.0
      TcpAttemptFails                 3                  0.0
      TcpEstabResets                  5                  0.0
      TcpInSegs                       401036             0.0
      TcpOutSegs                      401052             0.0
      TcpOutRsts                      38                 0.0
      UdpInDatagrams                  26                 0.0
      UdpOutDatagrams                 27                 0.0
      Ip6OutNoRoutes                  1                  0.0
      TcpExtDelayedACKs               1                  0.0
      TcpExtTCPPrequeued              98                 0.0
      TcpExtTCPDirectCopyFromPrequeue 98                 0.0
      TcpExtTCPHPHits                 4                  0.0
      TcpExtTCPHPHitsToUser           98                 0.0
      TcpExtTCPPureAcks               5                  0.0
      TcpExtTCPHPAcks                 101                0.0
      TcpExtTCPAbortOnData            6                  0.0
      TcpExtBusyPollRxPackets         400832             0.0
      TcpExtTCPOrigDataSent           400983             0.0
      IpExtInOctets                   21273867           0.0
      IpExtOutOctets                  21261254           0.0
      IpExtInNoECTPkts                401064             0.0
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b59768c6
  21. 07 11月, 2015 1 次提交
    • M
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman 提交于
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  22. 18 8月, 2015 1 次提交
    • Y
      bnx2: Fix bandwidth allocation for some MF modes · da3cc2da
      Yuval Mintz 提交于
      Management firmware tells driver in case bandwidth configuration for
      a specific function exists, but [regretably] the same field has different
      meanings depending on the multi-function mode - it can either be
      a percentile value or an actual speed.
      
      For newer multi-function modes current logic is incorrect -
      driver understands values as actual speeds instead of percentages,
      causing the resulting chip configuration to be incorrect.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      da3cc2da
  23. 11 8月, 2015 1 次提交
  24. 30 7月, 2015 1 次提交
  25. 23 7月, 2015 4 次提交
  26. 29 6月, 2015 1 次提交
    • M
      bnx2x: fix DMA API usage · 8031612d
      Michal Schmidt 提交于
      With CONFIG_DMA_API_DEBUG=y bnx2x triggers the error "DMA-API: device
      driver frees DMA memory with wrong function".
      On archs where PAGE_SIZE > SGE_PAGE_SIZE it also triggers "DMA-API:
      device driver frees DMA memory with different size".
      
      Fix this by making the mapping and unmapping symmetric:
       - Do not map the whole pool page at once. Instead map the
         SGE_PAGE_SIZE-sized pieces individually, so they can be unmapped in
         the same manner.
       - What's mapped using dma_map_page() must be unmapped using
         dma_unmap_page().
      
      Tested on ppc64.
      
      Fixes: 4cace675 ("bnx2x: Alloc 4k fragment for each rx ring buffer element")
      Signed-off-by: NMichal Schmidt <mschmidt@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8031612d
  27. 25 6月, 2015 1 次提交
  28. 02 6月, 2015 1 次提交
    • G
      bnx2x: Alloc 4k fragment for each rx ring buffer element · 4cace675
      Gabriel Krisman Bertazi 提交于
      The driver allocates one page for each buffer on the rx ring, which is
      too much on architectures like ppc64 and can cause unexpected allocation
      failures when the system is under stress.  Now, we keep a memory pool
      per queue, and if the architecture's PAGE_SIZE is greater than 4k, we
      fragment pages and assign each 4k segment to a ring element, which
      reduces the overall memory consumption on such architectures.  This
      helps avoiding errors like the example below:
      
      [bnx2x_alloc_rx_sge:435(eth1)]Can't alloc sge
      [c00000037ffeb900] [d000000075eddeb4] .bnx2x_alloc_rx_sge+0x44/0x200 [bnx2x]
      [c00000037ffeb9b0] [d000000075ee0b34] .bnx2x_fill_frag_skb+0x1ac/0x460 [bnx2x]
      [c00000037ffebac0] [d000000075ee11f0] .bnx2x_tpa_stop+0x160/0x2e8 [bnx2x]
      [c00000037ffebb90] [d000000075ee1560] .bnx2x_rx_int+0x1e8/0xc30 [bnx2x]
      [c00000037ffebcd0] [d000000075ee2084] .bnx2x_poll+0xdc/0x3d8 [bnx2x] (unreliable)
      Signed-off-by: NGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
      Acked-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Reviewed-by: NLino Sanfilippo <LinoSanfilippo@gmx.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4cace675