1. 18 10月, 2016 1 次提交
    • J
      ethernet/intel: use core min/max MTU checking · 91c527a5
      Jarod Wilson 提交于
      e100: min_mtu 68, max_mtu 1500
      - remove e100_change_mtu entirely, is identical to old eth_change_mtu,
        and no longer serves a purpose. No need to set min_mtu or max_mtu
        explicitly, as ether_setup() will already set them to 68 and 1500.
      
      e1000: min_mtu 46, max_mtu 16110
      
      e1000e: min_mtu 68, max_mtu varies based on adapter
      
      fm10k: min_mtu 68, max_mtu 15342
      - remove fm10k_change_mtu entirely, does nothing now
      
      i40e: min_mtu 68, max_mtu 9706
      
      i40evf: min_mtu 68, max_mtu 9706
      
      igb: min_mtu 68, max_mtu 9216
      - There are two different "max" frame sizes claimed and both checked in
        the driver, the larger value wasn't relevant though, so I've set max_mtu
        to the smaller of the two values here to retain identical behavior.
      
      igbvf: min_mtu 68, max_mtu 9216
      - Same issue as igb duplicated
      
      ixgb: min_mtu 68, max_mtu 16114
      - Also remove pointless old == new check, as that's done in dev_set_mtu
      
      ixgbe: min_mtu 68, max_mtu 9710
      
      ixgbevf: min_mtu 68, max_mtu dependent on hardware/firmware
      - Some hw can only handle up to max_mtu 1504 on a vf, others 9710
      
      CC: netdev@vger.kernel.org
      CC: intel-wired-lan@lists.osuosl.org
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NJarod Wilson <jarod@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      91c527a5
  2. 07 4月, 2016 1 次提交
  3. 06 4月, 2016 2 次提交
  4. 13 12月, 2015 4 次提交
  5. 16 10月, 2015 1 次提交
    • J
      drivers/net/intel: use napi_complete_done() · 32b3e08f
      Jesse Brandeburg 提交于
      As per Eric Dumazet's previous patches:
      (see commit (24d2e4a5) - tg3: use napi_complete_done())
      
      Quoting verbatim:
      Using napi_complete_done() instead of napi_complete() allows
      us to use /sys/class/net/ethX/gro_flush_timeout
      
      GRO layer can aggregate more packets if the flush is delayed a bit,
      without having to set too big coalescing parameters that impact
      latencies.
      </end quote>
      
      Tested
      configuration: low latency via ethtool -C ethx adaptive-rx off
      				rx-usecs 10 adaptive-tx off tx-usecs 15
      workload: streaming rx using netperf TCP_MAERTS
      
      igb:
      MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo
      ...
      Interim result:  941.48 10^6bits/s over 1.000 seconds ending at 1440193171.589
      
      Alignment      Offset         Bytes    Bytes       Recvs   Bytes    Sends
      Local  Remote  Local  Remote  Xfered   Per                 Per
      Recv   Send    Recv   Send             Recv (avg)          Send (avg)
          8       8      0       0 1176930056  1475.36    797726   16384.00  71905
      
      MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo
      ...
      Interim result:  941.49 10^6bits/s over 0.997 seconds ending at 1440193142.763
      
      Alignment      Offset         Bytes    Bytes       Recvs   Bytes    Sends
      Local  Remote  Local  Remote  Xfered   Per                 Per
      Recv   Send    Recv   Send             Recv (avg)          Send (avg)
          8       8      0       0 1175182320  50476.00     23282   16384.00  71816
      
      i40e:
      Hard to test because the traffic is incoming so fast (24Gb/s) that GRO
      always receives 87kB, even at the highest interrupt rate.
      
      Other drivers were only compile tested.
      Signed-off-by: NJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      32b3e08f
  6. 12 5月, 2015 1 次提交
  7. 09 4月, 2015 1 次提交
  8. 09 3月, 2015 1 次提交
  9. 06 3月, 2015 2 次提交
    • S
      e1000: add dummy allocator to fix race condition between mtu change and netpoll · 08e83316
      Sabrina Dubroca 提交于
      There is a race condition between e1000_change_mtu's cleanups and
      netpoll, when we change the MTU across jumbo size:
      
      Changing MTU frees all the rx buffers:
          e1000_change_mtu -> e1000_down -> e1000_clean_all_rx_rings ->
              e1000_clean_rx_ring
      
      Then, close to the end of e1000_change_mtu:
          pr_info -> ... -> netpoll_poll_dev -> e1000_clean ->
              e1000_clean_rx_irq -> e1000_alloc_rx_buffers -> e1000_alloc_frag
      
      And when we come back to do the rest of the MTU change:
          e1000_up -> e1000_configure -> e1000_configure_rx ->
              e1000_alloc_jumbo_rx_buffers
      
      alloc_jumbo finds the buffers already != NULL, since data (shared with
      page in e1000_rx_buffer->rxbuf) has been re-alloc'd, but it's garbage,
      or at least not what is expected when in jumbo state.
      
      This results in an unusable adapter (packets don't get through), and a
      NULL pointer dereference on the next call to e1000_clean_rx_ring
      (other mtu change, link down, shutdown):
      
      BUG: unable to handle kernel NULL pointer dereference at           (null)
      IP: [<ffffffff81194d6e>] put_compound_page+0x7e/0x330
      
          [...]
      
      Call Trace:
       [<ffffffff81195445>] put_page+0x55/0x60
       [<ffffffff815d9f44>] e1000_clean_rx_ring+0x134/0x200
       [<ffffffff815da055>] e1000_clean_all_rx_rings+0x45/0x60
       [<ffffffff815df5e0>] e1000_down+0x1c0/0x1d0
       [<ffffffff811e2260>] ? deactivate_slab+0x7f0/0x840
       [<ffffffff815e21bc>] e1000_change_mtu+0xdc/0x170
       [<ffffffff81647050>] dev_set_mtu+0xa0/0x140
       [<ffffffff81664218>] do_setlink+0x218/0xac0
       [<ffffffff814459e9>] ? nla_parse+0xb9/0x120
       [<ffffffff816652d0>] rtnl_newlink+0x6d0/0x890
       [<ffffffff8104f000>] ? kvm_clock_read+0x20/0x40
       [<ffffffff810a2068>] ? sched_clock_cpu+0xa8/0x100
       [<ffffffff81663802>] rtnetlink_rcv_msg+0x92/0x260
      
      By setting the allocator to a dummy version, netpoll can't mess up our
      rx buffers.  The allocator is set back to a sane value in
      e1000_configure_rx.
      
      Fixes: edbbb3ca ("e1000: implement jumbo receive with partial descriptors")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      08e83316
    • E
      e1000: call netif_carrier_off early on down · f9c029db
      Eliezer Tamir 提交于
      When bringing down an interface netif_carrier_off() should be
      one the first things we do, since this will prevent the stack
      from queuing more packets to this interface.
      This operation is very fast, and should make the device behave
      much nicer when trying to bring down an interface under load.
      
      Also, this would Do The Right Thing (TM) if this device has some
      sort of fail-over teaming and redirect traffic to the other IF.
      
      Move netif_carrier_off as early as possible.
      Signed-off-by: NEliezer Tamir <eliezer.tamir@linux.intel.com>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      f9c029db
  10. 23 1月, 2015 1 次提交
  11. 14 1月, 2015 1 次提交
  12. 11 12月, 2014 1 次提交
  13. 09 12月, 2014 1 次提交
  14. 30 10月, 2014 1 次提交
  15. 12 9月, 2014 7 次提交
  16. 26 8月, 2014 1 次提交
    • V
      e1000: Fix TSO for non-accelerated vlan traffic · 06f4d033
      Vlad Yasevich 提交于
      This device claims TSO and checksum support for vlans.  It also
      allows a user to control vlan acceleration offloading.  As such,
      it is possible to turn off vlan acceleration and configure a vlan
      which will continue to support TSO.
      
      In such situation the packet passed down the the device will contain
      a vlan header and skb->protocol will be set to ETH_P_8021Q.
      The device assumes that skb->protocol contains network protocol
      value and uses that value to set up TSO and checksum information.
      This will results in corrupted frames sent on the wire.
      
      This patch extract the protocol value correctly and corrects TSO
      for non-accelerated traffic.
      
      CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
      CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
      CC: Bruce Allan <bruce.w.allan@intel.com>
      CC: Carolyn Wyborny <carolyn.wyborny@intel.com>
      CC: Don Skidmore <donald.c.skidmore@intel.com>
      CC: Greg Rose <gregory.v.rose@intel.com>
      CC: Alex Duyck <alexander.h.duyck@intel.com>
      CC: John Ronciak <john.ronciak@intel.com>
      CC: Mitch Williams <mitch.a.williams@intel.com>
      CC: Linux NICS <linux.nics@intel.com>
      CC: e1000-devel@lists.sourceforge.net
      Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      06f4d033
  17. 13 8月, 2014 1 次提交
  18. 04 6月, 2014 1 次提交
  19. 11 4月, 2014 1 次提交
  20. 30 11月, 2013 3 次提交
    • V
      e1000: fix possible reset_task running after adapter down · 74a1b1ea
      Vladimir Davydov 提交于
      On e1000_down(), we should ensure every asynchronous work is canceled
      before proceeding. Since the watchdog_task can schedule other works
      apart from itself, it should be stopped first, but currently it is
      stopped after the reset_task. This can result in the following race
      leading to the reset_task running after the module unload:
      
      e1000_down_and_stop():			e1000_watchdog():
      ----------------------			-----------------
      
      cancel_work_sync(reset_task)
      					schedule_work(reset_task)
      cancel_delayed_work_sync(watchdog_task)
      
      The patch moves cancel_delayed_work_sync(watchdog_task) at the beginning
      of e1000_down_and_stop() thus ensuring the race is impossible.
      
      Cc: Tushar Dave <tushar.n.dave@intel.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      74a1b1ea
    • V
      e1000: fix lockdep warning in e1000_reset_task · b2f963bf
      Vladimir Davydov 提交于
      The patch fixes the following lockdep warning, which is 100%
      reproducible on network restart:
      
      ======================================================
      [ INFO: possible circular locking dependency detected ]
      3.12.0+ #47 Tainted: GF
      -------------------------------------------------------
      kworker/1:1/27 is trying to acquire lock:
       ((&(&adapter->watchdog_task)->work)){+.+...}, at: [<ffffffff8108a5b0>] flush_work+0x0/0x70
      
      but task is already holding lock:
       (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000]
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (&adapter->mutex){+.+...}:
             [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
             [<ffffffff816b8cbc>] mutex_lock_nested+0x4c/0x390
             [<ffffffffa017233d>] e1000_watchdog+0x7d/0x5b0 [e1000]
             [<ffffffff8108b972>] process_one_work+0x1d2/0x510
             [<ffffffff8108ca80>] worker_thread+0x120/0x3a0
             [<ffffffff81092c1e>] kthread+0xee/0x110
             [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0
      
      -> #0 ((&(&adapter->watchdog_task)->work)){+.+...}:
             [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810
             [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
             [<ffffffff8108a5eb>] flush_work+0x3b/0x70
             [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140
             [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20
             [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000]
             [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000]
             [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000]
             [<ffffffff8108b972>] process_one_work+0x1d2/0x510
             [<ffffffff8108ca80>] worker_thread+0x120/0x3a0
             [<ffffffff81092c1e>] kthread+0xee/0x110
             [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&adapter->mutex);
                                     lock((&(&adapter->watchdog_task)->work));
                                     lock(&adapter->mutex);
        lock((&(&adapter->watchdog_task)->work));
      
       *** DEADLOCK ***
      
      3 locks held by kworker/1:1/27:
       #0:  (events){.+.+.+}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510
       #1:  ((&adapter->reset_task)){+.+...}, at: [<ffffffff8108b906>] process_one_work+0x166/0x510
       #2:  (&adapter->mutex){+.+...}, at: [<ffffffffa0177c0a>] e1000_reset_task+0x4a/0xa0 [e1000]
      
      stack backtrace:
      CPU: 1 PID: 27 Comm: kworker/1:1 Tainted: GF            3.12.0+ #47
      Hardware name: System manufacturer System Product Name/P5B-VM SE, BIOS 0501    05/31/2007
      Workqueue: events e1000_reset_task [e1000]
       ffffffff820f6000 ffff88007b9dba98 ffffffff816b54a2 0000000000000002
       ffffffff820f5e50 ffff88007b9dbae8 ffffffff810ba936 ffff88007b9dbac8
       ffff88007b9dbb48 ffff88007b9d8f00 ffff88007b9d8780 ffff88007b9d8f00
      Call Trace:
       [<ffffffff816b54a2>] dump_stack+0x49/0x5f
       [<ffffffff810ba936>] print_circular_bug+0x216/0x310
       [<ffffffff810bd9c0>] __lock_acquire+0x1710/0x1810
       [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
       [<ffffffff810bdb5d>] lock_acquire+0x9d/0x120
       [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
       [<ffffffff8108a5eb>] flush_work+0x3b/0x70
       [<ffffffff8108a5b0>] ? __flush_work+0x250/0x250
       [<ffffffff8108b5d8>] __cancel_work_timer+0x98/0x140
       [<ffffffff8108b693>] cancel_delayed_work_sync+0x13/0x20
       [<ffffffffa0170cec>] e1000_down_and_stop+0x3c/0x60 [e1000]
       [<ffffffffa01775b1>] e1000_down+0x131/0x220 [e1000]
       [<ffffffffa0177c12>] e1000_reset_task+0x52/0xa0 [e1000]
       [<ffffffff8108b972>] process_one_work+0x1d2/0x510
       [<ffffffff8108b906>] ? process_one_work+0x166/0x510
       [<ffffffff8108ca80>] worker_thread+0x120/0x3a0
       [<ffffffff8108c960>] ? manage_workers+0x2c0/0x2c0
       [<ffffffff81092c1e>] kthread+0xee/0x110
       [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70
       [<ffffffff816c3d7c>] ret_from_fork+0x7c/0xb0
       [<ffffffff81092b30>] ? __init_kthread_worker+0x70/0x70
      
      == The issue background ==
      
      The problem occurs, because e1000_down(), which is called under
      adapter->mutex by e1000_reset_task(), tries to synchronously cancel
      e1000 auxiliary works (reset_task, watchdog_task, phy_info_task,
      fifo_stall_task), which take adapter->mutex in their handlers. So the
      question is what does adapter->mutex protect there?
      
      The adapter->mutex was introduced by commit 0ef4ee ("e1000: convert to
      private mutex from rtnl") as a replacement for rtnl_lock() taken in the
      asynchronous handlers. It targeted on fixing a similar lockdep warning
      issued when e1000_down() was called under rtnl_lock(), and it fixed it,
      but unfortunately it introduced the lockdep warning described above.
      Anyway, that said the source of this bug is that the asynchronous works
      were made to take rtnl_lock() some time ago, so let's look deeper and
      find why it was added there.
      
      The rtnl_lock() was added to asynchronous handlers by commit 338c15
      ("e1000: fix occasional panic on unload") in order to prevent
      asynchronous handlers from execution after the module is unloaded
      (e1000_down() is called) as it follows from the comment to the commit:
      
      > Net drivers in general have an issue where timers fired
      > by mod_timer or work threads with schedule_work are running
      > outside of the rtnl_lock.
      >
      > With no other lock protection these routines are vulnerable
      > to races with driver unload or reset paths.
      >
      > The longer term solution to this might be a redesign with
      > safer locks being taken in the driver to guarantee no
      > reentrance, but for now a safe and effective fix is
      > to take the rtnl_lock in these routines.
      
      I'm not sure if this locking scheme fixed the problem or just made it
      unlikely, although I incline to the latter. Anyway, this was long time
      ago when e1000 auxiliary works were implemented as timers scheduling
      real work handlers in their routines. The e1000_down() function only
      canceled the timers, but left the real handlers running if they were
      running, which could result in work execution after module unload.
      Today, the e1000 driver uses sane delayed works instead of the pair
      timer+work to implement its delayed asynchronous handlers, and the
      e1000_down() synchronously cancels all the works so that the problem
      that commit 338c15 tried to cope with disappeared, and we don't need any
      locks in the handlers any more. Moreover, any locking there can
      potentially result in a deadlock.
      
      So, this patch reverts commits 0ef4ee and 338c15.
      
      Fixes: 0ef4eedc ("e1000: convert to private mutex from rtnl")
      Fixes: 338c15e4 ("e1000: fix occasional panic on unload")
      Cc: Tushar Dave <tushar.n.dave@intel.com>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: NVladimir Davydov <vdavydov@parallels.com>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      b2f963bf
    • Y
      e1000: prevent oops when adapter is being closed and reset simultaneously · 6a7d64e3
      yzhu1 提交于
      This change is based on a similar change made to e1000e support in
      commit bb9e44d0 ("e1000e: prevent oops when adapter is being closed
      and reset simultaneously").  The same issue has also been observed
      on the older e1000 cards.
      
      Here, we have increased the RESET_COUNT value to 50 because there are too
      many accesses to e1000 nic on stress tests to e1000 nic, it is not enough
      to set RESET_COUT 25. Experimentation has shown that it is enough to set
      RESET_COUNT 50.
      Signed-off-by: Nyzhu1 <yanjun.zhu@windriver.com>
      Tested-by: NAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      6a7d64e3
  21. 01 11月, 2013 1 次提交
  22. 22 9月, 2013 1 次提交
  23. 20 4月, 2013 3 次提交
  24. 15 3月, 2013 1 次提交
  25. 16 2月, 2013 1 次提交