1. 24 9月, 2015 1 次提交
    • N
      netpoll: Close race condition between poll_one_napi and napi_disable · 2d8bff12
      Neil Horman 提交于
      Drivers might call napi_disable while not holding the napi instance poll_lock.
      In those instances, its possible for a race condition to exist between
      poll_one_napi and napi_disable.  That is to say, poll_one_napi only tests the
      NAPI_STATE_SCHED bit to see if there is work to do during a poll, and as such
      the following may happen:
      
      CPU0				CPU1
      ndo_tx_timeout			napi_poll_dev
       napi_disable			 poll_one_napi
        test_and_set_bit (ret 0)
      				  test_bit (ret 1)
         reset adapter		   napi_poll_routine
      
      If the adapter gets a tx timeout without a napi instance scheduled, its possible
      for the adapter to think it has exclusive access to the hardware  (as the napi
      instance is now scheduled via the napi_disable call), while the netpoll code
      thinks there is simply work to do.  The result is parallel hardware access
      leading to corrupt data structures in the driver, and a crash.
      
      Additionaly, there is another, more critical race between netpoll and
      napi_disable.  The disabled napi state is actually identical to the scheduled
      state for a given napi instance.  The implication being that, if a napi instance
      is disabled, a netconsole instance would see the napi state of the device as
      having been scheduled, and poll it, likely while the driver was dong something
      requiring exclusive access.  In the case above, its fairly clear that not having
      the rings in a state ready to be polled will cause any number of crashes.
      
      The fix should be pretty easy.  netpoll uses its own bit to indicate that that
      the napi instance is in a state of being serviced by netpoll (NAPI_STATE_NPSVC).
      We can just gate disabling on that bit as well as the sched bit.  That should
      prevent netpoll from conducting a napi poll if we convert its set bit to a
      test_and_set_bit operation to provide mutual exclusion
      
      Change notes:
      V2)
      	Remove a trailing whtiespace
      	Resubmit with proper subject prefix
      
      V3)
      	Clean up spacing nits
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: jmaxwell@redhat.com
      Tested-by: jmaxwell@redhat.com
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d8bff12
  2. 29 8月, 2015 1 次提交
  3. 14 1月, 2015 1 次提交
  4. 22 11月, 2014 2 次提交
  5. 02 9月, 2014 2 次提交
  6. 30 8月, 2014 1 次提交
  7. 25 8月, 2014 1 次提交
  8. 09 7月, 2014 1 次提交
  9. 02 4月, 2014 1 次提交
  10. 30 3月, 2014 5 次提交
  11. 27 3月, 2014 1 次提交
  12. 25 3月, 2014 1 次提交
  13. 18 3月, 2014 10 次提交
  14. 07 2月, 2014 1 次提交
  15. 22 1月, 2014 1 次提交
  16. 11 1月, 2014 1 次提交
    • J
      net: core: explicitly select a txq before doing l2 forwarding · f663dd9a
      Jason Wang 提交于
      Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The
      will cause several issues:
      
      - NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan
        instead of lower device which misses the necessary txq synchronization for
        lower device such as txq stopping or frozen required by dev watchdog or
        control path.
      - dev_hard_start_xmit() was called with NULL txq which bypasses the net device
        watchdog.
      - dev_hard_start_xmit() does not check txq everywhere which will lead a crash
        when tso is disabled for lower device.
      
      Fix this by explicitly introducing a new param for .ndo_select_queue() for just
      selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also
      extended to accept this parameter and dev_queue_xmit_accel() was used to do l2
      forwarding transmission.
      
      With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need
      to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep
      a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of
      dev_queue_xmit() to do the transmission.
      
      In the future, it was also required for macvtap l2 forwarding support since it
      provides a necessary synchronization method.
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: e1000-devel@lists.sourceforge.net
      Signed-off-by: NJason Wang <jasowang@redhat.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f663dd9a
  17. 03 1月, 2014 1 次提交
  18. 26 10月, 2013 1 次提交
  19. 20 9月, 2013 1 次提交
    • N
      netpoll: fix NULL pointer dereference in netpoll_cleanup · d0fe8c88
      Nikolay Aleksandrov 提交于
      I've been hitting a NULL ptr deref while using netconsole because the
      np->dev check and the pointer manipulation in netpoll_cleanup are done
      without rtnl and the following sequence happens when having a netconsole
      over a vlan and we remove the vlan while disabling the netconsole:
      	CPU 1					CPU2
      					removes vlan and calls the notifier
      enters store_enabled(), calls
      netdev_cleanup which checks np->dev
      and then waits for rtnl
      					executes the netconsole netdev
      					release notifier making np->dev
      					== NULL and releases rtnl
      continues to dereference a member of
      np->dev which at this point is == NULL
      Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0fe8c88
  20. 13 9月, 2013 1 次提交
  21. 06 6月, 2013 1 次提交
  22. 05 6月, 2013 1 次提交
  23. 29 5月, 2013 1 次提交
  24. 28 5月, 2013 1 次提交
  25. 06 5月, 2013 1 次提交