1. 11 5月, 2015 1 次提交
  2. 24 3月, 2015 2 次提交
  3. 10 3月, 2015 1 次提交
    • F
      net: delete stale packet_mclist entries · 82f17091
      Francesco Ruggeri 提交于
      When an interface is deleted from a net namespace the ifindex in the
      corresponding entries in PF_PACKET sockets' mclists becomes stale.
      This can create inconsistencies if later an interface with the same ifindex
      is moved from a different namespace (not that unlikely since ifindexes are
      per-namespace).
      In particular we saw problems with dev->promiscuity, resulting
      in "promiscuity touches roof, set promiscuity failed. promiscuity
      feature of device might be broken" warnings and EOVERFLOW failures of
      setsockopt(PACKET_ADD_MEMBERSHIP).
      This patch deletes the mclist entries for interfaces that are deleted.
      Since this now causes setsockopt(PACKET_DROP_MEMBERSHIP) to fail with
      EADDRNOTAVAIL if called after the interface is deleted, also make
      packet_mc_drop not fail.
      Signed-off-by: NFrancesco Ruggeri <fruggeri@arista.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      82f17091
  4. 03 3月, 2015 1 次提交
  5. 02 3月, 2015 3 次提交
  6. 25 2月, 2015 1 次提交
    • A
      af_packet: don't pass empty blocks for PACKET_V3 · 41a50d62
      Alexander Drozdov 提交于
      Before da413eec ("packet: Fixed TPACKET V3 to signal poll when block is
      closed rather than every packet") poll listening for an af_packet socket was
      not signaled if there was no packets to process. After the patch poll is
      signaled evety time when block retire timer expires. That happens because
      af_packet closes the current block on timeout even if the block is empty.
      
      Passing empty blocks to the user not only wastes CPU but also wastes ring
      buffer space increasing probability of packets dropping on small timeouts.
      Signed-off-by: NAlexander Drozdov <al.drozdov@gmail.com>
      Cc: Dan Collins <dan@dcollins.co.nz>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Guy Harris <guy@alum.mit.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      41a50d62
  7. 22 2月, 2015 1 次提交
  8. 14 1月, 2015 1 次提交
  9. 13 1月, 2015 1 次提交
  10. 12 1月, 2015 1 次提交
  11. 23 12月, 2014 1 次提交
    • D
      packet: Fixed TPACKET V3 to signal poll when block is closed rather than every packet · da413eec
      Dan Collins 提交于
      Make TPACKET_V3 signal poll when block is closed rather than for every
      packet. Side effect is that poll will be signaled when block retire
      timer expires which didn't previously happen. Issue was visible when
      sending packets at a very low frequency such that all blocks are retired
      before packets are received by TPACKET_V3. This caused avoidable packet
      loss. The fix ensures that the signal is sent when blocks are closed
      which covers the normal path where the block is filled as well as the
      path where the timer expires. The case where a block is filled without
      moving to the next block (ie. all blocks are full) will still cause poll
      to be signaled.
      Signed-off-by: NDan Collins <dan@dcollins.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      da413eec
  12. 10 12月, 2014 1 次提交
    • A
      put iov_iter into msghdr · c0371da6
      Al Viro 提交于
      Note that the code _using_ ->msg_iter at that point will be very
      unhappy with anything other than unshifted iovec-backed iov_iter.
      We still need to convert users to proper primitives.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c0371da6
  13. 09 12月, 2014 1 次提交
  14. 25 11月, 2014 1 次提交
    • M
      af_packet: fix sparse warning · 6e58040b
      Michael S. Tsirkin 提交于
      af_packet produces lots of these:
      	net/packet/af_packet.c:384:39: warning: incorrect type in return expression (different modifiers)
      	net/packet/af_packet.c:384:39:    expected struct page [pure] *
      	net/packet/af_packet.c:384:39:    got struct page *
      
      this seems to be because sparse does not realize that _pure
      refers to function, not the returned pointer.
      
      Tweak code slightly to avoid the warning.
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e58040b
  15. 24 11月, 2014 3 次提交
  16. 22 11月, 2014 1 次提交
  17. 06 11月, 2014 1 次提交
    • D
      net: Add and use skb_copy_datagram_msg() helper. · 51f3d02b
      David S. Miller 提交于
      This encapsulates all of the skb_copy_datagram_iovec() callers
      with call argument signature "skb, offset, msghdr->msg_iov, length".
      
      When we move to iov_iters in the networking, the iov_iter object will
      sit in the msghdr.
      
      Having a helper like this means there will be less places to touch
      during that transformation.
      
      Based upon descriptions and patch from Al Viro.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51f3d02b
  18. 02 9月, 2014 2 次提交
  19. 30 8月, 2014 1 次提交
  20. 25 8月, 2014 1 次提交
  21. 22 8月, 2014 1 次提交
  22. 30 7月, 2014 1 次提交
  23. 16 7月, 2014 1 次提交
  24. 12 4月, 2014 1 次提交
    • D
      net: Fix use after free by removing length arg from sk_data_ready callbacks. · 676d2369
      David S. Miller 提交于
      Several spots in the kernel perform a sequence like:
      
      	skb_queue_tail(&sk->s_receive_queue, skb);
      	sk->sk_data_ready(sk, skb->len);
      
      But at the moment we place the SKB onto the socket receive queue it
      can be consumed and freed up.  So this skb->len access is potentially
      to freed up memory.
      
      Furthermore, the skb->len can be modified by the consumer so it is
      possible that the value isn't accurate.
      
      And finally, no actual implementation of this callback actually uses
      the length argument.  And since nobody actually cared about it's
      value, lots of call sites pass arbitrary values in such as '0' and
      even '1'.
      
      So just remove the length argument from the callback, that way there
      is no confusion whatsoever and all of these use-after-free cases get
      fixed as a side effect.
      
      Based upon a patch by Eric Dumazet and his suggestion to audit this
      issue tree-wide.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      676d2369
  25. 04 4月, 2014 2 次提交
  26. 29 3月, 2014 1 次提交
    • D
      packet: respect devices with LLTX flag in direct xmit · 43279500
      Daniel Borkmann 提交于
      Quite often it can be useful to test with dummy or similar
      devices as a blackhole sink for skbs. Such devices are only
      equipped with a single txq, but marked as NETIF_F_LLTX as
      they do not require locking their internal queues on xmit
      (or implement locking themselves). Therefore, rather use
      HARD_TX_{UN,}LOCK API, so that NETIF_F_LLTX will be respected.
      
      trafgen mmap/TX_RING example against dummy device with config
      foo: { fill(0xff, 64) } results in the following performance
      improvements for such scenarios on an ordinary Core i7/2.80GHz:
      
      Before:
      
       Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs):
      
         160,975,944,159 instructions:k            #    0.55  insns per cycle          ( +-  0.09% )
         293,319,390,278 cycles:k                  #    0.000 GHz                      ( +-  0.35% )
             192,501,104 branch-misses:k                                               ( +-  1.63% )
                     831 context-switches:k                                            ( +-  9.18% )
                       7 cpu-migrations:k                                              ( +-  7.40% )
                  69,382 cache-misses:k            #    0.010 % of all cache refs      ( +-  2.18% )
             671,552,021 cache-references:k                                            ( +-  1.29% )
      
            22.856401569 seconds time elapsed                                          ( +-  0.33% )
      
      After:
      
       Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs):
      
         133,788,739,692 instructions:k            #    0.92  insns per cycle          ( +-  0.06% )
         145,853,213,256 cycles:k                  #    0.000 GHz                      ( +-  0.17% )
              59,867,100 branch-misses:k                                               ( +-  4.72% )
                     384 context-switches:k                                            ( +-  3.76% )
                       6 cpu-migrations:k                                              ( +-  6.28% )
                  70,304 cache-misses:k            #    0.077 % of all cache refs      ( +-  1.73% )
              90,879,408 cache-references:k                                            ( +-  1.35% )
      
            11.719372413 seconds time elapsed                                          ( +-  0.24% )
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43279500
  27. 27 3月, 2014 1 次提交
  28. 01 3月, 2014 1 次提交
    • D
      packet: allow to transmit +4 byte in TX_RING slot for VLAN case · 52f1454f
      Daniel Borkmann 提交于
      Commit 57f89bfa ("network: Allow af_packet to transmit +4 bytes
      for VLAN packets.") added the possibility for non-mmaped frames to
      send extra 4 byte for VLAN header so the MTU increases from 1500 to
      1504 byte, for example.
      
      Commit cbd89acb ("af_packet: fix for sending VLAN frames via
      packet_mmap") attempted to fix that for the mmap part but was
      reverted as it caused regressions while using eth_type_trans()
      on output path.
      
      Lets just act analogous to 57f89bfa and add a similar logic
      to TX_RING. We presume size_max as overcharged with +4 bytes and
      later on after skb has been built by tpacket_fill_skb() check
      for ETH_P_8021Q header on packets larger than normal MTU. Can
      be easily reproduced with a slightly modified trafgen in mmap(2)
      mode, test cases:
      
       { fill(0xff, 12) const16(0x8100) fill(0xff, <1504|1505>) }
       { fill(0xff, 12) const16(0x0806) fill(0xff, <1500|1501>) }
      
      Note that we need to do the test right after tpacket_fill_skb()
      as sockets can have PACKET_LOSS set where we would not fail but
      instead just continue to traverse the ring.
      Reported-by: NMathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Ben Greear <greearb@candelatech.com>
      Cc: Phil Sutter <phil@nwl.cc>
      Tested-by: NMathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52f1454f
  29. 19 2月, 2014 1 次提交
  30. 17 2月, 2014 1 次提交
    • D
      packet: check for ndo_select_queue during queue selection · 0fd5d57b
      Daniel Borkmann 提交于
      Mathias reported that on an AMD Geode LX embedded board (ALiX)
      with ath9k driver PACKET_QDISC_BYPASS, introduced in commit
      d346a3fa ("packet: introduce PACKET_QDISC_BYPASS socket
      option"), triggers a WARN_ON() coming from the driver itself
      via 066dae93 ("ath9k: rework tx queue selection and fix
      queue stopping/waking").
      
      The reason why this happened is that ndo_select_queue() call
      is not invoked from direct xmit path i.e. for ieee80211 subsystem
      that sets queue and TID (similar to 802.1d tag) which is being
      put into the frame through 802.11e (WMM, QoS). If that is not
      set, pending frame counter for e.g. ath9k can get messed up.
      
      So the WARN_ON() in ath9k is absolutely legitimate. Generally,
      the hw queue selection in ieee80211 depends on the type of
      traffic, and priorities are set according to ieee80211_ac_numbers
      mapping; working in a similar way as DiffServ only on a lower
      layer, so that the AP can favour frames that have "real-time"
      requirements like voice or video data frames.
      
      Therefore, check for presence of ndo_select_queue() in netdev
      ops and, if available, invoke it with a fallback handler to
      __packet_pick_tx_queue(), so that driver such as bnx2x, ixgbe,
      or mlx4 can still select a hw queue for transmission in
      relation to the current CPU while e.g. ieee80211 subsystem
      can make their own choices.
      Reported-by: NMathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0fd5d57b
  31. 23 1月, 2014 1 次提交
  32. 22 1月, 2014 2 次提交