1. 20 4月, 2009 1 次提交
    • J
      net: sch_netem: Fix an inconsistency in ingress netem timestamps. · 8caf1539
      Jarek Poplawski 提交于
      Alex Sidorenko reported:
      
      "while experimenting with 'netem' we have found some strange behaviour. It
      seemed that ingress delay as measured by 'ping' command shows up on some
      hosts but not on others.
      
      After some investigation I have found that the problem is that skbuff->tstamp
      field value depends on whether there are any packet sniffers enabled. That
      is:
      
      - if any ptype_all handler is registered, the tstamp field is as expected
      - if there are no ptype_all handlers, the tstamp field does not show the delay"
      
      This patch prevents unnecessary update of tstamp in dev_queue_xmit_nit()
      on ingress path (with act_mirred) adding a check, so minimal overhead on
      the fast path, but only when sniffers etc. are active.
      
      Since netem at ingress seems to logically emulate a network before a host,
      tstamp is zeroed to trigger the update and pretend delays are from the
      outside.
      Reported-by: NAlex Sidorenko <alexandre.sidorenko@hp.com>
      Tested-by: NAlex Sidorenko <alexandre.sidorenko@hp.com>
      Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8caf1539
  2. 15 4月, 2009 1 次提交
  3. 11 4月, 2009 1 次提交
  4. 02 4月, 2009 1 次提交
  5. 27 3月, 2009 1 次提交
    • H
      GRO: Disable GRO on legacy netif_rx path · 8f1ead2d
      Herbert Xu 提交于
      When I fixed the GRO crash in the legacy receive path I used
      napi_complete to replace __napi_complete.  Unfortunately they're
      not the same when NETPOLL is enabled, which may result in us
      not calling __napi_complete at all.
      
      What's more, we really do need to keep the __napi_complete call
      within the IRQ-off section since in theory an IRQ can occur in
      between and fill up the backlog to the maximum, causing us to
      lock up.
      
      Since we can't seem to find a fix that works properly right now,
      this patch reverts all the GRO support from the netif_rx path.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f1ead2d
  6. 22 3月, 2009 2 次提交
  7. 19 3月, 2009 1 次提交
  8. 18 3月, 2009 1 次提交
  9. 17 3月, 2009 1 次提交
    • H
      GRO: Move netpoll checks to correct location · d1c76af9
      Herbert Xu 提交于
      As my netpoll fix for net doesn't really work for net-next, we
      need this update to move the checks into the right place.  As it
      stands we may pass freed skbs to netpoll_receive_skb.
      
      This patch also introduces a netpoll_rx_on function to avoid GRO
      completely if we're invoked through netpoll.  This might seem
      paranoid but as netpoll may have an external receive hook it's
      better to be safe than sorry.  I don't think we need this for
      2.6.29 though since there's nothing immediately broken by it.
      
      This patch also moves the GRO_* return values to netdevice.h since
      VLAN needs them too (I tried to avoid this originally but alas
      this seems to be the easiest way out).  This fixes a bug in VLAN
      where it continued to use the old return value 2 instead of the
      correct GRO_DROP.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d1c76af9
  10. 14 3月, 2009 1 次提交
  11. 05 3月, 2009 2 次提交
  12. 03 3月, 2009 1 次提交
    • E
      netns: Remove net_alive · 17edde52
      Eric W. Biederman 提交于
      It turns out that net_alive is unnecessary, and the original problem
      that led to it being added was simply that the icmp code thought
      it was a network device and wound up being unable to handle packets
      while there were still packets in the network namespace.
      
      Now that icmp and tcp have been fixed to properly register themselves
      this problem is no longer present and we have a stronger guarantee
      that packets will not arrive in a network namespace then that provided
      by net_alive in netif_receive_skb.  So remove net_alive allowing
      packet reception run a little faster.
      
      Additionally document the strong reason why network namespace cleanup
      is safe so that if something happens again someone else will have
      a chance of figuring it out.
      Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      17edde52
  13. 01 3月, 2009 1 次提交
    • H
      netpoll: Add drop checks to all entry points · 4ead4431
      Herbert Xu 提交于
      The netpoll entry checks are required to ensure that we don't
      receive normal packets when invoked via netpoll.  Unfortunately
      it only ever worked for the netif_receive_skb/netif_rx entry
      points.  The VLAN (and subsequently GRO) entry point didn't
      have the check and therefore can trigger all sorts of weird
      problems.
      
      This patch adds the netpoll check to all entry points.
      
      I'm still uneasy with receiving at all under netpoll (which
      apparently is only used by the out-of-tree kdump code).  The
      reason is it is perfectly legal to receive all data including
      headers into highmem if netpoll is off, but if you try to do
      that with netpoll on and someone gets a printk in an IRQ handler                                             
      you're going to get a nice BUG_ON.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ead4431
  14. 23 2月, 2009 1 次提交
    • E
      netns: Remove net_alive · ce16c533
      Eric W. Biederman 提交于
      It turns out that net_alive is unnecessary, and the original problem
      that led to it being added was simply that the icmp code thought
      it was a network device and wound up being unable to handle packets
      while there were still packets in the network namespace.
      
      Now that icmp and tcp have been fixed to properly register themselves
      this problem is no longer present and we have a stronger guarantee
      that packets will not arrive in a network namespace then that provided
      by net_alive in netif_receive_skb.  So remove net_alive allowing
      packet reception run a little faster.
      
      Additionally document the strong reason why network namespace cleanup
      is safe so that if something happens again someone else will have
      a chance of figuring it out.
      Signed-off-by: NEric W. Biederman <ebiederm@aristanetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce16c533
  15. 21 2月, 2009 1 次提交
  16. 19 2月, 2009 1 次提交
  17. 16 2月, 2009 2 次提交
    • P
      d24fff22
    • P
      net: infrastructure for hardware time stamping · ac45f602
      Patrick Ohly 提交于
      The additional per-packet information (16 bytes for time stamps, 1
      byte for flags) is stored for all packets in the skb_shared_info
      struct. This implementation detail is hidden from users of that
      information via skb_* accessor functions. A separate struct resp.
      union is used for the additional information so that it can be
      stored/copied easily outside of skb_shared_info.
      
      Compared to previous implementations (reusing the tstamp field
      depending on the context, optional additional structures) this
      is the simplest solution. It does not extend sk_buff itself.
      
      TX time stamping is implemented in software if the device driver
      doesn't support hardware time stamping.
      
      The new semantic for hardware/software time stamping around
      ndo_start_xmit() is based on two assumptions about existing
      network device drivers which don't support hardware time
      stamping and know nothing about it:
       - they leave the new skb_shared_tx unmodified
       - the keep the connection to the originating socket in skb->sk
         alive, i.e., don't call skb_orphan()
      
      Given that skb_shared_tx is new, the first assumption is safe.
      The second is only true for some drivers. As a result, software
      TX time stamping currently works with the bnx2 driver, but not
      with the unmodified igb driver (the two drivers this patch series
      was tested with).
      Signed-off-by: NPatrick Ohly <patrick.ohly@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac45f602
  18. 09 2月, 2009 2 次提交
  19. 07 2月, 2009 1 次提交
  20. 06 2月, 2009 1 次提交
    • H
      gro: Fix frag_list merging on imprecisely split packets · 56035022
      Herbert Xu 提交于
      The previous fix ad0f9904 (gro:
      Fix handling of imprecisely split packets) only fixed the case
      of frags merging, frag_list merging in the same circumstances
      were still broken.
      
      In particular, the packet headers end up in the data stream.
      
      This patch fixes this plus another issue where an imprecisely
      split packet header may be read incorrectly (this is mostly
      harmless since it'll simply cause the packet to not match and
      be rejected for GRO).
      
      Thanks to Emil Tantilov and Jeff Kirsher for helping to track
      this down.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56035022
  21. 05 2月, 2009 1 次提交
    • H
      net: Partially allow skb destructors to be used on receive path · 9a279bcb
      Herbert Xu 提交于
      As it currently stands, skb destructors are forbidden on the
      receive path because the protocol end-points will overwrite
      any existing destructor with their own.
      
      This is the reason why we have to call skb_orphan in the loopback
      driver before we reinject the packet back into the stack, thus
      creating a period during which loopback traffic isn't charged
      to any socket.
      
      With virtualisation, we have a similar problem in that traffic
      is reinjected into the stack without being associated with any
      socket entity, thus providing no natural congestion push-back
      for those poor folks still stuck with UDP.
      
      Now had we been consistent in telling them that UDP simply has
      no congestion feedback, I could just fob them off.  Unfortunately,
      we appear to have gone to some length in catering for this on
      the standard UDP path, with skb/socket accounting so that has
      created a very unhealthy dependency.
      
      Alas habits are difficult to break out of, so we may just have
      to allow skb destructors on the receive path.
      
      It turns out that making skb destructors useable on the receive path
      isn't as easy as it seems.  For instance, simply adding skb_orphan
      to skb_set_owner_r isn't enough.  This is because we assume all
      over the IP stack that skb->sk is an IP socket if present.
      
      The new transparent proxy code goes one step further and assumes
      that skb->sk is the receiving socket if present.
      
      Now all of this can be dealt with by adding simple checks such
      as only treating skb->sk as an IP socket if skb->sk->sk_family
      matches.  However, it turns out that for bridging at least we
      don't need to do all of this work.
      
      This is of interest because most virtualisation setups use bridging
      so we don't actually go through the IP stack on the host (with
      the exception of our old nemesis the bridge netfilter, but that's
      easily taken care of).
      
      So this patch simply adds skb_orphan to the point just before we
      enter the IP stack, but after we've gone through the bridge on the
      receive path.  It also adds an skb_orphan to the one place in
      netfilter that touches skb->sk/skb->destructor, that is, tproxy.
      
      One word of caution, because of the internal code structure, anyone
      wishing to deploy this must use skb_set_owner_w as opposed to
      skb_set_owner_r since many functions that create a new skb from
      an existing one will invoke skb_set_owner_w on the new skb.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9a279bcb
  22. 01 2月, 2009 1 次提交
    • H
      gro: Fix handling of imprecisely split packets · ad0f9904
      Herbert Xu 提交于
      The commit 89a1b249edcf9be884e71f92df84d48355c576aa (gro: Avoid
      copying headers of unmerged packets) only worked for packets
      which are either completely linear, completely non-linear, or
      packets which exactly split at the boundary between headers and
      payload.
      
      Anything else would cause bits in the header to go missing if
      the packet is held by GRO.
      
      This may have broken drivers such as ixgbe.
      
      This patch fixes the places that assumed or only worked with
      the above cases.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad0f9904
  23. 30 1月, 2009 3 次提交
  24. 28 1月, 2009 3 次提交
  25. 21 1月, 2009 1 次提交
  26. 20 1月, 2009 1 次提交
  27. 15 1月, 2009 3 次提交
  28. 11 1月, 2009 1 次提交
  29. 07 1月, 2009 2 次提交