1. 21 11月, 2009 1 次提交
  2. 17 11月, 2009 1 次提交
  3. 12 11月, 2009 1 次提交
    • A
      skbuff: Do not allow skb recycling with disabled IRQs · e84af6dd
      Anton Vorontsov 提交于
      NAPI drivers try to recycle SKBs in their polling routine, but we
      generally don't know the context in which the polling will be called,
      and the skb recycling itself may require IRQs to be enabled.
      
      This patch adds irqs_disabled() test to the skb_recycle_check()
      routine, so that we'll not let the drivers hit the skb recycling
      path with IRQs disabled.
      
      As a side effect, this patch actually disables skb recycling for some
      [broken] drivers. E.g. gianfar driver grabs an irqsave spinlock during
      TX ring processing, and then tries to recycle an skb, and that caused
      the following badness:
      
      nf_conntrack version 0.5.0 (1008 buckets, 4032 max)
      ------------[ cut here ]------------
      Badness at kernel/softirq.c:143
      NIP: c003e3c4 LR: c423a528 CTR: c003e344
      ...
      NIP [c003e3c4] local_bh_enable+0x80/0xc4
      LR [c423a528] destroy_conntrack+0xd4/0x13c [nf_conntrack]
      Call Trace:
      [c15d1b60] [c003e32c] local_bh_disable+0x1c/0x34 (unreliable)
      [c15d1b70] [c423a528] destroy_conntrack+0xd4/0x13c [nf_conntrack]
      [c15d1b80] [c02c6370] nf_conntrack_destroy+0x3c/0x70
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e84af6dd
  4. 25 7月, 2009 1 次提交
  5. 18 6月, 2009 2 次提交
  6. 15 6月, 2009 1 次提交
  7. 11 6月, 2009 1 次提交
    • J
      mac80211: do not pass PS frames out of mac80211 again · 8f77f384
      Johannes Berg 提交于
      In order to handle powersave frames properly we had needed
      to pass these out to the device queues again, and introduce
      the skb->requeue bit. This, however, also has unnecessary
      overhead by needing to 'clean up' already tried frames, and
      this clean-up code is also buggy when software encryption
      is used.
      
      Instead of sending the frames via the master netdev queue
      again, simply put them into the pending queue. This also
      fixes a problem where frames for that particular station
      could be reordered when some were still on the software
      queues and older ones are re-injected into the software
      queue after them.
      Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>
      8f77f384
  8. 09 6月, 2009 1 次提交
  9. 08 6月, 2009 1 次提交
    • H
      net: Ensure partial checksum offset is inside the skb head · 5ff8dda3
      Herbert Xu 提交于
      On Thu, Jun 04, 2009 at 09:06:00PM +1000, Herbert Xu wrote:
      >
      > tun: Optimise handling of bogus gso->hdr_len
      >
      > As all current versions of virtio_net generate a value for the
      > header length that's too small, we should optimise this so that
      > we don't copy it twice.  This can be done by ensuring that it is
      > at least as large as the place where we'll write the checksum.
      >
      > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
      
      With this applied we can strengthen the partial checksum check:
      
      In skb_partial_csum_set we check to see if the checksum offset
      is within the packet.  However, we really should check that it
      is within the skb head as that's the only bit we can modify
      without copying.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: NRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5ff8dda3
  10. 03 6月, 2009 1 次提交
  11. 27 5月, 2009 4 次提交
  12. 25 5月, 2009 2 次提交
  13. 19 5月, 2009 1 次提交
  14. 07 5月, 2009 1 次提交
  15. 30 4月, 2009 1 次提交
  16. 15 4月, 2009 1 次提交
    • S
      tracing/events: move trace point headers into include/trace/events · ad8d75ff
      Steven Rostedt 提交于
      Impact: clean up
      
      Create a sub directory in include/trace called events to keep the
      trace point headers in their own separate directory. Only headers that
      declare trace points should be defined in this directory.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
      Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ad8d75ff
  17. 29 3月, 2009 1 次提交
  18. 14 3月, 2009 1 次提交
  19. 27 2月, 2009 1 次提交
  20. 18 2月, 2009 1 次提交
    • D
      net: Kill skb_truesize_check(), it only catches false-positives. · 92a0acce
      David S. Miller 提交于
      A long time ago we had bugs, primarily in TCP, where we would modify
      skb->truesize (for TSO queue collapsing) in ways which would corrupt
      the socket memory accounting.
      
      skb_truesize_check() was added in order to try and catch this error
      more systematically.
      
      However this debugging check has morphed into a Frankenstein of sorts
      and these days it does nothing other than catch false-positives.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92a0acce
  21. 16 2月, 2009 1 次提交
    • P
      net: infrastructure for hardware time stamping · ac45f602
      Patrick Ohly 提交于
      The additional per-packet information (16 bytes for time stamps, 1
      byte for flags) is stored for all packets in the skb_shared_info
      struct. This implementation detail is hidden from users of that
      information via skb_* accessor functions. A separate struct resp.
      union is used for the additional information so that it can be
      stored/copied easily outside of skb_shared_info.
      
      Compared to previous implementations (reusing the tstamp field
      depending on the context, optional additional structures) this
      is the simplest solution. It does not extend sk_buff itself.
      
      TX time stamping is implemented in software if the device driver
      doesn't support hardware time stamping.
      
      The new semantic for hardware/software time stamping around
      ndo_start_xmit() is based on two assumptions about existing
      network device drivers which don't support hardware time
      stamping and know nothing about it:
       - they leave the new skb_shared_tx unmodified
       - the keep the connection to the originating socket in skb->sk
         alive, i.e., don't call skb_orphan()
      
      Given that skb_shared_tx is new, the first assumption is safe.
      The second is only true for some drivers. As a result, software
      TX time stamping currently works with the bnx2 driver, but not
      with the unmodified igb driver (the two drivers this patch series
      was tested with).
      Signed-off-by: NPatrick Ohly <patrick.ohly@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac45f602
  22. 13 2月, 2009 1 次提交
  23. 10 2月, 2009 1 次提交
  24. 06 2月, 2009 1 次提交
    • H
      gro: Fix frag_list merging on imprecisely split packets · 56035022
      Herbert Xu 提交于
      The previous fix ad0f9904 (gro:
      Fix handling of imprecisely split packets) only fixed the case
      of frags merging, frag_list merging in the same circumstances
      were still broken.
      
      In particular, the packet headers end up in the data stream.
      
      This patch fixes this plus another issue where an imprecisely
      split packet header may be read incorrectly (this is mostly
      harmless since it'll simply cause the packet to not match and
      be rejected for GRO).
      
      Thanks to Emil Tantilov and Jeff Kirsher for helping to track
      this down.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56035022
  25. 01 2月, 2009 1 次提交
  26. 30 1月, 2009 4 次提交
    • H
      gro: Do not merge paged packets into frag_list · 81705ad1
      Herbert Xu 提交于
      gro: Do not merge paged packets into frag_list
      
      Bigger is not always better :)
      
      It was easy to continue to merged packets into frag_list after the
      page array is full.  However, this turns out to be worse than LRO
      because frag_list is a much less efficient form of storage than the
      page array.  So we're better off stopping the merge and starting
      a new entry with an empty page array.
      
      In future we can optimise this further by doing frag_list merging
      but making sure that we continue to fill in the page array.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      81705ad1
    • H
      gro: Avoid copying headers of unmerged packets · 86911732
      Herbert Xu 提交于
      Unfortunately simplicity isn't always the best.  The fraginfo
      interface turned out to be suboptimal.  The problem was quite
      obvious.  For every packet, we have to copy the headers from
      the frags structure into skb->head, even though for 99% of the
      packets this part is immediately thrown away after the merge.
      
      LRO didn't have this problem because it directly read the headers
      from the frags structure.
      
      This patch attempts to address this by creating an interface
      that allows GRO to access the headers in the first frag without
      having to copy it.  Because all drivers that use frags place the
      headers in the first frag this optimisation should be enough.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86911732
    • S
      net: Fix OOPS in skb_seq_read(). · 71b3346d
      Shyam Iyer 提交于
      It oopsd for me in skb_seq_read. addr2line said it was
      linux-2.6/net/core/skbuff.c:2228, which is this line:
      
      
      	while (st->frag_idx < skb_shinfo(st->cur_skb)->nr_frags) {
      
      
      I added some printks in there and it looks like we hit this:
      
              } else if (st->root_skb == st->cur_skb &&
                         skb_shinfo(st->root_skb)->frag_list) {
                       st->cur_skb = skb_shinfo(st->root_skb)->frag_list;
                       st->frag_idx = 0;
                       goto next_skb;
              }
      
      
      
      Actually I did some testing and added a few printks and found that the
      st->cur_skb->data was 0 and hence the ptr used by iscsi_tcp was null.
      This caused the kernel panic.
      
       	if (abs_offset < block_limit) {
      -		*data = st->cur_skb->data + abs_offset;
      +		*data = st->cur_skb->data + (abs_offset - st->stepped_offset);
      
      I enabled the debug_tcp and with a few printks found that the code did
      not go to the next_skb label and could find that the sequence being
      followed was this -
      
      It hit this if condition -
      
              if (st->cur_skb->next) {
                      st->cur_skb = st->cur_skb->next;
                      st->frag_idx = 0;
                      goto next_skb;
      
      And so, now the st pointer is shifted to the next skb whereas actually
      it should have hit the second else if first since the data is in the
      frag_list.
      
              else if (st->root_skb == st->cur_skb &&
                       skb_shinfo(st->root_skb)->frag_list) {
                      st->cur_skb = skb_shinfo(st->root_skb)->frag_list;
                      goto next_skb;
              }
      
      Reversing the two conditions the attached patch fixes the issue for me
      on top of Herbert's patches. 
      Signed-off-by: NShyam Iyer <shyam_iyer@dell.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      71b3346d
    • H
      net: Fix frag_list handling in skb_seq_read · 95e3b24c
      Herbert Xu 提交于
      The frag_list handling was broken in skb_seq_read:
      
      1) We didn't add the stepped offset when looking at the head
      are of fragments other than the first.
      
      2) We didn't take the stepped offset away when setting the data
      pointer in the head area.
      
      3) The frag index wasn't reset.
      
      This patch fixes both issues.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95e3b24c
  27. 21 1月, 2009 1 次提交
  28. 20 1月, 2009 1 次提交
    • J
      net: Fix data corruption when splicing from sockets. · 8b9d3728
      Jarek Poplawski 提交于
      The trick in socket splicing where we try to convert the skb->data
      into a page based reference using virt_to_page() does not work so
      well.
      
      The idea is to pass the virt_to_page() reference via the pipe
      buffer, and refcount the buffer using a SKB reference.
      
      But if we are splicing from a socket to a socket (via sendpage)
      this doesn't work.
      
      The from side processing will grab the page (and SKB) references.
      The sendpage() calls will grab page references only, return, and
      then the from side processing completes and drops the SKB ref.
      
      The page based reference to skb->data is not enough to keep the
      kmalloc() buffer backing it from being reused.  Yet, that is
      all that the socket send side has at this point.
      
      This leads to data corruption if the skb->data buffer is reused
      by SLAB before the send side socket actually gets the TX packet
      out to the device.
      
      The fix employed here is to simply allocate a page and copy the
      skb->data bytes into that page.
      
      This will hurt performance, but there is no clear way to fix this
      properly without a copy at the present time, and it is important
      to get rid of the data corruption.
      
      With fixes from Herbert Xu.
      Tested-by: NWilly Tarreau <w@1wt.eu>
      Foreseen-by: NChangli Gao <xiaosuo@gmail.com>
      Diagnosed-by: NWilly Tarreau <w@1wt.eu>
      Reported-by: NWilly Tarreau <w@1wt.eu>
      Fixed-by: NJens Axboe <jens.axboe@oracle.com>
      Signed-off-by: NJarek Poplawski <jarkao2@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b9d3728
  29. 15 1月, 2009 1 次提交
  30. 05 1月, 2009 2 次提交
    • H
      gro: Add page frag support · 5d38a079
      Herbert Xu 提交于
      This patch allows GRO to merge page frags (skb_shinfo(skb)->frags)
      in one skb, rather than using the less efficient frag_list.
      
      It also adds a new interface, napi_gro_frags to allow drivers
      to inject page frags directly into the stack without allocating
      an skb.  This is intended to be the GRO equivalent for LRO's
      lro_receive_frags interface.
      
      The existing GSO interface can already handle page frags with
      or without an appended frag_list so nothing needs to be changed
      there.
      
      The merging itself is rather simple.  We store any new frag entries
      after the last existing entry, without checking whether the first
      new entry can be merged with the last existing entry.  Making this
      check would actually be easy but since no existing driver can
      produce contiguous frags anyway it would just be mental masturbation.
      
      If the total number of entries would exceed the capacity of a
      single skb, we simply resort to using frag_list as we do now.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d38a079
    • H
      gro: Use gso_size to store MSS · b530256d
      Herbert Xu 提交于
      In order to allow GRO packets without frag_list at all, we need to
      store the MSS in the packet itself.  The obvious place is gso_size.
      The only thing to watch out for is if the packet ends up not being
      GRO then we need to clear gso_size before pushing the packet into
      the stack.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b530256d
  31. 16 12月, 2008 1 次提交
    • H
      net: Add skb_gro_receive · 71d93b39
      Herbert Xu 提交于
      This patch adds the helper skb_gro_receive to merge packets for
      GRO.  The current method is to allocate a new header skb and then
      chain the original packets to its frag_list.  This is done to
      make it easier to integrate into the existing GSO framework.
      
      In future as GSO is moved into the drivers, we can undo this and
      simply chain the original packets together.
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      71d93b39