提交 · bad43ca8325f493dcaa0896c2f036276af059c7e · FengXiao2002 / Linux Imx

20 5月, 2012 1 次提交

net: introduce skb_try_coalesce() · bad43ca8

由 Eric Dumazet 提交于 5月 19, 2012

Move tcp_try_coalesce() protocol independent part to
skb_try_coalesce().

skb_try_coalesce() can be used in IPv4 defrag and IPv6 reassembly,
to build optimized skbs (less sk_buff, and possibly less 'headers')

skb_try_coalesce() is zero copy, unless the copy can fit in destination
header (its a rare case)

kfree_skb_partial() is also moved to net/core/skbuff.c and exported,
because IPv6 will need it in patch (ipv6: use skb coalescing in
reassembly).
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bad43ca8

19 5月, 2012 1 次提交

net: introduce netdev_alloc_frag() · 6f532612

由 Eric Dumazet 提交于 5月 18, 2012

Fix two issues introduced in commit a1c7fff7
( net: netdev_alloc_skb() use build_skb() )

- Must be IRQ safe (non NAPI drivers can use it)
- Must not leak the frag if build_skb() fails to allocate sk_buff

This patch introduces netdev_alloc_frag() for drivers willing to
use build_skb() instead of __netdev_alloc_skb() variants.

Factorize code so that :
__dev_alloc_skb() is a wrapper around __netdev_alloc_skb(), and
dev_alloc_skb() a wrapper around netdev_alloc_skb()

Use __GFP_COLD flag.

Almost all network drivers now benefit from skb->head_frag
infrastructure.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f532612

07 5月, 2012 1 次提交

skb: Add inline helper for getting the skb end offset from head · ec47ea82

由 Alexander Duyck 提交于 5月 04, 2012

With the recent changes for how we compute the skb truesize it occurs to me
we are probably going to have a lot of calls to skb_end_pointer -
skb->head. Instead of running all over the place doing that it would make
more sense to just make it a separate inline skb_end_offset(skb) that way
we can return the correct value without having gcc having to do all the
optimization to cancel out skb->head - skb->head.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec47ea82

04 5月, 2012 1 次提交

skb: Add skb_head_is_locked helper function · 3a7c1ee4

由 Alexander Duyck 提交于 5月 03, 2012

This patch adds support for a skb_head_is_locked helper function.  It is
meant to be used any time we are considering transferring the head from
skb->head to a paged frag.  If the head is locked it means we cannot remove
the head from the skb so it must be copied or we must take the skb as a
whole.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3a7c1ee4

01 5月, 2012 4 次提交

net: fix two typos in skbuff.h · d9619496

由 Eric Dumazet 提交于 4月 30, 2012

fix kernel doc typos in function names
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d9619496

net: skb_peek()/skb_peek_tail() cleanups · 18d07000

由 Eric Dumazet 提交于 4月 30, 2012

remove useless casts and rename variables for less confusion.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

18d07000

net: make GRO aware of skb->head_frag · d7e8883c

由 Eric Dumazet 提交于 4月 30, 2012

GRO can check if skb to be merged has its skb->head mapped to a page
fragment, instead of a kmalloc() area.

We 'upgrade' skb->head as a fragment in itself

This avoids the frag_list fallback, and permits to build true GRO skb
(one sk_buff and up to 16 fragments), using less memory.

This reduces number of cache misses when user makes its copy, since a
single sk_buff is fetched.

This is a followup of patch "net: allow skb->head to be a page fragment"
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7e8883c

net: allow skb->head to be a page fragment · d3836f21

由 Eric Dumazet 提交于 4月 27, 2012

skb->head is currently allocated from kmalloc(). This is convenient but
has the drawback the data cannot be converted to a page fragment if
needed.

We have three spots were it hurts :

1) GRO aggregation

 When a linear skb must be appended to another skb, GRO uses the
frag_list fallback, very inefficient since we keep all struct sk_buff
around. So drivers enabling GRO but delivering linear skbs to network
stack aren't enabling full GRO power.

2) splice(socket -> pipe).

 We must copy the linear part to a page fragment.
 This kind of defeats splice() purpose (zero copy claim)

3) TCP coalescing.

 Recently introduced, this permits to group several contiguous segments
into a single skb. This shortens queue lengths and save kernel memory,
and greatly reduce probabilities of TCP collapses. This coalescing
doesnt work on linear skbs (or we would need to copy data, this would be
too slow)

Given all these issues, the following patch introduces the possibility
of having skb->head be a fragment in itself. We use a new skb flag,
skb->head_frag to carry this information.

build_skb() is changed to accept a frag_size argument. Drivers willing
to provide a page fragment instead of kmalloc() data will set a non zero
value, set to the fragment size.

Then, on situations we need to convert the skb head to a frag in itself,
we can check if skb->head_frag is set and avoid the copies or various
fallbacks we have.

This means drivers currently using frags could be updated to avoid the
current skb->head allocation and reduce their memory footprint (aka skb
truesize). (thats 512 or 1024 bytes saved per skb). This also makes
bpf/netfilter faster since the 'first frag' will be part of skb linear
part, no need to copy data.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3836f21

24 4月, 2012 1 次提交

net: skb_can_coalesce returns a boolean · 38ba0a65

由 Eric Dumazet 提交于 4月 23, 2012

Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

38ba0a65

20 4月, 2012 1 次提交

nf_bridge: remove holes in struct nf_bridge_info · bf1ac5ca

由 Eric Dumazet 提交于 4月 18, 2012

Put use & mask on same location to avoid two holes on 64bit arches
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf1ac5ca

14 4月, 2012 1 次提交

skbuff: struct ubuf_info callback type safety · ca8f4fb2

由 Michael S. Tsirkin 提交于 4月 09, 2012

The skb struct ubuf_info callback gets passed struct ubuf_info
itself, not the arg value as the field name and the function signature
seem to imply. Rename the arg field to ctx to match usage,
add documentation and change the callback argument type
to make usage clear and to have compiler check correctness.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca8f4fb2

11 4月, 2012 1 次提交

tcp: avoid order-1 allocations on wifi and tx path · a21d4572

由 Eric Dumazet 提交于 4月 10, 2012

Marc Merlin reported many order-1 allocations failures in TX path on its
wireless setup, that dont make any sense with MTU=1500 network, and non
SG capable hardware.

After investigation, it turns out TCP uses sk_stream_alloc_skb() and
used as a convention skb_tailroom(skb) to know how many bytes of data
payload could be put in this skb (for non SG capable devices)

Note : these skb used kmalloc-4096 (MTU=1500 + MAX_HEADER +
sizeof(struct skb_shared_info) being above 2048)

Later, mac80211 layer need to add some bytes at the tail of skb
(IEEE80211_ENCRYPT_TAILROOM = 18 bytes) and since no more tailroom is
available has to call pskb_expand_head() and request order-1
allocations.

This patch changes sk_stream_alloc_skb() so that only
sk->sk_prot->max_header bytes of headroom are reserved, and use a new
skb field, avail_size to hold the data payload limit.

This way, order-0 allocations done by TCP stack can leave more than 2 KB
of tailroom and no more allocation is performed in mac80211 layer (or
any layer needing some tailroom)

avail_size is unioned with mark/dropcount, since mark will be set later
in IP stack for output packets. Therefore, skb size is unchanged.
Reported-by: NMarc MERLIN <marc@merlins.org>
Tested-by: NMarc MERLIN <marc@merlins.org>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a21d4572

29 3月, 2012 1 次提交

Remove all #inclusions of asm/system.h · 9ffc93f2

由 David Howells 提交于 3月 28, 2012

Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`
Signed-off-by: NDavid Howells <dhowells@redhat.com>

9ffc93f2

26 3月, 2012 1 次提交

net: add a truesize parameter to skb_add_rx_frag() · 50269e19

由 Eric Dumazet 提交于 3月 23, 2012

skb_add_rx_frag() API is misleading.

Network skbs built with this helper can use uncharged kernel memory and
eventually stress/crash machine in OOM.

Add a 'truesize' parameter and then fix drivers in followup patches.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

50269e19

20 3月, 2012 1 次提交

net: update the usage of CHECKSUM_UNNECESSARY · 3af79302

由 Yi Zou 提交于 3月 19, 2012

As suggested by Ben, this adds the clarification on the usage of
CHECKSUM_UNNECESSARY on the outgoing patch. Also add the usage
description of NETIF_F_FCOE_CRC and CHECKSUM_UNNECESSARY
for the kernel FCoE protocol driver.

This is a follow-up to the following:
http://patchwork.ozlabs.org/patch/147315/Signed-off-by: NYi Zou <yi.zou@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: www.Open-FCoE.org <devel@open-fcoe.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3af79302

10 3月, 2012 1 次提交

net: Use bool in skbuff.h helper functions. · bdcc0924

由 David S. Miller 提交于 3月 07, 2012

In particular do this for skb_is_nonlinear(), skb_is_gso(), and
skb_is_gso_v6().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bdcc0924

05 3月, 2012 1 次提交

BUG: headers with BUG/BUG_ON etc. need linux/bug.h · 187f1882

由 Paul Gortmaker 提交于 11月 23, 2011

If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any
other BUG variant in a static inline (i.e. not in a #define) then
that header really should be including <linux/bug.h> and not just
expecting it to be implicitly present.

We can make this change risk-free, since if the files using these
headers didn't have exposure to linux/bug.h already, they would have
been causing compile failures/warnings.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

187f1882

24 2月, 2012 2 次提交

net: Add framework to allow sending packets with customized CRC. · 3bdc0eba

由 Ben Greear 提交于 2月 11, 2012

This is useful for testing RX handling of frames with bad
CRCs.

Requires driver support to actually put the packet on the
wire properly.
Signed-off-by: NBen Greear <greearb@candelatech.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

3bdc0eba

ipsec: be careful of non existing mac headers · 03606895

由 Eric Dumazet 提交于 2月 23, 2012

Niccolo Belli reported ipsec crashes in case we handle a frame without
mac header (atm in his case)

Before copying mac header, better make sure it is present.

Bugzilla reference: https://bugzilla.kernel.org/show_bug.cgi?id=42809Reported-by: NNiccolò Belli <darkbasic@linuxsystems.it>
Tested-by: NNiccolò Belli <darkbasic@linuxsystems.it>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

03606895

22 2月, 2012 2 次提交

skb: Add skb_peek_next helper · da5ef6e5

由 Pavel Emelyanov 提交于 2月 21, 2012

Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da5ef6e5

datagram: Add offset argument to __skb_recv_datagram · 3f518bf7

由 Pavel Emelyanov 提交于 2月 21, 2012

This one is only considered for MSG_PEEK flag and the value pointed by
it specifies where to start peeking bytes from. If the offset happens to
point into the middle of the returned skb, the offset within this skb is
put back to this very argument.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3f518bf7

11 2月, 2012 1 次提交

skbuff: Move rxhash and vlan_tci to consolidate holes in sk_buff · 4031ae6e

由 Alexander Duyck 提交于 1月 27, 2012

This change helps to reduce the overall size of the sk_buff by moving
rxhash and vlan_tci so that the u16 values and u8 bitfields can be better
combined to create only one hole instead of multiple.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Tested-by: NStephen Ko <stephen.s.ko@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

4031ae6e

06 1月, 2012 1 次提交

net: pack skb_shared_info more efficiently · 9f42f126

由 Ian Campbell 提交于 1月 05, 2012

nr_frags can be 8 bits since 256 is plenty of fragments. This allows it to be
packed with tx_flags.

Also by moving ip6_frag_id and dataref (both 4 bytes) next to each other we can
avoid a hole between ip6_frag_id and frag_list on 64 bit systems.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9f42f126

24 12月, 2011 1 次提交

net: only use a single page of slop in MAX_SKB_FRAGS · 9d4dde52

由 Ian Campbell 提交于 12月 22, 2011

In order to accommodate a 64K buffer we need 64K/PAGE_SIZE plus one more page
in order to allow for a buffer which does not start on a page boundary.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d4dde52

05 12月, 2011 1 次提交

tcp: take care of misalignments · 117632e6

由 Eric Dumazet 提交于 12月 03, 2011

We discovered that TCP stack could retransmit misaligned skbs if a
malicious peer acknowledged sub MSS frame. This currently can happen
only if output interface is non SG enabled : If SG is enabled, tcp
builds headless skbs (all payload is included in fragments), so the tcp
trimming process only removes parts of skb fragments, header stay
aligned.

Some arches cant handle misalignments, so force a head reallocation and
shrink headroom to MAX_TCP_HEADER.

Dont care about misaligments on x86 and PPC (or other arches setting
NET_IP_ALIGN to 0)

This patch introduces __pskb_copy() which can specify the headroom of
new head, and pskb_copy() becomes a wrapper on top of __pskb_copy()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

117632e6

23 11月, 2011 1 次提交

net: remove netdev_alloc_page and use __GFP_COLD · 1f2149c1

由 Eric Dumazet 提交于 11月 22, 2011

Given we dont use anymore the struct net_device *dev argument, and this
interface brings litle benefit, remove netdev_{alloc|free}_page(), to
debloat include/linux/skbuff.h a bit.

(Some drivers used a mix of these interfaces and alloc_pages())

When allocating a page given to device for DMA transfer (device to
memory), it makes sense to use a cold one (__GFP_COLD)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f2149c1

17 11月, 2011 2 次提交

net: remove NETIF_F_NO_CSUM feature bit · 34324dc2

由 Michał Mirosław 提交于 11月 15, 2011

Only distinct use is checking if NETIF_F_NOCACHE_COPY should be
enabled by default. The check heuristics is altered a bit here,
so it hits other people than before. The default shouldn't be
trusted for performance-critical cases anyway.

For all other uses NETIF_F_NO_CSUM is equivalent to NETIF_F_HW_CSUM.
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

34324dc2

net: introduce and use netdev_features_t for device features sets · c8f44aff

由 Michał Mirosław 提交于 11月 15, 2011

v2:	add couple missing conversions in drivers
	split unexporting netdev_fix_features()
	implemented %pNF
	convert sock::sk_route_(no?)caps
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8f44aff

15 11月, 2011 1 次提交

net: introduce build_skb() · b2b5ce9d

由 Eric Dumazet 提交于 11月 14, 2011

One of the thing we discussed during netdev 2011 conference was the idea
to change some network drivers to allocate/populate their skb at RX
completion time, right before feeding the skb to network stack.

In old days, we allocated skbs when populating the RX ring.

This means bringing into cpu cache sk_buff and skb_shared_info cache
lines (since we clear/initialize them), then 'queue' skb->data to NIC.

By the time NIC fills a frame in skb->data buffer and host can process
it, cpu probably threw away the cache lines from its caches, because lot
of things happened between the allocation and final use.

So the deal would be to allocate only the data buffer for the NIC to
populate its RX ring buffer. And use build_skb() at RX completion to
attach a data buffer (now filled with an ethernet frame) to a new skb,
initialize the skb_shared_info portion, and give the hot skb to network
stack.

build_skb() is the function to allocate an skb, caller providing the
data buffer that should be attached to it. Drivers are expected to call
skb_reserve() right after build_skb() to adjust skb->data to the
Ethernet frame (usually skipping NET_SKB_PAD and NET_IP_ALIGN, but some
drivers might add a hardware provided alignment)

Data provided to build_skb() MUST have been allocated by a prior
kmalloc() call, with enough room to add SKB_DATA_ALIGN(sizeof(struct
skb_shared_info)) bytes at the end of the data without corrupting
incoming frame.

data = kmalloc(NET_SKB_PAD + NET_IP_ALIGN + 1536 +
               SKB_DATA_ALIGN(sizeof(struct skb_shared_info)),
	       GFP_ATOMIC);
...
skb = build_skb(data);
if (!skb) {
	recycle_data(data);
} else {
	skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN);
	...
}
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Eilon Greenstein <eilong@broadcom.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Tom Herbert <therbert@google.com>
CC: Jamal Hadi Salim <hadi@mojatatu.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Thomas Graf <tgraf@infradead.org>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b2b5ce9d

10 11月, 2011 1 次提交

net: add wireless TX status socket option · 6e3e939f

由 Johannes Berg 提交于 11月 09, 2011

The 802.1X EAPOL handshake hostapd does requires
knowing whether the frame was ack'ed by the peer.
Currently, we fudge this pretty badly by not even
transmitting the frame as a normal data frame but
injecting it with radiotap and getting the status
out of radiotap monitor as well. This is rather
complex, confuses users (mon.wlan0 presence) and
doesn't work with all hardware.

To get rid of that hack, introduce a real wifi TX
status option for data frame transmissions.

This works similar to the existing TX timestamping
in that it reflects the SKB back to the socket's
error queue with a SCM_WIFI_STATUS cmsg that has
an int indicating ACK status (0/1).

Since it is possible that at some point we will
want to have TX timestamping and wifi status in a
single errqueue SKB (there's little point in not
doing that), redefine SO_EE_ORIGIN_TIMESTAMPING
to SO_EE_ORIGIN_TXSTATUS which can collect more
than just the timestamp; keep the old constant
as an alias of course. Currently the internal APIs
don't make that possible, but it wouldn't be hard
to split them up in a way that makes it possible.

Thanks to Neil Horman for helping me figure out
the functions that add the control messages.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

6e3e939f

01 11月, 2011 1 次提交

include: linux: skbuf.h: Fix parameter documentation · f83347df

由 Marcos Paulo de Souza 提交于 10月 31, 2011

Fixes parameter name of skb_frag_dmamap function to silence warning on
make htmldocs.
Signed-off-by: NMarcos Paulo de Souza <marcos.mage@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f83347df

24 10月, 2011 1 次提交

net: hold sock reference while processing tx timestamps · da92b194

由 Richard Cochran 提交于 10月 21, 2011

The pair of functions,

 * skb_clone_tx_timestamp()
 * skb_complete_tx_timestamp()

were designed to allow timestamping in PHY devices. The first
function, called during the MAC driver's hard_xmit method, identifies
PTP protocol packets, clones them, and gives them to the PHY device
driver. The PHY driver may hold onto the packet and deliver it at a
later time using the second function, which adds the packet to the
socket's error queue.

As pointed out by Johannes, nothing prevents the socket from
disappearing while the cloned packet is sitting in the PHY driver
awaiting a timestamp. This patch fixes the issue by taking a reference
on the socket for each such packet. In addition, the comments
regarding the usage of these function are expanded to highlight the
rule that PHY drivers must use skb_complete_tx_timestamp() to release
the packet, in order to release the socket reference, too.

These functions first appeared in v2.6.36.
Reported-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NRichard Cochran <richard.cochran@omicron.at>
Cc: <stable@vger.kernel.org>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da92b194

21 10月, 2011 2 次提交

net: add opaque struct around skb frag page · a8605c60

由 Ian Campbell 提交于 10月 19, 2011

I've split this bit out of the skb frag destructor patch since it helps enforce
the use of the fragment API.
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8605c60

net: constify skbuff and Qdisc elements · 05bdd2f1

由 Eric Dumazet 提交于 10月 20, 2011

Preliminary patch before tcp constification
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05bdd2f1

20 10月, 2011 2 次提交

net: do not take an additional reference in skb_frag_set_page · a0bec1cd

由 Ian Campbell 提交于 10月 18, 2011

I audited all of the callers in the tree and only one of them (pktgen) expects
it to do so. Taking this reference is pretty obviously confusing and error
prone.

In particular I looked at the following commits which switched callers of
(__)skb_frag_set_page to the skb paged fragment api:

6a930b9f cxgb3: convert to SKB paged frag API.
5dc3e196 myri10ge: convert to SKB paged frag API.
0e0634d2 vmxnet3: convert to SKB paged frag API.
86ee8130 virtionet: convert to SKB paged frag API.
4a22c4c9 sfc: convert to SKB paged frag API.
18324d69 cassini: convert to SKB paged frag API.
b061b39e benet: convert to SKB paged frag API.
b7b6a688 bnx2: convert to SKB paged frag API.
804cf14e net: xfrm: convert to SKB frag APIs
ea2ab693 net: convert core to skb paged frag APIs
Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0bec1cd

net: Allow skb_recycle_check to be done in stages · 3d153a7c

由 Andy Fleming 提交于 10月 13, 2011

skb_recycle_check resets the skb if it's eligible for recycling.
However, there are times when a driver might want to optionally
manipulate the skb data with the skb before resetting the skb,
but after it has determined eligibility.  We do this by splitting the
eligibility check from the skb reset, creating two inline functions to
accomplish that task.
Signed-off-by: NAndy Fleming <afleming@freescale.com>
Acked-by: NDavid Daney <david.daney@cavium.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d153a7c

19 10月, 2011 1 次提交

net: add skb frag size accessors · 9e903e08

由 Eric Dumazet 提交于 10月 18, 2011

To ease skb->truesize sanitization, its better to be able to localize
all references to skb frags size.

Define accessors : skb_frag_size() to fetch frag size, and
skb_frag_size_{set|add|sub}() to manipulate it.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9e903e08

14 10月, 2011 1 次提交

net: more accurate skb truesize · 87fb4b7b

由 Eric Dumazet 提交于 10月 13, 2011

skb truesize currently accounts for sk_buff struct and part of skb head.
kmalloc() roundings are also ignored.

Considering that skb_shared_info is larger than sk_buff, its time to
take it into account for better memory accounting.

This patch introduces SKB_TRUESIZE(X) macro to centralize various
assumptions into a single place.

At skb alloc phase, we put skb_shared_info struct at the exact end of
skb head, to allow a better use of memory (lowering number of
reallocations), since kmalloc() gives us power-of-two memory blocks.

Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are
aligned to cache lines, as before.

Note: This patch might trigger performance regressions because of
misconfigured protocol stacks, hitting per socket or global memory
limits that were previously not reached. But its a necessary step for a
more accurate memory accounting.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Andi Kleen <ak@linux.intel.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

87fb4b7b

16 9月, 2011 1 次提交

net: copy userspace buffers on device forwarding · 48c83012

由 Michael S. Tsirkin 提交于 8月 31, 2011

dev_forward_skb loops an skb back into host networking
stack which might hang on the memory indefinitely.
In particular, this can happen in macvtap in bridged mode.
Copy the userspace fragments to avoid blocking the
sender in that case.

As this patch makes skb_copy_ubufs extern now,
I also added some documentation and made it clear
the SKBTX_DEV_ZEROCOPY flag automatically instead
of doing it in all callers. This can be made into a separate
patch if people feel it's worth it.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

48c83012

25 8月, 2011 1 次提交

net: convert core to skb paged frag APIs · ea2ab693

由 Ian Campbell 提交于 8月 22, 2011

Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: netdev@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea2ab693