提交 · cab41c47d92851de71c74b1a7bdbf0fadf6ae4ba · openeuler / raspberrypi-kernel

13 9月, 2014 1 次提交

skb: Add documentation for skb_clone_sk · cab41c47

由 Alexander Duyck 提交于 9月 10, 2014

This change adds some documentation to the call skb_clone_sk. This is
meant to help clarify the purpose of the function for other developers.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cab41c47

11 9月, 2014 1 次提交

pktgen: Convert pr_warning to pr_warn · 294a0b7f

由 Joe Perches 提交于 9月 09, 2014

Use the more common pr_warn.
Realign arguments.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

294a0b7f

10 9月, 2014 2 次提交

netns: remove one sparse warning · 416c51e1

由 Eric Dumazet 提交于 9月 09, 2014

net/core/net_namespace.c:227:18: warning: incorrect type in argument 1
(different address spaces)
net/core/net_namespace.c:227:18:    expected void const *<noident>
net/core/net_namespace.c:227:18:    got struct net_generic [noderef]
<asn:4>*gen

We can use rcu_access_pointer() here as read-side access to the pointer
was removed at least one grace period ago.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

416c51e1

net: bpf: be friendly to kmemcheck · 286aad3c

由 Daniel Borkmann 提交于 9月 08, 2014

Reported by Mikulas Patocka, kmemcheck currently barks out a
false positive since we don't have special kmemcheck annotation
for bitfields used in bpf_prog structure.

We currently have jited:1, len:31 and thus when accessing len
while CONFIG_KMEMCHECK enabled, kmemcheck throws a warning that
we're reading uninitialized memory.

As we don't need the whole bit universe for pages member, we
can just split it to u16 and use a bool flag for jited instead
of a bitfield.
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

286aad3c

06 9月, 2014 8 次提交

net: Add function for parsing the header length out of linear ethernet frames · 56193d1b

由 Alexander Duyck 提交于 9月 05, 2014

This patch updates some of the flow_dissector api so that it can be used to
parse the length of ethernet buffers stored in fragments. Most of the
changes needed were to __skb_get_poff as it needed to be updated to support
sending a linear buffer instead of a skb.

I have split __skb_get_poff into two functions, the first is skb_get_poff
and it retains the functionality of the original __skb_get_poff. The other
function is __skb_get_poff which now works much like __skb_flow_dissect in
relation to skb_flow_dissect in that it provides the same functionality but
works with just a data buffer and hlen instead of needing an skb.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56193d1b

net: merge cases where sock_efree and sock_edemux are the same function · 82eabd9e

由 Alexander Duyck 提交于 9月 04, 2014

Since sock_efree and sock_demux are essentially the same code for non-TCP
sockets and the case where CONFIG_INET is not defined we can combine the
code or replace the call to sock_edemux in several spots. As a result we
can avoid a bit of unnecessary code or code duplication.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82eabd9e

net-timestamp: Make the clone operation stand-alone from phy timestamping · 62bccb8c

由 Alexander Duyck 提交于 9月 04, 2014

The phy timestamping takes a different path than the regular timestamping
does in that it will create a clone first so that the packets needing to be
timestamped can be placed in a queue, or the context block could be used.

In order to support these use cases I am pulling the core of the code out
so it can be used in other drivers beyond just phy devices.

In addition I have added a destructor named sock_efree which is meant to
provide a simple way for dropping the reference to skb exceptions that
aren't part of either the receive or send windows for the socket, and I
have removed some duplication in spots where this destructor could be used
in place of sock_edemux.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62bccb8c

net-timestamp: Merge shared code between phy and regular timestamping · 37846ef0

由 Alexander Duyck 提交于 9月 04, 2014

This change merges the shared bits that exist between skb_tx_tstamp and
skb_complete_tx_timestamp.  By doing this we can avoid the two diverging as
there were already changes pushed into skb_tx_tstamp that hadn't made it
into the other function.

In addition this resolves issues with the fact that
skb_complete_tx_timestamp was included in linux/skbuff.h even though it was
only compiled in if phy timestamping was enabled.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37846ef0

net: treewide: Fix typo found in DocBook/networking.xml · e793c0f7

由 Masanari Iida 提交于 9月 04, 2014

This patch fix spelling typo found in DocBook/networking.xml.
It is because the neworking.xml is generated from comments
in the source, I have to fix typo in comments within the source.
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Acked-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e793c0f7

ethtool: Add generic options for tunables · f0db9b07

由 Govindarajulu Varadarajan 提交于 9月 03, 2014

This patch adds new ethtool cmd, ETHTOOL_GTUNABLE & ETHTOOL_STUNABLE for getting
tunable values from driver.

Add get_tunable and set_tunable to ethtool_ops. Driver implements these
functions for getting/setting tunable value.
Signed-off-by: NGovindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0db9b07

dev_ioctl: remove dev_load() CAP_SYS_MODULE message · e020836d

由 Daniel Borkmann 提交于 9月 02, 2014

Marcel reported to see the following message when autoloading
is being triggered when adding nlmon device:

  Loading kernel module for a network device with
  CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias
  netdev-nlmon instead.

This false-positive happens despite with having correct
capabilities set, e.g. through issuing `ip link del dev nlmon`
more than once on a valid device with name nlmon, but Marcel
has also seen it on creation time when no nlmon module is
previously compiled-in or loaded as module and the device
name equals a link type name (e.g. nlmon, vxlan, team).

Stephen says:

  The netdev module alias is a hold over from the past. For
  normal devices, people used to create a alias eth0 to and
  point it to the type of network device used, that was back
  in the bad old ISA days before real discovery.

  Also, the tunnels create module alias for the control device
  and ip used to use this to autoload the tunnel device.

  The message is bogus and should just be removed, I also see
  it in a couple of other cases where tap devices are renamed
  for other usese.

As mentioned in 8909c9ad ("net: don't allow CAP_NET_ADMIN
to load non-netdev kernel modules"), we nevertheless still
might want to leave the old autoloading behaviour in place
as it could break old scripts, so for now, lets just remove
the log message as Stephen suggests.

Reference: http://thread.gmane.org/gmane.linux.kernel/1105168Reported-by: NMarcel Holtmann <marcel@holtmann.org>
Suggested-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e020836d

net: bpf: make eBPF interpreter images read-only · 60a3b225

由 Daniel Borkmann 提交于 9月 02, 2014

With eBPF getting more extended and exposure to user space is on it's way,
hardening the memory range the interpreter uses to steer its command flow
seems appropriate.  This patch moves the to be interpreted bytecode to
read-only pages.

In case we execute a corrupted BPF interpreter image for some reason e.g.
caused by an attacker which got past a verifier stage, it would not only
provide arbitrary read/write memory access but arbitrary function calls
as well. After setting up the BPF interpreter image, its contents do not
change until destruction time, thus we can setup the image on immutable
made pages in order to mitigate modifications to that code. The idea
is derived from commit 314beb9b ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks").

This is possible because bpf_prog is not part of sk_filter anymore.
After setup bpf_prog cannot be altered during its life-time. This prevents
any modifications to the entire bpf_prog structure (incl. function/JIT
image pointer).

Every eBPF program (including classic BPF that are migrated) have to call
bpf_prog_select_runtime() to select either interpreter or a JIT image
as a last setup step, and they all are being freed via bpf_prog_free(),
including non-JIT. Therefore, we can easily integrate this into the
eBPF life-time, plus since we directly allocate a bpf_prog, we have no
performance penalty.

Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
inspection of kernel_page_tables.  Brad Spengler proposed the same idea
via Twitter during development of this patch.

Joint work with Hannes Frederic Sowa.
Suggested-by: NBrad Spengler <spender@grsecurity.net>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

60a3b225

04 9月, 2014 1 次提交

qdisc: validate frames going through the direct_xmit path · 1f59533f

由 Jesper Dangaard Brouer 提交于 9月 03, 2014

In commit 50cbe9ab ("net: Validate xmit SKBs right when we
pull them out of the qdisc") the validation code was moved out of
dev_hard_start_xmit and into dequeue_skb.

However this overlooked the fact that we do not always enqueue
the skb onto a qdisc. First situation is if qdisc have flag
TCQ_F_CAN_BYPASS and qdisc is empty.  Second situation is if
there is no qdisc on the device, which is a common case for
software devices.

Originally spotted and inital patch by Alexander Duyck.
As a result Alex was seeing issues trying to connect to a
vhost_net interface after commit 50cbe9ab was applied.

Added a call to validate_xmit_skb() in __dev_xmit_skb(), in the
code path for qdiscs with TCQ_F_CAN_BYPASS flag, and in
__dev_queue_xmit() when no qdisc.

Also handle the error situation where dev_hard_start_xmit() could
return a skb list, and does not return dev_xmit_complete(rc) and
falls through to the kfree_skb(), in that situation it should
call kfree_skb_list().

Fixes:  50cbe9ab ("net: Validate xmit SKBs right when we pull them out of the qdisc")
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f59533f

03 9月, 2014 4 次提交

rtnl/do_setlink(): notify when a netdev is modified · ba998906

由 Nicolas Dichtel 提交于 9月 01, 2014

Depending on which parameters were updated, the changes were not propagated via
the notifier chain and netlink.

The new flag has been set only when the change did not cause a call to the
notifier chain and/or to the netlink notification functions.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba998906

rtnl/do_setlink(): last arg is now a set of flags · 90c325e3

由 Nicolas Dichtel 提交于 9月 01, 2014

There is no functional changes with this commit, it only prepares the next one.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

90c325e3

rtnl/do_setlink(): set modified when IFLA_LINKMODE is updated · 1889b0e7

由 Nicolas Dichtel 提交于 9月 01, 2014

The only effect of this patch is to print a warning if IFLA_LINKMODE is updated
and a following change fails.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1889b0e7

rtnl/do_setlink(): set modified when IFLA_TXQLEN is updated · 5d1180fc

由 Nicolas Dichtel 提交于 9月 01, 2014

The only effect of this patch is to print a warning if IFLA_TXQLEN is updated
and a following change fails.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d1180fc

02 9月, 2014 12 次提交

sock: deduplicate errqueue dequeue · 364a9e93

由 Willem de Bruijn 提交于 8月 31, 2014

sk->sk_error_queue is dequeued in four locations. All share the
exact same logic. Deduplicate.

Also collapse the two critical sections for dequeue (at the top of
the recv handler) and signal (at the bottom).

This moves signal generation for the next packet forward, which should
be harmless.

It also changes the behavior if the recv handler exits early with an
error. Previously, a signal for follow-up packets on the errqueue
would then not be scheduled. The new behavior, to always signal, is
arguably a bug fix.

For rxrpc, the change causes the same function to be called repeatedly
for each queued packet (because the recv handler == sk_error_report).
It is likely that all packets will fail for the same reason (e.g.,
memory exhaustion).

This code runs without sk_lock held, so it is not safe to trust that
sk->sk_err is immutable inbetween releasing q->lock and the subsequent
test. Introduce int err just to avoid this potential race.
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

364a9e93

net: Support for csum_bad in skbuff · 5a212329

由 Tom Herbert 提交于 8月 31, 2014

This flag indicates that an invalid checksum was detected in the
packet. __skb_mark_checksum_bad helper function was added to set this.

Checksums can be marked bad from a driver or the GRO path (the latter
is implemented in this patch). csum_bad is checked in
__skb_checksum_validate_complete (i.e. calling that when ip_summed ==
CHECKSUM_NONE).

csum_bad works in conjunction with ip_summed value. In the case that
ip_summed is CHECKSUM_NONE and csum_bad is set, this implies that the
first (or next) checksum encountered in the packet is bad. When
ip_summed is CHECKSUM_UNNECESSARY, the first checksum after the last
one validated is bad. For example, if ip_summed == CHECKSUM_UNNECESSARY,
csum_level == 1, and csum_bad is set-- then the third checksum in the
packet is bad. In the normal path, the packet will be dropped when
processing the protocol layer of the bad checksum:
__skb_decr_checksum_unnecessary called twice for the good checksums
changing ip_summed to CHECKSUM_NONE so that
__skb_checksum_validate_complete is called to validate the third
checksum and that will fail since csum_bad is set.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a212329

pktgen: add flag NO_TIMESTAMP to disable timestamping · afb84b62

由 Jesper Dangaard Brouer 提交于 8月 28, 2014

Then testing the TX limits of the stack, then it is useful to
be-able to disable the do_gettimeofday() timetamping on every packet.

This implements a pktgen flag NO_TIMESTAMP which will disable this
call to do_gettimeofday().

The performance change on (my system E5-2695) with skb_clone=0, goes
from TX 2,423,751 pps to 2,567,165 pps with flag NO_TIMESTAMP. Thus,
the cost of do_gettimeofday() or saving is approx 23 nanosec.
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

afb84b62

net: xmit_list() becomes dev_hard_start_xmit(). · 8dcda22a

由 David S. Miller 提交于 9月 01, 2014

Now fundamentally we can process lists of SKBs as cheaply
as single packets.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8dcda22a

net: Don't keep around original SKB when we software segment GSO frames. · ce93718f

由 David S. Miller 提交于 8月 30, 2014

Just maintain the list properly by returning the head of the remaining
SKB list from dev_hard_start_xmit().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce93718f

D
net: Validate xmit SKBs right when we pull them out of the qdisc. · 50cbe9ab
由 David S. Miller 提交于 8月 30, 2014
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
50cbe9ab

net: Separate out SKB validation logic from transmit path. · eae3f88e

由 David S. Miller 提交于 8月 30, 2014

dev_hard_start_xmit() does two things, it first validates and
canonicalizes the SKB, then it actually sends it.

Make a set of helper functions for doing the first part.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eae3f88e

D
net: Have xmit_list() signal more==true when appropriate. · 95f6b3dd
由 David S. Miller 提交于 8月 29, 2014
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
95f6b3dd
D
net: Pass a "more" indication down into netdev_start_xmit() code paths. · fa2dbdc2
由 David S. Miller 提交于 8月 29, 2014
```
For now it will always be false.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
fa2dbdc2

net: Move main gso loop out of dev_hard_start_xmit() into helper. · 7f2e870f

由 David S. Miller 提交于 8月 29, 2014

There is a slight policy change happening here as well.

The previous code would drop the entire rest of the GSO skb if any of
them got, for example, a congestion notification.

That makes no sense, anything NET_XMIT_MASK and below is something
like congestion or policing.  And in the congestion case it doesn't
even mean the packet was actually dropped.

Just continue until dev_xmit_complete() evaluates to false.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f2e870f

D
net: Create xmit_one() helper for dev_hard_start_xmit() · 2ea25513
由 David S. Miller 提交于 8月 29, 2014
```
Hopefully making the code a bit easier to read and digest.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
2ea25513

net: Do txq_trans_update() in netdev_start_xmit() · 10b3ad8c

由 David S. Miller 提交于 8月 29, 2014

That way we don't have to audit every call site to make sure it is
doing this properly.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10b3ad8c

30 8月, 2014 3 次提交

net: Allow GRO to use and set levels of checksum unnecessary · 662880f4

由 Tom Herbert 提交于 8月 27, 2014

Allow GRO path to "consume" checksums provided in CHECKSUM_UNNECESSARY
and to report new checksums verfied for use in fallback to normal
path.

Change GRO checksum path to track csum_level using a csum_cnt field
in NAPI_GRO_CB. On GRO initialization, if ip_summed is
CHECKSUM_UNNECESSARY set NAPI_GRO_CB(skb)->csum_cnt to
skb->csum_level + 1. For each checksum verified, decrement
NAPI_GRO_CB(skb)->csum_cnt while its greater than zero. If a checksum
is verfied and NAPI_GRO_CB(skb)->csum_cnt == 0, we have verified a
deeper checksum than originally indicated in skbuf so increment
csum_level (or initialize to CHECKSUM_UNNECESSARY if ip_summed is
CHECKSUM_NONE or CHECKSUM_COMPLETE).
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

662880f4

net: attempt a single high order allocation · d9b2938a

由 Eric Dumazet 提交于 8月 27, 2014

In commit ed98df33 ("net: use __GFP_NORETRY for high order
allocations") we tried to address one issue caused by order-3
allocations.

We still observe high latencies and system overhead in situations where
compaction is not successful.

Instead of trying order-3, order-2, and order-1, do a single order-3
best effort and immediately fallback to plain order-0.

This mimics slub strategy to fallback to slab min order if the high
order allocation used for performance failed.

Order-3 allocations give a performance boost only if they can be done
without recurring and expensive memory scan.

Quoting David :

The page allocator relies on synchronous (sync light) memory compaction
after direct reclaim for allocations that don't retry and deferred
compaction doesn't work with this strategy because the allocation order
is always decreasing from the previous failed attempt.

This means sync light compaction will always be encountered if memory
cannot be defragmented or reclaimed several times during the
skb_page_frag_refill() iteration.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Acked-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d9b2938a

net: add skb_get_tx_queue() helper · 10c51b56

由 Daniel Borkmann 提交于 8月 27, 2014

Replace occurences of skb_get_queue_mapping() and follow-up
netdev_get_tx_queue() with an actual helper function.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10c51b56

26 8月, 2014 4 次提交

net: fix checksum features handling in netif_skb_features() · db115037

由 Michal Kubeček 提交于 8月 25, 2014

This is follow-up to

  da08143b ("vlan: more careful checksum features handling")

which introduced more careful feature intersection in vlan code,
taking into account that HW_CSUM should be considered superset
of IP_CSUM/IPV6_CSUM. The same is needed in netif_skb_features()
in order to avoid offloading mismatch warning when vlan is
created on top of a bond consisting of slaves supporting IP/IPv6
checksumming but not vlan Tx offloading.
Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db115037

net: make skb an optional parameter for__skb_flow_dissect() · 453a940e

由 WANG Cong 提交于 8月 25, 2014

Fixes: commit 690e36e7 (net: Allow raw buffers to be passed into the flow dissector)
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

453a940e

net: fix comments for __skb_flow_get_ports() · 6451b3f5

由 WANG Cong 提交于 8月 25, 2014

Fixes: commit 690e36e7 (net: Allow raw buffers to be passed into the flow dissector)
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6451b3f5

net: prevent of emerging cross-namespace symlinks · 4c75431a

由 Alexander Y. Fomichev 提交于 8月 25, 2014

Code manipulating sysfs symlinks on adjacent net_devices(s)
currently doesn't take into account that devices potentially
belong to different namespaces.

This patch trying to fix an issue as follows:
- check for net_ns before creating / deleting symlink.
  for now only netdev_adjacent_rename_links and
  __netdev_adjacent_dev_remove are affected, afaics
  __netdev_adjacent_dev_insert implies both net_devs
  belong to the same namespace.
- Drop all existing symlinks to / from all adj_devs before
  switching namespace and recreate them just after.
Signed-off-by: NAlexander Y. Fomichev <git.user@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c75431a

25 8月, 2014 2 次提交

D
net: Add ops->ndo_xmit_flush() · 4798248e
由 David S. Miller 提交于 8月 22, 2014
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
4798248e

net: skb_gro_checksum_* functions · 573e8fca

由 Tom Herbert 提交于 8月 22, 2014

Add skb_gro_checksum_validate, skb_gro_checksum_validate_zero_check,
and skb_gro_checksum_simple_validate, and __skb_gro_checksum_complete.
These are the cognates of the normal checksum functions but are used
in the gro_receive path and operate on GRO related fields in sk_buffs.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

573e8fca

24 8月, 2014 2 次提交

net: use reciprocal_scale() helper · 8fc54f68

由 Daniel Borkmann 提交于 8月 23, 2014

Replace open codings of (((u64) <x> * <y>) >> 32) with reciprocal_scale().
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fc54f68

net: Allow raw buffers to be passed into the flow dissector. · 690e36e7

由 David S. Miller 提交于 8月 23, 2014

Drivers, and perhaps other entities we have not yet considered,
sometimes want to know how deep the protocol headers go before
deciding how large of an SKB to allocate and how much of the packet to
place into the linear SKB area.

For example, consider a driver which has a device which DMAs into
pools of pages and then tells the driver where the data went in the
DMA descriptor(s).  The driver can then build an SKB and reference
most of the data via SKB fragments (which are page/offset/length
triplets).

However at least some of the front of the packet should be placed into
the linear SKB area, which comes before the fragments, so that packet
processing can get at the headers efficiently.  The first thing each
protocol layer is going to do is a "pskb_may_pull()" so we might as
well aggregate as much of this as possible while we're building the
SKB in the driver.

Part of supporting this is that we don't have an SKB yet, so we want
to be able to let the flow dissector operate on a raw buffer in order
to compute the offset of the end of the headers.

So now we have a __skb_flow_dissect() which takes an explicit data
pointer and length.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

690e36e7