提交 · 29fea209f8333e2351710c14c19bf5358da63e39 · openeuler / Kernel

03 9月, 2014 18 次提交

Merge branch 'netdev_modified' · 29fea209

由 David S. Miller 提交于 9月 02, 2014

Nicolas Dichtel says:

====================
rtnl: send notification in do_setlink()

This series ensures to call the notifier chain and to send a netlink
message when a change is done by do_setlink().

The three first patches mainly prepare the last one, which do this change.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

29fea209

rtnl/do_setlink(): notify when a netdev is modified · ba998906

由 Nicolas Dichtel 提交于 9月 01, 2014

Depending on which parameters were updated, the changes were not propagated via
the notifier chain and netlink.

The new flag has been set only when the change did not cause a call to the
notifier chain and/or to the netlink notification functions.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ba998906

rtnl/do_setlink(): last arg is now a set of flags · 90c325e3

由 Nicolas Dichtel 提交于 9月 01, 2014

There is no functional changes with this commit, it only prepares the next one.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

90c325e3

rtnl/do_setlink(): set modified when IFLA_LINKMODE is updated · 1889b0e7

由 Nicolas Dichtel 提交于 9月 01, 2014

The only effect of this patch is to print a warning if IFLA_LINKMODE is updated
and a following change fails.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1889b0e7

rtnl/do_setlink(): set modified when IFLA_TXQLEN is updated · 5d1180fc

由 Nicolas Dichtel 提交于 9月 01, 2014

The only effect of this patch is to print a warning if IFLA_TXQLEN is updated
and a following change fails.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d1180fc

Merge branch 'be2net-next' · 219c5361

由 David S. Miller 提交于 9月 02, 2014

Sathya Perla says:

====================
be2net: patch set

v2 changes: add a new line after variable declaration in patch 12.

***
Patch 1 adds a few new log messages to help debugging in failure cases.

Patch 2 uses new macros for parsing RX/TX completions and TX wrbs to
help shorten the lines.

Patch 3 adds a description for the RX counter rx_input_fifo_overflow_drop.

Patch 4 adds TX completion error statistics reporting via ethtool.

Patch 5 adds a dma_mapping_error counter and its reporting via ethtool.

Patch 6 fixes up log messages in the Lancer FW download path.

Patch 7 replaces gotos with direct return statements.

Patch 8 cleans up be_change_mtu() code by using a new macro BE_MAX_MTU

Patch 9 makes be_cmd_get_regs() routine to return an integer status
similar to other FW cmd routines in be_cmds.c

Patch 10 gets rid of TX budget as enforcing a budget on TX completion
processing in NAPI is neither suggested nor it provides a performance benefit.

Patch 11 defines and uses a new macro for_all_tx_queues_on_eq() similar
to the RX processing code.

Patch 12 queries max_tx_qs from the FW for BE3 super-nic profiles.
For those profiles, the driver cannot assume a constant BE3_MAX_TX_QS value,
as the value may change for each function.

Please consider applying this patch set to the net-next tree. Thanks!
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

219c5361

be2net: query max_tx_qs for BE3 super-nic profile from FW · a28277dc

由 Suresh Reddy 提交于 9月 02, 2014

In the BE3 super-nic profile, the max_tx_qs value can vary for each function.
So the driver needs to query this value from FW instead of using the
pre-defined constant BE3_MAX_TX_QS.
Signed-off-by: NSuresh Reddy <Suresh.Reddy@emulex.com>
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a28277dc

be2net: define macro for_all_tx_queues_on_eq() · a4906ea0

由 Sathya Perla 提交于 9月 02, 2014

Replace the for() loop that traverses all the TX queues on an EQ
with the macro for_all_tx_queues_on_eq(). With this expalnatory
name, the one line comment is not required anymore.
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4906ea0

be2net: get rid of TX budget · c8f64615

由 Sathya Perla 提交于 9月 02, 2014

Enforcing a budget on the TX completion processing in NAPI doesn't
benefit performance in anyway. Just get rid of it.
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8f64615

be2net: make be_cmd_get_regs() return a status · c5f156de

由 Vasundhara Volam 提交于 9月 02, 2014

There are a few failure cases in be_cmd_get_regs() that ideally must return
an error value. This style is used across all the routines in be_cmds.c with
this routine being an exception. This patch fixes this.
Signed-off-by: NVasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5f156de

be2net: define BE_MAX_MTU · 0d3f5cce

由 Kalesh AP 提交于 9月 02, 2014

This patch defines a new macro BE_MAX_MTU to make the code in be_change_mtu()
more readable.
Signed-off-by: NKalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0d3f5cce

be2net: remove unncessary gotos · 3fb8cb80

由 Kalesh AP 提交于 9月 02, 2014

In cases where there is no extra code to handle an error, this patch replaces
gotos with a direct return statement.
Signed-off-by: NKalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fb8cb80

be2net: fix log messages in lancer FW download path · bb864e07

由 Kalesh AP 提交于 9月 02, 2014

Log messages in the Lancer FW download path have issues such as:
- a single message spanning multiple lines
- the success message is logged even in failure cases
- status codes are already logged in the FW cmd routines
This patch fixes these issues.
Signed-off-by: NKalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb864e07

be2net: Add a dma_mapping_error counter in ethtool · d3de1540

由 Vasundhara Volam 提交于 9月 02, 2014

Add a dma_mapping_error counter to count the number of packets dropped
due to DMA mapping errors.
Signed-off-by: NVasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3de1540

be2net: Add TX completion error statistics in ethtool · 512bb8a2

由 Kalesh AP 提交于 9月 02, 2014

HW reports TX completion errors in TX completion. This patch adds these
counters to ethtool statistics.
Signed-off-by: NKalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

512bb8a2

S
be2net: add a description for counter rx_input_fifo_overflow_drop · acbd6ff8
由 Sathya Perla 提交于 9月 02, 2014
```
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
acbd6ff8

be2net: shorten AMAP_GET/SET_BITS() macro calls · c3c18bc1

由 Sathya Perla 提交于 9月 02, 2014

The AMAP_GET/SET_BITS() macro calls take structure name as a parameter
and hence are long and span more than one line. Replace these calls
with a wrapper macros for RX/Tx compls and TX wrb. This results in fewer
lines and more readable code in be_main.c
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3c18bc1

be2net: add a few log messages · acbafeb1

由 Sathya Perla 提交于 9月 02, 2014

This patch adds the following log messages to help debugging
failure cases:
1) log FW version number: this is useful when driver initialization
fails and the FW version number cannot be queried via ethtool
2) per function resource limits for BEx chips: these values are
currently being printed only for Skyhawk and Lancer
3) PCI BAR mapping failure
4) function_mode/caps queried from FW: this helps catch any FW bugs
that could advertise wrong capabilities to the driver
Signed-off-by: NSathya Perla <sathya.perla@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

acbafeb1

02 9月, 2014 22 次提交

sock: deduplicate errqueue dequeue · 364a9e93

由 Willem de Bruijn 提交于 8月 31, 2014

sk->sk_error_queue is dequeued in four locations. All share the
exact same logic. Deduplicate.

Also collapse the two critical sections for dequeue (at the top of
the recv handler) and signal (at the bottom).

This moves signal generation for the next packet forward, which should
be harmless.

It also changes the behavior if the recv handler exits early with an
error. Previously, a signal for follow-up packets on the errqueue
would then not be scheduled. The new behavior, to always signal, is
arguably a bug fix.

For rxrpc, the change causes the same function to be called repeatedly
for each queued packet (because the recv handler == sk_error_report).
It is likely that all packets will fail for the same reason (e.g.,
memory exhaustion).

This code runs without sk_lock held, so it is not safe to trust that
sk->sk_err is immutable inbetween releasing q->lock and the subsequent
test. Introduce int err just to avoid this potential race.
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

364a9e93

net-timestamp: expand documentation · 8fe2f761

由 Willem de Bruijn 提交于 8月 31, 2014

Expand Documentation/networking/timestamping.txt with new
interfaces and bytestream timestamping. Also minor
cleanup of the other text.

Import txtimestamp.c test of the new features.
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fe2f761

Merge branch 'csums-next' · c5a65680

由 David S. Miller 提交于 9月 01, 2014

Tom Herbert says:

====================
net: Checksum offload changes - Part VI

I am working on overhauling RX checksum offload. Goals of this effort
are:

- Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
- Preserve CHECKSUM_COMPLETE through encapsulation layers
- Don't do skb_checksum more than once per packet
- Unify GRO and non-GRO csum verification as much as possible
- Unify the checksum functions (checksum_init)
- Simplify code

What is in this seventh patch set:

- Add skb->csum. This allows a device or GRO to indicate that an
  invalid checksum was detected.
- Checksum unncessary to checksum complete conversions.

With these changes, I believe that the third goal of the overhaul is
now mostly achieved. In the case of no encapsulation or one layer of
encapsulation, there should only be at most one skb_checksum over
each packet (between GRO and normal path). In the case of two layers
of encapsulation, it is still possible with the right combination of
non-zero and zero UDP checksums to have >1 skb_checksum. For instance:
IP>GRE(with csum)>IP>UDP(zero csum)>VXLAN>IP>UDP(non-zero csum),
would likely necessiate an skb_checksum in GRO and normal path.
This doesn't seem like a common scenario at all so I'm inclined to
not address this now, if multiple layers of encapsulation becomes
popular we can reassess.

Note that checksum conversion shows a nice improvement for RX VXLAN when
outer UDP checksum is enabled (12.65% CPU compared to 20.94%). This
is not only from the fact that we don't need checksum calculation on
the host, but also allows GRO for VXLAN in this case. Checksum
conversion does not help send side (which still needs to perform
a checksum on host). For that we will implement remote checksum offload
in a later patch
(http://tools.ietf.org/html/draft-herbert-remotecsumoffload-00).

Please review carefully and test if possible, mucking with basic
checksum functions is always a little precarious :-)
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5a65680

T
l2tp: Enable checksum unnecessary conversions for l2tp/UDP sockets · 72297c59
由 Tom Herbert 提交于 8月 31, 2014
```
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
72297c59
T
vxlan: Enable checksum unnecessary conversions for vxlan/UDP sockets · c60c308c
由 Tom Herbert 提交于 8月 31, 2014
```
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
c60c308c

gre: Add support for checksum unnecessary conversions · 884d338c

由 Tom Herbert 提交于 8月 31, 2014

Call skb_checksum_try_convert and skb_gro_checksum_try_convert
after checksum is found present and validated in the GRE header
for normal and GRO paths respectively.

In GRO path, call skb_gro_checksum_try_convert
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

884d338c

udp: Add support for doing checksum unnecessary conversion · 2abb7cdc

由 Tom Herbert 提交于 8月 31, 2014

Add support for doing CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE
conversion in UDP tunneling path.

In the normal UDP path, we call skb_checksum_try_convert after locating
the UDP socket. The check is that checksum conversion is enabled for
the socket (new flag in UDP socket) and that checksum field is
non-zero.

In the UDP GRO path, we call skb_gro_checksum_try_convert after
checksum is validated and checksum field is non-zero. Since this is
already in GRO we assume that checksum conversion is always wanted.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2abb7cdc

net: Infrastructure for checksum unnecessary conversions · d96535a1

由 Tom Herbert 提交于 8月 31, 2014

For normal path, added skb_checksum_try_convert which is called
to attempt to convert CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE. The
primary condition to allow this is that ip_summed is CHECKSUM_NONE
and csum_valid is true, which will be the state after consuming
a CHECKSUM_UNNECESSARY.

For GRO path, added skb_gro_checksum_try_convert which is the GRO
analogue of skb_checksum_try_convert. The primary condition to allow
this is that NAPI_GRO_CB(skb)->csum_cnt == 0 and
NAPI_GRO_CB(skb)->csum_valid is set. This implies that we have consumed
all available CHECKSUM_UNNECESSARY checksums in the GRO path.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d96535a1

net: Support for csum_bad in skbuff · 5a212329

由 Tom Herbert 提交于 8月 31, 2014

This flag indicates that an invalid checksum was detected in the
packet. __skb_mark_checksum_bad helper function was added to set this.

Checksums can be marked bad from a driver or the GRO path (the latter
is implemented in this patch). csum_bad is checked in
__skb_checksum_validate_complete (i.e. calling that when ip_summed ==
CHECKSUM_NONE).

csum_bad works in conjunction with ip_summed value. In the case that
ip_summed is CHECKSUM_NONE and csum_bad is set, this implies that the
first (or next) checksum encountered in the packet is bad. When
ip_summed is CHECKSUM_UNNECESSARY, the first checksum after the last
one validated is bad. For example, if ip_summed == CHECKSUM_UNNECESSARY,
csum_level == 1, and csum_bad is set-- then the third checksum in the
packet is bad. In the normal path, the packet will be dropped when
processing the protocol layer of the bad checksum:
__skb_decr_checksum_unnecessary called twice for the good checksums
changing ip_summed to CHECKSUM_NONE so that
__skb_checksum_validate_complete is called to validate the third
checksum and that will fail since csum_bad is set.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a212329

r8152: rename rx_buf_sz · 52aec126

由 hayeswang 提交于 9月 02, 2014

The variable "rx_buf_sz" is used by both tx and rx buffers. Replace
it with "agg_buf_sz".
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

52aec126

net: phy: mdio-bcm-unimac: NULL-terminate unimac_mdio_ids · 4559154a

由 Florian Fainelli 提交于 8月 29, 2014

drivers/net/phy/mdio-bcm-unimac.c:195:37-38: unimac_mdio_ids is not NULL
terminated at line 195

Make sure of_device_id tables are NULL terminated
Generated by: scripts/coccinelle/misc/of_table.cocci
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4559154a

net: dsa: make dsa_pack_type static · 61b7363f

由 Florian Fainelli 提交于 8月 29, 2014

net/dsa/dsa.c:624:20: sparse: symbol 'dsa_pack_type' was not declared.
Should it be static?

Fixes: 3e8a72d1 ("net: dsa: reduce number of protocol hooks")
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61b7363f

bonding: add slave_changelink support and use it for queue_id · 0f23124a

由 Nikolay Aleksandrov 提交于 8月 27, 2014

This patch adds support for slave_changelink to the bonding and uses it
to give the ability to change the queue_id of the enslaved devices via
netlink. It sets slave_maxtype and uses bond_changelink as a prototype for
bond_slave_changelink.
Example/test command after the iproute2 patch:
 ip link set eth0 type bond_slave queue_id 10

CC: David S. Miller <davem@davemloft.net>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Suggested-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Acked-by: NJiri Pirko <jiri@resnulli.us>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0f23124a

tcp: whitespace fixes · 688d1945

由 stephen hemminger 提交于 8月 29, 2014

Fix places where there is space before tab, long lines, and
awkward if(){, double spacing etc. Add blank line after declaration/initialization.
Signed-off-by: NStephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

688d1945

net: systemport: tell RXCHK if we are using Broadcom tags · d09d3038

由 Florian Fainelli 提交于 8月 28, 2014

When Broadcom tags are enabled, e.g: when interfaced to an Ethernet
switch, make sure that we tell the RXCHK engine that it should be
expecting a 4-bytes Broadcom tag after the Ethernet MAC Source Address.

Use netdev_uses_dsa() to check for that condition since that will tell
us if a switch is attached to our network interface.

Fixes: 80105bef ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d09d3038

pktgen: add flag NO_TIMESTAMP to disable timestamping · afb84b62

由 Jesper Dangaard Brouer 提交于 8月 28, 2014

Then testing the TX limits of the stack, then it is useful to
be-able to disable the do_gettimeofday() timetamping on every packet.

This implements a pktgen flag NO_TIMESTAMP which will disable this
call to do_gettimeofday().

The performance change on (my system E5-2695) with skb_clone=0, goes
from TX 2,423,751 pps to 2,567,165 pps with flag NO_TIMESTAMP. Thus,
the cost of do_gettimeofday() or saving is approx 23 nanosec.
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

afb84b62

bnx2x: fix tunneled GSO over IPv6 · 05f8461b

由 Dmitry Kravkov 提交于 8月 28, 2014

Set correct bit for packed description.

Introduced in e42780b6
    bnx2x: Utilize FW 7.10.51
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDmitry Kravkov <Dmitry.Kravkov@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

05f8461b

bnx2x: prevent incorrect byte-swap in BE · 55ef5c89

由 Dmitry Kravkov 提交于 8月 28, 2014

Fixes incorrectly defined struct in FW HSI for BE platform.
Affects tunneling, tx-switching and anti-spoofing.

Introduced in e42780b6
    bnx2x: Utilize FW 7.10.51
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDmitry Kravkov <Dmitry.Kravkov@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55ef5c89

tipc: add name distributor resiliency queue · a5325ae5

由 Erik Hugne 提交于 8月 28, 2014

TIPC name table updates are distributed asynchronously in a cluster,
entailing a risk of certain race conditions. E.g., if two nodes
simultaneously issue conflicting (overlapping) publications, this may
not be detected until both publications have reached a third node, in
which case one of the publications will be silently dropped on that
node. Hence, we end up with an inconsistent name table.

In most cases this conflict is just a temporary race, e.g., one
node is issuing a publication under the assumption that a previous,
conflicting, publication has already been withdrawn by the other node.
However, because of the (rtt related) distributed update delay, this
may not yet hold true on all nodes. The symptom of this failure is a
syslog message: "tipc: Cannot publish {%u,%u,%u}, overlap error".

In this commit we add a resiliency queue at the receiving end of
the name table distributor. When insertion of an arriving publication
fails, we retain it in this queue for a short amount of time, assuming
that another update will arrive very soon and clear the conflict. If so
happens, we insert the publication, otherwise we drop it.

The (configurable) retention value defaults to 2000 ms. Knowing from
experience that the situation described above is extremely rare, there
is no risk that the queue will accumulate any large number of items.
Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Acked-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5325ae5

tipc: refactor name table updates out of named packet receive routine · f4ad8a4b

由 Erik Hugne 提交于 8月 28, 2014

We need to perform the same actions when processing deferred name
table updates, so this functionality is moved to a separate
function.
Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Acked-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4ad8a4b

r8152: reduce the number of Tx · 1764bcd9

由 hayeswang 提交于 8月 28, 2014

Because the Tx has the features of stopping queue and aggregation,
We don't need many tx buffers. Change the tx number from 10 to 4
to reduce the usage of the memory. This could save 16K * 6 bytes
memory.
Signed-off-by: NHayes Wang <hayeswang@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1764bcd9

Merge branch 'xmit_list' · 53fda7f7

由 David S. Miller 提交于 9月 01, 2014

David Miller says:

====================
net: Make dev_hard_start_xmit() work fundamentally on lists

After this patch set, dev_hard_start_xmit() will work fundemantally on
any and all SKB lists.

This opens the path for a clean implementation of pulling multiple
packets out during qdisc_restart(), and then passing that blob in one
shot to dev_hard_start_xmit().

There were two main architectural blockers to this:

1) The GSO handling, we kept the original GSO head SKB around simply
   because dev_hard_start_xmit() had no way to communicate to the
   caller how far into the segmented list it was able to go.  Now it
   can, so the head GSO can be liberated immediately.

   All of the special GSO head SKB destructor et al. handling goes
   away too.

2) Validate of VLAN, CSUM, and segmentation characteristics was being
   performed inside of dev_hard_start_xmit().  If want to truly batch,
   we have to let the higher levels to this.  In particular, this is
   now dequeue_skb()'s job.

And with those two issues out of the way, it should now be trivial to
build experiments on top of this patch set, all of the framework
should be there now.  You could do something as simple as:

	skb = q->dequeue(q);
	if (skb)
		skb = validate_xmit_skb(skb, qdisc_dev(q));
	if (skb) {
		struct sk_buff *new, *head = skb;
		int limit = 5;

		do {
			new = q->dequeue(q);
			if (new)
				new = validate_xmit_skb(new, qdisc_dev(q));
			if (new) {
				skb->next = new;
				skb = new;
			}
		} while (new && --limit);
		skb = head;
	}

inside of the else branch of dequeue_skb().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53fda7f7

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功