提交 · 8d4057a938481351dc690fbe23e8c72af08d5890 · openeuler / Kernel

01 5月, 2012 8 次提交

tg3: provide frags as skb head · 8d4057a9

由 Eric Dumazet 提交于 4月 27, 2012

This patch converts tg3 driver, one of our reference drivers, to use new
build_skb() api in frag mode.

Instead of using kmalloc() to allocate the memory block that will be
used by build_skb() as skb->head, we use a page fragment.

This is a followup of patch "net: allow skb->head to be a page fragment"

This allows GRO, TCP coalescing, and splice() to be more efficient.

Incidentally, this also removes SLUB slow path contention in kfree()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d4057a9

net: allow skb->head to be a page fragment · d3836f21

由 Eric Dumazet 提交于 4月 27, 2012

skb->head is currently allocated from kmalloc(). This is convenient but
has the drawback the data cannot be converted to a page fragment if
needed.

We have three spots were it hurts :

1) GRO aggregation

 When a linear skb must be appended to another skb, GRO uses the
frag_list fallback, very inefficient since we keep all struct sk_buff
around. So drivers enabling GRO but delivering linear skbs to network
stack aren't enabling full GRO power.

2) splice(socket -> pipe).

 We must copy the linear part to a page fragment.
 This kind of defeats splice() purpose (zero copy claim)

3) TCP coalescing.

 Recently introduced, this permits to group several contiguous segments
into a single skb. This shortens queue lengths and save kernel memory,
and greatly reduce probabilities of TCP collapses. This coalescing
doesnt work on linear skbs (or we would need to copy data, this would be
too slow)

Given all these issues, the following patch introduces the possibility
of having skb->head be a fragment in itself. We use a new skb flag,
skb->head_frag to carry this information.

build_skb() is changed to accept a frag_size argument. Drivers willing
to provide a page fragment instead of kmalloc() data will set a non zero
value, set to the fragment size.

Then, on situations we need to convert the skb head to a frag in itself,
we can check if skb->head_frag is set and avoid the copies or various
fallbacks we have.

This means drivers currently using frags could be updated to avoid the
current skb->head allocation and reduce their memory footprint (aka skb
truesize). (thats 512 or 1024 bytes saved per skb). This also makes
bpf/netfilter faster since the 'first frag' will be part of skb linear
part, no need to copy data.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3836f21

forcedeth: add transmit timestamping support · 49cbb1c1

由 Willem de Bruijn 提交于 4月 27, 2012

Insert an skb_tx_timestamp call in both ndo_start_xmit routines
Tested to work for the nv_start_xmit_optimized case
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

49cbb1c1

bnx2x: add transmit timestamping support · 8373c57d

由 Willem de Bruijn 提交于 4月 27, 2012

Signed-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NEilon Greenstein <eilong@broadcom.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8373c57d

e1000e: add transmit timestamping support · 80be3129

由 Willem de Bruijn 提交于 4月 27, 2012

Signed-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80be3129

e1000: add transmit timestamping support · eab467f5

由 Willem de Bruijn 提交于 4月 27, 2012

Signed-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eab467f5

bridge: Fix fatal typo in setup of multicast_querier_expired · bb63f1f8

由 Herbert Xu 提交于 4月 30, 2012

Unfortunately it seems that I didn't properly test the case of
an expired external querier in the recent multicast bridge series.

The setup of the timer in that case is completely broken and leads
to a NULL-pointer dereference.  This patch fixes it.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bb63f1f8

l2tp: Add missing net/net/ip6_checksum.h include. · d499bd2e

由 David S. Miller 提交于 4月 30, 2012

Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d499bd2e

29 4月, 2012 10 次提交

net/l2tp: add support for L2TP over IPv6 UDP · d2cf3361

由 Benjamin LaHaise 提交于 4月 27, 2012

Now that encap_rcv() works on IPv6 UDP sockets, wire L2TP up to IPv6.
Support has been tested with and without hardware offloading.  This
version fixes the L2TP over localhost issue with incorrect checksums
being reported.
Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d2cf3361

net/ipv6/udp: UDP encapsulation: introduce encap_rcv hook into IPv6 · d7f3f621

由 Benjamin LaHaise 提交于 4月 27, 2012

Now that the sematics of udpv6_queue_rcv_skb() match IPv4's
udp_queue_rcv_skb(), introduce the UDP encap_rcv() hook for IPv6.
Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7f3f621

net/ipv6/udp: UDP encapsulation: move socket locking into udpv6_queue_rcv_skb() · cb80ef46

由 Benjamin LaHaise 提交于 4月 27, 2012

In order to make sure that when the encap_rcv() hook is introduced it is
not called with the socket lock held, move socket locking from callers into
udpv6_queue_rcv_skb(), matching what happens in IPv4.
Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb80ef46

net/ipv6/udp: UDP encapsulation: break backlog_rcv into __udpv6_queue_rcv_skb · f7ad74fe

由 Benjamin LaHaise 提交于 4月 27, 2012

This is the first step in reworking the IPv6 UDP code to be structured more
like the IPv4 UDP code.  This patch creates __udpv6_queue_rcv_skb() with
the equivalent sematics to __udp_queue_rcv_skb(), and wires it up to the
backlog_rcv method.
Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7ad74fe

D

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · a319726a
由 David S. Miller 提交于 4月 28, 2012

a319726a

drivers/net/oki-semi: Donot recompute IP header checksum · 62ecc379

由 RongQing.Li 提交于 4月 26, 2012

If I understand correct, NETIF_F_IP_CSUM only means the hardware
will compute the TCP/UDP checksum, IP checksum is always computed
in software

So as a workround of hardware unable to compute small packages
checksum, do not need to compute IP header checksum.
Signed-off-by: NRongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62ecc379

drivers/net/oki-semi: Remove the definition of PCH_GBE_ETH_ALEN · d89bdff1

由 RongQing.Li 提交于 4月 26, 2012

PCH_GBE_ETH_ALEN is equal to ETH_ALEN, so we can replace it with
ETH_ALEN.

If they are not equal, it must be a bug, since this is ethernet,
and the address has been already stored to mc_addr_list as ETH_ALEN
bytes when call pch_gbe_mac_mc_addr_list_update.
Signed-off-by: NRongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d89bdff1

net/at91_ether: use gpio_to_irq for phy IRQ line · 86cc070e

由 Nicolas Ferre 提交于 4月 26, 2012

Use the gpio_to_irq() function to retrieve the phy IRQ line
from the GPIO pin specification.
This fix is needed now that we have moved to irqdomains on AT91.
Reported-by: NJamie Iles <jamie@jamieiles.com>
Signed-off-by: NNicolas Ferre <nicolas.ferre@atmel.com>
Cc: Andrew Victor <avictor.za@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86cc070e

AT91: Remove fixed mapping for AT91RM9200 ethernet · c5f0f83c

由 Andrew Victor 提交于 4月 26, 2012

The AT91RM9200 Ethernet controller still has a fixed IO mapping.
So:
* Remove the fixed IO mapping and AT91_VA_BASE_EMAC definition.
* Pass the physical base-address via platform-resources to the driver.
* Convert at91_ether.c driver to perform an ioremap().
* Ethernet PHY detection needs to be performed during the driver
initialization process, it can no longer be done first.
Signed-off-by: NAndrew Victor <linux@maxim.org.za>
Signed-off-by: NJean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: NNicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c5f0f83c

net: Fixed a coding style issue related to spaces. · cb75a36c

由 Jeffrin Jose 提交于 4月 25, 2012

Fixed a coding style issue relating to spaces
in net/core/sock.c
Signed-off-by: NJeffrin Jose <ahiliation@yahoo.co.in>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cb75a36c

27 4月, 2012 21 次提交

ixgbe: check for WoL support in single function · 8e2813f5

由 Jacob Keller 提交于 4月 21, 2012

This patch consolidates the case logic for checking whether a device supports
WoL into a single place. Previously ethtool and probe used similar logic that
was copied and maintained separately. This patch encapsulates the core logic
into a function so that a user only has to update one place.
Signed-off-by: NJacob Keller <jacob.e.keller@intel.com>
Tested-by: NStephen Ko <stephen.s.ko@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

8e2813f5

igb: Force flow control off during reset when forcing speed. · a27416bb

由 Matthew Vick 提交于 4月 18, 2012

During igb_reset(), we initiate a hardware reset which will clear our
flow control settings. For auto-negotiation, we re-negotiate them when
linking up again, but we need to force them off properly for the forced
speed case.
Signed-off-by: NMatthew Vick <matthew.vick@intel.com>
Tested-by: NAaron Brown <aaron.f.brown@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

a27416bb

e1000e: 82579 potential system hang on stress when ME enabled · bdc125f7

由 Bruce Allan 提交于 3月 20, 2012

Previously, a workaround was added to address a hardware bug in the
PCIm2PCI arbiter where a write by the driver of the Transmit/Receive
Descriptor Tail register could happen concurrently with a write of any
MAC CSR register by the Manageability Engine (ME) which could cause the
Tail register to have an incorrect value. The arbiter is supposed to
prevent the concurrent writes but there is a bug that can cause the Host
(driver) access to be acknowledged later than it should.
After further investigation, it was discovered that a driver write access
of any MAC CSR register after being idle for some time can be lost when
ME is accessing a MAC CSR register. When this happens, no further target
access is claimed by the MAC which could hang the system.
The workaround to check bit 24 in the FWSM register (set only when ME is
accessing a MAC CSR register) and delay for a limited amount of time until
it is cleared is now done for all driver writes of MAC CSR registers on
82579 with ME enabled. In the rare case when the driver is writing the
Tail register and ME is accessing any MAC CSR register for a duration
longer than the maximum delay, write the register and verify it has the
correct value before continuing, otherwise reset the device.

This patch also moves some pre-existing macros from the hardware-specific
header file to the more appropriate generic driver header file.
Signed-off-by: NBruce Allan <bruce.w.allan@intel.com>
Tested-by: NJeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

bdc125f7

e1000e: 82579 packet drop workaround · 36ceeb43

由 Bruce Allan 提交于 3月 20, 2012

In K1 mode (a MAC/PHY interconnect power mode), the 82579 device shuts down
the Phase Lock Loop (PLL) of the interconnect to save power. When the PLL
starts working, the 82579 device may start to transfer the packet through
the interconnect before it is fully functional causing packet drops. This
workaround disables shutting down the PLL in K1 mode for 1G link speed.
Signed-off-by: NBruce Allan <bruce.w.allan@intel.com>
Tested-by: NJeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

36ceeb43

e1000e: Enable DMA Burst Mode on 82574 by default. · 2cb7a9cc

由 Matthew Vick 提交于 3月 16, 2012

Performance testing has shown that enabling DMA burst on 82574
improves performance on small packets, so enable it by default.
Signed-off-by: NMatthew Vick <matthew.vick@intel.com>
Tested-by: NJeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

2cb7a9cc

e1000e: Disable Far-End LoopBack following reset on 80003ES2LAN. · 1c1093a4

由 Matthew Vick 提交于 3月 16, 2012

80003ES2LAN has an errata such that far-end loopback may be activated by
bit errors producing a reserved symbol. In order to disable far-end
loopback quickly enough, disable it immediately following a reset.
Signed-off-by: NMatthew Vick <matthew.vick@intel.com>
Tested-by: NJeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

1c1093a4

net: cleanups in sock_setsockopt() · 82981930

由 Eric Dumazet 提交于 4月 26, 2012

Use min_t()/max_t() macros, reformat two comments, use !!test_bit() to
match !!sock_flag()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82981930

net: doc: merge /proc/sys/net/core/* documents into one place · c60f6aa8

由 Shan Wei 提交于 4月 26, 2012

All parameter descriptions in /proc/sys/net/core/* now is separated
two places. So, merge them into Documentation/sysctl/net.txt.
Signed-off-by: NShan Wei <davidshan@tencent.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c60f6aa8

be2net: update the driver version · 06b0ab37

由 Ajit Khaparde 提交于 4月 26, 2012

Signed-off-by: NAjit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

06b0ab37

be2net: fix speed displayed by ethtool on certain SKUs · 2a89611a

由 Ajit Khaparde 提交于 4月 26, 2012

logical speed returned by link_status_query needs to be multiplied by 10.
Signed-off-by: NAjit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a89611a

be2net: Ignore status of some ioctls during driver load · ddc3f5cb

由 Ajit Khaparde 提交于 4月 26, 2012

Signed-off-by: NAjit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ddc3f5cb

NET: smsc-ircc2: mark non-experimental · 97076d27

由 Linus Walleij 提交于 4月 26, 2012

This has been used by me and others for ages, let's stop calling it
experimental.
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

97076d27

bonding: bond_update_speed_duplex() can return void since no callers check its return · 13b95fb7

由 Rick Jones 提交于 4月 26, 2012

As none of the callers of bond_update_speed_duplex (need to) check its
return value, there is little point in it returning anything.
Signed-off-by: NRick Jones <rick.jones2@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13b95fb7

qlcnic: Allow a predefined set of capture masks for FW dump · 4fbec4d8

由 Manish Chopra 提交于 4月 26, 2012

o 0x3, 0x7, 0xF, 0x1F, 0x3F, 0x7F and 0xFF are the allowed capture masks.
o Updated driver version to 5.0.28
Signed-off-by: NManish chopra <manish.chopra@qlogic.com>
Signed-off-by: NAnirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4fbec4d8

qlcnic: Adding mac statistics to ethtool. · 54a8997c

由 Jitendra Kalsaria 提交于 4月 26, 2012

Signed-off-by: NJitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: NAnirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54a8997c

qlcnic: Register device in FAILED state. · b43e5ee7

由 Sucheta Chakraborty 提交于 4月 26, 2012

o Without failing probe, register netdevice when device is in FAILED state.
o Device will come up with minimum functionality.
Signed-off-by: NSucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: NAnirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b43e5ee7

crush: include header for global symbols · feb50ac1

由 hartleys 提交于 4月 24, 2012

Include the header to pickup the definitions of the global symbols.

Quiets the following sparse warnings:

warning: symbol 'crush_find_rule' was not declared. Should it be static?
warning: symbol 'crush_do_rule' was not declared. Should it be static?
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Sage Weil <sage@newdream.net>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

feb50ac1

isdn/eicon: use standard __init,__exit function markup · d7398892

由 hartleys 提交于 4月 24, 2012

Remove the custom DIVA_{INIT,EXIT}_FUNCTION defines and use
the standard __init,__exit markup.
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Armin Schindler <mac@melware.de>
Cc: Karsten Keil <isdn@linux-pingi.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7398892

ipv6: RTAX_FEATURE_ALLFRAG causes inefficient TCP segment sizing · 67469601

由 Eric Dumazet 提交于 4月 24, 2012

Quoting Tore Anderson from :
https://bugzilla.kernel.org/show_bug.cgi?id=42572

When RTAX_FEATURE_ALLFRAG is set on a route, the effective TCP segment
size does not take into account the size of the IPv6 Fragmentation
header that needs to be included in outbound packets, causing every
transmitted TCP segment to be fragmented across two IPv6 packets, the
latter of which will only contain 8 bytes of actual payload.

RTAX_FEATURE_ALLFRAG is typically set on a route in response to
receving a ICMPv6 Packet Too Big message indicating a Path MTU of less
than 1280 bytes. 1280 bytes is the minimum IPv6 MTU, however ICMPv6
PTBs with MTU < 1280 are still valid, in particular when an IPv6
packet is sent to an IPv4 destination through a stateless translator.
Any ICMPv4 Need To Fragment packets originated from the IPv4 part of
the path will be translated to ICMPv6 PTB which may then indicate an
MTU of less than 1280.

The Linux kernel refuses to reduce the effective MTU to anything below
1280 bytes, instead it sets it to exactly 1280 bytes, and
RTAX_FEATURE_ALLFRAG is also set. However, the TCP segment size appears
to be set to 1240 bytes (1280 Path MTU - 40 bytes of IPv6 header),
instead of 1232 (additionally taking into account the 8 bytes required
by the IPv6 Fragmentation extension header).

This in turn results in rather inefficient transmission, as every
transmitted TCP segment now is split in two fragments containing
1232+8 bytes of payload.

After this patch, all the outgoing packets that includes a
Fragmentation header all are "atomic" or "non-fragmented" fragments,
i.e., they both have Offset=0 and More Fragments=0.

With help from David S. Miller
Reported-by: NTore Anderson <tore@fud.no>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Tom Herbert <therbert@google.com>
Tested-by: NTore Anderson <tore@fud.no>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

67469601

D

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next · a85c9bb8
由 David S. Miller 提交于 4月 26, 2012

a85c9bb8

Merge branch 'master' of... · d9b8ae6b

由 John W. Linville 提交于 4月 26, 2012

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem

Conflicts:
	drivers/net/wireless/iwlwifi/iwl-testmode.c

d9b8ae6b

26 4月, 2012 1 次提交

tcp repair: Fix unaligned access when repairing options (v2) · de248a75

由 Pavel Emelyanov 提交于 4月 25, 2012

Don't pick __u8/__u16 values directly from raw pointers, but instead use
an array of structures of code:value pairs. This is OK, since the buffer
we take options from is not an skb memory, but a user-to-kernel one.

For those options which don't require any value now, require this to be
zero (for potential future extension of this API).

v2: Changed tcp_repair_opt to use two __u32-s as spotted by David Laight.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de248a75

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功