提交 · 4c0ec6544a0cd5e3eed08df2c14cf98185098abe · openeuler / Kernel

17 11月, 2010 15 次提交

ixgbe: remove unnecessary re-init of adapter on Rx-csum change · 4c0ec654

由 Alexander Duyck 提交于 11月 16, 2010

There is no need to reset the adapter when changing the Rx checksum
settings. Since the only change is a software flag we can disable it
without needing to reset the entire adapter.
Signed-off-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

4c0ec654

ixgbe: DCB: credit max only needs to be gt TSO size for 82598 · 80ab193d

由 John Fastabend 提交于 11月 16, 2010

The maximum credits per traffic class only needs to be greater
then the TSO size for 82598 devices. The 82599 devices do not
have this requirement so only do this test for 82598 devices.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

80ab193d

ixgbe: DCB set PFC high and low water marks per data sheet specs · 16b61beb

由 John Fastabend 提交于 11月 16, 2010

Currently the high and low water marks for PFC are being set
conservatively for jumbo frames. This means the RX buffers
are being underutilized in the default 1500 MTU. This patch
fixes this so that the water marks are set as described in
the data sheet considering the MTU size.

The equation used is,

RTT * 1.44 + MTU * 1.44 + MTU

Where RTT is the round trip time and MTU is the max frame size
in KB. To avoid floating point arithmetic FC_HIGH_WATER is
defined

((((RTT + MTU) * 144) + 99) / 100) + MTU

This changes how the hardware field fc.low_water and
fc.high_water are used. With this change they are no longer
storing the actual low water and high water markers but are
storing the required head room in the buffer. This simplifies
the logic and we do not need to account for the size of the
buffer when setting the thresholds.

Testing with iperf and 16 threads showed a slight uptick in
throughput over a single traffic class .1-.2Gbps and a reduction
in pause frames. Without the patch a 30 second run would show
~10-15 pause frames being transmitted with the patch ~2-5 are
seen. Test were run back to back with 82599.

Note RXPBSIZE is in KB and low and high water marks fields are
also in KB. However the FCRT* registers are 32B granularity and
right shifted 5 into the register,

(((rx_pbsize - water_mark) * 1024) / 32) << 5

is the most explicit conversion here we simplify

(rx_pbsize - water_mark) * 32 << 5 = (rx_pbsize - water_mark) << 10

This patch updates the PFC thresholds and legacy FC thresholds.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

16b61beb

ixgbevf: Update Version String and Copyright Notice · 66c87bd5

由 Greg Rose 提交于 11月 16, 2010

Update version string and copyright notice.
Signed-off-by: NGreg Rose <gregory.v.rose@intel.com>
Tested-by: NEmil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

66c87bd5

ixgbe: delay rx_ring freeing · 1a51502b

由 Eric Dumazet 提交于 11月 16, 2010

"cat /proc/net/dev" uses RCU protection only.

Its quite possible we call a driver get_stats() method while device is
dismantling and freeing its data structures.

So get_stats() methods must be very careful not accessing driver private
data without appropriate locking.

In ixgbe case, we access rx_ring pointers. These pointers are freed in
ixgbe_clear_interrupt_scheme() and set to NULL, this can trigger NULL
dereference in ixgbe_get_stats64()

A possible fix is to use RCU locking in ixgbe_get_stats64() and defer
rx_ring freeing after a grace period in ixgbe_clear_interrupt_scheme()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Reported-by: NTantilov, Emil S <emil.s.tantilov@intel.com>
Tested-by: NRoss Brattain <ross.b.brattain@intel.com>
Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>

1a51502b

net: reorder struct sock fields · b178bb3d

由 Eric Dumazet 提交于 11月 16, 2010

Right now, fields in struct sock are not optimally ordered, because each
path (RX softirq, TX completion, RX user,  TX user) has to touch fields
that are contained in many different cache lines.

The really critical thing is to shrink number of cache lines that are
used at RX softirq time : CPU handling softirqs for a device can receive
many frames per second for many sockets. If load is too big, we can drop
frames at NIC level. RPS or multiqueue cards can help, but better reduce
latency if possible.

This patch starts with UDP protocol, then additional patches will try to
reduce latencies of other ones as well.

At RX softirq time, fields of interest for UDP protocol are :
(not counting ones in inet struct for the lookup)

Read/Written:
sk_refcnt   (atomic increment/decrement)
sk_rmem_alloc & sk_backlog.len (to check if there is room in queues)
sk_receive_queue
sk_backlog (if socket locked by user program)
sk_rxhash
sk_forward_alloc
sk_drops

Read only:
sk_rcvbuf (sk_rcvqueues_full())
sk_filter
sk_wq
sk_policy[0]
sk_flags

Additional notes :

- sk_backlog has one hole on 64bit arches. We can fill it to save 8
bytes.
- sk_backlog is used only if RX sofirq handler finds the socket while
locked by user.
- sk_rxhash is written only once per flow.
- sk_drops is written only if queues are full

Final layout :

[1] One section grouping all read/write fields, but placing rxhash and
sk_backlog at the end of this section.

[2] One section grouping all read fields in RX handler
   (sk_filter, sk_rcv_buf, sk_wq)

[3] Section used by other paths

I'll post a patch on its own to put sk_refcnt at the end of struct
sock_common so that it shares same cache line than section [1]

New offsets on 64bit arch :

sizeof(struct sock)=0x268
offsetof(struct sock, sk_refcnt)  =0x10
offsetof(struct sock, sk_lock)    =0x48
offsetof(struct sock, sk_receive_queue)=0x68
offsetof(struct sock, sk_backlog)=0x80
offsetof(struct sock, sk_rmem_alloc)=0x80
offsetof(struct sock, sk_forward_alloc)=0x98
offsetof(struct sock, sk_rxhash)=0x9c
offsetof(struct sock, sk_rcvbuf)=0xa4
offsetof(struct sock, sk_drops) =0xa0
offsetof(struct sock, sk_filter)=0xa8
offsetof(struct sock, sk_wq)=0xb0
offsetof(struct sock, sk_policy)=0xd0
offsetof(struct sock, sk_flags) =0xe0

Instead of :

sizeof(struct sock)=0x270
offsetof(struct sock, sk_refcnt)  =0x10
offsetof(struct sock, sk_lock)    =0x50
offsetof(struct sock, sk_receive_queue)=0xc0
offsetof(struct sock, sk_backlog)=0x70
offsetof(struct sock, sk_rmem_alloc)=0xac
offsetof(struct sock, sk_forward_alloc)=0x10c
offsetof(struct sock, sk_rxhash)=0x128
offsetof(struct sock, sk_rcvbuf)=0x4c
offsetof(struct sock, sk_drops) =0x16c
offsetof(struct sock, sk_filter)=0x198
offsetof(struct sock, sk_wq)=0x88
offsetof(struct sock, sk_policy)=0x98
offsetof(struct sock, sk_flags) =0x130
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b178bb3d

udp: use atomic_inc_not_zero_hint · c31504dc

由 Eric Dumazet 提交于 11月 15, 2010

UDP sockets refcount is usually 2, unless an incoming frame is going to
be queued in receive or backlog queue.

Using atomic_inc_not_zero_hint() permits to reduce latency, because
processor issues less memory transactions.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c31504dc

vlan: remove ndo_select_queue() logic · 213b15ca

由 Eric Dumazet 提交于 11月 11, 2010

Now vlan are lockless, we dont need special ndo_select_queue() logic.
dev_pick_tx() will do the multiqueue stuff on the real device transmit.
Suggested-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

213b15ca

vlan: lockless transmit path · 4af429d2

由 Eric Dumazet 提交于 11月 10, 2010

vlan is a stacked device, like tunnels. We should use the lockless
mechanism we are using in tunnels and loopback.

This patch completely removes locking in TX path.

tx stat counters are added into existing percpu stat structure, renamed
from vlan_rx_stats to vlan_pcpu_stats.

Note : this partially reverts commit 2e59af3d (vlan: multiqueue vlan
device)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4af429d2

macvlan: lockless tx path · 8ffab51b

由 Eric Dumazet 提交于 11月 10, 2010

macvlan is a stacked device, like tunnels. We should use the lockless
mechanism we are using in tunnels and loopback.

This patch completely removes locking in TX path.

tx stat counters are added into existing percpu stat structure, renamed
from rx_stats to pcpu_stats.

Note : this reverts commit 2c114553 (macvlan: add multiqueue
capability)

Note : rx_errors converted to a 32bit counter, like tx_dropped, since
they dont need 64bit range.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: NPatrick McHardy <kaber@trash.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ffab51b

packet: Enhance AF_PACKET implementation to not require high order contiguous... · 0e3125c7

由 Neil Horman 提交于 11月 16, 2010

packet: Enhance AF_PACKET implementation to not require high order contiguous memory allocation (v4)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Version 4 of this patch.

Change notes:
1) Removed extra memset. Didn't think kcalloc added a GFP_ZERO the way kzalloc did :)

Summary:
It was shown to me recently that systems under high load were driven very deep
into swap when tcpdump was run. The reason this happened was because the
AF_PACKET protocol has a SET_RINGBUFFER socket option that allows the user space
application to specify how many entries an AF_PACKET socket will have and how
large each entry will be. It seems the default setting for tcpdump is to set
the ring buffer to 32 entries of 64 Kb each, which implies 32 order 5
allocation. Thats difficult under good circumstances, and horrid under memory
pressure.

I thought it would be good to make that a bit more usable. I was going to do a
simple conversion of the ring buffer from contigous pages to iovecs, but
unfortunately, the metadata which AF_PACKET places in these buffers can easily
span a page boundary, and given that these buffers get mapped into user space,
and the data layout doesn't easily allow for a change to padding between frames
to avoid that, a simple iovec change is just going to break user space ABI
consistency.

So I've done this, I've added a three tiered mechanism to the af_packet set_ring
socket option. It attempts to allocate memory in the following order:

1) Using __get_free_pages with GFP_NORETRY set, so as to fail quickly without
digging into swap

2) Using vmalloc

3) Using __get_free_pages with GFP_NORETRY clear, causing us to try as hard as
needed to get the memory

The effect is that we don't disturb the system as much when we're under load,
while still being able to conduct tcpdumps effectively.

Tested successfully by me.
Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NMaciej Żenczykowski <zenczykowski@gmail.com>
Reported-by: NMaciej Żenczykowski <zenczykowski@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e3125c7

drivers/isdn/mISDN: Use printf extension %pV · 020f01eb

由 Joe Perches 提交于 11月 09, 2010

Using %pV reduces the number of printk calls and
eliminates any possible message interleaving from
other printk calls.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

020f01eb

netlink: let nlmsg and nla functions take pointer-to-const args · 3654654f

由 Jan Engelhardt 提交于 11月 16, 2010

The changed functions do not modify the NL messages and/or attributes
at all. They should use const (similar to strchr), so that callers
which have a const nlmsg/nlattr around can make use of them without
casting.

While at it, constify a data array.
Signed-off-by: NJan Engelhardt <jengelh@medozas.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3654654f

ipv6: fix missing in6_ifa_put in addrconf · 9d82ca98

由 John Fastabend 提交于 11月 15, 2010

Fix ref count bug introduced by

commit 2de79570
Author: Lorenzo Colitti <lorenzo@google.com>
Date:   Wed Oct 27 18:16:49 2010 +0000

ipv6: addrconf: don't remove address state on ifdown if the address
is being kept

Fix logic so that addrconf_ifdown() decrements the inet6_ifaddr
refcnt correctly with in6_ifa_put().
Reported-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9d82ca98

D

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 · b5e41567
由 David S. Miller 提交于 11月 16, 2010

b5e41567

16 11月, 2010 25 次提交

net: Export netif_get_vlan_features(). · 6b353088

由 David S. Miller 提交于 11月 15, 2010

ERROR: "netif_get_vlan_features" [drivers/net/xen-netfront.ko] undefined!
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b353088

enic: Fix build warnings · 1f4f067f

由 Vasanthy Kolluri 提交于 11月 15, 2010

Fix data type of argument passed to pci_alloc_consistent and pci_free_consistent routines.
Signed-off-by: NVasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: NRoopa Prabhu <roprabhu@cisco.com>
Signed-off-by: NDavid Wang <dwang2@cisco.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f4f067f

hso: Fix unused variable warning · ce5a1213

由 Alan Cox 提交于 11月 15, 2010

Fallout from the TIOCGICOUNT work
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ce5a1213

bridge: add RCU annotations to bridge port lookup · ec1e5610

由 Eric Dumazet 提交于 11月 15, 2010

br_port_get() renamed to br_port_get_rtnl() to make clear RTNL is held.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ec1e5610

bridge: fix RCU races with bridge port · b5ed54e9

由 stephen hemminger 提交于 11月 15, 2010

The macro br_port_exists() is not enough protection when only
RCU is being used. There is a tiny race where other CPU has cleared port
handler hook, but is bridge port flag might still be set.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5ed54e9

netdev: add rcu annotations to receive handler hook · 61391cde

由 stephen hemminger 提交于 11月 15, 2010

Suggested by Eric's bridge RCU changes.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61391cde

bridge: add proper RCU annotation to should_route_hook · a386f990

由 Eric Dumazet 提交于 11月 15, 2010

Add br_should_route_hook_t typedef, this is the only way we can
get a clean RCU implementation for function pointer.

Move route_hook to location where it is used.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a386f990

bridge: add RCU annotation to bridge multicast table · e8051688

由 Eric Dumazet 提交于 11月 15, 2010

Add modern __rcu annotatations to bridge multicast table.
Use newer hlist macros to avoid direct access to hlist internals.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e8051688

net/ipv6/mcast.c: Remove unnecessary semicolons · 8a22c99a

由 Joe Perches 提交于 11月 14, 2010

Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a22c99a

include/net/caif/cfctrl.h: Remove unnecessary semicolons · d577f1cc

由 Joe Perches 提交于 11月 14, 2010

Signed-off-by: NJoe Perches <joe@perches.com>
Acked-by: NSjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d577f1cc

J
include/linux/if_macvlan.h: Remove unnecessary semicolons · c59504eb
由 Joe Perches 提交于 11月 14, 2010
```
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
c59504eb
J
drivers/net/cnic.c: Remove unnecessary semicolons · 779bb41d
由 Joe Perches 提交于 11月 14, 2010
```
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
779bb41d
J
drivers/net/ixgbe: Remove unnecessary semicolons · e81a1ba8
由 Joe Perches 提交于 11月 14, 2010
```
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
e81a1ba8
J
drivers/net/e1000e: Remove unnecessary semicolons · 1d51c418
由 Joe Perches 提交于 11月 14, 2010
```
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
1d51c418
J
drivers/net/bnx2x: Remove unnecessary semicolons · 6f38ad93
由 Joe Perches 提交于 11月 14, 2010
```
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
6f38ad93

drivers/isdn: Remove unnecessary semicolons · ad65ffd1

由 Joe Perches 提交于 11月 14, 2010

Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ad65ffd1

D

Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6 · 6c1b6c6b
由 David S. Miller 提交于 11月 15, 2010

6c1b6c6b

net: Simplify RX queue allocation · fe822240

由 Tom Herbert 提交于 11月 09, 2010

This patch move RX queue allocation to alloc_netdev_mq and freeing of
the queues to free_netdev (symmetric to TX queue allocation). Each
kobject RX queue takes a reference to the queue's device so that the
device can't be freed before all the kobjects have been released-- this
obviates the need for reference counts specific to RX queues.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fe822240

net: Move TX queue allocation to alloc_netdev_mq · ed9af2e8

由 Tom Herbert 提交于 11月 09, 2010

TX queues are now allocated in alloc_netdev_mq and freed in
free_netdev.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ed9af2e8

xfrm: use gre key as flow upper protocol info · cc9ff19d

由 Timo Teräs 提交于 11月 03, 2010

The GRE Key field is intended to be used for identifying an individual
traffic flow within a tunnel. It is useful to be able to have XFRM
policy selector matches to have different policies for different
GRE tunnels.
Signed-off-by: NTimo Teräs <timo.teras@iki.fi>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc9ff19d

vlan: Fix build warning in vlandev_seq_show() · e1f2d8c2

由 David S. Miller 提交于 11月 15, 2010

net/8021q/vlanproc.c: In function 'vlandev_seq_show':
net/8021q/vlanproc.c:283:20: warning: unused variable 'fmt'
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1f2d8c2

carl9170: use generic sign_extend32 · b1d771ee

由 Christian Lamparter 提交于 10月 29, 2010

This patch replaces the handcrafted
sign extension cruft with a generic
bitop function.
Signed-off-by: NChristian Lamparter <chunkeey@googlemail.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

b1d771ee

bitops: Provide generic sign_extend32 function · 7919a57b

由 Andreas Herrmann 提交于 8月 30, 2010

This patch moves code out from wireless drivers where two different
functions are defined in three code locations for the same purpose and
provides a common function to sign extend a 32-bit value.
Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

7919a57b

wl1251: use wl12xx_platform_data to pass data · e4b3fdb8

由 Grazvydas Ignotas 提交于 11月 04, 2010

Make use the newly added method to pass platform data for wl1251 too.
This allows to eliminate some redundant code.

Cc: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: NGrazvydas Ignotas <notasas@gmail.com>
Acked-by: NKalle Valo <kvalo@adurom.com>
Acked-by: NLuciano Coelho <luciano.coelho@nokia.com>
Acked-by: NTony Lindgren <tony@atomide.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

e4b3fdb8

wl1251: add runtime PM support for SDIO · 1d4b89f2

由 Grazvydas Ignotas 提交于 11月 08, 2010

Add runtime PM support, similar to how it's done for wl1271.
This allows to power down the card when the driver is loaded but
network is not in use.

Cc: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: NGrazvydas Ignotas <notasas@gmail.com>
Acked-by: NKalle Valo <kvalo@adurom.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

1d4b89f2

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功