提交 · a73ed26bbae7327370c5bd298f07de78df9e3466 · OpenHarmony / kernel_linux

10 12月, 2011 1 次提交

sch_red: generalize accurate MAX_P support to RED/GRED/CHOKE · a73ed26b

由 Eric Dumazet 提交于 12月 09, 2011

Now RED uses a Q0.32 number to store max_p (max probability), allow
RED/GRED/CHOKE to use/report full resolution at config/dump time.

Old tc binaries are non aware of new attributes, and still set/get Plog.

New tc binary set/get both Plog and max_p for backward compatibility,
they display "probability value" if they get max_p from new kernels.

# tc -d  qdisc show dev ...
...
qdisc red 10: parent 1:1 limit 360Kb min 30Kb max 90Kb ecn ewma 5
probability 0.09 Scell_log 15

Make sure we avoid potential divides by 0 in reciprocal_value(), if
(max_th - min_th) is big.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a73ed26b

09 12月, 2011 6 次提交

sch_red: Adaptative RED AQM · 8af2a218

由 Eric Dumazet 提交于 12月 08, 2011

Adaptative RED AQM for linux, based on paper from Sally FLoyd,
Ramakrishna Gummadi, and Scott Shenker, August 2001 :

http://icir.org/floyd/papers/adaptiveRed.pdf

Goal of Adaptative RED is to make max_p a dynamic value between 1% and
50% to reach the target average queue : (max_th - min_th) / 2

Every 500 ms:
 if (avg > target and max_p <= 0.5)
  increase max_p : max_p += alpha;
 else if (avg < target and max_p >= 0.01)
  decrease max_p : max_p *= beta;

target :[min_th + 0.4*(min_th - max_th),
          min_th + 0.6*(min_th - max_th)].
alpha : min(0.01, max_p / 4)
beta : 0.9
max_P is a Q0.32 fixed point number (unsigned, with 32 bits mantissa)

Changes against our RED implementation are :

max_p is no longer a negative power of two (1/(2^Plog)), but a Q0.32
fixed point number, to allow full range described in Adatative paper.

To deliver a random number, we now use a reciprocal divide (thats really
a multiply), but this operation is done once per marked/droped packet
when in RED_BETWEEN_TRESH window, so added cost (compared to previous
AND operation) is near zero.

dump operation gives current max_p value in a new TCA_RED_MAX_P
attribute.

Example on a 10Mbit link :

tc qdisc add dev $DEV parent 1:1 handle 10: est 1sec 8sec red \
   limit 400000 min 30000 max 90000 avpkt 1000 \
   burst 55 ecn adaptative bandwidth 10Mbit

# tc -s -d qdisc show dev eth3
...
qdisc red 10: parent 1:1 limit 400000b min 30000b max 90000b ecn
adaptative ewma 5 max_p=0.113335 Scell_log 15
 Sent 50414282 bytes 34504 pkt (dropped 35, overlimits 1392 requeues 0)
 rate 9749Kbit 831pps backlog 72056b 16p requeues 0
  marked 1357 early 35 pdrop 0 other 0
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8af2a218

vlan: introduce functions to do mass addition/deletion of vids by another device · 348a1443

由 Jiri Pirko 提交于 12月 08, 2011

Introduce functions handy to copy vlan ids from one driver's list to
another.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

348a1443

vlan: introduce vid list with reference counting · 5b9ea6e0

由 Jiri Pirko 提交于 12月 08, 2011

This allows to keep track of vids needed to be in rx vlan filters of
devices even if they are used in bond/team etc.

vlan_info as well as vlan_group previously was, is allocated when first
vid is added and dealocated whan last vid is deleted.

vlan_group definition is moved to private header.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b9ea6e0

net: introduce vlan_vid_[add/del] and use them instead of direct [add/kill]_vid ndo calls · 87002b03

由 Jiri Pirko 提交于 12月 08, 2011

This patch adds wrapper for ndo_vlan_rx_add_vid/ndo_vlan_rx_kill_vid
functions. Check for NETIF_F_HW_VLAN_FILTER feature is done in this
wrapper.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

87002b03

net: make vlan ndo_vlan_rx_[add/kill]_vid return error value · 8e586137

由 Jiri Pirko 提交于 12月 08, 2011

Let caller know the result of adding/removing vlan id to/from vlan
filter.

In some drivers I make those functions to just return 0. But in those
where there is able to see if hw setup went correctly, return value is
set appropriately.
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e586137

vlan: rename vlan_dev_info to vlan_dev_priv · 7da82c06

由 Jiri Pirko 提交于 12月 08, 2011

As this structure is priv, name it approprietely. Also for pointer to it
use name "vlan".
Signed-off-by: NJiri Pirko <jpirko@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7da82c06

07 12月, 2011 7 次提交

D
ipv6: Move xfrm_lookup() call down into icmp6_dst_alloc(). · 87a11578
由 David S. Miller 提交于 12月 06, 2011
```
And return error pointers.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
87a11578
D
ipv6: Make third arg to anycast_dst_alloc() bool. · 8f031519
由 David S. Miller 提交于 12月 06, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
8f031519

inet_diag: Introduce new inet_diag_req header · 126fdc32

由 Pavel Emelyanov 提交于 12月 06, 2011

This one coinsides with the sock_diag_req in the beginning and
contains only used fields from its previous analogue.

The existing code is patched to use the _compat version of it
for now.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

126fdc32

sock_diag: Initial skeleton · d366477a

由 Pavel Emelyanov 提交于 12月 06, 2011

When receiving the SOCK_DIAG_BY_FAMILY message we have to find the
handler for provided family and pass the nl message to it.

This patch describes an infrastructure to work with such nandlers
and implements stubs for AF_INET(6) ones.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d366477a

sock_diag: Introduce new message type · 8d34172d

由 Pavel Emelyanov 提交于 12月 06, 2011

This type will run the family+protocol based socket dumping.
Also prepare the stub function for it.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d34172d

inet_diag: Partly rename inet_ to sock_ · 7f1fb60c

由 Pavel Emelyanov 提交于 12月 06, 2011

The ultimate goal is to get the sock_diag module, that works in
family+protocol terms. Currently this is suitable to do on the
inet_diag basis, so rename parts of the code. It will be moved
to sock_diag.c later.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f1fb60c

caif-spi: Bugfix for dump upon device removal · d5f43c1e

由 Erwan Bracq 提交于 12月 06, 2011

Fix dump upon device removal, by moving deinitialization from
platform-device-remove to network-interface-uninit.
Signed-off-by: NSjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5f43c1e

06 12月, 2011 4 次提交

if_ether.h: Add IEEE 802.1 Local Experimental Ethertype 1. · 63afe12f

由 sjur.brandeland@stericsson.com 提交于 12月 04, 2011

Add EthType 0x88b5.
This Ethertype value is available for public use for prototype and
vendor-specific protocol development,as defined in Amendment 802a
to IEEE Std 802.
Signed-off-by: NSjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

63afe12f

net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}. · 27217455

由 David Miller 提交于 12月 02, 2011

To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NRoland Dreier <roland@purestorage.com>

27217455

tracing: Restore system filter behavior · 27b14b56

由 Li Zefan 提交于 11月 01, 2011

Though not all events have field 'prev_pid', it was allowed to do this:

  # echo 'prev_pid == 100' > events/sched/filter

but commit 75b8e982 (tracing/filter: Swap
entire filter of events) broke it without any reason.

Link: http://lkml.kernel.org/r/4EAF46CF.8040408@cn.fujitsu.comSigned-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

27b14b56

x86, amd: Fix up numa_node information for AMD CPU family 15h model 0-0fh northbridge functions · f62ef5f3

由 Andreas Herrmann 提交于 12月 02, 2011

I've received complaints that the numa_node attribute for family
15h model 00-0fh (e.g. Interlagos) northbridge functions shows
-1 instead of the proper node ID.

Correct this with attached quirks (similar to quirks for other
AMD CPU families used in multi-socket systems).
Signed-off-by: NAndreas Herrmann <andreas.herrmann3@amd.com>
Cc: Frank Arnold <frank.arnold@amd.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Link: http://lkml.kernel.org/r/20111202072143.GA31916@alberich.amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>

f62ef5f3

05 12月, 2011 2 次提交

perf: Fix loss of notification with multi-event · 10c6db11

由 Peter Zijlstra 提交于 11月 26, 2011

When you do:
$ perf record -e cycles,cycles,cycles noploop 10

You expect about 10,000 samples for each event, i.e., 10s at
1000samples/sec. However, this is not what's happening. You
get much fewer samples, maybe 3700 samples/event:

$ perf report -D | tail -15
Aggregated stats:
TOTAL events: 10998
MMAP events: 66
COMM events: 2
SAMPLE events: 10930
cycles stats:
TOTAL events: 3644
SAMPLE events: 3644
cycles stats:
TOTAL events: 3642
SAMPLE events: 3642
cycles stats:
TOTAL events: 3644
SAMPLE events: 3644

On a Intel Nehalem or even AMD64, there are 4 counters capable
of measuring cycles, so there is plenty of space to measure those
events without multiplexing (even with the NMI watchdog active).
And even with multiplexing, we'd expect roughly the same number
of samples per event.

The root of the problem was that when the event that caused the buffer
to become full was not the first event passed on the cmdline, the user
notification would get lost. The notification was sent to the file
descriptor of the overflowed event but the perf tool was not polling
on it. The perf tool aggregates all samples into a single buffer,
i.e., the buffer of the first event. Consequently, it assumes
notifications for any event will come via that descriptor.

The seemingly straight forward solution of moving the waitq into the
ringbuffer object doesn't work because of life-time issues. One could
perf_event_set_output() on a fd that you're also blocking on and cause
the old rb object to be freed while its waitq would still be
referenced by the blocked thread -> FAIL.

Therefore link all events to the ringbuffer and broadcast the wakeup
from the ringbuffer object to all possible events that could be waited
upon. This is rather ugly, and we're open to better solutions but it
works for now.
Reported-by: NStephane Eranian <eranian@google.com>
Finished-by: NStephane Eranian <eranian@google.com>
Reviewed-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111126014731.GA7030@quadSigned-off-by: NIngo Molnar <mingo@elte.hu>

10c6db11

tcp: take care of misalignments · 117632e6

由 Eric Dumazet 提交于 12月 03, 2011

We discovered that TCP stack could retransmit misaligned skbs if a
malicious peer acknowledged sub MSS frame. This currently can happen
only if output interface is non SG enabled : If SG is enabled, tcp
builds headless skbs (all payload is included in fragments), so the tcp
trimming process only removes parts of skb fragments, header stay
aligned.

Some arches cant handle misalignments, so force a head reallocation and
shrink headroom to MAX_TCP_HEADER.

Dont care about misaligments on x86 and PPC (or other arches setting
NET_IP_ALIGN to 0)

This patch introduces __pskb_copy() which can specify the headroom of
new head, and pskb_copy() becomes a wrapper on top of __pskb_copy()
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

117632e6

04 12月, 2011 7 次提交

ipv6: Kill ndisc_get_neigh() inline helper. · 04a6f441

由 David S. Miller 提交于 12月 03, 2011

It's only used in net/ipv6/route.c and the NULL device check is
superfluous for all of the existing call sites.

Just expand the __ndisc_lookup_errno() call at each location.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04a6f441

net: Add Open vSwitch kernel components. · ccb1352e

由 Jesse Gross 提交于 10月 25, 2011

Open vSwitch is a multilayer Ethernet switch targeted at virtualized
environments.  In addition to supporting a variety of features
expected in a traditional hardware switch, it enables fine-grained
programmatic extension and flow-based control of the network.
This control is useful in a wide variety of applications but is
particularly important in multi-server virtualization deployments,
which are often characterized by highly dynamic endpoints and the need
to maintain logical abstractions for multiple tenants.

The Open vSwitch datapath provides an in-kernel fast path for packet
forwarding.  It is complemented by a userspace daemon, ovs-vswitchd,
which is able to accept configuration from a variety of sources and
translate it into packet processing rules.

See http://openvswitch.org for more information and userspace
utilities.
Signed-off-by: NJesse Gross <jesse@nicira.com>

ccb1352e

ipv6: Add fragment reporting to ipv6_skip_exthdr(). · 75f2811c

由 Jesse Gross 提交于 11月 30, 2011

While parsing through IPv6 extension headers, fragment headers are
skipped making them invisible to the caller.  This reports the
fragment offset of the last header in order to make it possible to
determine whether the packet is fragmented and, if so whether it is
a first or last fragment.
Signed-off-by: NJesse Gross <jesse@nicira.com>

75f2811c

vlan: Move vlan_set_encap_proto() to vlan header file · 396cf943

由 Pravin B Shelar 提交于 11月 18, 2011

Open vSwitch needs this function for vlan handling.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

396cf943

genetlink: Add rcu_dereference_genl and genl_dereference. · b4e16611

由 Jesse Gross 提交于 11月 19, 2011

This adds rcu_dereference_genl and genl_dereference, which are genl
variants of the RTNL functions to enforce proper locking with lockdep
and sparse.
Signed-off-by: NJesse Gross <jesse@nicira.com>

b4e16611

genetlink: Add lockdep_genl_is_held(). · 86b1309c

由 Pravin B Shelar 提交于 11月 10, 2011

Open vSwitch uses genl_mutex locking to protect datapath
data-structures like flow-table, flow-actions. Following patch adds
lockdep_genl_is_held() which is used for rcu annotation to prove
locking.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

86b1309c

genetlink: Add genl_notify() · 263ba61d

由 Pravin B Shelar 提交于 11月 10, 2011

Open vSwitch uses Generic Netlink interface for communication
between userspace and kernel module. genl_notify() is used
for sending notification back to userspace.

genl_notify() is analogous to rtnl_notify() but uses genl_sock
instead of rtnl.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

263ba61d

02 12月, 2011 5 次提交

net: Make ndo_neigh_destroy return void. · 65698610

由 David S. Miller 提交于 12月 01, 2011

The return value isn't used.

Suggested by Ben Hucthings.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

65698610

ipv4: use a 64bit load/store in output path · 84f9307c

由 Eric Dumazet 提交于 11月 30, 2011

gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flowi4 (saddr,daddr) fields wont
break the rule.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84f9307c

can: cc770: add driver core for the Bosch CC770 and Intel AN82527 · 2a367c3a

由 Wolfgang Grandegger 提交于 11月 30, 2011

Signed-off-by: NWolfgang Grandegger <wg@grandegger.com>
Acked-by: NMarc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2a367c3a

dsa: Include linux/if_ether.h to fix build error · ea1f51be

由 Axel Lin 提交于 11月 30, 2011

Include linux/if_ether.h to fix below build errors:

  CC      arch/arm/mach-kirkwood/common.o
In file included from arch/arm/mach-kirkwood/common.c:19:
include/net/dsa.h: In function 'dsa_uses_dsa_tags':
include/net/dsa.h:192: error: 'ETH_P_DSA' undeclared (first use in this function)
include/net/dsa.h:192: error: (Each undeclared identifier is reported only once
include/net/dsa.h:192: error: for each function it appears in.)
include/net/dsa.h: In function 'dsa_uses_trailer_tags':
include/net/dsa.h:197: error: 'ETH_P_TRAILER' undeclared (first use in this function)
make[1]: *** [arch/arm/mach-kirkwood/common.o] Error 1
make: *** [arch/arm/mach-kirkwood] Error 2
Signed-off-by: NAxel Lin <axel.lin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea1f51be

drm/radeon/kms: add some new pci ids · 2ed4d9d6

由 Alex Deucher 提交于 12月 01, 2011

Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@kernel.org
Signed-off-by: NDave Airlie <airlied@redhat.com>

2ed4d9d6

01 12月, 2011 8 次提交

S
caif: Remove unused attributes from struct cflayer · 8aa953d0
由 sjur.brandeland@stericsson.com 提交于 11月 30, 2011
```
Signed-off-by: NSjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
8aa953d0

caif: Remove unused enum and parameter in cfserl · e977b4cf

由 sjur.brandeland@stericsson.com 提交于 11月 30, 2011

Remove unused enum cfcnfg_phy_type and the parameter to cfserl_create.
Signed-off-by: NSjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e977b4cf

caif: Restructure how link caif link layer enroll · 7c18d220

由 sjur.brandeland@stericsson.com 提交于 11月 30, 2011

Enrolling CAIF link layers are refactored.
Signed-off-by: NSjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c18d220

sch_red: fix red_calc_qavg_from_idle_time · ea6a5d3b

由 Eric Dumazet 提交于 11月 30, 2011

Since commit a4a710c4 (pkt_sched: Change PSCHED_SHIFT from 10 to
6) it seems RED/GRED are broken.

red_calc_qavg_from_idle_time() computes a delay in us units, but this
delay is now 16 times bigger than real delay, so the final qavg result
smaller than expected.

Use standard kernel time services since there is no need to obfuscate
them.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea6a5d3b

netem: rate extension · 7bc0f28c

由 Hagen Paul Pfeifer 提交于 11月 30, 2011

Currently netem is not in the ability to emulate channel bandwidth. Only static
delay (and optional random jitter) can be configured.

To emulate the channel rate the token bucket filter (sch_tbf) can be used. But
TBF has some major emulation flaws. The buffer (token bucket depth/rate) cannot
be 0. Also the idea behind TBF is that the credit (token in buckets) fills if
no packet is transmitted. So that there is always a "positive" credit for new
packets. In real life this behavior contradicts the law of nature where
nothing can travel faster as speed of light. E.g.: on an emulated 1000 byte/s
link a small IPv4/TCP SYN packet with ~50 byte require ~0.05 seconds - not 0
seconds.

Netem is an excellent place to implement a rate limiting feature: static
delay is already implemented, tfifo already has time information and the
user can skip TBF configuration completely.

This patch implement rate feature which can be configured via tc. e.g:

tc qdisc add dev eth0 root netem rate 10kbit

To emulate a link of 5000byte/s and add an additional static delay of 10ms:

tc qdisc add dev eth0 root netem delay 10ms rate 5KBps

Note: similar to TBF the rate extension is bounded to the kernel timing
system. Depending on the architecture timer granularity, higher rates (e.g.
10mbit/s and higher) tend to transmission bursts. Also note: further queues
living in network adaptors; see ethtool(8).
Signed-off-by: NHagen Paul Pfeifer <hagen@jauu.net>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@drr.davemloft.net>

7bc0f28c

atm: clip: Use device neigh support on top of "arp_tbl". · 32092ecf

由 David Miller 提交于 7月 25, 2011

Instead of instantiating an entire new neigh_table instance
just for ATM handling, use the neigh device private facility.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32092ecf

neigh: Add device constructor/destructor capability. · da6a8fa0

由 David Miller 提交于 7月 25, 2011

If the neigh entry has device private state, it will need
constructor/destructor ops.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da6a8fa0

D
atm: clip: Convert over to neighbour_priv() · 869759b9
由 David Miller 提交于 7月 25, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
869759b9

OpenHarmony / kernel_linux 上一次同步 接近 4 年

OpenHarmony / kernel_linux
上一次同步接近 4 年