提交 · 75f2811c6460ccc59d83c66059943ce9c9f81a18 · openeuler / Kernel

04 12月, 2011 4 次提交

ipv6: Add fragment reporting to ipv6_skip_exthdr(). · 75f2811c

由 Jesse Gross 提交于 11月 30, 2011

While parsing through IPv6 extension headers, fragment headers are
skipped making them invisible to the caller.  This reports the
fragment offset of the last header in order to make it possible to
determine whether the packet is fragmented and, if so whether it is
a first or last fragment.
Signed-off-by: NJesse Gross <jesse@nicira.com>

75f2811c

vlan: Move vlan_set_encap_proto() to vlan header file · 396cf943

由 Pravin B Shelar 提交于 11月 18, 2011

Open vSwitch needs this function for vlan handling.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

396cf943

genetlink: Add lockdep_genl_is_held(). · 86b1309c

由 Pravin B Shelar 提交于 11月 10, 2011

Open vSwitch uses genl_mutex locking to protect datapath
data-structures like flow-table, flow-actions. Following patch adds
lockdep_genl_is_held() which is used for rcu annotation to prove
locking.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

86b1309c

genetlink: Add genl_notify() · 263ba61d

由 Pravin B Shelar 提交于 11月 10, 2011

Open vSwitch uses Generic Netlink interface for communication
between userspace and kernel module. genl_notify() is used
for sending notification back to userspace.

genl_notify() is analogous to rtnl_notify() but uses genl_sock
instead of rtnl.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NJesse Gross <jesse@nicira.com>

263ba61d

03 12月, 2011 1 次提交
- D
  atm: clip: Remove code commented out since eternity. · 340e8dc1
  由 David S. Miller 提交于 12月 02, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  340e8dc1
02 12月, 2011 11 次提交

D
netfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NS · 3ced1be5
由 David S. Miller 提交于 12月 01, 2011
```
firewalld in Fedora 16 needs this.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
3ced1be5

ipv4: flush route cache after change accept_local · d01ff0a0

由 Peter Pan(潘卫平) 提交于 12月 01, 2011

After reset ipv4_devconf->data[IPV4_DEVCONF_ACCEPT_LOCAL] to 0,
we should flush route cache, or it will continue receive packets with local
source address, which should be dropped.
Signed-off-by: NWeiping Pan <panweiping3@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d01ff0a0

sch_red: fix red_change · 1ee5fa1e

由 Eric Dumazet 提交于 12月 01, 2011

Le mercredi 30 novembre 2011 à 14:36 -0800, Stephen Hemminger a écrit :

> (Almost) nobody uses RED because they can't figure it out.
> According to Wikipedia, VJ says that:
>  "there are not one, but two bugs in classic RED."

RED is useful for high throughput routers, I doubt many linux machines
act as such devices.

I was considering adding Adaptative RED (Sally Floyd, Ramakrishna
Gummadi, Scott Shender), August 2001

In this version, maxp is dynamic (from 1% to 50%), and user only have to
setup min_th (target average queue size)
(max_th and wq (burst in linux RED) are automatically setup)

By the way it seems we have a small bug in red_change()

if (skb_queue_empty(&sch->q))
	red_end_of_idle_period(&q->parms);

First, if queue is empty, we should call
red_start_of_idle_period(&q->parms);

Second, since we dont use anymore sch->q, but q->qdisc, the test is
meaningless.

Oh well...

[PATCH] sch_red: fix red_change()

Now RED is classful, we must check q->qdisc->q.qlen, and if queue is empty,
we start an idle period, not end it.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1ee5fa1e

dccp: Fix compile warning in probe code. · d984e619

由 David S. Miller 提交于 12月 01, 2011

Commit 1386be55 ("dccp: fix
auto-loading of dccp(_probe)") fixed a bug but created a new
compiler warning:

net/dccp/probe.c: In function ‘dccpprobe_init’:
net/dccp/probe.c:166:2: warning: the omitted middle operand in ?: will always be ‘true’, suggest explicit middle operand [-Wparentheses]

try_then_request_module() is built for situations where the
"existence" test is some lookup function that returns a non-NULL
object on success, and with a reference count of some kind held.

Here we're looking for a success return of zero from the jprobe
registry.

Instead of fighting the way try_then_request_module() works, simply
open code what we want to happen in a local helper function.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d984e619

Revert "udp: remove redundant variable" · 59c2cdae

由 David S. Miller 提交于 12月 01, 2011

This reverts commit 81d54ec8.

If we take the "try_again" goto, due to a checksum error,
the 'len' has already been truncated.  So we won't compute
the same values as the original code did.
Reported-by: Npaul bilke <fsmail@conspiracy.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59c2cdae

bridge: master device stuck in no-carrier state forever when in user-stp mode · b03b6dd5

由 Vitalii Demianets 提交于 11月 25, 2011

When in user-stp mode, bridge master do not follow state of its slaves, so
after the following sequence of events it can stuck forever in no-carrier
state:
1) turn stp off
2) put all slaves down - master device will follow their state and also go in
no-carrier state
3) turn stp on with bridge-stp script returning 0 (go to the user-stp mode)
Now bridge master won't follow slaves' state and will never reach running
state.

This patch solves the problem by making user-stp and kernel-stp behavior
similar regarding master following slaves' states.
Signed-off-by: NVitalii Demianets <vitas@nppfactor.kiev.ua>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b03b6dd5

ipv4: Perform peer validation on cached route lookup. · efbc368d

由 David S. Miller 提交于 12月 01, 2011

Otherwise we won't notice the peer GENID change.
Reported-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

efbc368d

ipv4: use a 64bit load/store in output path · 84f9307c

由 Eric Dumazet 提交于 11月 30, 2011

gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flowi4 (saddr,daddr) fields wont
break the rule.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

84f9307c

dccp: Evaluate ip_hdr() only once in dccp_v4_route_skb(). · 898f7358

由 David S. Miller 提交于 12月 01, 2011

This also works around a bogus gcc warning generated by an
upcoming patch from Eric Dumazet that rearranges the layout
of struct flowi4.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

898f7358

net: net_device flags is an unsigned int · b536db93

由 Eric Dumazet 提交于 11月 30, 2011

commit b00055aa ([NET] core: add RFC2863 operstate) changed
net_device flags from unsigned short to unsigned int.

Some core functions still assume its an unsigned short.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b536db93

netem: fix build error on 32bit arches · fc33cc72

由 Eric Dumazet 提交于 11月 30, 2011

ERROR: "__udivdi3" [net/sched/sch_netem.ko] undefined!
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NHagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fc33cc72

01 12月, 2011 16 次提交

net/core: fix rollback handler in register_netdevice_notifier · 8f891489

由 RongQing.Li 提交于 11月 30, 2011

Within nested statements, the break statement terminates only the
do, for, switch, or while statement that immediately encloses it,
So replace the break with goto.
Signed-off-by: NRongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8f891489

caif: Remove unused enum and parameter in cfserl · e977b4cf

由 sjur.brandeland@stericsson.com 提交于 11月 30, 2011

Remove unused enum cfcnfg_phy_type and the parameter to cfserl_create.
Signed-off-by: NSjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e977b4cf

caif: Restructure how link caif link layer enroll · 7c18d220

由 sjur.brandeland@stericsson.com 提交于 11月 30, 2011

Enrolling CAIF link layers are refactored.
Signed-off-by: NSjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c18d220

caif: Allow cfpkt_extr_head to process empty message · 200c5a3b

由 sjur.brandeland@stericsson.com 提交于 11月 30, 2011

Allow NULL pointer in cfpkt_extr_head in order to
skip past header data.
Signed-off-by: NSjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

200c5a3b

netem: rate extension · 7bc0f28c

由 Hagen Paul Pfeifer 提交于 11月 30, 2011

Currently netem is not in the ability to emulate channel bandwidth. Only static
delay (and optional random jitter) can be configured.

To emulate the channel rate the token bucket filter (sch_tbf) can be used. But
TBF has some major emulation flaws. The buffer (token bucket depth/rate) cannot
be 0. Also the idea behind TBF is that the credit (token in buckets) fills if
no packet is transmitted. So that there is always a "positive" credit for new
packets. In real life this behavior contradicts the law of nature where
nothing can travel faster as speed of light. E.g.: on an emulated 1000 byte/s
link a small IPv4/TCP SYN packet with ~50 byte require ~0.05 seconds - not 0
seconds.

Netem is an excellent place to implement a rate limiting feature: static
delay is already implemented, tfifo already has time information and the
user can skip TBF configuration completely.

This patch implement rate feature which can be configured via tc. e.g:

tc qdisc add dev eth0 root netem rate 10kbit

To emulate a link of 5000byte/s and add an additional static delay of 10ms:

tc qdisc add dev eth0 root netem delay 10ms rate 5KBps

Note: similar to TBF the rate extension is bounded to the kernel timing
system. Depending on the architecture timer granularity, higher rates (e.g.
10mbit/s and higher) tend to transmission bursts. Also note: further queues
living in network adaptors; see ethtool(8).
Signed-off-by: NHagen Paul Pfeifer <hagen@jauu.net>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@drr.davemloft.net>

7bc0f28c

ipv6 : mcast : Delete useless parameter in ip6_mc_add1_src() · 99d2f47a

由 Jun Zhao 提交于 11月 30, 2011

Need not to used 'delta' flag when add single-source to interface
filter source list.
Signed-off-by: NJun Zhao <mypopydev@gmail.com>
Signed-off-by: NDavid S. Miller <davem@drr.davemloft.net>

99d2f47a

ipv4 : igmp : Delete useless parameter in ip_mc_add1_src() · 5eb81e89

由 Jun Zhao 提交于 11月 30, 2011

Need not to used 'delta' flag when add single-source to interface
filter source list.
Signed-off-by: NJun Zhao <mypopydev@gmail.com>
Signed-off-by: NDavid S. Miller <davem@drr.davemloft.net>

5eb81e89

atm: clip: Use device neigh support on top of "arp_tbl". · 32092ecf

由 David Miller 提交于 7月 25, 2011

Instead of instantiating an entire new neigh_table instance
just for ATM handling, use the neigh device private facility.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32092ecf

neigh: Add device constructor/destructor capability. · da6a8fa0

由 David Miller 提交于 7月 25, 2011

If the neigh entry has device private state, it will need
constructor/destructor ops.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da6a8fa0

D
atm: clip: Convert over to neighbour_priv() · 869759b9
由 David Miller 提交于 7月 25, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
869759b9

neigh: Do not set tbl->entry_size in ipv4/ipv6 neigh tables. · 76cc714e

由 David Miller 提交于 7月 25, 2011

Let the core self-size the neigh entry based upon the key length.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76cc714e

neigh: Add infrastructure for allocating device neigh privates. · 596b9b68

由 David Miller 提交于 7月 25, 2011

netdev->neigh_priv_len records the private area length.

This will trigger for neigh_table objects which set tbl->entry_size
to zero, and the first instances of this will be forthcoming.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

596b9b68

neigh: Get rid of neigh_table->kmem_cachep · 5b8b0060

由 David Miller 提交于 7月 25, 2011

We are going to alloc for device specific private areas for
neighbour entries, and in order to do that we have to move
away from the fixed allocation size enforced by using
neigh_table->kmem_cachep

As a nice side effect we can now use kfree_rcu().
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b8b0060

ipv4: fix lockdep splat in rt_cache_seq_show · 218fa90f

由 Eric Dumazet 提交于 11月 29, 2011

After commit f2c31e32 (fix NULL dereferences in check_peer_redir()),
dst_get_neighbour() should be guarded by rcu_read_lock() /
rcu_read_unlock() section.
Reported-by: NMiles Lane <miles.lane@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

218fa90f

sch_teql: fix lockdep splat · f7e57044

由 Eric Dumazet 提交于 11月 30, 2011

We need rcu_read_lock() protection before using dst_get_neighbour(), and
we must cache its value (pass it to __teql_resolve())

teql_master_xmit() is called under rcu_read_lock_bh() protection, its
not enough.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f7e57044

tcp: inherit listener congestion control for passive cnx · d8a6e65f

由 Eric Dumazet 提交于 11月 30, 2011

Rick Jones reported that TCP_CONGESTION sockopt performed on a listener
was ignored for its children sockets : right after accept() the
congestion control for new socket is the system default one.

This seems an oversight of the initial design (quoted from Stephen)

Based on prior investigation and patch from Rick.
Reported-by: NRick Jones <rick.jones2@hp.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Yuchung Cheng <ycheng@google.com>
Tested-by: NRick Jones <rick.jones2@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d8a6e65f

30 11月, 2011 8 次提交

ipv4: remove useless codes in ipmr_device_event() · e92036a6

由 RongQing.Li 提交于 11月 23, 2011

Commit 7dc00c82 added a 'notify' parameter for vif_delete() to
distinguish whether to unregister the device.

When notify=1 means we does not need to unregister the device,
so calling unregister_netdevice_many is useless.
Signed-off-by: NRongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e92036a6

net: Fix skb_update_prio RCU usage. · 6977a79d

由 Igor Maravic 提交于 11月 25, 2011

Change function rcu_dereference to rcu_dereference_bh to avoid warning

[ INFO: suspicious RCU usage. ]
-------------------------------
net/core/dev.c:2459 suspicious rcu_dereference_check() usage!

because we are locking with

rcu_read_lock_bh();

in function dev_queue_xmit(struct sk_buff *skb)
Signed-off-by: NIgor Maravic <igorm@etf.rs>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6977a79d

netlabel: Fix build problems when IPv6 is not enabled · 1281bc25

由 Paul Moore 提交于 11月 29, 2011

A recent fix to the the NetLabel code caused build problem with
configurations that did not have IPv6 enabled; see below:

 netlabel_kapi.c: In function 'netlbl_cfg_unlbl_map_add':
 netlabel_kapi.c:165:4:
  error: implicit declaration of function 'netlbl_af6list_add'

This patch fixes this problem by making the IPv6 specific code conditional
on the IPv6 configuration flags as we done in the rest of NetLabel and the
network stack as a whole.  We have to move some variable declarations
around as a result so things may not be quite as pretty, but at least it
builds cleanly now.

Some additional IPv6 conditionals were added to the NetLabel code as well
for the sake of consistency.
Reported-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NPaul Moore <pmoore@redhat.com>
Acked-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1281bc25

sctp: better integer overflow check in sctp_auth_create_key() · c89304b8

由 Xi Wang 提交于 11月 29, 2011

The check from commit 30c2235c is incomplete and cannot prevent
cases like key_len = 0x80000000 (INT_MAX + 1).  In that case, the
left-hand side of the check (INT_MAX - key_len), which is unsigned,
becomes 0xffffffff (UINT_MAX) and bypasses the check.

However this shouldn't be a security issue.  The function is called
from the following two code paths:

 1) setsockopt()

 2) sctp_auth_asoc_set_secret()

In case (1), sca_keylength is never going to exceed 65535 since it's
bounded by a u16 from the user API.  As such, the key length will
never overflow.

In case (2), sca_keylength is computed based on the user key (1 short)
and 2 * key_vector (3 shorts) for a total of 7 * USHRT_MAX, which still
will not overflow.

In other words, this overflow check is not really necessary.  Just
make it more correct.
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c89304b8

sch_choke: use skb_flow_dissect() · 2bcc34bb

由 Eric Dumazet 提交于 11月 29, 2011

Instead of using a custom flow dissector, use skb_flow_dissect() and
benefit from tunnelling support.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2bcc34bb

sch_sfq: use skb_flow_dissect() · 11fca931

由 Eric Dumazet 提交于 11月 29, 2011

Instead of using a custom flow dissector, use skb_flow_dissect() and
benefit from tunnelling support.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

11fca931

tcp: avoid frag allocation for small frames · f07d960d

由 Eric Dumazet 提交于 11月 28, 2011

tcp_sendmsg() uses select_size() helper to choose skb head size when a
new skb must be allocated.

If GSO is enabled for the socket, current strategy is to force all
payload data to be outside of headroom, in PAGE fragments.

This strategy is not welcome for small packets, wasting memory.

Experiments show that best results are obtained when using 2048 bytes
for skb head (This includes the skb overhead and various headers)

This patch provides better len/truesize ratios for packets sent to
loopback device, and reduce memory needs for in-flight loopback packets,
particularly on arches with big pages.

If a sender sends many 1-byte packets to an unresponsive application,
receiver rmem_alloc will grow faster and will stop queuing these packets
sooner, or will collapse its receive queue to free excess memory.

netperf -t TCP_RR results are improved by ~4 %, and many workloads are
improved as well (tbench, mysql...)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f07d960d

flow_dissector: use a 64bit load/store · 4d77d2b5

由 Eric Dumazet 提交于 11月 28, 2011

Le lundi 28 novembre 2011 à 19:06 -0500, David Miller a écrit :
> From: Dimitris Michailidis <dm@chelsio.com>
> Date: Mon, 28 Nov 2011 08:25:39 -0800
>
> >> +bool skb_flow_dissect(const struct sk_buff *skb, struct flow_keys
> >> *flow)
> >> +{
> >> +	int poff, nhoff = skb_network_offset(skb);
> >> +	u8 ip_proto;
> >> +	u16 proto = skb->protocol;
> >
> > __be16 instead of u16 for proto?
>
> I'll take care of this when I apply these patches.

( CC trimmed )

Thanks David !

Here is a small patch to use one 64bit load/store on x86_64 instead of
two 32bit load/stores.

[PATCH net-next] flow_dissector: use a 64bit load/store

gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flow_keys (src,dst) fields wont
break the rule.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d77d2b5

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功