提交 · 8f029de281b26ec9fd5cd77294db1d35d9876f1a · openeuler / Kernel

23 2月, 2011 6 次提交
- D
  xfrm: Mark flowi arg to xfrm_type->reject() const. · 8f029de2
  由 David S. Miller 提交于 2月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  8f029de2
- D
  xfrm: Mark flowi arg to ->init_tempsel() const. · 73e5ebb2
  由 David S. Miller 提交于 2月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  73e5ebb2
- D
  xfrm: Mark flowi arg to ->fill_dst() const. · 0c7b3eef
  由 David S. Miller 提交于 2月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  0c7b3eef
- D
  xfrm: Mark flowi arg to ->get_tos() const. · 05d84025
  由 David S. Miller 提交于 2月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  05d84025
- D
  xfrm: Mark flowi arg const in key extraction helpers. · e8a4e377
  由 David S. Miller 提交于 2月 22, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  e8a4e377
- E
  net: add __rcu annotations to sk_wq and wq · eaefd110
  由 Eric Dumazet 提交于 2月 18, 2011
```
Add proper RCU annotations/verbs to sk_wq and wq members

Fix __sctp_write_space() sk_sleep() abuse (and sock->wq access)

Fix sunrpc sk_sleep() abuse too
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  eaefd110
21 2月, 2011 1 次提交

tcp: Remove debug macro of TCP_CHECK_TIMER · 089c3482

由 Shan Wei 提交于 2月 19, 2011

Now, TCP_CHECK_TIMER is not used for debuging, it does nothing.
And, it has been there for several years, maybe 6 years.

Remove it to keep code clearer.
Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

089c3482

18 2月, 2011 3 次提交

ipv4: Use const'ify fib_result deep in the route call chains. · 982721f3

由 David S. Miller 提交于 2月 16, 2011

The only troublesome bit here is __mkroute_output which wants
to override res->fi and res->type, compute those in local
variables instead.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

982721f3

D
ipv4: Mark fib_combine_itag()'s 'res' arg as const. · b6bf3ca0
由 David S. Miller 提交于 2月 16, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
b6bf3ca0

net: Add initial_ref arg to dst_alloc(). · 3c7bd1a1

由 David S. Miller 提交于 2月 16, 2011

This allows avoiding multiple writes to the initial __refcnt.

The most simplest cases of wanting an initial reference of "1"
in ipv4 and ipv6 have been converted, the rest have been left
along and kept at the existing "0".
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c7bd1a1

17 2月, 2011 1 次提交

netfilter: tproxy: do not assign timewait sockets to skb->sk · d503b30b

由 Florian Westphal 提交于 2月 17, 2011

Assigning a socket in timewait state to skb->sk can trigger
kernel oops, e.g. in nfnetlink_log, which does:

if (skb->sk) {
        read_lock_bh(&skb->sk->sk_callback_lock);
        if (skb->sk->sk_socket && skb->sk->sk_socket->file) ...

in the timewait case, accessing sk->sk_callback_lock and sk->sk_socket
is invalid.

Either all of these spots will need to add a test for sk->sk_state != TCP_TIME_WAIT,
or xt_TPROXY must not assign a timewait socket to skb->sk.

This does the latter.

If a TW socket is found, assign the tproxy nfmark, but skip the skb->sk assignment,
thus mimicking behaviour of a '-m socket .. -j MARK/ACCEPT' re-routing rule.

The 'SYN to TW socket' case is left unchanged -- we try to redirect to the
listener socket.

Cc: Balazs Scheidler <bazsi@balabit.hu>
Cc: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: NFlorian Westphal <fwestphal@astaro.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

d503b30b

11 2月, 2011 3 次提交

inet: Create a mechanism for upward inetpeer propagation into routes. · 6431cbc2

由 David S. Miller 提交于 2月 07, 2011

If we didn't have a routing cache, we would not be able to properly
propagate certain kinds of dynamic path attributes, for example
PMTU information and redirects.

The reason is that if we didn't have a routing cache, then there would
be no way to lookup all of the active cached routes hanging off of
sockets, tunnels, IPSEC bundles, etc.

Consider the case where we created a cached route, but no inetpeer
entry existed and also we were not asked to pre-COW the route metrics
and therefore did not force the creation a new inetpeer entry.

If we later get a PMTU message, or a redirect, and store this
information in a new inetpeer entry, there is no way to teach that
cached route about the newly existing inetpeer entry.

The facilities implemented here handle this problem.

First we create a generation ID.  When we create a cached route of any
kind, we remember the generation ID at the time of attachment.  Any
time we force-create an inetpeer entry in response to new path
information, we bump that generation ID.

The dst_ops->check() callback is where the knowledge of this event
is propagated.  If the global generation ID does not equal the one
stored in the cached route, and the cached route has not attached
to an inetpeer yet, we look it up and attach if one is found.  Now
that we've updated the cached route's information, we update the
route's generation ID too.

This clears the way for implementing PMTU and redirects directly in
the inetpeer cache.  There is absolutely no need to consult cached
route information in order to maintain this information.

At this point nothing bumps the inetpeer genids, that comes in the
later changes which handle PMTUs and redirects using inetpeers.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6431cbc2

inetpeer: Add redirect and PMTU discovery cached info. · ddd4aa42

由 David S. Miller 提交于 2月 09, 2011

Validity of the cached PMTU information is indicated by it's
expiration value being non-zero, just as per dst->expires.

The scheme we will use is that we will remember the pre-ICMP value
held in the metrics or route entry, and then at expiration time
we will restore that value.

In this way PMTU expiration does not kill off the cached route as is
done currently.

Redirect information is permanent, or at least until another redirect
is received.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ddd4aa42

inetpeer: Abstract address representation further. · 7a71ed89

由 David S. Miller 提交于 2月 09, 2011

Future changes will add caching information, and some of
these new elements will be addresses.

Since the family is implicit via the ->daddr.family member,
replicating the family in ever address we store is entirely
redundant.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7a71ed89

09 2月, 2011 3 次提交

net: Kill NETEVENT_PMTU_UPDATE. · 8d13a2a9

由 David S. Miller 提交于 2月 08, 2011

Nobody actually does anything in response to the event,
so just kill it off.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d13a2a9

net: Remove bogus barrier() in dst_allfrag(). · e7b66bdc

由 David S. Miller 提交于 2月 08, 2011

I simply missed this one when modifying the other dst
metric interfaces earlier.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e7b66bdc

ipsec: allow to align IPv4 AH on 32 bits · fa9921e4

由 Nicolas Dichtel 提交于 2月 02, 2011

The Linux IPv4 AH stack aligns the AH header on a 64 bit boundary
(like in IPv6). This is not RFC compliant (see RFC4302, Section
3.3.3.2.1), it should be aligned on 32 bits.

For most of the authentication algorithms, the ICV size is 96 bits.
The AH header alignment on 32 or 64 bits gives the same results.

However for SHA-256-128 for instance, the wrong 64 bit alignment results
in adding useless padding in IPv4 AH, which is forbidden by the RFC.

To avoid breaking backward compatibility, we use a new flag
(XFRM_STATE_ALIGN4) do change original behavior.

Initial patch from Dang Hongwu <hongwu.dang@6wind.com> and
Christophe Gouault <christophe.gouault@6wind.com>.
Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa9921e4

06 2月, 2011 1 次提交
- D
  tcp: Add reference to initial CWND ietf draft. · 7eb38527
  由 David S. Miller 提交于 2月 05, 2011
```
Suggested by Alexander Zimmermann
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  7eb38527
05 2月, 2011 1 次提交

inetpeer: Move ICMP rate limiting state into inet_peer entries. · 92d86829

由 David S. Miller 提交于 2月 04, 2011

Like metrics, the ICMP rate limiting bits are cached state about
a destination.  So move it into the inet_peer entries.

If an inet_peer cannot be bound (the reason is memory allocation
failure or similar), the policy is to allow.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92d86829

04 2月, 2011 3 次提交

include/net/genetlink.h: Allow genlmsg_cancel to accept a NULL argument · 38db9e1d

由 Julia Lawall 提交于 1月 28, 2011

nlmsg_cancel can accept NULL as its second argument, so for similarity,
this patch extends genlmsg_cancel to be able to accept a NULL second
argument as well.
Signed-off-by: NJulia Lawall <julia@diku.dk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

38db9e1d

mac80211: Add testing functionality for TKIP · 681d1190

由 Jouni Malinen 提交于 2月 03, 2011

TKIP countermeasures depend on devices being able to detect Michael
MIC failures on received frames and for stations to report errors to
the AP. In order to test that behavior, it is useful to be able to
send out TKIP frames with incorrect Michael MIC. This testing behavior
has minimal effect on the TX path, so it can be added to mac80211 for
convenient use.

The interface for using this functionality is a file in mac80211
netdev debugfs (tkip_mic_test). Writing a MAC address to the file
makes mac80211 generate a dummy data frame that will be sent out using
invalid Michael MIC value. In AP mode, the address needs to be for one
of the associated stations or ff:ff:ff:ff:ff:ff to use a broadcast
frame. In station mode, the address can be anything, e.g., the current
BSSID. It should be noted that this functionality works correctly only
when associated and using TKIP.
Signed-off-by: NJouni Malinen <jouni.malinen@atheros.com>
Acked-by: NJohannes Berg <johannes@sipsolutions.net>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

681d1190

mac80211: add HW flag for disabling auto link-PS in AP mode · d057e5a3

由 Arik Nemtsov 提交于 1月 31, 2011

When operating in AP mode the wl1271 hardware filters out null-data
packets as well as management packets. This makes it impossible for
mac80211 to monitor the PS mode by using the PM bit of incoming frames.

Implement a HW flag to indicate that mac80211 should ignore the PM bit.
In addition, expose ieee80211_sta_ps_transition() to make low-level
drivers capable of controlling PS-mode.
Signed-off-by: NArik Nemtsov <arik@wizery.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

d057e5a3

03 2月, 2011 1 次提交
- D
  tcp: Increase the initial congestion window to 10. · 442b9635
  由 David S. Miller 提交于 2月 02, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Acked-by: NNandita Dukkipati <nanditad@google.com>
```
  442b9635
02 2月, 2011 2 次提交

ipv4: Update some fib_hash centric interface names. · 5348ba85

由 David S. Miller 提交于 2月 01, 2011

fib_hash_init() --> fib_trie_init()
fib_hash_table() --> fib_trie_table()
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5348ba85

IPVS: Remove unused variables · a1367647

由 Simon Horman 提交于 2月 01, 2011

These variables are unused as a result of the recent netns work.
Signed-off-by: NSimon Horman <horms@verge.net.au>
Acked-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NHans Schillstrom <hans@schillstrom.com>
Tested-by: NHans Schillstrom <hans@schillstrom.com>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

a1367647

01 2月, 2011 4 次提交

netfilter: ecache: always set events bits, filter them later · 3db7e93d

由 Pablo Neira Ayuso 提交于 2月 01, 2011

For the following rule:

iptables -I PREROUTING -t raw -j CT --ctevents assured

The event delivered looks like the following:

 [UPDATE] tcp      6 src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]

Note that the TCP protocol state is not included. For that reason
the CT event filtering is not very useful for conntrackd.

To resolve this issue, instead of conditionally setting the CT events
bits based on the ctmask, we always set them and perform the filtering
in the late stage, just before the delivery.

Thus, the event delivered looks like the following:

 [UPDATE] tcp      6 432000 ESTABLISHED src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

3db7e93d

netfilter: NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros · f703651e

由 Jozsef Kadlecsik 提交于 2月 01, 2011

The patch adds the NFNL_SUBSYS_IPSET id and NLA_PUT_NET* macros to the
vanilla kernel.
Signed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

f703651e

ipv4: Consolidate all default route selection implementations. · 0c838ff1

由 David S. Miller 提交于 1月 31, 2011

Both fib_trie and fib_hash have a local implementation of
fib_table_select_default().  This is completely unnecessary
code duplication.

Since we now remember the fib_table and the head of the fib
alias list of the default route, we can implement one single
generic version of this routine.

Looking at the fib_hash implementation you may get the impression
that it's possible for there to be multiple top-level routes in
the table for the default route.  The truth is, it isn't, the
insert code will only allow one entry to exist in the zero
prefix hash table, because all keys evaluate to zero and all
keys in a hash table must be unique.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c838ff1

ipv4: Remember FIB alias list head and table in lookup results. · 5b470441

由 David S. Miller 提交于 1月 31, 2011

This will be used later to implement fib_select_default() in a
completely generic manner, instead of the current situation where the
default route is re-looked up in the TRIE/HASH table and then the
available aliases are analyzed.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5b470441

30 1月, 2011 1 次提交

net: Add compat ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT · 709b46e8

由 Eric W. Biederman 提交于 1月 29, 2011

SIOCGETSGCNT is not a unique ioctl value as it it maps tio SIOCPROTOPRIVATE +1,
which unfortunately means the existing infrastructure for compat networking
ioctls is insufficient.  A trivial compact ioctl implementation would conflict
with:

SIOCAX25ADDUID
SIOCAIPXPRISLT
SIOCGETSGCNT_IN6
SIOCGETSGCNT
SIOCRSSCAUSE
SIOCX25SSUBSCRIP
SIOCX25SDTEFACILITIES

To make this work I have updated the compat_ioctl decode path to mirror the
the normal ioctl decode path.  I have added an ipv4 inet_compat_ioctl function
so that I can have ipv4 specific compat ioctls.   I have added a compat_ioctl
function into struct proto so I can break out ioctls by which kind of ip socket
I am using.  I have added a compat_raw_ioctl function because SIOCGETSGCNT only
works on raw sockets.  I have added a ipmr_compat_ioctl that mirrors the normal
ipmr_ioctl.

This was necessary because unfortunately the struct layout for the SIOCGETSGCNT
has unsigned longs in it so changes between 32bit and 64bit kernels.

This change was sufficient to run a 32bit ip multicast routing daemon on a
64bit kernel.
Reported-by: NBill Fenner <fenner@aristanetworks.com>
Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

709b46e8

29 1月, 2011 3 次提交

ipv4: Attach FIB info to dst_default_metrics when possible · 725d1e1b

由 David S. Miller 提交于 1月 28, 2011

If there are no explicit metrics attached to a route, hook
fi->fib_info up to dst_default_metrics.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

725d1e1b

ipv4: Allocate fib metrics dynamically. · 9c150e82

由 David S. Miller 提交于 1月 28, 2011

This is the initial gateway towards super-sharing metrics
if they are all set to zero for a route.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c150e82

mac80211: add MCS information to radiotap · 6d744bac

由 Johannes Berg 提交于 1月 27, 2011

This adds the MCS information we currently get
from the drivers into radiotap.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

6d744bac

28 1月, 2011 3 次提交

net: Pre-COW metrics for TCP. · a4daad6b

由 David S. Miller 提交于 1月 27, 2011

TCP is going to record metrics for the connection,
so pre-COW the route metrics at route cache entry
creation time.

This avoids several atomic operations that have to
occur if we COW the metrics after the entry reaches
global visibility.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4daad6b

inetpeer: Mark metrics as "new" in fresh inetpeer entries. · 144001bd

由 David S. Miller 提交于 1月 27, 2011

Set the RTAX_LOCKED metric to INETPEER_METRICS_NEW (basically,
all ones) on fresh inetpeer entries.

This way code can determine if default metrics have been loaded
in from a routing table entry already.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

144001bd

D
inetpeer: Add metrics storage to inetpeer entries. · 60659823
由 David S. Miller 提交于 1月 26, 2011
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
60659823

27 1月, 2011 1 次提交

net: Implement read-only protection and COW'ing of metrics. · 62fa8a84

由 David S. Miller 提交于 1月 26, 2011

Routing metrics are now copy-on-write.

Initially a route entry points it's metrics at a read-only location.
If a routing table entry exists, it will point there.  Else it will
point at the all zero metric place-holder called 'dst_default_metrics'.

The writeability state of the metrics is stored in the low bits of the
metrics pointer, we have two bits left to spare if we want to store
more states.

For the initial implementation, COW is implemented simply via kmalloc.
However future enhancements will change this to place the writable
metrics somewhere else, in order to increase sharing.  Very likely
this "somewhere else" will be the inetpeer cache.

Note also that this means that metrics updates may transiently fail
if we cannot COW the metrics successfully.

But even by itself, this patch should decrease memory usage and
increase cache locality especially for routing workloads.  In those
cases the read-only metric copies stay in place and never get written
to.

TCP workloads where metrics get updated, and those rare cases where
PMTU triggers occur, will take a very slight performance hit.  But
that hit will be alleviated when the long-term writable metrics
move to a more sharable location.

Since the metrics storage went from a u32 array of RTAX_MAX entries to
what is essentially a pointer, some retooling of the dst_entry layout
was necessary.

Most importantly, we need to preserve the alignment of the reference
count so that it doesn't share cache lines with the read-mostly state,
as per Eric Dumazet's alignment assertion checks.

The only non-trivial bit here is the move of the 'flags' member into
the writeable cacheline.  This is OK since we are always accessing the
flags around the same moment when we made a modification to the
reference count.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62fa8a84

25 1月, 2011 1 次提交

net: change netdev->features to u32 · 04ed3e74

由 Michał Mirosław 提交于 1月 24, 2011

Quoting Ben Hutchings: we presumably won't be defining features that
can only be enabled on 64-bit architectures.

Occurences found by `grep -r` on net/, drivers/net, include/

[ Move features and vlan_features next to each other in
  struct netdev, as per Eric Dumazet's suggestion -DaveM ]
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04ed3e74

22 1月, 2011 1 次提交

cfg80211: Extend channel to frequency mapping for 802.11j · 59eb21a6

由 Bruno Randolf 提交于 1月 17, 2011

Extend channel to frequency mapping for 802.11j Japan 4.9GHz band, according to
IEEE802.11 section 17.3.8.3.2 and Annex J. Because there are now overlapping
channel numbers in the 2GHz and 5GHz band we can't map from channel to
frequency without knowing the band. This is no problem as in most contexts we
know the band. In places where we don't know the band (and WEXT compatibility)
we assume the 2GHz band for channels below 14.

This patch does not implement all channel to frequency mappings defined in
802.11, it's just an extension for 802.11j 20MHz channels. 5MHz and 10MHz
channels as well as 802.11y channels have been omitted.

The following drivers have been updated to reflect the API changes:
iwl-3945, iwl-agn, iwmc3200wifi, libertas, mwl8k, rt2x00, wl1251, wl12xx.
The drivers have been compile-tested only.
Signed-off-by: NBruno Randolf <br1@einfach.org>
Signed-off-by: NBrian Prodoehl <bprodoehl@gmail.com>
Acked-by: NLuciano Coelho <coelho@ti.com>
Signed-off-by: NJohn W. Linville <linville@tuxdriver.com>

59eb21a6

21 1月, 2011 1 次提交

net_sched: accurate bytes/packets stats/rates · 9190b3b3

由 Eric Dumazet 提交于 1月 20, 2011

In commit 44b82883 (net_sched: pfifo_head_drop problem), we fixed
a problem with pfifo_head drops that incorrectly decreased
sch->bstats.bytes and sch->bstats.packets

Several qdiscs (CHOKe, SFQ, pfifo_head, ...) are able to drop a
previously enqueued packet, and bstats cannot be changed, so
bstats/rates are not accurate (over estimated)

This patch changes the qdisc_bstats updates to be done at dequeue() time
instead of enqueue() time. bstats counters no longer account for dropped
frames, and rates are more correct, since enqueue() bursts dont have
effect on dequeue() rate.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9190b3b3

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功